12
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

Embed Size (px)

Citation preview

Page 1: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING

NLP-AIIIIT-Hyderabad

CIIL, Mysore

ICON 2003

19-22 DECEMBER, 2003

Page 2: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

Computational Linguistics:HOW ONE FEEDS THE OTHER

• We can study anything about language ...

• 1. Formalize some insights

• 2. Study the formalism mathematically

• 3. Develop & implement algorithms

• 4. Test on real data

Page 3: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

nlp: The Big Questions• What are the right

formalisms to encode linguistic knowledge?– Discrete knowledge:

what is possible?– Continuous knowledge:

what is likely?

• How can we compute efficiently with these formalisms?– Or find approximations

that work pretty well?

Page 4: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

Some of the Active Research• Syntax: It’s converging, but still messy

– “DEEP/SURFACE/SHALOOW structure” problems of syntax

• Phonology: Formalism under hot development• Speech:

– Better language modeling – Better models of acoustics, articulatory

pronunciation– Adaptation to particular speakers and dialects

• Translation models and algorithms• Semantic theories and connection to AI – use stats?

– Too many semantic phenomena. Really hard to determine and disambiguate possible meanings.

Page 5: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

Deploying NLP• Speech recognition and IR have finally gone

commercial over the last few years.• But not much NLP is out in the real world.• What kind of applications should we be working

toward?• Resources:

– Corpora, with or without annotation– WordNet; morphologies; maybe a few grammars– Some languages don’t gell well with NLP or

speech modules, or statistical training modules.– But there are research toolkits that exist and

they need to be made available.

Page 6: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

Sneaking NLP in through the back door

– ADD FEATURES TO EXISTING INTERFACES• “Click to translate”• Spell correction of queries• Allow multiple types of queries • Work on document clusters and summaries• Machines gradually replacing humans @

phone/email helpdesks – now becoming a reality elsewhere

– BACK-END PROCESSING• Information extraction and normalization to build

databases: Assemble good text from boilerplate - wherever

– HAND-HELD DEVICES• Translator• Personal conversation recorder, with topical search

Page 7: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

Making Search applications and technology for the masses?

• Allow queries over meanings, not sentences• Need semantic network extraction from the web• Simple entities and relationships among them• Not complete, but linked to original text• Allow inexact queries – Train data and learn to

generalize from a few tagged examples• Redundancy factor is important• Collapse for browsability or space• Games• Command-and-control applications• “Practical dialogue” (computer as assistant)

Page 8: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

Applications in Discourse

Modeling • Following Donia Scott & Hans

Kamp, I would say that here is a field that is still unable to come to terms with the semantics-prgamatics divide of texts and natural languages and related problems.

• How to combine both semantic representation and yet keep pragmatic information?

• Highly elliptical utterances that are common in spoken dialogue pose special challenges.

• Many theories but none is complete, although some (or aspects of some) lend themselves more readily to implementation than others.

Page 9: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

• A theory of discourse coherence (Jerry Hobbs, 1985) based on a small, limited set of coherence relations which is part of a larger, still-developing theory of the relations between text interpretation and belief systems.

• A tripartite organization of discourse structure (cf. Grosz and Sidner 1986) according to the focus of attention of the speaker, (the attentional state), the structure of the speaker's purposes (the intentional structure) and the structure of sequences of utterances.

• We have Rhetorical Structure Theory (RST) where there is a hierarchical organization of text spans involving nucleus or satellite set of relations. (Mann & Thompson)

• Then, there is Discourse Representation Theory (DRT) (cf. Kamp 1981), a semantic theory developed for the express purpose of representing and computing trans-sentential anaphora and other forms of text cohesion

Different approaches to discourse and dialogue study

Page 10: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

Discourse Modeling in NLP: Future Directions

• Nature of Discourse Relations: textual, rhetorical, intentional, or informational?

• Number of Discourse Relations: • Level of Abstraction at which

Discourse is Described: • Nature of Discourse Segments,

and the psychological reality issue

• Role of Intentions in Discourse: • Mechanisms for Handling Key

Linguistic Phenomena & Reasoning

Page 11: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

  

SINCE MOST NLP SYSTEMS MUST DEPEND ON CAREFUL HANDLING OF MEANING, AS THIS MODEL SHOWS, THAT SHOULD BE OUR PRIORITY AREA NOW.

Page 12: INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON 2003 19-22 DECEMBER, 2003

ThankYou & WelcomeOnce Again