17
Natural Language Processing

Natural Language Processing

Embed Size (px)

Citation preview

Page 1: Natural Language Processing

Natural Language Processing

Page 2: Natural Language Processing

Introduction

▪ Natural Language Processing is a subfield of Artificial Intelligence and linguistics, devoted to make computers understand the statements or words written by humans.

▪ A language is a system, a set of rules or set of symbols.

1. Symbols are combined and used for conveying information or broadcasting the information.

2. Rules of grammar are used for handling symbols.

Page 3: Natural Language Processing

Introduction

▪ The history of NLP generally starts in the year 1950s. In 1950, Alan Turing published an article titled "Machine and Intelligence" which advertised what is now called the Turing test as a subfield of intelligence.

▪ Natural languages are languages that living creatures use for communication

▪ Artificial Languages are mathematically defined classes of signals that can be used for communication with machines

▪ A language is a set of sentences that may be used as signals to convey semantic information

▪ The meaning of a sentence is the semantic information it conveys

Page 4: Natural Language Processing

Problems faced in NLP

1. Incomplete description

2. Same word different Meanings

3. New Words, Expressions and Meanings are generated quite freely.

4. There are a lot of ways of telling the same thing.

Page 5: Natural Language Processing

STEPS OF NATURAL LANGUAGE PROCESSING

▪ Morphological Analysis: Individual words are analyzed into their components and non word tokens such as punctuations are separated from the words.

▪ Syntactic Analysis: Linear sequences of words are transformed into structures that show how the words relate to each other.

▪ Semantic Analysis: The structures created by the syntactic analyzer are assigned meanings.

▪ Discourse integration: The meaning of an individual sentence may depend on the sentences that precede it and may influence the meanings of the sentences that follow it.

▪ Pragmatic Analysis: The structure representing what was said is reinterpreted to determine what was actually meant.

Page 6: Natural Language Processing

Syntax analysis

▪ The lexicon of a language is its vocabulary that includes its words and expressions. Morphology depicts analysing, identifying and description of structure of words.

▪ It involves dividing a text into paragraphs, words and the sentences

▪ The words are generally accepted as being the smallest units of syntax. The syntax refers to the rules and principles that govern the sentence structure of any individual languages

Page 7: Natural Language Processing

Syntactic Analysis

– S → NP VP

– NP → the NP1

– NP → PRO

– NP → PN

– NP → NP1

– NP1 →ADJS N

– ADJS → ε | ADJ ADJS

– VP →V

– VP →V NP

– N → file | printer

– PN → Bill

– PRO → I

– ADJ → short | long | fast

– V → printed | created | want

Page 8: Natural Language Processing

A Parse tree for a sentence :

S

NP

PN

Bill

VP

V

printed

NP

theNP1

ADJS

E

N

file

▪ Text : Bill Printed the file

Page 9: Natural Language Processing

Syntax Tree Example

Page 10: Natural Language Processing

Syntactic Analysis Example

▪ A parse tree :

John ate the apple.

1. S -> NP VP

2. VP -> V NP

3. NP -> NAME

4. NP -> ART N

5. NAME -> John

6. V -> ate

7. ART-> the

8. N -> apple

S

NP VP

NAME

John

V

ate

NP

ART N

the apple

Page 11: Natural Language Processing

Semantic Analysis

▪ It must map individual words into appropriate objects in the knowledgebase or database.

▪ It must create the correct structure to correspond to the way the meaning of the individual words combine with each other.

▪ Thus a mapping is made between the syntactic structures and objects in the task domain. The structures for which no such mapping is possible is rejected.

▪ Eg: the sentence “Colorless green ideas…” would be rejected as semantically anomalous because colorless and green makes no sense.

Page 12: Natural Language Processing

Knowledge Base Fragment

Page 13: Natural Language Processing

Partial Meaning for a Sentence

Page 14: Natural Language Processing

Discourse Integration

▪ The Meaning of an individual sentence may depend on the sentences that precede it and may influence the meaning of the sentences that follow it.

▪ Example: the word “it” in the sentence,”you wanted it” depends on the prior discourse content.

▪ Specifically we do not know whom the pronoun “I” or the proper noun “Bill” refers to.

▪ To pin down these references requires an appeal to a model of the current discourse context, from which we can learn that the current user is USER068 and that the only person named “Bill” about whom we could be talking is USER073.

▪ Once the correct referent for Bill is known, we can also determine exactly which file is being referred to.

Page 15: Natural Language Processing

Pragmatic Analysis

▪ The final step toward effective understanding is to decide what to do as a results.

▪ One possible thing to do is to record what was said as a fact and be done with it.

▪ For some sentences, whose intended effect is clearly declarative, that is precisely correct thing to do.

▪ But for other sentences, including this one, the intended effect is different.

▪ We can discover this intended effect by applying a set of rules that characterize cooperative dialogues.

▪ The final step in pragmatic processing is to translate, from the knowledge based representation to a command to be executed by the system.

▪ The results of the understanding process is

Page 16: Natural Language Processing

Pragmatic Analysis

Page 17: Natural Language Processing

Summary

▪ We have seen the results of the main processes that combinr to form a natural language system.

▪ In a complete system all of these processes are necessary. They will form a complete natural language processing system.

▪ But all programs are not written with exactly these components, sometimes two or more of such units are collapsed.

▪ Collapsing the components will result in a system that is easier to build for restricted subsets of English but one that is harder to extend to wider coverage.