67
Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017

Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

  • Upload
    others

  • View
    7

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Natural Language Processing

SoSe 2017

Introduction to Natural Language Processing

Dr. Mariana Neves April 24th, 2017

Page 2: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Outline

● Introduction to NLP

● NLP Applications

● NLP Techniques

● Linguistic Knowledge

● Challenges

● NLP course

2

Page 3: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

3

(http://www.transparent.com/learn-japanese/articles/dec_99.html)

(http://expertenough.com/2392/german-language-hacks)

Natural Language

Page 4: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

4

Artificial Language

(https://netbeans.org/features/java/)

(http://noobite.com/learn-programming-start-with-python/)

Page 5: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Natural Language Processing

5

Lexical Analysis

Syntax Analysis

Semantic Analysis

Discourse Analysis

Pragmatic Analysis

Page 6: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Natural Language Processing

6

(http://rutumulkar.com/blog/2016/NLP-ML)

Page 7: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Outline

● Introduction to NLP

● NLP Applications

● NLP Techniques

● Linguistic Knowledge

● Challenges

● NLP course

7

Page 8: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Spell and Grammar Checking

● Checking spelling and grammar

● Suggesting alternatives for the errors

8

Page 9: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Word Prediction

● Predicting the most probable next word to be typed by the user

9

Page 10: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Information Retrieval

● Finding relevant information to the user’s query

10

Page 11: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Information Retrieval

● Finding relevant information to the user’s query

11(https://hpi.de/plattner/olelo)

Page 12: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Text Categorization

● Assigning one (or more) pre-defined category to a text

12http://www.uclassify.com/browse/mvazquez/News-Classifier

Page 13: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Summarization

● Generating a short summary from one or more documents, sometimes based on a given query

13http://smmry.com/

Page 14: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Summarization

● Generating a short summary from one or more documents, sometimes based on a given query

14(https://hpi.de/plattner/olelo)

Page 15: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Question answering

● Answering questions with a short answer

15

http://start.csail.mit.edu/index.php

Page 16: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Question Answering

16(https://hpi.de/plattner/olelo)

Page 17: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Question answering

● IBM Watson in Jeopardy

17

https://www.youtube.com/watch?v=WFR3lOm_xhE

Page 18: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Information Extraction

● Extracting important concepts from texts and assigning them to slot in a certain template

18

Page 19: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Information Extraction

● Includes named-entity recognition

19

(https://en.wikipedia.org/wiki/Angela_Merkel)

Page 20: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Information Extraction

20

(https://www.ncbi.nlm.nih.gov/pubmed/17910981)

Page 21: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Machine Translation

● Translating a text from one language to another

21 (http://www.lemonde.fr/)

Page 22: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Machine Translation

● Translating a text from one language to another

22 (https://www.youtube.com/watch?v=UtnXrXZYg_g)

Page 23: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Sentiment Analysis

● Identifying sentiments and opinions stated in a text

23

Page 24: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Optical Character Recognition

● Recognizing printed or handwritten texts and converting them to computer-readable texts

24

Page 25: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Speech recognition

● Recognizing a spoken language and transforming it into a text

25

Page 26: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Speech synthesis

● Producing a spoken language from a text

26

Page 27: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Spoken dialog systems

● Running a dialog between the user and the system

27

Page 28: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Level of difficulties

● Easy (mostly solved)

– Spell and grammar checking

– Some text categorization tasks

– Some named-entity recognition tasks

28

Page 29: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Level of difficulties

● Intermediate (good progress)

– Information retrieval

– Sentiment analysis

– Machine translation

– Information extraction

29

Page 30: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Level of difficulties

● Difficult (still hard)

– Question answering

– Summarization

– Dialog systems

30

Page 31: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Outline

● Introduction to NLP

● NLP Applications

● NLP Techniques

● Linguistic Knowledge

● Challenges

● NLP course

31

Page 32: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Section splitting

● Splitting a text into sections

32

Page 33: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Sentence splitting

● Splitting a text into sentences

33

Page 34: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Part-of-speech tagging

● Assigning a syntatic tag to each word in a sentence

34 http://nlp.stanford.edu:8080/corenlp/

Page 35: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Parsing

● Building the syntactic tree of a sentence

35

http://nlp.stanford.edu:8080/corenlp/

Page 36: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Parsing

● Building the syntactic tree of a sentence

36

http://nlp.stanford.edu:8080/corenlp/

Page 37: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Named-entity recognition

● Identifying pre-defined entity types in a sentence

37

Page 38: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Word sense disambiguation

● Figuring out the exact meaning of a word or entity

38

http://www.thefreedictionary.com/tie

Page 39: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Word sense disambiguation

39

http://www.ling.gu.se/~lager/Home/pwe_ui.html

Page 40: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Semantic role labeling

● Extracting subject-predicate-object triples from a sentence

40

http://cogcomp.cs.illinois.edu/page/demo_view/srl

Page 41: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Outline

● Introduction to NLP

● NLP Applications

● NLP Techniques

● Linguistic Knowledge

● Challenges

● NLP course

41

Page 42: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Phonetics and phonology

● The study of linguistic sounds and their relations to words

42

http://german.about.com/library/blfunkabc.htm

Page 43: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Morphology

● The study of internal structures of words and how they can be modified

● Parsing complex words into their components

43(http://allthingslinguistic.com/post/50939757945/morphological-typology-illustrations-from)

Page 44: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Syntax

● The study of the structural relationships between words in a sentence

44

Page 45: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Semantics

● The study of the meaning of words, and how these combine to form the meanings of sentences

– Synonymy: fall & autumn

– Hypernymy & hyponymy (is a): animal & dog

– Meronymy (part of): finger & hand

– Homonymy: fall (verb & season)

– Antonymy: big & small

45

Page 46: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Pragmatics

● Social use of language

● The study of how language is used to accomplish goals, and the influence of context on meaning.

● Understanding the aspects of a language which depends on situation and world knowledge.

46

“Give me the salt!”

or

“Could you please give me the salt?”

Page 47: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Discourse

● The study of linguistic units larger than a single statement

47

John reads a book. He borrowed it from his friend.

(http://en.wikipedia.org/wiki/Berlin)

Page 48: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Outline

● Introduction to NLP

● NLP Applications

● NLP Techniques

● Linguistic Knowledge

● Challenges

● NLP course

48

Page 49: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Paraphrasing

● Different words/sentences express the same meaning

– Season of the year

● Fall● Autumn

– Book delivery time

● When will my book arrive?● When will I receive my book?

49

Page 50: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Ambiguity

● One word/sentence can have different meanings

– Fall

● The third season of the year● Moving down towards the ground or towards a lower

position

– The door is open.

● Expressing a fact● A request to close the door

50

Page 51: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Phonetics and Phonology

51

http://worldsgreatestsmile.com/html/phonological_ambiguity.html

Page 52: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Syntax and ambiguity

● I saw the man with a telescope.

– Who had the telescope?

52

(http://www.realtytrac.com/landing/2009-year-end-foreclosure-report.html)

Page 53: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Semantics

● The astronomer loves the star.

– Star in the sky

– Celebrity

53

(http://www.businessnewsdaily.com/2023-celebrity-hiring.html)

(http://en.wikipedia.org/wiki/Star#/media/File:Starsinthesky.jpg)

Page 54: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Discourse analysis

● Alice understands that you like your mother, but she …

– Does she refer to Alice or your mother?

54

Page 55: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Outline

● Introduction to NLP

● NLP Applications

● NLP Techniques

● Linguistic Knowledge

● Challenges

● NLP course

55

Page 56: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

NLP Course

● Home page:

– https://hpi.de/plattner/teaching/summer-term-2017/natural-language-processing.html

● Lecture

– Monday 11:00-12:30

– HS3

– 3 credit points

56

Page 57: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Grading

● 60% Project

● 40% Final exam (written)

● You have to pass both of them

57

Page 58: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

58

Page 59: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Project

● Development of a NLP application

– Information Retrieval

– Information Extraction

– Text Summarization

– Question Answering

– Sentiment Analysis

– Machine Translation

– Etc..

59

Page 60: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Project

● The application should include following components:

– Syntactic parsing (e.g., POS tagging)

– Semantics (e.g., NER)

– Discourse analysis

60

Page 61: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Project

● Any NLP or ML libraries

– Stanford Core NLP

– NLTK

– spaCy

– Apache OpenNLP

– GATE

– SAP HANA (contact me)

– R

– Weka

– TensorFlow

– etc.

61

Page 62: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Project

● Any language

– English, German, etc.

(Check available NLP tools)

● Any text collections

– Social media, Web pages, publications, Wikipedia, etc.

– Benchmarks or newly created datasets

● Any domain

62

Page 63: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Project

● Teams (2-3 students)

● Send me an email with your proposal (until May 5th, 2017)

● Updates (presentations) on the progress of the project

– Mid-term and final presentation

– Also considered for grading (commitment)

63

Page 64: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Course book

● Speech and Language Processing

– Daniel Jurafsky and James H. Martin

– New (3rd) edition: https://web.stanford.edu/~jurafsky/slp3/

64

Page 65: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Deep learning book

● Deep Learning (Adaptive Computation and Machine Learning)

– Ian Goodfellow, Yoshua Bengio, Aaron Courville

– The Mit Press, 2017

– http://www.deeplearningbook.org

– NLP in section 12.4

65

Page 66: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

Journal and conferences

● Journal

– Computational Linguistics

● Conferences

– ACL: Association for Computational Linguistics

– NAACL: North American Chapter

– EACL: European Chapter

– HLT: Human Language Technology

– EMNLP: Empirical Methods on Natural Language Processing

– CoLing: Computational Linguistics

– LREC: Language Resources and Evaluation

66

Page 67: Natural Language Processing SoSe 2017...Natural Language Processing SoSe 2017 Introduction to Natural Language Processing Dr. Mariana Neves April 24th, 2017 Outline Introduction to

NLP Course

● Contact

[email protected]

– Room V-0.01 (Villa) - appointment under request

● We have a student position for NLP at the EPIC chair!

67