Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
CS 7650: Natural Language Processing
Alan Ri:er(many slides from Greg Durrett)
Administrivia
‣ Course website: h:ps://ari:er.github.io/CS-7650/
‣ Piazza and Gradescope: links on the course website ‣ We will do our best to make sure quesJons about the
homework, etc. get answered within 24 hours
‣ TA Office hours: ‣ See spreadsheet
Course Requirements
‣ Prior exposure to machine learning very helpful but not required
‣ Programming / Python experience
‣ Probability
‣ Linear Algebra
‣ MulJvariable Calculus
Course Requirements
‣ Prior exposure to machine learning very helpful but not required
‣ Programming / Python experience
‣ Probability
‣ Linear Algebra
‣ MulJvariable Calculus
There will be a lot of math and programming!
Coursework Plan‣ 3 Programming Projects (fairly substanJal implementaJon effort)
‣ Text classificaJon
‣ Named enJty recogniJon (BiLSTM-CNN-CRF)
‣ Neural chatbot (Seq2Seq with a:enJon)
Coursework Plan‣ 3 Programming Projects (fairly substanJal implementaJon effort)
‣ Text classificaJon
‣ Named enJty recogniJon (BiLSTM-CNN-CRF)
‣ Neural chatbot (Seq2Seq with a:enJon)
‣ 2 wri:en assignments + midterm exam
‣ Mostly math problems related to ML / NLP
Coursework Plan‣ 3 Programming Projects (fairly substanJal implementaJon effort)
‣ Text classificaJon
‣ Named enJty recogniJon (BiLSTM-CNN-CRF)
‣ Neural chatbot (Seq2Seq with a:enJon)
‣ 2 wri:en assignments + midterm exam
‣ Mostly math problems related to ML / NLP
‣ Final project (details on course website, will discuss later)
Coursework Plan
‣ Problem Set 1 (math review) is out now on Gradescope (due Jan 25)
‣ 3 Programming Projects (fairly substanJal implementaJon effort)
‣ Text classificaJon
‣ Named enJty recogniJon (BiLSTM-CNN-CRF)
‣ Neural chatbot (Seq2Seq with a:enJon)
‣ 2 wri:en assignments + midterm exam
‣ Mostly math problems related to ML / NLP
‣ Final project (details on course website, will discuss later)
Free Textbooks!
‣ 2 really awesome free textbooks available
‣ There will be assigned readings from both
‣ Both freely available online
Programming Projects: ComputaJon‣ Modern NLP methods require non-trivial computaJon
‣ Training neural networks with many parameters can take a long Jme (it is a very good idea to start working on the assignments early!)
‣ You probably want to use a GPU
‣ Google Colab: free GPUs (some limitaJons)
‣ The programming projects are designed with Colab in mind
What’s the goal of NLP?
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
‣ Example: dialogue systems
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
‣ Example: dialogue systems
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
Siri, what’s the most valuable American
company?
‣ Example: dialogue systems
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
Siri, what’s the most valuable American
company?
Apple
‣ Example: dialogue systems
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
Siri, what’s the most valuable American
company?
Apple
Who is its CEO?
‣ Example: dialogue systems
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
Siri, what’s the most valuable American
company?
Apple
Who is its CEO?
‣ Example: dialogue systems
Tim Cook
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
Siri, what’s the most valuable American
company?
Apple
recognize marketCapis the target value
Who is its CEO?
‣ Example: dialogue systems
Tim Cook
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
Siri, what’s the most valuable American
company?
Apple
recognize marketCapis the target value
recognize predicate
Who is its CEO?
‣ Example: dialogue systems
Tim Cook
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
Siri, what’s the most valuable American
company?
Apple
recognize marketCapis the target value
recognize predicate
do computaJon
Who is its CEO?
‣ Example: dialogue systems
Tim Cook
What’s the goal of NLP?‣ Be able to solve problems that require deep understanding of text
Siri, what’s the most valuable American
company?
Apple
recognize marketCapis the target value
recognize predicate
do computaJon
Who is its CEO?
‣ Example: dialogue systems
resolve references
Tim Cook
AutomaJc SummarizaJon
AutomaJc SummarizaJon
…
…
AutomaJc SummarizaJon
…
…
One of New America’s writers posted a statement criJcal of Google. Eric Schmidt, Google’s CEO, was displeased.
The writer and his team were dismissed.
AutomaJc SummarizaJon
…
…
One of New America’s writers posted a statement criJcal of Google. Eric Schmidt, Google’s CEO, was displeased.
The writer and his team were dismissed.
compress text
AutomaJc SummarizaJon
…
…
One of New America’s writers posted a statement criJcal of Google. Eric Schmidt, Google’s CEO, was displeased.
The writer and his team were dismissed.
provide missing context
compress text
AutomaJc SummarizaJon
…
…
One of New America’s writers posted a statement criJcal of Google. Eric Schmidt, Google’s CEO, was displeased.
The writer and his team were dismissed.
provide missing context
paraphrase to provide clarity
compress text
Machine TranslaJon
People’s Daily, August 30, 2017
Machine TranslaJon
People’s Daily, August 30, 2017
Machine TranslaJon
Trump Pope family watch a hundred years a year in the White House balcony
People’s Daily, August 30, 2017
Machine TranslaJon
Trump Pope family watch a hundred years a year in the White House balcony
People’s Daily, August 30, 2017
NLP Analysis Pipeline
NLP Analysis PipelineText
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Text AnalysisText
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Text AnalysisText Annota.ons
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Text Analysis Applica.onsText Annota.ons
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Summarize
Text Analysis Applica.onsText Annota.ons
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Summarize
Extract informaJon
Text Analysis Applica.onsText Annota.ons
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Summarize
Extract informaJon
Answer quesJons
Text Analysis Applica.onsText Annota.ons
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Summarize
Extract informaJon
Answer quesJons
IdenJfy senJment
Text Analysis Applica.onsText Annota.ons
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Summarize
Extract informaJon
Answer quesJons
IdenJfy senJment
Translate
Text Analysis Applica.onsText Annota.ons
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Summarize
Extract informaJon
Answer quesJons
IdenJfy senJment
‣ NLP is about building these pieces!Translate
Text Analysis Applica.onsText Annota.ons
NLP Analysis Pipeline
SyntacJc parses
Coreference resoluJon
EnJty disambiguaJon
Discourse analysis
Summarize
Extract informaJon
Answer quesJons
IdenJfy senJment
‣ NLP is about building these pieces!Translate
Text Analysis Applica.onsText Annota.ons
‣ All of these components are modeled with staJsJcal approaches trained with machine learning
How do we represent language?Text
How do we represent language?LabelsText
How do we represent language?LabelsText
the movie was good +
How do we represent language?LabelsText
the movie was good +Beyoncé had one of the best videos of all 6me subjec.ve
How do we represent language?Labels
Sequences/tags
Text
the movie was good +Beyoncé had one of the best videos of all 6me subjec.ve
Tom Cruise stars in the new Mission Impossible filmPERSON MOVIE
How do we represent language?Labels
Sequences/tags
Trees
Text
the movie was good +Beyoncé had one of the best videos of all 6me subjec.ve
Tom Cruise stars in the new Mission Impossible filmPERSON MOVIE
I eat cake with icing
PPNPS
NPVP
VBZ NN
How do we represent language?Labels
Sequences/tags
Trees
Text
the movie was good +Beyoncé had one of the best videos of all 6me subjec.ve
Tom Cruise stars in the new Mission Impossible filmPERSON MOVIE
I eat cake with icing
PPNPS
NPVP
VBZ NNflights to Miami
λx. flight(x) ∧ dest(x)=Miami
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
…
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
…
Applica.ons
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
…
Applica.ons
Extract syntacJc features
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
…
Applica.ons
Extract syntacJc features
Tree-structured neural networks
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
…
Applica.ons
Tree transducers (for machine translaJon)
Extract syntacJc features
Tree-structured neural networks
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
…
Applica.ons
Tree transducers (for machine translaJon)
Extract syntacJc features
Tree-structured neural networks
…
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
…
Applica.ons
Tree transducers (for machine translaJon)
Extract syntacJc features
Tree-structured neural networks
end-to-end models …
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
‣ Main quesJon: What representaJons do we need for language? What do we want to know about it?
…
Applica.ons
Tree transducers (for machine translaJon)
Extract syntacJc features
Tree-structured neural networks
end-to-end models …
How do we use these representaJons?
LabelsSequencesTrees
Text AnalysisText
‣ Main quesJon: What representaJons do we need for language? What do we want to know about it?
‣ Boils down to: what ambiguiJes do we need to resolve?
…
Applica.ons
Tree transducers (for machine translaJon)
Extract syntacJc features
Tree-structured neural networks
end-to-end models …
Why is language hard? (and how can we handle that?)
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
The city council refused the demonstrators a permit because they ______ violence
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
The city council refused the demonstrators a permit because they ______ violence
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
The city council refused the demonstrators a permit because they ______ violence
they advocated
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
The city council refused the demonstrators a permit because they ______ violence
they advocated
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
The city council refused the demonstrators a permit because they ______ violence
they feared
they advocated
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
The city council refused the demonstrators a permit because they ______ violence
they feared
they advocated
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
The city council refused the demonstrators a permit because they ______ violence
they feared
they advocated
‣ This is so complicated that it’s an AI challenge problem! (AI-complete)
Language is Ambiguous!‣ Hector Levesque (2011): “Winograd schema challenge” (named aner Terry
Winograd, the creator of SHRDLU)
The city council refused the demonstrators a permit because they ______ violence
they feared
they advocated
‣ This is so complicated that it’s an AI challenge problem! (AI-complete)
‣ ReferenJal/semanJc ambiguity
Language is Ambiguous!
slide credit: Dan Klein
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
‣ Teacher Strikes Idle Kids
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
‣ Teacher Strikes Idle Kids‣ Hospitals Sued by 7 Foot Doctors
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
‣ Teacher Strikes Idle Kids‣ Hospitals Sued by 7 Foot Doctors‣ Ban on Nude Dancing on Governor’s Desk
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
‣ Teacher Strikes Idle Kids‣ Hospitals Sued by 7 Foot Doctors‣ Ban on Nude Dancing on Governor’s Desk‣ Iraqi Head Seeks Arms
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
‣ Teacher Strikes Idle Kids‣ Hospitals Sued by 7 Foot Doctors‣ Ban on Nude Dancing on Governor’s Desk‣ Iraqi Head Seeks Arms
‣ Stolen PainJng Found by Tree
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
‣ Teacher Strikes Idle Kids‣ Hospitals Sued by 7 Foot Doctors‣ Ban on Nude Dancing on Governor’s Desk‣ Iraqi Head Seeks Arms
‣ Stolen PainJng Found by Tree‣ Kids Make NutriJous Snacks
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
‣ Teacher Strikes Idle Kids‣ Hospitals Sued by 7 Foot Doctors‣ Ban on Nude Dancing on Governor’s Desk‣ Iraqi Head Seeks Arms
‣ Stolen PainJng Found by Tree‣ Kids Make NutriJous Snacks‣ Local HS Dropouts Cut in Half
Language is Ambiguous!‣ Ambiguous News Headlines:
slide credit: Dan Klein
‣ SyntacJc/semanJc ambiguity: parsing needed to resolve these, but need context to figure out which parse is correct
‣ Teacher Strikes Idle Kids‣ Hospitals Sued by 7 Foot Doctors‣ Ban on Nude Dancing on Governor’s Desk‣ Iraqi Head Seeks Arms
‣ Stolen PainJng Found by Tree‣ Kids Make NutriJous Snacks‣ Local HS Dropouts Cut in Half
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
il fait vraiment beau
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
It is really nice out
il fait vraiment beau
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
It is really nice out
il fait vraiment beau It’s really nice
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
It is really nice out
il fait vraiment beau It’s really niceThe weather is beauJful
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
It is really nice out
il fait vraiment beau It’s really niceThe weather is beauJfulIt is really beauJful outside
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
It is really nice out
il fait vraiment beau It’s really niceThe weather is beauJfulIt is really beauJful outsideHe makes truly beauJful
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
It is really nice out
il fait vraiment beau It’s really niceThe weather is beauJfulIt is really beauJful outsideHe makes truly beauJfulHe makes truly boyfriend
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
It is really nice out
il fait vraiment beau It’s really niceThe weather is beauJfulIt is really beauJful outsideHe makes truly beauJful
It fact actually handsomeHe makes truly boyfriend
Language is Really Ambiguous!‣ There aren’t just one or two possibiliJes which are resolved pragmaJcally
‣ Combinatorially many possibiliJes, many you won’t even register as ambiguiJes, but systems sJll have to resolve them
It is really nice out
il fait vraiment beau It’s really niceThe weather is beauJfulIt is really beauJful outsideHe makes truly beauJful
It fact actually handsomeHe makes truly boyfriend
‣ Lots of data!
slide credit: Dan Klein
What do we need to understand language?
What do we need to understand language?
‣ World knowledge: have access to informaJon beyond the training data
What do we need to understand language?
‣ World knowledge: have access to informaJon beyond the training data
DOJ greenlights Disney - Fox merger
What do we need to understand language?
‣ World knowledge: have access to informaJon beyond the training data
DOJ greenlights Disney - Fox merger
What do we need to understand language?
‣ World knowledge: have access to informaJon beyond the training data
DOJ greenlights Disney - Fox merger
Department of Jus6ce
What do we need to understand language?
‣ World knowledge: have access to informaJon beyond the training data
DOJ greenlights Disney - Fox merger
metaphor; “approves”
Department of Jus6ce
What do we need to understand language?
‣ World knowledge: have access to informaJon beyond the training data
DOJ greenlights Disney - Fox merger
metaphor; “approves”
Department of Jus6ce
‣ What is a green light? How do we understand what “green lighJng” does?
‣ Grounding: learn what fundamental concepts actually mean in a data-driven way
What do we need to understand language?
‣ Grounding: learn what fundamental concepts actually mean in a data-driven way
Golland et al. (2010)
What do we need to understand language?
‣ Grounding: learn what fundamental concepts actually mean in a data-driven way
McMahan and Stone (2015)Golland et al. (2010)
What do we need to understand language?
‣ LinguisJc structure
Centering Theory Grosz et al. (1995)
What do we need to understand language?
‣ LinguisJc structure
‣ …but computers probably won’t understand language the same way humans do
Centering Theory Grosz et al. (1995)
What do we need to understand language?
‣ LinguisJc structure
‣ …but computers probably won’t understand language the same way humans do
‣ However, linguisJcs tells us what phenomena we need to be able to deal with and gives us hints about how language works
Centering Theory Grosz et al. (1995)
What do we need to understand language?
‣ LinguisJc structure
‣ …but computers probably won’t understand language the same way humans do
‣ However, linguisJcs tells us what phenomena we need to be able to deal with and gives us hints about how language works
Centering Theory Grosz et al. (1995)
What do we need to understand language?
What techniques do we use? (to combine data, knowledge, linguisJcs, etc.)
A brief history of (modern) NLP
1980 1990 2000 2010 2020
A brief history of (modern) NLP
1980 1990 2000 2010 2020
“AI winter” rule-based, expert systems
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
Penn treebank
NP VPS
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
Penn treebank
NP VPS
Ratnaparkhi tagger
NNP VBZ
Collins vs. Charniak parsers
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
Penn treebank
NP VPS
Ratnaparkhi tagger
NNP VBZ
Collins vs. Charniak parsers
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
Penn treebank
NP VPS
Ratnaparkhi tagger
NNP VBZ
Sup: SVMs, CRFs, NER, SenJment
Unsup: topic models, grammar inducJon
Collins vs. Charniak parsers
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
Penn treebank
NP VPS
Ratnaparkhi tagger
NNP VBZ
Sup: SVMs, CRFs, NER, SenJment
Unsup: topic models, grammar inducJon
Collins vs. Charniak parsers
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
Penn treebank
NP VPS
Ratnaparkhi tagger
NNP VBZ
Sup: SVMs, CRFs, NER, SenJment
Semi-sup, structured predicJon
Unsup: topic models, grammar inducJon
Collins vs. Charniak parsers
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
Penn treebank
NP VPS
Ratnaparkhi tagger
NNP VBZ
Sup: SVMs, CRFs, NER, SenJment
Semi-sup, structured predicJon
Neural
Unsup: topic models, grammar inducJon
Collins vs. Charniak parsers
A brief history of (modern) NLP
1980 1990 2000 2010 2020
earliest stat MT work at IBM
“AI winter” rule-based, expert systems
Penn treebank
NP VPS
Ratnaparkhi tagger
NNP VBZ
Sup: SVMs, CRFs, NER, SenJment
Semi-sup, structured predicJon
Neural
Structured PredicJon
“Learning a Part-of-Speech Tagger from Two Hours of AnnotaJon” Garre:e and Baldridge (2013)
Structured PredicJon
“Learning a Part-of-Speech Tagger from Two Hours of AnnotaJon” Garre:e and Baldridge (2013)
‣ All of these techniques are data-driven! Some data is naturally occurring, but may need to label
Structured PredicJon
‣ Supervised techniques work well on very li:le data
“Learning a Part-of-Speech Tagger from Two Hours of AnnotaJon” Garre:e and Baldridge (2013)
‣ All of these techniques are data-driven! Some data is naturally occurring, but may need to label
Structured PredicJon
‣ Supervised techniques work well on very li:le data
“Learning a Part-of-Speech Tagger from Two Hours of AnnotaJon” Garre:e and Baldridge (2013)
‣ All of these techniques are data-driven! Some data is naturally occurring, but may need to label
Structured PredicJon
‣ Supervised techniques work well on very li:le data
unsupervised learning
“Learning a Part-of-Speech Tagger from Two Hours of AnnotaJon” Garre:e and Baldridge (2013)
‣ All of these techniques are data-driven! Some data is naturally occurring, but may need to label
Structured PredicJon
‣ Supervised techniques work well on very li:le data
annotaJon (two hours!)
unsupervised learning
“Learning a Part-of-Speech Tagger from Two Hours of AnnotaJon” Garre:e and Baldridge (2013)
‣ All of these techniques are data-driven! Some data is naturally occurring, but may need to label
Structured PredicJon
‣ Supervised techniques work well on very li:le data
annotaJon (two hours!)
unsupervised learning
“Learning a Part-of-Speech Tagger from Two Hours of AnnotaJon” Garre:e and Baldridge (2013)
be:er system!
‣ All of these techniques are data-driven! Some data is naturally occurring, but may need to label
Structured PredicJon
‣ Supervised techniques work well on very li:le data
annotaJon (two hours!)
unsupervised learning
‣ Even neural nets can do pre:y well!
“Learning a Part-of-Speech Tagger from Two Hours of AnnotaJon” Garre:e and Baldridge (2013)
be:er system!
‣ All of these techniques are data-driven! Some data is naturally occurring, but may need to label
Bahdanau et al. (2014)DeNero et al. (2008)
Less Manual Structure?
Does manual structure have a place?
Does manual structure have a place?
‣ Neural nets don’t always work out of domain!
Does manual structure have a place?
‣ Neural nets don’t always work out of domain!
‣ Coreference: rule-based systems are sJll about as good as deep learning out-of-domain
Does manual structure have a place?
‣ Neural nets don’t always work out of domain!
Moosavi and Strube (2017)
‣ Coreference: rule-based systems are sJll about as good as deep learning out-of-domain
Wikipedia
Newswire
Does manual structure have a place?
‣ Neural nets don’t always work out of domain!
Moosavi and Strube (2017)
‣ Coreference: rule-based systems are sJll about as good as deep learning out-of-domain
‣ LORELEI: transiJon point below which phrase-based systems are be:er
Wikipedia
Newswire
Does manual structure have a place?
‣ Neural nets don’t always work out of domain!
Moosavi and Strube (2017)
‣ Coreference: rule-based systems are sJll about as good as deep learning out-of-domain
‣ LORELEI: transiJon point below which phrase-based systems are be:er
‣ Why is this? InducJve bias!
Wikipedia
Newswire
Does manual structure have a place?
‣ Neural nets don’t always work out of domain!
Moosavi and Strube (2017)
‣ Coreference: rule-based systems are sJll about as good as deep learning out-of-domain
‣ LORELEI: transiJon point below which phrase-based systems are be:er
‣ Why is this? InducJve bias!
‣ Can mulJ-task learning help?
Wikipedia
Newswire
Trump Pope family watch a hundred years a year in the White House balcony
Does manual structure have a place?
Trump Pope family watch a hundred years a year in the White House balcony
‣ Maybe manual structure would help…
Does manual structure have a place?
Where are we?
Where are we?
‣ NLP consists of: analyzing and building representaJons for text, solving problems involving text
Where are we?
‣ NLP consists of: analyzing and building representaJons for text, solving problems involving text
‣ These problems are hard because language is ambiguous, requires drawing on data, knowledge, and linguisJcs to solve
Where are we?
‣ NLP consists of: analyzing and building representaJons for text, solving problems involving text
‣ These problems are hard because language is ambiguous, requires drawing on data, knowledge, and linguisJcs to solve
‣ Knowing which techniques use requires understanding dataset size, problem complexity, and a lot of tricks!
Where are we?
‣ NLP consists of: analyzing and building representaJons for text, solving problems involving text
‣ These problems are hard because language is ambiguous, requires drawing on data, knowledge, and linguisJcs to solve
‣ Knowing which techniques use requires understanding dataset size, problem complexity, and a lot of tricks!
‣ NLP encompasses all of these things
NLP vs. ComputaJonal LinguisJcs
‣ NLP: build systems that deal with language data
‣ CL: use computaJonal tools to study language
Hamilton et al. (2016)
NLP vs. ComputaJonal LinguisJcs
‣ NLP: build systems that deal with language data
‣ CL: use computaJonal tools to study language
Hamilton et al. (2016)
NLP vs. ComputaJonal LinguisJcs
‣ ComputaJonal tools for other purposes: literary theory, poliJcal science…
Bamman, O’Connor, Smith (2013)
NLP vs. ComputaJonal LinguisJcs
‣ ComputaJonal tools for other purposes: literary theory, poliJcal science…
Bamman, O’Connor, Smith (2013)
Course Goals
Course Goals
‣ Cover fundamental machine learning techniques used in NLP
Course Goals
‣ Cover fundamental machine learning techniques used in NLP
‣ Understand how to look at language data and approach linguisJc phenomena
Course Goals
‣ Cover fundamental machine learning techniques used in NLP
‣ Cover modern NLP problems encountered in the literature: what are the acJve research topics in 2021?
‣ Understand how to look at language data and approach linguisJc phenomena
Course Goals
‣ Cover fundamental machine learning techniques used in NLP
‣ Make you a “producer” rather than a “consumer” of NLP tools
‣ Cover modern NLP problems encountered in the literature: what are the acJve research topics in 2021?
‣ Understand how to look at language data and approach linguisJc phenomena
Course Goals
‣ Cover fundamental machine learning techniques used in NLP
‣ Make you a “producer” rather than a “consumer” of NLP tools
‣ Cover modern NLP problems encountered in the literature: what are the acJve research topics in 2021?
‣ The three assignments should teach you what you need to know to understand nearly any system in the literature
‣ Understand how to look at language data and approach linguisJc phenomena
Assignments‣ 3 Homework Assignments ‣ ImplementaJon-oriented ‣ ~2 weeks per assignment, 3 “slip days” for automaJc extensions
Assignments‣ 3 Homework Assignments ‣ ImplementaJon-oriented ‣ ~2 weeks per assignment, 3 “slip days” for automaJc extensions
These projects require understanding of the concepts, ability to write performant code, and ability to think about how to debug complex systems. They are challenging, so start early!
Final Project
‣ Final project (20%) ‣ Groups of 3-4 preferred, 1 is possible. ‣ Good idea to talk to run your project idea by me in office hours or email. ‣ 4 page report + final project presentaJon.
Gather.town Hangout