Upload
reuben-reilly
View
34
Download
2
Embed Size (px)
DESCRIPTION
Stat. inference: n-gram models over sparce data. Stat nlp function. Taking some data(generated in accordance with some unknown probability distribution) and then making some inferences about this distribution. Ex - PowerPoint PPT Presentation
Citation preview
Stat. inference: n-gram models over sparce data
Stat nlp function
• Taking some data(generated in accordance with some unknown probability distribution) and then making some inferences about this distribution. Ex
• We might look at lots of prepositional phrase attachments in a corpus and use them to try to predict prepositional phrase attachments for English in general.
• We will examine the classic task of language modelling (aka Shannon game) where the problem is to predict the next word given the previous words.
• Importance: • Speech or optical recognition, SMT, spelling
correction, and handwriting recognition.
Uses
• Word sense disambiguation• Probabilistic parsing
Building n-gram models
• http://svr-www.eng.cam.ac.uk/~prc14/toolkit.html
• Preprocess the corpus using ASCII files• Check this • http://books.google.com/ngrams
MLE
• Come across• 10 times of come across • 8 of which were followed by as• Once by more and once by a
• PMLE (wn|w1 ….. Wn-1) = C(w1……….wn
• ________________• C(w1……. W-1