Click here to load reader

Classification & Mallet Shallow Processing Techniques for NLP Ling570 November 14, 2011

  • View
    216

  • Download
    2

Embed Size (px)

Text of Classification & Mallet Shallow Processing Techniques for NLP Ling570 November 14, 2011

  • Slide 1
  • Classification & Mallet Shallow Processing Techniques for NLP Ling570 November 14, 2011
  • Slide 2
  • Roadmap Classification: Feature templates Case Study Examples Text Categorization Coreference Resolution Classification Systems Overview Mallet
  • Slide 3
  • Classification Problem Steps Input processing: Split data into training/dev/test
  • Slide 4
  • Classification Problem Steps Input processing: Split data into training/dev/test Convert data into an Attribute-Value Matrix Identify candidate features Perform feature selection Create AVM representation
  • Slide 5
  • Classification Problem Steps Input processing: Split data into training/dev/test Convert data into an Attribute-Value Matrix Identify candidate features Perform feature selection Create AVM representation Training
  • Slide 6
  • Classification Problem Steps Input processing: Split data into training/dev/test Convert data into an Attribute-Value Matrix Identify candidate features Perform feature selection Create AVM representation Training Testing Evaluation
  • Slide 7
  • Feature Template Example: Prevword (or w -1 )
  • Slide 8
  • Feature Template Example: Prevword (or w -1 ) Template corresponds to many features e.g. time flies like an arrow
  • Slide 9
  • Feature Template Example: Prevword (or w -1 ) Template corresponds to many features e.g. time flies like an arrow w -1 = w -1 =time w -1 =flies w -1 =like w -1 =an
  • Slide 10
  • Feature Template Example: Prevword (or w -1 ) Template corresponds to many features e.g. time flies like an arrow w -1 = w -1 =time w -1 =flies w -1 =like w -1 =an Shorthand for: w -1 = 0 or w -1 =time 1
  • Slide 11
  • AVM Example Time flies like an arrow Note: this is a compact form of the true sparse vector w -1 =w 0 or 1, for w in |V| w -1 w0w0 w -1 w 0 w +1 label x1 Time fliesN x2TimefliesTime flieslikeV x3flieslikeflies likeanP
  • Slide 12
  • Text Categorization Task: Given a document, assign to one of finite set of classes What are the classes? What are the features?
  • Slide 13
  • Text 1 Several hundred protesters, some wearing goggles and gas masks, marched past authorities in a downtown street Sunday, hours after riot police forced Occupy Portland demonstrators out of a pair of weeks-old encampments in nearby parks. Police moved in shortly before noon and drove protesters into the street after dozens remained in the camp in defiance city officials. Mayor Sam Adams had ordered that the camp shut down Saturday at midnight, citing unhealthy conditions and the encampments attraction of drug users and thieves. Anti-Wall Street protesters and their supporters flooded a city park area in Portland early Sunday in defiance of an eviction order, and authorities elsewhere stepped up pressure against the demonstrators, arresting nearly two dozen. (Nov. 13) More than 50 protesters were arrested in the police action, but officers did not use tear gas, rubber bullets or other so-called non-lethal weapons, police said. Washington Post, online 11/13/2011
  • Slide 14
  • Text 2 George Washington coach Mike Lonergan looked at the stat sheet, tried to muster a smile then clicked off the reasons why the Colonials lost to No. 24 California on Sunday night. A piercing 21-0 run by the Golden Bears at the end of the first half was at the top of the list. Not even a second straight 20-point effort from Tony Taylor was enough to dig George Washington out of the early hole, and the Colonials spent the rest of the night in a futile game of catch-up. Ive never really been involved with a run quite like that, Lonergan said after Cals 81-54 win over George Washington. I tried calling a couple timeouts. It was very disappointing that we just never really got our composure back the rest of that half. To end it that way and not even score any points, that was basically the game right there. Washington Post, online 11/13/2011
  • Slide 15
  • Test 3 Jersey Boys at the National Theatre By Jane Horwitz, Sunday, November 13, 5:29 PMJane Horwitz Jersey Boys is irresistible, and the touring company now at the National Theatre gets it almost entirely right. This Broadway hit (it has been running since fall 2005 and has played Washington before as well) rises well above the so-called jukebox show genre. Subtitled The Story of Frankie Valli & the Four Seasons, the musical tells a tale that transcends show business gossip to become a close character study of four talented but very different blue-collar guys from New Jersey who just happen to have sung some of the best close-harmony rock/pop tunes of the late 1950s, the 1960s and into the 1970s. Washington Post, online 11/13/2011
  • Slide 16
  • What categories? What features?
  • Slide 17
  • Example: Coreference Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...
  • Slide 18
  • Example: Coreference Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment...
  • Slide 19
  • Example: Coreference Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment... Can be viewed as a classification problem
  • Slide 20
  • Example: Coreference Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment... Can be viewed as a classification problem What are the inputs?
  • Slide 21
  • Example: Coreference Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment... Can be viewed as a classification problem What are the inputs? What are the categories?
  • Slide 22
  • Example: Coreference Queen Elizabeth set about transforming her husband, King George VI, into a viable monarch. Logue, a renowned speech therapist, was summoned to help the King overcome his speech impediment... Can be viewed as a classification problem What are the inputs? What are the categories? What features would be useful?
  • Slide 23
  • Example: NER Named Entity tagging: John visited New York last Friday [person John] visited [location New York] [time last Friday] As a classification problem John/PER-B visited/O New/LOC-B York/LOC-I last/TIME-B Friday/TIME-I Input? Features? Classes?
  • Slide 24
  • Classifiers & Systems
  • Slide 25
  • Classifiers Wide variety Differ on several dimensions Supervision Learning Function Input Features
  • Slide 26
  • Supervision in Classifiers Supervised: True label/class of each training instance is provided to the learner at training time Nave Bayes, MaxEnt, Decision Trees, Neural nets, etc
  • Slide 27
  • Supervision in Classifiers Supervised: True label/class of each training instance is provided to the learner at training time Nave Bayes, MaxEnt, Decision Trees, Neural nets, etc Unsupervised: No true labels are provided for examples during training Clustering: k-means; Min-cut algorithms
  • Slide 28
  • Supervision in Classifiers Supervised: True label/class of each training instance is provided to the learner at training time Nave Bayes, MaxEnt, Decision Trees, Neural nets, etc Unsupervised: No true labels are provided for examples during training Clustering: k-means; Min-cut algorithms Semi-supervised: (bootstrapping) True labels are provided for only a subset of examples Co-training, semi-supervised SVM/CRF, etc
  • Slide 29
  • Inductive Bias What form of function is learned? Function that separates members of different classes Linear separator Higher order functions Vornoi diagrams, etc
  • Slide 30
  • Inductive Bias What form of function is learned? Function that separates members of different classes Linear separator Higher order functions Vornoi diagrams, etc Graphically, decision boundary + + + - - -
  • Slide 31
  • Machine Learning Functions Problem: Can the representation effectively model the class to be learned?
  • Slide 32
  • Machine Learning Functions Problem: Can the representation effectively model the class to be learned? Motivates selection of learning algorithm ++ - - -
  • Slide 33
  • Machine Learning Functions Problem: Can the representation effectively model the class to be learned? Motivates selection of learning algorithm ++ - - - For this function, Linear discriminant is GREAT!
  • Slide 34
  • Machine Learning Functions Problem: Can the representation effectively model the class to be learned? Motivates selection of learning algorithm ++ - - - For this function, Linear discriminant is GREAT! Rectangular boundaries (e.g. ID trees) TERRIBLE!
  • Slide 35
  • Machine Learning Functions Problem: Can the representation effectively model the class to

Search related