Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Parsing Discourse Relations
Giuseppe Riccardi Signals and Interactive Systems Lab
University of Trento, Italy
Behavioral Analytics
Parser Run on Genia Corpus Among 25 cases, 2 homozygous deletions and 1 hemizygous deletion were found in HCC samples. No point mutation was identified in the remaining 22 tumor samples without p16 gene deletions. Hypermethylation was detected in 24% (6/25) of tumor samples. However, the corresponding non-tumor liver tissue specimens were always unmethylated at the p16 locus. Loss of p16 protein expression occurred in 16 of 35 (45.7%) tumor samples, and all the non-tumor liver tissue specimens showed positive p16 staining. For the 25 cases examined for p16 gene alterations, the loss of p16 protein expression was observed in all tumors with p16 gene alterations and also in 3 tumors without p16 gene alterations. (Source: Genia corpus)
Parser Run on Genia Corpus Among 25 cases, 2 homozygous deletions and 1 hemizygous deletion were found in HCC samples. No point mutation was identified in the remaining 22 tumor samples without p16 gene deletions. Hypermethylation was detected in 24% (6/25) of tumor samples. However, the corresponding non-tumor liver tissue specimens were always unmethylated at the p16 locus. Loss of p16 protein expression occurred in 16 of 35 (45.7%) tumor samples, and all the non-tumor liver tissue specimens showed positive p16 staining. For the 25 cases examined for p16 gene alterations, the loss of p16 protein expression was observed in all tumors with p16 gene alterations and also in 3 tumors without p16 gene alterations. (Source: Genia corpus)
Parser Output : § Hypermethylation was detected in 24 % 6\/25 ) of tumor samples However(Comparison) the corresponding non-tumor liver tissue specimens were always unmethylated at the p16 locus § Loss of p16 protein expression occurred in 16 of 35 45.7 % ) tumor samples and(Expansion ) all the non-tumor liver tissue specimens showed positive p16 staining
Social Media User Opinions: Negative
The acting is below average, even from the likes of Curtis. You're more likely to get a kick out of her work in Halloween H20. Sutherland is wasted and Baldwin, well, he's acting like a Baldwin, of course. The real star here are Stan Winston's robot design, some schnazzy CGI, and the occasional good gore shot, like picking into someone's brain. So, if robots and body parts really turn you on, here's your movie. Otherwise, it's pretty much a sunken ship of a movie.
5/1/12 5
Social Media User Opinions: Positive
From here on, the plot takes a back seat, and we are treated to some of the best camera work and action staged. Most all the action is plausible and will hold you at the edge of your seat. There are a few melodramatic parts here, but, they tend to work out well. There is no general antagonist in this film, but the action and suspense makes you forget all about that. Daylight is a great film, I saw a non-matinee showing of it, and I thought it was worth every penny. The characterizations are mostly flat, one dimesional, but they have enough in them to get you to care for some of the characters. Rob Cohen (Dragonheart) does a great job with this film.
5/1/12 6
Discourse Relation Parsing
Giuseppe Riccardi
• Joint work with – Sucheta Ghosh , U. Trento – Richard Johansson, U. Trento/U. Gothenburg – Sara Tonelli , FBK-Irst
Ghosh S., Tonelli S., Riccardi G. and Johansson R., “End-to-End Discourse Parser Evaluation”, IEEE International Conference on Semantic Computing, Menlo Park, USA, 2011 Ghosh S., Johansson R., Riccardi G. and Tonelli S., “Shallow Discourse Parsing with Conditional Random Fields”, International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 2011 Ghosh S., Johansson R., Riccardi G. and Tonelli S., “Improving Recall Through Global Constraint Selection”, To appear on LREC, 2012
Discourse Parser
– From raw text extract: – Discourse relations:
• Discourse Predicate (Connective) • Connective sense • Arg1 • Arg2
– Explicit Connective
Giuseppe Riccardi
Parsing Architecture
Parser end2end Architecture
Chunklink
AddDiscourse
RootExtract +Morpha
• By Sabaine Buchholz
• CoNLL’00 task
• Pitler & Nenkova ‘09
• Conn. SenseDet.
• Morph & All Feat
• Johansson+ Minnen et al
Windowing(-2,+2) Arg2 Arg1
Doc
Parser • Stanford (K&M)
Parse_Tree
Features: Example
Selected Features: Arg1 Features used for Arg1 and Arg2 segmentation and labeling. F1. Token (T) F2. Sense of Connective (CONN) F3. IOB chain (IOB) F4. PoS tag F5. Lemma (L) F6. Inflection (INFL) F7. Main verb of main clause (MV) F8. Boolean feature for MV (BMV) Additional feature used only for Arg1 F9. Previous Sentence (PREV) F10. Arg2 Labels
Inter vs Intra Sentence Arguments
13
This &ilm should be brilliant. Howeverr, it can’t hold up.
Illustration: PREV Feature
Inter vs Intra Sentence Arguments
14
This &ilm should be brilliant. Howeverr, it can’t hold up. However However However However However 0 0 0 0 0
Illustration: PREV Feature
Inter vs Intra Sentence Arguments
15
0.77
0.610.68
0.52
0.270.36
00.10.20.30.40.50.60.70.80.9
P R F1
Intra+Prev Inter+Prev
-‐PREV +PREV
This &ilm should be brilliant. Howeverr, it can’t hold up. However However However However However 0 0 0 0 0
Illustration: PREV Feature
Selected Features: Arg2 Features used for Arg1 and Arg2 segmentation and labeling. F1. Token (T) F2. Sense of Connective (CONN) F3. IOB chain (IOB)
Ghosh S., Johansson R., Riccardi G. and Tonelli S., “Shallow Discourse Parsing with Conditional Random Fields”, International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 2011
Giuseppe Riccardi 17
Parser Evaluation
Giuseppe Riccardi 18
Parser Evaluation
Giuseppe Riccardi, UNITN
19
Lightweight Features -Reduce dimensionality of IOB chain features -Control robustness of parser (wrt to syntactic parse) -Binary features selected from IOB chain
Giuseppe Riccardi, UNITN
20
Lightweight Features IOB Chain feature replaced by two pairs of Boolean features
(1) The second top parent node whether starting (B) or not (2) The third top parent node whether starting (B) or not (3) The second top parent node whether ending (E) or not (4) The third top parent node whether ending (E) or not
Example: Tree diagram showed IOB feature for token “flashed” is I-S/E-VP/E-SBAR/E-S/C-VP Replacing Boolean feature for “flashed” respectively: (1) 0 ( ß E-VP ) (2) 0 ( ß E-SBAR ) (3) 1 ( ß E-VP ) (4) 1 ( ß E-SBAR )
Giuseppe Riccardi 21
Parser Evaluation: Arg2 Exact Match
P R F1
Baseline 0.53 0.46 0.49
Gold - Standard 0.84 0.74 0.79
Gold-Lightweight 0.80 0.74 0.77
AutoConn+GoldSPT 0.82 0.70 0.76
GoldConn+AutoSPT 0.76 0.61 0.68
Lightweight(Auto) 0.72 0.56 0.63
N-Best Parse Re-ranking
22 End2End Disc Parse
Ø Online Passive-Aggressive Perceptron
Ø Structured Voted Perceptron
Ø Linear Preference Learning Support Vector Machine
Ø Linear Best vs. Rest Support Vector Machine
23 End2End Disc Parse
Ø GF0. Log Posteriors Ø GF1. Overgeneration. Ø GF2. Undergeneration. Ø GF3. Intersentential Arg2. Ø GF4. Arg1 after the connective sentence Ø GF5. Argument overlapping with the connective. Ø GF6. Argument begins with I-‐ tag Ø GF7. Argument begins with E-‐ tag
N-Best ReRanking with Global Constraints
24 End2End Disc Parse
Exact Match Scores. Used n-‐best list numbers in parenthesis
Exact Arg1 Arg2 P R F1 P R F1
Baseline 69.88 48.51 57.26 83.44 75.14 79.07 Online PA 66.10 53.92 59.39(16) 82.59 76.39 79.37(4) Struct Per 67.18 52.64 59.03(4) 82.96 76.28 79.48(8) Bestvs Rest 66.19 52.83 58.94(8) 81.69 77.14 79.35(4) Pref-Linear 66.54 53.31 59.20(4) 82.82 76.28 79.42(4)
N-Best ReRanking with Global Constraints
Research Challenges § Speech , Dialog and Discourse
§ Speech Signal vs Linguistic correlates “Eat your porridge! You’re not going to football practice”
§ Parser
§ Trade-off btw coverage and agreement § Robustness of features § Semantic Annotation § Domain/Genre Adaptation
Research Challenges § Speech , Dialog and Discourse
§ Acoustics vs lexical correlates “Eat your porridge! You’re not going to football practice”
§ Parser
§ Trade-off amongst § sense-depth, coverage, agreement
§ Robustness of features § Semantic Annotation § Domain/Genre Adaptation
Publications Speech (LUNA Corpus) • Tonelli S., Riccardi G., Prasad R. and Joshi A. "Annotation of
Discourse Relations for Conversational Spoken Dialogs", LREC Valletta, 2010.
Text (PDTB corpus) • Ghosh S., Johansson R., Riccardi G. and Tonelli S., “Shallow
Discourse Parsing with Conditional Random Fields”, International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 2011
• Ghosh S., Tonelli S., Riccardi G. and Johansson R., “End-to-End Discourse Parser Evaluation”, IEEE International Conference on Semantic Computing, Menlo Park, USA, 2011
Giuseppe Riccardi 27