View
27
Download
0
Category
Tags:
Preview:
DESCRIPTION
SIGIR, August 2005, Salvador, Brazil On the Collective Classification of Email “Speech Acts”. Vitor R. Carvalho & William W. Cohen Carnegie Mellon University. Outline. Email “Speech Acts” and Applications Sequential Nature of Negotiations Collective Classification and Results. - PowerPoint PPT Presentation
Citation preview
SIGIR, August 2005, Salvador, BrazilSIGIR, August 2005, Salvador, Brazil
On the Collective Classification of On the Collective Classification of Email “Speech Acts”Email “Speech Acts”
Vitor R. Carvalho & William W. CohenCarnegie Mellon University
OutlineOutline
1.1. Email “Speech Acts” and Email “Speech Acts” and ApplicationsApplications
2.2. Sequential Nature of NegotiationsSequential Nature of Negotiations
3.3. Collective Classification and Collective Classification and ResultsResults
Classifying Email into Acts [Cohen, Carvalho & Mitchell, EMNLP-04][Cohen, Carvalho & Mitchell, EMNLP-04]
Verb
Commisive Directive
Deliver Commit Request Propose
Amend
Noun
Activity
OngoingEvent
MeetingOther
Delivery
Opinion Data
Verb
Commisive Directive
Deliver Commit Request Propose
Amend
Noun
Activity
OngoingEvent
MeetingOther
Delivery
Opinion Data
An An ActAct is a is a verb-nounverb-noun pair (e.g., pair (e.g., propose meeting) propose meeting)
One single email message may One single email message may contain multiple acts. Not all contain multiple acts. Not all pairs make sense. pairs make sense.
Try to describe commonly Try to describe commonly observed behaviors, rather than observed behaviors, rather than all possible speech acts.all possible speech acts.
Also include non-linguistic Also include non-linguistic usage of email (delivery of files)usage of email (delivery of files)
Most of the acts can be learned Most of the acts can be learned (EMNLP-04)(EMNLP-04)Noun
s
Verbs
Email Acts - ApplicationsEmail Acts - Applications
Email overload – improved email clients. Email overload – improved email clients. Negotiating/managing shared tasks is a central use Negotiating/managing shared tasks is a central use
of emailof email Tracking commitments, delegations, pending Tracking commitments, delegations, pending
answersanswers integrating to-do/task lists to email, etc.integrating to-do/task lists to email, etc.
Iterative Learning of Email Tasks and Speech Iterative Learning of Email Tasks and Speech Acts Acts [Kushmerick & Khoussainov, 2005][Kushmerick & Khoussainov, 2005]
Predicting Social Roles and Group Leadership. Predicting Social Roles and Group Leadership. [Leuski, 2004][Carvalho et al., in progress][Leuski, 2004][Carvalho et al., in progress]
Idea: Predicting Acts from Surrounding Acts
Delivery
Request
Commit
Proposal
Request
Commit
Delivery
Commit
Delivery
<<In-ReplyTo>> • Act has little or no correlation with other acts of same message
• Strong correlation with previous and next message’s acts
Example of Email Thread Sequence
[Winograd and [Winograd and FloresFlores,,1986]1986] “Conversation for “Conversation for Action Structure”Action Structure”
[Murakoshi et al., [Murakoshi et al., 1999]1999] ““Construction of Construction of Deliberation Deliberation Structure in Structure in EmailEmail””
Related work on the Sequential Nature Related work on the Sequential Nature of Negotiationsof Negotiations
[Kushmerick & Lau,[Kushmerick & Lau, 2005]2005] “Learning “Learning the structure of the structure of interactions interactions between buyers between buyers and e-commerce and e-commerce vendors”vendors”
Related work on the Sequential Nature Related work on the Sequential Nature of Negotiationsof Negotiations
Data: CSPACE CorpusData: CSPACE Corpus
Few large, free, natural email corpora are Few large, free, natural email corpora are availableavailable
CSPACE corpus (Kraut & Fussell)CSPACE corpus (Kraut & Fussell)o Emails associated with a semester-long project Emails associated with a semester-long project
for Carnegie Mellon MBA students in 1997for Carnegie Mellon MBA students in 1997o 15,000 messages from 277 students, divided in 50 15,000 messages from 277 students, divided in 50
teams (4 to 6 students/team)teams (4 to 6 students/team)o Rich in task negotiation. Rich in task negotiation. o 1500+ messages (4 teams) had their “Speech 1500+ messages (4 teams) had their “Speech
Acts” labeled.Acts” labeled.o One of the teams was double labeled, and the One of the teams was double labeled, and the
inter-annotator agreement ranges from 72 to 83% inter-annotator agreement ranges from 72 to 83% (Kappa) for the most frequent acts.(Kappa) for the most frequent acts.
Evidence of Sequential Correlation of Evidence of Sequential Correlation of ActsActs
Transition diagram for most common verbs from CSPACE corpusTransition diagram for most common verbs from CSPACE corpus It is NOT a Probabilistic DFAIt is NOT a Probabilistic DFA Act sequence patterns: (Request, Deliver+), (Propose, Commit+, Act sequence patterns: (Request, Deliver+), (Propose, Commit+,
Deliver+), (Propose, Deliver+), most common act was DeliverDeliver+), (Propose, Deliver+), most common act was Deliver Less regularity than the expected (considering previous Less regularity than the expected (considering previous
deterministic negotiation state diagrams)deterministic negotiation state diagrams)
Content versus ContextContent versus Context Content:Content: Bag of Words features only Bag of Words features only Context:Context: Parent and Child FeaturesParent and Child Features only ( table below) only ( table below) 8 MaxEnt classifiers, trained on 3F2 and tested on 1F3 team dataset8 MaxEnt classifiers, trained on 3F2 and tested on 1F3 team dataset Only 1Only 1stst child message was considered (vast majority – more than 95%) child message was considered (vast majority – more than 95%)
0 0.1 0.2 0.3 0.4 0.5
Request
Deliver
Commit
Propose
Directive
Commissive
Meeting
dData
Kappa Values (%)
Context Content
Kappa Values on 1F3 using Relational (Context) features and Textual (Content) features.
Parent Boolean Features
Child Boolean Features
Parent_Request, Parent_Deliver, Parent_Commit, Parent_Propose,
Parent_Directive, Parent_Commissive
Parent_Meeting, Parent_dData
Child_Request, Child_Deliver, Child_Commit, Child_Propose,
Child_Directive, Child_Commissive,
Child_Meeting, Child_dData
Set of Context Features (Relational)
Delivery
Request
Commit
Proposal
Request
???
Parent message Child message
Dependency NetworkDependency Network Dependency networks are probabilistic graphical models in which Dependency networks are probabilistic graphical models in which the full joint distribution of the network is approximated with a set the full joint distribution of the network is approximated with a set of conditional distributions that can be learned independently. The of conditional distributions that can be learned independently. The conditional probability distributions in a DN are calculated for each conditional probability distributions in a DN are calculated for each node given its neighboring nodes (its node given its neighboring nodes (its Markov blanketMarkov blanket).).
Approx inference Approx inference (Gibbs sampling)(Gibbs sampling)
Markov blanketMarkov blanket = = parent message and parent message and child messagechild message
Heckerman et al., Heckerman et al., JMLR-2000. Neville JMLR-2000. Neville & Jensen, KDD-& Jensen, KDD-MRDM-2003. MRDM-2003.
))(|Pr()Pr( i
ii XNeighborsXX
Parent Message
Child Message
Current
Message
Request
Commit
Deliver
… ……
Collective Classification Collective Classification Procedure Procedure
(based on Dependency Networks Model)(based on Dependency Networks Model)
Improvement over Content-only Improvement over Content-only baselinebaseline
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0 10 20 30 40 50
Iteration
Kap
pa
Deliver Commissive Request
Kappa oftenimproves after iteration
Kappa unchanged for “deliver”
Leave-one-team-out Leave-one-team-out ExperimentsExperiments
4 teams: 4 teams: 1f3(170 msgs)1f3(170 msgs) 2f2(137 msgs)2f2(137 msgs) 3f2(249 msgs)3f2(249 msgs) 4f4(165 msgs)4f4(165 msgs)
(x axis)= Bag-of-(x axis)= Bag-of-words onlywords only
(y-axis) = Collective (y-axis) = Collective classification resultsclassification results
Different teams Different teams present different present different styles for styles for negotiations and negotiations and task delegation.task delegation.
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80
4f4
1f3
3f2
2f2
Reference
Kappa ValuesKappa Values
Leave-one-team-out Leave-one-team-out ExperimentsExperiments
Consistent Consistent improvement of improvement of Commissive, Commissive, Commit and Commit and Meet actsMeet acts
Kappa ValuesKappa Values
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70
Commiss/Commit/Meet
Direct/dData/Request
Proposal/Delivery
Reference
Leave-one-team-out Leave-one-team-out ExperimentsExperiments
Deliver and dData Deliver and dData performance usually performance usually decreasesdecreases
Associated with Associated with data distribution, data distribution, FYI, file sharing, FYI, file sharing, etc.etc.
For “For “non-delivery”non-delivery”, , improvement in avg. improvement in avg. Kappa is statistically Kappa is statistically significant (p=0.01 significant (p=0.01 on a two-tailed T-on a two-tailed T-test)test)
Kappa ValuesKappa Values
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80
Non-delivery
Deliver/dData
Reference
Act by Act Comparative Act by Act Comparative ResultsResults
37.66
30.74
47.81
58.27
47.25
36.84
42.01
44.98
42.55
32.77
52.42
58.37
49.55
40.72
38.69
43.44
0 10 20 30 40 50 60 70
Commissive
Commit
Meeting
Directive
Request
Propose
Deliver
dData
Kappa Values (%)
Baseline Collective
Kappa values with and without collective classification, averaged over the four test sets in the leave-one-team out experiment.
ConclusionConclusion Sequential patterns of email acts were studied Sequential patterns of email acts were studied
in the CSPACE corpus. Less regularity than in the CSPACE corpus. Less regularity than expected.expected.
We proposed a collective classification We proposed a collective classification procedure for Email Speech Acts based on a procedure for Email Speech Acts based on a Dependency Net model. Dependency Net model.
Modest improvements over the baseline on acts Modest improvements over the baseline on acts related to negotiation (Request, Commit, related to negotiation (Request, Commit, Propose, Meet, etc) . No Propose, Meet, etc) . No improvement/deterioration was observed for improvement/deterioration was observed for Deliver/dData (acts less associated with Deliver/dData (acts less associated with negotiations)negotiations)
Degree of linkage in our dataset is small – which Degree of linkage in our dataset is small – which makes the observed results encouraging.makes the observed results encouraging.
Thank you!Thank you!
Thank you!Thank you!
Inter-Annotator AgreementInter-Annotator Agreement
Kappa StatisticKappa Statistic A = probability of A = probability of
agreement in a agreement in a categorycategory
R = prob. of R = prob. of agreement for 2 agreement for 2 annotators labeling annotators labeling at randomat random
Kappa range: -1…Kappa range: -1…+1+1
Inter-Annotator Agreement
Email Act Kappa
Deliver 0.75Commit 0.72Request 0.81Amend 0.83Propose 0.72
Recommended