25
What is the problem? How can we deal with concept drift? Summary Extensional Mapping-Chains for studying Concept Drift in Political Ontologies Shenghui Wang 1 Stefan Schlobach 2 Janet Takens 3 Wouter van Atteveldt 3 1 The Network Institute 2 Department of Computer Science 3 Department of Communication Science Vrije Universiteit Amsterdam ICA 2010 Singapore

ICA Slides

Embed Size (px)

DESCRIPTION

2010 Conference of the International Communication Association

Citation preview

Page 1: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Extensional Mapping-Chains for studying ConceptDrift in Political Ontologies

Shenghui Wang1 Stefan Schlobach2

Janet Takens3 Wouter van Atteveldt3

1 The Network Institute2 Department of Computer Science

3 Department of Communication Science

Vrije Universiteit Amsterdam

ICA 2010Singapore

Page 2: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Content analysis in Communication Science

Communication scientists study all sorts of media contentrelated to human communication

Content analysis based on the NET method

concepts: political actors and issuesrelations: associations, opinions, or actions.

Example

Het Openbaar Ministerie (OM) wil de komende vier jaar mensen-handel uitroeien.

Page 3: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Content analysis in Communication Science

Communication scientists study all sorts of media contentrelated to human communication

Content analysis based on the NET method

concepts: political actors and issuesrelations: associations, opinions, or actions.

Example

Het Openbaar Ministerie (OM) wil de komende vier jaar mensen-handel uitroeien.

Page 4: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Content analysis in Communication Science

Communication scientists study all sorts of media contentrelated to human communication

Content analysis based on the NET method

concepts: political actors and issuesrelations: associations, opinions, or actions.

Example

Het Openbaar Ministerie (OM) wil de komende vier jaar mensen-handel uitroeien.

om human trafficking-1

Page 5: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Semantic network analysis

2077

4842606 2471

1625

2423

2076

1259

2151

1545

2647

2492

623

1827

1409

329

2655

870

1306

10731097

1439

2403 1932

1906

889

1145

956

845

1474

2054

480

1936

1045

1332

2614

2251

1373

1608

883

1233

2653

1011

693

1275

752

2259

2120

475

341

2323

539

2221

1034

1940

1635

545

1386

654

2806

2199

2002

1198

2696

907

2438

1052

2394

438

2186

2377548

2753

648

1721

361

2124

2467

2070

856

2751

1077

1708

2393

1067

1223

2351

22712127

1059

1706

1739

74013881268

2573

2090

4641841

1234

2516

964

2171

Page 6: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Network-based communication science study

What information can we extract from these networks?

Politicians are networking

Politics is perceived by citizens via media

Media study by semantic network analysis

Who is determining the subjects?Who is teaming up?Who is more credible?Who owns which topic?

Page 7: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Before network analysis

We first need to build the networks!

Requires: large corpora with annotated textual content

Manual coding against coding books (ontologies)Automated content analysis in progress

Page 8: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Before network analysis

We first need to build the networks!

Requires: large corpora with annotated textual content

Manual coding against coding books (ontologies)Automated content analysis in progress

Page 9: ICA Slides

What is the problem? How can we deal with concept drift? Summary

What is the problem?

Problems with constructing annotated content

Data from different time periods or genres

Coded by different teams at different moments

Manifesto Research Group: 25 countries, from 1945 to 2006Comparative Policy Agendas project: media content,manifestos, legislative texts, government press statements, etc.Election campaign coverage from 1994 to 2006

Page 10: ICA Slides

What is the problem? How can we deal with concept drift? Summary

What are the challenges?

Interoperability problem while sharing information

Different teams use different code books

Example

illegal immigration

labour migrants

Different coding books should be merged or at least connected

Not the focus of this paper

Page 11: ICA Slides

What is the problem? How can we deal with concept drift? Summary

What are the challenges?

Interoperability problem while sharing information

Different teams use different code books

Example

illegal immigration

labour migrants

Different coding books should be merged or at least connected

Not the focus of this paper

Page 12: ICA Slides

What is the problem? How can we deal with concept drift? Summary

What are the challenges?

Interoperability problem while sharing information

Different teams use different code books

Example

illegal immigration

labour migrants

Different coding books should be merged or at least connected

Not the focus of this paper

Page 13: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Follow the Fashion?

Page 14: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Women’s role?

Suffragettes said that women’s role in society is unacceptable

Pope says that women’s role in society is unacceptable

Page 15: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Concept drift

Our problem: Concept drift

Meaning of concepts changes over time

Analysis based on evolving concepts must consider temporallocality

Study concept drift itself is useful

Page 16: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Datasets

Five political ontologies which were used to annotatenewspaper articles

23 639 manually annotated newspaper articles during fiverecent Dutch national election campaigns

There even exist manual mappings but most of them arelexically very similar

Page 17: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Detecting concept drift

We use extensional mapping techniques

Consider concepts at different time to be different concepts

Use extensional method to detect the links between conceptsat different time

Assumption: similar sentences should be coded with similarconcepts, therefore, similar concepts should have similarextension.

Page 18: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Representing concept drift using mapping chains

Page 19: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Evaluating concept drift

What can we learn from those chains?

Do they agree with the political reality?

Do they tell us something we do not noticed before?

Are some concepts more stable/unstable than others?

Quantitative evaluation is interesting, but qualitative analysisseems to tell us something too.

Page 20: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Qualitative analysis of mapping chains

Association vs. similarity

Early erroneous associations can turn large parts of theanalysis practically useless.

Page 21: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Qualitative analysis of mapping chains

Association vs. similarity

Early erroneous associations can turn large parts of theanalysis practically useless.

Page 22: ICA Slides

What is the problem? How can we deal with concept drift? Summary

“productiviteit” (Productivity)

94_productiviteit 98_welvaart valence0.0387

02_economische groei0.0657

02_welvaart

0.0587

03_economische groei

0.0569

03_financieringstekort0.0499

06_economic growth0.0880

06_begroting0.0315

0.0327

06_bezuinigingen0.0336

03_spaarloon0.0361

06_spaarloon0.1505

06_levensloopregeling

0.0518

“euthanasie” (Euthanasia)

94_euthanasie

98_oeuthanasie

0.2636

98_hreferendum

0.0457

02_euthanasie

0.1057

02_milieuactivist0.0768

03_euthanasie

0.2999

03_homohuwelijk0.2519

06_gay marriage0.1789

06_abortion

0.1704

0.3491

0.1883

03_milieuactivist0.2185

03_justitie0.0507

06_criminelen

0.0425

06_verbetering communicatie overheid burger0.0165

0.0310

06_asielzoekers

0.0291

02_referendum0.1016

02_cdavvdlpf

0.0882

03_referendum eu0.1117

03_referendum

0.0432

06_gratis schoolboeken0.0441

06_referendum

0.0313

0.0571

06_burgerinitiatief

0.0454

03_zondagsrust0.0293

03_scholieren

0.0257

06_werknemers0.0548

06_sunday rest

0.0398

06_leerlingen0.0511

06_education

0.0286

Page 23: ICA Slides

What is the problem? How can we deal with concept drift? Summary

If we know two end-point concepts have the same meaning

Kite-shaped chains

94_asielzoekers

98_rcriminaliteit

98_avluchtelingen

98_okerken

98_asielzoekers

98_kabinet kokmierlods

02_criminaliteit

02_jusititie

02_cellentekort

02_drugkoeriers

03_politie

03_justitie

03_criminaliteit

06_asielzoekers

02_mensenrechten

02_instroom beperking

02_asielzoekers

03_asielzoekers

03_opvang illegalen

02_democratie

02_buitenlanders

03_illegalen

03_vluchtelingen

02_bedrijfsleven

Page 24: ICA Slides

What is the problem? How can we deal with concept drift? Summary

“christelijken” (Christians)

94_christelijken

98_ochristelijk christenen

98_oabortus 02_normen waarden

02_multiculturele samenleving

03_multiculturele samenleving 06_christenen

“asielzoekers” (Asylum seeker)

94_asielzoekers

98_rcriminaliteit

98_avluchtelingen

98_okerken

98_asielzoekers

98_kabinet kokmierlods

02_criminaliteit

02_jusititie

02_cellentekort

02_drugkoeriers

03_politie

03_justitie

03_criminaliteit

06_asielzoekers

02_mensenrechten

02_instroom beperking

02_asielzoekers

03_asielzoekers

03_opvang illegalen

02_democratie

02_buitenlanders

03_illegalen

03_vluchtelingen

02_bedrijfsleven

Page 25: ICA Slides

What is the problem? How can we deal with concept drift? Summary

Summary

By looking at extensions of concepts, we can detect conceptdrift

Domain experts found that the detected concept drift makessense

Automated matching techniques can help domain experts tofind hidden links between concepts

More work needs to be done