Upload
ilias-flaounas
View
19
Download
1
Embed Size (px)
Citation preview
Detecting Macro-patterns in the EU Mediasphere
ISL Seminars
Ilias Flaounas
University of Bristol
June 8, 2010
I. Flaounas (University of Bristol) June 8, 2010 1 / 16
Introduction
Traditionally, content analysis of news-media has been the domain ofresearch of social scientists.
But they have a lot oflimitations. They work:
with few news-media perstudy.
with small numbers ofnews-items.
in small time periods.
on a single country’smedia.
...
I. Flaounas (University of Bristol) June 8, 2010 2 / 16
Examples
Examples of papers published in the recent issues of Journal ofCommunication:
“A total of 529 stories from NBC Nightly News and 322 stories airedon Special Report about Iraq, and 64 and 47, respectively, aboutAfghanistan were analyzed by two coders”S. Aday, “Chasing the Bad News: An Analysis of 2005 Iraq and Afghanistan War
Coverage on NBC and Fox News Channel”, J. of Com. 60, 144-164 (2010).
I. Flaounas (University of Bristol) June 8, 2010 3 / 16
Examples
Examples of papers published in the recent issues of Journal ofCommunication:
“A total of 529 stories from NBC Nightly News and 322 stories airedon Special Report about Iraq, and 64 and 47, respectively, aboutAfghanistan were analyzed by two coders”S. Aday, “Chasing the Bad News: An Analysis of 2005 Iraq and Afghanistan War
Coverage on NBC and Fox News Channel”, J. of Com. 60, 144-164 (2010).
“Our corpus of data consisted of Channel 2s broadcasts on the eve ofMDHH between 7:30 p.m. and midnight in the years 1994-2007[...].All 278 items aired on the 14 examined evenings were coded.”O. Meyers et al. “Prime Time Commemoration: An Analysis of Television
Broadcasts on Israel’s Memorial Day for the Holocaust and the Heroism”, J. of
Com. 59, 456-480 (2009).
I. Flaounas (University of Bristol) June 8, 2010 3 / 16
Example of Coding Scheme
These questionaires haveto be completedmanually.
Even more, typically morethan one coders have tocomplete the samequestionaires for the samenews items.
I. Flaounas (University of Bristol) June 8, 2010 4 / 16
Nowadays...
Most media offer their content online in a convinient form.
I. Flaounas (University of Bristol) June 8, 2010 5 / 16
Our Dataset
215 media outlets
over the 27 EU countries
in 22 different languages
for a 6 months period
A total of 1.3M newsarticles.
I. Flaounas (University of Bristol) June 8, 2010 6 / 16
Our Dataset
215 media outlets
over the 27 EU countries
in 22 different languages
for a 6 months period
A total of 1.3M newsarticles.
What macro-patterns can be found using modern AI techniques?
I. Flaounas (University of Bristol) June 8, 2010 6 / 16
The EU mediasphereCo-coverage network: We link two outlets if they share more stories thanexpected by chance (χ2
− scores).
I. Flaounas (University of Bristol) June 8, 2010 7 / 16
The EU mediasphereCo-coverage network: We link two outlets if they share more stories thanexpected by chance (χ2
− scores).
This network has 203 nodes and 6702 edges.I. Flaounas (University of Bristol) June 8, 2010 7 / 16
A bit sparser....
This network has 197 nodes, 3386 edges and 3 connected components.Singleton nodes are omitted.
I. Flaounas (University of Bristol) June 8, 2010 8 / 16
What kind of connected components are formed?
I. Flaounas (University of Bristol) June 8, 2010 9 / 16
What kind of connected components are formed?
I. Flaounas (University of Bristol) June 8, 2010 9 / 16
What kind of connected components are formed?
We go as sparse as possible with stopping criterion the modularitymaximization.The probability of two non-singleton outlets from the same country toend up in the same connected component is 82.9% (p < 0.001).Nationality is the major underline criterion of what stories mediaoutlets choose to publish.
I. Flaounas (University of Bristol) June 8, 2010 9 / 16
What kind of connected components are formed?
We go as sparse as possible with stopping criterion the modularitymaximization.The probability of two non-singleton outlets from the same country toend up in the same connected component is 82.9% (p < 0.001).Nationality is the major underline criterion of what stories mediaoutlets choose to publish.
We will work on countries level rather than outlets level.
I. Flaounas (University of Bristol) June 8, 2010 9 / 16
Which are the strongest connections between countries?
I. Flaounas (University of Bristol) June 8, 2010 10 / 16
Which are the strongest connections between countries?
We go as sparse as possible while keeping the network connected.
This network has 27 nodes and 112 edges.
I. Flaounas (University of Bristol) June 8, 2010 10 / 16
Can we explain relations of countries?
I. Flaounas (University of Bristol) June 8, 2010 11 / 16
Can we explain relations of countries?
We found significant (p < 0.001) correlation of countries’ media-content
similarity to their:
Economical proximity — based on trade volume 31.03%
I. Flaounas (University of Bristol) June 8, 2010 11 / 16
Can we explain relations of countries?
We found significant (p < 0.001) correlation of countries’ media-content
similarity to their:
Economical proximity — based on trade volume 31.03%
Cultural proximity — based on song contest votting patterns 32.05%
I. Flaounas (University of Bristol) June 8, 2010 11 / 16
Can we explain relations of countries?
We found significant (p < 0.001) correlation of countries’ media-content
similarity to their:
Economical proximity — based on trade volume 31.03%
Cultural proximity — based on song contest votting patterns 32.05%
Geographical proximity — based on sharing of borders 33.86%
I. Flaounas (University of Bristol) June 8, 2010 11 / 16
How ‘close’ countries really are, based on common media
interests?We use χ2-scores as similarities and project countries in a 2-D plane usingMultidimensional Scaling.
I. Flaounas (University of Bristol) June 8, 2010 12 / 16
How ‘close’ countries really are, based on common media
interests?We use χ2-scores as similarities and project countries in a 2-D plane usingMultidimensional Scaling.
I. Flaounas (University of Bristol) June 8, 2010 12 / 16
How ‘close’ countries really are, based on common media
interests?We use χ2-scores as similarities and project countries in a 2D plane usingMultidimensional Scaling.
We colour in blue the Eurozonemembers.
These countries are closer to thecenter, that is the averageEU-media content.
I. Flaounas (University of Bristol) June 8, 2010 13 / 16
Ranking of countries
Based on their deviation from average EU media content (in 26D space).
I. Flaounas (University of Bristol) June 8, 2010 14 / 16
Ranking of countries
Based on their deviation from average EU media content (in 26D space).
Rank Country Euro A.Year
1 France Y 19572 Austria Y 19953 Germany Y 19574 Greece Y 19815 Ireland Y 19736 Cyprus Y 20047 Slovenia Y 20048 Spain Y 19869 Slovakia Y 200410 Italy Y 195711 Belgium Y 195712 Luxembourg Y 195713 Bulgaria N 200714 Netherlands Y 1957
15 U. Kingdom N 197316 Finland Y 199517 Sweden N 199518 Poland N 200419 Estonia N 200420 Denmark N 197321 Portugal Y 198622 Malta Y 200423 Czech Republic N 200424 Romania N 200725 Latvia N 200426 Hungary N 200427 Lithuania N 2004
I. Flaounas (University of Bristol) June 8, 2010 14 / 16
Any other important factors?
Correlations of countries deviation from average EU media content.
Factor Correlation (%) p-values
In Eurozone 70.65 <0.001
I. Flaounas (University of Bristol) June 8, 2010 15 / 16
Any other important factors?
Correlations of countries deviation from average EU media content.
Factor Correlation (%) p-values
In Eurozone 70.65 <0.001Accession Year -49.32 0.009
I. Flaounas (University of Bristol) June 8, 2010 15 / 16
Any other important factors?
Correlations of countries deviation from average EU media content.
Factor Correlation (%) p-values
In Eurozone 70.65 <0.001Accession Year -49.32 0.009GDP 2008 44.75 0.020
I. Flaounas (University of Bristol) June 8, 2010 15 / 16
Any other important factors?
Correlations of countries deviation from average EU media content.
Factor Correlation (%) p-values
In Eurozone 70.65 <0.001Accession Year -49.32 0.009GDP 2008 44.75 0.020Population 23.05 0.247
I. Flaounas (University of Bristol) June 8, 2010 15 / 16
Any other important factors?
Correlations of countries deviation from average EU media content.
Factor Correlation (%) p-values
In Eurozone 70.65 <0.001Accession Year -49.32 0.009GDP 2008 44.75 0.020Population 23.05 0.247Area 15.63 0.435
I. Flaounas (University of Bristol) June 8, 2010 15 / 16
Any other important factors?
Correlations of countries deviation from average EU media content.
Factor Correlation (%) p-values
In Eurozone 70.65 <0.001Accession Year -49.32 0.009GDP 2008 44.75 0.020Population 23.05 0.247Area 15.63 0.435Population Density 7.45 0.712
I. Flaounas (University of Bristol) June 8, 2010 15 / 16
Methods Summary
RSS parsing
Web page content scrapping
Statistical Machine Translation
Stemming, stop-words removal
Bag-of-words representation
Best Reciprocal Hit Clustering
Network construction
Multidimensional Scaling
Statistics...
I. Flaounas (University of Bristol) June 8, 2010 16 / 16
Methods Summary
RSS parsing
Web page content scrapping
Statistical Machine Translation
Stemming, stop-words removal
Bag-of-words representation
Best Reciprocal Hit Clustering
Network construction
Multidimensional Scaling
Statistics...
Thank you!
I. Flaounas (University of Bristol) June 8, 2010 16 / 16