15
REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium Stefanie Wiegand & Stuart E. Middleton University of Southampton IT Innovation Centre {sw,sem}@it-innovation.soton.ac.uk Veracity & Velocity of Social Media Content during Breaking News: Analysis of November 2015 Paris Shootings

Veracity & Velocity of Social Media Content during Breaking News

Embed Size (px)

Citation preview

Page 1: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Stefanie Wiegand & Stuart E. Middleton

University of Southampton IT Innovation Centre

{sw,sem}@it-innovation.soton.ac.uk

Veracity & Velocity of Social Media Content

during Breaking News:Analysis of November 2015 Paris Shootings

Page 2: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium 1

Introduction

Experiment

Results

Discussion

Future work

Overview

Page 3: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

What's this all about?

2

Problems:

Journalists doing breaking UGC verification – speed vs. accuracy

Echo chamber can make false rumours go viral

Automate information gathering – Journalists make the final decision

Ideas:

First 60 mins of a UGC post filter by attribution to trusted sources

Visualise traffic patterns for posts attributed to trusted and untrusted sources

Can traffic analysis help to verify / debunk content?

First 5 mins rank UGC not seen before by mention count

Provide a ranked list of likely eyewitness UGC every 5 mins

Can we produce a high quality eyewitness UGC feed?

Introduction

Page 4: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment setup

3

Data

5 viral UGC posts (3 eyewitness, 2 debunked) - manually identified

38GB of serialised data covering the first 6h after the first attack

5.9M posts, ~40k attributed sources, ~418k unique URLs

~160k - 1.8M posts in the first hour per UGC test case

Technology

Target UGC Image/Video → TinEye → Duplicate Images/Videos

Posts → Text extraction → Sources → PostgreSQL

PostgreSQL → Triple store → Trust knowledge model → Trusted posts

Experiment

Page 5: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment method

4

Verification (Experiment 1)

Filter (un-)trusted content in first 60 mins of 5 target UGC posts

Examine velocity of trusted and untrusted sources mentioning target UGC

When is target UGC attributed to trusted sources?

Identification (Experiment 2)

Temporally segment first 5 mins of posts for 5 target event times

Filter old URLs (including alternative URLs)

Rank by mention frequency

Does target UGC appear highly in ranked list?

Experiment

Page 6: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment 1 - Case P1

5

Results

0

50

100

150

200

250

300

350

400

10 20 30 40 50 60

con

ten

t it

ems

[#]

time [min]

P1

trusted unknown untrusted total

Page 7: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment 1 - Case P2

6

Results

0

200

400

600

800

1000

1200

10 20 30 40 50 60

con

ten

t it

ems

[#]

time [min]

P2

trusted unknown untrusted total

Page 8: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment 1 - Case P3

7

Results

0

50

100

150

200

250

10 20 30 40 50 60

con

ten

t it

ems

[#]

time [min]

P3

trusted unknown untrusted total

Page 9: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment 1 - Case P3

7

Results

0

1

2

3

4

5

10 20 30 40 50 60

con

ten

t it

ems

[#]

time [min]

trusted/untrusted P3

trusted untrusted

Page 10: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment 1 - Case D1

9

Results

0

500

1000

1500

2000

2500

3000

3500

10 20 30 40 50 60

con

ten

t it

ems

[#]

time [min]

D1

trusted unknown untrusted total

Page 11: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment 1 - Case D2

9

Results

0

500

1000

1500

2000

2500

3000

3500

10 20 30 40 50 60

con

ten

t it

ems

[#]

time [min]

D2

trusted unknown untrusted total

Page 12: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Experiment 2

11

Results

Target Image ID P1 P2 P3 D1 D2

number of followers of author 335 1.4k 218 2.8k 151k

content likes 11 408 35 17k 29k

content retweets 83 3.3k 194 22k 30k

total # of tweets

in 60 minute window483918 162111 811079 1501000 1837173

total # of unique mentioned URLs in

60 minute window785 4331 535 7907 13252

ranking of target image set in total for

5 minute segment

(top x percent)

9 / 653

(2%)

1 / 603

(1%)

61 / 1097

(6%)

427 / 11605

(4%)

1 / 11337

(1%)

total number of eyewitness content in

5 minute segment 25 2 12 29 30

unique number of eyewitness content

in 5 minute segment4 1 4 13 14

Page 13: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

How is this useful to journalists?

12

Posts by trusted matter for verification

Wisdom of the crowds is not always wisdom at all

Twitter "echo chamber" is less useful than a post by a trusted source

Easier/faster to spot new eyewitness UGC

Filter feeds to 10s of posts not 1000s of posts

Reduce information overload for journalists in first 5 mins

Additional analysis can improve eyewitness UGC further

Eyewitness classification

Image analysis (e.g. Exif metadata)

Author profile pages

Discussion

Page 14: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium

Where to go from here

13

Cross check known facts

Extend knowledge model to support this

e.g. image classification of weather/lighting ↔ time & location of event

e.g. mentions of known event actors

Use linked open data to visualise source bias

this can include political, religious or other bias

Observational study of journalists verifying UGC

Journalist experts show best practice verification on specific examples

We train our algorithms on observed best practice

We check our algorithms results against journalists ground truth

Future work

Page 15: Veracity & Velocity of Social Media Content during Breaking News

REVEAL Project: Co-funded by the EU FP7 Programme Nr.: 610928 www.revealproject.eu © 2016 REVEAL consortium 14

Any questions?

Stefanie Wiegand & Stuart E. Middleton

University of Southampton IT Innovation Centre

email: {sw|sem}@it-innovation.soton.ac.uk

web: www.it-innovation.soton.ac.uk

twitter: @RevealEU, @IT_Innov, @stuart_e_middle

Many thanks for your attention!