43
Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.i e Enabling Networked Knowledge Deletion Discussions in Wikipedia*: Decision Factors and Outcomes Jodi Schneider, Alexandre Passant, & Stefan Decker WikiSym 2012 Linz, Austria Wednesday 29 th August 2012 1 *enWP

WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Embed Size (px)

DESCRIPTION

WikiSym 2012 presentation

Citation preview

Page 1: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Copyright 2011 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Deletion Discussions in Wikipedia*: Decision Factors and

Outcomes

Jodi Schneider, Alexandre Passant, & Stefan Decker

WikiSym 2012Linz, Austria

Wednesday 29th August 2012

1

*enWP

Page 2: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Big questions about WP

Is crowdsourcing sustainable? Is content bias manageable? Does it matter who writes WP? How can newcomers be welcomed and

socialized?

2

Page 3: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

… are related to Deletion

Is crowdsourcing sustainable? How do we maintain content through deletion?

Is content bias manageable? Are new articles needed? Are they welcomed?

Does it matter who writes WP? … or who makes deletion decisions?

How can newcomers be welcomed and socialized? Deletion threatens editor retention

– 1 in 3 editors begin by creating a new article– 7 times as likely to stay if their article is kept

Source: [[User:Mr.Z-man/newusers]] via [[Wikipedia:Wikipedia_Signpost/2011-04-04/Editor_retention]]

3

Page 4: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Overall Goals

Understand outcomes of deletion discussions What are good outcomes for articles? ... for the community?

Provide support to various groups Readers/New Editors Debate Closers People Reading Archived Debates

4

Page 5: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

This Study’s Research Questions

1. What factors contribute to the decision about whether to delete a given article?

2. When multiple factors are given, what is the relative importance of those factors?

3. What are the outcomes of deletion discussions, both for articles and for the community?

5

Page 6: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Overview

Outcomes (RQ3) Data, Methods, Previous Research Factors (RQs 1&2) Future Work on Support (Demo)

6

Page 7: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Articles: Good Outcomes

7

Page 8: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

… Content Expansion

8

Page 9: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Good Rationale

9

Page 10: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Good Outcome?

10

Page 11: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Community: Good Outcomes

Learning to argue effectively Becoming more detached from content Introducing new editors to community values Developing new editors’ editing skills

11

Page 12: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Example: Good Community Outcomes

William Vickers (fiddler) 1 main author – their first article Nominated for deletion after 1 hour and 20 minutes Shaped during the process

12

Page 13: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Changes During AfD

Article renamed to William Vickers manuscript Discography added 26 edits from this author

13

Page 14: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Supporting the Editor

First article this editor created. Overall 11 articles later created by this editor. Creator made many more edits to this article.

26 edits, compared to 3-9 edits to his later articles.

14

Page 15: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

15

Page 16: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Overview

Outcomes (RQ3) Data, Methods, Previous Research Factors (RQs 1&2) Future Work on Support (Demo)

16

Page 17: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Discussion-based Deletion

“Articles for Deletion” (AfD)

Most contentious Articulated decision-making 500+ deletion discussions/week ~12% of deletions Lam & Riedl. “Is Wikipedia growing a longer

tail?” GROUP ’09

17

Page 18: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Dataset

Data Corpus: “Typical Day” 72 deletion discussions January 29, 2011

English Wikipedia only

18

Page 19: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Methods

Deep analysis of a moderate-sized dataset

Representative sample Intensive manual analysis Annotation with multiple coders Descriptive statistics

19

Page 20: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Previous Research

Shallow analysis of large datasets

Redacted content – West & Lee, “What Wikipedia deletes” WikiSym 2011

Vote sequencing – Taraborelli & Ciampaglia “Beyond notability” SASOW 2011

Decision quality – Lam, Karim & Riedl “The effects of group composition on

decision quality in a social production community”, GROUP 2010

Who participates, what & how much gets deleted– Priedhorsky, Chen, Lam, Panciera, Terveen, & Riedl. “Creating,

destroying, and restoring value in Wikipedia”, GROUP 2007– Geiger & Ford “Participation in Wikipedia’s article deletion

processes”, WikiSym 2011

20

Page 21: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

From Reading to Editing

How can newcomers be welcomed and socialized? Deletion threatens editor retention

– 1 in 3 editors begin by creating a new article– 7 times as likely to stay if their article is kept

Source: [[User:Mr.Z-man/newusers]] via [[Wikipedia:Wikipedia_Signpost/2011-04-04/Editor_retention]]

21

Page 22: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Instructions?

22

Page 23: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Notabili-what?

22% of all deletions are speedy deleted for A7: No indication of importance

Geiger & Ford WikiSym 2011

23

Page 24: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Reader’s View of Deletion

24

Page 25: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Novices vs. Experts in deletion discussions

Worthwhile content that is poorly defended -> deleted

Need Wikipedia knowledge (procedural knowledge) Need content knowledge

25

Page 26: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Articulate Values/Criteria

4 Factors in Deletion Discussions cover 91% of comments 70% of discussions

26

Page 27: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Articulate Values/Criteria

4 Factors in Deletion Discussions cover 91% of comments 70% of discussions

The best way to avoid deletion is for readers to understand these criteria.

27

Page 28: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Article Feedback

28

Page 29: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Factor Example (used to justify `keep')

Notability Anyone covered by another encyclopedic reference is considered notable enough for inclusion in Wikipedia.

Sources Basic information about this album at a minimum is certainly verifiable, it's a major label release, and a highly notable band.

Maintenance …this article is savable but at its current state, needs a lot of improvement.

Bias It is by no means spam (it does not promote the products).

Other I'm advocating a blanket "hangon" for all articles on newly- drafted players

Jodi Schneider, Alexandre Passant & Stefan DeckerDeletion Discussions in Wikipedia: Decision Factors and Outcomes

4 Factors (RQ1)

Page 30: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Articulate Values/Criteria

4 Factors in Deletion Discussions cover 91% of comments 70% of discussions

The best way to avoid deletion is for readers to understand these 4 criteria: Notability Sources Maintenance Bias

30

Page 31: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Factors in Context

31

Page 32: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Relative importance (R2)

Notability trumped by other values Comprehensiveness > Notability (given

Sources) Keeping a (non-notable) Velvet Underground album

we shouldn’t mechanically apply notability guidelines in this instance, where it would “[punch a] hole in their otherwise comprehensive discography.”

Maintenance > Notability Deleting a notable topic due to maintenance

this is the rare case where notability is not the main argument in favor of deletion. It has been demonstrated that the subject is already covered in numerous other articles and that those articles do a much better, more thorough job of covering the topic.

32

Page 33: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Issues

Discussions fail without comments Interactions with article creators

Contentious Learning opportunity

Conflicts around consensus values Notability

– Why just because it is a small team and not major does it not deserve it’s (sic) own page on here?

Reliable sources Policy development is separated from case

debates Frankly, the basis of my disagreement with you here is

that I don’t agree with the guideline.

33

Page 34: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Future Work

Factor-based view of deletion Please give me feedback!

34

Page 35: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

35

Page 36: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

36

Page 37: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

37

Page 38: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Thanks!

[email protected]://jodischneider.com/jodi.html@jschneiderUser:Jodi.a.schneider

38

Page 39: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Deletion Workflow

39

Page 40: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Articles for Deletion (AfD)

40

Page 41: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Friction with outside

41

Page 42: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
Page 43: WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Novices don’t understand notability

Notability vs. real-world importance Emsworth Cricket Club is one of the oldest cricket clubs in the world,

and this really is worth a mention. Especially on a website, where pointless people … gets a mention.

Why just because it is a small team and not major does it not deserve it’s (sic) own page on here?

43