Download pptx - Fail ir16 intro

#FAIL!THINGS THAT DIDN‘T WORK OUT IN SOCIAL

MEDIA RESEARCH- AND WHAT WE CAN LEARN FROM THEM

Workshop at Internet Research 16, Phoenix, October 21st, 2015.

• Workshop hashtag: #fail2015b• Conference hashtag: #ir16• Workshop website:

http://failworkshops.wordpress.com• Etherpad: https://pad.okfn.org/p/fail

WELCOME

Luca Rossi@LR

Karine Nahon@karineb

Katrin Weller@kwelle

http://failworkshops.wordpress.com/

https://pad.okfn.org/p/fail

https://pad.okfn.org/p/fail

ABOUT #FAIL! WORKSHOPS

• Traveling on to different conferences. First workshop was at WebSci15 (June 2015)

• Aim: collect various examples for things that can go wrong and share them with different communities

learn from experiencesConnect different research communities

WHAT WE‘VE LEARNED SO FAR

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 20130

100

200

300

400

500

600

TwitterFacebookYouTubeBlogsWikisFoursquareLinkedInMySpace

Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus Title Search. For details: http://kwelle.wordpress.com/2014/04/07/bibliometric-analysis-of-social-media-research/

SOCIAL MEDIA RESEARCH

http://kwelle.wordpress.com/2014/04/07/bibliometric-analysis-of-social-media-research/

2008-2013 papers on Twitter and elections: data sources

Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.

6

Data source numberNo information 11Collected manually from Twitter website (Copy-Paste / Screenshot)

6

Twitter API (no further information) 8Twitter Search API 3Twitter Streaming API 1Twitter Rest API 1Twitter API user timeline 1Own program for accessing Twitter APIs 4Twitter Gardenhose 1Official Reseller (Gnip, DataSift) 3YourTwapperKeeper 3Other tools (e.g. Topsy) 6Received from colleagues 1

SOCIAL MEDIA RESEARCH

What we discussed at the first workshop…

CURRENT PROBLEMS

Challenge 1: users

• How to involve social media users in the research process?

• Presentation by Elodie Crespel: “Extending data collection with web browser extension” – Participants may be creative in their use of

technology – flexibility is needed.

Challenge 2: methods

• Data analytics: which approaches should be chosen?

• Taha Yasseri: “The double-edged sword of statistical significance” – Questions the p-value as a standard for data

analytics.– “too much of attention and reliance on specific

measures or methods without being aware of the logic behind them, can be misleading”

Challenge 3: tools

• Many researchers use third party tools for data collection or analysis – which may not always work as expected.

• Presentation by Michael Bossetta and Anamaria Dutceac Segesten: “Tracing Eurosceptic Party Networks via Hyperlink Network Analysis and #FAIL!ng: Can Web Crawlers Keep up with Web Design?” – Exemplary case: issuecrawler.

Challenge 4: content

• Content analysis is heavily effected by the dynamic nature of social media.

• Presentation by Marie Van Cranenbroeck: “Managing and Using Unstable Data in a Social Science Research about Museums and Audiences on Social Media” – Data collection and storage challenges

Specific details and additions

• Researchers and users may have different ideas about the definition of social media / social networks

• Lack of evaluation standards• Availability of data (also: not enough data)• Data may be corrupt (e.g. missing data)• Social media as a moving target (Karpf, D. (2012).

Social science research methods in Internet time. Information, Communication & Society 15(5):639-661. )

Meta discussion

• Social media research can have various forms. Different disciplines involved.

• Best practices and pitfalls in social media research are mainly discussed informally. Few possibilities to share unsuccessful approaches.

WHAT WE‘D LIKE TO LEARN TODAY

• Towards a categorization of challenges for social media research: what can go wrong?

• Collection of more experiences• Structuring them into different categories

WHAT WE‘D LIKE TO LEARN TODAY

Today: - 4 presentations- Think about your own experiences!

- … in connection to each presentation - … in general

9:00 Introduction: “What we’ve learnt from the first workshop and what we’d like to learn today”

9:15 Shawn Walker: “Complexity of collecting social media data in ephemeral contexts”

9:40 Cornelius Puschmann: “Why LIWC sucks (or: saner options for social media content analysis)”

10:05 Break10:20 Luca Rossi: “The fourth deadly sin of social media researchers

(or: scientific research and unstable socio-technical platforms)”10:45 Marco Toledo Bastos, “Individual Behavior from Aggregate

Social Media Data“11:10 Discussion & Conclusions12:00 End

PROGRAM

• Other experiences? Share your thoughts!• Main categories of #fail cases?• Top 3 take away messages for next workshop?

DISCUSSION

WHERE TO GO FROM HERE?

• Next steps – lessons learnt for future workshop organisation

• Which additional conferences?• Publication? Guidebook?

• Archiving: – URLs may vanish (Question: linear rate of decay?) – Images missing– Platforms changing (moving target!) – not just about the interface!

• Visualization of results– Word cloud (compare histograms)

• Tools – sentiment140, Internet Archive, GNIP

• Methods– Content Analysis:

• replicability? Validation?• Context for social media contents (e.g. surrounding tweets).• LIWC, General Inquirer

– Predictions – „Data Science“

• Lack of theory

• Data Quality: – Can we still cite/use data and research published in 2007/2008`?– Baseline? (how to define for a moving target)

• Theory:– Can we only do descriptive work for single platforms?– Look for the theory instead for the data?

• Meta– Systematic review of existing literature is needed

• Documentation– Timeframe generalizaion– Document time, cultures?– How long will my results be valid?– Have a general base for comparison