18
‘Big Social Data’ in Context: Connecting Social Media Data and Other Sources Axel Bruns and Tim Highfield Social Media Research Group Queensland University of Technology Brisbane, Australia a.bruns / t.highfield @ qut.edu.au @snurb_dot_info / @timhighfield

‘Big Social Data’ in Context: Connecting Social Media Data and Other Sources

Embed Size (px)

DESCRIPTION

Paper by Axel Bruns and Tim Highfield presented at the ACSPRI conference, Sydney, 7-10 Dec. 2014.

Citation preview

‘Big Social Data’ in Context: Connecting Social Media Data and Other SourcesAxel Bruns and Tim Highfield

Social Media Research Group

Queensland University of Technology

Brisbane, Australia

a.bruns / t.highfield @ qut.edu.au

@snurb_dot_info / @timhighfield

THE PROMISE OF BIG SOCIAL DATA

• Social media and big data:– Substantial growth in social media usage– User activity generates data and metadata– Readily accessible through APIs– New tools for processing and visualising big data at scale

• Emergence of social media analytics:– Large-scale tracking of public user activities– ‘Trending topics’, user sentiment, network influencers– Scholarly and commercial research– A ‘computational turn’ towards the digital humanities (David Berry)– Ethical concerns around profiling and content ownership

BIG DATA AND SOCIETY

• New methodologies:– Empirical, large-scale, real-time investigation– Data-led, comprehensive evaluation rather than small-scale sampling of public

communication– But also: combined quantitative/qualitative approaches– Not studying the Internet, but studying society with the Internet (Richard Rogers)

• Applications:– Political engagement, especially during elections, crises, scandals– Crisis communication during natural and human-made disasters– Engagement with mainstream media: watching, reading, sharing, …– Brand communication, especially during brand crises– Identification of earthquakes (USGS), tracking of epidemias (Google)– …

SOCIAL MEDIA AND BEYOND

• Facebook, Twitter:– Useful but highly particular areas of online activity– Not necessarily generalisable to overall activity patterns– Current research approaches and API limitations introduce further biases

• E.g. publics on Twitter:– Micro: @reply and retweet conversations– Meso: follower/followee networks– Macro: #hashtag ‘communities’ (Bruns & Moe, 2014)

• Key needs in Twitter research:– Situation of hashtags in wider communicative ecology on Twitter– Day-to-day uses of Twitter, beyond and outside hashtags– Dynamics of everyday quasi-private, interpersonal, and/or public communication– Track impact of social and technological changes on these uses

BIG DATA, RARE DATA?

• The political economy of social media research:– API-based data access is shaped to privilege certain approaches– Research funding is easier to obtain for specific, limited purposes– Longitudinal, ‘big’ data access requires ongoing, substantial funding and

infrastructure– Exploratory, data-driven research is difficult to sell to most funding bodies– Also related to divergent resources available to different scholarly disciplines

• Most ‘difficult’ large-scale social media research is conducted by Facebook / Twitter and commercial research institutes

RESEARCH PROJECT

• ARC Future Fellowship:– Four-year project, $876,973– Axel Bruns (FF), Tim Highfield (Postdoc), Felix Münch (PhD1, 2014-2017);

PhD2 (2015-2018) – enquire within

At the intersection of mainstream, niche, and social media, the processes by which public opinion forms and public debate unfolds are increasingly complex, and poorly understood. This project draws on large datasets and innovative methods to develop a new model of the Australian online public sphere.

• Also supported by ARC LIEF project:– Two-year project (2014/15; QUT, Curtin, Deakin, Swinburne) to develop

comprehensive infrastructure for large-scale social media data analytics

WHAT DATA SOURCES ARE THERE?

• Data sources:– Facebook: user engagement with public pages (profile activity is semi-private)– Twitter:

• hashtag, keyword, URL sharing datasets (public accounts only)• Australian network data; Australian firehose (public accounts only)

– Other social media sources…

– Experian Hitwise: • Australian Web browsing data (ISP-level, anonymous and opt-in panels, 1.5m users)• Australian Web searching data (same methodology)

– Proprietary datasets:• Website analytics for major news sites (e.g. News Ltd. / Fairfax Digital sites)

– Mainstream media monitoring:• Content databases for mainstream media coverage

PROJECT AGENDA

• Data sources:– Australian Internet browse / search patterns (Experian Hitwise)– Online news media reading patterns (Fairfax Digital)– Big social data on news sharing via social media (ARC LIEF)

• Multiple overlapping publics / networks:– What drives their formation and dissipation?– How do they interact and interweave?– How are they interleaved with the wider media ecology?– Social media do not contain publics: publics transcend social media

RESEARCH AIMS

• Methodological development (Y1):– How do we process and integrate these data?

• Standard methods for gathering, processing, storing, analysing datasets• Regular, automated workflows

• Short term (Y2):– What happens as news breaks?

• Search, browsing, reading, sharing patterns• Formation of ad hoc publics

• Medium term (Y3):– How do themes, topics, actors wax and wane?

• Prominence in user activities• Development of stable issue publics

• Long term (Y4):– How do these patterns affect public opinion formation?

• Predictable patterns, stable networks of interaction• Structural analysis of the online public sphere

HITWISE: NEWS SEARCHING TRENDS

HITWISE: NEWS BROWSING TRENDS

TWITTER: NEWS SHARING TRENDS

Education

Agriculture

Literature

Adelaide / SA

FoodWine

Beer

Parenting

Mums PR

Netizens

Marketing

InvestingReal Estate

Home BusinessSole Traders

Self-Help

HR / Support

Followback

Urban MediaUtilities

Advertising

Business

Fashion

Beauty

ArtsCinema

Journalists

Politics

Hard RightLeftists

News

CyclingTalkback

Music

TVV8s UFC

NRL

AFL

Football

Horse Racing

CricketNRU

Celebrities

Hillsong

Perth

PopMedia

Teen Idols

Cody Simpson

THE AUSTRALIAN TWITTERSPHERE

~140k Australian accounts with degree > 1000, as of Sep. 2013

Q&A (3 SEP. TO 7 OCT. 2014)

ABC NEWS (JUNE 2012 TO SEP. 2014)

DAILY TELEGRAPH (JUNE 2012 TO SEP. 2014)

NEXT STEPS

• Further data points:– More detailed data on search patterns (Experian Hitwise)– Readership patterns (Fairfax Digital sites)– Facebook audience engagement patterns with news pages

• Further analytical approaches:– Activity patterns around key issues and events

(e.g. G20, AFC Asian Cup, ANZAC Day, Queensland state election)– Correlation of activity patterns across datasets– Computational modelling of patterns to identify cross-influence of activities on

different platforms on each other

• Further theory development:– Ad hoc publics, issue publics, public sphericules in the Australian public sphere