Network analysis of Brexit discussion on social media · We analyze the engagement of citizens, media and politicians on social media during the campaign peri-od of Brexit referendum

NetworkanalysisofBrexitdiscussiononsocialmedia

KennethBenoitLondonSchoolofEconomicsandPoliticalScience,K.R.Benoit@lse.ac.ukAkitakaMatsuoLondonSchoolofEconomicsandPoliticalScience,[email protected] ABSTRACT:Weanalyzetheengagementofcitizens,mediaandpoliticiansonsocialmediaduringthecampaignperi-odofBrexitreferendumintheUKinJune2016.WefocusonthesocialmediaconversationonTwitter,the largestmicro-bloggingservicewithmorethan300millionactiveusersandapproximately500mil-lionnewmessagesgeneratedperday.WeanalysethenetworksofTweetsfocusingonboththetopicsusershavediscussedduringthereferendumcampaignandthenetworkoffollowerships.Usingmachinelearning,weclassifyusersasfavouringLeaveorRemain,andusethisinformationtocomparetheesti-matesofthetopicstheydiscussinthedebate.ThetopicanalysisrevealsthatthestructureoftopicsbyLeaveusersaremoredenselyrelatedthanthosebyRemainusers.Furthermore,byanalyzingthenet-workanalysisoffollowership,werevealthattheimportantaccountsintheBrexitdebatesuchasmediaandpoliticiansarecategoricallydifferentbetweenLeaveandRemain,indicatingafundamentallydivid-edmessagingcommunicationstructure,reinforcingworkdoneelsewhereonhowsocialnetworksformideological“echochambers”reinforcingone’spre-existingattitudes. KEYWORDS:SocialMedia,Brexit,NetworkAnalysis,TextAnalysis

2

1.Introduction

Inthispaper,weanalyzetheengagementofcitizens,mediaandpoliticiansonsocialmedia

duringthecampaignperiodofthe“Brexit”referendumheldintheUKinJune2016.Thepar-

ticularfocusisplacedonthesocialmediaconversationonTwitter,thelargestmicro-blogging

servicewithmorethan300millionactiveusersandover500millionnewmessagesgenerated

perday.WeanalysetraitsofthenetworksofTweetsfocusingonboththenetworkoftopics

usershavediscussedduring the referendumcampaignand thenetworkof followership.The

purposeofouranalysisistodetectthedifferencesbetweenLeaveandRemainforvariousas-

pectsofnetworkofTwittercommunications, inparticularthenetworkoftopicseachsideof

thedebate(LeaveorRemain)usedinTwitterconversations,andalsothenetworkoffollower-

followeerelations.TheformerfocusesonthestructureofTweetcontents,andthe latterfo-

cusesonthe informationenvironmentofTwitteruserswhoparticipated inconversationson

theBrexitreferendumonsocialnetworkingplatforms.

The analysis of topics reveals that the structure of topics by Leave supporters are more

denselyrelatedtooneanotherthanthoseusedbyRemainsupporters.Thenetworkanalysisof

followershiprevealsthattheimportantaccountsintheBrexitdebatesuchasmediaandpoliti-

ciansarecategoricallydifferentbetweenLeaveandRemain.Thisobserveddivisionindicatesa

fundamentallydividedstructureforconveyingandreceivingpoliticalcommunication, indicat-

ingthatsimilartoresultsshowninothercontexts(Flaxman,GoelandRao,2016;DelVicarioet

al.,2016;AllcottandGentzkow,2017),socialmediaintheBrexitcontextformedsimilarideo-

3

logical“echochambers”(Sunstein,2001)inwhichusersreinforcedtheirpre-existingattitudes

byfollowingaccountswithlike-mindedcontent.

2.Data

InordertocapturethediscussionontheBrexitreferenduminTwitterinitsentirety,weset

uptheTwitterdownloaderthroughanaccesstotheTwitter“firehose”,thehigh-capacitylive

streamofallTweets,whichguaranteedthedeliveryofallpostsmatchingthecapturecriteria.

AnotheroptiontocaptureTweetsistouseTwitter’sStreamingAPI1,whichdeliversarandom

sampleofTweetsuptoonepercentofallTweetsgeneratedatagiventime.

Table1.CaptureCriteria

TheuseofthestreamingAPIispreferredinmanystudiesbecauseanyTwitterusershavean

1https://dev.Twitter.com/streaming/overview

4

accesstothispublicAPI.Forourpurposes,thecapturingmethodalsoprovidedamoreaccu-

ratepictureofTwitterconversationforourresearchpurposes,becausethevolumeofTweets

inpronouncedeventssuchasBrexitmightexceedtheStreamingAPI’sratelimit.2Throughthe

trialandexploration,weselectedasetofsearchcriteriaforcapturingTweets,basedonacore

setofunambiguouslyBrexit-relatedterms.

The search termswe used are presented in Table 1. The capture terms consists of three

sets.The first isageneral search term (“Brexit”), thesecondconsistsofprominenthashtags

(hyperlinksonTwitterprefacedby the“”symbol) related toBrexit,and the thirdconsistsof

Twitterusernamesofgroupsandusersclearlysetupforthepurposesofcommunicatingabout

Brexit(e.g.@voteleave).

AnyTweetsthatcontainoneofthesetermswerecapturedinourdatacollectionwhichwe

startedonJanuary6,2016andranthroughJuly2016,themonthfollowingthereferendum.

OurdownloadeffortscapturedalargenumberofTweets.FromthestarttotheendofJune

2016,26millionTweetswerecollected,where10millionTweetsareoriginalTweets (asop-

posedtoretweets).OurTwitterdatahasaverylongtail.Thedataincludesabout3.5million

TwitterusersinwhichmajorityofaccountssentonlyasingleBrexit-relatedTweet.There are

a small number ofuserswhogeneratedaverylargenumberofTweets.Someofthehighest

volumeaccounts,whichgeneratedtensofthousandsofTweetsduringthistime,areobviously

spamaccounts,andwetrytoexcludetheseaccounts.3Figure1isatimeseriesofthevolume

2ForthedetailedcomparisonbetweenFirehoseandStreamingAPI,seeMorstatteretal.(2013)3Thesespamaccountstypicallyexclusivelysentre-tweets,andalsodespiteveryhighvolumeofTweets,

5

ofcapturedTweetsonthelogarithmicscale.Eachlineshowsthevolumeoforiginalpostsand

reTweetsoneachday.

ThedayaftertheBrexitreferendum(24June)recordedthelargestvolumeofTwittertrans-

actions.Over13millionTweetswererecordedonthedayofthereferendumitself.

Figure1.VolumeofBrexitTweets(Jan-Sep2016)

Note:y-axisisonalogarithmicscale

3.ClassifyingRemainv.LeaveAccounts

Thefirst,crucialtaskfromthelargecorpusofTweetswithmillionsofusersistopredictthe

theirnamesarerarelymentionedbyotherTwitteraccounts.

6

sides(LeaveorRemain)ofTwitteraccountsexpressedintheirTweetsonBrexitconversation.

We conduct this task by generating a training set consisting of human-classifed accounts to

generateacoretrainingset,andusingsupervisedmachinelearningtopredictthesideofthe

remainingusers.Theworkisconductedthroughtheseveralsteps.Thefirststepistoidentifya

fewhundred importantaccountsandmanuallyclassifytheseaccounts intoLeaveorRemain.

Weselectedtheseimportantaccountsbasedonthenumberofmentionstoauseraccountin

TweetsonBrexit.AusermentionisapartofTwittertextthatcontainsauser’sscreenname

followedbythe“@”character.ThisfeatureofTwittermakesaTweetwithareferencetospe-

cificuserssearchable,andisusedtoaimaTweetatamentioneduser.

Fortop200mostmentionedaccounts,ourresearchcollaboratorsvisitedinDecember2016

thewebsiteoftheirTwitteraccountstocheckwhethertheseaccountsarestillactiveandtheir

positionintheBrexitreferendum.Wefoundthatfifty-fiveaccountsareonLeaveandtwenty-

fiveareRemain.Fromthis listofknownaccounts,weextend the listofTwitteraccountson

eachsidethroughtheanalysisofhashtagusage.Ahashtag isanalpha-numericstringwitha

prependedhash (“#”) in socialmedia texts. InclusionofhashtagswillmakeTweetseasier to

findanddelivermessages tootherTwitteruserswho share the interestswith theuserwho

posted theoriginalTweetwithhashtags.We identify thehashtagsdisproportionallyusedby

oneoftwosidesbylookingatthehashtaguseofhuman-codedaccounts,4andidentifyother

4 Hashtags by the Remain side are #strongerin, #intogether, #infor, #votein, #libdems, #voting, #in-crowd,#bremain,and#greenerin.HashtagsusedbytheLeavesideare#voteleave,#inorout,#takecon-trol,#voteout,#takecontrol, #borisjohnson,#projecthope,#independenceday,#ivotedleave,#project-fear,#britain,#boris,#lexit,#go,#takebackcontrol,#labourleave,#no2eu,#betteroffout,#june23,and

7

TwitterusersactiveintheBrexitdiscussion(i.e.morethan50Tweetsinourdata).

Weuse this listofhashtags to finda largenumberofLeaveandRemainTwitteraccounts

fromtheaccountsactive intheBrexitreferendumdebateonTwitter,about15,000accounts

withmorethan50TweetsinourBrexitTwittercorpus.WecalculatedtheLeavehastagscore

ofeachaccountbycountingthenumberofLeavehashtagssubtractedbythenumberofRe-

mainhashtag.Weselect thetop (orbottom)tenpercentuseraccountsas theLeave (orRe-

main)asthetrainingdata.

ThenextstepistheclassificationofTwitteraccountsinto“Leave”and“Remain”usingthis

trainingdata.Forthesimplicityandinterpretabilityofoutcomes,weuseasimpleNaiveBayes

classiferwithaflatprior.Forcreatingthefeaturematrix,weemploya“bag-of-words”meth-

odology.WefirstcombineallTweetsfromeachaccountandthenextractfeatures.Sincethe

document featurematrix isextremely sparse (over97percent sparsity)evenonlyusinguni-

grams,wedonottryn-grams.Weapplylower-casingtoallwordsandthenstemthem,except

forTwitterspecificentities(hashtagsandmentions).Featureswithadocumentfrequencyless

than0.1percentwerealsoremoved.

Table2:AverageAccuracyByFeatureSelectionsFeatureType Accuracy NAllFeatures 0.93 64206HashtagsandMentions 0.88 18705Hashtags 0.85 7972Mentions 0.87 10733

#democracy. Since thesehashtags areused for trainingdata generation,we removed them from thefeaturesintheclassificationmodel.

8

Thenextstepistheselectionoffeatures.Theconcernistheoverfittingofthemodel.Even

afterthecleaningofthedata,thereareover60,000featuresleftforonly3,000trainingdocu-

ments.

Tocheckwhethertheinclusionofallfeaturesisappropriate,wetestedseveraldifferentse-

lections of features. The setting we have tested are: hashtag-only,mentions-only, hashtags

and menstions, and all features. For each of these four settings, we run a ten-fold cross-

validationtentimeseach,andcalculatedtheaverageaccuracyforout-of-samplepredictions.5

Table2reporttheoutcomes.

Allsettingsarereasonablyaccurate,buttheuseofall featuresoutperfomedtheotherap-

proaches,andwethereforedecidedtousethatmodelforgeneratingourLeaveversusRemain

labelsonthefullsetofuseraccounts.

We estimate themodelwith full training data. The estimatedmodel provide two sets of

predictions:thefirstistheprobablityofeachfeaturebelongtotheLeaveorRemain,andthe

secondisthepredictedprobabilityofeachaccountbelongstotheLeaveorRemainside.Fig-

ure2depictstherelativefrequenciesofthehastagsusedbyeachside,plottingthesizeofeach

hashtagproportionaltoitsrelativefrequency,asa“wordcloud”.Thepartitionsdistinguishthe

usagebythepartitionsofourclassifer.

5Sinceboth“Remain”and”Leave”areimportantoutcomes,wereporttheaccuracyratherthanotherstatisticsthatfocusonpositivepredictability,suchasprecisionorrecall.

9

Figure2.WordCloudofHighFrequencyHashtags

Note:Topseventyhashtagsineachcategory.Thecategoriesofhashtagsaregeneratedfromthepredictedproba-bilityofahashtagtobelongtoLeave(0-0.25:Remain,0.25-0.75:Neutral,0.75-1:Leave).Sizeofhashtagispropor-tionaltothefrequency.Colorindicatesthecategory.AftertrainingtheNBclassificationmachine,weapplyittoallthreemillionaccounts.Theclas-

sificationresultsatTweetlevelareshowninthetableandfigurebelow.AsFigure2indicates,

theTweetsbyLeaveaccountsexceedthevolumeofTweetsbyRemainaccountsinmostofthe

timeapproachingtothereferendum.ThegreaternumberoftotalTweetsbyRemain(Table4)

10

istheresultofthemassivevolumeofTweetsafterthereferendum.Thedayafterthereferen-

dum(24June,2016)recordedthehighestvolumeinourdataandthemajorityTweetsonthe

dayweremadebytheaccountsourmodelclassifiedasRemain.

Table3:ClassificationResultsofTweetsSide Count Oct.Remain 9780223 36.93%Neutral 7786297 29.40%Leave 8914207 33.66%

Figure3Time-seriesofLeaveandRemainTweets(JanJune2016)

Note:y-axisisonalogarithmicscale

11

4.Networkoftopics

Applying the Structural TopicModel (STM) by Roberts et al. (2014, 2015),we attempt to

identify thetopics inTwitterconversationsandtheirdynamicacrosstimeandbysideof the

debate. STM is amethodology that iswell-suited to identifying the topics in the debate on

Twitter, their associationswith each side, and their chronological evolution. The strengthof

STM,whichisbuiltuponothermodelstodetectthetopics,suchastheLatentDirichletAlloca-

tionmodel,isthatitallowstheincorporationofadditionalcovariatestoestimatetheeffectof

thesefactorsontheuseoftopics.

Thereareothermethods formeasuring topics that canbeused for thispurpose, suchas

Twitter-LDA(Zhaoetal.,2011)orAuthor-Topic-model(Xuetal.,2011).However,thesemod-

elsarenotparticularlysuitableforourresearchpurposes,becausethesemodelsaremorefo-

cusedon the classificationof individual documents, anddo soby setting restrictionson the

topic distributions of each document. A simple LDAmodel (Blei, Ng and Jordan, 2003) and

STM,which is based on the LDA, assume that each document is amixture of topics. This is

moresuitableforourdataaggregationstrategydetailedbelow.Also,thefeatureofSTMtoal-

lowtheuseofadditionalcovariatesasthedeterminantsofprevalenceofeachtopicisparticu-

larlyusefulforthisresearch,becauseoneofthemaininterestsforustofitatopicmodelisto

explorethetemporaldynamicsofevolutionoftopicsbyeachsideofLeaveandRemain.The

unitofobservationinthemodelisatextofcombinedTweetscreatedbyeachuserinagiven

12

week.6 TheTwitterusers’ positions inBrexit are theprediction from theNaiveBayesmodel

describedintheprevioussection.Therearetworeasonswhyweusethisaggregationstrategy.

First,bycombiningTweetsfromeachuser,wecanprovidemoreinformationtothemodelto

improve the performance of the classifier (Hong and Davison, 2010). Second, we combine

Tweetsforeachweeksothatwecancapturethedynamicsoftopics.Thesideisabinaryvari-

able that takesavalueofeitherLeaveorRemain.Neutralusersare removed fromthedata

becausemanyusersarethe lowfrequencyTweeters(onBrexit,at least)whomightnotpro-

videtheenoughinformationtoourNaiveBayesclassifiertoclearlydeterminetheirside.Inthe

modelspecification,weusedtwocovariatesandtheir interaction.OneisSideandanotheris

Week.TheWeekvariabletakesanintegervaluethat indicatesaweeknumberintheyearof

2016, and converted to a b-spline to prevent over fitting. An interaction between Side and

Weekisincludedasweexpectthatthetopicsdiscusseddonotmoveinparallelacrosssides.In

STM, thenumberof topics (k)has tobe fed,andwechosek=40after theexplorationofk

through10,20,30and40.

STM, likeothertopicmodels, isamethodofunsupervisedmachine learningthatdoesnot

requirepre-codedtrainingdata.Thismeansthat it isuptoresearchers’qualitative judgment

to interpret the estimated topicsmeans and give them substantively understandable labels

(Changetal.,2009;Wallachetal.,2009).Thejudgmentisbasedonthehigh-probabilityorex-

clusivewords foreach topic.Outof forty topicsweestimated, thirty-fiveseemtohavecon-

sistentthemeswhilefivedonotprovideconsistent,interpretablecontents.Dependingonthe

6Aweekstartsat12amGMTonMonday.

13

effectofSideontheprevalenceoftopics,wecategorizetopicsintoLeave,Remain,andNeu-

tral.Table4providesthelabelsweappliedtoourtopics(seetheappendixforamoredetailed

presentationoftheproportionoftopicsbyLeaveversusRemainside,usedtodeterminethe

associationinthetable).

Thetimevariable,anothercovariateofthemodel,allowsustocheckthedynamicsoftopics

byeachside.Thefollowingtwofiguresareexamplesofthetopicevolutions.Thehorizontalax-

isindicatestheweekoftheyearin2016(thereferendumwasheldonWeek26)andtheverti-

calaxisistheproportionoftopicintheentireTwitterconversationintheweek.Thefirstfigure

isonthetopic“Brexit:TheMovie”,whichisadocumentaryfilmcreatedbyafilmdirectorwho

advocatestheUKconcessionfromEU,hadasignificantspikeontheweekof itsreleasedate

(11May2017)fromLeave(blueline),whilethereisnoresponsetothistopicfromRemainside

Twitteraccounts(redline).

TherearealsomultipletopicsfromRemainsides.ThefollowingfigureisoneofsuchRemain

topics.Weconsiderthisisatopiconthe“FinancialRisk”orfinancialconcernsabouttheBrexit

becauseofitshighprobabilitywords(e.g.brexit,fears,vote,stocks,markets,global).Thetopic

hadbeenalwaysdiscussedbyRemainTwitterusersthroughoutthetime-perioduntilthetime

ofthereferendum,andtherewasahugespikeof this topic inacoupleofweeksbeforethe

referendum.

14

Table4DetectedTopics

15

Figure4:TopicEvolution“Brexit:TheMovie”

Note:y-axisindicatestheproportionoftopic

Figure5TopicEvolution“FinancialRisk”

Note:y-axisindicatestheproportionoftopic

16

5.Structureoftopics

STMallowstheestimatedtopics,meaningthatitcancorrectlyidentifytopicsthatco-occur

intheTweetsbyauser.Bycheckingthetopiccorrelations,wecanalsomapthenetworkstruc-

tureoftopics.Thefollowingfigureshowsthenetworkoftopics(Figure6).

Figure6Networkoftopics

Note:TheredcirclesareRemaintopics(judgedfromtheestimatedproportionsoftopics,describedabove),andbluecirclesareRemaintopics.Thewidthsofthelinesthatconnecttwotopicsindicateproximitybetweentwotopicsmeasuredbythestrengthofcorrelations.Thedistanceoftwotopicsintheplotalsoreflectstheproximity.

17

The figure indicates that theLeave topicsarestronglycorrelated.Thedensenetworkofa

numberofLeavetopicsonthebottomleft inthefigure indicatesthattheLeaveTweetshad

wellorganizedtopicstructurecenteredaround“VoteLeave”,thecoretopicbyLeaveside.

Incontrast,theRemaintopicsarerathersporadic.Therearecorrelatedtopicsofmobiliza-

tionattempts(middle-left),butthesearenotmuchrelatedtotopicsofsubstantialissues.For

remain side, topicson the substantial issuesaremostlyoneconomy (middle-top),but these

are rather separated from the core topic of remain side (“Stronger In” topic on the bottom

left). In termsof thedisseminationof topics on the socialmedia, the Leave side seemed to

havetheadvantageagainsttheRemainside.BecauseauserismorelikelytoTweetboththe

simplemessageoftakingaction(e.g.“VoteLeave”)andthesubstantialissuesthatexplainwhy

suchactionsarenecessary (e.g. “Jobsand social security”), followersof suchaccountwould

havereceivedthestrongmessageforLeave.

6.Analysisoffollowershipnetwork

Inthissection,weestimatethepositionsofTwitteraccountsthroughanetworkanalysisof

followershipduringtheBrexitreferendumcampaign.Thedataweusearethenetworkoffol-

lowers for 400 accounts that are the most frequently mentioned in the Brexit referendum

Tweets.Weuseamethodofnetwork-basedideologyscalingdevelopedinBarberà(2015).The

methodologyrecoversanissue/ideologyspacefromthenetworkoffollowership,andlocates

18

followersandfollowersinasingledimensionalspace.Thekeyassumptionofthismethodology

isthatTwitterusersprefertofollowaccountsthathaveopinionssimilartotheirownopinions.

Inthisspecificapplication,wethinkthatbyfocusingonthe importantaccounts inBrexitde-

bateselectedbythefrequencyofmentionstotheaccountsinTweetsrelatedtoBrexit,wecan

reproducetheideologicalspaceinBrexit(i.e.LeaveorRemain).

Theestimatedresultsfromthismodelconsistoftwoparts.

1.EstimatedpositionsofimportantTwitteraccounts

2.EstimatedpositionsofTwitteraccountswhofollowtheseimportantaccounts.

Wewillexploretheresultsinthisorder.

7.Ideologydistributionoffollowees

Thefollowingistheestimateofallfrequentlymentionedaccounts(Figure7).Thehorizontal

dimension is the position (Right = Remain). The vertical dimension is the popularity (high =

popular).

Thenwehighlightwhatweconsiderthepolitically influentialaccounts,namelyTwitterac-

countsofMPandmediaoutlets.

19

Figure7.NetworkbasedscalingofBrexitissuepositionsfrequentlymentioneduseraccounts

TheresultsaredepictedinFigure8.Themessageswecangetfromthispicturearethefol-

lowing:

• Conservatives MPs positions are very diverse. Their positions are distributed from

moderatelyLeavetomoderatelyRemain.

• LabourMPspositionsarerelativelycoherentandplacedasthefirmRemain.Anexcep-

tionwas Kate Hoey, whowas one of themost active figures in the Grassroots Out

group.

• OtherpartiesMPpositionsareinthelocationweexpect(SNP,Greens,UKIP).

20

• MediaaccountpositionsrangedfromfrommoderateLeave(TheSun)tofirmRemain

(FinancialTimes).

Figure8:NetworkbasedscalingofBrexitissuepositionsofpoliticallyimportantTwitterac-

counts

8.Distributionoffollowers

Wenowmoveontotheanalysisofthedistributionoffollowers,depictedinFigure9.There

aretwopeaksinthedistributionoffolloweraccounts.TheplotofthedistributionsofRemain

21

andLeaveusers(basedontheNaveBayesestimation)explainswherethesetwopeakscome

from.ThepeakontheleftcorrespondstotheLeaveusers,andpeakontherightcorresponds

to the Remain users. This result indicates that there are distinct patterns in the choice of

whomtofollowbytheuserswhoparticipatedintheBrexitdebateonTwitter.Leaveusersfol-

lowed Leave accountsmore, and therefore they aremore likely tobe exposed to the infor-

mationbroadcastedbyLeaveaccountsontheirTwitterfeed,thenasaresultmentionedthese

accountsfrequentlyintheirTweetstodisseminatetheinformationfurther.Asimilarpatternis

foundamongtheusersclassifiedaspro-Remain.

Figure9:Distributionofnon-eliteaccountsbasedonthenetworkscaling

22

Figure10:Distributionofnon-eliteaccountsbasedonthenetworkscalingbyside

9.ComparisonbetweenfollowershipnetworkonBrexitandallMPsandmedia

In theprevious sectionwe found that therearedistinctpatternsof followershipbyLeave

andRemainTwitterusers.Inthissubsection,weattempttoaddresstheoriginofthispattern,

inparticularwhetherthispatternhasanyrelationswiththenormalleft-rightideologicalspec-

trum,throughthecomparisonoftwoideologicalscaling,onefromtheLeaveandRemainposi-

tions intheBrexitTweetsnetworkandanotherfromthepositions.Followingthestrategyof

Barbera(2015),weobtainedthefollowershipnetworkdataofallMPsduringtheBrexitcam-

23

paignperiod (i.e.before the2017GeneralElection)andmajormediaoutlets,andestimated

thepositionsofTwitterusersinthisnetwork.

Thenextfigureshowsthedifferencesandsimilaritiesbetweentheideologicalpositionses-

timatedfromthesetwonetworks11.Thehorizontalaxis indicatesthepositioningofTwitter

users in thenetworkofBrexitTweets (where largervalues indicatebeingmorepro-Remain)

andtheverticalaxis indicatesthepositioningofTwitterusersinthefollowershipofMPsand

mediaaccounts.

Thelargervalueindicatesamoreconservativeposition.Thereisaclearcorrelationbetween

twoestimates.TheconservativeTwitterusersaremorelikelytotaketheLeaveposition.How-

ever,therelationsarenotstrong.Thecorrelationbetweenthesetwovariablesareonly0.396.

ThisresultsuggeststhatalthoughtheLeave-Remaindimensionispartiallythereflectionoflib-

eralconservativedimension,butotherconcerns,extractedasthetopicofeitherLeaveorRe-

main inourtopicmodelbutnotwell-representedbytheexistingparties,wouldhaveplayed

thekeyroleforsomecitizensintheUKtodecidetheirsideonthedebate.

10.Discussion

Our investigation of a large set of Twitter data yields important insights into the engage-

ment of citizens,media and politicians using Twitter for political communication during the

campaignperiodofBrexit referendum in theUK in June2016. Inparticular,wehave shown

thatboth thecontentand thenetworkof followerships reveal importantdifferencesamong

24

thesidesinthedebate,bothabouthowindividualsandorganizationsself-selecttheircontent

andwhatmessages they communicate, dependingonwhether they supported the Leaveor

RemainsidesinthedebateoverBrexit.

Ouranalysisofthenetworkoftopicsthatusershavediscussedduringthereferendumcam-

paignandthenetworkoffollowershipwasaimedatdetectingthedifferencesbetweenLeave

andRemain.Ouranalysis shows that thestructureof followershipnetworks isbothstrongly

dividedandcategoricallydifferentbetweentheLeaveandRemainsides.

OuranalysisofthetopicsusesinthedebaterevealsthatthestructureoftopicsbyLeaveus-

ersaremoredenselyrelatedtooneanotherthoseusedthatbyRemainusers.Inotherwords,

themessaginganddiscourseofthepro-Leavesidewasmoredenseandmorecoherentthan

the network of pro-Remain supporters, who advanced more diverse arguments in favor of

theirposition.Thismirrorsearlierworkonthesentimentof themessagesthat revealsmore

positive, less tentative,andmore future-looking rhetoricamongLeave supporters compared

toRemain.

Overall,ouranalysisofpoliticaldiscussiononTwitterhasrevealedthat thisplatformdoes

contain informative content in the formofpolitical communication thatmeaningfullydistin-

guishes thesides in theBrexitdebate. It isbothpossible topredict theorientationofauser

fromtheuser’sTweets,andpossibletodistinguishdistincttopicsanddistinctpositioningfrom

thecontentandfollowershipstructuresoftheuser’ssocialmediausage.

25

Figure11:ComparisonofBrexitideologyandnormalLeft-Rightideologybasedonnetwork-

basedscaling

26

References Allcott,H.andGentzkow,M.(2017).SocialMediaandFakeNewsinthe2016Election.JournalofEconomicPerspectives,31(2):211–236.Barberá,P.(2014).Birdsofthesamefeathertweettogether:BayesianidealpointestimationusingTwitterdata.PoliticalAnalysis,23(1),76-91.Blei,D.M.,Ng,A.Y.,&Jordan,M.I.(2003).Latentdirichletallocation.JournalofmachineLearningresearch,3(Jan),993-1022.Chang,J.,Gerrish,S.,Wang,C.,Boyd-Graber,J.L.,&Blei,D.M.(2009).Readingtealeaves:Howhumansinterprettopicmodels.InAdvancesinneuralinformationprocessingsystems(pp.288-296).DelVicario,Michela,AlessandroBessi,FabianaZollo,FabioPetroni,AntonioScala,GuidoCal-darelli,DelVicario,M.,Bessi,A.,Zollo,F.,Petroni,F.,Scala,A.,Caldarelli,G.,...&Quattrociocchi,W.(2016).Thespreadingofmisinformationonline.ProceedingsoftheNationalAcademyofScien-ces,113(3),554-559.Flaxman,S.,Goel,S.,&Rao,J.M.(2016).Filterbubbles,echochambers,andonlinenewscon-sumption.PublicOpinionQuarterly,80(S1),298-320.Hong,L.,&Davison,B.D.(2010,July).Empiricalstudyoftopicmodelingintwitter.InProcee-dingsofthefirstworkshoponsocialmediaanalytics(pp.80-88).ACM.Morstatter,F.,Pfeffer,J.,Liu,H.,&Carley,K.M.(2013,July).IstheSampleGoodEnough?ComparingDatafromTwitter'sStreamingAPIwithTwitter'sFirehose.InICWSM.Roberts,M.E.,BrandonM.S.andTingley,D.(2015).stm:RPackageforStructuralTopicModels.JournalofStatisticalSoftwareVV.Roberts,M.E.,Stewart,B.M.,Tingley,D.,Lucas,C.,Leder-Luis,J.,Gadarian,S.K.,...&Rand,D.G.(2014).StructuralTopicModelsforOpen-EndedSurveyResponses.AmericanJournalofPo-liticalScience,58(4),1064-1082.Sunstein,C.(2001).EchoChambers.PrincetonUniversityPress.

27

Wallach,H.M.,Murray,I.,Salakhutdinov,R.,&Mimno,D.(2009,June).Evaluationmethodsfortopicmodels.InProceedingsofthe26thannualinternationalconferenceonmachinelear-ning(pp.1105-1112).ACM.Xu,Z.,Ru,L.,Xiang,L.,&Yang,Q.(2011,August).Discoveringuserinterestontwitterwithamodifiedauthor-topicmodel.InProceedingsofthe2011IEEE/WIC/ACMInternationalConfe-rencesonWebIntelligenceandIntelligentAgentTechnology-Volume01(pp.422-429).IEEEComputerSociety.Zhao,W.X.,Jiang,J.,Weng,J.,He,J.,Lim,E.P.,Yan,H.,&Li,X.(2011,April).Comparingtwitterandtraditionalmediausingtopicmodels.InEuropeanConferenceonInformationRetrieval(pp.338-349).Springer,Berlin,Heidelberg.

Documents

Network analysis of Brexit discussion on social media · We analyze the engagement of citizens, media and politicians on social media during the campaign peri-od of Brexit referendum