Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
NetworkanalysisofBrexitdiscussiononsocialmedia
KennethBenoitLondonSchoolofEconomicsandPoliticalScience,K.R.Benoit@lse.ac.ukAkitakaMatsuoLondonSchoolofEconomicsandPoliticalScience,[email protected] ABSTRACT:Weanalyzetheengagementofcitizens,mediaandpoliticiansonsocialmediaduringthecampaignperi-odofBrexitreferendumintheUKinJune2016.WefocusonthesocialmediaconversationonTwitter,the largestmicro-bloggingservicewithmorethan300millionactiveusersandapproximately500mil-lionnewmessagesgeneratedperday.WeanalysethenetworksofTweetsfocusingonboththetopicsusershavediscussedduringthereferendumcampaignandthenetworkoffollowerships.Usingmachinelearning,weclassifyusersasfavouringLeaveorRemain,andusethisinformationtocomparetheesti-matesofthetopicstheydiscussinthedebate.ThetopicanalysisrevealsthatthestructureoftopicsbyLeaveusersaremoredenselyrelatedthanthosebyRemainusers.Furthermore,byanalyzingthenet-workanalysisoffollowership,werevealthattheimportantaccountsintheBrexitdebatesuchasmediaandpoliticiansarecategoricallydifferentbetweenLeaveandRemain,indicatingafundamentallydivid-edmessagingcommunicationstructure,reinforcingworkdoneelsewhereonhowsocialnetworksformideological“echochambers”reinforcingone’spre-existingattitudes. KEYWORDS:SocialMedia,Brexit,NetworkAnalysis,TextAnalysis
2
1.Introduction
Inthispaper,weanalyzetheengagementofcitizens,mediaandpoliticiansonsocialmedia
duringthecampaignperiodofthe“Brexit”referendumheldintheUKinJune2016.Thepar-
ticularfocusisplacedonthesocialmediaconversationonTwitter,thelargestmicro-blogging
servicewithmorethan300millionactiveusersandover500millionnewmessagesgenerated
perday.WeanalysetraitsofthenetworksofTweetsfocusingonboththenetworkoftopics
usershavediscussedduring the referendumcampaignand thenetworkof followership.The
purposeofouranalysisistodetectthedifferencesbetweenLeaveandRemainforvariousas-
pectsofnetworkofTwittercommunications, inparticularthenetworkoftopicseachsideof
thedebate(LeaveorRemain)usedinTwitterconversations,andalsothenetworkoffollower-
followeerelations.TheformerfocusesonthestructureofTweetcontents,andthe latterfo-
cusesonthe informationenvironmentofTwitteruserswhoparticipated inconversationson
theBrexitreferendumonsocialnetworkingplatforms.
The analysis of topics reveals that the structure of topics by Leave supporters are more
denselyrelatedtooneanotherthanthoseusedbyRemainsupporters.Thenetworkanalysisof
followershiprevealsthattheimportantaccountsintheBrexitdebatesuchasmediaandpoliti-
ciansarecategoricallydifferentbetweenLeaveandRemain.Thisobserveddivisionindicatesa
fundamentallydividedstructureforconveyingandreceivingpoliticalcommunication, indicat-
ingthatsimilartoresultsshowninothercontexts(Flaxman,GoelandRao,2016;DelVicarioet
al.,2016;AllcottandGentzkow,2017),socialmediaintheBrexitcontextformedsimilarideo-
3
logical“echochambers”(Sunstein,2001)inwhichusersreinforcedtheirpre-existingattitudes
byfollowingaccountswithlike-mindedcontent.
2.Data
InordertocapturethediscussionontheBrexitreferenduminTwitterinitsentirety,weset
uptheTwitterdownloaderthroughanaccesstotheTwitter“firehose”,thehigh-capacitylive
streamofallTweets,whichguaranteedthedeliveryofallpostsmatchingthecapturecriteria.
AnotheroptiontocaptureTweetsistouseTwitter’sStreamingAPI1,whichdeliversarandom
sampleofTweetsuptoonepercentofallTweetsgeneratedatagiventime.
Table1.CaptureCriteria
TheuseofthestreamingAPIispreferredinmanystudiesbecauseanyTwitterusershavean
1https://dev.Twitter.com/streaming/overview
4
accesstothispublicAPI.Forourpurposes,thecapturingmethodalsoprovidedamoreaccu-
ratepictureofTwitterconversationforourresearchpurposes,becausethevolumeofTweets
inpronouncedeventssuchasBrexitmightexceedtheStreamingAPI’sratelimit.2Throughthe
trialandexploration,weselectedasetofsearchcriteriaforcapturingTweets,basedonacore
setofunambiguouslyBrexit-relatedterms.
The search termswe used are presented in Table 1. The capture terms consists of three
sets.The first isageneral search term (“Brexit”), thesecondconsistsofprominenthashtags
(hyperlinksonTwitterprefacedby the“”symbol) related toBrexit,and the thirdconsistsof
Twitterusernamesofgroupsandusersclearlysetupforthepurposesofcommunicatingabout
Brexit(e.g.@voteleave).
AnyTweetsthatcontainoneofthesetermswerecapturedinourdatacollectionwhichwe
startedonJanuary6,2016andranthroughJuly2016,themonthfollowingthereferendum.
OurdownloadeffortscapturedalargenumberofTweets.FromthestarttotheendofJune
2016,26millionTweetswerecollected,where10millionTweetsareoriginalTweets (asop-
posedtoretweets).OurTwitterdatahasaverylongtail.Thedataincludesabout3.5million
TwitterusersinwhichmajorityofaccountssentonlyasingleBrexit-relatedTweet.There are
a small number ofuserswhogeneratedaverylargenumberofTweets.Someofthehighest
volumeaccounts,whichgeneratedtensofthousandsofTweetsduringthistime,areobviously
spamaccounts,andwetrytoexcludetheseaccounts.3Figure1isatimeseriesofthevolume
2ForthedetailedcomparisonbetweenFirehoseandStreamingAPI,seeMorstatteretal.(2013)3Thesespamaccountstypicallyexclusivelysentre-tweets,andalsodespiteveryhighvolumeofTweets,
5
ofcapturedTweetsonthelogarithmicscale.Eachlineshowsthevolumeoforiginalpostsand
reTweetsoneachday.
ThedayaftertheBrexitreferendum(24June)recordedthelargestvolumeofTwittertrans-
actions.Over13millionTweetswererecordedonthedayofthereferendumitself.
Figure1.VolumeofBrexitTweets(Jan-Sep2016)
Note:y-axisisonalogarithmicscale
3.ClassifyingRemainv.LeaveAccounts
Thefirst,crucialtaskfromthelargecorpusofTweetswithmillionsofusersistopredictthe
theirnamesarerarelymentionedbyotherTwitteraccounts.
6
sides(LeaveorRemain)ofTwitteraccountsexpressedintheirTweetsonBrexitconversation.
We conduct this task by generating a training set consisting of human-classifed accounts to
generateacoretrainingset,andusingsupervisedmachinelearningtopredictthesideofthe
remainingusers.Theworkisconductedthroughtheseveralsteps.Thefirststepistoidentifya
fewhundred importantaccountsandmanuallyclassifytheseaccounts intoLeaveorRemain.
Weselectedtheseimportantaccountsbasedonthenumberofmentionstoauseraccountin
TweetsonBrexit.AusermentionisapartofTwittertextthatcontainsauser’sscreenname
followedbythe“@”character.ThisfeatureofTwittermakesaTweetwithareferencetospe-
cificuserssearchable,andisusedtoaimaTweetatamentioneduser.
Fortop200mostmentionedaccounts,ourresearchcollaboratorsvisitedinDecember2016
thewebsiteoftheirTwitteraccountstocheckwhethertheseaccountsarestillactiveandtheir
positionintheBrexitreferendum.Wefoundthatfifty-fiveaccountsareonLeaveandtwenty-
fiveareRemain.Fromthis listofknownaccounts,weextend the listofTwitteraccountson
eachsidethroughtheanalysisofhashtagusage.Ahashtag isanalpha-numericstringwitha
prependedhash (“#”) in socialmedia texts. InclusionofhashtagswillmakeTweetseasier to
findanddelivermessages tootherTwitteruserswho share the interestswith theuserwho
posted theoriginalTweetwithhashtags.We identify thehashtagsdisproportionallyusedby
oneoftwosidesbylookingatthehashtaguseofhuman-codedaccounts,4andidentifyother
4 Hashtags by the Remain side are #strongerin, #intogether, #infor, #votein, #libdems, #voting, #in-crowd,#bremain,and#greenerin.HashtagsusedbytheLeavesideare#voteleave,#inorout,#takecon-trol,#voteout,#takecontrol, #borisjohnson,#projecthope,#independenceday,#ivotedleave,#project-fear,#britain,#boris,#lexit,#go,#takebackcontrol,#labourleave,#no2eu,#betteroffout,#june23,and
7
TwitterusersactiveintheBrexitdiscussion(i.e.morethan50Tweetsinourdata).
Weuse this listofhashtags to finda largenumberofLeaveandRemainTwitteraccounts
fromtheaccountsactive intheBrexitreferendumdebateonTwitter,about15,000accounts
withmorethan50TweetsinourBrexitTwittercorpus.WecalculatedtheLeavehastagscore
ofeachaccountbycountingthenumberofLeavehashtagssubtractedbythenumberofRe-
mainhashtag.Weselect thetop (orbottom)tenpercentuseraccountsas theLeave (orRe-
main)asthetrainingdata.
ThenextstepistheclassificationofTwitteraccountsinto“Leave”and“Remain”usingthis
trainingdata.Forthesimplicityandinterpretabilityofoutcomes,weuseasimpleNaiveBayes
classiferwithaflatprior.Forcreatingthefeaturematrix,weemploya“bag-of-words”meth-
odology.WefirstcombineallTweetsfromeachaccountandthenextractfeatures.Sincethe
document featurematrix isextremely sparse (over97percent sparsity)evenonlyusinguni-
grams,wedonottryn-grams.Weapplylower-casingtoallwordsandthenstemthem,except
forTwitterspecificentities(hashtagsandmentions).Featureswithadocumentfrequencyless
than0.1percentwerealsoremoved.
Table2:AverageAccuracyByFeatureSelectionsFeatureType Accuracy NAllFeatures 0.93 64206HashtagsandMentions 0.88 18705Hashtags 0.85 7972Mentions 0.87 10733
#democracy. Since thesehashtags areused for trainingdata generation,we removed them from thefeaturesintheclassificationmodel.
8
Thenextstepistheselectionoffeatures.Theconcernistheoverfittingofthemodel.Even
afterthecleaningofthedata,thereareover60,000featuresleftforonly3,000trainingdocu-
ments.
Tocheckwhethertheinclusionofallfeaturesisappropriate,wetestedseveraldifferentse-
lections of features. The setting we have tested are: hashtag-only,mentions-only, hashtags
and menstions, and all features. For each of these four settings, we run a ten-fold cross-
validationtentimeseach,andcalculatedtheaverageaccuracyforout-of-samplepredictions.5
Table2reporttheoutcomes.
Allsettingsarereasonablyaccurate,buttheuseofall featuresoutperfomedtheotherap-
proaches,andwethereforedecidedtousethatmodelforgeneratingourLeaveversusRemain
labelsonthefullsetofuseraccounts.
We estimate themodelwith full training data. The estimatedmodel provide two sets of
predictions:thefirstistheprobablityofeachfeaturebelongtotheLeaveorRemain,andthe
secondisthepredictedprobabilityofeachaccountbelongstotheLeaveorRemainside.Fig-
ure2depictstherelativefrequenciesofthehastagsusedbyeachside,plottingthesizeofeach
hashtagproportionaltoitsrelativefrequency,asa“wordcloud”.Thepartitionsdistinguishthe
usagebythepartitionsofourclassifer.
5Sinceboth“Remain”and”Leave”areimportantoutcomes,wereporttheaccuracyratherthanotherstatisticsthatfocusonpositivepredictability,suchasprecisionorrecall.
9
Figure2.WordCloudofHighFrequencyHashtags
Note:Topseventyhashtagsineachcategory.Thecategoriesofhashtagsaregeneratedfromthepredictedproba-bilityofahashtagtobelongtoLeave(0-0.25:Remain,0.25-0.75:Neutral,0.75-1:Leave).Sizeofhashtagispropor-tionaltothefrequency.Colorindicatesthecategory.AftertrainingtheNBclassificationmachine,weapplyittoallthreemillionaccounts.Theclas-
sificationresultsatTweetlevelareshowninthetableandfigurebelow.AsFigure2indicates,
theTweetsbyLeaveaccountsexceedthevolumeofTweetsbyRemainaccountsinmostofthe
timeapproachingtothereferendum.ThegreaternumberoftotalTweetsbyRemain(Table4)
10
istheresultofthemassivevolumeofTweetsafterthereferendum.Thedayafterthereferen-
dum(24June,2016)recordedthehighestvolumeinourdataandthemajorityTweetsonthe
dayweremadebytheaccountsourmodelclassifiedasRemain.
Table3:ClassificationResultsofTweetsSide Count Oct.Remain 9780223 36.93%Neutral 7786297 29.40%Leave 8914207 33.66%
Figure3Time-seriesofLeaveandRemainTweets(JanJune2016)
Note:y-axisisonalogarithmicscale
11
4.Networkoftopics
Applying the Structural TopicModel (STM) by Roberts et al. (2014, 2015),we attempt to
identify thetopics inTwitterconversationsandtheirdynamicacrosstimeandbysideof the
debate. STM is amethodology that iswell-suited to identifying the topics in the debate on
Twitter, their associationswith each side, and their chronological evolution. The strengthof
STM,whichisbuiltuponothermodelstodetectthetopics,suchastheLatentDirichletAlloca-
tionmodel,isthatitallowstheincorporationofadditionalcovariatestoestimatetheeffectof
thesefactorsontheuseoftopics.
Thereareothermethods formeasuring topics that canbeused for thispurpose, suchas
Twitter-LDA(Zhaoetal.,2011)orAuthor-Topic-model(Xuetal.,2011).However,thesemod-
elsarenotparticularlysuitableforourresearchpurposes,becausethesemodelsaremorefo-
cusedon the classificationof individual documents, anddo soby setting restrictionson the
topic distributions of each document. A simple LDAmodel (Blei, Ng and Jordan, 2003) and
STM,which is based on the LDA, assume that each document is amixture of topics. This is
moresuitableforourdataaggregationstrategydetailedbelow.Also,thefeatureofSTMtoal-
lowtheuseofadditionalcovariatesasthedeterminantsofprevalenceofeachtopicisparticu-
larlyusefulforthisresearch,becauseoneofthemaininterestsforustofitatopicmodelisto
explorethetemporaldynamicsofevolutionoftopicsbyeachsideofLeaveandRemain.The
unitofobservationinthemodelisatextofcombinedTweetscreatedbyeachuserinagiven
12
week.6 TheTwitterusers’ positions inBrexit are theprediction from theNaiveBayesmodel
describedintheprevioussection.Therearetworeasonswhyweusethisaggregationstrategy.
First,bycombiningTweetsfromeachuser,wecanprovidemoreinformationtothemodelto
improve the performance of the classifier (Hong and Davison, 2010). Second, we combine
Tweetsforeachweeksothatwecancapturethedynamicsoftopics.Thesideisabinaryvari-
able that takesavalueofeitherLeaveorRemain.Neutralusersare removed fromthedata
becausemanyusersarethe lowfrequencyTweeters(onBrexit,at least)whomightnotpro-
videtheenoughinformationtoourNaiveBayesclassifiertoclearlydeterminetheirside.Inthe
modelspecification,weusedtwocovariatesandtheir interaction.OneisSideandanotheris
Week.TheWeekvariabletakesanintegervaluethat indicatesaweeknumberintheyearof
2016, and converted to a b-spline to prevent over fitting. An interaction between Side and
Weekisincludedasweexpectthatthetopicsdiscusseddonotmoveinparallelacrosssides.In
STM, thenumberof topics (k)has tobe fed,andwechosek=40after theexplorationofk
through10,20,30and40.
STM, likeothertopicmodels, isamethodofunsupervisedmachine learningthatdoesnot
requirepre-codedtrainingdata.Thismeansthat it isuptoresearchers’qualitative judgment
to interpret the estimated topicsmeans and give them substantively understandable labels
(Changetal.,2009;Wallachetal.,2009).Thejudgmentisbasedonthehigh-probabilityorex-
clusivewords foreach topic.Outof forty topicsweestimated, thirty-fiveseemtohavecon-
sistentthemeswhilefivedonotprovideconsistent,interpretablecontents.Dependingonthe
6Aweekstartsat12amGMTonMonday.
13
effectofSideontheprevalenceoftopics,wecategorizetopicsintoLeave,Remain,andNeu-
tral.Table4providesthelabelsweappliedtoourtopics(seetheappendixforamoredetailed
presentationoftheproportionoftopicsbyLeaveversusRemainside,usedtodeterminethe
associationinthetable).
Thetimevariable,anothercovariateofthemodel,allowsustocheckthedynamicsoftopics
byeachside.Thefollowingtwofiguresareexamplesofthetopicevolutions.Thehorizontalax-
isindicatestheweekoftheyearin2016(thereferendumwasheldonWeek26)andtheverti-
calaxisistheproportionoftopicintheentireTwitterconversationintheweek.Thefirstfigure
isonthetopic“Brexit:TheMovie”,whichisadocumentaryfilmcreatedbyafilmdirectorwho
advocatestheUKconcessionfromEU,hadasignificantspikeontheweekof itsreleasedate
(11May2017)fromLeave(blueline),whilethereisnoresponsetothistopicfromRemainside
Twitteraccounts(redline).
TherearealsomultipletopicsfromRemainsides.ThefollowingfigureisoneofsuchRemain
topics.Weconsiderthisisatopiconthe“FinancialRisk”orfinancialconcernsabouttheBrexit
becauseofitshighprobabilitywords(e.g.brexit,fears,vote,stocks,markets,global).Thetopic
hadbeenalwaysdiscussedbyRemainTwitterusersthroughoutthetime-perioduntilthetime
ofthereferendum,andtherewasahugespikeof this topic inacoupleofweeksbeforethe
referendum.
14
Table4DetectedTopics
15
Figure4:TopicEvolution“Brexit:TheMovie”
Note:y-axisindicatestheproportionoftopic
Figure5TopicEvolution“FinancialRisk”
Note:y-axisindicatestheproportionoftopic
16
5.Structureoftopics
STMallowstheestimatedtopics,meaningthatitcancorrectlyidentifytopicsthatco-occur
intheTweetsbyauser.Bycheckingthetopiccorrelations,wecanalsomapthenetworkstruc-
tureoftopics.Thefollowingfigureshowsthenetworkoftopics(Figure6).
Figure6Networkoftopics
Note:TheredcirclesareRemaintopics(judgedfromtheestimatedproportionsoftopics,describedabove),andbluecirclesareRemaintopics.Thewidthsofthelinesthatconnecttwotopicsindicateproximitybetweentwotopicsmeasuredbythestrengthofcorrelations.Thedistanceoftwotopicsintheplotalsoreflectstheproximity.
17
The figure indicates that theLeave topicsarestronglycorrelated.Thedensenetworkofa
numberofLeavetopicsonthebottomleft inthefigure indicatesthattheLeaveTweetshad
wellorganizedtopicstructurecenteredaround“VoteLeave”,thecoretopicbyLeaveside.
Incontrast,theRemaintopicsarerathersporadic.Therearecorrelatedtopicsofmobiliza-
tionattempts(middle-left),butthesearenotmuchrelatedtotopicsofsubstantialissues.For
remain side, topicson the substantial issuesaremostlyoneconomy (middle-top),but these
are rather separated from the core topic of remain side (“Stronger In” topic on the bottom
left). In termsof thedisseminationof topics on the socialmedia, the Leave side seemed to
havetheadvantageagainsttheRemainside.BecauseauserismorelikelytoTweetboththe
simplemessageoftakingaction(e.g.“VoteLeave”)andthesubstantialissuesthatexplainwhy
suchactionsarenecessary (e.g. “Jobsand social security”), followersof suchaccountwould
havereceivedthestrongmessageforLeave.
6.Analysisoffollowershipnetwork
Inthissection,weestimatethepositionsofTwitteraccountsthroughanetworkanalysisof
followershipduringtheBrexitreferendumcampaign.Thedataweusearethenetworkoffol-
lowers for 400 accounts that are the most frequently mentioned in the Brexit referendum
Tweets.Weuseamethodofnetwork-basedideologyscalingdevelopedinBarberà(2015).The
methodologyrecoversanissue/ideologyspacefromthenetworkoffollowership,andlocates
18
followersandfollowersinasingledimensionalspace.Thekeyassumptionofthismethodology
isthatTwitterusersprefertofollowaccountsthathaveopinionssimilartotheirownopinions.
Inthisspecificapplication,wethinkthatbyfocusingonthe importantaccounts inBrexitde-
bateselectedbythefrequencyofmentionstotheaccountsinTweetsrelatedtoBrexit,wecan
reproducetheideologicalspaceinBrexit(i.e.LeaveorRemain).
Theestimatedresultsfromthismodelconsistoftwoparts.
1.EstimatedpositionsofimportantTwitteraccounts
2.EstimatedpositionsofTwitteraccountswhofollowtheseimportantaccounts.
Wewillexploretheresultsinthisorder.
7.Ideologydistributionoffollowees
Thefollowingistheestimateofallfrequentlymentionedaccounts(Figure7).Thehorizontal
dimension is the position (Right = Remain). The vertical dimension is the popularity (high =
popular).
Thenwehighlightwhatweconsiderthepolitically influentialaccounts,namelyTwitterac-
countsofMPandmediaoutlets.
19
Figure7.NetworkbasedscalingofBrexitissuepositionsfrequentlymentioneduseraccounts
TheresultsaredepictedinFigure8.Themessageswecangetfromthispicturearethefol-
lowing:
• Conservatives MPs positions are very diverse. Their positions are distributed from
moderatelyLeavetomoderatelyRemain.
• LabourMPspositionsarerelativelycoherentandplacedasthefirmRemain.Anexcep-
tionwas Kate Hoey, whowas one of themost active figures in the Grassroots Out
group.
• OtherpartiesMPpositionsareinthelocationweexpect(SNP,Greens,UKIP).
20
• MediaaccountpositionsrangedfromfrommoderateLeave(TheSun)tofirmRemain
(FinancialTimes).
Figure8:NetworkbasedscalingofBrexitissuepositionsofpoliticallyimportantTwitterac-
counts
8.Distributionoffollowers
Wenowmoveontotheanalysisofthedistributionoffollowers,depictedinFigure9.There
aretwopeaksinthedistributionoffolloweraccounts.TheplotofthedistributionsofRemain
21
andLeaveusers(basedontheNaveBayesestimation)explainswherethesetwopeakscome
from.ThepeakontheleftcorrespondstotheLeaveusers,andpeakontherightcorresponds
to the Remain users. This result indicates that there are distinct patterns in the choice of
whomtofollowbytheuserswhoparticipatedintheBrexitdebateonTwitter.Leaveusersfol-
lowed Leave accountsmore, and therefore they aremore likely tobe exposed to the infor-
mationbroadcastedbyLeaveaccountsontheirTwitterfeed,thenasaresultmentionedthese
accountsfrequentlyintheirTweetstodisseminatetheinformationfurther.Asimilarpatternis
foundamongtheusersclassifiedaspro-Remain.
Figure9:Distributionofnon-eliteaccountsbasedonthenetworkscaling
22
Figure10:Distributionofnon-eliteaccountsbasedonthenetworkscalingbyside
9.ComparisonbetweenfollowershipnetworkonBrexitandallMPsandmedia
In theprevious sectionwe found that therearedistinctpatternsof followershipbyLeave
andRemainTwitterusers.Inthissubsection,weattempttoaddresstheoriginofthispattern,
inparticularwhetherthispatternhasanyrelationswiththenormalleft-rightideologicalspec-
trum,throughthecomparisonoftwoideologicalscaling,onefromtheLeaveandRemainposi-
tions intheBrexitTweetsnetworkandanotherfromthepositions.Followingthestrategyof
Barbera(2015),weobtainedthefollowershipnetworkdataofallMPsduringtheBrexitcam-
23
paignperiod (i.e.before the2017GeneralElection)andmajormediaoutlets,andestimated
thepositionsofTwitterusersinthisnetwork.
Thenextfigureshowsthedifferencesandsimilaritiesbetweentheideologicalpositionses-
timatedfromthesetwonetworks11.Thehorizontalaxis indicatesthepositioningofTwitter
users in thenetworkofBrexitTweets (where largervalues indicatebeingmorepro-Remain)
andtheverticalaxis indicatesthepositioningofTwitterusersinthefollowershipofMPsand
mediaaccounts.
Thelargervalueindicatesamoreconservativeposition.Thereisaclearcorrelationbetween
twoestimates.TheconservativeTwitterusersaremorelikelytotaketheLeaveposition.How-
ever,therelationsarenotstrong.Thecorrelationbetweenthesetwovariablesareonly0.396.
ThisresultsuggeststhatalthoughtheLeave-Remaindimensionispartiallythereflectionoflib-
eralconservativedimension,butotherconcerns,extractedasthetopicofeitherLeaveorRe-
main inourtopicmodelbutnotwell-representedbytheexistingparties,wouldhaveplayed
thekeyroleforsomecitizensintheUKtodecidetheirsideonthedebate.
10.Discussion
Our investigation of a large set of Twitter data yields important insights into the engage-
ment of citizens,media and politicians using Twitter for political communication during the
campaignperiodofBrexit referendum in theUK in June2016. Inparticular,wehave shown
thatboth thecontentand thenetworkof followerships reveal importantdifferencesamong
24
thesidesinthedebate,bothabouthowindividualsandorganizationsself-selecttheircontent
andwhatmessages they communicate, dependingonwhether they supported the Leaveor
RemainsidesinthedebateoverBrexit.
Ouranalysisofthenetworkoftopicsthatusershavediscussedduringthereferendumcam-
paignandthenetworkoffollowershipwasaimedatdetectingthedifferencesbetweenLeave
andRemain.Ouranalysis shows that thestructureof followershipnetworks isbothstrongly
dividedandcategoricallydifferentbetweentheLeaveandRemainsides.
OuranalysisofthetopicsusesinthedebaterevealsthatthestructureoftopicsbyLeaveus-
ersaremoredenselyrelatedtooneanotherthoseusedthatbyRemainusers.Inotherwords,
themessaginganddiscourseofthepro-Leavesidewasmoredenseandmorecoherentthan
the network of pro-Remain supporters, who advanced more diverse arguments in favor of
theirposition.Thismirrorsearlierworkonthesentimentof themessagesthat revealsmore
positive, less tentative,andmore future-looking rhetoricamongLeave supporters compared
toRemain.
Overall,ouranalysisofpoliticaldiscussiononTwitterhasrevealedthat thisplatformdoes
contain informative content in the formofpolitical communication thatmeaningfullydistin-
guishes thesides in theBrexitdebate. It isbothpossible topredict theorientationofauser
fromtheuser’sTweets,andpossibletodistinguishdistincttopicsanddistinctpositioningfrom
thecontentandfollowershipstructuresoftheuser’ssocialmediausage.
25
Figure11:ComparisonofBrexitideologyandnormalLeft-Rightideologybasedonnetwork-
basedscaling
26
References Allcott,H.andGentzkow,M.(2017).SocialMediaandFakeNewsinthe2016Election.JournalofEconomicPerspectives,31(2):211–236.Barberá,P.(2014).Birdsofthesamefeathertweettogether:BayesianidealpointestimationusingTwitterdata.PoliticalAnalysis,23(1),76-91.Blei,D.M.,Ng,A.Y.,&Jordan,M.I.(2003).Latentdirichletallocation.JournalofmachineLearningresearch,3(Jan),993-1022.Chang,J.,Gerrish,S.,Wang,C.,Boyd-Graber,J.L.,&Blei,D.M.(2009).Readingtealeaves:Howhumansinterprettopicmodels.InAdvancesinneuralinformationprocessingsystems(pp.288-296).DelVicario,Michela,AlessandroBessi,FabianaZollo,FabioPetroni,AntonioScala,GuidoCal-darelli,DelVicario,M.,Bessi,A.,Zollo,F.,Petroni,F.,Scala,A.,Caldarelli,G.,...&Quattrociocchi,W.(2016).Thespreadingofmisinformationonline.ProceedingsoftheNationalAcademyofScien-ces,113(3),554-559.Flaxman,S.,Goel,S.,&Rao,J.M.(2016).Filterbubbles,echochambers,andonlinenewscon-sumption.PublicOpinionQuarterly,80(S1),298-320.Hong,L.,&Davison,B.D.(2010,July).Empiricalstudyoftopicmodelingintwitter.InProcee-dingsofthefirstworkshoponsocialmediaanalytics(pp.80-88).ACM.Morstatter,F.,Pfeffer,J.,Liu,H.,&Carley,K.M.(2013,July).IstheSampleGoodEnough?ComparingDatafromTwitter'sStreamingAPIwithTwitter'sFirehose.InICWSM.Roberts,M.E.,BrandonM.S.andTingley,D.(2015).stm:RPackageforStructuralTopicModels.JournalofStatisticalSoftwareVV.Roberts,M.E.,Stewart,B.M.,Tingley,D.,Lucas,C.,Leder-Luis,J.,Gadarian,S.K.,...&Rand,D.G.(2014).StructuralTopicModelsforOpen-EndedSurveyResponses.AmericanJournalofPo-liticalScience,58(4),1064-1082.Sunstein,C.(2001).EchoChambers.PrincetonUniversityPress.
27
Wallach,H.M.,Murray,I.,Salakhutdinov,R.,&Mimno,D.(2009,June).Evaluationmethodsfortopicmodels.InProceedingsofthe26thannualinternationalconferenceonmachinelear-ning(pp.1105-1112).ACM.Xu,Z.,Ru,L.,Xiang,L.,&Yang,Q.(2011,August).Discoveringuserinterestontwitterwithamodifiedauthor-topicmodel.InProceedingsofthe2011IEEE/WIC/ACMInternationalConfe-rencesonWebIntelligenceandIntelligentAgentTechnology-Volume01(pp.422-429).IEEEComputerSociety.Zhao,W.X.,Jiang,J.,Weng,J.,He,J.,Lim,E.P.,Yan,H.,&Li,X.(2011,April).Comparingtwitterandtraditionalmediausingtopicmodels.InEuropeanConferenceonInformationRetrieval(pp.338-349).Springer,Berlin,Heidelberg.