A SURVEY OF TECHNIQUES FOR EVENT DETECTION IN TWITTERrali.iro.umontreal.ca/rali/sites/default/files/... · of techniques for event detection from Twitter streams. These techniques

Computational Intelligence, Volume 0, Number 0, 2013

A SURVEY OF TECHNIQUES FOR EVENT DETECTION IN TWITTER

FARZINDAR ATEFEH AND WAEL KHREICH

NLP Technologies Inc., Montreal, QC, Canada

Twitter is among the fastest-growing microblogging and online social networking services. Messages postedon Twitter (tweets) have been reporting everything from daily life stories to the latest local and global news andevents. Monitoring and analyzing this rich and continuous user-generated content can yield unprecedentedly valu-able information, enabling users and organizations to acquire actionable knowledge. This article provides a surveyof techniques for event detection from Twitter streams. These techniques aim at finding real-world occurrencesthat unfold over space and time. In contrast to conventional media, event detection from Twitter streams posesnew challenges. Twitter streams contain large amounts of meaningless messages and polluted content, which neg-atively affect the detection performance. In addition, traditional text mining techniques are not suitable, because ofthe short length of tweets, the large number of spelling and grammatical errors, and the frequent use of informaland mixed language. Event detection techniques presented in literature address these issues by adapting techniquesfrom various fields to the uniqueness of Twitter. This article classifies these techniques according to the event type,detection task, and detection method and discusses commonly used features. Finally, it highlights the need forpublic benchmarks to evaluate the performance of different detection approaches and various features.

Received 29 November 2012; Revised 17 April 2013; Accepted 11 July 2013

Key words: event detection, event identification, microblogs, monitoring social media, Twitter data stream.

1. INTRODUCTION

Microblogging is a broadcast medium that allows users to exchange small digitalcontent such as short texts, links, images, or videos. Although it is a relatively new com-munication medium compared with traditional media, microblogging has gained increasedattention among users, organizations, and research scholars in different disciplines. Thepopularity of microblogging stems from its distinctive communication services such asportability, immediacy, and ease of use, which allow users to instantly respond and spreadinformation with limited or no restrictions on content. Virtually any person witnessing orinvolved in any event is nowadays able to disseminate real-time information, which canreach the other side of the world as the event unfolds. For instance, during recent socialupheavals and crises, millions of people on the ground turned to Twitter to report and followsignificant events.

Twitter is currently the most popular and fastest-growing microblogging service, withmore than 140 million users producing over 400 million tweets per day—mostly mobile—as of June 2012.1 Twitter enables users to post status updates, or tweets, no longer than 140characters to a network of followers using various communication services (e.g., cell phones,e-mails, Web interfaces, or other third-party applications). While some users consider the140-character constraint as a severe limitation, many argue that it is the feature that setsTwitter apart—short information is easier to consume and faster to spread. Even thoughtweets are limited in size, Twitter is updated hundreds of millions of times a day by peopleall over the world, and its content varies tremendously based on user interests and behaviors

Address correspondence to Wael Khreich, NLP Technologies Inc., 52 Le Royer, Montreal, QC, Canada;e-mail: [email protected]

1 http://news.cnet.com/8301-1023_3-57448388-93/twitter-hits-400-million-tweets-per-day-mostly-mobile/ (accessedon November 8, 2012).

© 2013 Wiley Periodicals, Inc.

http://news.cnet.com/8301-1023_3-57448388-93/twitter-hits-400-million-tweets-per-day-mostly-mobile/

COMPUTATIONAL INTELLIGENCE

(Java et al. 2007; Krishnamurthy et al. 2008; Zhao and Rosson 2009). Therefore, Twitterstreams contain large and diverse amount of information ranging from daily-life stories tothe latest local and worldwide news and events (Hurlock and Wilson 2011; Kwak et al.2010).

Online social media sites (Facebook, Twitter, Youtube, etc.) have revolutionized theway we communicate with individuals, groups, and communities and altered everydaypractices (Boyd and Ellison 2007). Several recent workshops, such as Semantic Analy-sis in Social Media (Farzindar and Inkpen 2012), are increasingly focusing on the impactof social media on our daily lives. For instance, Twitter has changed the way peopleand businesses perform, seek advice, and create “ambient awareness” (a sort of virtualomnipresence) and reinforced the weak and strong tie of friendship (Geser 2011; McFedries2007; Thompson 2008). Unlike other media sources, Twitter messages provide timely andfine-grained information about any kind of event, reflecting, for instance, personal per-spectives, social information, conversational aspects, emotional reactions, and controversialopinions.

Monitoring and analyzing this rich and continuous flow of user-generated content canyield unprecedentedly valuable information, which would not have been available from tra-ditional media outlets. Tweets can be seen as a dynamic source of information enablingindividuals, corporations, and government organizations to stay informed of “what is hap-pening now.” For instance, people would be interested in getting advice, opinions, facts, orupdates on news or events (Java et al. 2007; Krishnamurthy et al. 2008; Zhao and Rosson2009). Companies are increasingly using Twitter to advertise and recommend products,brands, and services; to build and maintain reputations; to analyze users’ sentiment regard-ing their products (or those of their competitors); to respond to customers’ complaints; andto improve decision making and business intelligence (Farzindar 2012; Jansen et al. 2009;Jiang et al. 2011; Liu et al. 2012; Pak and Paroubek 2010). Twitter has also emerged as afast communication channel for gathering and spreading breaking news (Amer-Yahia et al.2012; Phuvipadawat and Murata 2010; Sankaranarayanan et al. 2009), for predicting elec-tion results, and for sharing political events and conversations (Small 2011; Tumasjan et al.2010). It has also become an important analytical tool for crime prediction (Wang et al.2012) and monitoring terrorist activities.2

In general, events can be defined as real-world occurrences that unfold over space andtime (Allan et al. 1998; Troncy et al. 2010; Xie et al. 2008; Yang et al. 1998). Event detectionfrom conventional media sources has been long addressed in the Topic Detection and Track-ing (TDT) research program, which mainly aims at finding and following events in a streamof broadcast news stories (Allan et al. 1998; Yang et al. 1998). However, event detectionfrom Twitter streams pose new challenges that are different from those faced by the eventdetection tasks in traditional media. In contrast with the well-written, structured, and editednews releases, Twitter messages are restricted in length and written by anyone. Therefore,tweets include large amounts of informal, irregular, and abbreviated words, large numberof spelling and grammatical errors, and improper sentence structures and mixed languages.In addition, Twitter streams contain large amounts of meaningless messages (Hurlock andWilson 2011), polluted content (Lee et al. 2011), and rumors (Castillo et al. 2011), whichnegatively affect the performance of the detection algorithms.

This article provides a survey of techniques found in the literature for event detectionfrom Twitter streams. These techniques are classified according to the event type (specified

2 http://www.nextgov.com/big-data/2012/07/pentagon-seeks-predict-terrorism-monitoring-facebook-twitter/57085/(accessed on November 8, 2012).

http://www.nextgov.com/big-data/2012/07/pentagon-seeks-predict-terrorism-monitoring-facebook-twitter/57085/

TECHNIQUES FOR EVENT DETECTION IN TWITTER

or unspecified), detection task (retrospective or new event detection), and detection method(supervised or unsupervised) as well as to the target application (Tables 1 and 2). Commonlyused feature representations are also presented and discussed (Table 2). Event detectionfrom Twitter streams is a vibrant research area that draws on techniques from various fieldssuch as machine learning, natural language processing, data mining, information extractionand retrieval, and text mining. This article does not provide an exhaustive review of existingapproaches but rather selects representative techniques to give the reader a perspective onthe main research directions.

Section 2 provides background information on Twitter and presents the motivation andchallenges associated with event detection from Twitter streams. Section 3 briefly describesrelevant techniques for event detection from traditional media outlets. In Section 4, thetechniques for detecting real-world events from Twitter data streams are described and cat-egorized according to the type of event, detection task, and detection methods with thecommonly used feature representations. Finally, Section 5 provides a general discussion,followed by the conclusion.

2. TWITTER

2.1. Service Overview

As described in Section 1, Twitter is currently the most popular fastest-growingmicroblogging service. It is the eighth most popular site in the world according to the3-month Alexa traffic rankings.3 This popularity is also reflected in the increasingly largenumber of research papers published about Twitter in various fields. Twitter’s core func-tion allows users to post and read short messages. Similar to blogging sites, Twitter allowssharing any kind of information, thoughts, opinions, and ideas to keep people updatedor informed of happenings. However, Twitter encourages concise description of ideas viatweets—short text messages that are no longer than 140 characters—which allow for effec-tive and timely communication of information. These tweets are automatically posted as astreams (and publicly accessible) on the user’s profile on Twitter and instantly sent to theuser’s network of followers. Messages can be posted on Twitter through various communi-cation services such as cell phones, emails, Web interfaces, or other third-party applications.In particular, tweets can be easily posted via mobile devices, such as short message servicemessages, providing an efficient medium for instant information dissemination and con-sumption. The portability, immediacy, and ease of use are among the main reasons behindTwitter’s popularity.

Twitter is also a unique online social networking service that allows people to createprofiles, communicate, and connect with other people on the service. The social relationshipon Twitter is asymmetric and can be conceptualized as a directed social network or followernetwork (Brzozowski and Romero 2011). A user can follow any other user without requiringan approval or a reciprocal connection from the followed users. Twitter does not impose anylimits on the number of followers to a user account; however, one user account can typicallyfollow up to 2000 users. There is, however, a user-dependent limit to follow additional usersbased on the ratio of followers to be followed.4 By default, posted messages are available toanyone. Although users can modify their privacy settings to only update their followers andto decide who can follow them, these are not commonly used. Users can consume tweets

3 http://www.alexa.com/siteinfo/twitter.com (accessed on November 8, 2012).4 https://support.twitter.com/entries/68916 (accessed on November 8, 2012).

http://www.alexa.com/siteinfo/twitter.com

https://support.twitter.com/entries/68916


TA

BL

E1.

Taxo

nom

yof

Eve

ntD

etec

tion

Tech

niqu

esin

Twit

ter.

Ref

eren

ces

Type

ofev

ent

Det

ecti

onm

etho

dD

etec

tion

task

App

lica

tion

Spe

cifi

edU

nspe

cifi

edS

uper

vise

dU

nsup

ervi

sed

NE

DR

ED

San

kara

nara

yana

net

al.(

2009

)x

xx

xB

reak

ing-

new

sde

tect

ion

Phu

vipa

daw

atan

dM

urat

a(2

010)

xx

xB

reak

ing-

new

sde

tect

ion

Petr

ovic

etal

.(20

10)

xx

xG

ener

al(u

nkno

wn)

even

tdet

ecti

onB

ecke

ret

al.(

2011

a)x

xx

xG

ener

al(u

nkno

wn)

even

tdet

ecti

onL

ong

etal

.(20

11)

xx

xG

ener

al(u

nkno

wn)

even

tdet

ecti

onW

eng

and

Lee

(201

1)x

xx

Gen

eral

(unk

now

n)ev

entd

etec

tion

Cor

deir

o(2

012)

xx

xG

ener

al(u

nkno

wn)

even

tdet

ecti

onPo

pesc

uan

dPe

nnac

chio

tti(

2010

)x

xx

Con

trov

ersi

alne

ws

even

tsab

out

cele

brit

ies

Pope

scu

etal

.(20

11)

xx

xC

ontr

over

sial

new

sev

ents

abou

tce

lebr

itie

sB

enso

net

al.(

2011

)x

xx

Mus

ical

even

tdet

ecti

onL

eean

dS

umiy

a(2

010)

xx

xG

eoso

cial

even

tmon

itor

ing

Sak

akie

tal.

(201

0)x

xx

Nat

ural

disa

ster

even

tsm

onit

orin

gB

ecke

ret

al.(

2011

)x

xx

Que

ry-b

ased

even

tret

riev

alM

asso

udie

tal.

(201

1)x

xx

Que

ry-b

ased

even

tret

riev

alM

etzl

eret

al.(

2012

)x

xx

Que

ry-b

ased

stru

ctur

edev

ent

retr

ieva

lG

uet

al.(

2011

)x

xx

Que

ry-b

ased

stru

ctur

edev

ent

retr

ieva

l


TA

BL

E2.

Sum

mar

yof

Det

ecti

onTe

chni

ques

and

Feat

ure

Rep

rese

ntat

ions

.

Ref

eren

ces

Det

ecti

onte

chni

ques

Gen

eral

feat

ures

Twit

ter-

spec

ific

feat

ures

San

kara

nara

yana

net

al.(

2009

)N

aive

Bay

escl

assi

fier

and

onli

neTe

rmve

ctor

Has

htag

san

dti

mes

tam

pscl

uste

ring

Phu

vipa

daw

atan

dM

urat

a(2

010)

Onl

ine

clus

teri

ngTe

rmve

ctor

,pro

per

noun

sH

asht

ags,

#fol

low

ers,

(con

vent

iona

lNE

R)

#ret

wee

tsan

dti

mes

tam

psPe

trov

icet

al.(

2010

)O

nlin

ecl

uste

ring

(bas

edon

loca

lity

#tw

eets

,#us

ers

and

entr

opy

–se

nsit

ive

hash

ing)

ofm

essa

ges

Bec

ker

etal

.(20

11a)

Onl

ine

clus

teri

ngan

dsu

ppor

tvec

tor

Term

vect

orH

asht

ags,

mul

ti-w

ord

mac

hine

clas

sifi

erha

shta

gsw

ith

spec

ial

capi

tali

zati

on,r

etw

eets

,re

plie

san

dm

enti

ons.

Lon

get

al.(

2011

)H

iera

rchi

cald

ivis

ive

clus

teri

ngW

ord

freq

uenc

yan

den

trop

yP

roba

bili

tyof

wor

doc

curr

ing

inha

shta

gsW

eng

and

Lee

(201

1)D

iscr

ete

wav

elet

anal

ysis

and

grap

hIn

divi

dual

wor

ds–

part

itio

ning

Cor

deir

o(2

012)

Con

tinu

ous

wav

elet

anal

ysis

and

–H

asht

agoc

curr

ence

sla

tent

Dir

ichl

etal

loca

tion

Pope

scu

and

Penn

acch

iott

i(20

10)

Gra

dien

tboo

sted

deci

sion

tree

sC

orre

lati

onof

targ

etev

ents

Pro

port

ion

ofno

uns,

verb

s,(o

ren

titi

es)

wit

hth

eW

eban

dqu

esti

ons,

bad

wor

ds,e

tc.;

trad

itio

naln

ews

med

ia#t

wee

t,#r

etw

eets

,#re

plie

s,#t

wee

tspe

rus

er,h

asht

ags;

prop

orti

onof

twee

tsan

dha

shta

gsin

volv

ing

buzz

ines

s,se

ntim

ent,

cont

rove

rsy


TA

BL

E2.

Con

tinu

ed.

Ref

eren

ces

Det

ecti

onte

chni

ques

Gen

eral

feat

ures

Twit

ter-

spec

ific

feat

ures

Pope

scu

etal

.(20

11)

Gra

dien

tboo

sted

deci

sion

tree

sPa

rt-o

f-S

peec

hta

ggin

gan

dre

gula

rR

elat

ive

posi

tion

alin

form

atio

n,le

ngth

expr

essi

ons

(in

addi

tion

toth

efe

atur

esof

snap

shot

,cat

egor

y,la

ngua

geus

edby

Pope

scu

etal

.(20

11))

(in

addi

tion

toth

efe

atur

esus

edby

Pope

scu

etal

.(20

11))

Ben

son

etal

.(20

11)

Fact

orgr

aph

mod

elan

dco

ndit

iona

lTe

rmve

ctor

sfo

rar

tist

nam

esW

ord

shap

e,pa

tter

nsfo

rem

otic

ons,

rand

omfi

elds

(ext

ract

edfr

omW

ikip

edia

)ti

me

refe

renc

es,v

enue

type

san

dfo

rci

tyve

nue

nam

es.

Lee

and

Sum

iya

(201

0)S

tati

stic

alm

odel

ing

ofno

rmal

–#T

wee

t,#C

row

d,#M

ovin

gCro

wd

crow

dbe

havi

orba

sed

onge

otag

sS

akak

ieta

l.(2

010)

Sup

port

vect

orm

achi

necl

assi

fier

–#W

ords

,#ke

ywor

dsan

dth

ew

ords

surr

ound

ing

user

squ

ery

Bec

ker

etal

.(20

11)

Rec

ursi

vequ

ery

cons

truc

tion

Term

freq

uenc

yan

dco

-loc

atio

nH

asht

ags

and

UR

LM

asso

udie

tal.

(201

1)G

ener

ativ

ela

ngua

gem

odel

ing

Em

otic

ons,

post

leng

th,s

hout

ing,

hype

rlin

ks,c

apit

aliz

atio

n,re

cenc

y,#r

epos

tsan

d#f

ollo

wer

sM

etzl

eret

al.(

2012

)Te

mpo

ralq

uery

expa

nsio

nte

chni

que

Bur

stin

ess

scor

eba

sed

onth

efr

eque

ncy

ofqu

ery

term

occu

rren

ceG

uet

al.(

2011

)E

vent

mod

elin

g(E

Tre

e)Te

rmve

ctor

and

n-gr

amm

odel

sR

epli

esto

twee

ts


by reading their timeline—a chronological streams of incoming tweets from everyone theyfollow.

Twitter has evolved over time and adopted suggestions originally proposed by users tomake the platform more flexible. It currently provides different ways for users to converseand interact by referencing each other in posted messages in a well-defined markup vocab-ulary. Placing the “@” symbol before a user name (also called a handle, “@username”)creates a mention or a reply link to the referenced user’s account. A mention is used any-where in the message for signaling that the mentioned user is also registered on Twitter.A reply is a special mention from one user in response to another user’s message startingwith the replied-to @username (Honeycutt and Herring 2009). Mentions are displayed inthe referenced user accounts to keep track of messages mentioning their names. Twitter alsoallows users to forward or retweet someone else’s tweet to their followers. It is commonlycarried out by using the RT prefix before the user name that originated the message, “RT@username.” Retweeting is a common practice on Twitter to share useful or interestinginformation while giving credit to the original user (Boyd et al. 2010). Topics on Twittercan be categorized by a hashtag, which is any keyword preceded by a hash sign “#” (e.g.,#nlptechnologies). Hashtags were developed as a means to create groupings on Twitter.Twitter users can use hashtag to indicate the subject of their messages, to collate tweets fromdifferent users on a shared subject, and to regularly track specific events in real time.

Twitter provides an application programming interface (API),5 which allows developersto programmatically access the public data streams as well as many features of the service.For instance, Twitter streaming API provides filtering by location, keywords, author, andothers. The availability of Twitter data has motivated significant research work in variousdisciplines and led to numerous applications and tools.

2.2. Twitter as a Source of Information

Twitter is becoming the microphone of the masses, which altered news production andconsumption (Murthy 2011). Many real-world examples have shown the effectiveness andthe timely information reported by Twitter during disasters and social movements. Repre-sentative examples include the bomb blasts in Mumbai in November 2008 (Oh et al. 2011),the flooding of the Red River Valley in the United States and Canada in March and April2009 (Starbird et al. 2010), the U.S. Airways plane crash on the Hudson river in January2009, the devastating earthquake in Haiti in 2010, the demonstrations following the IranianPresidential elections in 2009, and the “Arab Spring” in the Middle East and North Africaregion (Khondker 2011; Khan 2012).

Several studies have analyzed Twitter’s user intentions (Java et al. 2007; Krishnamurthyet al. 2008; Zhao and Rosson 2009; Kwak et al. 2010; Kaplan and Haenlein 2011). Forinstance, Java et al. (2007) categorized user intentions on Twitter into daily chatter, con-versations, sharing information, and reporting news. They also identified Twitter users asinformation sources, friends, and information seekers. Krishnamurthy et al. (2008) pre-sented similar classification of user intentions and also included evangelists and spammersthat are looking to follow anyone. According to Kaplan and Haenlein (2011), people aremotivated by the concept of ambient awareness—being updated about even the most trivialmatters in other peoples’ lives and by the platform for virtual exhibitionism and voyeurismprovided for both active contributors and passive observers. Many research efforts have alsofocused on Twitter user motivations in specific environments such as at work (Zhao and

5 https://dev.twitter.com/docs/streaming-apis.

https://dev.twitter.com/docs/streaming-apis


Rosson 2009; Lovejoy and Saxton 2012), during conferences (Reinhardt et al. 2009), and inpolitics (Tumasjan et al. 2010; Small 2011).

2.3. Challenges

Tweets have reported everything from daily life stories to latest local and worldwideevents. Twitter content reflects real-time events in our life and contains rich social infor-mation and temporal attributes. Monitoring and analyzing this rich and continuous flow ofuser-generated content can yield unprecedentedly valuable information. However, Twitterstreams contain large amounts of meaningless messages (pointless babbles) (Hurlock andWilson 2011) and rumors (Castillo et al. 2011). These are important to build users’ socialnetworks (Kaplan and Haenlein 2011) and may help to understand people’s reactions toevents, however they negatively affect event detection performance. Nevertheless, Twittercharacteristics and popularity are particularly alluring for spammers and other content pol-luters (Lee et al. 2011), to spread advertisements, pornography, viruses, and phishing, orjust to compromise system reputation (Benevenuto et al. 2010).

A major challenge facing event detection from Twitter streams is therefore to separatethe mundane and polluted information from interesting real-world events. In practice, highlyscalable and efficient approaches are required for handling and processing the increasinglylarge amount of Twitter data (especially for real-time event detection). Other challenges areinherent to Twitter’s design and usage. These are mainly due to the short length of tweetmessages, the frequent use of (dynamically evolving) informal, irregular, and abbreviatedwords, the large number of spelling and grammatical errors, and the use of improper sen-tence structure and mixed languages. Such data sparseness, lack of context, and diversityof vocabulary make the traditional text analysis techniques less suitable for tweets (Metzleret al. 2007). In addition, different events may enjoy different popularity among users andcan differ significantly in content, number of messages and participants, periods, inherentstructure, and causal relationships (Nallapati et al. 2004).

3. EVENT DETECTION IN TRADITIONAL MEDIA

This section provides a brief overview of event detection techniques applied to tra-ditional media outlets. Some of these techniques have been adapted to event detectionin Twitter. They are generally classified into document-pivot and feature-pivot techniquesdepending on whether they rely on document or temporal features.

3.1. Document-Pivot Techniques

Event detection has long been addressed in the TDT program (Allan 2002), an ini-tiative sponsored by the Defense Advanced Research Projects Agency, concerned withevent-based organization of textual news document streams. The motivation for the TDTresearch initiative was to provide core technology for news monitoring tools from multi-ple sources of traditional media (including newswire and broadcast news) to keep usersupdated about news and developments. Originally, the TDT consisted of three main tasks:segmentation, detection, and tracking. These tasks attempt to segment news text into cohe-sive stories, detect new (unforeseen) events, and track the development of a previouslyreported event.

According to TDT, the objective of event detection is to discover new or previouslyunidentified events, where each event refers to a specific thing that happens at a specifictime and place (Allan et al. 1998). Event detection consists of three major phases: datapreprocessing, data representation, and data organization or clustering (Yang et al. 1998;


Yang et al. 2002). Data preprocessing involves filtering out stopwords (on, of, are, etc.) andapplying words stemming and tokenization techniques.

Traditional data representation for event detection involves the term vectors or bag ofwords whose entries are nonzero if the corresponding terms appear in the document. Eachterm in the vector is typically weighted using the classical term frequency–inverse documentfrequency (tf-idf ) approach (Salton 1988), which evaluates how important a word is to adocument in a corpus. However, the term vector model is subject to the curse of dimension-ality when the text is long. More importantly, the temporal order of words and the semanticand syntactic features of the text (e.g., named entity and grammar) are discarded by theterm vectors; thus, the model may not capture the similarity (or dissimilarity) among related(or unrelated) events. For instance, it would be difficult for the term vector to distinguishbetween two different airplane crashes. In fact, Allan, Lavrenko, Marlin, and Swan (2000)presented an upper bound for full-text similarity, which leads to exploring other data repre-sentation techniques such as semantical and contextual features (Allan, Lavrenko, and Jin2000).

The named entity vector is an alternative data representation (Kumaran and Allan 2004),which attempts to extract information answering the 4Ws questions: who, what, when, andwhere (Mohd 2007). Both term and named entity vectors have been integrated into a mixedvector (Yang et al. 1999; Kumaran and Allan 2004). Probabilistic representations have alsobeen applied including language models (Leek et al. 2002) and more advanced probabilisticframeworks that incorporate both content and time information (Li et al. 2005). Similaritybetween events is measured using traditional metrics such as the Euclidean distance, Pear-son’s correlation coefficient, and cosine similarity. More recently, other similarity measureshave been proposed such as the Hellinger distance (Brants et al. 2003) and the clusteringindex (Jo and Lee 2007).

In TDT, event detection has been broadly divided into two categories: retrospectiveevent detection (RED) and new event detection (NED)6 (Yang et al. 1998; Allan et al. 1998).RED focuses on discovering previously unidentified events from accumulated historicalcollections (Yang et al. 1998), while NED involves the discovery of new events from livestreams in (near) real time (Allan et al. 1998). Clustering-based algorithms (Berkhin 2002;Aggarwal and Zhai 2012) have been mainly employed for both RED and NED tasks.

Retrospective event detection involves iterative clustering algorithms that require theentire document collection, to organize the documents into topic clusters. Hierarchical clus-tering approaches, such as the bottom-up hierarchical agglomerative clustering (HAC) (Jainand Dubes 1988), have been widely employed for this task. At first, each single data pointis represented by a cluster, and then closest clusters are merged on the basis of the simi-larity measure until all data points become a single cluster or some termination criteria aresatisfied. Several variations of the HAC algorithm have been employed in TDT tasks. Forinstance, Yang et al. (1998) employed the group average clustering to detect news eventsfrom accumulated news stories. A two-layer HAC approach based on affinity propagationhas also been proposed to reduce the false positives; however, at the expense of increasedcomplexity (Dai et al. 2010). The traditional k-means algorithm and its variants, such ask-median and k-means++, have also been applied (Bouras and Tsogkas 2010).

New event detection has been characterized as a query-free retrieval tasks because theevent information is not known a priori and hence cannot be expressed as an explicit query(Allan et al. 1998). In contrast to RED, NED must provide decisions (new or old events)as documents arrive. Therefore, the employed clustering approaches are typically based on

6 Also known as first-story detection or novelty detection.


incremental (greedy) algorithms that process the input streams sequentially and merge anevent with the most similar one or create a new cluster if the similarity measure exceedsa predefined threshold (Allan et al. 1998). In practice, such approach could be time andresource intensive and may be unfeasible without employing specialized techniques forimproved system efficiency (Luo et al. 2007). For instance, using a sliding time windowover old stories and comparing the new story with the most recent number of stories wouldalleviate the resource requirements (Yang et al. 1998; Papka 1999; Luo et al. 2007). Theunderlying assumption is that the occurrence of stories related to the same event should beclose in time. Other techniques for improved system efficiency include limiting the num-ber of terms per document, limiting the number of total terms kept, and employing parallelprocessing (Luo et al. 2007).

These event detection techniques are typically called document-pivot techniques,because they detect events by clustering documents on the basis of their textual similarity.However, the TDT line of research assumes that all documents are relevant and contain someold or new events of interest (Allan, Lavrenko, and Jin 2000). This assumption is clearlyviolated in Twitter data streams, where relevant events are buried in large amounts of noisydata (Becker et al. 2011b; Castillo et al. 2011; Hurlock and Wilson 2011; Lee et al. 2011).In addition, these techniques are not designed to handle the speed and scale requirements ofsocial media.

3.2. Feature-Pivot TechniquesTrend detection tasks over textual data collection generally aim to identify topic

areas that were previously unseen or rapidly growing in importance within the corpus(Kontostathis et al. 2004). Recently, there has been a significant interest in bursty eventdetection techniques in traditional media (Kleinberg 2002; Fung et al.2005; He, Chang,and Lim 2007; He, Chang, Lim, and Zhang 2007; Wang et al. 2007; Goorha and Ungar2010). These feature-pivot techniques model an event in text streams as a bursty activ-ity, with certain features rising sharply in frequency as the event emerges. An event istherefore conventionally represented by a number of keywords showing burst in appear-ance counts (Kleinberg 2002). The underlying assumption is that some related words wouldshow an increased usage as an event occurs. Different from traditional RED and NEDapproaches, these techniques analyze feature distributions and discover events by groupingbursty features with identical trends.

In his seminal work, Kleinberg (2002) proposed in infinite-state automaton to modelthe arrival times of documents in a streams to identify bursts that have high intensity overlimited durations of time. The states of the probabilistic automaton correspond to the fre-quencies of individual words, while the state transitions capture the burst, which correspondto a significant change in word frequency. Fung et al. (2005) modeled word appearanceas binomial distribution, identified the bursty words according to a heuristic-based thresh-old, and grouped bursty features to find bursty events. He, Chang, Lim, and Zhang (2007)applied spectral analysis using discrete Fourier transformation (DFT) to categorize featuresfor different event characteristics (e.g., important or not, and periodic or aperiodic events).DFT converts the signals from the time domain into the frequency domain, such that a burstin the time domain corresponds to a spike in the frequency domain. However, DFT can-not identify the period of a bursty event. Therefore, He, Chang, and Lim (2007) employedGaussian mixture models to identify feature bursts and their associated periods. Snowsillet al. (2010) presented an online approach for detecting events in news streams based onstatistical significant tests of n-gram word frequency within a time frame. An incrementalsuffix tree data structure was applied to reduce the time and space constraints required foronline detection.


Direct application of feature-pivot techniques to Twitter streams may not be suitable,because the temporal distributions of features are very noisy and not all bursts are relevantevents of interest.

4. EVENT DETECTION IN TWITTER

This section describes various techniques proposed for event detection from Twitterstreams. Table 1 presents a taxonomy of these techniques according to the type of event,detection task, and detection methods. Depending on the type of events, these techniquesare classified into unspecified and specified event detection. According to the detectiontask and target application, they are classified into RED and NED techniques. Further-more, depending on the detection method, the presented techniques are also categorized intosupervised and unsupervised (or a combination of both) techniques (Table 1). In addition,the main event detection methods are further illustrated in Table 2 along with the featurerepresentations, which are divided into general and Twitter-specific features.

4.1. Unspecified versus Specified Event

Depending on the available information on the event of interest, event detection can beclassified into specified and unspecified techniques, as shown in columns 2 and 3 of Table 1.Because no prior information is available about the event, the unspecified event detectiontechniques rely on the temporal signal of Twitter streams to detect the occurrence of a real-world event. These techniques typically require monitoring for bursts or trends in Twitterstreams, grouping the features with identical trend into events, and ultimately classifyingthe events into different categories. On the other hand, the specified event detection relies onspecific information and features that are known about the event, such as a venue, time, type,and description, which are provided by the user or from the event context. These features canbe exploited by adapting traditional information retrieval and extraction techniques (such asfiltering, query generation and expansion, clustering, and information aggregation) to theunique characteristics of tweets.

The following subsections describe the techniques for unspecified and specified eventdetection. These techniques are then further classified according to the detection task(Section 4.2) and detection methods (Section 4.3)

4.1.1. Unspecified Event Detection. The nature of Twitter posts reflect events as theyunfold; hence, these tweets are particularly useful for unknown event detection. Unknownevents of interest are typically driven by emerging events, breaking news, and general topicsthat attract the attention of a large number of Twitter users. Because no event information isavailable, unknown events are typically detected by exploiting the temporal patterns or sig-nal of Twitter streams. New events of general interest exhibit a burst of features in Twitterstreams yielding, for instance, a sudden increased use of specific keywords. Bursty featuresthat occur frequently together in tweets can then be grouped into trends (Mathioudakis andKoudas 2010). In addition to trending events, endogenous or nonevent trends are also abun-dant on Twitter (Naaman et al. 2011). Techniques for unspecified event detection in Twittermust therefore discriminate trending events of general interest from the trivial or noneventtrends (exhibiting similar temporal pattern) using scalable and efficient algorithms. Thetechniques described in the following text attempted to address these challenges.

Sankaranarayanan et al. (2009) proposed a news processing system based on Twitter,called TwitterStand, to capture tweets that correspond to late breaking news. They employ anaive Bayes classifier to separate news from irrelevant information and an online clustering


algorithm based on weighted term vector according to tf-idf and cosine similarity to formclusters of news. In addition, hashtags are used to reduce clustering errors. Clusters arealso associated with time information for management and for determining the clusters ofinterest. Other issues addressed include removing the noise and determining the relevantlocations associated with the tweets.

Phuvipadawat and Murata (2010) presented a method to collect, group, rank, and trackbreaking news from Twitter. They first sample tweets (through Twitter streaming API) usingpredefined search queries, for example, “#breakingnews” and “#breaking news” keyword,and index their content with Apache Lucene.7 Messages that are similar to each other arethen grouped together to form a news story. Similarity between messages are based ontf-idf with an increased weight for proper noun terms, hashtags, and usernames. Propernouns are identified using the Stanford Named Entity Recognizer (NER) trained on conven-tional news corpora. They use a weighted combination of number of followers (reliability)and the number of retweeted messages (popularity) with a time adjustment for the fresh-ness of the message to rank each cluster. New messages are included in a cluster if theyare similar to the first message and to the top-k terms in that cluster. The authors stressthe importance of proper nouns identification to enhance the similarity comparison betweentweets and hence improve the overall system accuracy. An application based on the proposedmethod called Hot-streams has been developed.

Petrovic et al. (2010) adapted the online NED approach proposed for news media (Allan,Lavrenko, and Jin 2000), which is based on cosine similarity between documents to detectnew events that have never appeared in previous tweets. They focused on improving theefficiency of online NED algorithm and proposed a constant time and space approach basedon an adapted variant of the locality sensitive hashing methods (Gionis et al. 1999), whichlimits the search to a small number of documents. However, they did not consider replies,retweets, and hashtags in their experiments or the significance of newly detected events (e.g.,trivial or not). Results have shown that ranking according to the number of users is betterthan ranking according to the number of tweets and considering entropy of the messagereduces the amount of spam messages in output.

Becker et al. (2011a) focused on online identification of real-world event content andits associated Twitter messages using an online clustering technique, which continuouslyclusters similar tweets and then classifies the clusters content into real-world events ornonevents. These nonevents involve Twitter-centric topics, which are trending activitiesin Twitter that do not reflect any real-world occurrences (Naaman et al. 2011). Twitter-centric activities are difficult to detect, because they often share similar temporal distributioncharacteristics with real-world events. Their clustering approach is based on a classical(threshold-based) incremental clustering algorithm that has been proposed for NED in newsdocuments (Allan et al. 1998). Each message is represented as a tf-idf weight vector of itstextual content, and cosine similarity is used to compute the distance from a message tocluster centroids. In addition to traditional preprocessing steps such as stop-word elimina-tion and stemming, the weight of hashtag terms are doubled because they are considered astrong indication of the message content. The authors combined temporal, social, topical,and Twitter-centric features. The temporal features rely on term frequency that appear in theset of messages associated with a cluster over time. The social features include the percent-age of messages containing users interaction (i.e., retweets, replies, and mentions) out of allmessages in a cluster. The topical features are based on the hypothesis that event clusterstend to revolve around a central topic, whereas nonevent clusters often center around various

7 http://lucene.apache.org.

http://lucene.apache.org.


common terms (e.g., “sleep” or “work”) that do not reflect a single theme. The Twitter-centric features are based on the frequency of multiword hashtags with special capitalization(e.g., #BadWrestlingNames). Because the clusters constantly evolve over time, the featuresare periodically updated for old clusters and computed for newly formed ones. Finally, asupport vector machine (SVM) classifier is trained on a labeled set of cluster features andused to decide whether the cluster (and its associated messages) contains real-world eventinformation.

Long et al. (2011) adapted a traditional clustering approach by integrating some spe-cific features to the characteristics of microblog data.8 These features are based on “topicalwords,” which are more popular than others with respect to an event. Topical words areextracted from daily messages on the basis of word frequency, word occurrence in hash-tag, and word entropy. A (top-down) hierarchical divisive clustering approach is employedon a co-occurrence graph (connecting messages in which topical words co-occur) to dividetopical words into event clusters. To track changes among events at different time, amaximum-weighted bipartite graph matching is employed to create event chains, with avariation of Jaccard coefficient as similarity measures between clusters. Finally, cosine sim-ilarity augmented with a time interval between messages is used to find the top-k mostrelevant posts that summarize an event. These event summaries are then linked to eventchain clusters and plotted on the time line. For event detection, the authors found that top-down divisive clustering outperforms both k-means and traditional hierarchical clusteringalgorithms.

Weng and Lee (2011) proposed an event detection based on clustering of discretewavelet signals built from individual words generated by Twitter. In contrast withFourier transforms, which have been proposed for event detection from traditional media(Section 3.2), wavelet transformations are localized in both time and frequency domain andhence able to identify the time and the duration of a bursty event within the signal. Waveletsconvert the signals from the time domain to time-scale domain, where the scale can beconsidered as the inverse of frequency. Signal construction is based on time-dependent vari-ant of document frequency–inverse document frequency (DF-IDF), where DF counts thenumber of tweets (document) containing a specific word, while IDF accommodates wordfrequency up to the current time step. A sliding window is then applied to capture the changeover time using the H-measure (normalized wavelet entropy). Trivial words are filteredout on the basis of (a threshold set on) signals cross-correlation, which measure similaritybetween two signals as function of a time lag. The remaining words are then clustered toform events with a modularity-based graph partitioning technique, which splits the graphinto subgraphs each corresponding to an event. Finally, significant events are detected on thebasis of the number of words and the cross-correlation among the words related to an event.

Similarly, Cordeiro (2012) proposed a continuous wavelet transformation based onhashtag occurrences combined with a topic model inference using latent Dirichlet allocation(LDA) (Blei et al. 2003). Instead of individual words, hashtags are used for building waveletsignals. An abrupt increase in the number of a given hashtag is considered a good indicatorof an event that is happening at a given time. Therefore, all hashtags were retrieved fromtweets and then grouped in intervals of 5 minutes. Hashtag signals are constructed over timeby counting the hashtag mentions in each interval, grouping them into separated time series(one for each hashtag), and concatenating all tweets that mention the hashtag during eachtime series. Adaptive filters are then used to remove noisy hashtag signals, before applyingthe continuous wavelet transformation and getting a time-frequency representation of the

8 The authors applied their approach to Sina (http://t.sina.com.cn), a popular microblog in China.


signal. Next, wavelet peak and local maxima detection techniques are used to detect peaksand changes in the hashtag signal. Finally, when an event is detected within a given timeinterval, LDA is applied to all tweets related to the hashtag in each corresponding time seriesto extract a set of latent topics, which provide an improved summary of event description.

4.1.2. Specified Event Detection. Specified event detection includes known orplanned social events. These events could be partially or fully specified with the related con-tent or metadata information such as location, time, venue, and performers. The techniquesdescribed later attempt to exploit Twitter textual content or metadata information or both,using a wide range of machine learning, data mining, and text analysis techniques.

Popescu and Pennacchiotti (2010) focused on identifying controversial events that pro-voke public discussions with opposing opinions in Twitter, such as controversies involvingcelebrities. Their detection framework is based on the notion of a Twitter snapshot, a tripletconsisting of a target entity (e.g., Barack Obama), a given period (e.g., 1 day), and a set oftweets about the entity from the target period. Given a set of Twitter snapshots, an eventdetection module first distinguishes between event and nonevent snapshots using a super-vised gradient boosted decision trees (Friedman 2001), trained on manually labeled data set.To rank these event snapshots, a controversy model assigns higher scores to controversial-event snapshots, on the basis of a regression algorithm applied to a large number of features.The employed features are based on Twitter-specific characteristics including linguistic,structural, buzziness,9 sentiment, and controversy features, and on external features suchas news buzz and Web-news controversy. These external features require time alignmentof entities in news media and Twitter sources, to capture entities that are trending in bothsources because they are more likely to refer to real-world events. The authors have also pro-posed to merge the two stages (detection and scoring) into a single-stage system by includingthe event detection score as an additional feature into the controversy model, which yieldedan improved performance. Feature analysis of the single-stage system revealed that the eventscore is the most relevant feature because it discriminates event from nonevent snapshots.Hashtags are found to be important semantic features for tweets, because they help identifythe topic of a tweet and estimate the topical cohesiveness of a set of tweets. Neverthe-less, external features based on news and the Web are also found useful; hence, correlationwith traditional media helps validate and explain social media reactions. In addition, thelinguistic, structural, and sentiment features also provide considerable effects. The authorsconcluded that a rich, varied set of features is crucial for controversy detection.

In a successive work, Popescu et al. (2011) employed the same framework describedearlier, but with additional features to extract events and their descriptions from Twitter. Thekey idea is based on the importance and the number of the entities to capture commonsenseintuitions about event and nonevent snapshots. As observed by the authors: “Most eventsnapshots have a small set of important entities and additional minor entities while non-event snapshots may have a larger set of equally unimportant entities.” These new featuresare inspired from the document aboutness system (Paranjpe 2009) and aim at ranking theentities in a snapshot with respect to their relative importance to the snapshot. This includesrelative positional information (e.g., offset of term in snapshot), term-level information(term frequency, Twitter corpus IDF), and snapshot-level information (length of snapshot,category, language). Opinion extraction tools such as an off-the-shelf part-of-speech (POS)tagger and regular expressions have also been applied for improved event and main entity

9 Approximated by the number of tweets in a snapshot referring to an entity over the average number of tweets in theNprevious snapshots referring to the same entity.


extraction. The number of snapshots containing action verbs, the buzziness of an entity inthe news on a given day, and the number of reply tweets are among the most useful newfeatures found by the authors.

Benson et al. (2011) present a novel approach to identify Twitter messages for con-cert events using a factor graph model, which simultaneously analyzes individual messages,clusters them according to event type, and induces a canonical value for each event prop-erty. The motivation is to infer a comprehensive list of musical events from Twitter (basedon artist–venue pairs) to complete an existing list (e.g., city event calender table) by discov-ering new musical events mentioned by Twitter users that are difficult to find in other mediasources. At the message level, this approach relies on a conditional random field (CRF) toextract the artist name and location of the event. The input features to CRF model includeword shape; a set of regular expressions for common emoticons, time references, and venuetypes; a bag of words for artist names extracted from external source (e.g., Wikipedia); anda bag of words for city venue names. Clustering is guided by term popularity, which is analignment score among the message term labels (artist, venue, none) and some candidatevalue (e.g., specific artist or venue name). To capture the large text variation in Twitter mes-sages, this score is based on a weighted combination of term similarity measures, includingcomplete string matching, and adjacency and equality indicators scaled by the inverse doc-ument frequency. In addition, a uniqueness factor (favoring single messages) is employedduring clustering to uncover rare event messages that are dominated by the popular onesand to discourage various messages from the same events to cluster into multiple events.On the other hand, a consistent indicator is employed to discourage messages from multi-ple events to form a single cluster. The factor graph model is then employed to capture theinteraction between all components and provide the final decision. The output of the modelconsists of a musical event-based clustering of messages, where each cluster is representedby an artist–venue pairs.

Lee and Sumiya (2010) present a geosocial local event detection system based on mod-eling and monitoring crowd behaviors via Twitter, to identify local festivals. They rely ongeographical regularities deduced from the usual behavior patterns of crowds using geotags.First, Twitter geotagged data are collected and preprocessed over a long period for a specificregion (Fujisaka et al. 2010). The region is then divided into several regions of interest (ROI)using the k-means algorithm, applied to the geographical coordinates (longitudes/latitudes)of the collected data. Geographical regularities of crowd within each ROI are then estimatedfrom historical data based on three main features: the number of tweets, users, and movingusers within an ROI. Statistics for these features are then accumulated over historical datausing 6-hour time interval to form the estimated behavior of crowd within each ROI. Finally,unusual events in the monitored geographical area can be detected by comparing statisticsfrom new tweets with those of the estimated behavior. The authors found that an increaseduser activity (moving inside or coming to an ROI) combined with an increased number oftweets provides strong indicator of local festivals.

Sakaki et al. (2010) exploited tweets to detect specific types of events such asearthquakes and typhoons. They formulated event detection as a classification problemand trained an SVM on a manually labeled Twitter data set comprising positive events(earthquakes and typhoons) and negative events (other events or nonevents). Three typesof features have been employed: the number of words (statistical), the keywords in a tweetmessage, and the words surrounding users queries (contextual). Analysis of the number oftweets over time for earthquakes and typhoons data revealed an exponential distribution ofevents. Parameters of the exponential distribution are estimated from historical data andthen used for computation of a reliable wait time (during which more information is beinggathered from related tweets) before raising an alarm. Experiments have shown that the


statistical features provided the best results, while a small improvement in performancehas been achieved by the combination of the three features. The authors have also appliedKalman filtering and particle filtering (Fox et al. 2003) for estimation of earthquake centerand typhoon trajectory from Twitter temporal and spatial information. They found that parti-cle filters outperformed Kalman filters in both cases, because of the inappropriate Gaussianassumption of the latter for this type of problems.

Becker et al. (2011) presented a system for augmenting information about plannedevents with Twitter messages, using a combination of simple rules and query building strate-gies. To identify Twitter messages for an event, they begin with simple and precise querystrategies derived from the event description and its associated aspects (e.g., combiningtime and venue). An annotator is then asked to label the results returned by each strat-egy for over 50 events that provide high-precision tweets. To improve recall, they employterm-frequency analysis and co-location techniques on the resulting high-precision tweetsto identify descriptive event terms and phrases, which are then used recursively to definenew queries. In addition, they build queries using URL and hashtag statistics from the high-precision tweets for an event. Finally, they build a rule-based classifier to select among thisnew set of queries and then use the selected queries to retrieve additional event messages.In a related work, Becker et al. (2011b) proposed centrality-based approaches to extracthigh-quality, relevant, and useful Twitter messages related to an event. These approachesare based on the observation that the most topically central messages in a cluster are morelikely to reflect key aspects of the event than other, less central cluster messages. The tech-niques from both works have recently been extended and incorporated into a more generalapproach that aims at identifying social media contents for known events across differentsocial media sites (Becker et al. 2012).

Massoudi et al. (2011) employed a generative language modeling approach based onquery expansion and microblog “quality indicators” to retrieve individual microblog mes-sages. However, the authors only considered the existence of a query term within a specificpost and discarded its local frequency. The quality indicators include part of the blog “cred-ibility indicators” proposed by Weerkamp and de Rijke (2008) such as emoticons, postlength, shouting, capitalization, and the existence of hyperlinks, extended with specificmicroblog characteristics such as a recency factor, and the number of reposts and followers.The recency factor is based on difference between the query time and the post time. The val-ues provided with these microblog-specific indicators are averaged into a single value andare weight combined with the credibility indicators to compute the overall prior probabilityfor a microblog post. The query expansion technique selects top-k terms that occur in a user-specified number of posts close to the query date. The final query is therefore a weightedmixture of the original and expanded query. The combination of the quality indicator termsand the microblog characteristics has been shown to outperform each method alone. In addi-tion, tokens with numeric or nonalphabetic characters have turned out beneficial for queryexpansion.

Rather than retrieving individual microblog messages in response to an event query,Metzler et al. (2012) proposed retrieving a ranked list (or timeline) of historical eventsummaries. The search task involves temporal query expansion, timespan retrieval, andsummarization. In response to a user query, this approach retrieves a ranked set of times-pans10 on the basis of the occurrence of the query keywords. A burstiness score is thencomputed for all terms that occur in messages posted during each of the retrieved times-pans. This score is based on the frequency of term occurrence within the retrieved timespan

10 The authors suggest dividing the microblog streams into hourly based timespans.


to that within the entire microblog archive. The idea is to capture terms that are heavily dis-cussed and trending during a retrieved timespan, because they are more likely to be relatedto the query. The scores for each term are aggregated (using geometric mean) over allretrieved timespans, and the k highest weighted terms are considered for query expansion.The expanded query is now used to identify the 1000 highest scoring timespans, with respectto the term expansion weight and to the cosine similarity between the burstiness of the queryterms and the burstiness of the timespan terms. Adjacent timespans (contiguous in time)are then merged into longer time interval to form the final ranked list. To produce a shortsummary for each retrieved time interval, a small set of query-relevant messages postedduring the timespan are then selected. These relevant messages are retrieved as top rankedmessage according to a weighted variant of the query likelihood scoring function, which isbased on the burstiness score for expansion terms and a Dirichlet smoothed language mod-eling estimate for each term in the message. The authors showed that their approach is morerobust and effective than the traditional relevance-based language models (Lavrenko andCroft 2001) applied to the collected Twitter corpus and to English Gigaword corpus.

Gu et al. (2011) proposed an event modeling approach called ETree for event modelingfrom Twitter streams. ETree employs n-gram-based content analysis techniques to groupa large number of event-related messages into semantically coherent information blocks,an incremental modeling process to construct hierarchical theme structures, and a lifecycle-based temporal analysis technique to identify potential causal relationships betweeninformation blocks. The n-gram model is used to detect frequent key phrases among alarge number of event-related messages, where each phrase represents an initial informa-tion block. Semantically coherent messages are merged into the corresponding informationblock. The weighted cosine similarity is computed between each of the remaining messages(that does not include any key phrase) and each information block, and the messages withhigh similarities are merged into the corresponding information block. In addition, repliesto tweets are also merged into the corresponding information block. An incremental (top-down) hierarchical algorithm based on weighted cosine similarity is proposed to constructand update the theme structures, where each theme is considered as a tree structure withinformation blocks as leaf nodes and subtopics as internal nodes. For instance, when a newtweet becomes available, it may be assigned to an existing theme or node or may become anew theme (in this case, the hierarchy must be reconstructed). Finally, casual relationshipsbetween information blocks are computed on the basis of content (weighted cosine) similar-ity and temporal relevance. Temporal information is based on the time boundaries of eachinformation block as well as on the temporal distribution reflecting the number of messagesposted within each period. The authors show that the n-gram-based block identificationgenerates coherent information blocks with high coverage. An event is considered coherentif more than half of its information blocks are relevant, while the coverage of an event isdefined as the percentage of messages that are captured into one of the identified informa-tion blocks. In addition, ETree is shown more efficient compared with its nonincrementalversion and to TSCAN—a widely used algorithm that derives major themes of events fromthe eigenvectors of a temporal block association matrix (Chen and Chang Chen 2008).

4.2. New versus Retrospective Event

Similar to event detection from conventional media, described in Section 3, eventdetection in Twitter can also be classified into RED and NED depending on the task andapplication requirements as well as on the type of event.

Because NED techniques involve continuous monitoring of Twitter signals for discover-ing new events in near real time, they are naturally suited for detecting unknown real-world


events or breaking news, as shown in Table 1, column 6. In general, trending events onTwitter could be aligned with real-world breaking news. However, sometimes a comment,person, or photo related to real-world breaking news may become more trending on Twitterthan the original event. One such example is Bobak Ferdowsi’s hairstyle, which became viralon social media while NASA’s Curiosity rover was landing on Mars11 Although the NEDapproaches do not impose any assumption on the event, they are not restricted to unspec-ified event detection. When the monitoring task involves specific events (natural disasters,celebrities, etc.) or a specific information about the event description (e.g., geographicallocation), these information could be integrated into the NED system, for instance, by usingfiltering techniques (Sakaki et al. 2010) or exploiting additional features such as the con-troversy (Popescu and Pennacchiotti 2010) or the geotagged information (Lee and Sumiya2010), to better focus on the event of interest. Most NED approaches could also be appliedto historical data to detect and analyze past events.

While most research focused on NED to exploit the timely information provided byTwitter streams, recent studies have shown an interest in RED from Twitter’s historical data(Table 1, column 7). Existing microblog search services, such as those offered by Twit-ter and Google, only provide limited search capabilities that allow to retrieve individualmicroblog posts in response to a query (Metzler et al. 2012). The challenges in findingTwitter messages relevant to a given user query are mainly due to the sparseness of thetweets and the large number of vocabulary mismatch (which is dynamically evolving). Forexample, relevant messages may not contain any query term, or new abbreviation termsor hashtags may emerge with the event. Traditional query expansion techniques rely onterms that co-occur with query terms in relevant documents. In contrast, event retrieval fromTwitter data focused on temporal and dynamic query expansion techniques. Recent researchefforts have started to focus on providing more structured and comprehensive summaries ofTwitter events.

4.3. Detection Methods and Features

Event detection from Twitter streams draws on techniques from different fields, whichare extensively covered in the literature, including machine learning and data mining(Murphy 2012; Hastie et al. 2009), natural language processing (Manning and Schütze1999; Jurafsky and Martin 2009), information extraction (Hogenboom et al. 2011), text min-ing (Hogenboom et al. 2011; Aggarwal 2011), and information retrieval (Baeza-Yates andRibeiro-Neto 2011). In this section, the major directions of the survey approaches for eventdetection from Twitter are discussed.

In general, machine learning tasks involve learning a mapping function: f .X/ ! Y ,from an input space X to an output space Y . These tasks are typically divided into super-vised and unsupervised learning approaches. In supervised learning, a labeled set of Ninput–output pairs ¹.x1; y1/; : : : ; .xn; yn/º is provided at training time to learn f .X/, whileunsupervised learning relies solely on the input data ¹.x1; : : : ; yn/º to discover interestingpatterns (i.e., estimate both f .X/ and Y ).

The remainder of this section categorizes the event detection techniques into supervisedand unsupervised learning or a combination of both approaches (as shown in Table 1) anddiscusses the feature representation for each approach, which are classified into general andTwitter-specific features as illustrated in Table 2.

11 http://www.examiner.com/article/photos-mohawk-guy-bobak-ferdowsi-s-hair-goes-viral-as-curiosity-lands-on-mars.


4.3.1. Unsupervised Detection Approaches. Similar to event detection from conven-tional media (Section 3), most techniques for unspecified event detection from Twitterstreams rely on clustering approaches, as described in Section 4.1.1 and also shown inTables 1 and 2. Clustering approaches are naturally suitable for unspecified NED fromTwitter, because they are unsupervised in that they require no labeled data for training.However, given the increasingly large amounts of data and the real-time nature of Twitterstreams, clustering algorithms for NED must be efficient and highly scalable. In addition,they should not require any prior knowledge (e.g., the number of clusters), because Twitterdata are dynamically evolving and new events arise over time. Ideally, these clustering algo-rithms must process and analyze new tweets as they become available using a single or fewpasses within a limited amount of time and memory and be able to provide decisions at anygiven time. Therefore, partitioning clustering techniques such as K-means, K-median, andK-medoid or other approaches based on the expectation–maximization algorithm (Aggarwaland Zhai 2012; Berkhin 2002) are also not suitable because they require a prior knowledgeof the number of clusters (K).

Several threshold-based online (or incremental) approaches for clustering Twitterstreams have been mainly adopted from the NED in the TDT (Becker et al. 2011a; Petrovicet al. 2010; Phuvipadawat and Murata 2010; Sankaranarayanan et al. 2009). As described inSection 3, incremental clustering approaches are appropriate for grouping of continuouslygenerated text, by setting a maximum similarity between new tweets and any of the existingclusters. When the similarity is greater than a preset threshold, the new tweet is consideredsimilar to and merged with the closest cluster; otherwise, it is considered as a new event anda new cluster is formed. Although some work has focused on improving the efficiency of theoriginal online clustering algorithm (Petrovic et al. 2010), little research efforts have beendevoted to threshold settings and fragmentation issues. Thresholds are typically set empiri-cally using the training (or validation) data and assumed to generalize to unseen messages.Fragmentation occurs when tweets that talk about the same event are grouped in differentclusters. Fragmentation is inherent to incremental clustering and depends on threshold set-tings. To alleviate this issue, Sankaranarayanan et al. (2009) suggested a periodic secondpass to merge similar or duplicate clusters, while Petrovic et al. (2010) proposed comparingnew tweets with a fixed number of most recent clusters.

Graph-based clustering algorithms have also been proposed for the NED task. A hierar-chical divisive clustering approach is used on a co-occurrence graph (connecting messagesaccording to word co-occurrence) to divide topical words into event clusters (Long et al.2011). A modularity-based graph partitioning technique is used to form events by splittingthe graph into subgraphs each corresponding to an event (Weng and Lee 2011). The poweriteration method employed in the PageRank algorithm (Ipsen and Wills 2006) is proposedto alleviate the computational burden associated with finding the largest eigenvalue of themodularity matrix (Weng and Lee 2011). In general, hierarchical clustering algorithms donot scale to the large size of the data because they require the full similarity matrix, whichcontains the pairwise similarity between groups (Becker et al. 2010; Cordeiro 2012). Inaddition, scalable graph partitioning algorithms may not capture the highly skewed eventdistribution of Twitter data as they are biased toward balanced partitioning (Becker et al.2010).

New event detection approaches that are based on unsupervised clustering typicallyemploy shallow feature representations, including the weighted term vector accord-ing to tf-idf computed over some period (Petrovic et al. 2010; Weng and Lee 2011)and augmented with other features such as hashtag occurrence (Becker et al. 2011a;Sankaranarayanan et al. 2009). In addition to word and hashtag frequency, some authorshave included other features such as word entropy (Long et al. 2011) and proper names


(Phuvipadawat and Murata 2010). Because of their discriminating power, and for improvedefficiency, the hashtag features have been recently used without any additional features(Cordeiro 2012). Cosine similarity is most commonly used within these online clusteringalgorithms to compute the distance between the (augmented) term vectors and the centersof clusters. The approaches based on spectral analysis and weighted graphs (Cordeiro 2012;Long et al. 2011; Weng and Lee 2011) inherently detect the bursty events associated with thepreviously considered features over time. On the other hand, NED approaches that are onlybased on clustering typically exploit Twitter social network information, which reflect thebursty events, to rank the resulting event clusters and their associated messages before pre-senting the most relevant real-world events. This includes ranking event clusters accordingto the number of tweets, retweets, users, and followers (Petrovic et al. 2010; Phuvipadawatand Murata 2010).

New event detection of specific type of events could be addressed similar to anomalydetection techniques (Khreich et al. 2009), which rely on modeling normal user behaviorand detecting any deviation from this baseline profile. This alternate unsupervised learningapproach has been shown effective in detecting local festival events, by learning the nor-mal behavior of users in a given location over some period, and in detecting deviations aspossible events (Lee and Sumiya 2010). Boxplot statistics such as the median, quartiles,and minimum and maximum values have been used to learn the geographical regularitiesof a crowd within a region of interest using features such as the number of tweets and thenumber of users, and user-movement based on geotag information (Lee and Sumiya 2010).From this perspective, online and incremental learning of sequential models such as hiddenMarkov models (Khreich et al. 2012b) would provide a more robust solution for modelingthe temporal behavior of the data.

In contrast to traditional query expansion techniques, which rely on query-term co-occurrence, the unsupervised approaches for RED focus on temporal and dynamic queryexpansion techniques. Because of the temporal and social aspects in Twitter streams, peoplesearching Twitter show more interest in timely information (e.g., related to news or events)and social information related, for instance, to other users or popular trends (Teevan et al.2011). The proposed temporal query expansion techniques are based on the number of postsclose to the query date to retrieve individual messages (Massoudi et al. 2011) or on past(possibly distant) timespans to extract various messages that are related to a specific event(Metzler et al. 2012). Queries are extended on the basis of features such as the relativeterm frequency, hashtags, existence of hyperlinks and emoticons, number of replies andfollowers, post length, abbreviation, and capitalization that co-trend with the query termsin relevant timespans. This allows queries to adapt to new terms, abbreviations, or hashtagsthat may emerge during specific events.

Providing a structured and comprehensive summary of Twitter events, which goesbeyond simple message retrieval, is currently an active area of research. As a postprocessingstep to event detection, it would provide a more comprehensive view of the content than alist of tweets. Effective summarization and structuring of tweets is also important for severalmicroblog-related applications ranging from trend detection to microblog retrieval and sen-timent analysis (Efron 2011; Kim et al. 2011; Sharifi et al. 2010). A “phrase reinforcement”algorithm is proposed to summarize multiple tweets related to the same event byfinding the most commonly used phrase that encompasses the topic (Sharifi et al. 2010).Long et al. (2011) return the k most relevant and diverse posts to capture the event contextbased on cosine similarity between posts within a given time interval, while Cordeiro (2012)returns the set of hashtags related to the events based on an LDA topic model. A more struc-tured view of events is provided by the ETree framework (Gu et al. 2011). ETree identifiesthe major aspects of the event, the key message clusters, and their hierarchical structure


and causal relationships by using the term vector, number of replies, and employing n-gramanalysis, incremental modeling, and a life cycle–based temporal analysis.

4.3.2. Supervised Detection Approaches. While most NED approaches for unspeci-fied events involve unsupervised clustering of new tweets (as shown in Table 1), the NEDtechniques that focus on detecting a specified type of event mainly rely on supervisedlearning approaches. Although manually labeling a large number of twitter messages is alabor-intensive and time-consuming task, it is more feasible for specified events than forunspecified events (as illustrated in Table 3). When some event descriptions are known, fil-tering techniques could be used to reduce the amounts of irrelevant messages and make iteasier for a human expert to annotate a data set of “reasonable” size. Furthermore, filter-ing according to specific event descriptions, such as keywords, location, or time, would alsoreduce the amount of Twitter messages that must be processed during system operation andallow the detection algorithm to focus on a restricted set of tweets.

Several supervised classification algorithms have been proposed for NED of speci-fied events, including naive Bayes (Becker et al. 2011a; Sankaranarayanan et al. 2009),SVM (Becker et al. 2011a; Sakaki et al. 2010), and gradient boosted decision trees(Popescu and Pennacchiotti 2010; Popescu et al. 2011). These classifiers are typicallytrained on a small set of Twitter messages collected over a few weeks or months andthen filtered and labeled according to the target event as, for instance, an event ornonevent (Becker et al. 2011a; Sankaranarayanan et al. 2009), an earthquake or non-earthquake event (Sakaki et al. 2010), and a controversial or noncontroversial event(Popescu and Pennacchiotti 2010; Popescu et al. 2011). The labeling procedure usuallyinvolves two human annotators with specific domain knowledge. An agreement mea-sure, such as Cohen’s Kappa measure (Carletta 1996), is then used to evaluate the levelof interannotator agreement. Ambiguous events with a high level of disagreement arediscarded.

In addition to filtering out part of the irrelevant messages, when the detection taskinvolves specified events, additional features (other than word or hashtag frequency) couldbe included in the detection algorithm for improved system accuracy. These features mayvary widely depending on the target event and its description. For instance, in addition toword frequency, Sakaki et al. (2010) considered special keywords mentioning an “earth-quake”, its variant, or related words (e.g., “shaking”) as well as the contextual informationsurrounding these keywords. For detecting controversial events about celebrities, Popescuand Pennacchiotti (2010) employed a large set of linguistic, structural, burst, sentiment, andcontroversy features from Twitter and external features such as news buzz and Web-newscontroversy. For improved event detection accuracy and event description quality, Popescuet al. (2011) augmented this set with more sophisticated features based on informationretrieval and natural language processing techniques, such as relative positional information,POS tagging, and main entity extraction.

4.3.3. Hybrid Detection Approaches. While supervised classification and unsuper-vised clustering approaches have been applied separately for NED, a combination ofboth approaches has also been proposed (Table 1). Some approaches employ classifi-cation or detection techniques to identify relevant or important tweets before clustering(Sankaranarayanan et al. 2009). A trained classifier used to discriminate between events andnonevents would help reduce the amount of (noisy) data provided for clustering and henceimprove system efficiency. However, this approach is sensitive to the classification accuracyand threshold settings; relevant real-world events could be discarded before reaching the


TA

BL

E3.

Sum

mar

yof

Dat

aS

ets

and

Eva

luat

ion

Met

rics

.

Col

lect

ion

Cor

pus

size

Tem

pora

lsco

peE

valu

atio

n

San

kara

nara

yana

net

al.(

2009

)G

arde

nHos

eB

irdD

og–

–Q

uali

tativ

eP

huvi

pada

wat

and

Mur

ata

(201

0)S

trea

min

gA

PI

(pre

defi

ned

10tw

eets

–Q

uali

tativ

ese

arch

quer

ies

targ

etin

gbr

eaki

ngne

ws)

Petr

ovic

etal

.(20

10)

Str

eam

ing

AP

I16

3.5

Mtw

eets

6m

onth

sA

vera

gepr

ecis

ion

over

1000

thre

ads

from

syst

emou

tput

man

ually

labe

led

asre

leva

nt,

neut

ral,

orsp

amB

ecke

ret

al.(

2011

a)S

trea

min

gA

PI

(tw

eets

from

2.6

Mtw

eets

1m

onth

Mac

ro-a

vera

gedF1

over

300

NY

C-b

ased

user

s)tw

eetc

lust

ers

man

ually

labe

led

asre

al-w

orld

even

t,Tw

itte

r-ce

ntri

cac

tivit

y,ot

her

none

vent

,or

ambi

guou

sL

ong

etal

.(20

11)

Sin

aA

PI

22M

post

s2.

5m

onth

sP

reci

sion

Wen

gan

dL

ee(2

011)

RE

ST

AP

I4.

3M

twee

ts1

mon

thP

reci

sion

on21

syst

em-d

etec

ted

even

ts(t

wee

tsfr

om1k

mos

tfo

llow

edS

inga

pore

-bas

edus

ers

+th

eir

foll

ower

s)C

orde

iro

(201

2)S

prit

zer

13.6

Mtw

eets

8da

ysV

isua

lill

ustr

atio

nPo

pesc

uan

dPe

nnac

chio

tti(

2010

)Fi

reho

se74

0K

snap

shot

s7

mon

ths

Ave

rage

prec

isio

n,ar

eaun

der

RO

Ccu

rve

over

10-f

old

cros

s-va

lida

tion


TA

BL

E3.

Con

tinu

ed.

Col

lect

ion

Cor

pus

size

Tem

pora

lsco

peE

valu

atio

n

Pope

scu

etal

.(20

11)

Fire

hose

5040

snap

shot

s–

Ave

rage

prec

isio

n,m

ean

reci

proc

alra

nkov

erB

enso

net

al.(

2011

)Tw

eets

from

NY

C-b

ased

user

s4.

7M

twee

ts3

wee

kend

sP

reci

sion

/rec

allo

ver

reco

rds

scra

ped

from

NY

C.c

omm

usic

even

tgui

deL

eean

dS

umiy

a(2

010)

Sea

rch

AP

I(t

wee

tsge

otag

ged

21.6

Mtw

eets

1.5

mon

ths

Pre

cisi

on/r

ecal

love

rsy

stem

asfr

omJa

pan)

resu

lts

man

ually

judg

edag

ains

t15

pres

peci

fied

fest

ival

sin

a3-

day

peri

odS

akak

ieta

l.(2

010)

Sea

rch

AP

I(q

ueri

es:{

typh

oon}

597

twee

ts–

Pre

cisi

on/r

ecal

lan

d{e

arth

quak

e+

shak

ing}

)B

ecke

ret

al.(

2011

)Tw

itte

rA

PI

––

Qua

lita

tive

Mas

soud

ieta

l.(2

011)

Que

ries

base

don

tren

ding

topi

cs11

0M

twee

ts6

mon

ths

Sta

ndar

dIR

met

rics

over

man

ually

asse

ssed

pool

ofto

p20

resu

lts

ofea

chqu

ery

Met

zler

etal

.(20

12)

Str

eam

ing

AP

I46

.6M

twee

ts6

mon

ths

Pre

cisi

on@

10fo

r50

diff

eren

tev

entt

ype

quer

ies

Gu

etal

.(20

11)

Sea

rch

AP

I(q

ueri

esfo

r20

3.5

Mtw

eets

5m

onth

sC

over

age/

cohe

renc

em

anua

llyse

lect

edev

ents

acro

ssse

ven

cate

gori

es)

NYC.com


clustering stage. Other approaches first proceed with a clustering and then attempt to clas-sify whether a cluster contains relevant information about real-world events. Because theclusters constantly evolve over time, the features are periodically updated for old clustersand computed for newly formed ones.

Another hybrid approach proposed to identify Twitter messages corresponding to con-cert events uses a factor graph model, which simultaneously extracts the artist name andlocation of the event using a supervised CRF classifier and then clusters them according toevent type and induces a canonical value for each event property (Benson et al. 2011). Thisnovel approach could be seen as a specific named entity recognizer. In this line of research,Ritter et al. (2011) developed more general NLP tools for Twitter text, including a POStagger, a shallow parser, and a named entity recognizer based on supervised CRF models.

5. DISCUSSION

As discussed in the previous section and presented in Tables 1 and 2, event detectiontechniques from Twitter mainly rely on unsupervised and supervised detection approaches.While the benefit of unsupervised clustering approaches is that they do not require labeleddata, several options are still available for potential optimization. For instance, setting thethresholds of incremental clustering algorithms should be based on more advanced and pos-sibly adaptive techniques rather than simply relying on static values computed from smalldata sets during system design. Alternative techniques for improved clustering efficiencyand scalability may be considered, such as blocking or canopy techniques (Bilenko et al.2006; McCallum et al. 2000; Reuter and Cimiano 2012). Candidate retrieval or blockingmethods alleviate the scalability issue by selecting a subset of object pairs with large sim-ilarity values (e.g., between messages or between messages and clusters), leaving out theremaining pairs as dissimilar, and hence reducing the number of messages that are con-sidered as potential events (Bilenko et al. 2006; Reuter and Cimiano 2012). Performingclustering in two stages is another alternative. First, an approximate distance measure is usedto efficiently (and roughly) divide the data into overlapping subsets or “canopies.” Then,a more rigorous clustering stage using expensive distance measurements is applied to themessages that occur in a common canopy (McCallum et al. 2000). Cluster fragmentation isalso another issue that deserves more focus. In addition to the temporal aspect of the mes-sages, other features such as the location proximity (using geotags) could be used as anotherindicator that messages are related to the same event.

On the other hand, the proposed supervised event detection approaches generallyassume a static environment. A single classifier is typically trained off-line on a relativelysmall batch of Twitter data labeled manually. The classifier is then deployed for detect-ing events directly or combined with a clustering approach. These techniques are thereforerestricted in scope, because limited labeled data are available for training and Twitter is acontinuously evolving environment. For instance, users may leave or join the service, activeusers may become inactive, new terms, abbreviations, and hashtags may emerge. A staticclassifier is prone to both false positive and negative errors when a concept drift occurs inthe data streams. Techniques such as incremental learning (Joshi and Kulkarni 2012) andensemble methods (Khreich et al. 2012a; Kuncheva 2004; Polikar 2006) may be employedto account for unseen events and adapt to changes that may occur over time.

Other approaches that have proved useful when dealing with sparse labeled data includesemi-supervised learning (Chapelle et al. 2006; Zhu Updated on July 19, 2008) and trans-fer learning (Pan and Yang 2010; Pan et al. 2012). Semi-supervised learning exploits asmall amount of labeled data together with the large amount of unlabeled data to build


classifiers (Chapelle et al. 2006; Zhu Updated on July 19, 2008). Transfer learning meth-ods are designed to extract useful knowledge from different but related domains (Chapelleet al. 2006; Zhu Updated on July 19, 2008). For instance, transferring existing knowledgein Wikipedia documents to help classify Twitter messages. Following this line of research,Meij et al. (2012) proposed a method for automatically mapping tweets to Wikipedia arti-cles to facilitate semantic mining. This approach is based on a combination of high-recallconcept ranking and high-precision machine learning including random forests or gradientboosted regression trees.

People are increasingly searching the Web not only to find documents but also to makedecisions. Because queries are inherently ambiguous and difficult to interpret in isolation,recent advances in Web search technologies are relying on the notion of relevance to reduceambiguity. Therefore, several learning to rank models have been proposed for informationretrieval tasks to capture document relevance by combining various global, context-specific,and user-specific features (Li et al. 2008). In addition to traditional information retrievalmethods, the previously described techniques for semi-supervised and transfer learning maybe used for feature generation, ranking model selection, and labeled data collection formodel training and evaluation. The challenge with RED resides in designing features thatcapture the importance of an event with respect to specific keywords, while taking intoaccount the temporal, geospacial, and social network specific to the user.

Performance evaluation of different approaches and features is a major issue facingevent detection in Twitter. In typical information retrieval tasks, precision, recall, and F-measures are common performance metrics. The precision is the number of relevant eventsdetected over the total number of events detected, while the recall is the number of rele-vant events detected over the total number of relevant events that exist in the data streams.F-measures are weighted harmonic means of precision and recall. Recall is generally diffi-cult to compute for large and noisy data sets, because manual enumeration of all relevantevents that exist in a given Twitter streams is time consuming for small sets and infeasiblefor larger ones. Therefore, as illustrated in Table 3, some of the work surveyed in this articleonly focused on precision measures such as average precision or precision@K (which cap-ture the fraction of correctly detected events out of the top-K detected ones), while othersonly presented few examples of the detected events. Representative data sets and commontestbed are highly required for evaluation of different detection techniques and feature rep-resentations that are proposed for event detection from Twitter. Therefore, sharing (and ifpossible merging) the labeled data sets online (such as those presented in Table 3), as well asusing crowd sourcing services such as Amazon’s Mechanical Turk for larger scale labeling,would provide more representative data for evaluation.

Most techniques focused on English language. In addition to stop-words and trivialnonalphanumeric words, all non-English words are filtered out. However, depending onthe event location, Twitter messages could be written in mixed languages or in completelynon-English languages. Accurate translation of all non-English tweets would increase thecomputational load the amounts of data to process by downstream applications. There is agrowing body of mining translingual knowledge from textual data found on the Web for sta-tistical machine translation (Nie et al. 2012). In general, techniques for both RED and NEDdo not always require high-quality text translations. Parallel or even comparable corpora canbe directly used to train models or learn term similarity measures for query translation.

This survey focused on event detection from a single source of social media infor-mation. More robust solutions would be provided by integrating and combining eventinformation from multiple social sources, such as Facebook, Flicker, and Youtube (Chenand Roy 2009; Kennedy et al. 2007; Kinsella et al. 2011; Mirkovic et al. 2011; Rattenburyet al. 2007; Stefanidis et al. 2011; Tang et al. 2012). Missing information from one source


may be available in the other. Clustering and classification approaches could be applied tovarious social media sources simultaneously using their common feature representationsincluded in the metadata (title, user, time, tags, location, etc.). As an alternative, these detec-tion approaches could be applied to each social media source to further exploit site-specificfeatures and then be merged using learned similarity measures (Becker et al. 2010). Fur-thermore, multiple text streams could be indexed and aligned by the same set of time pointscalled coordinated text streams (Wang et al. 2007), to determine the interesting events andassociations between different streams. These approaches and additional features can helpdetect new events from complementary social media sites.

Although the research community is making progress toward specific event detectionsubtasks, simultaneously monitoring and analyzing the events and activities from differentsocial media services remain a challenge. A considerable effort is still required to achieveefficient and reliable event detection systems, such as designing better feature extraction andquery generation techniques, more accurate filtering and detection algorithms, improvedtechniques to combine and analyze information from multiple sources (social and tra-ditional media) and multiple languages, and enhanced summarization and visualizationapproaches. Many organizations and research scholars are actively developing new systemsand algorithms to overcome these challenges to exploit this rich and continuous flow ofuser-generated content.

6. CONCLUSION

Event detection aims at finding real-world occurrences that unfold over space and time.As a fast-growing microblogging and online social networking service, Twitter providesunprecedentedly valuable user-generated content that can be transformed into actionable andsituational knowledge. More importantly, messages posted on Twitter—currently exceed-ing 400 million tweets per day—could reveal information about real-world events as theyunfold. However, event detection from Twitter data must efficiently and accurately uncoverrelevant information about events of general or specific interest, which is buried within alarge amount of mundane information (e.g., meaningless, polluted, and rumor messages).This article provides a survey of techniques proposed for event detection from Twitterdata. These techniques are classified according to the type of target event into specifiedor unspecified event detection. Depending on the detection task and target application,these techniques are also classified into RED or NED. Nevertheless, they are also catego-rized according to the detection methods that involve supervised, unsupervised, and hybridapproaches. General and Twitter-specific feature representations corresponding to each cat-egory are also presented and discussed. Finally, this article highlights major issues and openresearch challenges, in particular, the need for publicly available testbeds for comprehensiveevaluation of performance and objective comparison of different detection approaches.

REFERENCES

AGGARWAL, C. C. 2011. An introduction to social network data analytics. In Social Network Data Analytics.Edited by C. C. AGGARWAL. Springer: New York, pp. 1–15.

AGGARWAL, C. C., and C. ZHAI. 2012. A survey of text clustering algorithms. In Mining Text Data. Edited byC. C. AGGARWAL, and C. ZHAI. Springer: New York, pp. 77–128.

ALLAN, J. 2002. Topic Detection and Tracking: Event-based Information Organization. Kluwer AcademicPublishers: Norwell, MA.


ALLAN, J., J. CARBONELL, G. DODDINGTON, J. YAMRON, and Y. YANG. 1998. Topic detection and trackingpilot study final report. In Proceedings of the DARPA Broadcast News Transcription and UnderstandingWorkshop, Lansdowne, VA, pp. 194–218.

ALLAN, J., V. LAVRENKO, and H. JIN. 2000. First story detection in TDT is hard. In Proceedings of the NinthInternational Conference on Information and Knowledge Management, CIKM ’00, ACM, New York, NY,pp. 374–381.

ALLAN, J., V. LAVRENKO, D. MARLIN, and R. SWAN. 2000. Detections, bounds, and timelines: UMass andTDT–3. In Proceedings of Topic Detection and Tracking (TDT–3), Vienna, VA, pp. 167–174.

ALLAN, J., R. PARKA, and LAVRENKO V. 1998. On-line new event detection and tracking. In Proceedings of the21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,SIGIR ’98, ACM, New York, NY, pp. 37–45.

AMER-YAHIA, S., S. ANJUM, A. GHENAI, A. SIDDIQUE, S. ABBAR, S. MADDEN, A. MARCUS, and M.EL-HADDAD. 2012. MAQSA: A system for social analytics on news. In Proceedings of the 2012 ACMSIGMOD International Conference on Management of Data, SIGMOD ’12, ACM, New York, NY,pp. 653–656.

BAEZA-YATES, R. A., and B. RIBEIRO-NETO. 2011. Modern Information Retrieval the Concepts and Technol-ogy Behind Search (2nd ed.). Pearson Education Ltd.: Harlow, England.

BECKER, H., F. CHEN, D. ITER, M. NAAMAN, and L. GRAVANO. 2011. Automatic identification and presen-tation of Twitter content for planned events. In International AAAI Conference on Weblogs and SocialMedia, Barcelona, Spain.

BECKER, H., D. ITER, M. NAAMAN, and L. GRAVANO. 2012. Identifying content for planned events acrosssocial media sites. In Proceedings of the Fifth ACM International Conference on Web Search and DataMining, WSDM ’12, ACM, New York, NY, pp. 533–542.

BECKER, H., M. NAAMAN, and L. GRAVANO. 2010. Learning similarity metrics for event identification in socialmedia. In WSDM’10, ACM, New York, pp. 291–300.

BECKER, H., M. NAAMAN, and L. GRAVANO. 2011a. Beyond trending topics: Real-world event identificationon Twitter. In ICWSM, Barcelona, Spain.

BECKER, H., M. NAAMAN, and L. GRAVANO. 2011b. Selecting quality Twitter content for events. InInternational AAAI Conference on Weblogs and Social Media, Barcelona, Spain.

BENEVENUTO, F., G. MAGNO, T. RODRIGUES, and V. ALMEIDA. 2010. Detecting spammers on Twitter. InProceedings of the 7th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference(CEAS), Redmond, WA.

BENSON, E., A. HAGHIGHI, and R. BARZILAY. 2011. Event discovery in social media feeds. In Proceed-ings of the 49th Annual Meeting of the Association for Computational Linguistics: Human LanguageTechnologies, Volume 1 of HLT ’11, Association for Computational Linguistics, Stroudsburg, PA,pp. 389–398.

BERKHIN, P. 2002. A survey of clustering data mining techniques. Technical report, Yahoo!, Inc.

BILENKO, M., B. KAMATH, and R. J. MOONEY. 2006. Adaptive blocking: learning to scale up record linkage.In Proceedings of the Sixth International Conference on Data Mining, ICDM ’06, IEEE Computer Society,Washington, DC, pp. 87–96.

BLEI, D. M., A. Y. NG, and M. I. JORDAN. 2003. Latent Dirichlet allocation. Journal of Machine LearningResearch, 3: 993–1022.

BOURAS, C., and V. TSOGKAS. 2010. Assigning Web news to clusters. In Proceedings of the 2010 Fifth Inter-national Conference on Internet and Web Applications and Services, ICIW ’10, IEEE Computer Society,Washington, DC, pp. 1–6.

BOYD, D., S. GOLDER, and G. LOTAN. 2010. Tweet, tweet, retweet: Conversational aspects of retweeting onTwitter. In 2010 43rd Hawaii International Conference on System Sciences (HICSS), pp. 1–10.

BOYD, D. M., and N. B. ELLISON. 2007. Social network sites: Definition, history, and scholarship. Journal ofComputer-Mediated Communication, 13(1): 210–230.


BRANTS, T., F. CHEN, and A. FARAHAT. 2003. A system for new event detection. In Research and Developmentin Information Retrieval, New York, NY, pp. 330–337.

BRZOZOWSKI, M. J., and D. M. ROMERO. 2011. Who should I follow? Recommending people in directed socialnetworks. In Proceedings of the Fifth International Conference on Weblogs and Social Media, ICWSM,The AAAI Press.

CARLETTA, J. 1996. Assessing agreement on classification tasks: the kappa statistic. Computational Linguistics,22(2): 249–254.

CASTILLO, C., M. MENDOZA, and B. POBLETE. 2011. Information credibility on Twitter. In Proceedings of the20th International Conference on World Wide Web, WWW ’11, ACM, New York, NY, pp. 675–684.

CHAPELLE, O., B. SCHÃULKOPF, and B. ZIEN. 2006. Semi-Supervised Learning. Adaptive Computation andMachine Learning series. MIT Press: Cambridge, MA.

CHEN, C. C., and M. CHANG CHEN. 2008. TSCAN: A novel method for topic summarization and contentanatomy. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval, SIGIR ’08, ACM, New York, NY, pp. 579–586.

CHEN, L., and A. ROY. 2009. Event detection from Flickr data through wavelet-based spatial analysis. In Pro-ceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM ’09, ACM,New York, NY, pp. 523–532.

CORDEIRO, M. 2012. Twitter event detection: Combining wavelet analysis and topic inference summarization.In Doctoral Symposium on Informatics Engineering, DSIE’2012.

DAI, X. Y., Q. C. CHEN, X. L. WANG, and J. XU. 2010. Online topic detection and tracking of financial newsbased on hierarchical clustering. In 2010 International Conference on Machine Learning and Cybernetics(ICMLC), Vol. 6, pp. 3341–3346.

EFRON, M. 2011. Information search and retrieval in microblogs. Journal of the American Society ofInformation Science and Technology, 62(6): 996–1008.

FARZINDAR, A. 2012. Industrial perspectives on social networks. In EACL 2012 - Workshop on SemanticAnalysis in Social Media.

FARZINDAR, A., and D. INKPEN. 2012. Proceedings of the Workshop on Semantic Analysis in Social Media.Association for Computational Linguistics, Avignon, France.

FOX, D., J. HIGHTOWER, L. LIAO, D. SCHULZ, and G. BORRIELLO. 2003. Bayesian filtering for locationestimation. IEEE Pervasive Computing, 2: 24–33.

FRIEDMAN, J. H. 2001. Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29:1189–1232.

FUJISAKA, T., R. LEE, and K. SUMIYA. 2010. Discovery of user behavior patterns from geo-tagged micro-blogs. In Proceedings of the 4th International Conference on Uniquitous Information Management andCommunication, ICUIMC ’10, ACM, New York, NY, pp. 36:1–36:10.

FUNG, G. P. C., J. XU YU, P. S. YU, and H. LU. 2005. Parameter free bursty events detection in text streams.In Proceedings of the 31st International Conference on Very Large Data Bases, VLDB ’05, VLDBEndowment, pp. 181–192.

GESER, H. 2011. Has tweeting become inevitable? Twitter’s strategic role in the world of digital communication.Online Publications.

GIONIS, A., P. INDYK, and R. MOTWANI. 1999. Similarity search in high dimensions via hashing. In Proceedingsof the 25th International Conference on Very Large Data Bases, VLDB ’99. Morgan Kaufmann PublishersInc.: San Francisco, CA, pp. 518–529.

GOORHA, S., and L. UNGAR. 2010. Discovery of significant emerging trends. In Proceedings of the 16th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10, ACM, New York,NY, pp. 57–64.

GU, H., X. XIE, Q. LV, Y. RUAN, and L. SHANG. 2011. ETree: Effective and efficient event modeling for real-time online social media networks. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011IEEE/WIC/ACM International Conference, Vol. 1, pp. 300–307.


HASTIE, T., R. TIBSHIRANI, and J. FRIEDMAN. 2009. The Elements of Statistical Learning: Data Mining,Inference, and Prediction, Second Edition. Springer: Stanford, CA.

HE, Q., K. CHANG, and E.-P. LIM. 2007. Analyzing feature trajectories for event detection. In Proceedingsof the 30th Annual International ACM SIGIR Conference on Research and Development in InformationRetrieval, SIGIR ’07, ACM, New York, NY, pp. 207–214.

HE, Q., K. CHANG, E.-P. LIM, and J. ZHANG. 2007. Bursty feature representation for clustering text streams. InSIAM International Conference on Data Mining.

HOGENBOOM, F., F. FRASINCAR, U. KAYMAK, and F. DE JONG. 2011. An overview of event extraction fromtext. In Workshop on Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE2011) at Tenth International Semantic Web Conference (ISWC 2011), Vol. 779 of CEUR Workshop Pro-ceedings. Edited by VAN ERP, M., W. R. VAN HAGE, L. HOLLINK, A. JAMESON, and R. TRONCY.CEUR-WS.org: Koblenz, Germany; pp. 48–57.

HONEYCUTT, C., and S. C. HERRING. 2009. Beyond microblogging: conversation and collaboration via Twitter.In 42nd Hawaii International Conference on System Sciences, HICSS ’09, pp. 1–10.

HURLOCK, J., and M. WILSON. 2011. Searching Twitter: separating the tweet from the chaff. In InternationalAAAI Conference on Weblogs and Social Media, Barcelona, Spain.

IPSEN, I. C. F., and R. S. WILLS. 2006. Mathematical properties and analysis of Google’s pagerank. Boletín dela Sociedad Española de Matemática Aplicada, 34: 191–196.

JAIN, A. K., and R. C. DUBES. 1988. Algorithms for Clustering Data. Prentice-Hall: Upper Saddle River, NJ.

JANSEN, B. J., M. ZHANG, K. SOBEL, and A. CHOWDURY. 2009. Twitter power: Tweets as electronic word ofmouth. Journal of the American Society for Information Science and Technology, 60(11): 2169–2188.

JAVA, A., X. SONG, T. FININ, and B. TSENG. 2007. Why we Twitter: Understanding microblogging usage andcommunities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining andSocial Network Analysis, WebKDD/SNA-KDD ’07, ACM, New York, NY, pp. 56–65.

JIANG, L., M. YU, M. ZHOU, X. LIU, and T. ZHAO. 2011. Target-dependent Twitter sentiment classification.In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: HumanLanguage Technologies - Volume 1, HLT ’11, Association for Computational Linguistics, Stroudsburg, PA,pp. 151–160.

JO, T., and M. LEE. 2007. The evaluation measure of text clustering for the variable number of clusters. In Pro-ceedings of the 4th International Symposium on Neural Networks: Part II–Advances in Neural Networks,ISNN ’07. Springer-Verlag: Berlin, Heidelberg, pp. 871–879.

JOSHI, P., and P. KULKARNI. 2012. Incremental learning: areas and methods – a survey. International Journalof Data Mining & Knowledge Management Process, 2(5): 43–51.

JURAFSKY, D., and J. H. MARTIN. 2009. Speech and Language Processing: An Introduction to Natural Lan-guage Processing, Computational Linguistics, and Speech Recognition (2nd ed.). Prentice Hall: UpperSaddle River, NJ.

KAPLAN, A. M., and M. HAENLEIN. 2011. The early bird catches the news: Nine things you should know aboutmicro-blogging. Business Horizons, 54(2): 105–113.

KENNEDY, L., M. NAAMAN, S. AHERN, R. NAIR, and T. RATTENBURY. 2007. How Flickr helps us makesense of the world: Context and content in community-contributed media collections. In Proceedingsof the 15th International Conference on Multimedia, MULTIMEDIA ’07, ACM, New York, NY, pp.631–640.

KHAN, A. A. 2012. The role social of media and modern technology in arabs spring. Far East Journal ofPsychology and Business, 7(1): 56–63.

KHONDKER, H. H. 2011. Role of new media in the Arab Spring. Globalizations, 8(5): 575–679.

KHREICH, W., E. GRANGER, A. MIRI, and R. SABOURIN. 2012a. Adaptive ROC-based ensembles of HMMsapplied to anomaly detection. Pattern Recognition, 45(1): 208–230.

KHREICH, W., E. GRANGER, A. MIRI, and R. SABOURIN. 2012b. A survey of techniques for incrementallearning of HMM parameters. Information Sciences, 197: 105–130.


KHREICH, W., E. GRANGER, R. SABOURIN, and A. MIRI. 2009. Combining hidden Markov models for anomalydetection. In International Conference on Communications (ICC), Dresden, Germany, pp. 1–6.

KIM, H. D., K. GANESAN, P. SONDHI, and C. ZHAI. 2011. Comprehensive review of opinion summarization.Technical report, University of Illinois at Urbana-Champaign.

KINSELLA, S., A. PASSANT, and BRESLIN. J. 2011. Topic classification in social media using metadata fromhyperlinked objects. In Advances in Information Retrieval, Vol. 6611 of Lecture Notes in Computer Sci-ence. Edited by CLOUGH, P., C. FOLEY, C. GURRIN, G. JONES, W. KRAAIJ, H. LEE, and V. MUDOCH.Springer Berlin: Heidelberg, pp. 201–206.

KLEINBERG, J. 2002. Bursty and hierarchical structure in streams. In Proceedings of the Eighth ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining, KDD ’02, ACM, New York, NY,pp. 91–101.

KONTOSTATHIS, A., L. GALITSKY, W. M. POTTENGER, S. ROY, and D. J. PHELPS. 2004. A survey of emerg-ing trend detection in textual data mining. In A Comprehensive Survey of Text Mining: Clustering,Classification and Retrieval. Edited by M. W. BERRY. Springer: New York.

KRISHNAMURTHY, B., P. GILL, and M. ARLITT. 2008. A few chirps about Twitter. In Proceedings of the FirstWorkshop on Online Social Networks, WOSN ’08, ACM, New York, NY, pp. 19–24.

KUMARAN, G., and J. ALLAN. 2004. Text classification and named entities for new event detection. In Researchand Development in Information Retrieval, Sheffield, UK, pp. 297–304.

KUNCHEVA, L. I. 2004. Classifier ensembles for changing environments. In Multiple Classifier Systems,Vol. 3077, Cagliari, Italy, pp. 1–15.

KWAK, H., C. LEE, H. PARK, and S. MOON. 2010. What is Twitter, a social network or a news media? InProceedings of the 19th International Conference on World Wide Web, WWW ’10, ACM, New York, NY,pp. 591–600.

LAVRENKO, V., and W. B. CROFT. 2001. Relevance based language models. In Proceedings of the 24th AnnualInternational ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’01,ACM, New York, NY, pp. 120–127.

LEE, K., B. EOFF, and J. CAVERLEE. 2011. Seven months with the devils: A long-term study of content polluterson Twitter. In International AAAI Conference on Weblogs and Social Media, Barcelona, Spain.

LEE, R., and K. SUMIYA. 2010. Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection. In Proceedings of the 2nd ACM SIGSPATIAL International Workshop on LocationBased Social Networks, LBSN ’10, ACM, New York, NY, pp. 1–10.

LEEK, T., R. SCHWARTZ, and S. SISTA. 2002. Probabilistic approaches to topic detection and tracking. In TopicDetection and Tracking. Edited by J. ALLAN. Kluwer Academic Publishers: Norwell, MA, pp. 67–83.

LI, P., C. BURGES, and Q. WU. 2008. McRank: Learning to rank using multiple classification and gradientboosting. In Advances in Neural Information Processing Systems 20. Edited by PLATT, J., D. KOLLER, Y.SINGER, and S. ROWEIS. MIT Press: Cambridge, MA, pp. 897–904.

LI, Z., B. WANG, M. LI, and W. MA. 2005. A probabilistic model for retrospective news event detection. InProceedings of the 28th Annual International ACM SIGIR Conference on Research and Development inInformation Retrieval, SIGIR ’05, ACM, New York, NY, pp. 106–113.

LIU, K. L., W. LI, and M. GUO. 2012. Emoticon smoothed language models for Twitter sentiment analysis. InProceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto, ON.

LONG, R., H. WANG, Y. CHEN, O. JIN, and Y. YU. 2011. Towards effective event detection, tracking and summa-rization on microblog data. In Web-Age Information Management, Vol. 6897 of Lecture Notes in ComputerScience. Edited by WANG, H., S. LI, S. OYAMA, X. HU, and T. QIAN. Springer: Berlin/Heidelberg,pp. 652–663.

LOVEJOY, K., and G. D. SAXTON. 2012. Information, community, and action: How nonprofit organizations usesocial media. Journal of Computer-Mediated Communication, 17(3): 337–353.

LUO, G., C. TANG, and P. S. YU. 2007. Resource-adaptive real-time new event detection. In Proceedings of the2007 ACM SIGMOD International Conference on Management of Data, SIGMOD ’07, ACM, New York,NY, pp. 497–508.


MANNING, C. D., and H. SCHÜTZE. 1999. Foundations of Statistical Natural Language Processing. MIT Press:Cambridge, MA.

MASSOUDI, K., M. TSAGKIAS, M. DE RIJKE, and W. WEERKAMP. 2011. Incorporating query expan-sion and quality indicators in searching microblog posts. In Proceedings of the 33rd EuropeanConference on Advances in Information Retrieval, ECIR’11. Springer-Verlag: Berlin, Heidelberg,pp. 362–367.

MATHIOUDAKIS, M., and N. KOUDAS. 2010. TwitterMonitor: Trend detection over the Twitter stream. InSIGMOD Conference, Indianapolis, IN, pp. 1155–1158.

MCCALLUM, A., K. NIGAM, and L. H. UNGAR. 2000. Efficient clustering of high-dimensional datasets with application to reference matching. In Proceedings of the Sixth ACM SIGKDD Interna-tional Conference on Knowledge Discovery and Data Mining, KDD ’00, ACM, New York, NY,pp. 169–178.

MCFEDRIES, P. 2007. Technically speaking: All A-Twitter. IEEE Spectrum, 44(10): 84.

MEIJ, E., W. WEERKAMP, and M. DE RIJKE. 2012. Adding semantics to microblog posts. In Proceedings ofthe fifth ACM International Conference on Web Search and Data Mining, WSDM ’12, ACM, New York,NY, pp. 563–572.

METZLER, D., C. CAI, and E. H. HOVY. 2012. Structured event retrieval over microblog archives. In HLT-NAACL, pp. 646–655.

METZLER, D., S. DUMAIS, and C. MEEK. 2007. Similarity measures for short segments of text. In Proceed-ings of the 29th European Conference on IR Research, ECIR’07. Springer-Verlag: Berlin, Heidelberg,pp. 16–27.

MIRKOVIC, M., D. CULIBRK, S. PAPADOPOULOS, C. ZIGKOLIS, Y. KOMPATSIARIS, G. MCARDLE, andV. CRNOJEVIC. 2011. A comparative study of spatial, temporal and content-based patterns emergingin Youtube and Flickr. In Computational Aspects of Social Networks (CASoN), 2011 InternationalConference, pp. 189–194.

MOHD, M. 2007. Named entity patterns across news domains. In Proceedings of the 1st BCS IRSG con-ference on Future Directions in Information Access, FDIA’07, British Computer Society, Swinton, UK,pp. 5–5.

MURPHY, K. P. 2012. Machine Learning: A Probabilistic Perspective. Adaptive Computation and MachineLearning Series. Mit Press: Cambridge, MA.

MURTHY, D. 2011. Twitter: microphone for the masses? Media, Culture & Society, 33(5): 779–789.

NAAMAN, M., H. BECKER, and GRAVANO. L. 2011. Hip and trendy: characterizing emerging trends on Twitter.Journal of the American Society of Information Science and Technology, 62(5): 902–918.

NALLAPATI, R., A. FENG, F. PENG, and J. ALLAN. 2004. Event threading within news topics. In Proceedingsof the Thirteenth ACM International Conference on Information and Knowledge Management, CIKM ’04,ACM, New York, NY, pp. 446–453.

NIE, J. Y., J. GAO, and G. CAO. 2012. Translingual mining from text data. In Mining Text Data. Edited by C.C.AGGARWAL, and C. ZHAI. Springer: New York, pp. 323–359.

OH, O., M. AGRAWAL, and H. RAO. 2011. Information control and terrorism: Tracking the Mumbai terroristattack through Twitter. Information Systems Frontiers, 13: 33–43.

PAK, A., and P. PAROUBEK. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedingsof the Seventh International Conference on Language Resources and Evaluation (LREC’10), EuropeanLanguage Resources Association (ELRA), Valletta, Malta.

PAN, S. J., and Q. YANG. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and DataEngineering, 22: 1345–1359.

PAN, W., E. ZHONG, and Q. YANG. 2012. Transfer learning for text mining. In Mining Text Data. Edited by C.C.AGGARWAL, and C. ZHAI. Springer: New York, pp. 223–257.

PAPKA, R. 1999. On-line new event detection, clustering and tracking. Ph.D. thesis, Department of ComputerScience, University of Massachusetts, Amherst.


PARANJPE, D. 2009. Learning document aboutness from implicit user feedback and document structure. InProceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM ’09, ACM,New York, NY, pp. 365–374.

PETROVIC, S., M. OSBORNE, and V. LAVRENKO. 2010. Streaming first story detection with application toTwitter. In Human Language Technologies: The 2010 Annual Conference of the North American Chapterof the Association for Computational Linguistics, HLT ’10, pp. 181–189.

PHUVIPADAWAT, S., and T. MURATA. 2010. Breaking news detection and tracking in Twitter. InIEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology(WI-IAT), Vol. 3, Toronto, ON, pp. 120–123.

POLIKAR, R. 2006. Ensemble based systems in decision making. IEEE Circuits and Systems Magazine, 6(3):21–45.

POPESCU, A. M., and M. PENNACCHIOTTI. 2010. Detecting controversial events from Twitter. In Proceedingsof the 19th ACM international Conference on Information and Knowledge Management, CIKM ’10, ACM,New York, NY, pp. 1873–1876.

POPESCU, A. M., M. PENNACCHIOTTI, and D. PARANJPE. 2011. Extracting events and event descriptions fromTwitter. In Proceedings of the 20th International Conference Companion on World Wide Web, WWW ’11,pp. 105–106.

RATTENBURY, T., N. GOOD, and M. NAAMAN. 2007. Towards automatic extraction of event and placesemantics from Flickr tags. In Proceedings of the 30th Annual International ACM SIGIR Confer-ence on Research and Development in Information Retrieval, SIGIR ’07, ACM, New York, NY, pp.103–110.

REINHARDT, W., B. GUNTER, E. MARTIN, and C. COSTA. 2009. How people are using Twitter duringconferences. In Proceedings of 5th EduMedia Conference, Salzburg, Vienna, pp. 145–156.

REUTER, T., and P. CIMIANO. 2012. A systematic investigation of blocking strategies for real-time classificationof social media content into events. In International AAAI Conference on Weblogs and Social Media,Dublin, Ireland.

RITTER, A., M. SAM CLARK, and O. ETZIONI. 2011. Named entity recognition in tweets: Anexperimental study. In Proceedings of the Conference on Empirical Methods in Natural Lan-guage Processing, EMNLP ’11, Association for Computational Linguistics, Stroudsburg, PA, pp.1524–1534.

SAKAKI, T., M. OKAZAKI, and Y. MATSUO. 2010. Earthquake shakes Twitter users: Real-time event detectionby social sensors. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10,ACM, New York, NY, pp. 851–860.

SALTON, G. 1988. Automatic Text Processing: The Transformation, Analysis and Retrieval of Information byComputer. Addison-Wesley: Boston.

SANKARANARAYANAN, J., H. SAMET, B. E. TEITLER, M. D. LIEBERMAN, and J. SPERLING. 2009.TwitterStand: News in tweets. In Proceedings of the 17th ACM SIGSPATIAL International Con-ference on Advances in Geographic Information Systems, GIS ’09, ACM, New York, NY, pp.42–51.

SHARIFI, B., M. A. HUTTON, and J. KALITA. 2010. Summarizing microblogs automatically. In HumanLanguage Technologies: The 2010 Annual Conference of the North American Chapter of the Associa-tion for Computational Linguistics, HLT ’10, Association for Computational Linguistics, Stroudsburg, PA,pp. 685–688.

SMALL, T. A. 2011. What the hashtag? Information, Communication & Society, 14(6): 872–895.

SNOWSILL, T., F. NICART, M. STEFANI, T. DE BIE, and N. CRISTIANINI. 2010. Finding surprising patternsin textual data streams. In Cognitive Information Processing (CIP), 2010 2nd International Workshop,Tuscany, Italy, pp. 405–410.

STARBIRD, K., P. PALEN, A. L. HUGHES, and S. VIEWEG. 2010. Chatter on the red: What hazards threatreveals about the social life of microblogged information. In Proceedings of the 2010 ACM conference onComputer Supported Cooperative Work, CSCW ’10, ACM, New York, NY, pp. 241–250.


STEFANIDIS, A, A. CROOKS, and J. RADZIKOWSKI. 2011. Harvesting ambient geospatial information fromsocial media feeds. GeoJournal: 1–20.

TANG, J., X. WANG, H. GAO, X. HU, and H. LIU. 2012. Enriching short text representation in microblog forclustering. Frontiers of Computer Science in China, 6(1): 88–101.

TEEVAN, J., D. RAMAGE, and M. R. MORRIS. 2011. #TwitterSearch: A comparison of microblog search andweb search. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining,WSDM ’11, ACM, New York, NY, pp. 35–44.

THOMPSON, C. 2008. Brave New World of Digital Intimacy. New York Times. Online. Accessed on November1, 2012.

TRONCY, R., B. MALOCHA, and A. T. S. FIALHO. 2010. Linking events with media. In Proceedings ofthe 6th International Conference on Semantic Systems, I-SEMANTICS ’10, ACM, New York, NY,pp. 42:1–42:4.

TUMASJAN, A., T. O. SPRENGER, P. G. SANDNER, and I. M. WELPE. 2010. Predicting elections with Twitter:What 140 characters reveal about political sentiment. In Proceedings of the Fourth International Conferenceon Weblogs and Social Media, ICWSM. The AAAI Press: Washington, DC.

WANG, X., M. S. GERBER, and D. E. BROWN. 2012. Automatic crime prediction using events extracted fromTwitter posts. In Proceedings of the 5th International Conference on Social Computing, Behavioral-CulturalModeling and Prediction, SBP’12. Springer-Verlag: Berlin, Heidelberg, pp. 231–238.

WANG, X., C. ZHAI, X. HU, and R. SPROAT. 2007. Mining correlated bursty topic patterns from coordinatedtext streams. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining, KDD ’07, ACM, New York, NY, pp. 784–793.

WEERKAMP, W., and M. DE RIJKE. 2008. Credibility improves topical blog post retrieval. In ACL, Columbus,OH, pp. 923–931.

WENG, J., and B.-S. LEE. 2011. Event detection in Twitter. In ICWSM, Barcelona, Spain.

XIE, L., H. SUNDARAM, and M. CAMPBELL. 2008. Event mining in multimedia stream. Proceedings of theIEEE, 96(4): 623–647.

YANG, Y., J. G. CARBONELL, R. D. BROWN, T. PIERCE, B. T. ARCHIBALD, and X. LIU. 1999. Learningapproaches for detecting and tracking news events. IEEE Intelligent Systems, 14(4): 32–43.

YANG, Y., T. PIERCE, and J. CARBONELL. 1998. A study of retrospective and on-line event detection. InProceedings of the 21st Annual International ACM SIGIR Conference on Research and Development inInformation Retrieval, SIGIR ’98, ACM, New York, NY, pp. 28–36.

YANG, Y., J. ZHANG, J. CARBONELL, and C. JIN. 2002. Topic-conditioned novelty detection. In Proceedings ofthe Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02,ACM, New York, NY, pp. 688–693.

ZHAO, D., and M. B. ROSSON. 2009. How and why people Twitter: The role that micro-blogging plays ininformal communication at work. In Proceedings of the ACM 2009 International Conference on SupportingGroup Work, GROUP ’09, ACM, New York, NY, pp. 243–252.

ZHU, X. Updated on July 19, 2008. Semi-supervised learning literature survey, University of Wisconsin MadisonTechnical Report TR 1530.

Documents

A SURVEY OF TECHNIQUES FOR EVENT DETECTION IN TWITTERrali.iro.umontreal.ca/rali/sites/default/files/... · of techniques for event detection from Twitter streams. These techniques