9
Research Article Crowd Sensing Based Burst Computing of Events Using Social Media Zheng Xu, 1,2 Yunhuai Liu, 2 Lin Mei, 2 Hui Zhang, 1 and Chuanping Hu 2 1 Tsinghua University, Beijing 100084, China 2 e ird Research Institute of the Ministry of Public Security, Shanghai 201142, China Correspondence should be addressed to Zheng Xu; [email protected] Received 2 July 2014; Revised 25 August 2014; Accepted 29 September 2014 Academic Editor: Xiangfeng Luo Copyright © 2015 Zheng Xu et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. With the popularity of web, the internet is becoming a major information provider and poster of an event due to its real-time, open, and dynamic features. In this paper, crowd sensing based burst computation algorithm of a web event is developed in order to let the people know a web event clearly and help the social group or government process the events effectively. Different temporal features of web events are developed to provide the basics for the proposed computation algorithm. e burst power is presented to integrate the above temporal features of an event. Empirical experiments on real datasets including Google Zeitgeist and Google Trends show that the number of web pages and the average clustering coefficient can be used to detect events. e evaluations on real dataset show that the proposed function integrating the number of web pages and the average clustering coefficient can be used for event detection efficiently and correctly. 1. Introduction Crowd sensing is a process of acquisition, integration, and analysis of big and heterogeneous data generated by a diversity of sources in urban spaces, such as sensors, devices, vehicles, buildings, and human. With the help of cloud computing [14], internet of things [57], and Big Data [8, 9], crowd sensing connects unobtrusive and ubiquitous sens- ing technologies, advanced data management and analytics models, and novel visualization methods to create solutions that improve urban environment, human life quality, and city operation systems. Current research in crowd sensing addresses the following issues: crowd sensing as a novel methodology for user-centered research; development of new services and applications based on human sensing, computation, and problem solving; engineering of improved crowd sensing platforms including quality control mecha- nisms; incentive design of work; usage of crowd sensing for professional business; and theoretical frameworks for evaluation. Nowadays, no countries, no communities, and no person are immune to emergency events [10]. An emergency event is a sudden, urgent, usually unexpected incident or occurrence that requires an immediate reaction or assistance for emergency situations faced by social group (e.g., the corporations) or the recipients of public assistance [11]. How to prepare for, respond to, and recover from the emergency events is important. An apparent choice for processing an emergency event is to analyze its related information. Due to the popularity of the web, most emergency events are reported in the form of web resources. Particularly, with the development of the social media, people can get/post more and more information of emergency events from/to the web. In fact, a web user can be seen as a sensor of an emergency event. For example, if a user makes a post in microblogs or BBS about an earthquake occurrence, then she/he can be seen as an “earthquake sensor.” Web can be seen as a sensor receiver. In this paper, we call the web users as “social sensors.” In our view, using related web resources as crowd sensing to analyze emergency events has the following advantages. (1) Real-Time of Web. Web can provide related information immediately aſter an emergency event happens, which is associated with the sudden feature of emergency events. Traditional media such as newspapers and magazines cannot Hindawi Publishing Corporation International Journal of Distributed Sensor Networks Volume 2015, Article ID 184751, 8 pages http://dx.doi.org/10.1155/2015/184751

Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

Research ArticleCrowd Sensing Based Burst Computing of EventsUsing Social Media

Zheng Xu12 Yunhuai Liu2 Lin Mei2 Hui Zhang1 and Chuanping Hu2

1Tsinghua University Beijing 100084 China2TheThird Research Institute of the Ministry of Public Security Shanghai 201142 China

Correspondence should be addressed to Zheng Xu xuzhengshueducn

Received 2 July 2014 Revised 25 August 2014 Accepted 29 September 2014

Academic Editor Xiangfeng Luo

Copyright copy 2015 Zheng Xu et alThis is an open access article distributed under theCreativeCommonsAttribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

With the popularity of web the internet is becoming amajor information provider and poster of an event due to its real-time openand dynamic features In this paper crowd sensing based burst computation algorithm of a web event is developed in order to letthe people know a web event clearly and help the social group or government process the events effectively Different temporalfeatures of web events are developed to provide the basics for the proposed computation algorithm The burst power is presentedto integrate the above temporal features of an event Empirical experiments on real datasets including Google Zeitgeist and GoogleTrends show that the number of web pages and the average clustering coefficient can be used to detect events The evaluations onreal dataset show that the proposed function integrating the number of web pages and the average clustering coefficient can be usedfor event detection efficiently and correctly

1 Introduction

Crowd sensing is a process of acquisition integration andanalysis of big and heterogeneous data generated by adiversity of sources in urban spaces such as sensors devicesvehicles buildings and human With the help of cloudcomputing [1ndash4] internet of things [5ndash7] and Big Data [8 9]crowd sensing connects unobtrusive and ubiquitous sens-ing technologies advanced data management and analyticsmodels and novel visualization methods to create solutionsthat improve urban environment human life quality andcity operation systems Current research in crowd sensingaddresses the following issues crowd sensing as a novelmethodology for user-centered research development ofnew services and applications based on human sensingcomputation and problem solving engineering of improvedcrowd sensing platforms including quality control mecha-nisms incentive design of work usage of crowd sensingfor professional business and theoretical frameworks forevaluation Nowadays no countries no communities and noperson are immune to emergency events [10] An emergencyevent is a sudden urgent usually unexpected incident or

occurrence that requires an immediate reaction or assistancefor emergency situations faced by social group (eg thecorporations) or the recipients of public assistance [11] Howto prepare for respond to and recover from the emergencyevents is important An apparent choice for processing anemergency event is to analyze its related information Dueto the popularity of the web most emergency events arereported in the form of web resources Particularly with thedevelopment of the social media people can getpost moreand more information of emergency events fromto the webIn fact a web user can be seen as a sensor of an emergencyevent For example if a user makes a post in microblogs orBBS about an earthquake occurrence then shehe can beseen as an ldquoearthquake sensorrdquo Web can be seen as a sensorreceiver In this paper we call theweb users as ldquosocial sensorsrdquoIn our view using related web resources as crowd sensing toanalyze emergency events has the following advantages

(1) Real-Time of Web Web can provide related informationimmediately after an emergency event happens which isassociated with the sudden feature of emergency eventsTraditional media such as newspapers and magazines cannot

Hindawi Publishing CorporationInternational Journal of Distributed Sensor NetworksVolume 2015 Article ID 184751 8 pageshttpdxdoiorg1011552015184751

2 International Journal of Distributed Sensor Networks

report an emergency event immediately On the contraryweb can release this issue properly For example from somereports the time when Chinese web users know the Septem-ber 11 attacks is only 5 minutes later than the president of theUnited States [12 13] Recently with the rapid development ofsocial media such as Twitter and Facebook web becomes animportant eventsrsquo information provider more than ever [14]

(2) Free Spread of Web The free spread of information byweb can provide comprehensive perspective of emergencyevents Traditional media usually give some dedicated pointssuch as expert opinions to public In other words traditionalmedia usually try to conclude some emergency events ratherthan reporting them Different from the traditional mediaweb can provide different perspectives of an emergencyevent Different users can give their own opinions about anemergency event The open feature of web ensures that usersknow the different aspects of opinions about an emergencyevent

(3) Constant Change of Web The dynamic feature of webinformation can keep up with the evolution of emergencyevents Of course an emergency event is not static On thecontrary the information of it may change with the timeIn some studies [15] the change of an event is named eventevolution The event evolution generates a large volume oftemporal data For example when you search ldquoMoammarGadhafirdquo in Google the total number of news about it in24102011 is 23400 These large numbers of informationshould have an appropriate intermediate to support themBesides the large volume of data the information of anemergency event updates quickly For example the increasednumber of the news is 22800 in 25102011 which is almostequal to that in the past Web as a live and active corpus cankeep up with the evolution of emergency events

In a web environment web pages come from one or moresources Event detection is the task of capturing the first webpage that mentions previously unseen events This task haspractical applications in several domains including intelli-gence gathering financial market analyses and news anal-yses where useful information is buried in a large amountof data that grows rapidly with time The recently releasedGoogle Flu (httpwwwgoogleorgflutrends) Trends show-cases an important application on estimating flu epidemicsbased on queries received from massive web users The USgovernment is building a massive computer system that canmonitor news blogs and emails for antiterrorism purposes[16] and event detection is an essential component of thissystem In this paper we introduce a new web mining taskmdashburst power computation of emergency events using crowdsensing Given an emergency event the related web pages canbe found examples of web news microblogs and forumsBased on the semantics of these web pages formed by socialsensors the temporal features of an event are given And thenthe burst power is computedThe major contributions of thispaper are as follows

(1) In this work we define the novel problem of com-puting the burst power of an emergency event The

temporal features of an emergency event imaged onthe web are built which integrate the number ofthe increased web pages the number of increasedkeywords the distribution of keywords and therelations of keywords At last some heuristic rules aregiven to detect the different states of an emergencyevent

(2) We give the definitions of five impact factors includ-ing the number of increased web pages the numberof increased keywords the number of communitiesthe average clustering coefficient and the averagesimilarities of web pages These five impact factorscontain statistic and content information of an event

(3) Experiments on real datasets (real web events) showthe good performance of the proposed algorithm andverify its effectiveness and robustness The proposedalgorithm can help the social group or governmentprocess the emergency events and let the people knowan emergency event clearly

The rest of the paper is organized as follows In Section 2we give the related work of event detection In Section 3 weformally define the problem and a series of definitions aregiven The methods for generating five impact factors aregiven in Section 4 We discuss our experiments and resultsin Section 5 At last some conclusions are given

2 Related Work

The proposed states detection problem is similar to theresearch of Topic Detection and Tracking (TDT) Variousmethods have been proposed to manage news stories spotnews events and track the process of events [17ndash24] Usuallythe TDT research generates a hierarchical structure of anevent which aims at clustering related news into it OverallTDT technologies have been attempting to detect or clusternews stories into these events without focusing on or inter-preting with the sudden urgent and unexpected features ofemergency events [25] Since event evolution technologiesare similar to the emergency event states detection we willlist some related works about it Event evolution proposed byAllan [26] is a subtopic of topic detection and tracking Inhis study two important conclusions are given (1) a seminalevent may lead to several other events (2) the events at thebeginning may have more influence on the events comingimmediately after than the events at the later time Allan usedthe ontology tomeasure the similarity of events However theontology is difficult to get which makes the work difficultto be used directly Mei and Zhai [27] investigated themeevolution which is similar to event evolution They proposeda temporal pattern discovery technique on the basis of thetimestamps of text streams The theme of each interval isidentified and the evolution of theme between successiveintervals is extracted Unfortunately the proposed techniquedid not consider the different states of an event which mayimpact its result Wei and Chang [28] proposed an eventevolution pattern discovery technique which identifies eventepisodes together with their temporal relationships An event

International Journal of Distributed Sensor Networks 3

episode is defined as a stage or subevent of an event Theabove study differs from this paper their study deals with anevent and their event episodes whereas this paper handlesthe different states of emergency events imaged on the webLater Yang and Shi [29] aimed at discovering event evolutiongraphs from news corpora The proposed event evolutiongraph is used to present the underlying structure of theeventsTheproposedmethod uses the event timestamp eventcontent similarity temporal proximity and web pages distri-bution proximity to model the event evolution relationshipsRecently Jo et al [30] studied the method to discover theevolution of topics (ie events) over time in a timestampdocument collection They tried to capture the topologyof topic evolution that is inherent in a given corpus Theyclaimed that the topology of the topic evolution discovered byhis method is very rich and carries concrete information onhow the corpus has evolved over time Event detection basedon prior user queries is reported in [31 32] Fung et al [31]proposed to first identify the bursty features related to theuser query and then organize the documents related to thosebursty features into an event hierarchy In [32] a user specifiesan event (or a topic) of interest using several keywords as aquery The response to the query is a combination of streams(eg news feeds emails) that are sufficiently correlated andcollectively contain all query keywords within a time periodThe proposed work is also related to event detection usingclick-through data [33] Event ranking with user attention isreported in [34] where the events are firstly detected fromnews streamsUser attention is then derived from the numberof page-views (collected through web browser toolbars) forall the news articles in the same event Leskovec et al [35 36]proposed the method for outbreak detection based on cost-effective function

Overall the above methods have been proved to makegood performance on general events other than emergencyevents The emergency events possess dynamic real-timemultistates sudden and urgent features In this paper weconsider the burst feature of an emergency event imaged onthe web

3 Problems Formulation

In this section the basic definition of the proposedmethod isintroduced The temporal features of an emergency event aregiven in the next section The burst factors of the proposedmethod are given in the last section

31 Basic Definitions An event is something that happens atsome specific time and often some specific places [37ndash39]In fact this definition of events from TDT can be relaxedsince some events do not happen at some specific placeManyevents are launched by some news or stories only on the webwhich cannot get the exact location stamp Instead the timeof an emergency event can always be identified since somenews of it has an exact timestamp 119905 Besides the timestampin this paper we only take web pages into consideration sincethey can be easily processed and analyzed An emergencyevent is defined as follows

Definition 1 (emergency event 119890) An emergency event 119890 is atuple 119871

119890 119865119890 where 119871

119890is the life course of 119890 119865

119890is the set of

basic features describing 119890

Definition 2 (the basic features and life cycle of emergencyevent imaged on the web) 119865

119890and 119871

119890The basic features of an

emergency event contain three components including seedsset 119878(119905

119894 119905119895) web pages set 120593(119905

119894 119905119895) and keywords set 120595(119905

119894 119905119895)

The life cycle of an emergency event contains the starting andending timestamps ⟨119905

119904 119905119890⟩

Seeds set consists of the core keywords of an emergencyevent from timestamp 119905

119894to 119905119895 Usually these keywords can be

used to search the related web pages covering one web eventFor example we can use ldquoChina rail crashrdquo as the seeds tosearch related web pages covering the event The seeds set ofan emergency event can be found from news website whichprovides the hot news topics each day Besides the seeds sethow to get the web pages related to the emergency event isanother issue Web page set is a set with 119899 related web pagesreturn by search engines using the seeds from timestamp 119905

119894

to 119905119895 which is denoted by 120593(119905

119894 119905119895) = 1198891 1198892 119889119894 119889119899

Keywords set is a setwith119898 keywords extracted from120593(119905119894 119905119895)

which is denoted by 120595(119905119894 119905119895) = 1198961 1198962 119896119894 119896119898 Web

page 119889119894is represented by a keyword vector which is denoted

as

119889119894= 1199081 1199082 119908119898 (1)

where 119908119895= (1 + log 119905119891(119896

119895)) lowast log(1 + 119899119889119891(119896

119895)) [38] 119905119891(119896

119895)

means the term frequency of keyword 119896119895in web page 119889

119894

and 119889119891(119896119895) means the web page frequency of keyword 119896

119895in

120593(119905119894 119905119895)

Herein we use search engine such as Google (httpwwwgooglecom) and Baidu (httpwwwbaiducom) to get theweb pages related to an emergency event imaged on the webThe reasons are as follows

(1)Updating InformationRapidlyThe information of an eventrefreshes quickly For example the information of ldquoJapannuclear crisisrdquo from social sensors may update per hour andeven per minute Web search engines such as Google providethe interface to search information up to per minute

(2) Different Information Sources The information of anemergency event usually comes from different sources suchas news blogs and BBS Web search engines such as Googleand Baidu provide the interface to search information fromvarious websites

32 Basic Temporal Features In this section we present fivebasic temporal features including (1) the number of increasedweb pages (2) the number of increased keywords (3) thedistribution of keywords on web pages (4) the associatedrelations of keywords and (5) the similarities of web pages

Temporal Feature 1 (the number of increased web pages fromtimestamp 119905

119894to 119905119895 |120593(119905119894 119905119895)|) The elements in 120593(119905

119894 119905119895) do not

appear from the starting timestamp 119905119904to 119905119894 that is forall119889

119899isin

120593(119905119894 119905119895) rarr 119889

119899notin 120593(119905119904 119905119894)

4 International Journal of Distributed Sensor Networks

Temporal Feature 2 (the number of increased keywords fromtimestamp 119905

119894to 119905119895 |120595(119905119894 119905119895)|) The elements in 120595(119905

119894 119905119895) do not

appear from the starting timestamp 119905119904to 119905119894 that is forall119896

119898isin

120595(119905119894 119905119895) rarr 119896

119898notin 120595(119905119894 119905119895)

Temporal Feature 3 The distribution of keywords on webpages from timestamp 119905

119894to 119905119895 120577(119905119894 119905119895) For an emergency

event 119890 the web pages in 120593(119905119894 119905119895) can be represented as a

vector by the keywords in120595(119905119894 119905119895)These vectors can be stored

as a matrix

120577 (119905119894 119905119895) = (

11990811 1199081119898

d

1199081198991 sdot sdot sdot 119908

119899119898

) (2)

Temporal Feature 4 (the associated relations of keywords fromtimestamp 119905

119894to 119905119895 Γ(119905119894 119905119895)) For an emergency event 119890 the

associated relations of keywords can be stored as a matrix

Γ (119905119894 119905119895) = (

11989111 1198911119898

d

1198911198981 sdot sdot sdot 119891

119898119898

) (3)

where 119891119894119895means the weight of relation between 119896

119894and 119896

119895

which can be computed by

119891119894119895=

log ((119873 (119896119894and 119896119895) lowast 119899) (119873 (119896

119894) lowast 119873 (119896

119895)))

log 119899 (4)

where 119873(119896119894) means the number of in 120593(119905

119894 119905119895) containing

119896119894and 119873(119896

119894and 119896119895) is the number of web pages in 120593(119905

119894 119905119895)

containing both 119896119894and 119896119895

Temporal Feature 5 (the similarities between web pages fromtimestamp 119905

119894to 119905119895 Ξ(119905119894 119905119895)) For an emergency event 119890 the

similarities between web pages can be stored as a matrix

Ξ (119905119894 119905119895) = (

11988611 1198861119899

d

1198861198991 sdot sdot sdot 119886

119899119899

) (5)

where 119886119894119895means the similarities between 119889

119894and 119889119895 which can

be computed by

119886119894119895=

119889119894sdot 119889119895

1003817100381710038171003817119889119894

1003817100381710038171003817

10038171003817100381710038171003817119889119895

10038171003817100381710038171003817

(6)

where 119889119894 and 119889

119895 denote the mathematic model of vectors

119889119894and 119889

119895

33 Basic Burst Factors In this section we present basicburst factors including the number of communities in contextgraph the average clustering coefficient of the context graphand the average similarities of web pages

Impact Factor 1 (the number of communities in contextgraph from timestamp 119905

119894to 119905119895 |119862(119905

119894 119905119895)|) A community

is a subgraph of the context graph which reflects a partof context of an event 119890 The set of communities is asegmentation of the context graph Each context communityis a part of the context graph which is with no commonkeywords of other community The set of communities of anevent 119890 is denoted as

119862119890= 1198881 1198882 119888|119862

119890| forall119896

119894isin 119888119894and 119896119895isin 119888119895997888rarr 119896119894= 119896119895 (7)

Impact Factor 2 (the average clustering coefficient [24] ofthe context graph from timestamp 119905

119894to 119905119895 119862119862(119905

119894 119905119895)) In

graph theory a clustering coefficient is a measure of degreeto which nodes in a graph tend to cluster together Theclustering coefficient of the keyword 119896

119894in context graph can

be computed by

119862119862 (119896119894) =

2119897119901 (119901 minus 1)

(8)

where 119901means the number of neighbor node of the keyword119896119894and 119897 means the number of edges between these neighbor

nodes Thus the average clustering coefficient of the contextgraph can be computed by

119862119862 (119905119894 119905119895) =

sum119898

119894=1 119862119862 (119896119894)

119898 (9)

Impact Factor 3 (the average similarities of web pages fromtimestamp 119905

119894to 119905119895 119860119878(119905

119894 119905119895)) For an event 119890 the similarities

can be computed by cosine function

Sim (119889119894 119889119895) =

119889119894sdot 119889119895

1003817100381710038171003817119889119894

1003817100381710038171003817

10038171003817100381710038171003817119889119895

10038171003817100381710038171003817

(10)

where 119889119894 and 119889

119895 denote the mathematic model of vectors

119889119894and 119889

119895 Thus the average similarities of web pages can be

computed by

119860119878 (119905119894 119905119895) =

sum119898

119894=1sum119898

119895=1 Sim (119889119894 119889119895)

119898 (119898 minus 1) (11)

4 Computing the Burst Power

In this section based on the above definitions the methodfor computing the burst power of an emergency event isproposed

41 Basic Definitions of Burst Power After giving the basictemporal features of emergency events we define burst poweras follows

Definition 3 (burst power) 119900119901(119905119894 119905119895) For an emergency event

119890 the burst power from timestamp 119905119894to 119905119895is the influence

degree on the society

For example high |120593(119905119894 119905119895)| or |120595(119905

119894 119905119895)|means high influ-

ence degree of an event on the society thus the event has highburst power

International Journal of Distributed Sensor Networks 5

Keywords Web pages

The outbreak power is minimum when the bipartite graph of

keywords and web pages is complete

Keywords Web pages

The outbreak power is maximum when all of the keywords are only

provided by one web page

k1

k2

k3

k4

w1

w2

w3

k1

k2

k3

k4

w1

w2

w3

Figure 1 The illustration of the maximum and minimum burst power

According to the characteristic of burst power the timeinterval with high influence to the society will possess highpossibility to be the peak ormilestone of an emergency eventInspired by [40] before we propose the algorithm to computethe burst power we introduce two important definitions asfollows

Definition 4 (the representative power of keywords) 119903119901(119896)The representative power of keyword 119896 is the probability of 119896to represent the event 119890 correctly

Definition 5 (the confidence of web pages) 119888119908(119889) Theconfidence of a web page 119889 is the expected representativepower of keywords provided by 119889

Different keywords related to one event reveal the variousaspects of the event For example the keyword ldquoChinardquoreveals the place of the event ldquoChina rail crashrdquo On the otherhand the keywords ldquorailrdquo and ldquocrashrdquo reveal the object of theevent ldquoChina rail crashrdquo

42 Basic Heuristics for Computing Burst Power Based onthe common sense and the observations on real data wehave four basic heuristic rules which serve as the bases forthe computing of burst power These four heuristic rules arerelevant to the data If there is burst situation in the data theywould be correct Given the discussion field of this paper allemergency events have some different effect and spreadingpower Thus these four heuristic rules are appropriate foremergency events situation

Heuristic Rule 1 If ignoring 120595(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120593(119905119894 119905119895)|

According to heuristic rule 1 the burst power is propor-tional to the number of increased web pages So we can get119900119901(119905119894 119905119895)infin|120593(119905

119894 119905119895)|

Heuristic Rule 2 If ignoring 120593(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120595(119905119894 119905119895)|

According to heuristic rule 2 the burst power is propor-tional to the number of increased keywords So we can get119900119901(119905119894 119905119895)infin|120595(119905

119894 119905119895)|

If two time intervalswith the same |120593(119905119894 119905119895)| and |120595(119905

119894 119905119895)|

the distribution of keywords will determine 119900119901(119905119894 119905119895) So we

give heuristic rule 3

Heuristic Rule 3 If the bipartite graph of 120577(119905119894 119905119895) is a complete

graph 119900119901(119905119894 119905119895) is the lowest that is (forall119908

119899119898isin 120577(119905119894 119905119895) rarr

119908119899119898

= 0) rarr 119900119901(119905119894 119905119895)min

According to heuristic rule 3 if all of the keywords appearin each web page the similarity between them is 1 whichmeans all of the web pages are copies from one web pageThissituation means that the emergency event is with the lowestdiversity

Since120595(119905119894 119905119895) and 120593(119905

119894 119905119895) are dependent the distribution

of keywords on the web pages should be considered So wegive heuristic rule 4Heuristic Rule 4 Since120595(119905

119894 119905119895) and120593(119905

119894 119905119895) are dependent the

distribution of keywords on web pages should be consideredIf all of the keywords are provided by only one web page119900119901(119905119894 119905119895) is the highest

43 Computing the Burst Power Heuristic rule 4 givesthe condition of maximum 119900119901(119905

119894 119905119895) Figure 1 gives the

illustration of heuristic rules 3 and 4 Based on heuristic

6 International Journal of Distributed Sensor Networks

rules 3 and 4 we can conclude that the confidence of aweb page and the representative power of a keyword aredetermined by each other andwe can use an iterativemethodto compute them Thus we compute the confidence of aweb page by calculating the average representative power ofkeywords it provides as follows

119888119908 (119889ℎ) =

sum(119896119898isin

997888

119889ℎ)cap(119908ℎ119899gt0)

119903119901 (119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(12)

where |997888119889ℎ| means the number of web pages with keywords

119896119898Inspired by [40] we use probability function to compute

the representative power of keywords

119903119901 (119896119898) = 1minus prod

(119889ℎisin

997888

119896119898)cap(119908ℎ119898gt0)

(1minus 119888119908 (119889ℎ)) (13)

where997888119896119898is the set of web pages providing 119896

119898

The above two equations show how to compute theconfidence of a web page However since the similaritiesbetween web pages are not zero we put the similaritiesbetween web pages into (8) The equation can be revised as(9) which considers the similarities between web pages

1199031199011015840

(119896119898) = 119903119901 (119896

119898) + sum

(119896119894isin

997888997888

119896119896119894)cap(119891119894119895gt0)

119891119894119895lowast 119903119901 (119896

119894)

(14)

where997888997888119896119896119894is the set of keywords similarities against keyword

119896119894Since 1199031199011015840(119896) may be higher than 1 we adopt the widely

used logistic function to set 1199031199011015840(119896) into (0 1)Then (9) can berevised as

11990311990110158401015840

(119896119898) =

11 + 119890minus1199031199011015840(119896119898)

(15)

Equation (7) is revised as

119888119908 (119889ℎ) =

sum119896119898isin

997888

119889ℎcap119908ℎ119898gt011990311990110158401015840

(119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(16)

and the burst power function of time interval (119905119894 119905119895) can be

computed by the sum of confidence of all web pages

119900119901 (119905119894 119905119895) =

119899

sum

ℎ=1(1minus 119888119908 (119889

ℎ)) (17)

where 119899 means the number of web pages from time interval(119905119894 119905119895)

As described above we can compute the burst power ofthe proposed method

5 Experiments and Analysis

51 Datasets The events in our experiments are extractedfrom Google and Baidu We select 150 events with about

Table 1 The details of datasets

Feature ValueAverage number of seeds per event 2Average number of web pages per event 1012Average number of keywords per event 4534Average number of days per event 30Average number of web pages per day 34Average number of keywords per day 853

1500000 web pages in our experiments including politicsevents accidents events disasters events and terrorismevents The web pages of each event are downloaded fromGoogle Stanford tagger (httpnlpstanfordedu) is used toreserve the noun words in the web pages The keywordsare selected by their document frequencies Table 1 showsthe statistics of our experimental dataset When Google andBaidu provide the events it also gives some keywords forhelping users to search them After we get the seed set of anevent by the search engine a certain number of the web pagesare collected as samples by automatic crawling and searchingwith the seed setThe detailed steps for collecting related webresources of an event in our experiments are as follows

(1) Get the seed set of an event such as ldquoChina traincrashrdquo which can be seen as 119878(119905

119894 119905119895) of an event

(2) Search the seed set as the query and download therelated web pages with search engine which can beseen as 120593(119905

119894 119905119895) of an event

(3) Identify the starting timestamp of an event by 120593(119905119894 119905119895)

and the ending timestamp by download time whichcan be seen as 119905

119904and 119905119890of an event

(4) Get |120593(119905119894 119905119895)| |120595(119905

119894 119905119895)| 120577(119905119894 119905119895) and Ξ(119905

119894 119905119895) per day

(5) Do step (4) of different information sources includingnews blogs and BBS

52 Experimental Results After obtaining the temporal fea-tures per day of each event we select ten human annotatorsto test whether the burst power of each event is correct ornotThe data fromGoogle Trends is selected to compare withthat of the proposed method If the human annotator thinksthe proposed methods are equal to his own imagination orGoogle Trends we set the precision of this detection to trueIn additional the annotators are set to do their evaluationsindependently which ensure the reliability and validity of theresults Before the human annotator started to evaluate theexperimental results we provide the abstract descriptions ofeach event for them For example we will give some newsstories and concepts to the human annotators This trainingsession continued until the human annotators are familiarwith the concepts and the temporal feature of events In thenext step each annotator was asked to test the results of eachevent independently

In fact the overall precision of the proposed methodachieves 92 showing the accuracy of our burst powercomputation algorithm From the experiments on the real

International Journal of Distributed Sensor Networks 7

data we know that the proposed algorithm can compute theburst power of an event accuratelyThe information fromwebcan be integrated into burst power The factor can be used todetect states of an emergency event

6 Conclusions

With the popularity of web the internet is becoming amajor information provider and poster of an event due toits real-time open and dynamic features However facedwith the hugeness disorder and continuous web resourcesit is impossible for people to efficiently recognize collectand organize the events In this paper crowd sensing basedburst computation algorithm of a web event is developed inorder to let the people know a web event clearly and help thesocial group or government process the events effectivelyThedefinition of ldquosocial sensorsrdquo is firstly introduced which isthe foundation of using web resources to compute the burstpower of events on the web Secondly different temporalfeatures of web events are developed to provide the basicsfor the proposed computation algorithmMoreover the burstpower is presented to integrate the above temporal featuresof an event Empirical experiments on real datasets includingGoogle Zeitgeist and Google Trends show that that thenumber of web pages and the average clustering coefficientcan be used to detect events Some strategies integrating thenumber of web pages and the average clustering coefficientare also employed The evaluations on real dataset show thatthe proposed function integrating the number of web pagesand the average clustering coefficient can be used for eventdetection efficiently and correctly

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceandTechnologyMajor Project underGrant 2013ZX01033002-003 in part by the National High Technology Research andDevelopment Program of China (863 Program) under Grants2013AA014601 and 2013AA014603 in part by National KeyTechnology Support Program under Grant 2012BAH07B01in part by the National Science Foundation of China underGrants 61300202 and 61300028 in part by the Project of theMinistry of Public Security under Grant 2014JSYJB009 inpart by the China Postdoctoral Science Foundation underGrant 2014M560085 in part by the Science Foundation ofShanghai under Grant 13ZR1452900 in part by the ChinaNational Social Science Fund 06BFX051 and in part bythe Shanghai University training and selection of outstand-ing young teachers in special research fund hzf05046 Theauthors must be grateful for the helpful suggestion from YLiu L Mei H Zhang and C Hu

References

[1] Y Liu L Ni and C Hu ldquoA generalized probabilistic topologycontrol for wireless sensor networksrdquo IEEE Journal on SelectedAreas in Communications vol 30 no 9 pp 1780ndash1788 2012

[2] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[3] C Hu Z Xu Y Liu L Mei L Chen and X Luo ldquoSemanticlink network-based model for organizing multimedia big datardquoIEEE Transactions on Emerging Topics in Computing vol 2 no3 pp 376ndash387 2014

[4] Y Liu Y Zhu L Ni and G Xue ldquoA reliability-oriented trans-mission service in wireless sensor networksrdquo IEEE TransactionsonParallel andDistributed Systems vol 22 no 12 pp 2100ndash21072011

[5] Y Liu Q Zhang and L M Ni ldquoOpportunity-based topologycontrol in wireless sensor networksrdquo IEEE Transactions onParallel andDistributed Systems vol 21 no 3 pp 405ndash416 2010

[6] X Liu Y Yang D Yuan and J Chen ldquoDo we need to handleevery temporal violation in scientific workflow systemsrdquo ACMTransactions on Software Engineering and Methodology vol 23no 1 Article ID 2559938 2014

[7] LWang J Tao R Ranjan et al ldquoG-Hadoop mapReduce acrossdistributed data centers for data-intensive computingrdquo FutureGeneration Computer Systems vol 29 no 3 pp 739ndash750 2013

[8] Z Xu X Wei X Luo et al ldquoKnowle A semantic link networkbased system for organizing large scale online news eventsrdquoFuture Generation Computer Systems vol 43-44 pp 40ndash502015

[9] Z Xu X Luo S Zhang X Wei L Mei and C Hu ldquoMin-ing temporal explicit and implicit semantic relations betweenentities using web search enginesrdquo Future Generation ComputerSystems vol 37 pp 468ndash477 2014

[10] D Haddow A Bullock and P Coppola Introduction to Emer-gency Management Butterworth-Heinemann 2010

[11] 2012 httpdefinitionsuslegalcomeemergency-event[12] 2012 httpwwwwhointcsrsarsen[13] 2012 httpenwikipediaorgwikiSeptember 11 attacks[14] J Farber T Myers J Trevathan I Atkinson and T Andersen

ldquoRiskr a low-technological Web20 disaster service to monitorand share informationrdquo in In Proceedings of 15th InternationalConference on Network-Based Information Systems (NBiS rsquo12)pp 311ndash318 IEEE Melbourne Australia September 2012

[15] J Makkonen ldquoInvestigations on event evolution in TDTrdquo inProceedings of the Conference of the North American Chapterof the Association for Computational Linguistics on HumanLanguage (NAACLstudent rsquo03) pp 43ndash48 Edmonton CanadaMay 2003

[16] C C Yang X Shi and C-P Wei ldquoDiscovering event evolutiongraphs fromnews corporardquo IEEETransactions on SystemsManand Cybernetics Part A Systems and Humans vol 39 no 4 pp850ndash863 2009

[17] J Abonyi B Feil S Nemeth and P Arva ldquoModifiedGath-GEVaclustering for fuzzy segmentation of multivariate time-seriesrdquoFuzzy Sets and Systems vol 149 no 1 pp 39ndash56 2005

[18] X Wu Y-J Lu Q Peng and C-W Ngo ldquoMining eventstructures from web videosrdquo IEEEMultimedia vol 18 no 1 pp38ndash51 2011

8 International Journal of Distributed Sensor Networks

[19] Q He K Chang E-P Lim and A Banerjee ldquoKeep it simplewith time a reexamination of probabilistic topic detectionmodelsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 10 pp 1795ndash1808 2010

[20] C Sung and T G Kim ldquoCollaborative modeling processfor development of domain-specific discrete event simulationsystemsrdquo IEEE Transactions on Systems Man and CyberneticsPart C Applications and Reviews vol 42 no 4 pp 532ndash5462012

[21] E Keogh K Chakrabarti M Pazzani and S MehrotraldquoDimensionality reduction for fast similarity search in largetime series databasesrdquo Knowledge and Information Systems vol3 no 3 pp 263ndash286 2001

[22] J Himberg K Korpiaho H Mannila J Tikanmaki and H TT Toivonen ldquoTime series segmentation for context recognitionin mobile devicesrdquo in Proceedings of the 1st IEEE InternationalConference on Data Mining (ICDM rsquo01) pp 203ndash210 December2001

[23] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[24] P Xiong Y Fan and M Zhou ldquoWeb service configurationunder multiple quality-of-service attributesrdquo IEEE TransactionsonAutomation Science and Engineering vol 6 no 2 pp 311ndash3212009

[25] J Allan G Carbonell G Doddington J Yamron and Y YangldquoTopic detection and tracking pilot study final reportrdquo in Pro-ceedings of the Broadcast News Transcription and UnderstandingWorkshop 1998

[26] J Allan Topic Detection and Tracking Event-Based InformationOrganization Kluwer Norwell Mass USA 2000

[27] Q Mei and C Zhai ldquoDiscovering evolutionary theme patternsfrom text an exploration of temporal text miningrdquo in Pro-ceedings of the 11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 198ndash207 August2005

[28] C-P Wei and Y-H Chang ldquoDiscovering event evolution pat-terns fromdocument sequencesrdquo IEEETransactions on SystemsMan and Cybernetics Part A Systems and Humans vol 37 no2 pp 273ndash283 2007

[29] C C Yang andX Shi ldquoDiscovering event evolution graphs fromnewswiresrdquo in Proceedings of the 15th International Conferenceon World Wide Web pp 945ndash946 May 2006

[30] Y Jo C Lagoze and C L Giles ldquoDetecting research topicsvia the correlation between graphs and textsrdquo in Proceedings ofthe 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo07) pp 370ndash379 August2007

[31] G P C Fung J X Yu H Liu and P S Yu ldquoTime-dependentevent hierarchy constructionrdquo in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo07) pp 300ndash309 ACM August 2007

[32] V Hristidis O Valdivia M Vlachos and P S Yu ldquoContinuouskeyword search on multiple text streamsrdquo in Proceedings of the15th ACM Conference on Information and Knowledge Manage-ment (CIKM rsquo06) pp 802ndash803 November 2006

[33] Q Zhao T-Y Liu S S Bhowmick andW-Y Ma ldquoEvent detec-tion from evolution of click-through datardquo in Proceedings ofthe 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo06) pp 484ndash493 August2006

[34] C Wang M Zhang L Ru and S Ma ldquoAutomatic online newstopic ranking using media focus and user attention based onaging theoryrdquo in Proceedings of the 17th ACM Conference onInformation and Knowledge Management (CIKM rsquo08) pp 1033ndash1042 October 2008

[35] J Leskovec A Krause C Guestrin C Faloutsos J VanBriesenand N Glance ldquoCost-effective outbreak detection in networksrdquoin Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo07) pp420ndash429 August 2007

[36] J Leskovec and E Horvitz ldquoPlanetary-scale views on a largeinstant-messaging networkrdquo in Proceedings of the 17th Interna-tional Conference onWorldWideWeb (WWW rsquo08) pp 915ndash924April 2008

[37] R Nallapati A Feng F Peng and J Allan ldquoEvent threadingwithin news topicsrdquo in Proceedings of the 13th ACM Conferenceon Information and Knowledge Management (CIKM rsquo04) pp446ndash453 November 2004

[38] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing and Managementvol 24 no 5 pp 513ndash523 1988

[39] X Jin S Spangler R Ma and J Han ldquoTopic initiator detectionon the world wide webrdquo in Proceedings of the 19th InternationalWorld Wide Web Conference pp 481ndash490 2010

[40] X Yin J Han and P S Yu ldquoTruth discovery with multiple con-flicting information providers on the Webrdquo IEEE Transactionson Knowledge and Data Engineering vol 20 no 6 pp 796ndash8082008

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 2: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

2 International Journal of Distributed Sensor Networks

report an emergency event immediately On the contraryweb can release this issue properly For example from somereports the time when Chinese web users know the Septem-ber 11 attacks is only 5 minutes later than the president of theUnited States [12 13] Recently with the rapid development ofsocial media such as Twitter and Facebook web becomes animportant eventsrsquo information provider more than ever [14]

(2) Free Spread of Web The free spread of information byweb can provide comprehensive perspective of emergencyevents Traditional media usually give some dedicated pointssuch as expert opinions to public In other words traditionalmedia usually try to conclude some emergency events ratherthan reporting them Different from the traditional mediaweb can provide different perspectives of an emergencyevent Different users can give their own opinions about anemergency event The open feature of web ensures that usersknow the different aspects of opinions about an emergencyevent

(3) Constant Change of Web The dynamic feature of webinformation can keep up with the evolution of emergencyevents Of course an emergency event is not static On thecontrary the information of it may change with the timeIn some studies [15] the change of an event is named eventevolution The event evolution generates a large volume oftemporal data For example when you search ldquoMoammarGadhafirdquo in Google the total number of news about it in24102011 is 23400 These large numbers of informationshould have an appropriate intermediate to support themBesides the large volume of data the information of anemergency event updates quickly For example the increasednumber of the news is 22800 in 25102011 which is almostequal to that in the past Web as a live and active corpus cankeep up with the evolution of emergency events

In a web environment web pages come from one or moresources Event detection is the task of capturing the first webpage that mentions previously unseen events This task haspractical applications in several domains including intelli-gence gathering financial market analyses and news anal-yses where useful information is buried in a large amountof data that grows rapidly with time The recently releasedGoogle Flu (httpwwwgoogleorgflutrends) Trends show-cases an important application on estimating flu epidemicsbased on queries received from massive web users The USgovernment is building a massive computer system that canmonitor news blogs and emails for antiterrorism purposes[16] and event detection is an essential component of thissystem In this paper we introduce a new web mining taskmdashburst power computation of emergency events using crowdsensing Given an emergency event the related web pages canbe found examples of web news microblogs and forumsBased on the semantics of these web pages formed by socialsensors the temporal features of an event are given And thenthe burst power is computedThe major contributions of thispaper are as follows

(1) In this work we define the novel problem of com-puting the burst power of an emergency event The

temporal features of an emergency event imaged onthe web are built which integrate the number ofthe increased web pages the number of increasedkeywords the distribution of keywords and therelations of keywords At last some heuristic rules aregiven to detect the different states of an emergencyevent

(2) We give the definitions of five impact factors includ-ing the number of increased web pages the numberof increased keywords the number of communitiesthe average clustering coefficient and the averagesimilarities of web pages These five impact factorscontain statistic and content information of an event

(3) Experiments on real datasets (real web events) showthe good performance of the proposed algorithm andverify its effectiveness and robustness The proposedalgorithm can help the social group or governmentprocess the emergency events and let the people knowan emergency event clearly

The rest of the paper is organized as follows In Section 2we give the related work of event detection In Section 3 weformally define the problem and a series of definitions aregiven The methods for generating five impact factors aregiven in Section 4 We discuss our experiments and resultsin Section 5 At last some conclusions are given

2 Related Work

The proposed states detection problem is similar to theresearch of Topic Detection and Tracking (TDT) Variousmethods have been proposed to manage news stories spotnews events and track the process of events [17ndash24] Usuallythe TDT research generates a hierarchical structure of anevent which aims at clustering related news into it OverallTDT technologies have been attempting to detect or clusternews stories into these events without focusing on or inter-preting with the sudden urgent and unexpected features ofemergency events [25] Since event evolution technologiesare similar to the emergency event states detection we willlist some related works about it Event evolution proposed byAllan [26] is a subtopic of topic detection and tracking Inhis study two important conclusions are given (1) a seminalevent may lead to several other events (2) the events at thebeginning may have more influence on the events comingimmediately after than the events at the later time Allan usedthe ontology tomeasure the similarity of events However theontology is difficult to get which makes the work difficultto be used directly Mei and Zhai [27] investigated themeevolution which is similar to event evolution They proposeda temporal pattern discovery technique on the basis of thetimestamps of text streams The theme of each interval isidentified and the evolution of theme between successiveintervals is extracted Unfortunately the proposed techniquedid not consider the different states of an event which mayimpact its result Wei and Chang [28] proposed an eventevolution pattern discovery technique which identifies eventepisodes together with their temporal relationships An event

International Journal of Distributed Sensor Networks 3

episode is defined as a stage or subevent of an event Theabove study differs from this paper their study deals with anevent and their event episodes whereas this paper handlesthe different states of emergency events imaged on the webLater Yang and Shi [29] aimed at discovering event evolutiongraphs from news corpora The proposed event evolutiongraph is used to present the underlying structure of theeventsTheproposedmethod uses the event timestamp eventcontent similarity temporal proximity and web pages distri-bution proximity to model the event evolution relationshipsRecently Jo et al [30] studied the method to discover theevolution of topics (ie events) over time in a timestampdocument collection They tried to capture the topologyof topic evolution that is inherent in a given corpus Theyclaimed that the topology of the topic evolution discovered byhis method is very rich and carries concrete information onhow the corpus has evolved over time Event detection basedon prior user queries is reported in [31 32] Fung et al [31]proposed to first identify the bursty features related to theuser query and then organize the documents related to thosebursty features into an event hierarchy In [32] a user specifiesan event (or a topic) of interest using several keywords as aquery The response to the query is a combination of streams(eg news feeds emails) that are sufficiently correlated andcollectively contain all query keywords within a time periodThe proposed work is also related to event detection usingclick-through data [33] Event ranking with user attention isreported in [34] where the events are firstly detected fromnews streamsUser attention is then derived from the numberof page-views (collected through web browser toolbars) forall the news articles in the same event Leskovec et al [35 36]proposed the method for outbreak detection based on cost-effective function

Overall the above methods have been proved to makegood performance on general events other than emergencyevents The emergency events possess dynamic real-timemultistates sudden and urgent features In this paper weconsider the burst feature of an emergency event imaged onthe web

3 Problems Formulation

In this section the basic definition of the proposedmethod isintroduced The temporal features of an emergency event aregiven in the next section The burst factors of the proposedmethod are given in the last section

31 Basic Definitions An event is something that happens atsome specific time and often some specific places [37ndash39]In fact this definition of events from TDT can be relaxedsince some events do not happen at some specific placeManyevents are launched by some news or stories only on the webwhich cannot get the exact location stamp Instead the timeof an emergency event can always be identified since somenews of it has an exact timestamp 119905 Besides the timestampin this paper we only take web pages into consideration sincethey can be easily processed and analyzed An emergencyevent is defined as follows

Definition 1 (emergency event 119890) An emergency event 119890 is atuple 119871

119890 119865119890 where 119871

119890is the life course of 119890 119865

119890is the set of

basic features describing 119890

Definition 2 (the basic features and life cycle of emergencyevent imaged on the web) 119865

119890and 119871

119890The basic features of an

emergency event contain three components including seedsset 119878(119905

119894 119905119895) web pages set 120593(119905

119894 119905119895) and keywords set 120595(119905

119894 119905119895)

The life cycle of an emergency event contains the starting andending timestamps ⟨119905

119904 119905119890⟩

Seeds set consists of the core keywords of an emergencyevent from timestamp 119905

119894to 119905119895 Usually these keywords can be

used to search the related web pages covering one web eventFor example we can use ldquoChina rail crashrdquo as the seeds tosearch related web pages covering the event The seeds set ofan emergency event can be found from news website whichprovides the hot news topics each day Besides the seeds sethow to get the web pages related to the emergency event isanother issue Web page set is a set with 119899 related web pagesreturn by search engines using the seeds from timestamp 119905

119894

to 119905119895 which is denoted by 120593(119905

119894 119905119895) = 1198891 1198892 119889119894 119889119899

Keywords set is a setwith119898 keywords extracted from120593(119905119894 119905119895)

which is denoted by 120595(119905119894 119905119895) = 1198961 1198962 119896119894 119896119898 Web

page 119889119894is represented by a keyword vector which is denoted

as

119889119894= 1199081 1199082 119908119898 (1)

where 119908119895= (1 + log 119905119891(119896

119895)) lowast log(1 + 119899119889119891(119896

119895)) [38] 119905119891(119896

119895)

means the term frequency of keyword 119896119895in web page 119889

119894

and 119889119891(119896119895) means the web page frequency of keyword 119896

119895in

120593(119905119894 119905119895)

Herein we use search engine such as Google (httpwwwgooglecom) and Baidu (httpwwwbaiducom) to get theweb pages related to an emergency event imaged on the webThe reasons are as follows

(1)Updating InformationRapidlyThe information of an eventrefreshes quickly For example the information of ldquoJapannuclear crisisrdquo from social sensors may update per hour andeven per minute Web search engines such as Google providethe interface to search information up to per minute

(2) Different Information Sources The information of anemergency event usually comes from different sources suchas news blogs and BBS Web search engines such as Googleand Baidu provide the interface to search information fromvarious websites

32 Basic Temporal Features In this section we present fivebasic temporal features including (1) the number of increasedweb pages (2) the number of increased keywords (3) thedistribution of keywords on web pages (4) the associatedrelations of keywords and (5) the similarities of web pages

Temporal Feature 1 (the number of increased web pages fromtimestamp 119905

119894to 119905119895 |120593(119905119894 119905119895)|) The elements in 120593(119905

119894 119905119895) do not

appear from the starting timestamp 119905119904to 119905119894 that is forall119889

119899isin

120593(119905119894 119905119895) rarr 119889

119899notin 120593(119905119904 119905119894)

4 International Journal of Distributed Sensor Networks

Temporal Feature 2 (the number of increased keywords fromtimestamp 119905

119894to 119905119895 |120595(119905119894 119905119895)|) The elements in 120595(119905

119894 119905119895) do not

appear from the starting timestamp 119905119904to 119905119894 that is forall119896

119898isin

120595(119905119894 119905119895) rarr 119896

119898notin 120595(119905119894 119905119895)

Temporal Feature 3 The distribution of keywords on webpages from timestamp 119905

119894to 119905119895 120577(119905119894 119905119895) For an emergency

event 119890 the web pages in 120593(119905119894 119905119895) can be represented as a

vector by the keywords in120595(119905119894 119905119895)These vectors can be stored

as a matrix

120577 (119905119894 119905119895) = (

11990811 1199081119898

d

1199081198991 sdot sdot sdot 119908

119899119898

) (2)

Temporal Feature 4 (the associated relations of keywords fromtimestamp 119905

119894to 119905119895 Γ(119905119894 119905119895)) For an emergency event 119890 the

associated relations of keywords can be stored as a matrix

Γ (119905119894 119905119895) = (

11989111 1198911119898

d

1198911198981 sdot sdot sdot 119891

119898119898

) (3)

where 119891119894119895means the weight of relation between 119896

119894and 119896

119895

which can be computed by

119891119894119895=

log ((119873 (119896119894and 119896119895) lowast 119899) (119873 (119896

119894) lowast 119873 (119896

119895)))

log 119899 (4)

where 119873(119896119894) means the number of in 120593(119905

119894 119905119895) containing

119896119894and 119873(119896

119894and 119896119895) is the number of web pages in 120593(119905

119894 119905119895)

containing both 119896119894and 119896119895

Temporal Feature 5 (the similarities between web pages fromtimestamp 119905

119894to 119905119895 Ξ(119905119894 119905119895)) For an emergency event 119890 the

similarities between web pages can be stored as a matrix

Ξ (119905119894 119905119895) = (

11988611 1198861119899

d

1198861198991 sdot sdot sdot 119886

119899119899

) (5)

where 119886119894119895means the similarities between 119889

119894and 119889119895 which can

be computed by

119886119894119895=

119889119894sdot 119889119895

1003817100381710038171003817119889119894

1003817100381710038171003817

10038171003817100381710038171003817119889119895

10038171003817100381710038171003817

(6)

where 119889119894 and 119889

119895 denote the mathematic model of vectors

119889119894and 119889

119895

33 Basic Burst Factors In this section we present basicburst factors including the number of communities in contextgraph the average clustering coefficient of the context graphand the average similarities of web pages

Impact Factor 1 (the number of communities in contextgraph from timestamp 119905

119894to 119905119895 |119862(119905

119894 119905119895)|) A community

is a subgraph of the context graph which reflects a partof context of an event 119890 The set of communities is asegmentation of the context graph Each context communityis a part of the context graph which is with no commonkeywords of other community The set of communities of anevent 119890 is denoted as

119862119890= 1198881 1198882 119888|119862

119890| forall119896

119894isin 119888119894and 119896119895isin 119888119895997888rarr 119896119894= 119896119895 (7)

Impact Factor 2 (the average clustering coefficient [24] ofthe context graph from timestamp 119905

119894to 119905119895 119862119862(119905

119894 119905119895)) In

graph theory a clustering coefficient is a measure of degreeto which nodes in a graph tend to cluster together Theclustering coefficient of the keyword 119896

119894in context graph can

be computed by

119862119862 (119896119894) =

2119897119901 (119901 minus 1)

(8)

where 119901means the number of neighbor node of the keyword119896119894and 119897 means the number of edges between these neighbor

nodes Thus the average clustering coefficient of the contextgraph can be computed by

119862119862 (119905119894 119905119895) =

sum119898

119894=1 119862119862 (119896119894)

119898 (9)

Impact Factor 3 (the average similarities of web pages fromtimestamp 119905

119894to 119905119895 119860119878(119905

119894 119905119895)) For an event 119890 the similarities

can be computed by cosine function

Sim (119889119894 119889119895) =

119889119894sdot 119889119895

1003817100381710038171003817119889119894

1003817100381710038171003817

10038171003817100381710038171003817119889119895

10038171003817100381710038171003817

(10)

where 119889119894 and 119889

119895 denote the mathematic model of vectors

119889119894and 119889

119895 Thus the average similarities of web pages can be

computed by

119860119878 (119905119894 119905119895) =

sum119898

119894=1sum119898

119895=1 Sim (119889119894 119889119895)

119898 (119898 minus 1) (11)

4 Computing the Burst Power

In this section based on the above definitions the methodfor computing the burst power of an emergency event isproposed

41 Basic Definitions of Burst Power After giving the basictemporal features of emergency events we define burst poweras follows

Definition 3 (burst power) 119900119901(119905119894 119905119895) For an emergency event

119890 the burst power from timestamp 119905119894to 119905119895is the influence

degree on the society

For example high |120593(119905119894 119905119895)| or |120595(119905

119894 119905119895)|means high influ-

ence degree of an event on the society thus the event has highburst power

International Journal of Distributed Sensor Networks 5

Keywords Web pages

The outbreak power is minimum when the bipartite graph of

keywords and web pages is complete

Keywords Web pages

The outbreak power is maximum when all of the keywords are only

provided by one web page

k1

k2

k3

k4

w1

w2

w3

k1

k2

k3

k4

w1

w2

w3

Figure 1 The illustration of the maximum and minimum burst power

According to the characteristic of burst power the timeinterval with high influence to the society will possess highpossibility to be the peak ormilestone of an emergency eventInspired by [40] before we propose the algorithm to computethe burst power we introduce two important definitions asfollows

Definition 4 (the representative power of keywords) 119903119901(119896)The representative power of keyword 119896 is the probability of 119896to represent the event 119890 correctly

Definition 5 (the confidence of web pages) 119888119908(119889) Theconfidence of a web page 119889 is the expected representativepower of keywords provided by 119889

Different keywords related to one event reveal the variousaspects of the event For example the keyword ldquoChinardquoreveals the place of the event ldquoChina rail crashrdquo On the otherhand the keywords ldquorailrdquo and ldquocrashrdquo reveal the object of theevent ldquoChina rail crashrdquo

42 Basic Heuristics for Computing Burst Power Based onthe common sense and the observations on real data wehave four basic heuristic rules which serve as the bases forthe computing of burst power These four heuristic rules arerelevant to the data If there is burst situation in the data theywould be correct Given the discussion field of this paper allemergency events have some different effect and spreadingpower Thus these four heuristic rules are appropriate foremergency events situation

Heuristic Rule 1 If ignoring 120595(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120593(119905119894 119905119895)|

According to heuristic rule 1 the burst power is propor-tional to the number of increased web pages So we can get119900119901(119905119894 119905119895)infin|120593(119905

119894 119905119895)|

Heuristic Rule 2 If ignoring 120593(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120595(119905119894 119905119895)|

According to heuristic rule 2 the burst power is propor-tional to the number of increased keywords So we can get119900119901(119905119894 119905119895)infin|120595(119905

119894 119905119895)|

If two time intervalswith the same |120593(119905119894 119905119895)| and |120595(119905

119894 119905119895)|

the distribution of keywords will determine 119900119901(119905119894 119905119895) So we

give heuristic rule 3

Heuristic Rule 3 If the bipartite graph of 120577(119905119894 119905119895) is a complete

graph 119900119901(119905119894 119905119895) is the lowest that is (forall119908

119899119898isin 120577(119905119894 119905119895) rarr

119908119899119898

= 0) rarr 119900119901(119905119894 119905119895)min

According to heuristic rule 3 if all of the keywords appearin each web page the similarity between them is 1 whichmeans all of the web pages are copies from one web pageThissituation means that the emergency event is with the lowestdiversity

Since120595(119905119894 119905119895) and 120593(119905

119894 119905119895) are dependent the distribution

of keywords on the web pages should be considered So wegive heuristic rule 4Heuristic Rule 4 Since120595(119905

119894 119905119895) and120593(119905

119894 119905119895) are dependent the

distribution of keywords on web pages should be consideredIf all of the keywords are provided by only one web page119900119901(119905119894 119905119895) is the highest

43 Computing the Burst Power Heuristic rule 4 givesthe condition of maximum 119900119901(119905

119894 119905119895) Figure 1 gives the

illustration of heuristic rules 3 and 4 Based on heuristic

6 International Journal of Distributed Sensor Networks

rules 3 and 4 we can conclude that the confidence of aweb page and the representative power of a keyword aredetermined by each other andwe can use an iterativemethodto compute them Thus we compute the confidence of aweb page by calculating the average representative power ofkeywords it provides as follows

119888119908 (119889ℎ) =

sum(119896119898isin

997888

119889ℎ)cap(119908ℎ119899gt0)

119903119901 (119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(12)

where |997888119889ℎ| means the number of web pages with keywords

119896119898Inspired by [40] we use probability function to compute

the representative power of keywords

119903119901 (119896119898) = 1minus prod

(119889ℎisin

997888

119896119898)cap(119908ℎ119898gt0)

(1minus 119888119908 (119889ℎ)) (13)

where997888119896119898is the set of web pages providing 119896

119898

The above two equations show how to compute theconfidence of a web page However since the similaritiesbetween web pages are not zero we put the similaritiesbetween web pages into (8) The equation can be revised as(9) which considers the similarities between web pages

1199031199011015840

(119896119898) = 119903119901 (119896

119898) + sum

(119896119894isin

997888997888

119896119896119894)cap(119891119894119895gt0)

119891119894119895lowast 119903119901 (119896

119894)

(14)

where997888997888119896119896119894is the set of keywords similarities against keyword

119896119894Since 1199031199011015840(119896) may be higher than 1 we adopt the widely

used logistic function to set 1199031199011015840(119896) into (0 1)Then (9) can berevised as

11990311990110158401015840

(119896119898) =

11 + 119890minus1199031199011015840(119896119898)

(15)

Equation (7) is revised as

119888119908 (119889ℎ) =

sum119896119898isin

997888

119889ℎcap119908ℎ119898gt011990311990110158401015840

(119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(16)

and the burst power function of time interval (119905119894 119905119895) can be

computed by the sum of confidence of all web pages

119900119901 (119905119894 119905119895) =

119899

sum

ℎ=1(1minus 119888119908 (119889

ℎ)) (17)

where 119899 means the number of web pages from time interval(119905119894 119905119895)

As described above we can compute the burst power ofthe proposed method

5 Experiments and Analysis

51 Datasets The events in our experiments are extractedfrom Google and Baidu We select 150 events with about

Table 1 The details of datasets

Feature ValueAverage number of seeds per event 2Average number of web pages per event 1012Average number of keywords per event 4534Average number of days per event 30Average number of web pages per day 34Average number of keywords per day 853

1500000 web pages in our experiments including politicsevents accidents events disasters events and terrorismevents The web pages of each event are downloaded fromGoogle Stanford tagger (httpnlpstanfordedu) is used toreserve the noun words in the web pages The keywordsare selected by their document frequencies Table 1 showsthe statistics of our experimental dataset When Google andBaidu provide the events it also gives some keywords forhelping users to search them After we get the seed set of anevent by the search engine a certain number of the web pagesare collected as samples by automatic crawling and searchingwith the seed setThe detailed steps for collecting related webresources of an event in our experiments are as follows

(1) Get the seed set of an event such as ldquoChina traincrashrdquo which can be seen as 119878(119905

119894 119905119895) of an event

(2) Search the seed set as the query and download therelated web pages with search engine which can beseen as 120593(119905

119894 119905119895) of an event

(3) Identify the starting timestamp of an event by 120593(119905119894 119905119895)

and the ending timestamp by download time whichcan be seen as 119905

119904and 119905119890of an event

(4) Get |120593(119905119894 119905119895)| |120595(119905

119894 119905119895)| 120577(119905119894 119905119895) and Ξ(119905

119894 119905119895) per day

(5) Do step (4) of different information sources includingnews blogs and BBS

52 Experimental Results After obtaining the temporal fea-tures per day of each event we select ten human annotatorsto test whether the burst power of each event is correct ornotThe data fromGoogle Trends is selected to compare withthat of the proposed method If the human annotator thinksthe proposed methods are equal to his own imagination orGoogle Trends we set the precision of this detection to trueIn additional the annotators are set to do their evaluationsindependently which ensure the reliability and validity of theresults Before the human annotator started to evaluate theexperimental results we provide the abstract descriptions ofeach event for them For example we will give some newsstories and concepts to the human annotators This trainingsession continued until the human annotators are familiarwith the concepts and the temporal feature of events In thenext step each annotator was asked to test the results of eachevent independently

In fact the overall precision of the proposed methodachieves 92 showing the accuracy of our burst powercomputation algorithm From the experiments on the real

International Journal of Distributed Sensor Networks 7

data we know that the proposed algorithm can compute theburst power of an event accuratelyThe information fromwebcan be integrated into burst power The factor can be used todetect states of an emergency event

6 Conclusions

With the popularity of web the internet is becoming amajor information provider and poster of an event due toits real-time open and dynamic features However facedwith the hugeness disorder and continuous web resourcesit is impossible for people to efficiently recognize collectand organize the events In this paper crowd sensing basedburst computation algorithm of a web event is developed inorder to let the people know a web event clearly and help thesocial group or government process the events effectivelyThedefinition of ldquosocial sensorsrdquo is firstly introduced which isthe foundation of using web resources to compute the burstpower of events on the web Secondly different temporalfeatures of web events are developed to provide the basicsfor the proposed computation algorithmMoreover the burstpower is presented to integrate the above temporal featuresof an event Empirical experiments on real datasets includingGoogle Zeitgeist and Google Trends show that that thenumber of web pages and the average clustering coefficientcan be used to detect events Some strategies integrating thenumber of web pages and the average clustering coefficientare also employed The evaluations on real dataset show thatthe proposed function integrating the number of web pagesand the average clustering coefficient can be used for eventdetection efficiently and correctly

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceandTechnologyMajor Project underGrant 2013ZX01033002-003 in part by the National High Technology Research andDevelopment Program of China (863 Program) under Grants2013AA014601 and 2013AA014603 in part by National KeyTechnology Support Program under Grant 2012BAH07B01in part by the National Science Foundation of China underGrants 61300202 and 61300028 in part by the Project of theMinistry of Public Security under Grant 2014JSYJB009 inpart by the China Postdoctoral Science Foundation underGrant 2014M560085 in part by the Science Foundation ofShanghai under Grant 13ZR1452900 in part by the ChinaNational Social Science Fund 06BFX051 and in part bythe Shanghai University training and selection of outstand-ing young teachers in special research fund hzf05046 Theauthors must be grateful for the helpful suggestion from YLiu L Mei H Zhang and C Hu

References

[1] Y Liu L Ni and C Hu ldquoA generalized probabilistic topologycontrol for wireless sensor networksrdquo IEEE Journal on SelectedAreas in Communications vol 30 no 9 pp 1780ndash1788 2012

[2] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[3] C Hu Z Xu Y Liu L Mei L Chen and X Luo ldquoSemanticlink network-based model for organizing multimedia big datardquoIEEE Transactions on Emerging Topics in Computing vol 2 no3 pp 376ndash387 2014

[4] Y Liu Y Zhu L Ni and G Xue ldquoA reliability-oriented trans-mission service in wireless sensor networksrdquo IEEE TransactionsonParallel andDistributed Systems vol 22 no 12 pp 2100ndash21072011

[5] Y Liu Q Zhang and L M Ni ldquoOpportunity-based topologycontrol in wireless sensor networksrdquo IEEE Transactions onParallel andDistributed Systems vol 21 no 3 pp 405ndash416 2010

[6] X Liu Y Yang D Yuan and J Chen ldquoDo we need to handleevery temporal violation in scientific workflow systemsrdquo ACMTransactions on Software Engineering and Methodology vol 23no 1 Article ID 2559938 2014

[7] LWang J Tao R Ranjan et al ldquoG-Hadoop mapReduce acrossdistributed data centers for data-intensive computingrdquo FutureGeneration Computer Systems vol 29 no 3 pp 739ndash750 2013

[8] Z Xu X Wei X Luo et al ldquoKnowle A semantic link networkbased system for organizing large scale online news eventsrdquoFuture Generation Computer Systems vol 43-44 pp 40ndash502015

[9] Z Xu X Luo S Zhang X Wei L Mei and C Hu ldquoMin-ing temporal explicit and implicit semantic relations betweenentities using web search enginesrdquo Future Generation ComputerSystems vol 37 pp 468ndash477 2014

[10] D Haddow A Bullock and P Coppola Introduction to Emer-gency Management Butterworth-Heinemann 2010

[11] 2012 httpdefinitionsuslegalcomeemergency-event[12] 2012 httpwwwwhointcsrsarsen[13] 2012 httpenwikipediaorgwikiSeptember 11 attacks[14] J Farber T Myers J Trevathan I Atkinson and T Andersen

ldquoRiskr a low-technological Web20 disaster service to monitorand share informationrdquo in In Proceedings of 15th InternationalConference on Network-Based Information Systems (NBiS rsquo12)pp 311ndash318 IEEE Melbourne Australia September 2012

[15] J Makkonen ldquoInvestigations on event evolution in TDTrdquo inProceedings of the Conference of the North American Chapterof the Association for Computational Linguistics on HumanLanguage (NAACLstudent rsquo03) pp 43ndash48 Edmonton CanadaMay 2003

[16] C C Yang X Shi and C-P Wei ldquoDiscovering event evolutiongraphs fromnews corporardquo IEEETransactions on SystemsManand Cybernetics Part A Systems and Humans vol 39 no 4 pp850ndash863 2009

[17] J Abonyi B Feil S Nemeth and P Arva ldquoModifiedGath-GEVaclustering for fuzzy segmentation of multivariate time-seriesrdquoFuzzy Sets and Systems vol 149 no 1 pp 39ndash56 2005

[18] X Wu Y-J Lu Q Peng and C-W Ngo ldquoMining eventstructures from web videosrdquo IEEEMultimedia vol 18 no 1 pp38ndash51 2011

8 International Journal of Distributed Sensor Networks

[19] Q He K Chang E-P Lim and A Banerjee ldquoKeep it simplewith time a reexamination of probabilistic topic detectionmodelsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 10 pp 1795ndash1808 2010

[20] C Sung and T G Kim ldquoCollaborative modeling processfor development of domain-specific discrete event simulationsystemsrdquo IEEE Transactions on Systems Man and CyberneticsPart C Applications and Reviews vol 42 no 4 pp 532ndash5462012

[21] E Keogh K Chakrabarti M Pazzani and S MehrotraldquoDimensionality reduction for fast similarity search in largetime series databasesrdquo Knowledge and Information Systems vol3 no 3 pp 263ndash286 2001

[22] J Himberg K Korpiaho H Mannila J Tikanmaki and H TT Toivonen ldquoTime series segmentation for context recognitionin mobile devicesrdquo in Proceedings of the 1st IEEE InternationalConference on Data Mining (ICDM rsquo01) pp 203ndash210 December2001

[23] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[24] P Xiong Y Fan and M Zhou ldquoWeb service configurationunder multiple quality-of-service attributesrdquo IEEE TransactionsonAutomation Science and Engineering vol 6 no 2 pp 311ndash3212009

[25] J Allan G Carbonell G Doddington J Yamron and Y YangldquoTopic detection and tracking pilot study final reportrdquo in Pro-ceedings of the Broadcast News Transcription and UnderstandingWorkshop 1998

[26] J Allan Topic Detection and Tracking Event-Based InformationOrganization Kluwer Norwell Mass USA 2000

[27] Q Mei and C Zhai ldquoDiscovering evolutionary theme patternsfrom text an exploration of temporal text miningrdquo in Pro-ceedings of the 11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 198ndash207 August2005

[28] C-P Wei and Y-H Chang ldquoDiscovering event evolution pat-terns fromdocument sequencesrdquo IEEETransactions on SystemsMan and Cybernetics Part A Systems and Humans vol 37 no2 pp 273ndash283 2007

[29] C C Yang andX Shi ldquoDiscovering event evolution graphs fromnewswiresrdquo in Proceedings of the 15th International Conferenceon World Wide Web pp 945ndash946 May 2006

[30] Y Jo C Lagoze and C L Giles ldquoDetecting research topicsvia the correlation between graphs and textsrdquo in Proceedings ofthe 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo07) pp 370ndash379 August2007

[31] G P C Fung J X Yu H Liu and P S Yu ldquoTime-dependentevent hierarchy constructionrdquo in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo07) pp 300ndash309 ACM August 2007

[32] V Hristidis O Valdivia M Vlachos and P S Yu ldquoContinuouskeyword search on multiple text streamsrdquo in Proceedings of the15th ACM Conference on Information and Knowledge Manage-ment (CIKM rsquo06) pp 802ndash803 November 2006

[33] Q Zhao T-Y Liu S S Bhowmick andW-Y Ma ldquoEvent detec-tion from evolution of click-through datardquo in Proceedings ofthe 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo06) pp 484ndash493 August2006

[34] C Wang M Zhang L Ru and S Ma ldquoAutomatic online newstopic ranking using media focus and user attention based onaging theoryrdquo in Proceedings of the 17th ACM Conference onInformation and Knowledge Management (CIKM rsquo08) pp 1033ndash1042 October 2008

[35] J Leskovec A Krause C Guestrin C Faloutsos J VanBriesenand N Glance ldquoCost-effective outbreak detection in networksrdquoin Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo07) pp420ndash429 August 2007

[36] J Leskovec and E Horvitz ldquoPlanetary-scale views on a largeinstant-messaging networkrdquo in Proceedings of the 17th Interna-tional Conference onWorldWideWeb (WWW rsquo08) pp 915ndash924April 2008

[37] R Nallapati A Feng F Peng and J Allan ldquoEvent threadingwithin news topicsrdquo in Proceedings of the 13th ACM Conferenceon Information and Knowledge Management (CIKM rsquo04) pp446ndash453 November 2004

[38] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing and Managementvol 24 no 5 pp 513ndash523 1988

[39] X Jin S Spangler R Ma and J Han ldquoTopic initiator detectionon the world wide webrdquo in Proceedings of the 19th InternationalWorld Wide Web Conference pp 481ndash490 2010

[40] X Yin J Han and P S Yu ldquoTruth discovery with multiple con-flicting information providers on the Webrdquo IEEE Transactionson Knowledge and Data Engineering vol 20 no 6 pp 796ndash8082008

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 3: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

International Journal of Distributed Sensor Networks 3

episode is defined as a stage or subevent of an event Theabove study differs from this paper their study deals with anevent and their event episodes whereas this paper handlesthe different states of emergency events imaged on the webLater Yang and Shi [29] aimed at discovering event evolutiongraphs from news corpora The proposed event evolutiongraph is used to present the underlying structure of theeventsTheproposedmethod uses the event timestamp eventcontent similarity temporal proximity and web pages distri-bution proximity to model the event evolution relationshipsRecently Jo et al [30] studied the method to discover theevolution of topics (ie events) over time in a timestampdocument collection They tried to capture the topologyof topic evolution that is inherent in a given corpus Theyclaimed that the topology of the topic evolution discovered byhis method is very rich and carries concrete information onhow the corpus has evolved over time Event detection basedon prior user queries is reported in [31 32] Fung et al [31]proposed to first identify the bursty features related to theuser query and then organize the documents related to thosebursty features into an event hierarchy In [32] a user specifiesan event (or a topic) of interest using several keywords as aquery The response to the query is a combination of streams(eg news feeds emails) that are sufficiently correlated andcollectively contain all query keywords within a time periodThe proposed work is also related to event detection usingclick-through data [33] Event ranking with user attention isreported in [34] where the events are firstly detected fromnews streamsUser attention is then derived from the numberof page-views (collected through web browser toolbars) forall the news articles in the same event Leskovec et al [35 36]proposed the method for outbreak detection based on cost-effective function

Overall the above methods have been proved to makegood performance on general events other than emergencyevents The emergency events possess dynamic real-timemultistates sudden and urgent features In this paper weconsider the burst feature of an emergency event imaged onthe web

3 Problems Formulation

In this section the basic definition of the proposedmethod isintroduced The temporal features of an emergency event aregiven in the next section The burst factors of the proposedmethod are given in the last section

31 Basic Definitions An event is something that happens atsome specific time and often some specific places [37ndash39]In fact this definition of events from TDT can be relaxedsince some events do not happen at some specific placeManyevents are launched by some news or stories only on the webwhich cannot get the exact location stamp Instead the timeof an emergency event can always be identified since somenews of it has an exact timestamp 119905 Besides the timestampin this paper we only take web pages into consideration sincethey can be easily processed and analyzed An emergencyevent is defined as follows

Definition 1 (emergency event 119890) An emergency event 119890 is atuple 119871

119890 119865119890 where 119871

119890is the life course of 119890 119865

119890is the set of

basic features describing 119890

Definition 2 (the basic features and life cycle of emergencyevent imaged on the web) 119865

119890and 119871

119890The basic features of an

emergency event contain three components including seedsset 119878(119905

119894 119905119895) web pages set 120593(119905

119894 119905119895) and keywords set 120595(119905

119894 119905119895)

The life cycle of an emergency event contains the starting andending timestamps ⟨119905

119904 119905119890⟩

Seeds set consists of the core keywords of an emergencyevent from timestamp 119905

119894to 119905119895 Usually these keywords can be

used to search the related web pages covering one web eventFor example we can use ldquoChina rail crashrdquo as the seeds tosearch related web pages covering the event The seeds set ofan emergency event can be found from news website whichprovides the hot news topics each day Besides the seeds sethow to get the web pages related to the emergency event isanother issue Web page set is a set with 119899 related web pagesreturn by search engines using the seeds from timestamp 119905

119894

to 119905119895 which is denoted by 120593(119905

119894 119905119895) = 1198891 1198892 119889119894 119889119899

Keywords set is a setwith119898 keywords extracted from120593(119905119894 119905119895)

which is denoted by 120595(119905119894 119905119895) = 1198961 1198962 119896119894 119896119898 Web

page 119889119894is represented by a keyword vector which is denoted

as

119889119894= 1199081 1199082 119908119898 (1)

where 119908119895= (1 + log 119905119891(119896

119895)) lowast log(1 + 119899119889119891(119896

119895)) [38] 119905119891(119896

119895)

means the term frequency of keyword 119896119895in web page 119889

119894

and 119889119891(119896119895) means the web page frequency of keyword 119896

119895in

120593(119905119894 119905119895)

Herein we use search engine such as Google (httpwwwgooglecom) and Baidu (httpwwwbaiducom) to get theweb pages related to an emergency event imaged on the webThe reasons are as follows

(1)Updating InformationRapidlyThe information of an eventrefreshes quickly For example the information of ldquoJapannuclear crisisrdquo from social sensors may update per hour andeven per minute Web search engines such as Google providethe interface to search information up to per minute

(2) Different Information Sources The information of anemergency event usually comes from different sources suchas news blogs and BBS Web search engines such as Googleand Baidu provide the interface to search information fromvarious websites

32 Basic Temporal Features In this section we present fivebasic temporal features including (1) the number of increasedweb pages (2) the number of increased keywords (3) thedistribution of keywords on web pages (4) the associatedrelations of keywords and (5) the similarities of web pages

Temporal Feature 1 (the number of increased web pages fromtimestamp 119905

119894to 119905119895 |120593(119905119894 119905119895)|) The elements in 120593(119905

119894 119905119895) do not

appear from the starting timestamp 119905119904to 119905119894 that is forall119889

119899isin

120593(119905119894 119905119895) rarr 119889

119899notin 120593(119905119904 119905119894)

4 International Journal of Distributed Sensor Networks

Temporal Feature 2 (the number of increased keywords fromtimestamp 119905

119894to 119905119895 |120595(119905119894 119905119895)|) The elements in 120595(119905

119894 119905119895) do not

appear from the starting timestamp 119905119904to 119905119894 that is forall119896

119898isin

120595(119905119894 119905119895) rarr 119896

119898notin 120595(119905119894 119905119895)

Temporal Feature 3 The distribution of keywords on webpages from timestamp 119905

119894to 119905119895 120577(119905119894 119905119895) For an emergency

event 119890 the web pages in 120593(119905119894 119905119895) can be represented as a

vector by the keywords in120595(119905119894 119905119895)These vectors can be stored

as a matrix

120577 (119905119894 119905119895) = (

11990811 1199081119898

d

1199081198991 sdot sdot sdot 119908

119899119898

) (2)

Temporal Feature 4 (the associated relations of keywords fromtimestamp 119905

119894to 119905119895 Γ(119905119894 119905119895)) For an emergency event 119890 the

associated relations of keywords can be stored as a matrix

Γ (119905119894 119905119895) = (

11989111 1198911119898

d

1198911198981 sdot sdot sdot 119891

119898119898

) (3)

where 119891119894119895means the weight of relation between 119896

119894and 119896

119895

which can be computed by

119891119894119895=

log ((119873 (119896119894and 119896119895) lowast 119899) (119873 (119896

119894) lowast 119873 (119896

119895)))

log 119899 (4)

where 119873(119896119894) means the number of in 120593(119905

119894 119905119895) containing

119896119894and 119873(119896

119894and 119896119895) is the number of web pages in 120593(119905

119894 119905119895)

containing both 119896119894and 119896119895

Temporal Feature 5 (the similarities between web pages fromtimestamp 119905

119894to 119905119895 Ξ(119905119894 119905119895)) For an emergency event 119890 the

similarities between web pages can be stored as a matrix

Ξ (119905119894 119905119895) = (

11988611 1198861119899

d

1198861198991 sdot sdot sdot 119886

119899119899

) (5)

where 119886119894119895means the similarities between 119889

119894and 119889119895 which can

be computed by

119886119894119895=

119889119894sdot 119889119895

1003817100381710038171003817119889119894

1003817100381710038171003817

10038171003817100381710038171003817119889119895

10038171003817100381710038171003817

(6)

where 119889119894 and 119889

119895 denote the mathematic model of vectors

119889119894and 119889

119895

33 Basic Burst Factors In this section we present basicburst factors including the number of communities in contextgraph the average clustering coefficient of the context graphand the average similarities of web pages

Impact Factor 1 (the number of communities in contextgraph from timestamp 119905

119894to 119905119895 |119862(119905

119894 119905119895)|) A community

is a subgraph of the context graph which reflects a partof context of an event 119890 The set of communities is asegmentation of the context graph Each context communityis a part of the context graph which is with no commonkeywords of other community The set of communities of anevent 119890 is denoted as

119862119890= 1198881 1198882 119888|119862

119890| forall119896

119894isin 119888119894and 119896119895isin 119888119895997888rarr 119896119894= 119896119895 (7)

Impact Factor 2 (the average clustering coefficient [24] ofthe context graph from timestamp 119905

119894to 119905119895 119862119862(119905

119894 119905119895)) In

graph theory a clustering coefficient is a measure of degreeto which nodes in a graph tend to cluster together Theclustering coefficient of the keyword 119896

119894in context graph can

be computed by

119862119862 (119896119894) =

2119897119901 (119901 minus 1)

(8)

where 119901means the number of neighbor node of the keyword119896119894and 119897 means the number of edges between these neighbor

nodes Thus the average clustering coefficient of the contextgraph can be computed by

119862119862 (119905119894 119905119895) =

sum119898

119894=1 119862119862 (119896119894)

119898 (9)

Impact Factor 3 (the average similarities of web pages fromtimestamp 119905

119894to 119905119895 119860119878(119905

119894 119905119895)) For an event 119890 the similarities

can be computed by cosine function

Sim (119889119894 119889119895) =

119889119894sdot 119889119895

1003817100381710038171003817119889119894

1003817100381710038171003817

10038171003817100381710038171003817119889119895

10038171003817100381710038171003817

(10)

where 119889119894 and 119889

119895 denote the mathematic model of vectors

119889119894and 119889

119895 Thus the average similarities of web pages can be

computed by

119860119878 (119905119894 119905119895) =

sum119898

119894=1sum119898

119895=1 Sim (119889119894 119889119895)

119898 (119898 minus 1) (11)

4 Computing the Burst Power

In this section based on the above definitions the methodfor computing the burst power of an emergency event isproposed

41 Basic Definitions of Burst Power After giving the basictemporal features of emergency events we define burst poweras follows

Definition 3 (burst power) 119900119901(119905119894 119905119895) For an emergency event

119890 the burst power from timestamp 119905119894to 119905119895is the influence

degree on the society

For example high |120593(119905119894 119905119895)| or |120595(119905

119894 119905119895)|means high influ-

ence degree of an event on the society thus the event has highburst power

International Journal of Distributed Sensor Networks 5

Keywords Web pages

The outbreak power is minimum when the bipartite graph of

keywords and web pages is complete

Keywords Web pages

The outbreak power is maximum when all of the keywords are only

provided by one web page

k1

k2

k3

k4

w1

w2

w3

k1

k2

k3

k4

w1

w2

w3

Figure 1 The illustration of the maximum and minimum burst power

According to the characteristic of burst power the timeinterval with high influence to the society will possess highpossibility to be the peak ormilestone of an emergency eventInspired by [40] before we propose the algorithm to computethe burst power we introduce two important definitions asfollows

Definition 4 (the representative power of keywords) 119903119901(119896)The representative power of keyword 119896 is the probability of 119896to represent the event 119890 correctly

Definition 5 (the confidence of web pages) 119888119908(119889) Theconfidence of a web page 119889 is the expected representativepower of keywords provided by 119889

Different keywords related to one event reveal the variousaspects of the event For example the keyword ldquoChinardquoreveals the place of the event ldquoChina rail crashrdquo On the otherhand the keywords ldquorailrdquo and ldquocrashrdquo reveal the object of theevent ldquoChina rail crashrdquo

42 Basic Heuristics for Computing Burst Power Based onthe common sense and the observations on real data wehave four basic heuristic rules which serve as the bases forthe computing of burst power These four heuristic rules arerelevant to the data If there is burst situation in the data theywould be correct Given the discussion field of this paper allemergency events have some different effect and spreadingpower Thus these four heuristic rules are appropriate foremergency events situation

Heuristic Rule 1 If ignoring 120595(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120593(119905119894 119905119895)|

According to heuristic rule 1 the burst power is propor-tional to the number of increased web pages So we can get119900119901(119905119894 119905119895)infin|120593(119905

119894 119905119895)|

Heuristic Rule 2 If ignoring 120593(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120595(119905119894 119905119895)|

According to heuristic rule 2 the burst power is propor-tional to the number of increased keywords So we can get119900119901(119905119894 119905119895)infin|120595(119905

119894 119905119895)|

If two time intervalswith the same |120593(119905119894 119905119895)| and |120595(119905

119894 119905119895)|

the distribution of keywords will determine 119900119901(119905119894 119905119895) So we

give heuristic rule 3

Heuristic Rule 3 If the bipartite graph of 120577(119905119894 119905119895) is a complete

graph 119900119901(119905119894 119905119895) is the lowest that is (forall119908

119899119898isin 120577(119905119894 119905119895) rarr

119908119899119898

= 0) rarr 119900119901(119905119894 119905119895)min

According to heuristic rule 3 if all of the keywords appearin each web page the similarity between them is 1 whichmeans all of the web pages are copies from one web pageThissituation means that the emergency event is with the lowestdiversity

Since120595(119905119894 119905119895) and 120593(119905

119894 119905119895) are dependent the distribution

of keywords on the web pages should be considered So wegive heuristic rule 4Heuristic Rule 4 Since120595(119905

119894 119905119895) and120593(119905

119894 119905119895) are dependent the

distribution of keywords on web pages should be consideredIf all of the keywords are provided by only one web page119900119901(119905119894 119905119895) is the highest

43 Computing the Burst Power Heuristic rule 4 givesthe condition of maximum 119900119901(119905

119894 119905119895) Figure 1 gives the

illustration of heuristic rules 3 and 4 Based on heuristic

6 International Journal of Distributed Sensor Networks

rules 3 and 4 we can conclude that the confidence of aweb page and the representative power of a keyword aredetermined by each other andwe can use an iterativemethodto compute them Thus we compute the confidence of aweb page by calculating the average representative power ofkeywords it provides as follows

119888119908 (119889ℎ) =

sum(119896119898isin

997888

119889ℎ)cap(119908ℎ119899gt0)

119903119901 (119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(12)

where |997888119889ℎ| means the number of web pages with keywords

119896119898Inspired by [40] we use probability function to compute

the representative power of keywords

119903119901 (119896119898) = 1minus prod

(119889ℎisin

997888

119896119898)cap(119908ℎ119898gt0)

(1minus 119888119908 (119889ℎ)) (13)

where997888119896119898is the set of web pages providing 119896

119898

The above two equations show how to compute theconfidence of a web page However since the similaritiesbetween web pages are not zero we put the similaritiesbetween web pages into (8) The equation can be revised as(9) which considers the similarities between web pages

1199031199011015840

(119896119898) = 119903119901 (119896

119898) + sum

(119896119894isin

997888997888

119896119896119894)cap(119891119894119895gt0)

119891119894119895lowast 119903119901 (119896

119894)

(14)

where997888997888119896119896119894is the set of keywords similarities against keyword

119896119894Since 1199031199011015840(119896) may be higher than 1 we adopt the widely

used logistic function to set 1199031199011015840(119896) into (0 1)Then (9) can berevised as

11990311990110158401015840

(119896119898) =

11 + 119890minus1199031199011015840(119896119898)

(15)

Equation (7) is revised as

119888119908 (119889ℎ) =

sum119896119898isin

997888

119889ℎcap119908ℎ119898gt011990311990110158401015840

(119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(16)

and the burst power function of time interval (119905119894 119905119895) can be

computed by the sum of confidence of all web pages

119900119901 (119905119894 119905119895) =

119899

sum

ℎ=1(1minus 119888119908 (119889

ℎ)) (17)

where 119899 means the number of web pages from time interval(119905119894 119905119895)

As described above we can compute the burst power ofthe proposed method

5 Experiments and Analysis

51 Datasets The events in our experiments are extractedfrom Google and Baidu We select 150 events with about

Table 1 The details of datasets

Feature ValueAverage number of seeds per event 2Average number of web pages per event 1012Average number of keywords per event 4534Average number of days per event 30Average number of web pages per day 34Average number of keywords per day 853

1500000 web pages in our experiments including politicsevents accidents events disasters events and terrorismevents The web pages of each event are downloaded fromGoogle Stanford tagger (httpnlpstanfordedu) is used toreserve the noun words in the web pages The keywordsare selected by their document frequencies Table 1 showsthe statistics of our experimental dataset When Google andBaidu provide the events it also gives some keywords forhelping users to search them After we get the seed set of anevent by the search engine a certain number of the web pagesare collected as samples by automatic crawling and searchingwith the seed setThe detailed steps for collecting related webresources of an event in our experiments are as follows

(1) Get the seed set of an event such as ldquoChina traincrashrdquo which can be seen as 119878(119905

119894 119905119895) of an event

(2) Search the seed set as the query and download therelated web pages with search engine which can beseen as 120593(119905

119894 119905119895) of an event

(3) Identify the starting timestamp of an event by 120593(119905119894 119905119895)

and the ending timestamp by download time whichcan be seen as 119905

119904and 119905119890of an event

(4) Get |120593(119905119894 119905119895)| |120595(119905

119894 119905119895)| 120577(119905119894 119905119895) and Ξ(119905

119894 119905119895) per day

(5) Do step (4) of different information sources includingnews blogs and BBS

52 Experimental Results After obtaining the temporal fea-tures per day of each event we select ten human annotatorsto test whether the burst power of each event is correct ornotThe data fromGoogle Trends is selected to compare withthat of the proposed method If the human annotator thinksthe proposed methods are equal to his own imagination orGoogle Trends we set the precision of this detection to trueIn additional the annotators are set to do their evaluationsindependently which ensure the reliability and validity of theresults Before the human annotator started to evaluate theexperimental results we provide the abstract descriptions ofeach event for them For example we will give some newsstories and concepts to the human annotators This trainingsession continued until the human annotators are familiarwith the concepts and the temporal feature of events In thenext step each annotator was asked to test the results of eachevent independently

In fact the overall precision of the proposed methodachieves 92 showing the accuracy of our burst powercomputation algorithm From the experiments on the real

International Journal of Distributed Sensor Networks 7

data we know that the proposed algorithm can compute theburst power of an event accuratelyThe information fromwebcan be integrated into burst power The factor can be used todetect states of an emergency event

6 Conclusions

With the popularity of web the internet is becoming amajor information provider and poster of an event due toits real-time open and dynamic features However facedwith the hugeness disorder and continuous web resourcesit is impossible for people to efficiently recognize collectand organize the events In this paper crowd sensing basedburst computation algorithm of a web event is developed inorder to let the people know a web event clearly and help thesocial group or government process the events effectivelyThedefinition of ldquosocial sensorsrdquo is firstly introduced which isthe foundation of using web resources to compute the burstpower of events on the web Secondly different temporalfeatures of web events are developed to provide the basicsfor the proposed computation algorithmMoreover the burstpower is presented to integrate the above temporal featuresof an event Empirical experiments on real datasets includingGoogle Zeitgeist and Google Trends show that that thenumber of web pages and the average clustering coefficientcan be used to detect events Some strategies integrating thenumber of web pages and the average clustering coefficientare also employed The evaluations on real dataset show thatthe proposed function integrating the number of web pagesand the average clustering coefficient can be used for eventdetection efficiently and correctly

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceandTechnologyMajor Project underGrant 2013ZX01033002-003 in part by the National High Technology Research andDevelopment Program of China (863 Program) under Grants2013AA014601 and 2013AA014603 in part by National KeyTechnology Support Program under Grant 2012BAH07B01in part by the National Science Foundation of China underGrants 61300202 and 61300028 in part by the Project of theMinistry of Public Security under Grant 2014JSYJB009 inpart by the China Postdoctoral Science Foundation underGrant 2014M560085 in part by the Science Foundation ofShanghai under Grant 13ZR1452900 in part by the ChinaNational Social Science Fund 06BFX051 and in part bythe Shanghai University training and selection of outstand-ing young teachers in special research fund hzf05046 Theauthors must be grateful for the helpful suggestion from YLiu L Mei H Zhang and C Hu

References

[1] Y Liu L Ni and C Hu ldquoA generalized probabilistic topologycontrol for wireless sensor networksrdquo IEEE Journal on SelectedAreas in Communications vol 30 no 9 pp 1780ndash1788 2012

[2] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[3] C Hu Z Xu Y Liu L Mei L Chen and X Luo ldquoSemanticlink network-based model for organizing multimedia big datardquoIEEE Transactions on Emerging Topics in Computing vol 2 no3 pp 376ndash387 2014

[4] Y Liu Y Zhu L Ni and G Xue ldquoA reliability-oriented trans-mission service in wireless sensor networksrdquo IEEE TransactionsonParallel andDistributed Systems vol 22 no 12 pp 2100ndash21072011

[5] Y Liu Q Zhang and L M Ni ldquoOpportunity-based topologycontrol in wireless sensor networksrdquo IEEE Transactions onParallel andDistributed Systems vol 21 no 3 pp 405ndash416 2010

[6] X Liu Y Yang D Yuan and J Chen ldquoDo we need to handleevery temporal violation in scientific workflow systemsrdquo ACMTransactions on Software Engineering and Methodology vol 23no 1 Article ID 2559938 2014

[7] LWang J Tao R Ranjan et al ldquoG-Hadoop mapReduce acrossdistributed data centers for data-intensive computingrdquo FutureGeneration Computer Systems vol 29 no 3 pp 739ndash750 2013

[8] Z Xu X Wei X Luo et al ldquoKnowle A semantic link networkbased system for organizing large scale online news eventsrdquoFuture Generation Computer Systems vol 43-44 pp 40ndash502015

[9] Z Xu X Luo S Zhang X Wei L Mei and C Hu ldquoMin-ing temporal explicit and implicit semantic relations betweenentities using web search enginesrdquo Future Generation ComputerSystems vol 37 pp 468ndash477 2014

[10] D Haddow A Bullock and P Coppola Introduction to Emer-gency Management Butterworth-Heinemann 2010

[11] 2012 httpdefinitionsuslegalcomeemergency-event[12] 2012 httpwwwwhointcsrsarsen[13] 2012 httpenwikipediaorgwikiSeptember 11 attacks[14] J Farber T Myers J Trevathan I Atkinson and T Andersen

ldquoRiskr a low-technological Web20 disaster service to monitorand share informationrdquo in In Proceedings of 15th InternationalConference on Network-Based Information Systems (NBiS rsquo12)pp 311ndash318 IEEE Melbourne Australia September 2012

[15] J Makkonen ldquoInvestigations on event evolution in TDTrdquo inProceedings of the Conference of the North American Chapterof the Association for Computational Linguistics on HumanLanguage (NAACLstudent rsquo03) pp 43ndash48 Edmonton CanadaMay 2003

[16] C C Yang X Shi and C-P Wei ldquoDiscovering event evolutiongraphs fromnews corporardquo IEEETransactions on SystemsManand Cybernetics Part A Systems and Humans vol 39 no 4 pp850ndash863 2009

[17] J Abonyi B Feil S Nemeth and P Arva ldquoModifiedGath-GEVaclustering for fuzzy segmentation of multivariate time-seriesrdquoFuzzy Sets and Systems vol 149 no 1 pp 39ndash56 2005

[18] X Wu Y-J Lu Q Peng and C-W Ngo ldquoMining eventstructures from web videosrdquo IEEEMultimedia vol 18 no 1 pp38ndash51 2011

8 International Journal of Distributed Sensor Networks

[19] Q He K Chang E-P Lim and A Banerjee ldquoKeep it simplewith time a reexamination of probabilistic topic detectionmodelsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 10 pp 1795ndash1808 2010

[20] C Sung and T G Kim ldquoCollaborative modeling processfor development of domain-specific discrete event simulationsystemsrdquo IEEE Transactions on Systems Man and CyberneticsPart C Applications and Reviews vol 42 no 4 pp 532ndash5462012

[21] E Keogh K Chakrabarti M Pazzani and S MehrotraldquoDimensionality reduction for fast similarity search in largetime series databasesrdquo Knowledge and Information Systems vol3 no 3 pp 263ndash286 2001

[22] J Himberg K Korpiaho H Mannila J Tikanmaki and H TT Toivonen ldquoTime series segmentation for context recognitionin mobile devicesrdquo in Proceedings of the 1st IEEE InternationalConference on Data Mining (ICDM rsquo01) pp 203ndash210 December2001

[23] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[24] P Xiong Y Fan and M Zhou ldquoWeb service configurationunder multiple quality-of-service attributesrdquo IEEE TransactionsonAutomation Science and Engineering vol 6 no 2 pp 311ndash3212009

[25] J Allan G Carbonell G Doddington J Yamron and Y YangldquoTopic detection and tracking pilot study final reportrdquo in Pro-ceedings of the Broadcast News Transcription and UnderstandingWorkshop 1998

[26] J Allan Topic Detection and Tracking Event-Based InformationOrganization Kluwer Norwell Mass USA 2000

[27] Q Mei and C Zhai ldquoDiscovering evolutionary theme patternsfrom text an exploration of temporal text miningrdquo in Pro-ceedings of the 11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 198ndash207 August2005

[28] C-P Wei and Y-H Chang ldquoDiscovering event evolution pat-terns fromdocument sequencesrdquo IEEETransactions on SystemsMan and Cybernetics Part A Systems and Humans vol 37 no2 pp 273ndash283 2007

[29] C C Yang andX Shi ldquoDiscovering event evolution graphs fromnewswiresrdquo in Proceedings of the 15th International Conferenceon World Wide Web pp 945ndash946 May 2006

[30] Y Jo C Lagoze and C L Giles ldquoDetecting research topicsvia the correlation between graphs and textsrdquo in Proceedings ofthe 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo07) pp 370ndash379 August2007

[31] G P C Fung J X Yu H Liu and P S Yu ldquoTime-dependentevent hierarchy constructionrdquo in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo07) pp 300ndash309 ACM August 2007

[32] V Hristidis O Valdivia M Vlachos and P S Yu ldquoContinuouskeyword search on multiple text streamsrdquo in Proceedings of the15th ACM Conference on Information and Knowledge Manage-ment (CIKM rsquo06) pp 802ndash803 November 2006

[33] Q Zhao T-Y Liu S S Bhowmick andW-Y Ma ldquoEvent detec-tion from evolution of click-through datardquo in Proceedings ofthe 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo06) pp 484ndash493 August2006

[34] C Wang M Zhang L Ru and S Ma ldquoAutomatic online newstopic ranking using media focus and user attention based onaging theoryrdquo in Proceedings of the 17th ACM Conference onInformation and Knowledge Management (CIKM rsquo08) pp 1033ndash1042 October 2008

[35] J Leskovec A Krause C Guestrin C Faloutsos J VanBriesenand N Glance ldquoCost-effective outbreak detection in networksrdquoin Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo07) pp420ndash429 August 2007

[36] J Leskovec and E Horvitz ldquoPlanetary-scale views on a largeinstant-messaging networkrdquo in Proceedings of the 17th Interna-tional Conference onWorldWideWeb (WWW rsquo08) pp 915ndash924April 2008

[37] R Nallapati A Feng F Peng and J Allan ldquoEvent threadingwithin news topicsrdquo in Proceedings of the 13th ACM Conferenceon Information and Knowledge Management (CIKM rsquo04) pp446ndash453 November 2004

[38] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing and Managementvol 24 no 5 pp 513ndash523 1988

[39] X Jin S Spangler R Ma and J Han ldquoTopic initiator detectionon the world wide webrdquo in Proceedings of the 19th InternationalWorld Wide Web Conference pp 481ndash490 2010

[40] X Yin J Han and P S Yu ldquoTruth discovery with multiple con-flicting information providers on the Webrdquo IEEE Transactionson Knowledge and Data Engineering vol 20 no 6 pp 796ndash8082008

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 4: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

4 International Journal of Distributed Sensor Networks

Temporal Feature 2 (the number of increased keywords fromtimestamp 119905

119894to 119905119895 |120595(119905119894 119905119895)|) The elements in 120595(119905

119894 119905119895) do not

appear from the starting timestamp 119905119904to 119905119894 that is forall119896

119898isin

120595(119905119894 119905119895) rarr 119896

119898notin 120595(119905119894 119905119895)

Temporal Feature 3 The distribution of keywords on webpages from timestamp 119905

119894to 119905119895 120577(119905119894 119905119895) For an emergency

event 119890 the web pages in 120593(119905119894 119905119895) can be represented as a

vector by the keywords in120595(119905119894 119905119895)These vectors can be stored

as a matrix

120577 (119905119894 119905119895) = (

11990811 1199081119898

d

1199081198991 sdot sdot sdot 119908

119899119898

) (2)

Temporal Feature 4 (the associated relations of keywords fromtimestamp 119905

119894to 119905119895 Γ(119905119894 119905119895)) For an emergency event 119890 the

associated relations of keywords can be stored as a matrix

Γ (119905119894 119905119895) = (

11989111 1198911119898

d

1198911198981 sdot sdot sdot 119891

119898119898

) (3)

where 119891119894119895means the weight of relation between 119896

119894and 119896

119895

which can be computed by

119891119894119895=

log ((119873 (119896119894and 119896119895) lowast 119899) (119873 (119896

119894) lowast 119873 (119896

119895)))

log 119899 (4)

where 119873(119896119894) means the number of in 120593(119905

119894 119905119895) containing

119896119894and 119873(119896

119894and 119896119895) is the number of web pages in 120593(119905

119894 119905119895)

containing both 119896119894and 119896119895

Temporal Feature 5 (the similarities between web pages fromtimestamp 119905

119894to 119905119895 Ξ(119905119894 119905119895)) For an emergency event 119890 the

similarities between web pages can be stored as a matrix

Ξ (119905119894 119905119895) = (

11988611 1198861119899

d

1198861198991 sdot sdot sdot 119886

119899119899

) (5)

where 119886119894119895means the similarities between 119889

119894and 119889119895 which can

be computed by

119886119894119895=

119889119894sdot 119889119895

1003817100381710038171003817119889119894

1003817100381710038171003817

10038171003817100381710038171003817119889119895

10038171003817100381710038171003817

(6)

where 119889119894 and 119889

119895 denote the mathematic model of vectors

119889119894and 119889

119895

33 Basic Burst Factors In this section we present basicburst factors including the number of communities in contextgraph the average clustering coefficient of the context graphand the average similarities of web pages

Impact Factor 1 (the number of communities in contextgraph from timestamp 119905

119894to 119905119895 |119862(119905

119894 119905119895)|) A community

is a subgraph of the context graph which reflects a partof context of an event 119890 The set of communities is asegmentation of the context graph Each context communityis a part of the context graph which is with no commonkeywords of other community The set of communities of anevent 119890 is denoted as

119862119890= 1198881 1198882 119888|119862

119890| forall119896

119894isin 119888119894and 119896119895isin 119888119895997888rarr 119896119894= 119896119895 (7)

Impact Factor 2 (the average clustering coefficient [24] ofthe context graph from timestamp 119905

119894to 119905119895 119862119862(119905

119894 119905119895)) In

graph theory a clustering coefficient is a measure of degreeto which nodes in a graph tend to cluster together Theclustering coefficient of the keyword 119896

119894in context graph can

be computed by

119862119862 (119896119894) =

2119897119901 (119901 minus 1)

(8)

where 119901means the number of neighbor node of the keyword119896119894and 119897 means the number of edges between these neighbor

nodes Thus the average clustering coefficient of the contextgraph can be computed by

119862119862 (119905119894 119905119895) =

sum119898

119894=1 119862119862 (119896119894)

119898 (9)

Impact Factor 3 (the average similarities of web pages fromtimestamp 119905

119894to 119905119895 119860119878(119905

119894 119905119895)) For an event 119890 the similarities

can be computed by cosine function

Sim (119889119894 119889119895) =

119889119894sdot 119889119895

1003817100381710038171003817119889119894

1003817100381710038171003817

10038171003817100381710038171003817119889119895

10038171003817100381710038171003817

(10)

where 119889119894 and 119889

119895 denote the mathematic model of vectors

119889119894and 119889

119895 Thus the average similarities of web pages can be

computed by

119860119878 (119905119894 119905119895) =

sum119898

119894=1sum119898

119895=1 Sim (119889119894 119889119895)

119898 (119898 minus 1) (11)

4 Computing the Burst Power

In this section based on the above definitions the methodfor computing the burst power of an emergency event isproposed

41 Basic Definitions of Burst Power After giving the basictemporal features of emergency events we define burst poweras follows

Definition 3 (burst power) 119900119901(119905119894 119905119895) For an emergency event

119890 the burst power from timestamp 119905119894to 119905119895is the influence

degree on the society

For example high |120593(119905119894 119905119895)| or |120595(119905

119894 119905119895)|means high influ-

ence degree of an event on the society thus the event has highburst power

International Journal of Distributed Sensor Networks 5

Keywords Web pages

The outbreak power is minimum when the bipartite graph of

keywords and web pages is complete

Keywords Web pages

The outbreak power is maximum when all of the keywords are only

provided by one web page

k1

k2

k3

k4

w1

w2

w3

k1

k2

k3

k4

w1

w2

w3

Figure 1 The illustration of the maximum and minimum burst power

According to the characteristic of burst power the timeinterval with high influence to the society will possess highpossibility to be the peak ormilestone of an emergency eventInspired by [40] before we propose the algorithm to computethe burst power we introduce two important definitions asfollows

Definition 4 (the representative power of keywords) 119903119901(119896)The representative power of keyword 119896 is the probability of 119896to represent the event 119890 correctly

Definition 5 (the confidence of web pages) 119888119908(119889) Theconfidence of a web page 119889 is the expected representativepower of keywords provided by 119889

Different keywords related to one event reveal the variousaspects of the event For example the keyword ldquoChinardquoreveals the place of the event ldquoChina rail crashrdquo On the otherhand the keywords ldquorailrdquo and ldquocrashrdquo reveal the object of theevent ldquoChina rail crashrdquo

42 Basic Heuristics for Computing Burst Power Based onthe common sense and the observations on real data wehave four basic heuristic rules which serve as the bases forthe computing of burst power These four heuristic rules arerelevant to the data If there is burst situation in the data theywould be correct Given the discussion field of this paper allemergency events have some different effect and spreadingpower Thus these four heuristic rules are appropriate foremergency events situation

Heuristic Rule 1 If ignoring 120595(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120593(119905119894 119905119895)|

According to heuristic rule 1 the burst power is propor-tional to the number of increased web pages So we can get119900119901(119905119894 119905119895)infin|120593(119905

119894 119905119895)|

Heuristic Rule 2 If ignoring 120593(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120595(119905119894 119905119895)|

According to heuristic rule 2 the burst power is propor-tional to the number of increased keywords So we can get119900119901(119905119894 119905119895)infin|120595(119905

119894 119905119895)|

If two time intervalswith the same |120593(119905119894 119905119895)| and |120595(119905

119894 119905119895)|

the distribution of keywords will determine 119900119901(119905119894 119905119895) So we

give heuristic rule 3

Heuristic Rule 3 If the bipartite graph of 120577(119905119894 119905119895) is a complete

graph 119900119901(119905119894 119905119895) is the lowest that is (forall119908

119899119898isin 120577(119905119894 119905119895) rarr

119908119899119898

= 0) rarr 119900119901(119905119894 119905119895)min

According to heuristic rule 3 if all of the keywords appearin each web page the similarity between them is 1 whichmeans all of the web pages are copies from one web pageThissituation means that the emergency event is with the lowestdiversity

Since120595(119905119894 119905119895) and 120593(119905

119894 119905119895) are dependent the distribution

of keywords on the web pages should be considered So wegive heuristic rule 4Heuristic Rule 4 Since120595(119905

119894 119905119895) and120593(119905

119894 119905119895) are dependent the

distribution of keywords on web pages should be consideredIf all of the keywords are provided by only one web page119900119901(119905119894 119905119895) is the highest

43 Computing the Burst Power Heuristic rule 4 givesthe condition of maximum 119900119901(119905

119894 119905119895) Figure 1 gives the

illustration of heuristic rules 3 and 4 Based on heuristic

6 International Journal of Distributed Sensor Networks

rules 3 and 4 we can conclude that the confidence of aweb page and the representative power of a keyword aredetermined by each other andwe can use an iterativemethodto compute them Thus we compute the confidence of aweb page by calculating the average representative power ofkeywords it provides as follows

119888119908 (119889ℎ) =

sum(119896119898isin

997888

119889ℎ)cap(119908ℎ119899gt0)

119903119901 (119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(12)

where |997888119889ℎ| means the number of web pages with keywords

119896119898Inspired by [40] we use probability function to compute

the representative power of keywords

119903119901 (119896119898) = 1minus prod

(119889ℎisin

997888

119896119898)cap(119908ℎ119898gt0)

(1minus 119888119908 (119889ℎ)) (13)

where997888119896119898is the set of web pages providing 119896

119898

The above two equations show how to compute theconfidence of a web page However since the similaritiesbetween web pages are not zero we put the similaritiesbetween web pages into (8) The equation can be revised as(9) which considers the similarities between web pages

1199031199011015840

(119896119898) = 119903119901 (119896

119898) + sum

(119896119894isin

997888997888

119896119896119894)cap(119891119894119895gt0)

119891119894119895lowast 119903119901 (119896

119894)

(14)

where997888997888119896119896119894is the set of keywords similarities against keyword

119896119894Since 1199031199011015840(119896) may be higher than 1 we adopt the widely

used logistic function to set 1199031199011015840(119896) into (0 1)Then (9) can berevised as

11990311990110158401015840

(119896119898) =

11 + 119890minus1199031199011015840(119896119898)

(15)

Equation (7) is revised as

119888119908 (119889ℎ) =

sum119896119898isin

997888

119889ℎcap119908ℎ119898gt011990311990110158401015840

(119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(16)

and the burst power function of time interval (119905119894 119905119895) can be

computed by the sum of confidence of all web pages

119900119901 (119905119894 119905119895) =

119899

sum

ℎ=1(1minus 119888119908 (119889

ℎ)) (17)

where 119899 means the number of web pages from time interval(119905119894 119905119895)

As described above we can compute the burst power ofthe proposed method

5 Experiments and Analysis

51 Datasets The events in our experiments are extractedfrom Google and Baidu We select 150 events with about

Table 1 The details of datasets

Feature ValueAverage number of seeds per event 2Average number of web pages per event 1012Average number of keywords per event 4534Average number of days per event 30Average number of web pages per day 34Average number of keywords per day 853

1500000 web pages in our experiments including politicsevents accidents events disasters events and terrorismevents The web pages of each event are downloaded fromGoogle Stanford tagger (httpnlpstanfordedu) is used toreserve the noun words in the web pages The keywordsare selected by their document frequencies Table 1 showsthe statistics of our experimental dataset When Google andBaidu provide the events it also gives some keywords forhelping users to search them After we get the seed set of anevent by the search engine a certain number of the web pagesare collected as samples by automatic crawling and searchingwith the seed setThe detailed steps for collecting related webresources of an event in our experiments are as follows

(1) Get the seed set of an event such as ldquoChina traincrashrdquo which can be seen as 119878(119905

119894 119905119895) of an event

(2) Search the seed set as the query and download therelated web pages with search engine which can beseen as 120593(119905

119894 119905119895) of an event

(3) Identify the starting timestamp of an event by 120593(119905119894 119905119895)

and the ending timestamp by download time whichcan be seen as 119905

119904and 119905119890of an event

(4) Get |120593(119905119894 119905119895)| |120595(119905

119894 119905119895)| 120577(119905119894 119905119895) and Ξ(119905

119894 119905119895) per day

(5) Do step (4) of different information sources includingnews blogs and BBS

52 Experimental Results After obtaining the temporal fea-tures per day of each event we select ten human annotatorsto test whether the burst power of each event is correct ornotThe data fromGoogle Trends is selected to compare withthat of the proposed method If the human annotator thinksthe proposed methods are equal to his own imagination orGoogle Trends we set the precision of this detection to trueIn additional the annotators are set to do their evaluationsindependently which ensure the reliability and validity of theresults Before the human annotator started to evaluate theexperimental results we provide the abstract descriptions ofeach event for them For example we will give some newsstories and concepts to the human annotators This trainingsession continued until the human annotators are familiarwith the concepts and the temporal feature of events In thenext step each annotator was asked to test the results of eachevent independently

In fact the overall precision of the proposed methodachieves 92 showing the accuracy of our burst powercomputation algorithm From the experiments on the real

International Journal of Distributed Sensor Networks 7

data we know that the proposed algorithm can compute theburst power of an event accuratelyThe information fromwebcan be integrated into burst power The factor can be used todetect states of an emergency event

6 Conclusions

With the popularity of web the internet is becoming amajor information provider and poster of an event due toits real-time open and dynamic features However facedwith the hugeness disorder and continuous web resourcesit is impossible for people to efficiently recognize collectand organize the events In this paper crowd sensing basedburst computation algorithm of a web event is developed inorder to let the people know a web event clearly and help thesocial group or government process the events effectivelyThedefinition of ldquosocial sensorsrdquo is firstly introduced which isthe foundation of using web resources to compute the burstpower of events on the web Secondly different temporalfeatures of web events are developed to provide the basicsfor the proposed computation algorithmMoreover the burstpower is presented to integrate the above temporal featuresof an event Empirical experiments on real datasets includingGoogle Zeitgeist and Google Trends show that that thenumber of web pages and the average clustering coefficientcan be used to detect events Some strategies integrating thenumber of web pages and the average clustering coefficientare also employed The evaluations on real dataset show thatthe proposed function integrating the number of web pagesand the average clustering coefficient can be used for eventdetection efficiently and correctly

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceandTechnologyMajor Project underGrant 2013ZX01033002-003 in part by the National High Technology Research andDevelopment Program of China (863 Program) under Grants2013AA014601 and 2013AA014603 in part by National KeyTechnology Support Program under Grant 2012BAH07B01in part by the National Science Foundation of China underGrants 61300202 and 61300028 in part by the Project of theMinistry of Public Security under Grant 2014JSYJB009 inpart by the China Postdoctoral Science Foundation underGrant 2014M560085 in part by the Science Foundation ofShanghai under Grant 13ZR1452900 in part by the ChinaNational Social Science Fund 06BFX051 and in part bythe Shanghai University training and selection of outstand-ing young teachers in special research fund hzf05046 Theauthors must be grateful for the helpful suggestion from YLiu L Mei H Zhang and C Hu

References

[1] Y Liu L Ni and C Hu ldquoA generalized probabilistic topologycontrol for wireless sensor networksrdquo IEEE Journal on SelectedAreas in Communications vol 30 no 9 pp 1780ndash1788 2012

[2] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[3] C Hu Z Xu Y Liu L Mei L Chen and X Luo ldquoSemanticlink network-based model for organizing multimedia big datardquoIEEE Transactions on Emerging Topics in Computing vol 2 no3 pp 376ndash387 2014

[4] Y Liu Y Zhu L Ni and G Xue ldquoA reliability-oriented trans-mission service in wireless sensor networksrdquo IEEE TransactionsonParallel andDistributed Systems vol 22 no 12 pp 2100ndash21072011

[5] Y Liu Q Zhang and L M Ni ldquoOpportunity-based topologycontrol in wireless sensor networksrdquo IEEE Transactions onParallel andDistributed Systems vol 21 no 3 pp 405ndash416 2010

[6] X Liu Y Yang D Yuan and J Chen ldquoDo we need to handleevery temporal violation in scientific workflow systemsrdquo ACMTransactions on Software Engineering and Methodology vol 23no 1 Article ID 2559938 2014

[7] LWang J Tao R Ranjan et al ldquoG-Hadoop mapReduce acrossdistributed data centers for data-intensive computingrdquo FutureGeneration Computer Systems vol 29 no 3 pp 739ndash750 2013

[8] Z Xu X Wei X Luo et al ldquoKnowle A semantic link networkbased system for organizing large scale online news eventsrdquoFuture Generation Computer Systems vol 43-44 pp 40ndash502015

[9] Z Xu X Luo S Zhang X Wei L Mei and C Hu ldquoMin-ing temporal explicit and implicit semantic relations betweenentities using web search enginesrdquo Future Generation ComputerSystems vol 37 pp 468ndash477 2014

[10] D Haddow A Bullock and P Coppola Introduction to Emer-gency Management Butterworth-Heinemann 2010

[11] 2012 httpdefinitionsuslegalcomeemergency-event[12] 2012 httpwwwwhointcsrsarsen[13] 2012 httpenwikipediaorgwikiSeptember 11 attacks[14] J Farber T Myers J Trevathan I Atkinson and T Andersen

ldquoRiskr a low-technological Web20 disaster service to monitorand share informationrdquo in In Proceedings of 15th InternationalConference on Network-Based Information Systems (NBiS rsquo12)pp 311ndash318 IEEE Melbourne Australia September 2012

[15] J Makkonen ldquoInvestigations on event evolution in TDTrdquo inProceedings of the Conference of the North American Chapterof the Association for Computational Linguistics on HumanLanguage (NAACLstudent rsquo03) pp 43ndash48 Edmonton CanadaMay 2003

[16] C C Yang X Shi and C-P Wei ldquoDiscovering event evolutiongraphs fromnews corporardquo IEEETransactions on SystemsManand Cybernetics Part A Systems and Humans vol 39 no 4 pp850ndash863 2009

[17] J Abonyi B Feil S Nemeth and P Arva ldquoModifiedGath-GEVaclustering for fuzzy segmentation of multivariate time-seriesrdquoFuzzy Sets and Systems vol 149 no 1 pp 39ndash56 2005

[18] X Wu Y-J Lu Q Peng and C-W Ngo ldquoMining eventstructures from web videosrdquo IEEEMultimedia vol 18 no 1 pp38ndash51 2011

8 International Journal of Distributed Sensor Networks

[19] Q He K Chang E-P Lim and A Banerjee ldquoKeep it simplewith time a reexamination of probabilistic topic detectionmodelsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 10 pp 1795ndash1808 2010

[20] C Sung and T G Kim ldquoCollaborative modeling processfor development of domain-specific discrete event simulationsystemsrdquo IEEE Transactions on Systems Man and CyberneticsPart C Applications and Reviews vol 42 no 4 pp 532ndash5462012

[21] E Keogh K Chakrabarti M Pazzani and S MehrotraldquoDimensionality reduction for fast similarity search in largetime series databasesrdquo Knowledge and Information Systems vol3 no 3 pp 263ndash286 2001

[22] J Himberg K Korpiaho H Mannila J Tikanmaki and H TT Toivonen ldquoTime series segmentation for context recognitionin mobile devicesrdquo in Proceedings of the 1st IEEE InternationalConference on Data Mining (ICDM rsquo01) pp 203ndash210 December2001

[23] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[24] P Xiong Y Fan and M Zhou ldquoWeb service configurationunder multiple quality-of-service attributesrdquo IEEE TransactionsonAutomation Science and Engineering vol 6 no 2 pp 311ndash3212009

[25] J Allan G Carbonell G Doddington J Yamron and Y YangldquoTopic detection and tracking pilot study final reportrdquo in Pro-ceedings of the Broadcast News Transcription and UnderstandingWorkshop 1998

[26] J Allan Topic Detection and Tracking Event-Based InformationOrganization Kluwer Norwell Mass USA 2000

[27] Q Mei and C Zhai ldquoDiscovering evolutionary theme patternsfrom text an exploration of temporal text miningrdquo in Pro-ceedings of the 11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 198ndash207 August2005

[28] C-P Wei and Y-H Chang ldquoDiscovering event evolution pat-terns fromdocument sequencesrdquo IEEETransactions on SystemsMan and Cybernetics Part A Systems and Humans vol 37 no2 pp 273ndash283 2007

[29] C C Yang andX Shi ldquoDiscovering event evolution graphs fromnewswiresrdquo in Proceedings of the 15th International Conferenceon World Wide Web pp 945ndash946 May 2006

[30] Y Jo C Lagoze and C L Giles ldquoDetecting research topicsvia the correlation between graphs and textsrdquo in Proceedings ofthe 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo07) pp 370ndash379 August2007

[31] G P C Fung J X Yu H Liu and P S Yu ldquoTime-dependentevent hierarchy constructionrdquo in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo07) pp 300ndash309 ACM August 2007

[32] V Hristidis O Valdivia M Vlachos and P S Yu ldquoContinuouskeyword search on multiple text streamsrdquo in Proceedings of the15th ACM Conference on Information and Knowledge Manage-ment (CIKM rsquo06) pp 802ndash803 November 2006

[33] Q Zhao T-Y Liu S S Bhowmick andW-Y Ma ldquoEvent detec-tion from evolution of click-through datardquo in Proceedings ofthe 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo06) pp 484ndash493 August2006

[34] C Wang M Zhang L Ru and S Ma ldquoAutomatic online newstopic ranking using media focus and user attention based onaging theoryrdquo in Proceedings of the 17th ACM Conference onInformation and Knowledge Management (CIKM rsquo08) pp 1033ndash1042 October 2008

[35] J Leskovec A Krause C Guestrin C Faloutsos J VanBriesenand N Glance ldquoCost-effective outbreak detection in networksrdquoin Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo07) pp420ndash429 August 2007

[36] J Leskovec and E Horvitz ldquoPlanetary-scale views on a largeinstant-messaging networkrdquo in Proceedings of the 17th Interna-tional Conference onWorldWideWeb (WWW rsquo08) pp 915ndash924April 2008

[37] R Nallapati A Feng F Peng and J Allan ldquoEvent threadingwithin news topicsrdquo in Proceedings of the 13th ACM Conferenceon Information and Knowledge Management (CIKM rsquo04) pp446ndash453 November 2004

[38] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing and Managementvol 24 no 5 pp 513ndash523 1988

[39] X Jin S Spangler R Ma and J Han ldquoTopic initiator detectionon the world wide webrdquo in Proceedings of the 19th InternationalWorld Wide Web Conference pp 481ndash490 2010

[40] X Yin J Han and P S Yu ldquoTruth discovery with multiple con-flicting information providers on the Webrdquo IEEE Transactionson Knowledge and Data Engineering vol 20 no 6 pp 796ndash8082008

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 5: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

International Journal of Distributed Sensor Networks 5

Keywords Web pages

The outbreak power is minimum when the bipartite graph of

keywords and web pages is complete

Keywords Web pages

The outbreak power is maximum when all of the keywords are only

provided by one web page

k1

k2

k3

k4

w1

w2

w3

k1

k2

k3

k4

w1

w2

w3

Figure 1 The illustration of the maximum and minimum burst power

According to the characteristic of burst power the timeinterval with high influence to the society will possess highpossibility to be the peak ormilestone of an emergency eventInspired by [40] before we propose the algorithm to computethe burst power we introduce two important definitions asfollows

Definition 4 (the representative power of keywords) 119903119901(119896)The representative power of keyword 119896 is the probability of 119896to represent the event 119890 correctly

Definition 5 (the confidence of web pages) 119888119908(119889) Theconfidence of a web page 119889 is the expected representativepower of keywords provided by 119889

Different keywords related to one event reveal the variousaspects of the event For example the keyword ldquoChinardquoreveals the place of the event ldquoChina rail crashrdquo On the otherhand the keywords ldquorailrdquo and ldquocrashrdquo reveal the object of theevent ldquoChina rail crashrdquo

42 Basic Heuristics for Computing Burst Power Based onthe common sense and the observations on real data wehave four basic heuristic rules which serve as the bases forthe computing of burst power These four heuristic rules arerelevant to the data If there is burst situation in the data theywould be correct Given the discussion field of this paper allemergency events have some different effect and spreadingpower Thus these four heuristic rules are appropriate foremergency events situation

Heuristic Rule 1 If ignoring 120595(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120593(119905119894 119905119895)|

According to heuristic rule 1 the burst power is propor-tional to the number of increased web pages So we can get119900119901(119905119894 119905119895)infin|120593(119905

119894 119905119895)|

Heuristic Rule 2 If ignoring 120593(119905119894 119905119895) and 120577(119905

119894 119905119895) the possibil-

ity of time interval (119905119894 119905119895) with high burst power is increasing

with |120595(119905119894 119905119895)|

According to heuristic rule 2 the burst power is propor-tional to the number of increased keywords So we can get119900119901(119905119894 119905119895)infin|120595(119905

119894 119905119895)|

If two time intervalswith the same |120593(119905119894 119905119895)| and |120595(119905

119894 119905119895)|

the distribution of keywords will determine 119900119901(119905119894 119905119895) So we

give heuristic rule 3

Heuristic Rule 3 If the bipartite graph of 120577(119905119894 119905119895) is a complete

graph 119900119901(119905119894 119905119895) is the lowest that is (forall119908

119899119898isin 120577(119905119894 119905119895) rarr

119908119899119898

= 0) rarr 119900119901(119905119894 119905119895)min

According to heuristic rule 3 if all of the keywords appearin each web page the similarity between them is 1 whichmeans all of the web pages are copies from one web pageThissituation means that the emergency event is with the lowestdiversity

Since120595(119905119894 119905119895) and 120593(119905

119894 119905119895) are dependent the distribution

of keywords on the web pages should be considered So wegive heuristic rule 4Heuristic Rule 4 Since120595(119905

119894 119905119895) and120593(119905

119894 119905119895) are dependent the

distribution of keywords on web pages should be consideredIf all of the keywords are provided by only one web page119900119901(119905119894 119905119895) is the highest

43 Computing the Burst Power Heuristic rule 4 givesthe condition of maximum 119900119901(119905

119894 119905119895) Figure 1 gives the

illustration of heuristic rules 3 and 4 Based on heuristic

6 International Journal of Distributed Sensor Networks

rules 3 and 4 we can conclude that the confidence of aweb page and the representative power of a keyword aredetermined by each other andwe can use an iterativemethodto compute them Thus we compute the confidence of aweb page by calculating the average representative power ofkeywords it provides as follows

119888119908 (119889ℎ) =

sum(119896119898isin

997888

119889ℎ)cap(119908ℎ119899gt0)

119903119901 (119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(12)

where |997888119889ℎ| means the number of web pages with keywords

119896119898Inspired by [40] we use probability function to compute

the representative power of keywords

119903119901 (119896119898) = 1minus prod

(119889ℎisin

997888

119896119898)cap(119908ℎ119898gt0)

(1minus 119888119908 (119889ℎ)) (13)

where997888119896119898is the set of web pages providing 119896

119898

The above two equations show how to compute theconfidence of a web page However since the similaritiesbetween web pages are not zero we put the similaritiesbetween web pages into (8) The equation can be revised as(9) which considers the similarities between web pages

1199031199011015840

(119896119898) = 119903119901 (119896

119898) + sum

(119896119894isin

997888997888

119896119896119894)cap(119891119894119895gt0)

119891119894119895lowast 119903119901 (119896

119894)

(14)

where997888997888119896119896119894is the set of keywords similarities against keyword

119896119894Since 1199031199011015840(119896) may be higher than 1 we adopt the widely

used logistic function to set 1199031199011015840(119896) into (0 1)Then (9) can berevised as

11990311990110158401015840

(119896119898) =

11 + 119890minus1199031199011015840(119896119898)

(15)

Equation (7) is revised as

119888119908 (119889ℎ) =

sum119896119898isin

997888

119889ℎcap119908ℎ119898gt011990311990110158401015840

(119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(16)

and the burst power function of time interval (119905119894 119905119895) can be

computed by the sum of confidence of all web pages

119900119901 (119905119894 119905119895) =

119899

sum

ℎ=1(1minus 119888119908 (119889

ℎ)) (17)

where 119899 means the number of web pages from time interval(119905119894 119905119895)

As described above we can compute the burst power ofthe proposed method

5 Experiments and Analysis

51 Datasets The events in our experiments are extractedfrom Google and Baidu We select 150 events with about

Table 1 The details of datasets

Feature ValueAverage number of seeds per event 2Average number of web pages per event 1012Average number of keywords per event 4534Average number of days per event 30Average number of web pages per day 34Average number of keywords per day 853

1500000 web pages in our experiments including politicsevents accidents events disasters events and terrorismevents The web pages of each event are downloaded fromGoogle Stanford tagger (httpnlpstanfordedu) is used toreserve the noun words in the web pages The keywordsare selected by their document frequencies Table 1 showsthe statistics of our experimental dataset When Google andBaidu provide the events it also gives some keywords forhelping users to search them After we get the seed set of anevent by the search engine a certain number of the web pagesare collected as samples by automatic crawling and searchingwith the seed setThe detailed steps for collecting related webresources of an event in our experiments are as follows

(1) Get the seed set of an event such as ldquoChina traincrashrdquo which can be seen as 119878(119905

119894 119905119895) of an event

(2) Search the seed set as the query and download therelated web pages with search engine which can beseen as 120593(119905

119894 119905119895) of an event

(3) Identify the starting timestamp of an event by 120593(119905119894 119905119895)

and the ending timestamp by download time whichcan be seen as 119905

119904and 119905119890of an event

(4) Get |120593(119905119894 119905119895)| |120595(119905

119894 119905119895)| 120577(119905119894 119905119895) and Ξ(119905

119894 119905119895) per day

(5) Do step (4) of different information sources includingnews blogs and BBS

52 Experimental Results After obtaining the temporal fea-tures per day of each event we select ten human annotatorsto test whether the burst power of each event is correct ornotThe data fromGoogle Trends is selected to compare withthat of the proposed method If the human annotator thinksthe proposed methods are equal to his own imagination orGoogle Trends we set the precision of this detection to trueIn additional the annotators are set to do their evaluationsindependently which ensure the reliability and validity of theresults Before the human annotator started to evaluate theexperimental results we provide the abstract descriptions ofeach event for them For example we will give some newsstories and concepts to the human annotators This trainingsession continued until the human annotators are familiarwith the concepts and the temporal feature of events In thenext step each annotator was asked to test the results of eachevent independently

In fact the overall precision of the proposed methodachieves 92 showing the accuracy of our burst powercomputation algorithm From the experiments on the real

International Journal of Distributed Sensor Networks 7

data we know that the proposed algorithm can compute theburst power of an event accuratelyThe information fromwebcan be integrated into burst power The factor can be used todetect states of an emergency event

6 Conclusions

With the popularity of web the internet is becoming amajor information provider and poster of an event due toits real-time open and dynamic features However facedwith the hugeness disorder and continuous web resourcesit is impossible for people to efficiently recognize collectand organize the events In this paper crowd sensing basedburst computation algorithm of a web event is developed inorder to let the people know a web event clearly and help thesocial group or government process the events effectivelyThedefinition of ldquosocial sensorsrdquo is firstly introduced which isthe foundation of using web resources to compute the burstpower of events on the web Secondly different temporalfeatures of web events are developed to provide the basicsfor the proposed computation algorithmMoreover the burstpower is presented to integrate the above temporal featuresof an event Empirical experiments on real datasets includingGoogle Zeitgeist and Google Trends show that that thenumber of web pages and the average clustering coefficientcan be used to detect events Some strategies integrating thenumber of web pages and the average clustering coefficientare also employed The evaluations on real dataset show thatthe proposed function integrating the number of web pagesand the average clustering coefficient can be used for eventdetection efficiently and correctly

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceandTechnologyMajor Project underGrant 2013ZX01033002-003 in part by the National High Technology Research andDevelopment Program of China (863 Program) under Grants2013AA014601 and 2013AA014603 in part by National KeyTechnology Support Program under Grant 2012BAH07B01in part by the National Science Foundation of China underGrants 61300202 and 61300028 in part by the Project of theMinistry of Public Security under Grant 2014JSYJB009 inpart by the China Postdoctoral Science Foundation underGrant 2014M560085 in part by the Science Foundation ofShanghai under Grant 13ZR1452900 in part by the ChinaNational Social Science Fund 06BFX051 and in part bythe Shanghai University training and selection of outstand-ing young teachers in special research fund hzf05046 Theauthors must be grateful for the helpful suggestion from YLiu L Mei H Zhang and C Hu

References

[1] Y Liu L Ni and C Hu ldquoA generalized probabilistic topologycontrol for wireless sensor networksrdquo IEEE Journal on SelectedAreas in Communications vol 30 no 9 pp 1780ndash1788 2012

[2] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[3] C Hu Z Xu Y Liu L Mei L Chen and X Luo ldquoSemanticlink network-based model for organizing multimedia big datardquoIEEE Transactions on Emerging Topics in Computing vol 2 no3 pp 376ndash387 2014

[4] Y Liu Y Zhu L Ni and G Xue ldquoA reliability-oriented trans-mission service in wireless sensor networksrdquo IEEE TransactionsonParallel andDistributed Systems vol 22 no 12 pp 2100ndash21072011

[5] Y Liu Q Zhang and L M Ni ldquoOpportunity-based topologycontrol in wireless sensor networksrdquo IEEE Transactions onParallel andDistributed Systems vol 21 no 3 pp 405ndash416 2010

[6] X Liu Y Yang D Yuan and J Chen ldquoDo we need to handleevery temporal violation in scientific workflow systemsrdquo ACMTransactions on Software Engineering and Methodology vol 23no 1 Article ID 2559938 2014

[7] LWang J Tao R Ranjan et al ldquoG-Hadoop mapReduce acrossdistributed data centers for data-intensive computingrdquo FutureGeneration Computer Systems vol 29 no 3 pp 739ndash750 2013

[8] Z Xu X Wei X Luo et al ldquoKnowle A semantic link networkbased system for organizing large scale online news eventsrdquoFuture Generation Computer Systems vol 43-44 pp 40ndash502015

[9] Z Xu X Luo S Zhang X Wei L Mei and C Hu ldquoMin-ing temporal explicit and implicit semantic relations betweenentities using web search enginesrdquo Future Generation ComputerSystems vol 37 pp 468ndash477 2014

[10] D Haddow A Bullock and P Coppola Introduction to Emer-gency Management Butterworth-Heinemann 2010

[11] 2012 httpdefinitionsuslegalcomeemergency-event[12] 2012 httpwwwwhointcsrsarsen[13] 2012 httpenwikipediaorgwikiSeptember 11 attacks[14] J Farber T Myers J Trevathan I Atkinson and T Andersen

ldquoRiskr a low-technological Web20 disaster service to monitorand share informationrdquo in In Proceedings of 15th InternationalConference on Network-Based Information Systems (NBiS rsquo12)pp 311ndash318 IEEE Melbourne Australia September 2012

[15] J Makkonen ldquoInvestigations on event evolution in TDTrdquo inProceedings of the Conference of the North American Chapterof the Association for Computational Linguistics on HumanLanguage (NAACLstudent rsquo03) pp 43ndash48 Edmonton CanadaMay 2003

[16] C C Yang X Shi and C-P Wei ldquoDiscovering event evolutiongraphs fromnews corporardquo IEEETransactions on SystemsManand Cybernetics Part A Systems and Humans vol 39 no 4 pp850ndash863 2009

[17] J Abonyi B Feil S Nemeth and P Arva ldquoModifiedGath-GEVaclustering for fuzzy segmentation of multivariate time-seriesrdquoFuzzy Sets and Systems vol 149 no 1 pp 39ndash56 2005

[18] X Wu Y-J Lu Q Peng and C-W Ngo ldquoMining eventstructures from web videosrdquo IEEEMultimedia vol 18 no 1 pp38ndash51 2011

8 International Journal of Distributed Sensor Networks

[19] Q He K Chang E-P Lim and A Banerjee ldquoKeep it simplewith time a reexamination of probabilistic topic detectionmodelsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 10 pp 1795ndash1808 2010

[20] C Sung and T G Kim ldquoCollaborative modeling processfor development of domain-specific discrete event simulationsystemsrdquo IEEE Transactions on Systems Man and CyberneticsPart C Applications and Reviews vol 42 no 4 pp 532ndash5462012

[21] E Keogh K Chakrabarti M Pazzani and S MehrotraldquoDimensionality reduction for fast similarity search in largetime series databasesrdquo Knowledge and Information Systems vol3 no 3 pp 263ndash286 2001

[22] J Himberg K Korpiaho H Mannila J Tikanmaki and H TT Toivonen ldquoTime series segmentation for context recognitionin mobile devicesrdquo in Proceedings of the 1st IEEE InternationalConference on Data Mining (ICDM rsquo01) pp 203ndash210 December2001

[23] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[24] P Xiong Y Fan and M Zhou ldquoWeb service configurationunder multiple quality-of-service attributesrdquo IEEE TransactionsonAutomation Science and Engineering vol 6 no 2 pp 311ndash3212009

[25] J Allan G Carbonell G Doddington J Yamron and Y YangldquoTopic detection and tracking pilot study final reportrdquo in Pro-ceedings of the Broadcast News Transcription and UnderstandingWorkshop 1998

[26] J Allan Topic Detection and Tracking Event-Based InformationOrganization Kluwer Norwell Mass USA 2000

[27] Q Mei and C Zhai ldquoDiscovering evolutionary theme patternsfrom text an exploration of temporal text miningrdquo in Pro-ceedings of the 11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 198ndash207 August2005

[28] C-P Wei and Y-H Chang ldquoDiscovering event evolution pat-terns fromdocument sequencesrdquo IEEETransactions on SystemsMan and Cybernetics Part A Systems and Humans vol 37 no2 pp 273ndash283 2007

[29] C C Yang andX Shi ldquoDiscovering event evolution graphs fromnewswiresrdquo in Proceedings of the 15th International Conferenceon World Wide Web pp 945ndash946 May 2006

[30] Y Jo C Lagoze and C L Giles ldquoDetecting research topicsvia the correlation between graphs and textsrdquo in Proceedings ofthe 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo07) pp 370ndash379 August2007

[31] G P C Fung J X Yu H Liu and P S Yu ldquoTime-dependentevent hierarchy constructionrdquo in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo07) pp 300ndash309 ACM August 2007

[32] V Hristidis O Valdivia M Vlachos and P S Yu ldquoContinuouskeyword search on multiple text streamsrdquo in Proceedings of the15th ACM Conference on Information and Knowledge Manage-ment (CIKM rsquo06) pp 802ndash803 November 2006

[33] Q Zhao T-Y Liu S S Bhowmick andW-Y Ma ldquoEvent detec-tion from evolution of click-through datardquo in Proceedings ofthe 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo06) pp 484ndash493 August2006

[34] C Wang M Zhang L Ru and S Ma ldquoAutomatic online newstopic ranking using media focus and user attention based onaging theoryrdquo in Proceedings of the 17th ACM Conference onInformation and Knowledge Management (CIKM rsquo08) pp 1033ndash1042 October 2008

[35] J Leskovec A Krause C Guestrin C Faloutsos J VanBriesenand N Glance ldquoCost-effective outbreak detection in networksrdquoin Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo07) pp420ndash429 August 2007

[36] J Leskovec and E Horvitz ldquoPlanetary-scale views on a largeinstant-messaging networkrdquo in Proceedings of the 17th Interna-tional Conference onWorldWideWeb (WWW rsquo08) pp 915ndash924April 2008

[37] R Nallapati A Feng F Peng and J Allan ldquoEvent threadingwithin news topicsrdquo in Proceedings of the 13th ACM Conferenceon Information and Knowledge Management (CIKM rsquo04) pp446ndash453 November 2004

[38] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing and Managementvol 24 no 5 pp 513ndash523 1988

[39] X Jin S Spangler R Ma and J Han ldquoTopic initiator detectionon the world wide webrdquo in Proceedings of the 19th InternationalWorld Wide Web Conference pp 481ndash490 2010

[40] X Yin J Han and P S Yu ldquoTruth discovery with multiple con-flicting information providers on the Webrdquo IEEE Transactionson Knowledge and Data Engineering vol 20 no 6 pp 796ndash8082008

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 6: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

6 International Journal of Distributed Sensor Networks

rules 3 and 4 we can conclude that the confidence of aweb page and the representative power of a keyword aredetermined by each other andwe can use an iterativemethodto compute them Thus we compute the confidence of aweb page by calculating the average representative power ofkeywords it provides as follows

119888119908 (119889ℎ) =

sum(119896119898isin

997888

119889ℎ)cap(119908ℎ119899gt0)

119903119901 (119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(12)

where |997888119889ℎ| means the number of web pages with keywords

119896119898Inspired by [40] we use probability function to compute

the representative power of keywords

119903119901 (119896119898) = 1minus prod

(119889ℎisin

997888

119896119898)cap(119908ℎ119898gt0)

(1minus 119888119908 (119889ℎ)) (13)

where997888119896119898is the set of web pages providing 119896

119898

The above two equations show how to compute theconfidence of a web page However since the similaritiesbetween web pages are not zero we put the similaritiesbetween web pages into (8) The equation can be revised as(9) which considers the similarities between web pages

1199031199011015840

(119896119898) = 119903119901 (119896

119898) + sum

(119896119894isin

997888997888

119896119896119894)cap(119891119894119895gt0)

119891119894119895lowast 119903119901 (119896

119894)

(14)

where997888997888119896119896119894is the set of keywords similarities against keyword

119896119894Since 1199031199011015840(119896) may be higher than 1 we adopt the widely

used logistic function to set 1199031199011015840(119896) into (0 1)Then (9) can berevised as

11990311990110158401015840

(119896119898) =

11 + 119890minus1199031199011015840(119896119898)

(15)

Equation (7) is revised as

119888119908 (119889ℎ) =

sum119896119898isin

997888

119889ℎcap119908ℎ119898gt011990311990110158401015840

(119896119898)

1003816100381610038161003816100381610038161003816

997888119889ℎ

1003816100381610038161003816100381610038161003816

(16)

and the burst power function of time interval (119905119894 119905119895) can be

computed by the sum of confidence of all web pages

119900119901 (119905119894 119905119895) =

119899

sum

ℎ=1(1minus 119888119908 (119889

ℎ)) (17)

where 119899 means the number of web pages from time interval(119905119894 119905119895)

As described above we can compute the burst power ofthe proposed method

5 Experiments and Analysis

51 Datasets The events in our experiments are extractedfrom Google and Baidu We select 150 events with about

Table 1 The details of datasets

Feature ValueAverage number of seeds per event 2Average number of web pages per event 1012Average number of keywords per event 4534Average number of days per event 30Average number of web pages per day 34Average number of keywords per day 853

1500000 web pages in our experiments including politicsevents accidents events disasters events and terrorismevents The web pages of each event are downloaded fromGoogle Stanford tagger (httpnlpstanfordedu) is used toreserve the noun words in the web pages The keywordsare selected by their document frequencies Table 1 showsthe statistics of our experimental dataset When Google andBaidu provide the events it also gives some keywords forhelping users to search them After we get the seed set of anevent by the search engine a certain number of the web pagesare collected as samples by automatic crawling and searchingwith the seed setThe detailed steps for collecting related webresources of an event in our experiments are as follows

(1) Get the seed set of an event such as ldquoChina traincrashrdquo which can be seen as 119878(119905

119894 119905119895) of an event

(2) Search the seed set as the query and download therelated web pages with search engine which can beseen as 120593(119905

119894 119905119895) of an event

(3) Identify the starting timestamp of an event by 120593(119905119894 119905119895)

and the ending timestamp by download time whichcan be seen as 119905

119904and 119905119890of an event

(4) Get |120593(119905119894 119905119895)| |120595(119905

119894 119905119895)| 120577(119905119894 119905119895) and Ξ(119905

119894 119905119895) per day

(5) Do step (4) of different information sources includingnews blogs and BBS

52 Experimental Results After obtaining the temporal fea-tures per day of each event we select ten human annotatorsto test whether the burst power of each event is correct ornotThe data fromGoogle Trends is selected to compare withthat of the proposed method If the human annotator thinksthe proposed methods are equal to his own imagination orGoogle Trends we set the precision of this detection to trueIn additional the annotators are set to do their evaluationsindependently which ensure the reliability and validity of theresults Before the human annotator started to evaluate theexperimental results we provide the abstract descriptions ofeach event for them For example we will give some newsstories and concepts to the human annotators This trainingsession continued until the human annotators are familiarwith the concepts and the temporal feature of events In thenext step each annotator was asked to test the results of eachevent independently

In fact the overall precision of the proposed methodachieves 92 showing the accuracy of our burst powercomputation algorithm From the experiments on the real

International Journal of Distributed Sensor Networks 7

data we know that the proposed algorithm can compute theburst power of an event accuratelyThe information fromwebcan be integrated into burst power The factor can be used todetect states of an emergency event

6 Conclusions

With the popularity of web the internet is becoming amajor information provider and poster of an event due toits real-time open and dynamic features However facedwith the hugeness disorder and continuous web resourcesit is impossible for people to efficiently recognize collectand organize the events In this paper crowd sensing basedburst computation algorithm of a web event is developed inorder to let the people know a web event clearly and help thesocial group or government process the events effectivelyThedefinition of ldquosocial sensorsrdquo is firstly introduced which isthe foundation of using web resources to compute the burstpower of events on the web Secondly different temporalfeatures of web events are developed to provide the basicsfor the proposed computation algorithmMoreover the burstpower is presented to integrate the above temporal featuresof an event Empirical experiments on real datasets includingGoogle Zeitgeist and Google Trends show that that thenumber of web pages and the average clustering coefficientcan be used to detect events Some strategies integrating thenumber of web pages and the average clustering coefficientare also employed The evaluations on real dataset show thatthe proposed function integrating the number of web pagesand the average clustering coefficient can be used for eventdetection efficiently and correctly

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceandTechnologyMajor Project underGrant 2013ZX01033002-003 in part by the National High Technology Research andDevelopment Program of China (863 Program) under Grants2013AA014601 and 2013AA014603 in part by National KeyTechnology Support Program under Grant 2012BAH07B01in part by the National Science Foundation of China underGrants 61300202 and 61300028 in part by the Project of theMinistry of Public Security under Grant 2014JSYJB009 inpart by the China Postdoctoral Science Foundation underGrant 2014M560085 in part by the Science Foundation ofShanghai under Grant 13ZR1452900 in part by the ChinaNational Social Science Fund 06BFX051 and in part bythe Shanghai University training and selection of outstand-ing young teachers in special research fund hzf05046 Theauthors must be grateful for the helpful suggestion from YLiu L Mei H Zhang and C Hu

References

[1] Y Liu L Ni and C Hu ldquoA generalized probabilistic topologycontrol for wireless sensor networksrdquo IEEE Journal on SelectedAreas in Communications vol 30 no 9 pp 1780ndash1788 2012

[2] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[3] C Hu Z Xu Y Liu L Mei L Chen and X Luo ldquoSemanticlink network-based model for organizing multimedia big datardquoIEEE Transactions on Emerging Topics in Computing vol 2 no3 pp 376ndash387 2014

[4] Y Liu Y Zhu L Ni and G Xue ldquoA reliability-oriented trans-mission service in wireless sensor networksrdquo IEEE TransactionsonParallel andDistributed Systems vol 22 no 12 pp 2100ndash21072011

[5] Y Liu Q Zhang and L M Ni ldquoOpportunity-based topologycontrol in wireless sensor networksrdquo IEEE Transactions onParallel andDistributed Systems vol 21 no 3 pp 405ndash416 2010

[6] X Liu Y Yang D Yuan and J Chen ldquoDo we need to handleevery temporal violation in scientific workflow systemsrdquo ACMTransactions on Software Engineering and Methodology vol 23no 1 Article ID 2559938 2014

[7] LWang J Tao R Ranjan et al ldquoG-Hadoop mapReduce acrossdistributed data centers for data-intensive computingrdquo FutureGeneration Computer Systems vol 29 no 3 pp 739ndash750 2013

[8] Z Xu X Wei X Luo et al ldquoKnowle A semantic link networkbased system for organizing large scale online news eventsrdquoFuture Generation Computer Systems vol 43-44 pp 40ndash502015

[9] Z Xu X Luo S Zhang X Wei L Mei and C Hu ldquoMin-ing temporal explicit and implicit semantic relations betweenentities using web search enginesrdquo Future Generation ComputerSystems vol 37 pp 468ndash477 2014

[10] D Haddow A Bullock and P Coppola Introduction to Emer-gency Management Butterworth-Heinemann 2010

[11] 2012 httpdefinitionsuslegalcomeemergency-event[12] 2012 httpwwwwhointcsrsarsen[13] 2012 httpenwikipediaorgwikiSeptember 11 attacks[14] J Farber T Myers J Trevathan I Atkinson and T Andersen

ldquoRiskr a low-technological Web20 disaster service to monitorand share informationrdquo in In Proceedings of 15th InternationalConference on Network-Based Information Systems (NBiS rsquo12)pp 311ndash318 IEEE Melbourne Australia September 2012

[15] J Makkonen ldquoInvestigations on event evolution in TDTrdquo inProceedings of the Conference of the North American Chapterof the Association for Computational Linguistics on HumanLanguage (NAACLstudent rsquo03) pp 43ndash48 Edmonton CanadaMay 2003

[16] C C Yang X Shi and C-P Wei ldquoDiscovering event evolutiongraphs fromnews corporardquo IEEETransactions on SystemsManand Cybernetics Part A Systems and Humans vol 39 no 4 pp850ndash863 2009

[17] J Abonyi B Feil S Nemeth and P Arva ldquoModifiedGath-GEVaclustering for fuzzy segmentation of multivariate time-seriesrdquoFuzzy Sets and Systems vol 149 no 1 pp 39ndash56 2005

[18] X Wu Y-J Lu Q Peng and C-W Ngo ldquoMining eventstructures from web videosrdquo IEEEMultimedia vol 18 no 1 pp38ndash51 2011

8 International Journal of Distributed Sensor Networks

[19] Q He K Chang E-P Lim and A Banerjee ldquoKeep it simplewith time a reexamination of probabilistic topic detectionmodelsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 10 pp 1795ndash1808 2010

[20] C Sung and T G Kim ldquoCollaborative modeling processfor development of domain-specific discrete event simulationsystemsrdquo IEEE Transactions on Systems Man and CyberneticsPart C Applications and Reviews vol 42 no 4 pp 532ndash5462012

[21] E Keogh K Chakrabarti M Pazzani and S MehrotraldquoDimensionality reduction for fast similarity search in largetime series databasesrdquo Knowledge and Information Systems vol3 no 3 pp 263ndash286 2001

[22] J Himberg K Korpiaho H Mannila J Tikanmaki and H TT Toivonen ldquoTime series segmentation for context recognitionin mobile devicesrdquo in Proceedings of the 1st IEEE InternationalConference on Data Mining (ICDM rsquo01) pp 203ndash210 December2001

[23] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[24] P Xiong Y Fan and M Zhou ldquoWeb service configurationunder multiple quality-of-service attributesrdquo IEEE TransactionsonAutomation Science and Engineering vol 6 no 2 pp 311ndash3212009

[25] J Allan G Carbonell G Doddington J Yamron and Y YangldquoTopic detection and tracking pilot study final reportrdquo in Pro-ceedings of the Broadcast News Transcription and UnderstandingWorkshop 1998

[26] J Allan Topic Detection and Tracking Event-Based InformationOrganization Kluwer Norwell Mass USA 2000

[27] Q Mei and C Zhai ldquoDiscovering evolutionary theme patternsfrom text an exploration of temporal text miningrdquo in Pro-ceedings of the 11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 198ndash207 August2005

[28] C-P Wei and Y-H Chang ldquoDiscovering event evolution pat-terns fromdocument sequencesrdquo IEEETransactions on SystemsMan and Cybernetics Part A Systems and Humans vol 37 no2 pp 273ndash283 2007

[29] C C Yang andX Shi ldquoDiscovering event evolution graphs fromnewswiresrdquo in Proceedings of the 15th International Conferenceon World Wide Web pp 945ndash946 May 2006

[30] Y Jo C Lagoze and C L Giles ldquoDetecting research topicsvia the correlation between graphs and textsrdquo in Proceedings ofthe 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo07) pp 370ndash379 August2007

[31] G P C Fung J X Yu H Liu and P S Yu ldquoTime-dependentevent hierarchy constructionrdquo in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo07) pp 300ndash309 ACM August 2007

[32] V Hristidis O Valdivia M Vlachos and P S Yu ldquoContinuouskeyword search on multiple text streamsrdquo in Proceedings of the15th ACM Conference on Information and Knowledge Manage-ment (CIKM rsquo06) pp 802ndash803 November 2006

[33] Q Zhao T-Y Liu S S Bhowmick andW-Y Ma ldquoEvent detec-tion from evolution of click-through datardquo in Proceedings ofthe 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo06) pp 484ndash493 August2006

[34] C Wang M Zhang L Ru and S Ma ldquoAutomatic online newstopic ranking using media focus and user attention based onaging theoryrdquo in Proceedings of the 17th ACM Conference onInformation and Knowledge Management (CIKM rsquo08) pp 1033ndash1042 October 2008

[35] J Leskovec A Krause C Guestrin C Faloutsos J VanBriesenand N Glance ldquoCost-effective outbreak detection in networksrdquoin Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo07) pp420ndash429 August 2007

[36] J Leskovec and E Horvitz ldquoPlanetary-scale views on a largeinstant-messaging networkrdquo in Proceedings of the 17th Interna-tional Conference onWorldWideWeb (WWW rsquo08) pp 915ndash924April 2008

[37] R Nallapati A Feng F Peng and J Allan ldquoEvent threadingwithin news topicsrdquo in Proceedings of the 13th ACM Conferenceon Information and Knowledge Management (CIKM rsquo04) pp446ndash453 November 2004

[38] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing and Managementvol 24 no 5 pp 513ndash523 1988

[39] X Jin S Spangler R Ma and J Han ldquoTopic initiator detectionon the world wide webrdquo in Proceedings of the 19th InternationalWorld Wide Web Conference pp 481ndash490 2010

[40] X Yin J Han and P S Yu ldquoTruth discovery with multiple con-flicting information providers on the Webrdquo IEEE Transactionson Knowledge and Data Engineering vol 20 no 6 pp 796ndash8082008

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 7: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

International Journal of Distributed Sensor Networks 7

data we know that the proposed algorithm can compute theburst power of an event accuratelyThe information fromwebcan be integrated into burst power The factor can be used todetect states of an emergency event

6 Conclusions

With the popularity of web the internet is becoming amajor information provider and poster of an event due toits real-time open and dynamic features However facedwith the hugeness disorder and continuous web resourcesit is impossible for people to efficiently recognize collectand organize the events In this paper crowd sensing basedburst computation algorithm of a web event is developed inorder to let the people know a web event clearly and help thesocial group or government process the events effectivelyThedefinition of ldquosocial sensorsrdquo is firstly introduced which isthe foundation of using web resources to compute the burstpower of events on the web Secondly different temporalfeatures of web events are developed to provide the basicsfor the proposed computation algorithmMoreover the burstpower is presented to integrate the above temporal featuresof an event Empirical experiments on real datasets includingGoogle Zeitgeist and Google Trends show that that thenumber of web pages and the average clustering coefficientcan be used to detect events Some strategies integrating thenumber of web pages and the average clustering coefficientare also employed The evaluations on real dataset show thatthe proposed function integrating the number of web pagesand the average clustering coefficient can be used for eventdetection efficiently and correctly

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported in part by the National ScienceandTechnologyMajor Project underGrant 2013ZX01033002-003 in part by the National High Technology Research andDevelopment Program of China (863 Program) under Grants2013AA014601 and 2013AA014603 in part by National KeyTechnology Support Program under Grant 2012BAH07B01in part by the National Science Foundation of China underGrants 61300202 and 61300028 in part by the Project of theMinistry of Public Security under Grant 2014JSYJB009 inpart by the China Postdoctoral Science Foundation underGrant 2014M560085 in part by the Science Foundation ofShanghai under Grant 13ZR1452900 in part by the ChinaNational Social Science Fund 06BFX051 and in part bythe Shanghai University training and selection of outstand-ing young teachers in special research fund hzf05046 Theauthors must be grateful for the helpful suggestion from YLiu L Mei H Zhang and C Hu

References

[1] Y Liu L Ni and C Hu ldquoA generalized probabilistic topologycontrol for wireless sensor networksrdquo IEEE Journal on SelectedAreas in Communications vol 30 no 9 pp 1780ndash1788 2012

[2] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[3] C Hu Z Xu Y Liu L Mei L Chen and X Luo ldquoSemanticlink network-based model for organizing multimedia big datardquoIEEE Transactions on Emerging Topics in Computing vol 2 no3 pp 376ndash387 2014

[4] Y Liu Y Zhu L Ni and G Xue ldquoA reliability-oriented trans-mission service in wireless sensor networksrdquo IEEE TransactionsonParallel andDistributed Systems vol 22 no 12 pp 2100ndash21072011

[5] Y Liu Q Zhang and L M Ni ldquoOpportunity-based topologycontrol in wireless sensor networksrdquo IEEE Transactions onParallel andDistributed Systems vol 21 no 3 pp 405ndash416 2010

[6] X Liu Y Yang D Yuan and J Chen ldquoDo we need to handleevery temporal violation in scientific workflow systemsrdquo ACMTransactions on Software Engineering and Methodology vol 23no 1 Article ID 2559938 2014

[7] LWang J Tao R Ranjan et al ldquoG-Hadoop mapReduce acrossdistributed data centers for data-intensive computingrdquo FutureGeneration Computer Systems vol 29 no 3 pp 739ndash750 2013

[8] Z Xu X Wei X Luo et al ldquoKnowle A semantic link networkbased system for organizing large scale online news eventsrdquoFuture Generation Computer Systems vol 43-44 pp 40ndash502015

[9] Z Xu X Luo S Zhang X Wei L Mei and C Hu ldquoMin-ing temporal explicit and implicit semantic relations betweenentities using web search enginesrdquo Future Generation ComputerSystems vol 37 pp 468ndash477 2014

[10] D Haddow A Bullock and P Coppola Introduction to Emer-gency Management Butterworth-Heinemann 2010

[11] 2012 httpdefinitionsuslegalcomeemergency-event[12] 2012 httpwwwwhointcsrsarsen[13] 2012 httpenwikipediaorgwikiSeptember 11 attacks[14] J Farber T Myers J Trevathan I Atkinson and T Andersen

ldquoRiskr a low-technological Web20 disaster service to monitorand share informationrdquo in In Proceedings of 15th InternationalConference on Network-Based Information Systems (NBiS rsquo12)pp 311ndash318 IEEE Melbourne Australia September 2012

[15] J Makkonen ldquoInvestigations on event evolution in TDTrdquo inProceedings of the Conference of the North American Chapterof the Association for Computational Linguistics on HumanLanguage (NAACLstudent rsquo03) pp 43ndash48 Edmonton CanadaMay 2003

[16] C C Yang X Shi and C-P Wei ldquoDiscovering event evolutiongraphs fromnews corporardquo IEEETransactions on SystemsManand Cybernetics Part A Systems and Humans vol 39 no 4 pp850ndash863 2009

[17] J Abonyi B Feil S Nemeth and P Arva ldquoModifiedGath-GEVaclustering for fuzzy segmentation of multivariate time-seriesrdquoFuzzy Sets and Systems vol 149 no 1 pp 39ndash56 2005

[18] X Wu Y-J Lu Q Peng and C-W Ngo ldquoMining eventstructures from web videosrdquo IEEEMultimedia vol 18 no 1 pp38ndash51 2011

8 International Journal of Distributed Sensor Networks

[19] Q He K Chang E-P Lim and A Banerjee ldquoKeep it simplewith time a reexamination of probabilistic topic detectionmodelsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 10 pp 1795ndash1808 2010

[20] C Sung and T G Kim ldquoCollaborative modeling processfor development of domain-specific discrete event simulationsystemsrdquo IEEE Transactions on Systems Man and CyberneticsPart C Applications and Reviews vol 42 no 4 pp 532ndash5462012

[21] E Keogh K Chakrabarti M Pazzani and S MehrotraldquoDimensionality reduction for fast similarity search in largetime series databasesrdquo Knowledge and Information Systems vol3 no 3 pp 263ndash286 2001

[22] J Himberg K Korpiaho H Mannila J Tikanmaki and H TT Toivonen ldquoTime series segmentation for context recognitionin mobile devicesrdquo in Proceedings of the 1st IEEE InternationalConference on Data Mining (ICDM rsquo01) pp 203ndash210 December2001

[23] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[24] P Xiong Y Fan and M Zhou ldquoWeb service configurationunder multiple quality-of-service attributesrdquo IEEE TransactionsonAutomation Science and Engineering vol 6 no 2 pp 311ndash3212009

[25] J Allan G Carbonell G Doddington J Yamron and Y YangldquoTopic detection and tracking pilot study final reportrdquo in Pro-ceedings of the Broadcast News Transcription and UnderstandingWorkshop 1998

[26] J Allan Topic Detection and Tracking Event-Based InformationOrganization Kluwer Norwell Mass USA 2000

[27] Q Mei and C Zhai ldquoDiscovering evolutionary theme patternsfrom text an exploration of temporal text miningrdquo in Pro-ceedings of the 11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 198ndash207 August2005

[28] C-P Wei and Y-H Chang ldquoDiscovering event evolution pat-terns fromdocument sequencesrdquo IEEETransactions on SystemsMan and Cybernetics Part A Systems and Humans vol 37 no2 pp 273ndash283 2007

[29] C C Yang andX Shi ldquoDiscovering event evolution graphs fromnewswiresrdquo in Proceedings of the 15th International Conferenceon World Wide Web pp 945ndash946 May 2006

[30] Y Jo C Lagoze and C L Giles ldquoDetecting research topicsvia the correlation between graphs and textsrdquo in Proceedings ofthe 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo07) pp 370ndash379 August2007

[31] G P C Fung J X Yu H Liu and P S Yu ldquoTime-dependentevent hierarchy constructionrdquo in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo07) pp 300ndash309 ACM August 2007

[32] V Hristidis O Valdivia M Vlachos and P S Yu ldquoContinuouskeyword search on multiple text streamsrdquo in Proceedings of the15th ACM Conference on Information and Knowledge Manage-ment (CIKM rsquo06) pp 802ndash803 November 2006

[33] Q Zhao T-Y Liu S S Bhowmick andW-Y Ma ldquoEvent detec-tion from evolution of click-through datardquo in Proceedings ofthe 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo06) pp 484ndash493 August2006

[34] C Wang M Zhang L Ru and S Ma ldquoAutomatic online newstopic ranking using media focus and user attention based onaging theoryrdquo in Proceedings of the 17th ACM Conference onInformation and Knowledge Management (CIKM rsquo08) pp 1033ndash1042 October 2008

[35] J Leskovec A Krause C Guestrin C Faloutsos J VanBriesenand N Glance ldquoCost-effective outbreak detection in networksrdquoin Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo07) pp420ndash429 August 2007

[36] J Leskovec and E Horvitz ldquoPlanetary-scale views on a largeinstant-messaging networkrdquo in Proceedings of the 17th Interna-tional Conference onWorldWideWeb (WWW rsquo08) pp 915ndash924April 2008

[37] R Nallapati A Feng F Peng and J Allan ldquoEvent threadingwithin news topicsrdquo in Proceedings of the 13th ACM Conferenceon Information and Knowledge Management (CIKM rsquo04) pp446ndash453 November 2004

[38] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing and Managementvol 24 no 5 pp 513ndash523 1988

[39] X Jin S Spangler R Ma and J Han ldquoTopic initiator detectionon the world wide webrdquo in Proceedings of the 19th InternationalWorld Wide Web Conference pp 481ndash490 2010

[40] X Yin J Han and P S Yu ldquoTruth discovery with multiple con-flicting information providers on the Webrdquo IEEE Transactionson Knowledge and Data Engineering vol 20 no 6 pp 796ndash8082008

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 8: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

8 International Journal of Distributed Sensor Networks

[19] Q He K Chang E-P Lim and A Banerjee ldquoKeep it simplewith time a reexamination of probabilistic topic detectionmodelsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 32 no 10 pp 1795ndash1808 2010

[20] C Sung and T G Kim ldquoCollaborative modeling processfor development of domain-specific discrete event simulationsystemsrdquo IEEE Transactions on Systems Man and CyberneticsPart C Applications and Reviews vol 42 no 4 pp 532ndash5462012

[21] E Keogh K Chakrabarti M Pazzani and S MehrotraldquoDimensionality reduction for fast similarity search in largetime series databasesrdquo Knowledge and Information Systems vol3 no 3 pp 263ndash286 2001

[22] J Himberg K Korpiaho H Mannila J Tikanmaki and H TT Toivonen ldquoTime series segmentation for context recognitionin mobile devicesrdquo in Proceedings of the 1st IEEE InternationalConference on Data Mining (ICDM rsquo01) pp 203ndash210 December2001

[23] X Luo Z Xu J Yu and X Chen ldquoBuilding association linknetwork for semantic link on web resourcesrdquo IEEE Transactionson Automation Science and Engineering vol 8 no 3 pp 482ndash494 2011

[24] P Xiong Y Fan and M Zhou ldquoWeb service configurationunder multiple quality-of-service attributesrdquo IEEE TransactionsonAutomation Science and Engineering vol 6 no 2 pp 311ndash3212009

[25] J Allan G Carbonell G Doddington J Yamron and Y YangldquoTopic detection and tracking pilot study final reportrdquo in Pro-ceedings of the Broadcast News Transcription and UnderstandingWorkshop 1998

[26] J Allan Topic Detection and Tracking Event-Based InformationOrganization Kluwer Norwell Mass USA 2000

[27] Q Mei and C Zhai ldquoDiscovering evolutionary theme patternsfrom text an exploration of temporal text miningrdquo in Pro-ceedings of the 11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining pp 198ndash207 August2005

[28] C-P Wei and Y-H Chang ldquoDiscovering event evolution pat-terns fromdocument sequencesrdquo IEEETransactions on SystemsMan and Cybernetics Part A Systems and Humans vol 37 no2 pp 273ndash283 2007

[29] C C Yang andX Shi ldquoDiscovering event evolution graphs fromnewswiresrdquo in Proceedings of the 15th International Conferenceon World Wide Web pp 945ndash946 May 2006

[30] Y Jo C Lagoze and C L Giles ldquoDetecting research topicsvia the correlation between graphs and textsrdquo in Proceedings ofthe 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo07) pp 370ndash379 August2007

[31] G P C Fung J X Yu H Liu and P S Yu ldquoTime-dependentevent hierarchy constructionrdquo in Proceedings of the 13th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD rsquo07) pp 300ndash309 ACM August 2007

[32] V Hristidis O Valdivia M Vlachos and P S Yu ldquoContinuouskeyword search on multiple text streamsrdquo in Proceedings of the15th ACM Conference on Information and Knowledge Manage-ment (CIKM rsquo06) pp 802ndash803 November 2006

[33] Q Zhao T-Y Liu S S Bhowmick andW-Y Ma ldquoEvent detec-tion from evolution of click-through datardquo in Proceedings ofthe 12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining (KDD rsquo06) pp 484ndash493 August2006

[34] C Wang M Zhang L Ru and S Ma ldquoAutomatic online newstopic ranking using media focus and user attention based onaging theoryrdquo in Proceedings of the 17th ACM Conference onInformation and Knowledge Management (CIKM rsquo08) pp 1033ndash1042 October 2008

[35] J Leskovec A Krause C Guestrin C Faloutsos J VanBriesenand N Glance ldquoCost-effective outbreak detection in networksrdquoin Proceedings of the 13th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining (KDD rsquo07) pp420ndash429 August 2007

[36] J Leskovec and E Horvitz ldquoPlanetary-scale views on a largeinstant-messaging networkrdquo in Proceedings of the 17th Interna-tional Conference onWorldWideWeb (WWW rsquo08) pp 915ndash924April 2008

[37] R Nallapati A Feng F Peng and J Allan ldquoEvent threadingwithin news topicsrdquo in Proceedings of the 13th ACM Conferenceon Information and Knowledge Management (CIKM rsquo04) pp446ndash453 November 2004

[38] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing and Managementvol 24 no 5 pp 513ndash523 1988

[39] X Jin S Spangler R Ma and J Han ldquoTopic initiator detectionon the world wide webrdquo in Proceedings of the 19th InternationalWorld Wide Web Conference pp 481ndash490 2010

[40] X Yin J Han and P S Yu ldquoTruth discovery with multiple con-flicting information providers on the Webrdquo IEEE Transactionson Knowledge and Data Engineering vol 20 no 6 pp 796ndash8082008

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 9: Research Article Crowd Sensing Based Burst Computing of ...downloads.hindawi.com/journals/ijdsn/2015/184751.pdf · With the help of cloud computing[ ],internetofthings[ ],andBigData[

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of