27
1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair), Dr.Nicole Kong, Dr.James Bethel

1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

Embed Size (px)

Citation preview

Page 1: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

1

Using GIS to Understand Behavior Patterns of

Twitter Users

Yue LiM.S. Civil/Geomatics Engineering

Purdue UniversityCommittee: Dr.Jie Shan (Chair), Dr.Nicole Kong,

Dr.James Bethel

Page 2: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

2

Introduction

• Volunteered Geographic Information (VGI)1

− Emergency management, event detection, tourist behavior, knowledge discovery…

• Twitter

− The most popular micro-blogging site

− Tweets with longitude and latitude

− A gold mine for scholars in geography, linguistics, sociology, economics, health, and psychology2

− Marketing, advertising, regulation,…

Page 3: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

3

Research Goal

• To discover the spatio-temporal pattern of tweets

• To infer the human mobility patterns behind the tweets

• To understand the lifestyle of college students

Page 4: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

4

Study Area

• College town/city, Big Ten Universities

• West Lafayette, IN

− Most densely populated city in IN

− Home of Purdue University

• Ann Arbor, MI

− University of Michigan

• Bloomington, IN

− Indiana University, Bloomington

• Columbus, OH

− Ohio State University

Page 5: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

5

Data

• Geo-tagged tweets downloaded with Twitter Streaming API

• With longitude and latitude at time of posting

• Nov 18, 2013 to April 2, 2014

− West Lafayette : 59,238

− Ann Arbor: 220,117

− Bloomington :247,202

− Columbus: 1,936,238

Page 6: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

6

Methods

• Pure Spatial

− Point density analysis

• Pure Temporal

• Spatio-Temporal

− Tweets in Land Use

− Event/Anomaly detection

− Individual twitter user patterns

Page 7: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

7

Tweets in West Lafayette

Page 8: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

8

Tweets in Ann Arbor

Page 9: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

9

Tweets in Bloomington

Page 10: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

10

Tweets in Columbus

Page 11: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

11

Tweets by Hour

Page 12: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

12

Tweets by Hour

Page 13: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

13

Tweets and Land Use

• Land use in Ann Arbor, MI− Industrial

− Mixed Use

− Office

− Public/Education

− Recreation

− Residential

− Transportation

− Vacant

• Spatially join the tweets with land use

Page 14: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

14

Tweets and Land Use

Page 15: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

15

1 - Commercial; 2- Industrial; 3- Mixed Use; 4- Office; 5- Public/Education;6 – Recreation; 7- Residence; 8- Transportation; 9- Vacant/River

Tweets and Land Use

Page 16: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

16

Event Detection

• Spatially and temporally aggregated

− Football game, concert, festival,…

• Use Purdue shooting on Jan 21, 2014 as an example

− Lockdown from around 12-14pm

• Temporally

− 710 tweets in 12-14pm Jan 21, 231 unique users

− 7443 tweets in 12-14pm in the whole datasets, 1080 unique users

• Spatially

− How to measure spatial anomaly?

Page 17: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

17

Hypotheses

• Challenge: Inhomogeneous/clustered process even outside lockdown period

− Were tweets more significantly clustered during lockdown than average?

• Intensity of tweets is correlated with distance to campus buildings

• Extent of clustering is positively correlated with chi-sqare value

Page 18: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

18

Covariate: Purdue Buildings

Purdue Building Shapefile converted to tesselation

R libraries: maptools, sp, spatstat

Functions: as.mask → im → tess

Page 19: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

19

Randomization Test

Algorithm (by Ken Kellner):

1. Select 710 random tweets from dates 1/16/14 - 1/26/14 and hours 12am - 14pm without replacement

2. Call quadratcount() and quadrat.test() on new random dataset

3. Save chi-square value

4. Repeat 1000 times to obtain distribution of chi-square values

5. Compare actual chi-square value obtained on 1/21/14 with distribution

6. Quasi-p value: proportion of values more extreme than obtained value

Assumption: greater chi-square value = more inhomogenous/clustered

Tested with simulation

Page 20: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

20

Randomization Test Result

Chi-square: 85162.85Quasi-p value: 0.038

• We were able to detect a change in the pattern of tweets during the lockdown, when presumably more people were stuck in Purdue buildings than average.

Page 21: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

21

Event Detection

• We can see anomaly from Twitter data both temporally and spatially

• However, we are still looking for a complete and integrated algorithm, and apply it to other events

• To be cont’d

Page 22: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

22

Frequent Twitter Users

• Top 10 Twitter users with the most tweets in Ann Arbor

• Plot the tweets of individual Twitter user

• Four typical patterns− Work-Home

− Work-Road-Home

− Work-Home-Short Visit

− Multiple Clusters

Page 23: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

23

Frequent Twitter Users

Page 24: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

24

Frequent Twitter Users

Page 25: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

25

Future Work

• On-going research

• Complete analysis in all 4 study areas, and compare the patterns

• Develop/Find an algorithm for event detection

• …

• Any suggestions are welcomed!

Page 26: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

26

References

• 1. Goodchild, M. F., 2007. Citizens as sensors: The world of volunteered geography, GeoJournal, 69, 211-221.

• 2. Ghosh, D., and R. Guha, 2013. What are we ‘tweeting’ about obesity? Mapping tweets with topic modeling and Geographic Information System, Cartography and Geographic Information Science, 40(2), 90-102.

Page 27: 1 Using GIS to Understand Behavior Patterns of Twitter Users Yue Li M.S. Civil/Geomatics Engineering Purdue University Committee: Dr.Jie Shan (Chair),

27

QUESTIONS?