Upload
darshan-santani
View
1.860
Download
3
Embed Size (px)
DESCRIPTION
Citation preview
Reality Mining, (Big Data) and Urban Sensing
Darshan Santani
ETH Zurich
15 April 2010
2
Trivia
15 April 2010
3
Trivia
Taxi Observations* by Location and Booking Frequency of Zone in Singapore1
* Sampled dataset (~10,000 observations)15 April 2010
4
Outline
• Reality Mining• Applications• Holy Grail!• Challenges• Discussion and Q& A
15 April 2010
5
Reality Mining Study2
• “ … collection and analysis of machine-sensed environmental data pertaining to human social behavior, with the goal to identify predictable patterns of future human behavior …”
• .. extracting information from real world sensor data …
• Reality Mining vs. Data Mining
• Nathan Eagle, Alex (Sandy) Pentland, MIT, 2005• 100 Mobile phones, 9 months, 45,000 hours of
communication logs, location and proximity data15 April 2010
6
Key Results
15 April 2010
7
Key Results
Social Network Analysis in the wild!3
15 April 2010
8
Why do we care?
• Social Science– Social Network Analysis– Behavioral Modeling– Human Mobility
• Systems Research– Transportation– Environmental Modeling– Healthcare
15 April 2010
9
Enabled Applications
Human Mobility Patterns using CDRs 4
15 April 2010
10
Why do we care (again)?
• Social Science– Social Network Analysis– Behavioral Modeling– Human Mobility
• Systems Research– Transportation– Environmental Modeling– Healthcare
15 April 2010
11
Enabled Applications (contd.)
Environmental Monitoring - Noisetube5
15 April 2010
12
Real-time Traffic Monitoring6
15 April 2010
13
Mobile Millennium, UC Berkeley7
100 probe vehicles, carrying GPS-enabled N95
San Francisco Bay Area, California
Virtual Trip Lines (VTL)
15 April 2010
14
Mobile Millennium, UC Berkeley
15 April 2010
15
Holy Grail!• Urban Planning and Management
– Real time city• Are the sidewalks along the Belleuve lake good for jogging today, given the
air and noise pollution levels?
– Macroscopic view• Is there a need for running supplementary tram services (or sending an
additional fleet of taxis) towards the end of a soccer match between Switzerland and Germany?
– Emergency/Crisis Response• 2009 Mumbai terrorist blasts
– Disease Outbreak15 April 2010
16
Selective Information Broadcasting1
Booking Frequency by second
15 April 2010
17
Holy Grail!• Urban Planning and Management
– Real time city• Are the sidewalks along the Belleuve lake good for jogging today, given the
air and noise pollution levels?
– Macroscopic vs. Microscopic• Is there a need for running supplementary tram services (or sending an
additional fleet of taxis) towards the end of a soccer match between Switzerland and Germany?
– Emergency/Crisis Response• 2009 Mumbai terrorist blasts
– Disease Outbreak/Epidemic Modeling15 April 2010
18
Challenge #1: Big Data
• How big is big enough? –Wal-Mart: 100-400 GB/day of RFID data8
– LHC: 40 TB/day9
• Storage is cheap!
• Stream data mining
15 April 2010
19
Challenge #2: Abstraction
• Low level details– Parallelism! – Task distribution – Load balancing– Fault tolerance
• Programming Productivity
• Google’s MapReduce
15 April 2010
20
Challenge #3: Privacy(!)
• A “new deal” on data? 10
– right to possess your data– control the use of your data– right to distribute or dispose your data
• How thin or thick the line is between publicity and privacy?
• Trivia again!– Erica is travelling to Helsinki in May 2010?– Florian and Stephan visited Brussels in February 2010?
15 April 2010
21
Big Money!IBM Smarter Planet
15 April 2010
HP CeNSE
Q & A
23
Takeaway Message
Last 5 years have spurred an industrial revolution of sensor data. I believe that applying empirical (and later, computational methodologies) on this real world data would help us better understand the underlying cognitive, social, policy and engineering issues present in our socio-technical systems. Reality Mining, which sits at the intersection of computer science, statistics and social science, fits in this role nicely.
15 April 2010
24
References1. Darshan Santani, Rajesh Krishna Balan, and C. Jason Woodard, Understanding and Improving a
GPS-based Taxi System, In 6th USENIX International Conference on Mobile Systems, Applications, and Services (MobiSys), Breckenridge, Colorado, June 2008
2. N. Eagle and A. (Sandy) Pentland. Reality mining: sensing complex social systems. Personal Ubiquitous Computing, 10(4):255–268, 2006.
3. N. Eagle, A. S. Pentland, and D. Lazer. Inferring friendship network structure by using mobile phone data. Proceedings of the National Academy of Sciences, 106(36):15274–15278, 2009
4. C. Song, Z. Qu, N. Blumm, and A.-L. Barabasi. Limits of Predictability in Human Mobility. Science, 327(5968):1018–1021, 2010
5. N. Maisonneuve, M. Stevens, M. E. Niessen, and L. Steels.Noisetube: Measuring and mapping noise pollution with mobile phones. In I. N. Athanasiadis, P. A. Mitkas, A. E.Rizzoli, and J. M. Gómez, editors, ITEE, pages 215–228. Springer, 2009.
6. J. Yoon, B. Noble, and M. Liu. Surface street traffic estimation. In MobiSys ’07: Proceedings of the 5th International conference on Mobile systems, applications and services, pages 220–232, New York, NY, USA, 2007
7. J. C. Herrera, D. B.Work, R. Herring, X. J. Ban, , and A. M.Bayen. Evaluation of traffic data obtained via gps-enabled mobile phones: the mobile century field experiment. Working Paper, UCB-ITS-VWP-2009-8, August 2009
8. I. Alexander, G. Andrea, M. Florian, and E. Fleisch.Estimating data volumes of rfid-enabled supply chains. In AMCIS 2009 Proceedings, page 636, 2009
9. CERN LHC Computing. http://public.web.cern.ch/public/en/LHC/Computing-en.html, April 2010
10. Alex (Sandy) Pentland, Reality Mining for Companies, in O’reilly Where2.0 Conference, May 19-21, SanJose CA, 2009 15 April 2010