Big but personal (meta)data
How Human Behavior Bounds Privacy and What We Can We Do About It
Yves-Alexandre de Montjoye @yvesalexandre MIT Media Lab
De-identification
Entire country of 1.5 millions people
Our behavior is unique enough
4 points
Identify 95% of people
de Montjoye, Y. A., Hidalgo, C. A., Verleysen, M., & Blondel, V. D. (2013). Unique in the Crowd: The privacy bounds of human mobility. Nature SRep, 3.
What it means
1. It is possible to re-identify mobile phone metadata (even if there is no name or phone number)
Estimating Privacy
Spatial resolution Temporal resolution
Number of points
de Montjoye, Y. A., Hidalgo, C. A., Verleysen, M., & Blondel, V. D. (2013). Unique in the Crowd: The privacy bounds of human mobility. Nature SRep, 3.
What it means
1. It is possible to re-identify mobile phone metadata (even if there is no name or phone number)
2. It is not simply a question of coarsening the data (we’d just need a few more points)
Predicting personality using metadata
de Montjoye, Y. A., Quoidbach, J., Robic, F., & Pentland, A. S. (2013). Predicting personality using novel mobile phone-based metrics. In Social Computing, Behavioral-Cultural Modeling and Prediction (pp. 48-55). Springer Berlin Heidelberg.
What it means
1. It is possible to re-identify mobile phone metadata (even if there is no name or phone number)
2. It is not simply a question of coarsening the data (we’d just need a few more points)
3. It is not “just” metadata or what is directly visible in the data (e.g. one might use it to predict your personality)
Eagle, N., de Montjoye, Y-A.., & Bettencourt, L. M. (2009). Community computing: Comparisons between rural and urban societies using mobile phone data. IEEE Computational Science and Engineering
We should use this data
Deville, P. et al. (2014). Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences, 201408439.
Wesolowski, A., Eagle, N., Tatem, A. J., Smith, D. L., Noor, A. M., Snow, R. W., & Buckee, C. O. (2012). Quantifying the impact of human mobility on malaria. Science, 338(6104), 267-270.
(but in a privacy-conscientious way)
We should use this data
by: understanding what the real risks are
and designing solutions
Privacy-conscientious anonymization
de Montjoye, Y. A., Smoreda, Z., Trinquart, R., Ziemlicki, C., & Blondel, V. D. (2014). D4D-Senegal: The Second Mobile Phone Data for Development Challenge. arXiv preprint arXiv:1407.4885.
e.g. 2-week mobility traces of 27 x 300.000 individuals + Bandicoot’s behavioral indicators
Online systems: from privacy to security
openPDS/SafeAnswers: - Only shares answers, not raw data - Security mechanisms
openPDS/SafeAnswers
de Montjoye Y.-A., Wang S., Pentland A., On the Trusted Use of Large-Scale Personal Data. IEEE Data Engineering Bulletin, 35-4 (2012). de Montjoye, Y. A., Shmueli, E., Wang, S. S., & Pentland, A. S. (2014). openPDS: Protecting the Privacy of Metadata through SafeAnswers. PLoS ONE, 9(7), e98790.