15
Lucerne University of Applied Sciences and Arts, Switzerland Dr. Günter Karjoth June 17, 2014 ITU telco big data workshop Privacy Challenges of Telco Big Data Mobile phones are great sources of data – but we must be careful about privacy 1 / 15

ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, SwitzerlandDr. Günter KarjothJune 17, 2014

ITU telco big data workshop

Privacy Challenges of Telco Big Data

Mobile phones are great sources of data – but we must be carefulabout privacy

1 / 15

Page 2: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Sources of Big Data

Big data from cheap phoneslocation data (cell, GPS)search queries

e-mails, Twitter, social networks

2012 Swisscom “Ville Vivante”, mobile data from the city of Geneve

15 million “movements” measurable from 2 million cell phone calls made on

the Swisscom system during one day

2 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Source: villevivante.ch

Page 3: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Telecom Operators’ Benefits

Data Trove – telecom operators more and more treat customer data as an

asset to be mined instead of a mere incidental to running networks

Selling data on mobile users’ locations, movements, and web browsing habits

may grow into a multi billion-dollar marketTelecom big data products my be helpful well beyond the realms ofadvertising

for credit card companies wanting to detect fraud,for ambulance operators plotting routes to avoid traffic,...

Examples of data disclosure2012 Orange “Data for Development" Challenge

2.5B anonymized records from 5 months’ worth of calls made by 5M people at theIvory Coast → “to see what it’s possible to do with the data"

2013 Telefonica O2 “Campus party”Footfall counts for London Metropolitan Area over the course of 3 weeks

3 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 4: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Mobile Phones regarded as “radiolocation bugs"

German politician Malte Spitz published his telecom data from 8/09 to 2/10:

movements had been captured to 78 %.

ZEIT ONLINE “enriched" this data set with public and for everybody easily

available data records (Tweets and blogs).

Swiss politician Balthasar Glättli published his Telecom data from 1/13–8/13.

Sources:

http://www.watson.ch/!533090301

http://www.zeit.de/digital/datenschutz/2011-02/vorratsdaten-malte-spitz

4 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 5: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Big Data: Big Concerns

Big data technology enables massive data aggregation beyond what has

been previously possible

Inferencing concerns with non-sensitve data

5 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 6: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Safeguards

In the field of analytics, customer data can be used responsibly in two ways:

the appropriate customer permissions (consent) must be in place, or

it must be anonymized so that no individual can be identified.

+ In Europe, anonymized data fall out of the scope of data protection legislation.1

1An exception is the French Data Protection Law.

6 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 7: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Anonymization of Personal Data

Macro data“On Mondays on trajectory X there are 160% more passengers than on Tuesdays.”

Micro dataData is at the granularity of individuals!

Micro dataResearcher

+ Re-identification risk!

7 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 8: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Medical Data Released as Anonymous

SSN Name Race DoB Sex ZIP Status Health Problemsasian 09/27/64 female 94139 divorced hypertensionasian 09/30/64 female 94139 divorced obesityasian 04/18/64 male 94139 married chest painasian 04/15/64 male 94139 married obesityblack 03/13/63 male 94138 married hypertensionblack 03/18/63 male 94138 married shortness of breathblack 09/13/64 female 94141 married shortness of breathblack 09/07/64 female 94141 married obesitywhite 05/14/61 male 94138 single chest painwhite 05/08/61 male 94138 single obesitywhite 09/15/61 female 94142 widow shortness of breath

Voter ListName Address City ZIP DoB Sex PartySue. J. Carlson 900 Market St. San Francisco 94142 9/15/61 female democrat

8 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 9: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Protecting Microdata Tables

Anonymization techniques transform the values of the quasi-identifier

attributes.

Two main anonymization techniques: randomization and generalization.09/15/61→ 60-64, or94142→ 941**

k -Anonymity – Each release of data must be such that every combination of

values of quasi-identifiers can be indistinctly matched to at least k customers.

+ Trade-off between Data Utility and Privacy!

Additional administrational safeguards such as contractual terms may be

needed (limited disclosure).

9 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 10: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Data Types and Uniqueness

Relational Data

Transactional (set-valued) Data

Sequential Data

Trajectories

Graphs

Text

How much personal data you need to know for unique re-identification:

(year of birth, sex, 3-digit Zipcode) → 0.04% of the American population

(date of birth, sex, 5-digit Zipcode) → 63–87% of the American population

2 spatio-temporal points → 50%

4 spatio-temporal points → 95%

+ Re-identification requires outside information!

10 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Credit: Yves-Alexandre de Montjoye, et al.

Page 11: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Anonymization of Trajectory Data

ID Trajectory Status . . .1 〈b2 → d3 → c4 → f 6 → c7〉 On-welfare . . .2 〈f 6 → c7 → e8〉 Student . . .3 〈d3 → c4 → f 6 → e8〉 Retired . . .4 〈b2 → c5 → c7 → e8〉 Student . . .5 〈d3 → c7 → e8〉 Retired . . .6 〈c5 → f 6 → e8〉 Full-time . . .7 〈b2 → f 6 → c7 → e8〉 Full-time . . .8 〈b2 → c5 → f 6 → c7〉 On-welfare . . .

high dimensionality

data sparseness

sequential

LKC-Privacy: Every sequence with maximum length L of any trajectory is shared

by at least K records, and the confidence of inferring any sensitive value is not

greater than C.

11 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 12: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Anonymization of Trajectories – An evolving research field

Source:Abul, Bonchi & Nanni: “Never Walk Alone: Uncertainty for Anonymity in Moving Objects Databases”, 2009.

12 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 13: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Prepare to share but be aware

Spatio-temporal data is very unique

Increasing public availability of other datasets (‘Open Data’ movement)

What is needed:

Appropriate anonymization techniques

+ Do anonymization algorithms scale?

Additional administrational safeguards such as internal controls and

contractual terms

Telco operators have to weigh up legal & reputational risk vs. business opportunity.

13 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 14: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

When disclosing data it does not matter how sensitive the data is for us but how

characteristic. It is the latter that determines the effort necessary to link them with

other data to uncover our identity.

14 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth

Page 15: ITU telco big data workshop Privacy Challenges of Telco Big Data · 2014-06-18 · Data Trove – telecom operators more and more treat customer data as an asset to be mined instead

Lucerne University of Applied Sciences and Arts, Switzerland

Thank You for Your Attention

Lucerne University of Applied Scienes and Arts

Competence Center Information Security

Dr. Günter Karjoth

Lecturer on Information Security and Privacy

Zentralstrasse 9, CH-6002 Luzern

T: +41 41 228 99 78

[email protected]

Credit: Jorge Stolfi

15 / 15 Dr. Günter Karjoth | Privacy Challenges of Telco Big Data | June 17, 2014 © 2014 Günter Karjoth