Big Data for PPDR - Amazon S3s3.amazonaws.com/JuJaMa.UserContent/f56c9f2f-409f... · •Global...

Preview:

Citation preview

Big Data for PPDR:

Ethical & Privacy Challenges

Critical Communications World Amsterdam, 1 June 2016

ROBERT HORVITZ

“Big Data refers to the exponential growth both in the availability & automated use of information… by corporations, governments & other large organisations…”

---”Big Data & Smart Devices & their Impact on Privacy,” study for the

EuroParliament’s LIBE Committee (2015)

• Global public safety & security market worth $393 billion in 2016*

• PPDR spending on ICT services & equipment = $220.82 billion in 2015

–Predicted to reach $370.64 billion in

2020 = 10.9% CAGR**

* Markets&Markets estimate

** Homeland Security Research estimate

Not only data is big

Law enforcement databases in EU Member States

• Not reported in the European Sourcebook of Crime and Criminal Justice Statistics (HEUNI, 2014)

• 78 single-country crime-related databases found by RAND Europe in 2011*

– RAND noted the absence of a more comprehensive listing & problem of deciding what to include. They did not include “smart aggregations” [dataset fusions with automated search or AI]

* RAND Europe, “Development of a European Crime Report” (for DG HOME)

Schengen Information System II (SIS II)

• Managed now by eu-LISA

• Believed to be Europe’s largest law enforcement database (~50 million records)

• Facial recognition data recently added

• Eu-LISA also provides access to Eurodac fingerprint database for asylum seekers (~2.8 million records)

SIS II

“…the addition of new functions (in the case of

SIS II) and new purposes (both for SIS II and

Eurodac), and the features of forthcoming

databases (EES and RTP), have blurred the

distinction between border control, counter

terrorism, the fight against transnational crimes

and migration management… The original

collection of data for a well-defined and narrow

purpose has been combined with further storage

for new purposes…”

---”Profiling: Protecting citizens’ rights,

fighting illicit profiling” (study for DG JUSTICE)

US Police Data Initiative

• Launched by the White House in 2015

• Promotes public release of data related to law enforcement

• ~2,400 datasets released so far…

–Mainly statistics, annual reports, “lessons learned”

• …out of ~10,000 maintained by law enforcement agencies, nationally & locally

http://hallofjustice.sunlightfoundation.com/

Transparency

• Difficult to balance data subjects’ rights with the need to protect crime investigations

•BUT IT MUST BE DONE

• Because the evolution of surveillance technology & growth of analytical capabilities are shifting power toward state

One cannot demand all personal activities & assets to be transparent while arguing the need for state activities & assets to be opaque

, Statistics in focus 18/2013

• Proliferating use of video & real-time sensors

• Picture quality improvements (migration from B/W slowscan to color-HD to 4K to THz)

• Spread of body-cams, road-cams, car-cams, public space surveillance nets

• “Paperwork” becoming digital/electronic

• “Internet of Things” making environmental impacts of individual behaviours easier to monitor, record – and attribute.

If crime is trending downward, what isdriving law enforcement data set growth?

• Because a growing number of human activities leave traces in the environment which can be automatically sensed, the distinction between “personal” data & “Internet of Things” data is becoming uncertain

• Obtaining “informed consent” of data subjects is difficult when the data is collected automatically by sensors that are not specifically targetted and subjects are not aware of being sensed

• Several broad derogations for law enforcement in new EU Directive 2016/680 but conditions for applying them need deeper analysis and probably evaluation by ECHR and EDPC

“The old model of anonymity doesn’t seem to be the right model when we are talking about large-scale metadata.”

---Yves-Alexandre de MontjoyeThe study’s lead author

quoted in Natasha Singer, “With a Few Bits of Data,

Researchers Identify ‘Anonymous’ People,”

New York Times, 29 January 2015

A study of anonymised credit card transactions by 1.1 million people in 10,000 stores found that an average of 4 random pieces of information were enough to re-identify 90% of card users unambiguously

“In the context of Big Data, it is crucial to note that the processing of non-sensitive data can lead, through data mining, to the generation of data that reveals sensitive information.”

---”Big Data and Smart Devices and their Impact on Privacy,” study for the EuroParliament’s LIBE Committee (2015)

Biometric data: increasingly fit for purpose

• “Automated fingerprint algorithms [for matching & identification] have improved by an order of magnitude in the past ten years...

• “Identification based on facial recognition is accurate in controlled conditions in 99 cases out of 100...

• “A clear trend towards increasing fusion of such biometric modalities to improve performance and accuracy is emerging… In terms of law enforce-ment, the prevailing trend is towards multi-biometrics…”

---Biometrics in Large-scale IT, eu-LISA (2015) -http://www.eulisa.europa.eu/Publications/Reports/Biometrics%20in%20Larg

e-Scale%20IT.pdf

“Is big data personal data?”

“More often than not, yes. It is personal data. Businesses and governments want to know what people are doing… Even better, they want to anticipate their behaviour. Ideally, they would like to change their behaviour.”

--- Giovanni ButtarelliEuropean Data Protection Supervisor

“Living in a future Big Data world: can prosperity, freedom and fundamental rights be reconciled?” speech delivered on 27 February 2016

Privacy protection considerations

EU requirements for handing “personal” data

• “Personal data” according to Directive 95/46/EC is “any information relating to an identified or identifiable natural person ('data subject')”

• Particularly for “data controllers,” different requirements apply to the handling of “personal data” than to nonpersonal data

• Shrinking cost & size of electronic sensors, processors, recorders & storage

Hard drive cost per gigabyte

Processor cost per MIPS

RAM price per MB

Average price of image sensors

& accelerometers

What drives dataset growth?

There’s a conundrum here…

• The decreasing cost of storage & processing encourages retention of useless & obsolete data, because the relative cost of discarding data thatcould someday prove useful grows ever larger.

• Thus the relevance & reliability (the “quality”) of stored data gradually decreases as the quantity increases.

• Integrating diverse datasets & applying machine-learning/neural-net algorithms can show patterns & relationships not evident in individual datasets.

• But it can also add “knowledge” of uncertain validity, based on unknown rules & variables, enlarging the scope for speculation, suspicion & “false positive” identifications.

Machine Learning

• “Complex and often inscrutable, even at times to their programmers, machine learning models are starting to be used in areas such as credit offers, entrepreneurial funding or hiring. As these methods continue to advance, it may become more difficult to explain or account for the decisions machines make through this process unless mechanisms are built into their designs to ensure accountability.”

---Office of the US President, Big Data: A Report

on Algorithmic Systems, Opportunity & Civil Rights (May 2016), https://www.whitehouse.gov/sites/default/files/

microsites/ostp/2016_0504_data_discrimination.pdf

From “Spurious Correlations” by Tyler Vigen http://www.tylervigen.com/spurious-correlations

• 4 May 2016: New EU Data Protection General Regulation 2016/679 published in Official Journal

– To be implemented by Member States by 25 May 2018 (2-year transition period)

• Does not apply to the processing of personal data for “national security” or “prevention, investigation, detection or prosecution of criminal offences or… prevention of threats to public security”

• These are covered by Directive 2016/680, also published on 4 May

– EU Member States must adopt it by 6 May 2018

EU Directive 2016/680 - Protection of personal data in the context of law

enforcement & criminal justice

• Replaces Framework Decision 2008/977/JHA

• Probably the most important change is the further harmonisation of privacy protection rules so country of origin no longer determines which rules apply

• Also important that the European Data Protection Supervisor’s office is expanding to an independent board and replacing the Article 29 Working Group

Data management challenges

• Data subject rights vs data possessor capabilities

–When is identity “knowable”?

• Common vs heterogeneous, application specific & proprietary data standards

• Metadata tags to regulate processor access, use and re-use?

• Fairness & privacy protection by design

Thanks for your time!

Robert Horvitz

Senior Consultant

SCF Associates Ltd

Horvitz@volny.cz

Recommended