Upload
robkitchin
View
413
Download
1
Embed Size (px)
Citation preview
The Ethics of Urban Big Data
and Smart Cities
Prof. Rob Kitchin
Maynooth University
Data and the city
• Rich history of data being generated about cities
• Urban data are a key input for understanding city life, solving urban problems, formulating policy and plans, guiding operational governance, modelling possible futures, and tackling a diverse set of other issues
• For as long as data have been generated about cities then, various kinds of data-informed urbanism has been occurring
• Data-informed urbanism is increasingly being complemented and replaced by data-driven, networked urbanism
• Post-Millennium, the urban data landscape is being transformed moving from small to big data
Urban big data
• Directed
o Surveillance: CCTV, drones/satellite
o Scaled public admin records
• Automated
o Automated surveillance
o Digital devices
o Sensors, actuators, transponders, meters (IoT)
o Interactions and transactions
• Volunteered
o Social media
o Sousveillance/wearables
o Crowdsourcing/neogeography
o Citizen science
Urban big data
• Diverse range of public and private generation of fine-scale (uniquely indexical) data about citizens and places in real-time: • utilities
• transport providers
• environmental agencies
• mobile phone operators
• social media sites
• travel and accommodation websites
• home appliances and entertainment systems
• financial institutions and retail chains
• private surveillance and security firms
• remote sensing, aerial surveying
• emergency services
• Producing a data deluge that can be combined, analyzed, acted upon
Single systems
Integrated, city & sector wide
Data-driven, networked urbanism
www.dublindashboard.ie
Data-driven, networked urbanism
• Cities are becoming ever more instrumented and
networked, their systems interlinked and integrated
• Consequently, cities are becoming knowable and
controllable in new dynamic ways
• Urban operational governance and city services are
becoming highly responsive to a form of networked
urbanism in which big data systems are:
• prefiguring and setting the urban agenda
• producing a deluge of contextual and actionable data
• influencing and controlling how city systems respond and
perform in real-time
Creating smart cities
• Tackle pressing issues
• New forms of operational governance
• More efficient, competitive and productive service delivery
• Increase resilience and sustainability
• More transparency and accountability
• Enhance participation in city life and quality of life
• Stimulate creativity, innovation, entrepreneurship and
economic growth
• Improve models and simulations for future development
Eight critiques of smart cities
• City as a knowable, rational, steerable machine
• Ahistorical, aspatial and homogenizing
• Technocratic governance and solutionism
• Corporatisation of governance
• Serve certain interests and reinforce inequalities
• The politics of urban data
• Social, political, ethical effects
• Buggy, brittle, hackable urban systems
The politics of urban data
• Big data and dashboards are not simply technical tools
• Nor are they are not pragmatic, neutral, objective, non-ideological; nor can they speak for themselves
• Data do not exist independently of the ideas, instruments, practices, contexts, knowledges and systems used to generate, process & analyze them
• Big data and dashboards express a normative notion about what should be measured, for what reasons, and what they should tell us
• And they have normative effect - being used to influence decision-making, modify institutional behaviour, condition workers, etc
The politics of urban data
Material Platform (infrastructure – hardware)
Code Platform (operating system)
Code/algorithms (software)
Data(base)
Interface
Reception/Operation (user/usage)
Systems of thought
Forms of knowledge
Finance
Political economies
Governmentalities & legalities
Organisations and institutions
Subjectivities and communities
Marketplace
System/process performs a task
Context frames the system/task
Digital socio-technical assemblage
Places
Practices
Ethics of data-driven urbanism
• Data-driven, networked urbanism raises all kinds of ethical & related questions
• Data ownership and control
• Data integration and data markets
• Data security and integrity
• Dataveillance and privacy
• Data quality and provenance
• Data uses
Privacy and big urban data
• Privacy debates concern acceptable practices with regards to accessing and disclosing personal and sensitive information about a person
• identity privacy (to protect personal and confidential data)
• bodily privacy (to protect the integrity of the physical person);
• territorial privacy (to protect personal space, objects and property);
• locational and movement privacy (to protect against the tracking of spatial behaviour)
• communications privacy (to protect against the surveillance of conversations and correspondence);
• transactions privacy (to protect against monitoring of queries/searches, purchases, and other exchanges)
A Taxonomy of Privacy Harms (compiled from Solove 2006)
Domain Privacy breach Description
Information
Collection
Surveillance Watching, listening to, or recording of an individual’s activities
Interrogation Various forms of questioning or probing for information
Information
Processing
Aggregation The combination of various pieces of data about a person
Identification Linking information to particular individuals
Insecurity Carelessness in protecting stored information from leaks and
improper access
Secondary Use Use of information collected for one purpose for a different
purpose without the data subject’s consent
Exclusion Failure to allow the data subject to know about the data that others
have about her and participate in its handling and use, including
being barred from being able to access and correct errors
Information
Dissemination
Breach of Confidentiality Breaking a promise to keep a person’s information confidential
Disclosure Revelation of information about a person that impacts the way
others judge her character
Exposure Revealing another’s nudity, grief, or bodily functions
Increased Accessibility Amplifying the accessibility of information
Blackmail Threat to disclose personal information
Appropriation The use of the data subject’s identity to serve the aims and
interests of another
Distortion Dissemination of false or misleading information about individuals
Invasion Intrusion Invasive acts that disturb one’s tranquillity or solitude
Decisional Interference Incursion into the data subject’s decisions regarding her private
affairs
Privacy and big urban data
• Intensifies datafication • The capture and circulation data are:
• indiscriminate and exhaustive (involve all individuals, objects, transactions, etc.);
• distributed (occur across multiple devices, services and places);
• platform independent (data flows easily across platforms, services, and devices);
• continuous (data are generated on a routine and automated basis).
• Much greater levels of intensified scrutiny and modes of surveillance/dataveillance
• Tasks previously unmonitored or caught through disciplinary gaze now routinely tracked and traced
• All but impossible to live everyday lives without leaving digital footprints and shadows
• Mass recording, organizing, storing and sharing big data changes the uses to which data can be put
Location/movement data
• Controllable digital CCTV cameras + ANPR + facial recognition
• Smart phones: cell masts, GPS, wifi
• Sensor networks: capture and track phone identifiers such as MAC addresses
• Wifi mesh: capture & track phones with wifi turned on
• Smart card tracking: barcodes/RFID chips (buildings & public transport)
• Vehicle tracking: unique ID transponders for automated road tolls & car parking
• Other staging points: ATMs, credit card use, metadata tagging
• Electronic tagging; shared calenders
Data type Data permissions that can be sought by android apps (from Hein 2014)
Accounts log email log
App Activity name, package name, process number of activity, processed id
App Data Usage Cache size, code size, data size, name, package name
App Install installed at, name, package name, unknown sources enabled, version code, version
name
Battery health, level, plugged, present, scale, status, technology, temperature, voltage
Device Info board, brand, build version, cell number, device, device type, display, fingerprint, IP,
MAC address, manufacturer, model, OS platform, product, SDK code, total disk
space, unknown sources enabled
GPS accuracy, altitude, latitude, longitude, provider, speed
MMS from number, MMS at, MMS type, service number, to number
NetData bytes received, bytes sent, connection type, interface type
PhoneCall call duration, called at, from number, phone call type, to number
SMS from number, service number, SMS at, SMS type, to number
TelephonyInfo cell tower ID, cell tower latitude, cell tower longitude, IMEI, ISO country code, local
area code, MEID, mobile country code, mobile network code, network name,
network type, phone type, SIM serial number, SIM state, subscriber ID
WifiConnection BSSID, IP, linkspeed, MAC addr, network ID, RSSI, SSID
WifiNeighbors BSSID, capabilities, frequency, level, SSID
Root Check root status code, root status reason code, root version, sig file version
Malware Info algorithm confidence, app list, found malware, malware SDK version, package list,
reason code, service list, sigfile version
Privacy and big urban data
• Deepens inferencing
• Big data and predictive modelling enables a lot of inference
beyond the data generated
• can infer info about an individual not directly encoded in a
database but constitute PII which can produce ‘predictive
privacy harms’.
• For example, co-proximity and co-movement with others
can be used to infer political, social, and/or religious
affiliation.
• Also can produce ‘the tyranny of the minority’
Privacy and big urban data
• Weak anonymization and enables re-identification
• Key strategies for ensuring individual privacy is anonymization, either through the use of pseudonyms or aggregation or other strategies.
• Pseudonyms simply mean that a unique tag is used to identify a person in place of a name.
• Code is persistent and distinguishable from others and recognizable on an on-going basis, meaning it can be tracked over time and space and used to create detailed individual profiles.
• No different from other persistent identifiers such as social security numbers and in effect constitutes PII.
• Some companies talking of ‘anonymous identifiers’ is thus somewhat of an oxymoron, especially when the identifier is directly linked to an account with known personal details
• Inference and the linking of a pseudonym to other accounts and transactions means it can be potentially be re-identified.
• It is possible to reverse engineer anonymization strategies by combing and combining datasets
Privacy and big urban data
• Opacity and automation creates obfuscation and reduces control
• The emerging big data landscape is complex and fragmented.
• Various smart city technologies are composed of multiple interacting systems run by a number of corporate and state actors.
• Data are thus passed between ‘devices, platforms, services, applications, and analytics engines’ and shared with third parties.
• Across this maze-like assemblage data can be ‘leaked, intercepted, transmitted, disclosed, dis/assembled across data streams, and repurposed’ in ways that are difficult to track and control
• Moreover, algorithmic processing is black-boxed, so it’s not clear how data are being processed
• Opacity and automation undermine the FIPPs at the heart of privacy regulation in a number of respects:
• making it difficult for individuals to seek access to verify, query, correct or delete data, or to even know who to ask (tangled set of roles (as data processors and controllers);
• to know how data collected about them is used; to assess how fair any actions taken upon the data are;
• to hold data controllers to account
Privacy and big urban data
• Data are being shared and repurposed and used in unpredictable and unexpected ways
• One of the key features of the data revolution is the wholesale erosion of data minimization principles;
• that is, the undermining of purpose specification and use limitations principles that mean that data should only be generated to perform a particular task, are only retained as long as they are needed for that task, and are only used to perform a particular task.
• Solution pursued by many companies is to repackage data by de-identifying them (using pseudonyms or aggregation) or creating derived data, with only the original dataset being subjected to data minimization. The repackaged data can then be sold on and repurposed in a plethora of ways
• The data and services that data brokers offer are used to perform a wide variety of tasks for which the data were never intended, including to predictively profile, socially sort, behaviourally nudge, and regulate, control and govern individuals and the various systems and infrastructures with which they interact
Privacy and big urban data
• Notice and consent is an empty exercise or absent
• Individuals interact with a number of smart city technologies on a daily basis, each of which is generating data about them.
• Given the volume and diversity of these interactions it is simply too onerous for individuals to police their privacy across dozens of entities, to weigh up the costs and benefits of agreeing to terms and conditions without knowing how the data might be used now and in the future, and to assess the cumulative and holistic effects of their data being merged with other datasets
• In the case of some smart city technologies there is little mechanism to seek notice and consent
• For example, CCTV, ANPR and MAC address tracking, and sensing by the Internet of Things, all take place with no attempt at consent and often with little notification
• Moreover, there is no ability to opt-out
• As such, there is no sense in which a person can selectively reveal themselves; instead they must always reveal themselves.
• If a person is unaware that data about them is being generated, then it is impossible to discover and query the purposes to which those data are being put
R
Fair Information Practice Principles (OECD, 1980)
Principle Description
Notice Individuals are informed that data are being generated and the
purpose to which the data will be put
Choice Individuals have the choice to opt-in or opt-out as to whether and
how their data will be used or disclosed
Consent Data are only generated and disclosed with the consent of
individuals
Security Data are protected from loss, misuse, unauthorized access,
disclosure, alteration and destruction
Integrity Data are reliable, accurate, complete and current
Access Individuals can access, check and verify data about themselves
Use Data are only used for the purpose for which they are generated
and individuals are informed of each change of purpose
Accountability The data holder is accountable for ensuring the above principles
and has mechanisms in place to assure compliance
Redundant in the age of big urban data?
http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/
Hacking the City?
• Weak security and encryption • Insecure legacy systems and poor
maintenance • Large and complex attack surfaces and
interdependencies • Cascade effects • Human error and disgruntled
(ex)employees
Suggested solutions
• Market: • Industry standards and self-regulation
• Privacy/security as competitive advantage
• Technological • End-to-end strong encryption, access controls, security controls, audit
trails, backups, up-to-date patching, etc.
• Privacy enhancement tools
• Policy and regulation • FIPPs
• Privacy by design;
• security by design
• Governance • Vision and strategy: (1) smart city advisory board and smart city strategy;
• Oversight of delivery and compliance: (2) smart city governance, risk and compliance board;
• Day-to-day delivery: (3) core privacy/security team, smart city privacy/security assessments, and (4) computer emergency response team
Conclusion
• We are entering an era of embedded and mobile computation
• Devices and infrastructures are producing vast quantities of data in
real-time, and are responsive to these data, enabling new kinds of
monitoring, regulation and control
• Cities are becoming data-driven and are enacting new forms of
algorithmic governance
• Whilst data-driven, networked urbanism undoubtedly provides a set of
solutions for urban problems, it also raises a number of ethical and
normative questions
• The challenge facing urban managers and citizens is to realise the
benefits of planning and delivering city services using urban data and
real-time responsive systems whilst minimizing pernicious effects
• At present, little serious thought has been expended on the latter
Background
http://www.maynoothuniversity.ie/progcity
@progcity
@robkitchin
https://www.maynoothuniversity.ie/people/rob-kitchin