46
PrivacyAware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework Mariano P. Consens University of Toronto PRELIDA Consolidation and Dissemination Workshop, Riva del Garda, 2014 17/10/2014 Consens

Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Embed Size (px)

DESCRIPTION

by Mariano P. Consens (University of Toronto),

Citation preview

Page 1: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Privacy‐Aware Preservation:Challenges from the Perspective of a

Linked Data Privacy Auditing Framework

Mariano P. ConsensUniversity of Toronto

PRELIDA Consolidation and Dissemination Workshop, Riva del Garda, 2014

17/10/2014 Consens

Page 2: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

A Linked Data Privacy Auditing Perspective?• Recent work on a Linked Data Publishing Framework using two RDFS ontologies (L2TAP+SCIP) [SC12,SC14] – Publishing privacy log events as Linked Data

• Enable log integration via secure web access to all events– Encoding privacy‐related events in RDF

• Simple target for mapping key Contextual Integrity concepts– SPARQL solutions for

• Log construction (from policies and dataset descriptions)• Obligation derivation• Log‐based auditing of compliance checking (detection of privacy violations and attribution)

• Facilitates best practices using audit logs and monitoring as an effective oversight regime

17/10/2014 Consens

[SC14] R. Samavi, M. P. Consens, “Publishing L2TAP Logs to Facilitate Transparency and Accountability”. In Linked Data on the Web (LDOW2014), WWW Workshops, 2014.[SC12] R. Samavi, M. P. Consens, “L2TAP+SCIP: An audit‐based privacy framework leveraging Linked Data”. In 8th International Conference on Collaborative Computing (CollaborateCom2012), 2012.

Page 3: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Simple Contextual Integrity Ontology• The SCIP ontology is designed to capture Contextual Integrity requirements– Contexts (Participants and their Roles, Data Attributes, Purposes) 

– Norms and Information Transmission Principles• Privacy as the right to appropriate flows of personal information– Social Contexts (roles of individuals in society)– Context‐relative information normsHelen Nissenbaum. Privacy in Context: Technology, Policy, and the Integrity of Social Life, Stanford Law Books, 2009

17/10/2014 Consens

Page 4: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Privacy Policies are not just RBAC

Sample Privacy Preferences that could apply to glucose level websiteP1: Access by university researchers should be for research purposesP2: Consent must be obtained at least two days prior to accessP3: Should receive notification of publications using the data within 9 months

17/10/2014 Consens

Page 5: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

L2TAP+SCIP Motivation

• Increasing need for privacy frameworks that allow– Individuals to express their privacy preferences – Service providers to interpret, enforce, and be held accountable for respecting individual’s privacy concerns

• Compliance (e.g., HIPAA Privacy Rule, Gramm‐Leach‐Bliley Act, EU Directive 95/46/EC)

• EU Agency Recommendation (ENISA, 2011)– Research on information accountability technology should be promoted, aimed at the technical ability to hold information processors accountable for their storage, use and dissemination of third‐party data.

17/10/2014 Consens

Page 6: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Related Work• Linked Data privacy

– Expressing access control policies, SPPO (ACL) [Sacco, 2011]– Using SWRL to express access rules [Mühleisen, 2010]– Leveraging the linked data architecture for providing authorization and 

access restrictions (based on WebID) [Story, 2009 ], [Hollenbach et al., 2009 ]

• Policy monitoring approaches– LPU [Barth et al., 2006], MFOTL [Basin  et al., 2010], PrivacyLFP [Datta et al., 2011]

– Use linear, metric temporal logic (LTL, MFOTL)– Provide proof‐based systems for run time monitoring of policies

• Access control and privacy policy languages– Expressing access control policies [Sandhu et al., 1996], [Jojodia et al., 2001]– Expressing and enforcing privacy policies (P‐RBAC) [Ni et al., 2007], [Ni et 

al., 2008], [Li et al., 2012]

17/10/2014 Consens

Page 7: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Talk Outline

• Background on Perspective, Privacy Policies• Privacy awareness in OAIS?• L2TAP Linked Data Logs

– L2TAP Log Events– SCIP Privacy‐related Events

• Query‐based Auditing• Closing Remarks

17/10/2014 Consens

Page 8: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Privacy‐Aware Preservation in OAIS

• The PDI (Preservation Description Information)includes Access Rights Information– Access restrictions pertaining to the Content Information; including the legal framework, licensing terms, and access control

– Contains access and distribution conditions stated in the Submission Agreement, related to both preservation (by the OAIS) and final usage (by the Consumer)

– Includes the specifications for the application of rights enforcementmeasures

17/10/2014 Consens

Page 9: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Privacy‐awareness evolution?

• Changes in OAIS from 2002 (CCSDS 650.0‐B‐1) to 2012 (CCSDS 650.0‐M‐2)– Addition of Access Rights Information to PDI– Removal of Annex A (existing archive examples)

• Of the 5 examples removed, 2 deal with privacy issues

17/10/2014 Consens

Page 10: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Life Sciences Data Archive

• LSDA is responsible for collecting, cataloging, storing and making accessible the data of NASA funded Life Sciences space flight investigations 

• The LSDA has strict security measures for data from human subjects which require sensitivity and secure handling due to the Human Data Privacy Act

• Only mean‐pooled human data is made available to the public– Privacy via Anonymity vs. Usability 

17/10/2014 Consens

Page 11: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

National Collaborative Perinatal Project

• NCPP was a multi‐institutional, multi‐year (1959‐1965 to 1974) study of pregnant women and the children born from those pregnancies to provide baseline information useful for later determining the causes of neurological diseases– The NIH NINDS project expended more than $200 million over two decades to collect NCPP

– “It is unlikely that a study of this duration and magnitude will be repeated”

• NCPP transferred to the National Archives and Records Administration (NARA)– After NARA and NIH resolved the privacy and access concerns

17/10/2014 Consens

Page 12: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Privacy as a Value‐added Transformation of NCPP

• NARA provides NCPP as received from NINDS• NARA has created Public Use Files for the two data files containing personal identifiers in conformance with the Freedom of Information Act and NARA

• NARA enforces restrictions on access to records whose release might result in unwarranted invasion of personal privacy

17/10/2014 Consens

Page 13: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Talk Outline

• Background on Perspective, Privacy Policies• Privacy awareness in OAIS? • L2TAP Linked Data Logs

– L2TAP Log Events– SCIP Privacy‐related Events

• Query‐based Auditing• Closing Remarks

17/10/2014 Consens

Page 14: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Big (Health) Data Motivation

• Privacy protection a challenge for social computing and data driven science

• Consider big data biomedical research – Massive datasets of human genome, biological imaging, and clinical information collected and aggregated from individual health records

• Data subjects' privacy in clinical research– Addressed by multiple legislations and regulations (e.g., U.S. Department of Health and Human Services or HHS)

17/10/2014 Consens

Page 15: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Scenario: Medical Research Study

• Research teams are interested in analyzing primary reasons for ICU hospitalization, examine effectiveness of medication across patient demographics

• MIMIC II is a public clinical database provided by PhysioNetof data on de‐identified patient ICU admissions

• Researchers must comply with MIMIC II data use agreement, as well as HHS, Hospital/University, and other regulations

17/10/2014 Consens

DatasetMIMIC II

Auditor RT1

Data Provider Auditor

External AuditorL2TAP Audit Log

RT2Data Provider

PhysioNet

L2TAP Audit LogRT1

Research TeamRT1

Research TeamRT2

Auditor RT2

Page 16: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Overview of L2TAP Logs

• L2TAP provides a set of classes and properties that can be used to represent and publish a log of privacy events as Linked Data

• L2TAP‐related events in the log– Log Initialization

• Who is the logger, what time model is used

– Participant Registration• DataSubject, DataRequestor, DataSender, ObligationPerformer, ObligationWitness, PrivacyLogger, PrivacyExpert

17/10/2014 Consens

Page 17: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Log Initialization Event

17/10/2014

Who?

Consens

Page 18: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Participant Registration Event

17/10/2014

When?

What?

URI minted by PrivacyLogger

Consens

Page 19: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

L2TAP Participants in OAIS Scenario

17/10/2014 Consens

Individuals(as Source of Data)

Auditors(of Information Flows)

Page 20: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Overview of SCIP Privacy‐related Events

• Recap: the SCIP ontology is designed to capture Contextual Integrity requirements– Contexts (Participants and their Roles, Data Attributes, Purposes) 

– Norms and Information Transmission Principles• SCIP encoded privacy‐related events in RDF

– Privacy Preferences– Access Requests and Responses– Obligation Acceptances and Performances– Access Activities

17/10/2014 Consens

Page 21: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

17/10/2014 Consens

Research TeamRT1

Data Provider PhysioNet

Privacy Policies

L2TAP Audit Log

SCIP in the Medical Research Study

Page 22: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Privacy Preference

17/10/2014 Consens

Purpose!

Page 23: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Privacy Preferences in OAIS Administration

17/10/2014 Consens

Page 24: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

17/10/2014 Consens

Access Request

Research TeamRT1

Data Provider PhysioNet

Privacy Policies

L2TAP Audit Log

SCIP in the Medical Research Study

Page 25: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Access Request

17/10/2014 Consens

Page 26: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

17/10/2014 Consens

Access Request

Research TeamRT1

Data Provider PhysioNet

Privacy Policies

L2TAP Audit Log

Access Response

SCIP in the Medical Research Study

Page 27: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Access Response

17/10/2014 Consens

Page 28: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Requests and Responses in OAIS Access

17/10/2014 Consens

Page 29: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

17/10/2014 Consens

Access Request

Research TeamRT1

Obligation Acceptance

Data Provider PhysioNet

Privacy Policies

L2TAP Audit Log

Access Response

SCIP in the Medical Research Study

Page 30: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Obligation Acceptance

17/10/2014 Consens

Page 31: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

17/10/2014 Consens

Access Request

Research TeamRT1

Obligation Acceptance

Data Provider PhysioNet

Privacy Policies

L2TAP Audit Log

Access Response

Performed Obligation

SCIP in the Medical Research Study

Page 32: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Performed Obligation

17/10/2014 Consens

Page 33: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

17/10/2014 Consens

Access Request

Research TeamRT1

Obligation Acceptance

Access Activity

Data Provider PhysioNet

Privacy Policies

L2TAP Audit Log

Access Response

Performed Obligation

SCIP in the Medical Research Study

Page 34: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Access Activity

17/10/2014 Consens

Page 35: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Activities in OAIS Data Management 

17/10/2014 Consens

Page 36: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

17/10/2014 Consens

Access Request

Research TeamRT1

Obligation Acceptance

Access Activity

Data Provider PhysioNet

Privacy Policies

L2TAP Audit Log

Access Response

Performed Obligation

SCIP in the Medical Research Study

Page 37: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

L2TAP+SCIP flows affect most OAIS areas!

17/10/2014 Consens

Page 38: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Talk Outline

• Background on Perspective, Privacy Policies• Privacy awareness in OAIS? • L2TAP Linked Data Logs

– L2TAP Log Events– SCIP Privacy‐related Events

• Query‐based Auditing• Closing Remarks

17/10/2014 Consens

Page 39: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Log Construction

• Log Construction requires descriptions of the participants, policies, and access requests– Data provider motivated to express the policies that govern data usage

– Research team institution motivated to facilitate researcher accountability

– Access requests can be derived from dataset description (SPARQL Construct queries)

• Accommodates multiple log scenarios– Distribution, Replication, Third‐party custodians

17/10/2014 Consens

Page 40: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Obligation Derivation• Given an access request, we need a mechanism to log all applicable obligations

• The process consists of the following three tasks: – find matches between an access request and privacy preferences 

– generate the set of obligations – construct the logical expression that describes how the individual satisfaction of each obligation contributes to the overall compliance of the originally matched access request

• SPARQL Construct queries to derive the scip:Obligations that are returned in scip:AccessResponse

17/10/2014 Consens

Page 41: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Compliance Checking via SPARQL

• Algorithm1. Determine the individual satisfaction of each obligation (ASK query)

2. Evaluate how the individual satisfaction of each obligation contributes to the overall compliance of an access request (multiple ASK queries)

3. Determine the access request compliance (SELECT query)

• Representative compliance queries– Which access requests are not compliant at time t? – Which access requests have been discharged?– What obligations are pending?

17/10/2014 Consens

Page 42: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

SELECT   DISTINCT ?request WHERE {

?response scip:responseTo ?request .?response scip:contextObligation ?obligation .?response scip:accessDecision ?accessDecision .

FILTER ((!(φtf) && (φtp)) && ?accessDecision) }

Step 3 Compliance Checking Query• Which access requests are not compliant at time t? • Which access requests have been discharged?• Which access requests are compliant at time t but are not yet 

discharged? Framework Extensibility:Φ can be substituted by an expressions that its propositional value is deducted from a more sophisticated obligation model

4217/10/2014 Consens

Page 43: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Experimental Validation• Experimental validation of the scalability and practicality

– Custom Java application (SyntheticSCIP) used to generate a hypothetical audit log scenario with a growing number of access requests

– Six representative compliance queries timed using a Virtouso 6 installation on an Ubuntu server  

Q1

Q2

Q3

Q4

10 50 100 400 1,000

Q5 Q6

0

500

1000

1500

2000

2500 Ti

me

(sec

onds

)

Access Requests (in thousands) 17/10/2014 Consens

Page 44: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Talk Outline

• Background on Perspective, Privacy Policies• Privacy awareness in OAIS? • L2TAP Linked Data Logs

– L2TAP Log Events– SCIP Privacy‐related Events

• Query‐based Auditing• Closing Remarks

17/10/2014 Consens

Page 45: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Closing Remarks

• PRELIDA’s list of desirable features for a Preservable (Link Data) Dataset archive should consider including Privacy Policies– Why not using a Linked Data representation?

• Privacy‐aware Preservation has implications throughout the entire OAIS Reference Model

• Privacy preferences are dynamic– Age/time‐based policy changes– The social context of norms change

17/10/2014 Consens

Page 46: Privacy‐Aware Preservation: Challenges from the Perspective of a Linked Data Privacy Auditing Framework

Monitoring Policy Changes in OAIS Preservation Planning

17/10/2014 Consens