29
Exploiting Personal Data, Preserving Privacy Pradeep K. Murukannaiah and Munindar P. Singh Department of Computer Science North Carolina State University Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 1 / 29

Exploiting Personal Data, Preserving Privacy · (Dropbox Sync API) Place Labels Sensor Data Place Model (Semi-supervised) Tagging Subsystem (Android Activity) Event & Action Logger

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • Exploiting Personal Data, Preserving Privacy

    Pradeep K. Murukannaiah and Munindar P. Singh

    Department of Computer ScienceNorth Carolina State University

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 1 / 29

  • Personal Data and Applications: Current Setting

    ValueUser Applica-on

    Applica-on

    Sensor

    Service

    Sensor

    Service

    Control

    Control

    ValueInforma-on

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 2 / 29

  • Current Setting: Shortcomings

    Potential for privacy loss

    I Users’ personal data stored by several third-parties

    Fragmented user experience

    I Each application may interpret users’ data differently

    Difficult to deliver a personalized experience

    I Applications often rely on data aggregated from multiple users

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 3 / 29

  • Personal Data and Applications: Proposed Setting

    User Applica+on

    Applica+on

    Sensor

    Service

    Sensor

    Service

    Informa+on

    Value

    Middlew

    are

    Context

    Context

    Value

    Control

    I The middleware maps raw sensor data into high-level contexts andshares contexts with applications, respecting user’s privacy policies

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 4 / 29

  • Benefits of the Middleware Setting

    Potential benefits to an end userI Uniform location information architecture

    I Enhances learnabilityI Simplified interaction design

    I Enhances ease of useI Middleware-based service design

    I Preserves privacy

    Potential benefits to a developer

    I Separation of concernsI Simplifies development processI Yields modular implementations

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 5 / 29

  • The Disconnect between Data and Intent

    An example: Intelligent Ringer application

    I Automatically sets a ringer mode for incoming calls

    I Consumes data from Bluetooth sensor

    Why does the ringer application require from Bluetooth sensor?

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 6 / 29

  • Bridging Data and Intent

    Xipho: A methodology for engineering personal agents

    I Metamodel: Actor, goal, plan, belief, resource, and dependency

    I Spans requirements, design, specification, and implementation phases

    Callee

    Callee's neighborCaller

    To reach by phone

    To be not disturbed

    To be reachable by phone

    To work uninterrupted

    To not disturb a neighbor

    To set an appropriate ringer mode

    To answer the call

    +To update the caller

    To not answer the

    call

    To send an appropriate notification

    AND

    To preserve privacy

    AND

    Perspective

    Actor Goal Plan means-end

    dependencycontribution (+/-)

    Soft goal

    LEGEND

    - +

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 7 / 29

  • Bridging Data and IntentXipho maps data abstractions to agent capabilities via systematic steps

    Ringer Manager

    Context

    Neighbor's context

    Caller's context

    ORSet as silent

    ORSet as vibrate

    Ambience

    Activity Social Circle

    Casual Emergency

    Noisy Quiet

    Set as loud or vibrate

    Set as silent or vibrate

    Set as loud

    To be reachable by phone

    To work uninterrupted

    To not disturb a neighbor

    +

    Context Middleware

    Acquire context

    Example: Activity = ?A1 ∧ Social circle = ?S1 ∧ Neighbor’s context = ?N1∧ Caller’s context = Emergency → Set as loud ∨ Set as vibrate

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 8 / 29

  • Middleware Architecture

    Ringer Manager’s context instances (data abstractions)

    SpaceCaller's ContextActivityNeighbor's

    Context

    EmergencyCasual Near

    Social Circle

    Proximity to Phone

    Far

    The middleware

    I Elicits context instances

    I Recognizes context instances

    I Acquires context resources

    Callee Callee's contextContext

    Middleware

    Elicit context instances of open

    abstractions

    Recognize context

    instances Caller

    Caller's context

    Calle's Neighbor

    Neighbor's context

    Ringer Manager

    Acquire context resources

    Sensors & Network

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 9 / 29

  • Platys Middleware: Implementation

    Personal Computer

    Android Phone

    Shared Information Store (Dropbox Sync API)

    Place Labels

    Sensor DataPlace Model

    (Semi-supervised)

    Tagging Subsystem(Android Activity)

    Event & Action Logger(Android Service)

    Application Manager (Android Service)

    Platys Reasoner (Apache Daemon)

    API (Android IDL)

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 10 / 29

  • Platys Reasoner

    Platys Reasoner learns a user’s contexts from raw sensor data

    Traditional unsupervised approaches

    I Often require frequent sensing

    I Do not capture subjective nuances

    Traditional supervised approaches

    I Several labels for each place to perform well

    Platys Reasoner: A middle ground solution

    I Active learning seeks to reduce labeling effort

    I Semi-supervised learning deals with infrequent sensor data

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 11 / 29

  • Platys Reasoner: Intuition

    Home

    09:30

    Lab

    09:00 11:30

    Lab

    11:15 12:00

    Home

    14:00

    ???

    08:30 09:15 10:10 12:30 17:15

    ???

    {w1 w2, w3} {w5, w6, w7} {w6, w7}{…} {w1} {w4, w8}{w4, w5} {w6, w7, w8} {w2, w3}

    {g1} {…} {g2} {g3} {g3}

    Time:

    Place Label:

    WiFi:

    GPS:

    Ground Truth: Home Lab LabHome Home

    User labeled data points Unlabeled data points Data points for place prediction

    11:45

    {…}

    Stream of intermittent sensor data and place labels

    Home Lab Lab Home ??? ???Place Label:

    Active learning enable intelligent place labeling

    Home Lab Lab Home ??? ???Place Label:

    Data points with Platys Reasoner assigned labels

    Lab Lab HomeHome Lab Lab

    Semi-supervised learning exploits latent structure

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 12 / 29

  • Platys Reasoner: Active and semi-supervised learner

    Pool of Unlabeled Sensor Data Instances

    Active Learning

    Sensor 1 Sensor 2 Sensor 3 …

    Prompt the User to Label

    A Few Labeled Instances

    Several Labeled Instances

    Training

    Semi-Supervised Learning

    Iterative Clustering

    Places from Sensor Data

    User

    Label

    Classify Instances

    Prediction

    Filter Non-Places

    Infer New Labels

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 13 / 29

  • Evaluation

    Platys middleware

    I Simplifies application development, enhancing privacy and userexperience of applications produced

    Xipho methodology

    I Connects data and intent, enhancing software comprehensibility

    Platys Reasoner

    I Learns contextual abstractions from raw sensor data, reducing the usereffort required for training the reasoner

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 14 / 29

  • Platys Developer Study: Design

    Factor and Alternatives

    I Framework: Platys middleware and Android location provider

    Hypotheses

    1. Time and effort: Platys middleware reduces time and effort required toimplement an application compared to the control setting

    2. Code quality: Platys middleware yields more modular implementationsthan the control setting

    3. Usability and privacy: Platys middleware yields more usable andprivacy-preserving applications than the control setting

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 15 / 29

  • Platys Developer Study: Design

    Subjects

    I 46 students from a graduate-level computer science course

    I Balanced group for each alternative (education level, Android and generalprogramming experience)

    Study Unit

    I An application to implement (Ringer Manager)

    I Functional requirements specified as three–four use cases

    I Usability requirement specified as broad-brush guidelines

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 16 / 29

  • Analyses

    Modeling time (continuous value)

    I Equality of mean values: Two-tailed t-test

    I Equality of variance values: Two-tailed F -test

    I Normality tests (K -S test) conducted

    Modeling effort, model completeness and comprehensibility(ratings on a scale of 1–7)

    I Equality of medians: Two-tailed Wilcoxon’s ranksum-test

    I Normality assumption not required

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 17 / 29

  • Platys Developer Study: Results

    0 5 10 15 20

    P

    C

    timeprep (hours)

    µP 6= µC ∗∗(p = 0.001)

    0 5 10 15 20

    P

    C

    timeloc (hours)

    µP < µC ∗(p = 0.081)

    0 5 10 15 20

    P

    C

    timecore (hours)

    µX 6= µC(p = 0.311)

    0 5 10 15 20

    P

    C

    timeusability (hours)

    µP < µC ∗∗(p = 0.021)

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 18 / 29

  • Platys Developer Study: Results

    2 3 4

    P

    C

    MCC

    µP < µC ∗∗(p = 0.001)

    1.5 2 2.5

    P

    C

    NoLM

    µP < µC ∗∗(p = 0.001)

    500 1,000 1,500

    P

    C

    NoS

    µP < µC(p = 0.272)

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 19 / 29

  • Xipho Developer Study: Design

    Factor

    I Methodology

    Alternatives

    I Xipho and Tropos

    Hypotheses

    1. Time and effort: Xipho reduces modeling time and effort compared toTropos

    2. Completeness: Xipho models are more complete than Tropos models

    3. Comprehensibility: Xipho models are easier to comprehend thanTropos models

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 20 / 29

  • Xipho Developer Study: Design

    Subjects

    I 46 students from a graduate-level computer science course

    I Balanced group for each alternative (education level, modelingexperience, and programming experience)

    Study Unit

    I Three applications to practice, model, and verify

    I Requirements specified as three–four use cases

    Phase Task Group 1 Group 2 Group 3

    1 Practice Alarm Motivator Reminder2 Modeling Motivator Reminder Alarm3 Verification Reminder Alarm Motivator

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 21 / 29

  • Xipho Developer Study: Results

    0 2 4 6 8 10

    X

    C

    Practice hours per scenario

    µX < µC (p = 0.325)σ2X < σ

    2C (p = 0.087

    ∗)

    0 2 4 6 8 10

    X

    C

    Modeling hours per scenario

    µX < µC (p = 0.046∗∗)σ2X < σ

    2C (p = 0.563)

    0 20 40 60 80 100

    C

    X

    Modeling effort (% responses)

    x̃X < x̃C (p = 0.01∗∗)

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 22 / 29

  • Xipho Developer Study: Results

    I Three subjects rated each model for completeness and comprehensibility

    I Ratings with insufficient inter-rater agreement (Krippendorff’s α < 0.67)were discarded

    0 20 40 60 80 100

    C

    X

    Completeness (% ratings)

    x̃X > x̃C (p = 0.206)

    0 20 40 60 80 100

    C

    X

    Comprehensibility (% ratings)

    x̃X > x̃C (p = 0.029∗∗)

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 23 / 29

  • Platys Reasoner User Study: Design

    Data Collection

    I 10 users employed an Android phone installed with Platys middleware astheir primary phone for 10 weeks

    I Platys middleware collected sensor readingsI GPS, WiFi, Bluetooth sensor readings, and Google POI data

    I Each user labeled places of his interest multiple times

    Baselines

    I Two staypoint-based approaches

    I Two supervised approaches (logistic regression and SVM)

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 24 / 29

  • Platys Reasoner User Study: Results

    0.4

    0.6

    0.8

    1

    3 mins20 m

    30 mins200 m

    3 hours1.2 km

    1 day96 km

    AB

    C

    D

    E

    F

    Place Radius and Duration (log scale)

    Platys (Place-or-not)Platys (Which-place)Staypoint [Zheng 2012]Staypoint for optimal place radius andduration for each user A–F

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 25 / 29

  • Platys Reasoner User Study: Results

    1 2 3 4 5 6 70.3

    0.4

    0.5

    0.6

    0.7

    Number of labels trained with

    F-M

    easu

    re(a

    vera

    ged

    acro

    ssus

    ers)

    SVMSVM + Active learning

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 26 / 29

  • Platys Reasoner User Study: Results

    1 2 3 4 5 6 70.3

    0.4

    0.5

    0.6

    0.7

    0.8

    Number of labels trained with

    F-M

    easu

    re(a

    vera

    ged

    acro

    ssus

    ers)

    SVM + Active learningSemi-supervised SVM +

    Active learning

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 27 / 29

  • Papers

    1 P. K. Murukannaiah and M. P. Singh. Platys Social: Relating shared places andprivate social circles. IEEE Internet Computing, 16(3):53–59, May 2012.

    2 P. K. Murukannaiah and M. P. Singh. Understanding location-based userexperience. IEEE Internet Computing, 18(6):72–76, Nov 2014.

    3 P. K. Murukannaiah and M. P. Singh. Xipho: Extending Tropos to engineercontext-aware personal agents. In Proc. AAMAS, pp. 309–316, 2014.

    4 P. K. Murukannaiah and M. P. Singh. Platys: An active learning framework forplace-aware application development and its evaluation. ACM Transactions onSoftware Engineering and Methodology, 24(3):1–33, May 2015.

    5 L. Zavala, P. K. Murukannaiah, N. Poosamani, T. Finin, A. Joshi, I. Rhee, and M.P. Singh. Platys: From position to place-oriented mobile computing. AIMagazine, 36(2):50–62, June 2015.

    Murukannaiah and Singh (NCSU) Exploiting Personal Data, Preserving Privacy 28 / 29

  • Thank You