33
Application-screen Masking: A Hybrid Approach Abigail Goldsteen, Ksenya Kveler, Tamar Domany, Igor Gokhman, Boris Rozenberg, Ariel Farkash Information Privacy and Security, IBM Research – Haifa Presented by Abigail Goldsteen W2SP Workshop, San Jose, May 2014 © 2014 IBM Corporation

Application-screen Masking: A Hybrid Approach Abigail Goldsteen, Ksenya Kveler, Tamar Domany, Igor Gokhman, Boris Rozenberg, Ariel Farkash Information

Embed Size (px)

Citation preview

Application-screen Masking:

A Hybrid Approach

Abigail Goldsteen, Ksenya Kveler, Tamar Domany, Igor Gokhman, Boris Rozenberg, Ariel Farkash

Information Privacy and Security, IBM Research – Haifa

Presented by Abigail Goldsteen W2SP Workshop, San Jose, May 2014

© 2014 IBM Corporation

© 2014 IBM Corporation

Agenda• Problem• Existing approaches• Our approach• Challenges and limitations• Comparison between approaches• Summary• Questions

© 2014 IBM Corporation

Agenda• Problem• Existing approaches• Our approach• Challenges and limitations• Comparison between approaches• Summary• Questions

© 2014 IBM Corporation

Problem• How to share information while safeguarding the

privacy and security of sensitive data

Existing applications

New users/use cases

• Need to prevent users from viewing information they are not authorized to see

© 2014 IBM Corporation

Example

Data CenterOutsourced

Call Center

Germany India

Balance

:

John Smith

35

127.50$

National

ID:Name:

© 2014 IBM Corporation

Agenda• Problem• Existing approaches• Our approach• Challenges and limitations• Comparison between approaches• Summary• Questions

© 2014 IBM Corporation

Existing approaches1. Redesign application

o Can be very complicated and costlyo Not always possible due to lack of skills

2. Mask values in databaseo Difficult to maintain several copieso May “break” the application

3. Mask application-screenso Sensitive values are removed/masked after the

application has constructed the visual layout of the screen

Application server

ClientMasking

© 2014 IBM Corporation

Rule typesContent-based

• Based on the text value or its format

• Can be defined usingo Regular expressionso Natural Language

Processing (NLP)o Other data classification

techniques

• Example: o A regular expression

depicting email addresses

Context-based• Based on the visual

structure of the screen• Can be defined using

o UI constructs (labeled fields, table columns, drop-down boxes, etc.)

o A relationship between two entities on the screen

o Absolute locations

• Example: o Mask all labeled fields in

which the label is “Email Address”

© 2014 IBM Corporation

Existing application-screen masking approaches (1)

At the network level:

Fast Secure× Simplistic content-based rules

Application server

ClientWeb Proxy

HTTP request

HTTP response Masked HTTP response

HTTP request

Masking

Masked screen

© 2014 IBM Corporation

Existing application-screen masking approaches (2)

At the presentation level:

Context-based rules defined on screen× Difficulties in handling complex screens× Severe performance issues

Application server

Client

VNC ServerHTTP request

HTTP response Masked RFB

Remote Framebuffer

(RFB)

Masking

OCRMasked screenUnmasked

© 2014 IBM Corporation

Existing application-screen masking approaches (3)

At the operating system level

Context-based rules defined on screen× Installation on every end-user machine× Security issues

Application server

ClientHTTP request

HTTP response

Masked screen

Masking

© 2014 IBM Corporation

Agenda• Problem• Existing approaches• Our approach• Challenges and limitations• Comparison between approaches• Summary• Questions

© 2014 IBM Corporation

Hybrid approach• Masking at the network level

Fast Secure

• Easy rule definition at the presentation level Context-based rules defined on screen Content-based rules are also supported

© 2014 IBM Corporation

Some features• All sensitive information is removed from the

message and does not reach the browsero Cannot be viewed on screen or in page source

• Masking server and proxy are placed within the enterprise’s internal networko Sensitive information does not leave the premises

• Client requests are also intercepted to check if they contain masked datao The request is reconstructed with the original data

before sending to the server

© 2014 IBM Corporation

Masking rules• Rules are expressed in Javascript

1 .Mozilla Spidermonkey, https://developer.mozilla.org/en-US/docs/SpiderMonkey

• Each rule is executed on a specific HTTP messageo Can be filtered based on URL, server or client IP and username

• Several possible masking methodso Remove, Replace, Encrypt, etc.

o Powerful• Can define any type of

context-based ruleo Flexible

• Can work on many payload formats (e.g., HTML, XML, JSON, etc.)

o Fast• Executed using

existing, optimized engine1

© 2014 IBM Corporation

Visual rule authoring• Creating Javascript rules for individual HTTP

messages is very difficulto Each displayed element (e.g., table) may originate from

several different messages• May have different formats• May come from AJAX requests

o Need to use several tools to inspect network traffic, understand the underlying DOM and associate between the displayed element and the messages that created it

o Need to write scripts that are syntactically correct and validate that masking is performed correctly

• Need some tool to facilitate rule authoring process

© 2014 IBM Corporation

“Selection tool”

© 2014 IBM Corporation

“Selection tool” close-up

• Web-based tool, implemented in Javascript

• A floating panel attached to the original application

• Intercepts mouse hovering and click events to enable selection

© 2014 IBM Corporation

Agenda• Problem• Existing approaches• Our approach• Challenges and limitations• Comparison between approaches• Summary• Questions

© 2014 IBM Corporation

Technical challenges (1)

• Automatically creating scripts from user selections

© 2014 IBM Corporation

Technical challenges (1)

• Our solution: We devised an algorithm for detecting the origin of each screen element while the page is loadingo Monitors all web page modifications, compares the DOM before and

after the modifications and captures the changes that were initiated by HTTP messages

o Creates a map between each visual element and the message it came from, including the message’s URL and the location of the element within the message (e.g., Xpath)

© 2014 IBM Corporation

Technical challenges (2)

• Interacting with the target application without changing ito Need to catch DOM changes and add listeners for mouse events in the

target applicationo Browsers’ same-origin policy prevents pages/frames from different

origins from manipulating each others’ DOMs2

This prevents the naïve solution of presenting the target application in its own frame within a larger rule-authoring tool page

o Possible solutions:• Browser add-on• Standalone tool

• Our solution is based on hidden frames and “injecting” the selection tool code into the application messages using the runtime proxy

2 .J. Ruderman, “The same origin policy”, http://www.mozilla.org/projects/security/components/same-origin.html

Both require installation on the rule-author’s machine

© 2014 IBM Corporation

Limitations1. Cannot mask information that does not flow over

the network, i.e., generated on the client-sideo Example: an average that is calculated in the browser using Javascript

2. Cannot mask information that flows in binary formato Examples: images, Java applets, Adobe Flash objects, etc.

3. May fail client-side validationo Example: a field that checks for a valid email addresso Solution: use format-preserving masking techniques

© 2014 IBM Corporation

Agenda• Problem• Existing approaches• Our approach• Challenges and limitations• Comparison between approaches• Summary• Questions

© 2014 IBM Corporation

Comparison of approaches (1)

• Rule strength and granularityo We compare our context-based approach with content-based rules and

database masking, based on 4 criteria:• Masking granularity – the ability to mask exactly what is needed• Logical rule coverage - the ability to describe a rule by its logical

content (e.g., mask only patient emails)• Visual rule coverage - the ability to mask all or part of the

elements in a given area of the screen• Visual screen context - the ability to create rules in the context

of the presentation layer

© 2014 IBM Corporation

Examples• Masking granularity:

o A content-based rule will always mask all phone numbers in the application

• Cannot mask only patient phone numbers and not physician phone numbers

• Logical rule coverage:o At the DB layer, any data item can be specified for masking only once,

even if it appears on several pages or has several different formats• Cannot support cases where a data item in a table appears in two

different contexts, one that should be masked and one that shouldn’t

• Visual rule coverageo Our approach enables masking all items in a given area of the screen,

even though there may not be any correlation in the format or database table

© 2014 IBM Corporation

Comparison of approaches (2)

• Rule enforcement mechanismso We compare our network-level enforcement with masking at the

database level and the at the presentation-layer (using OCR), based on 3 criteria:

• Application integrity – effects on the proper functioning of the application

• Role-based masking – different masking based on user roles• Impact of screen complexity – do complex screens make

masking more difficult?

© 2014 IBM Corporation

Examples• Application integrity

o At the DB layer, illegal or missing values can result in “breaking” the application

o At the network layer, client-side validation or calculations may be compromised

• Impact of screen complexity o Masking at the presentation layer is directly correlated to screen

complexity• Overlapping or partially visible windows pose a significant

challengeo Network-based masking is somewhat affected by application

complexity, e.g., a screen constructed from many different messages• Masking is still possible, but rule definition is more complicated

© 2014 IBM Corporation

Agenda• Problem• Existing approaches• Our approach• Challenges and limitations• Comparison between approaches• Summary• Questions

© 2014 IBM Corporation

Summary• We showed a hybrid approach that combines

context-based rule creation at the presentation level with enforcement at the network level

• This enables: o Powerful and flexible rule language o Easy and straight-forward rule authoring processo Minimal performance impact at runtime

• Masking rules are defined in a simple and intuitive manner while navigating the target application and clicking on sensitive areas

• Requires minimal changes to the existing environment – no changes to the application or database

© 2014 IBM Corporation

Agenda• Problem• Existing approaches• Our approach• Challenges and limitations• Comparison between approaches• Summary• Questions

© 2014 IBM Corporation

Questions?

© 2014 IBM Corporation

Thank you