38
D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion, Zhezhe Chen, Feng Qin, and Dong Xuan The Ohio State University Infocom 2013

D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Embed Size (px)

Citation preview

Page 1: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

D2Taint: Differentiated and Dynamic

Information Flow Tracking on Smartphones for Numerous Data

Sources

Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion, Zhezhe Chen, Feng Qin,

and Dong XuanThe Ohio State University

Infocom 2013

Page 2: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Outline

Introduction

Background

Differentiated and Dynamic Tagging

IFT with Dynamic and Differentiated Tagging

Evaluation Method & Experimental Results

Conclusions

Page 3: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Introduction

Trend Micro reports that over 25,000 Android malware samples were found in June 2012 alone

46%-55% of smartphone apps transmit users’ private information over networks without users’ awareness or consent

Page 4: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Introduction

TaintDroid

extend Dalvik VM to tag smartphone data using 32 possible types based on their origins

Just 32 origins ...

Page 5: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Introduction

D2Taint

track sensitive data from a large number of possible internal and external sources

partition sources into disjoint classes (correspond to different sensitivities)

tag structure updates itself on-the-fly

Page 6: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Background

Information Flow Tracking Basics

Compiler analysis on programs written in special type-safe programming languages

Software instrumentation at source code, bytecode, or binary level

Architecture support for IFT

Page 7: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Background

Pre-defined structure

sources IDs or sensitivity level

Tag propagation policy

a = b + c --> a.tag = max(b.tag, c.tag)

Tag checking

“Passwords” may not be sent over network

Page 8: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

TaintDroid

Page 9: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

TaintDroid

Page 10: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

TaintDroid

Dalvik opcode: http://pallergabor.uw.hu/androidblog/dalvik_opcodes.html

Page 11: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Differentiated and Dynamic Tagging

Source level

different info sources may have different sensitivities in terms of security

Page 12: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Differentiated and Dynamic Tagging

Application level

different amounts of storage space to capture heterogeneous sources and correlations

Page 13: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Differentiated and Dynamic Tagging

User level

adapt to changing information source access patterns

Page 14: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Differentiated and Dynamic Tagging

By examining the source level and application level

Differentiated classes

By examining the user level

Tag dynamics

Page 15: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Differentiated and Dynamic Tagging

Tag structure

Page 16: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Tag Structure

Tag Tag scheme IDscheme ID Class 1Class 1 Class 2Class 2 Class 3Class 3

Class 1 TableClass 1 Table0001 google.com0001 google.com0010 yahoo.com0010 yahoo.com

Tag schemeTag scheme00 2/3/100 2/3/101 2/2/201 2/2/210 4/1/110 4/1/111 3/2/111 3/2/1

Page 17: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Tag Structures Examples

32 bits, 1 class for each bit: TaintDroid

32 bits, 2-bit tag scheme ID, 3 classes, 16/8/6 bits per class, 4/4/2 bits per source

32 bits, 2-bit tag scheme ID, 2 classes, 24/6 bits per class, 3/2 bits per source

Page 18: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Tag Dynamics

Each class can have different length at different times

Perform “on-demand” machine learning based on statistical properties of tag space usage and location information tables’ recent hash values

Adjust its tag structure

Page 19: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Tag Dynamics Issues

Tag Scheme Switching

tag scheme config -> preconfigured

when to switch tag scheme -> on-the-fly

Tag merging

Page 20: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

IFT with Dynamic and Differentiated Tagging

Page 21: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

IFT with Dynamic and Differentiated Tagging

When an app start, the dynamic tagging component first loads two configuration files

tag structure definitions

user-defined classes and known data sources

Page 22: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

IFT with Dynamic and Differentiated Tagging

The dynamic tagging component checks the data source list for each incoming data source by the tag assigner

and tracks incoming sources’ statistics and determines whether it should switch D2Taint to a different tag scheme

Page 23: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Dynamic Tagging Component

Dynamic Tagging Core

tag scheme settings: scheme number, bits per tag, number of classes, and a pointer to the class list

class structure: number of classes in the tag system, bits per hashcode, number of reserved slots for the class, and a text description of the class

Page 24: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Dynamic Tagging Component

Dynamic Tagging Core

use a global location information table list to record all source information

after a certain number of new sources are added into an location information table, D2Taint decides whether to switch the tag scheme based on these new sources

Page 25: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Dynamic Tagging Component

Tag Merger

a.tag = b.tag ⊕ c.tag

if using the same tag scheme?

if using the different tag scheme?

truncate certain significant bits

Page 26: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Information Flow Tracking Component

Taint Map

we do not store tags of method local variables, method arguments, and class instance fields adjacent to them in memory

Page 27: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Taint MapMethod local variables and method arguments

when Dalvik VM allocates a stack frame for a method, our system allocates a stack taint map for it

Class instance fields

to be stored in objects’ taint maps

objects’ taint map is stored immediately after that allocated for the object

Page 28: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Information Flow Tracking Component

Tag Assigner

insert our tag assigner logic into file I/O, network I/O, sensor, and other library functions that read private information

Page 29: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Information Flow Tracking Component

Tag Propagator

interpreted code and native code: same as TaintDroid

also propagate tags via Binder IPC

Page 30: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Information Flow Tracking Component

Tag Checker

trustable sites

Page 31: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Evaluation Method & Experimental Results

Android 2.2 on Nexus One

Select 84 “top free” apps from Google Play

CaffeineMark for benchmark

Page 32: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Evaluation Method & Experimental Results

Real-world

71 out of 84 apps leak information

reveal the paths by which the information is leaked

33 apps transmit data among many various external sources

12 apps leak devices’ IMEIs/EIDs

Page 33: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Evaluation Method & Experimental Results

Performance

-> 9%-> 7.3%-> 16%-> 3%-> 13%

-> 21%

Page 34: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Evaluation Method & Experimental Results

Java Macrobenchmark

Page 35: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Evaluation Method & Experimental Results

CaffeineMark’s memory footprint

Android: 21664 KB

D2Taint: 22528 KB

this test ignored the memory used by location information table, which will dynamically increases as more information sources arrived

Page 36: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Evaluation Method & Experimental Results

Sequential websites

dynamic static

Page 37: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Evaluation Method & Experimental Results

Random websites

dynamic static

Page 38: D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,

Conclusions

A novel IFT tagging strategy

using differentiated and dynamic tagging

dynamic tag structure