39
NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by: Junaed Bin Halim

NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Embed Size (px)

Citation preview

Page 1: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

NetworkProfiler: Towards Automatic Fingerprinting of

Android AppsShuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song

Presented by: Junaed Bin Halim

Page 2: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Outline

•Goal•Motivation•System Overview•Evaluation•Limitations•Related Work•Conclusion/Question

Page 3: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Goal

•What?–Develop a systematic tool• Automatically generate network profiles• In HTTP traffic

–To Identify Apps

•How?–By detecting fingerprints / signatures

Page 4: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Motivation

•Why do we need to identify applications?–To classify traffic generated by the applications for

better network management.–Operators can have a clear visibility into their

network• Better security: Intrusion detection• Better throughput: Real time video over download etc.

•What is traffic classification?–Categorize network traffic according to various

parameters, e.g., port number or protocol

Page 5: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Motivation (contd.)

•Why only Android apps?–Smartphone usage is increasing• 488m smartphones vs 415m pcs in 2011

–Users installs applications (apps) on their smartphones (avg 26 ~ 41)• Most applications generate network traffic

–Researchers prefer android over iOS (openness, availability of tools etc.)

•Why http traffic?–>80% smartphone traffic is http.

Page 6: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Observation

•An app can have many different network behaviors

Important to cover as many network behavior as possible

Key Idea: Identify the invariant parts of the flows belonging to an app

Page 7: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Network Profiler System Overview

Page 8: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Fingerprint Extractor : Parser

•Each HTTP request is composed of 3 parts–m: method–p: page• pc: page component• fn: file name

–q: query• k-v: key-value pair

Page 9: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Fingerprint Extractor: Clusterer

•Uses agglomerative clustering to group HTTP requests by similarities.

•How to find similarity?–Use Jaccard index as a measure of similarity

Page 10: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Fingerprint Extractor: Clusterer (2)

•Cluster–Distance between pages, : 1 - similarity–Distance between queries, : 1 – similarity–Distance between headers, + )/2–Same cluster if [ = 0.6]–Merge cluster A and B if cluster C is similar to both.

Page 11: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Fingerprint Extractor: Generation

•Build state machine for each cluster

•Merge state machines that contain the same hosts

Page 12: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Fingerprint Extractor: Generation(2)

•Query-values:– Some have the app name

embedded• Extract keywords from

manifest file–Any unique keyword is

sufficient•Third-party traffic:–Presence of app_id or key

Page 13: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Droid Driver

•Executes android apps and collects the network traces•Consists of two components–Random Tester• For traffic between the app provider, or third-party

–Directed Tester• For traffic between a CDN, or others

•Runs either component for an app

Page 14: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Droid Driver: Random Tester

•Runs the app randomly –Application events are generated at random–For applications that generate• Traffic between the app server• Third party traffic

– Admob, Google DoubleClick– Omniture, Google Analytics

•Efficient

Page 15: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Droid Driver: Directed Tester

•Not all app has unique id in its traffic–In some cases, the unique id is developer id (Angry

Birds, ESPN)•Directed Tester–Consists of 3 components• Path Recorder• Heuristic Path Generator• Path Replayer

Page 16: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Droid Driver: Directed Tester(2)

•Path Recorder–Records user events in an

emulator•Heuristic Path Generator–Generates unexplored

paths•Path Replayer– Forces the app to execute a

given path–Captures the network trace

Page 17: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Evaluation

•Downloaded 90k apps•70k uses internet•For 2 different traffic–Ad Traffic• Identified ad library from the manifest files of 32k apps

– 25k uses 1 ad library– 4k, 1k, 600, and 400 apps uses 2,3,4 and 5 ad libraries– Less than 300 uses more than 5

–Non-Ad Traffic• Considered 6 apps only

– Youtube, flixter, espn, score center, cnet news, pandora, and zedge

Page 18: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Evaluation: Ad Traffic

Page 19: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Evaluation: Ad Traffic

Page 20: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Evaluation: Non-Ad Traffic

•Manually generated seed-action-path.•Used Directed Testing to generated traffic.•All ads traffic were excluded.•Remaining traffic was annotated with the name of the app.

Page 21: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Results

•All applications were successfully identified in their experiment–For which network profile was generated–Not all were verified

Page 22: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Limitations

•Only identify apps that generate network traffic–Most application does these days•Only works for HTTP traffic–Does not work for HTTPS–Does not work for apps that use proprietary

protocols (skype etc.) •Uses supervised learning–Applications must be known prior to classification.–Need new signatures if app developer changes the

http request structure

Page 23: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Related Work

•Several works tried to classify traffic–Packet inspection• Port based

– Historically many applications utilize “well-known” ports– Classifier looks only the port in TCP SYN packets– Not all applications have registered port with IANA

• Payload based– Payload is visible, known to the classifier– Does not work if payload is obfuscated/encrypted

–Packet Inspection is computationally expensive

Page 24: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Related Work (contd.)

•Classification based on statistical traffic properties–empirical models of connection characteristics -

such as bytes, duration, arrival periodicity–flow duration, packet inter-arrival time and packet

size and byte profile–distributions of packet lengths and packet inter-

arrival times–Etc.

Page 25: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Related Work (contd.)

•Machine Learning–Based on statistical properties of the traffic–Supervised Learning (Classification)–Unsupervised Learning (Clustering)

•Different work uses different ML algorithms

•See: “A Survey of Techniques for Internet Traffic Classification using Machine Learning” - Thuy T.T. Nguyen, Grenville Armitage

Page 26: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Related Work : Examples

•Discoverer : 2007–Automatically reverse engineers the protocol

message formats of an application from its network trace• Application session : group of messages• Message format specification : sequence of fields• Common field semantics: length, offset, pointer, cookie,

endpoint-address etc.–Discoverer derives message format specification• Using cluster

Page 27: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Related Work: Examples

•EarlyBird: 2004, Polygraph: 2005, Hamsa : 2006–Detects previously unknown worms and viruses–Generates signatures of worms by identifying

common byte flows in the network traffic

Page 28: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Related Work (contd.)

•Intrusion detection–2 approaches• Signature based• Anomaly based

•This paper uses signature based application classification

•Anomaly-based detection–Monitors system activity to classify

Page 29: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Anomaly Based Detection

•Triggers alarm when some type of unusual behavior occurs on the network.–Anything that deviates from “normal” is unusual•Heuristic based•Example:–Protocol anomaly: HTTP traffic on a non-standard

port–Application anomaly: A segment of binary code in a

user password.–Statistical anomaly: Too much UDP compared to TCP

traffic.

Page 30: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Signature Based vs Anomaly Based

•Signature Based:–Strength: Precise if signatures are correctly

generated–Weakness: Requires prior knowledge about the

signatures•Anomaly Based:–Strength: Has the potential to detect new or

unknown attacks–Weakness: Often results in false alarms due to the

difficulty in modeling the “norm”

Page 31: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Related Work: Application Profiling

•Profiledroid: 2012–Profiles applications at 4 layers:• Static layer, User layer, Operating system layer, and

Network layer• Network layer metrics: Traffic intensity, Origin of traffic,

CDN + Cloud traffic, Google traffic, Third-party traffic, Incoming vs outgoing traffic, Number of distinct traffic sources, Ratio between Http vs Https traffic• Relies completely on users running apps to generate

traffic

Page 32: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Problems with existing works

•Not Scalable•Requires user’s involvement / not automatic•Coupled with the underlying TCP/Application layer protocol

Page 33: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Inter-component control flow graph

•Used to specify control flow in android applications•Model components:–Activity–Service–Broadcast receivers•External Signals: User Events•Internal Signals: Generated by method calls

See: http://danious.files.wordpress.com/2013/05/dominguezthesis2.pdf

Page 34: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Inter-component control flow graph (contd.)

Page 35: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Inter-component control flow graph (contd.)

Page 36: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Why this paper in CSCE 715?

•Network operators can provide better security for their network–Block malicious traffic–Apply traffic engineering

•Is that all?

Page 37: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

•The smartphone app you use reveal your personality–Cornell University Study, 2011• Appthusiasts• Appcentrics• Live Wires• Creators• Connectors• Apprentices

–App market research firm Flurry Analytics also confirms this

http://www.news.cornell.edu/stories/2011/02/trevor-pinch-links-app-usage-personality-typeshttp://sachendra.wordpress.com/2011/05/11/the-smartphone-apps-you-use-reveal-your-personality/http://wallstcheatsheet.com/stocks/can-your-apple-device-app-usage-reveal-your-personality.html/

Page 38: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Conclusion

•NetworkProfiler can identify applications with high precision–Uses network trace generated by the apps–Needs to know the patterns of generated traffic

beforehand–Works only for known applications•DirectedTesting can automate traffic generation from all paths of an application

Page 39: NetworkProfiler: Towards Automatic Fingerprinting of Android Apps Shuaifu Dai, Alok Tongaonkar, Xiaoyin Wang, Antonio Nucci, and Dawn Song Presented by:

Questions?