View
6
Download
0
Category
Preview:
Citation preview
PUMA: Programmable UI Automation for Large-Scale Dynamic Analysis of
Mobile Apps
Shuai Hao, Bin Liu, Suman Nath,
William G.J. Halfond, Ramesh Govindan
Mobile App Explosion 2
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
Nu
mb
er
of
Ap
ps
Source: http://en.wikipedia.org/wiki/Google_Play
1.2 million
App Behaviors are Complex 3
Source: forbes.com
Source: droidforums.net
Source: zdnet.com
Dynamic Analysis 4
These app behaviors can be studied by dynamic analysis.
We need scalable dynamic analysis
methods.
But given the huge number of apps,
Dynamic Analysis with UI Automation
UI-driven app exploration (or Monkey)
Source: http://mttnow.com/ios-automated-ui-testing
5
Monkey: High-level Idea 6
Monkey
UI events
UI info
Monkey in Dynamic Analysis 7
Example: Check number of Ads on each page
Monkey: Overview 8
An App Page
UI Extraction
Page
Text Image List
Button Button UI Structure
Button Click Multipage Scroll
Scroll List
Action Dictionary Click “Continue” Button
UI Action
The set of previously visited pages Next App Page
Decide UI Action
Monkey
UI Extraction Channel
UI Action Channel
Monkey-based App Analyses 9
AMC MobiSys’13
DECAF NSDI’14
SmartAds MobiSys’13
VanarSena MobiSys’14
AppsPlayground CODASPY’13
ContextualFuzzing MSR-TR’13
Check accessibility of UI elements
Check ad violation on each page
Deliver ad based on page content
Test whether app crashes
Detect privacy leak and malware
Test app crash and performance issue
Monkey Coverage vs. Speed 10
Problem
Solution
Example
Monkey may never finish app exploration
Refresh
Optimize app exploration in an analysis-specific way
AMC: uses a structural similarity measure DECAF: uses a structural similarity measure and ML-
based technique to avoid similar page visit
Updates can also be pushed to device
Flexibility in Access to App Info 11
Problem
Analysis may require data that is not available in UI structure
Solution
Instrument app or extract info from environment
Example
SmartAds: uses app instrumentation to obtain in- app context info
VanarSena: uses app instrumentation to control memory and isolated storage
Monkey-based App Analyses 12
Result
e.g. DECAF: > 4300 LOC with 70% for monkey
Impact
Complex and customized code
Tool for analysis X is very hard to be reused for analysis Y.
Our Goal 13
Provide a programmable way for
monkey-based app analyses.
• so that both analyses X and Y can be implemented easily by customization
Approach 14
Event-Driven UI Automation
SmartAds
Page Content
Monkey
DECAF
Ad Fraud
Monkey
AMC
UI Property
Monkey
Approach 15
UI Property
Monkey
AMC
Ad Fraud
Monkey
DECAF
Page Content
Monkey
SmartAds
Programmable Monkey
API
UI Property Ad Fraud Page Content
PUMA
Event-Driven UI Automation
events events events
Framework Requirements 16
System
AMC
DECAF
SmartAds
A 3 E
Apps Playground
VanarSena
Contextual Fuzzing
Exploration
Target
Distinct
Distinct
All
Distinct
Distinct
Distinct
All
Page
UI
UI
UI
UI
UI, text
UI, text
UI
Transition Property
Checked
Accessibility
Ad layout
Page content
None
Info flow
Crash
Crash, perf .
Action
Taken
-
-
-
-
-
Inject fault
Change
context
Required
Instrumentation
No
No
Yes
Yes
Yes
Yes
No
Customize the definition of state equivalence Customize the definition of next state Specify which properties to check Change the environment Specify instrumentation for the app
Key Idea: Event-driven Paradigm 17
Key Idea: Event-driven Paradigm 18
Go to initial page
Pick a clickable UI element
Perform click, user input
Effect environment changes, if needed
Wait for page done
Check page equivalence
Need continue
Next Click
Modify environment
In-line Analysis
State Equivalence
Terminating App
Text Input
App-specific Events
Framework Requirements 19
System
AMC
DECAF
SmartAds
A 3 E
Apps Playground
VanarSena
Contextual Fuzzing
Exploration
Target
Distinct
Distinct
All
Distinct
Distinct
Distinct
All
Page
UI
UI
UI
UI
UI, text
UI, text
UI
Transition Property
Checked
Accessibility
Ad layout
Page content
None
Info flow
Crash
Crash, perf .
Action
Taken
-
-
-
-
-
Inject fault
Change
context
Required
Instrumentation
No
No
Yes
Yes
Yes
Yes
No
State Equivalence
Next Click Text Input In-line Analysis
Modify environment
App-specific Events
Example: Network Usage Profiler 20
Count bytes sent and received in HTTP traffic
400 KB
1 MB
200 KB
500 KB
<described later>
State Equivalence
Next Click
App-specific Events
Example: Network Usage Profiler 21
Count bytes sent and received in HTTP traffic
State Equivalence
<described later>
State Equivalence 22
Determine whether two pages are equivalent or not
Structural similarity, content similarity, customized
Example: Network Usage Profiler 23
Count bytes sent and received in HTTP traffic
Next Click
<described later>
Next Click 24
Determine which UI element to click
Sequential order, max-type order, customized
Example: Network Usage Profiler 25
Count bytes sent and received in HTTP traffic
App-specific Events …
App-Specific Events
regexp-based filter language
Users define named event
Associate named event with a codepoint in the binary
26
Instrumentation through SIF
S. Hao, D. Li, W. Halfond, R. Govindan. “SIF: Selective Instrumentation Framework for Mobile Applications.” In ACM MobiSys, 2013.
Putting It All Together 27
PUMA
Instrumenter
Interpreter
Monkey
PUMAScript Original
App
Instrumented App
Implementation Challenges
Obtaining UI structure Intercept uiautomator events
Supporting page scrolling
Check for scrolling direction
Scroll in a zig-zag pattern
Combine partial pages
Detecting page loading completion WINDOW_CONTENT_CHANGED event
Use window-based moving average heuristic
28
Can PUMA support rapid development of large-scale dynamic app analyses?
Evaluation 29
1
What insights into the app ecosystem can these analyses provide? 2
Research Questions
Evaluation
Implemented 7 analysis studies
Downloaded 3,600 apps from Google Play
Ran over emulators phones
30
Methodology
Implemented Analysis Studies 31
Accessibility Violation Detection
Check accessibility of UI elements
Content-based App Search
Crawl in-app data for search engine
UI Structure Classifier
Classify apps based on UI structure
Ad Fraud Detection
Detect ads placement violation
Network Usage Profiler
Profile in-app HTTP traffic use
Permission Usage Profiler
Profile in-app permission usage
Stress Testing Inject null HTTP response for test
RQ1: Rapid Development? 32
Accessibility Violation Detection
Content-based App Search
UI Structure Classifier
Ad Fraud Detection
Network Usage Profiler
Permission Usage Profiler
Stress Testing
Finished development and experiments within
2 weeks
RQ1: Rapid Development? 33
PUMA only uses < 100 LOC for all studies
Study Task PUMAScript
(LOC) User Code
(LOC)
Accessibility violation detection 11 60
Content-based app search 14 0
UI structure classifier 11 0
Ad fraud detection 11 52
Network usage profiler 19 8
Permission usage profiler 20 5
Stress testing 16 5
~4300 LOC
Cross-Analysis Scaling Optimization 34
UI Property
Monkey API
Ad Fraud
Monkey API
Page Content
Monkey API
Computing Resources
With PUMA
after page is loaded check UI property
after page is loaded check ad fraud
after page is loaded crawl page content
Cross-Analysis Scaling Optimization 35
UI Property
Monkey API
Ad Fraud
Monkey API
Page Content
Monkey API
With PUMA
after page is loaded 1. check UI property 2. check Ad fraud 3. crawl page content
For 100 apps and 1 phone, it took 8 hours instead of 20 hours to finish.
2.5x speed up
RQ2: Insights into App Ecosystem? 36
Accessibility Violation Detection
Content-based App Search
UI Structure Classifier
Ad Fraud Detection
Network Usage Profiler
Permission Usage Profiler
Stress Testing
Accessibility Violation Detection
Content-based App Search
Check accessibility of UI elements
Crawl in-app data for search engine
Classify apps based on UI structure
Detect ads placement violation
Profile in-app HTTP traffic use
Profile in-app permission usage
Inject null HTTP response for test
Check accessibility of UI elements
Crawl in-app data for search engine
RQ2: Insights into App Ecosystem? 37
Accessibility Violation Detection
K. Lee, J. Flinn, T. J. Giuli, B. Noble, C. Peplin. “AMC: Verifying User Interface Properties for Vehicular Applications.” In ACM MobiSys’13.
Start Stop
Start Stop
Start
Start
Scrolling Button
Distance Button
Size Word Count
RQ2: Insights into App Ecosystem? 38
~55% of our apps violate ≥1 accessibility rules
Accessibility Violation Detection
0 500 1000 1500 2000 2500
user action per task
word count
button size
button distance
scrolling
RQ2: Insights into App Ecosystem? 39
Content-based App Search
RQ2: Insights into App Ecosystem? 40
App
1 How to Draw Singers
2 Love Frames for Free
3 How to Draw Celebrities
4 U-Meme
5 Collage Free
6 Your Voice
App
1 MixerBox
2 Shane Dawson Episodes
3 Hits Music Ringback Tone
4 Mobo Movie News & Stars
5 Mobo Fashion Trends & Deals
6 How to Draw Singers
… …
19 Rangers
Content-based App Search
Name Name + Desc. Name + Desc. + In-app Data
App
By PUMA
RQ2: Insights into App Ecosystem? 41
Search Index Mean Median
Name 8.31 3
Name + Desc. 199.43 66
Name + Desc. + In-app Data 300.37 131
Content-based App Search
#apps returned by search index In-app data can greatly improve search relevance
Summary
Programmable UI-Automation Framework Separates exploration and analysis logic Incorporates a generic monkey Exposes event driven programming abstractions
42
https://github.com/USC-NSL/sif
https://github.com/USC-NSL/puma
Recommended