Using Robots for App Testing

Preview:

Citation preview

USING ROBOTS FOR ANDROID APP TESTING

Shauvik Roy ChoudharyPhD, Georgia TechFounder, MoQuality

@shauvikhttp://shauvik.comhttp://moquality.com

About me

Shauvik Roy Choudhary

PhD from Georgia TechFounder, MoQuality

http://shauvik.com

PhD Thesis: Cross-platform Testing & Maintenance of Web & Mobile Apps

Industry:Google, Yahoo!, FullStory, Fujitsu Labs, IBM Research, HSBC, Goldman Sachs

Entrepreneurship:Georgia Tech TI:GER program, VentureLab, ATDC, NSF SBIR

2015 IEEE International Conference on Automated Software Engineering

Test Structure

Setup

Teardown

Exercise

Verify

Testing using a Robot

Setup

Teardown

Exercise

Verify

start the app

stop the app

app shall not crash

Inputs ???

Android Monkey Tool

$ adb shell monkey -p <your.package.name> -v <number of events>

Useful Monkey Options

-s <seed>

--throttle <milliseconds>

--pct-<event_type> <percent>

--ignore-<issue>

Automated Test Input Generation Techniques

DynodroidFSE’13

A3EOOPSLA’13

SwiftHandOOPSLA’13

DroidFuzzerMoMM’13

OrbitFASE’13

Monkey2008

ACTEveFSE’12

GUIRipperASE’12

JPF-AndroidSENotes’12

PUMAMobisys’14

EvoDroidFSE’14

Null IntentFuzzerWODA’14

IntentFuzzer2009

Push button techniques

SapienzISSTA’16

TrimDroidICSE’16

Goal of Testing:

1. Make the app crash

2. Test different behaviors

Tools Strategies

1. Instrumentation strategy -- App/Platform

2. Events generation strategy -- UI/System

3. Testing strategy -- Black-box/White-box

4. Exploration strategy -- Random/Model-based/Systematic

Exploration Strategy

1. Random Exploration Strategy

Randomly selects an event for explorationTools: Monkey, Dynodroid

Advantages● Efficiently generates events● Suitable for stress testing

Drawbacks● Hardly generates specific inputs● App behavior/coverage agnostic

○ might generate redundant events

● Typically no stopping criterion

2. Model-based Exploration Strategy

Use GUI Model of the app to systematically exploreTypically FSMs (states = Activities, edges = Events)

Tools: A3E, SwiftHand, GUIRipper, PUMA, Orbit

Advantages● Intuitively more effective● Can reduce redundant events

Drawbacks ● Does not consider events that alter

non-GUI state

3. Systematic Exploration Strategy

Use sophisticated techniques (e.g., symbolic execution & evolutionary algorithms) to systematically explore the app

Tool: ACTEve and EvoDroid

Advantages● Can explore behavior that is

hard to reach by random techniques

Drawbacks ● Less scalable compared to other techniques

(xleft

< $x < xright

) ∧ (y

top < $y < y

bottom)

SAT Solver

$x = 5; $y = 10

Automated Test Input Generation Techniques

Name Doesn’t needInstrumentation

Events Exploration Strategy

Testing Strategy

Platform App UI System

Monkey ✔ ✔ ✔ ✖ Random Black-box

ACTEve ✖ ✖ ✔ ✔ Systematic White-box

Dynodroid ✖ ✔ ✔ ✔ Random Black-box

A3E-DF ✔ ✖ ✔ ✖ Model-based Black-box

SwiftHand ✔ ✖ ✔ ✖ Model-based Black-box

GUIRipper ✔ ✖ ✔ ✖ Model-based Black-box

PUMA ✔ ✔ ✔ ✖ Model-based Black-box

ExperimentsImage Credit: Daily Alchemy

Research Criteria

C1. Ease of use

C2. Android framework compatibility

C3. Code coverage achieved

C4. Fault detection ability

Mobile App Benchmarks

Combination of all subjects (68) used from F-Droid and other open source repos

Experimental Setup

Debian Host

Ubuntu Guest 2 cores6GB RAM

VirtualBox Vagrant

Android Emulators4GB RAM

Emulators: v2.3 (Gingerbread)v4.1 (Jelly Bean)v4.4 (KitKat)

Tools installed on guest:

● Removed default timeouts

● Default config; No special tuning

Experimental Protocol

● Run each tool for 1 hour on each benchmark● Repeat 10 times to account for non-deterministic behavior● Collect Results

○ Coverage Report (every 5 min)

○ Logcat -> Extracted Failures

Emma HTML Reports Parse and extract statement coverage

Logcat Parse and extract unique stack traces (RegEx)

ResultsImage Credit: ToTheWeb

C1. Ease of Use & C2. Android CompatibilityName Ease of Use Compatibility

OS Emulator/Device

Monkey NO_EFFORT Any Any

ACTEve MAJOR_EFFORT v2.3 Emu (Custom)

Dynodroid NO_EFFORT v2.3 Emu (Custom)

A3E-Depth-first LITTLE_EFFORT Any Any

SwiftHand MAJOR_EFFORT v4.1+ Any

GUIRipper MAJOR_EFFORT Any Emulator

PUMA LITTLE_EFFORT v4.3+ Any

C3. Overall Code Coverage Achieved

C3. Coverage Analysis by Benchmark App

Divide And Conquer

RandomMusicPlayer

k9mail PasswordMakerPro

...

#A

pplic

atio

ns

% Coverage

C3. Code Coverage Achieved Over Time

C4. Fault Detection Ability

Pairwise Comparison: Coverage and Failures

Cov

erag

e

Pairwise Comparison: Coverage and Failures

Failu

res

Pairwise Comparison: Coverage and Failures Coverage

Failu

res

Observations and Discussion

1.Random testing can be effective

(somehow surprisingly)

2.Strategy makes

a difference(in the behaviors covered)

3.System events

matter(in addition to UI events)

Broadcast Receiver

Intents

SMS

Notifications

4.Restarts should be minimized

(for efficient exploration)

5.Practical

considerations matter

(for practical usefulness)

5.1Practical

considerations matter

(for practical usefulness)

Manual Inputs

5.2Practical

considerations matter

(for practical usefulness)

Initial State

Open Issues for Future workImage Credits: Back to the Future (Universal Pictures)

1.Reproducibility

(allow for reproducingobserved behaviors)

Image Source: http://ncp-e.com

2.Mocking

andSandboxing

(support reproducibility, avoid side effects, ease testing)

Source: http://googletesting.blogspot.com

3.Find problems

across platforms(address fragmentation)

Image Credit: OpenSignal

Takeaway

● Random approaches can be a good start to automated testing

● No single strategy alone is effective enough to cover all behaviors=> A combination is more effective

● Use our test infrastructure:

https://moquality.com/robots

coming soon

USING ROBOTS FOR ANDROID APP TESTING

Shauvik Roy ChoudharyPhD, Georgia TechFounder, MoQuality

@shauvikhttp://shauvik.comhttp://moquality.com