Limitations with Activity Recognition Methodology and Data Sets

Gary WeissFordham University

Jeff rey LockhartCambridge University

Work supported by Nat ional Sc ience Foundat ion Grant No. 1116124 .

LIMITATIONS WITH ACTIVITY

RECOGNITION METHODOLOGY AND

DATA SETS

HASCA 2014 2

Our WISDM (Wireless Sensor Data Mining) Lab has been working on activity recognition for several years Have focused on building and deploying a real-world

system called actitrackerRecent work has focused on implementing,

analyzing, and using different types of modelsWhen comparing our AR work with other work

we identified several key issues in methodology, which also impact the resulting data sets

GENESIS OF THIS WORK

9/13/2014

HASCA 2014 3

Identify some methodological issues and resulting impact on data setsMake people aware of these issuesPropose mechanisms for addressing these issues

Largest focus on model type but many other factors are considered

Ultimate goal is to generate more diverse data sets and precisely label underlying assumptions

OVERVIEW

9/13/2014

HASCA 2014 4

Model Type:Personal, Impersonal, Hybrid

Collection Method:Fully Natural, Semi-Natural, Laboratory

DataNumber of SubjectsPopulation (college, elderly, etc.)Traits (height, weight, income, education ,…)Activities (running, jogging, standing, …)Duration (1 hour of data …)

FACTORS IMPACTING ACTIVITY RECOGNITION

9/13/2014

HASCA 2014 5

Sensors Type: accelerometer, gyroscope, barometer Sampling rate: 20Hz, 50 Hz, … Number of sensors Location of sensors (pocket, belt, wrist, …) Orientation (facing up, down, in, out)

Features Raw features Transformed, Window Size

Results Accuracy Consistency

FACTORS IMPACTING ACTIVITY RECOGNITION

9/13/2014

HASCA 2014 6

We examined 34 published AR papers Many were smartphone-based

Several papers cover multiple data sets and thus 38 data sets were analyzed

Several papers utilized multiple model types and hence 47 distinct models were analyzed

Detailed analysis published in Lockhart’s MS thesis: Benefits of Personalized Data Mining Approaches to

Human Activity Recognition with Smartphone Sensor Data A table describes each of the factors listed on prior 2

slides for each dataset Summary information described in this presentation

ANALYSIS OF AR RESEARCH

9/13/2014

HASCA 2014 7

Personal Models Model based on labeled data from intended user Requires new users to provide training data Our AR results show high accuracy (~98%)

Impersonal/Universal Models Model based on a panel of representative users No training phase required– works “out of the box” Our AR results show modest performance (~76%)

Hybrid Models Model based on panel of users that includes intended user Requires a training phase for user Our AR results much closer to personal models even

though panel includes dozens of users (~95%)

BACKGROUND ON MODEL TYPE

9/13/2014

HASCA 2014 8

Our results show that personal models perform really well with only small amounts of data per activity Little practical need for hybrid models given need for training

Why are hybrid models often used in research papers? Simple experimental setup: use cross validation on single

data set. No need to carefully partition the data. With n users, personal and impersonal models require n

separate partitions Often assumed that hybrid models approximate impersonal

models and are treated as such In actuality they are much closer to personal models

ISSUES WITH HYBRID MODELS

9/13/2014

HASCA 2014 9

Hybrid model most popular and authors often claim results generalizable to new users (not true) In 10 of 19 cases 10 or fewer

users so even closer to personal models (we had 59)

Couldn’t determine model type in 6 cases; serious methodological issue

53% of the cases we claim methodological issues (40% + 13%)

Model Type

Count Percentage

Personal 12 26%

Impersonal

10 21%

Hybrid 19 40%

Unknown 6 13%

ISSUE 1: MODEL TYPE

Analysis of 47 models from 38 data sets

9/13/2014

HASCA 2014 10

Number of subjects often small 11 studies had less than 5; 12 had less

than 10 HASC 2010 & 2012 more users but little

data per userImpacts ability to evaluate

performance Our results show impersonal models are

very inconsistent across users4 studies evaluated universal models

with less than 8 users; only 2 had at least 30

Populations should also be diverse but many studies focus on college students; personal info should also be provided (height, weight, etc)

ISSUE 2: # SUBJECTS & DIVERSITY

Distribution of impersonal model performance across 59 users

9/13/2014

HASCA 2014 11

Many possible distinctions but 3 main categories: Fully natural: normal daily activities Semi-natural: operate in normal environment but may

be directed (e.g., asked to walk for 5 minutes) Laboratory: structured tasks in a controlled

environmentType of collection environment should be

documented since this impacts results and ability to replicate We have released an AR data set that is semi-natural

and our Actitracker data set that is fully natural (except for self-training phase)

ISSUE 3: COLLECTION METHODOLOGY

9/13/2014

HASCA 2014 12

Type of sensor and number of sensorsUsually provided: not an issue

LocationPrecise location and orientation is often not specified

Our results indicate these factors are important For smartphone, which pants pocket? How oriented? Mine almost always down and in (i.e.,

screen facing thigh).

ISSUE 4: SENSORS

9/13/2014

HASCA 2014 13

Usually little choice in how to represent raw features except for sampling rate

Raw sensor data transformed into multivariate records using sliding window and summary features Half of studies don’t report window size

Vast majority of smartphone AR research only uses basic statistics Yield good results which appear to be

competitive with more complex features (e.g., based on FFT info)

ISSUE 5: FEATURES & FEATURE GENERATION

Distribution of window sizes for 52% of studies that report this info

9/13/2014

HASCA 2014 14

Important that all AR data sets:Release raw dataTransformed data or script to generate transformed data Descriptions of higher level features often not suffi ciently

well specified

Our datasets include raw and transformed data sets and recently we also released the transformation scripts Interestingly, researchers found inconsistencies between our raw and transformed data and helped us identify several bugs

FEATURES & FEATURE GENERATION

9/13/2014

HASCA 2014 15

Two main data setsActivity Prediction

36 users with semi-natural data collection All data is labeled with activity

Actitracker Data Data from our publically available Actitracker app Data set will be updated periodically Fully natural data collection with semi-natural data

collection for self-training data Self-training data is labeled; remaining data is not

labeled Available from: http://www.cis.fordham.edu/wisdm/dataset.php

9/13/2014

WISDM ACTIVITY RECOGNITION DATA SETS

http://www.cis.fordham.edu/wisdm/dataset.php

http://www.cis.fordham.edu/wisdm/dataset.php

HASCA 2014 16

All activity recognition research should clearly describe relevant factors and describe experimental methodology Propose a list of factors/issues to include Many existing studies do not provide important

informationHighlight role of model type

Show that many studies do not specify model type or use hybrid models

Hybrid models are inappropriate in most cases and many studies assume they approximate impersonal/universal models– which is contradicted by our research

9/13/2014

CONCLUSIONS

HASCA 2014 17

Material based on Jeff Lockhart’s MS Thesis

Activity Recognition research was supported by all WISDM Lab members

Funding provided by NSF Grant 1116124

9/13/2014

ACKNOWLEDGEMENTS

HASCA 2014 18

Information available from wisdmproject.comPapers available under “About: Publications” tab

Includes Jeff’s MS Thesis Jeff Lockhart, Gary Weiss (2014).

The Benefits of Personalized Smartphone-Based Activity Recognition Models, In Proc. SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics, Philadelphia, PA, 614-622.

Info about our app available from actitracker.comApp available for download from Google Play

Feel free to download our data sets and ask us about our data

9/13/2014

MORE INFORMATION

http://www.cis.fordham.edu/wisdm/SIAM-2014-final-v3.pdf

http://www.cis.fordham.edu/wisdm/SIAM-2014-final-v3.pdf

Documents

Limitations with Activity Recognition Methodology and Data Sets