24
Barnan Das School of Electrical Engineering and Computer Science Washington State University ***Self-portraits by William Utermohlen, an American artist living in London, after he was diagnosed with Alzheimer’s disease in 1995. Utermohlen died from the consequences of Alzheimer’s disease in March 2007. Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes Barnan Das, Narayanan C. Krishnan, Diane J. Cook

Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Embed Size (px)

DESCRIPTION

This paper was selected at the ICDM Workshop on Data Mining in Biomedical Informatics and Healthcare (DMBIH), 2013.

Citation preview

Page 1: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Barnan DasSchool of Electrical Engineering and Computer Science

Washington State University

***Self-portraits by William Utermohlen, an American artist living in London, after he was diagnosed with Alzheimer’s disease in 1995. Utermohlen died from the consequences of Alzheimer’s disease in March 2007.

Handling Class Overlap and Imbalance to Detect Prompt Situations in

Smart HomesBarnan Das, Narayanan C. Krishnan, Diane J. Cook

Page 2: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

2

Worldwide Dementia population

Source: World Health Organization and Alzheimer’s Association.

Actual and expected number of Americans >=65 year with Alzheimer’s

Payment for care in 2012$200billion

Unpaid caregivers15million

36million

2010 2030 2050

5.1m

7.7m

13.2m

Page 3: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

3

Page 4: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Automated Prompting

4

Help with Activities of Daily Living (ADLs)

Page 5: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

5

• Rule-based (temporal or contextual)• Activity initiation• RFID and video-input based prompts for

activity steps

• Learning-based• Sub-activity level prompts• No audio/video input

Existing Work

Our Contribution

Page 6: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Architectural Overview

6

Page 7: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

7

Data

8 dailyactivities

300 elderlyparticipants

Prompts issued when errors were committed

Raw Data

1 activitystep

17 engineered features

Binary class{no-prompt, prompt}

Clean Data

1 datapoint

0/1

SweepingCookingMedicationWatering PlantsEtc.

Length of activity stepLocation in apartment

# sensors involves# distribution of sensor events

Etc.

Page 8: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Class Distribution

8

149

3831

Total number of data points

3980

Page 9: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Overlapping Classes9

Page 10: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Overlapping Classes in Prompting Data

10

3D PCA Plot of prompting data

Page 11: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Existing Approaches

11

• Discard data of the overlapping region

• Treat overlapping region as a separate class

• Polynomial combination of existing features

• Using kernel methods

Page 12: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Tomek Links

12

Page 13: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Cluster-Based Under-Sampling(ClusBUS)

13

Form clusters Under-sampling candidate clusters

Page 14: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

14

Choice of Clustering Algorithm

Determining Candidate Clusters

DBSCAN• Density-based• Non-spherical clusters• No need to predetermine

number of clusters

Empirically Determined• Based on minority class

dominance (r) in clusters• Threshold determined by q-

quantile values of r

Two Critical Components

Page 15: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Empirically Determined Threshold

15

Page 16: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Experimental Setup

16

Alternative Sampling Method SMOTE

Classifiers• C4.5 Decision Tree• Naïve Bayes• k-Nearest Neighbor• SVM

Performance Metric TP Rate, G-mean, AUC

Page 17: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Results (1)

17

C4.5 Naïve Bayes IBk SMO0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Original SMOTE ClusBUS

TP R

ate

C4.5 Naïve Bayes IBk SMO0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Original SMOTE ClusBUS

G-m

ean

TP Rate G-mean

Page 18: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Results (2)

18

C4.5 Naïve Bayes IBk SMO0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Original SMOTE ClusBUS

AUC

Area Under ROC Curve

Page 19: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Conclusion

19

• Automated prompting as a classification problem

• Proposed ClusBUS: under-sampling-based preprocessing

• Solution to class overlap helps address imbalance classes

Page 20: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Contact Us

20

Barnan [email protected]

Dr. Diane [email protected]

http://casas.wsu.edu

Page 21: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

21

Page 22: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Backup Slides

22

Page 23: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Activities

23

Sweeping

Cooking

Taking Medication

Watering Plants

Watching DVD

Selecting Outfit

Taking Phone Call

Writing Birthday Card

Page 24: Handling Class Overlap and Imbalance to Detect Prompt Situations in Smart Homes

Feature Generation

24

Feature # Feature Name Description

1 stepLength Length of the step in time (seconds)

2 numSensors Number of unique sensors involved with the step

3 numEvents Number of sensor events associated with the step

4 prevStep Previous step

5 nextStep Next step

6 timeActBegin Time (seconds) elapsed since the beginning of the activity

7 timePrevStep Time (seconds) difference between the last event of the previous step and the first event of the current step

8 stepsActBegin Number of steps visited since the start of the activity

9 activityID Activity ID

10 stepID Step ID

11 location Set of features representing sensor frequencies in kitchen, dining room, living room, etc. when the activity was performed

12 Class Binary class. 1-”Prompt”, 0-”No-Prompt”