89
Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities Put Knoesis Banner Keynote at IEEE BigData 2014 , Oct 28, 2014 Amit Sheth LexisNexis Ohio Eminent Scholar & Exec. Director, The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis ) Wright State, USA

Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Embed Size (px)

DESCRIPTION

Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014. Abstract at: http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm

Citation preview

Page 1: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Put Knoesis Banner

Keynote at IEEE BigData 2014, Oct 28, 2014

Amit ShethLexisNexis Ohio Eminent Scholar & Exec. Director,

The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis)Wright State, USA

Page 2: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

2

Thanks: My team (missing Pramod, Hemant, ...)

Collaborators: Clinicians: Dr. William Abrahams (OSU-Wexner), Dr. Shalini Forbis (Dayton Childrens), Dr. Sangeeta Agrawal (VA), Valerie Shalin (WSU Cognitive Scientists ), Payam Barnaghi (U-Surrey), Ramesh Jain(UCI), …Funding: NSF (esp. IIS-1111183 “SoCS: Social Media Enhanced Organizational Sensemaking in Emergency Response,”), AFRL, NIH, Industry….

Page 3: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

3http://hrboss.com/hiringboss/articles/big-data-infographic

Big Data 2014

Page 4: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

4

Only 0.5% to 1% of the data is used for analysis.

http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explodehttp://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume

Page 5: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

5

Variety – not just structure but modality: multimodal, multisensory

Structured

Unstructured

Semi structured

Audio

Video

Images

Page 6: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

6

Velocity

Fast Data

Rapid Changes

Real-Time/Stream Analysis

Current application examples: financial services, stock brokerage, weather tracking, movies/entertainment and online retail

Page 7: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

7

About 2 billion of the 5+ billion have data connections – so they perform “citizen sensing”.And there are more devices connected to the Internet than the entire human population.

These ~2 billion citizen sensors and 10 billion devices & objects connected to the Internet makes this an era of IoT (Internet of Things) and Internet of Everything (IoE).

http://www.cisco.com/web/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf

Ever Increasing Connected Devices and People

Page 8: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

8

“The next wave of dramatic Internet growth will come through the confluence of people, process, data, and things — the Internet of Everything (IoE).”

- CISCO IBSG, 2013

http://www.cisco.com/web/about/ac79/docs/innov/IoE_Economy.pdf

Beyond the IoE based infrastructure, it is the possibility of developing applications that spansPhysical, Cyber and the Social Worlds that is very exciting.

Internet of Things / Everything : Future Trends

Page 9: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

10

We are still working on the simpler representations of the real-world!

What has not changed?

http://artint.info/html/ArtInt_8.html http://en.wikipedia.org/wiki/Traffic_congestion

solve

represent interpret real-world

simplified representation

compute

Page 10: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

11

What should change?

We need computational paradigms to tap into the rich pulse of the human populace,

and utilize diverse data

Represent, capture, and compute with richer and fine-grained representations of real-world

problems

solve

represent interpret real-world

richer representation

compute

+

Richer representation of traffic observations

Effective solutions

People interpreting a real-world event

Page 11: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

High CO

High Wheeze

Reduced CO level =>

better Asthma control

Horizontal Operators(Semantic Integration) operates on data from heterogeneous sources to create Integrated/correlated data streams.

Vertical Operators(Semantic abstraction) operates onArtifacts at each level and transcends them to the next level.

Carbon Monoxide

High Luminosity

Wheeze

Luminosity

Low Wheeze

Low Luminosity

High CO influences Wheezing Level (Low/High)

Physical-Cyber-Social Computing for Actionable Insights from Multimodal Data

1Amit Sheth, Pramod Anantharam, Cory Henson, 'Physical-Cyber-Social Computing: An Early 21st Century Approach,' IEEE Intelligent Systems, vol. 28, no. 1, pp. 78-82, Jan.-Feb., 2013. http://doi.ieeecomputersociety.org/10.1109/MIS.2013.20

“a holistic treatment of data, information, and knowledge

from the PCS worlds to integrate, correlate, interpret,

and provide contextually relevant abstractions

to humans. ”1

12

Page 12: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

13

• Healthcare: ADFH, Asthma, GI, Demintia– Using kHealth system

• Traffic Analytics:– Understanding traffic flow

• Social Media Analysis :– Crisis coordination using Twitris

I will use applications in 3 domains to demonstrate

Page 13: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

14http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/

MIT Technology Review, 2012

The Patient of the Future

Page 14: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

15

Asthma: A Multi-faceted and Symptomatically Variable Health Challenge

Personal level Signals

Public level Signals

Population level Signals

1Marcus, Philip, Kevin R. Murphy, Abid Rahman, and Christopher D. O’Brien. "Intrapatient symptom variability in adults and children with asthma: Results of a survey." Advances in therapy 22, no. 5 (2005): 488-497.

“ … survey indicates that adult patients and caregivers of pediatric patients report variability in asthma symptoms over time, even when asthma medications are taken.”1

Page 15: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

16

Asthma: Actionable Information

How is my Asthma control?

Should I take additional medication today?

How can I reduce my asthma attacks at home?

Far better an approximate answer to the right question, which is often vague, than the exact answer to the wrong question, which can always be made precise.

-- John Tukey, Ann. Math. Stat. 33 (1962)

Page 16: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

17

Personal level Signals

Public level Signals

Population level Signals

Domain Knowledge

Asthma: Challenges in Heterogeneity, Variability, and Personalization

http://www.tuberktoraks.org/managete/fu_folder/2011-03/html/2011-3-291-311.html

Contextual Personalized Actionable

OR

Page 17: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

18

My 2004-2005 formulation of SMART DATA - Semagix

Formulation of Smart Data strategy providing services for Search, Explore, Notify.

“Use of Ontologies and Data repositories to gain

relevant insights”

Page 18: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

19

Smart Data (2014 retake)

Smart data makes sense out of Big data

It provides value from harnessing the challenges posed by volume, velocity, variety

and veracity of big data, in-turn providing actionable information and improve decision

making.

Page 19: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

20

OF human, BY human FOR human

Smart data is about extracting value by improving human involvement in data creation,

processing and consumption. It is about (improving)

computing for human experience.

Another perspective on Smart Data

Page 20: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

21Petabytes of Physical(sensory)-Cyber-Social Data everyday! More on PCS Computing: http://wiki.knoesis.org/index.php/PCS

‘OF human’ : Relevant Real-time Data Streams for Human Experience

Page 21: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Use of Prior Human-created Knowledge Models

22

‘BY human’: Involving Crowd Intelligence in data processing

Crowdsourcing and Domain-expert guided Machine Learning Modeling

Page 22: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

23

Detection of events, such as wheezing sound, indoor temperature, humidity,

dust, and CO level

Weather Application

Asthma Healthcare Application

High CO content at home during day

‘FOR human’ : Improving Human Experience (Smart Health)

Population Level

Personal

Public Health

Action in the Physical World

Luminosity CO level

CO in gush during day time

Page 23: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Electricity usage over a day, device at work, power consumption, cost/kWh,

heat index, relative humidity, and public events from social stream

Weather Application

Power Monitoring Application

24

‘FOR human’ : Improving Human Experience (Smart Energy)

Population Level Observations

Personal Level Observations

Action in the Physical World

Washing and drying has resulted in significant cost

since it was done during peak load period. Consider

changing this time to night.

Page 24: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

25

Big Data is pervasive - It is Smart Data that matter!

Page 25: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

DATAObservations from machine and social sensors

KNOWLEDGEfor interpretation of

observations

26

ACTIONSsituation awareness useful

for decision making

Primary challenge is to bridge the gap between data and actions

Contextualization

Personalization

Page 26: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

27

In the process, engaging both top and bottom brain

“The Theory of Cognitive Modes* emphasizes the constant and close interaction of the top and bottom systems. They don’t work in isolation — or in competition — but seamlessly together.”

“the top part of the brain is involved in setting up plans, controlling movements, registering changes in where objects are located in space, and revising plans when anticipated events do not occur.”

“bottom is involved in classifying and interpreting what we perceive, and allows us to confer meaning on the world.”

*http://brainblogger.com/2013/12/19/top-brain-bottom-brain-part-3-the-theory-of-cognitive-modes/ by G. Wayne Miller and Stephen M. Kosslyn, PhD | December 19, 2013

Page 27: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

28

Can we take inspiration from the ‘Theory of Cognitive Modes’ to develop a computational model?

Mover Perceiver Simulator Adaptor

http://online.stanford.edu/pgm-fa12

T & B B TT- Top brain, B- Bottom brain

our baby step toward a computational model for perception

(Machine Perception)

Page 28: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

29

J. McCarth

y

M. Weiser

D. Engelba

rt

J. C. R. Licklider

Toward a symbiotic partnership between machines and people

htttp://j.mp/k-chehttp://knoesis.org/index.php/Computing_For_Human_Experience

Page 29: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

30

RDF OWL

Semantic Sensor Networks (SSN)

How are machines supposed to integrate and interpret sensor data?

Page 30: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

31Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).

W3C Semantic Sensor Network Ontology

Page 31: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

32Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).

W3C Semantic Sensor Network Ontology

Page 32: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

SSNOntology

2 Interpreted data(deductive)[in OWL] e.g., threshold

1 Annotated Data[in RDF]e.g., label

0 Raw Data[in TEXT]e.g., number

3 Interpreted data (abductive)[in OWL]e.g., diagnosis

Intellego

“150”

Systolic blood pressure of 150 mmHg

ElevatedBlood

Pressure

Hyperthyroidism

less

use

ful …

mor

e us

eful

……

33

Levels of Abstraction

Page 33: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

34

… and do it efficiently and at scale

What if we could automate this interpretation of Data?

Page 34: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

35

Making sense of sensor data with

Henson et al An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web, Applied Ont, 2011

Page 35: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

36

People are good at making sense of sensory input

What can we learn from cognitive models of perception?

The key ingredient is prior knowledge

Page 36: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

37* based on Neisser’s cognitive model of perception

ObserveProperty

PerceiveFeature

Explanation

Discrimination

1

2

Translating low-level signals into high-level knowledge

Focusing attention on those aspects of the environment that provide useful information

Prior Knowledge

Convert large number of observations to semantic abstractions that provide insights and translate into

decisions

Perception Cycle*

Page 37: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

38

To enable machine perception,

Semantic Web technology is used to integrate

sensor data with prior knowledge on the Web

W3C SSN XG 2010-2011, SSN Ontology

Page 38: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

39

W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph

Prior knowledge on the Web

Page 39: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

40

W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph

Prior knowledge on the Web

Page 41: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

42

Inference to the best explanation• In general, explanation is an abductive

problem; and hard to compute

Finding the sweet spot between abduction and OWL• Single-feature assumption* enables use

of OWL-DL deductive reasoner

* An explanation must be a single feature which accounts for

all observed properties

Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building

Representation of Parsimonious Covering Theory in OWL-DL

Explanation

Page 42: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

43

ExplanatoryFeature ≡ ssn:isPropertyOf∃ —.{p1} … ssn:isPropertyOf⊓ ⊓ ∃ —.{pn}

elevated blood pressure

clammy skin

palpitations

Hypertension

Hyperthyroidism

Pulmonary Edema

Observed Property Explanatory Feature

Explanatory Feature: a feature that explains the set of observed properties

Explanation

Page 43: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

44

ObserveProperty

PerceiveFeature

Explanation

Discrimination2

Focusing attention on those aspects of the environment that provide useful information

Discrimination is the act of finding those properties that, if observed, would help distinguish between multiple explanatory

features

Discrimination

Page 44: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

45

ExpectedProperty ≡ ssn:isPropertyOf.{f∃ 1} … ssn:isPropertyOf.{f⊓ ⊓ ∃ n}

elevated blood pressure

clammy skin

palpitations

Hypertension

Hyperthyroidism

Pulmonary Edema

Expected Property Explanatory Feature

Expected Property: would be explained by every explanatory feature

Discrimination

Page 45: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

46

NotApplicableProperty ≡ ¬ ssn:isPropertyOf.{f∃ 1} … ¬ ssn:isPropertyOf.{f⊓ ⊓ ∃ n}

elevated blood pressure

clammy skin

palpitations

Hypertension

Hyperthyroidism

Pulmonary Edema

Not Applicable Property Explanatory Feature

Not Applicable Property: would not be explained by any explanatory feature

Discrimination

Page 46: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

47

DiscriminatingProperty ≡ ¬ExpectedProperty ¬NotApplicableProperty⊓

elevated blood pressure

clammy skin

palpitations

Hypertension

Hyperthyroidism

Pulmonary Edema

Discriminating Property Explanatory Feature

Discriminating Property: is neither expected nor not-applicable

Discrimination

Page 47: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

48

Semantic scalability: Resource savings of abstracting sensor data

Orders of magnitude resource savings for generating and storing relevant abstractions vs. raw observations.

Relevant abstractions

Raw observations

Page 48: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Qualities-High BP-Increased Weight

Entities-Hypertension-Hypothyroidism

kHealth

Machine Sensors

Personal Input

EMR/PHR

Comorbidity risk score e.g., Charlson Index

Longitudinal studies of cardiovascular risks

- Find risk factors- Validation - domain knowledge - domain expert

Find contribution of each risk

factor

Risk Assessment Model

Current Observations-Physical-Physiological-History

Risk Score (e.g., 1 => continue3 => contact clinic)

Model CreationValidate correlations

Historical observations e.g., EMR, sensor observations

49

Risk Score: from Data to Abstraction and Actionable Information

Page 49: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

50

Use of OWL reasoner is resource intensive (especially on resource-constrained devices), in terms of both memory and time

• Runs out of resources with prior knowledge >> 15 nodes

• Asymptotic complexity: O(n3)

How do we implement machine perception efficiently on a

resource-constrained device?

Page 50: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

51

intelligence at the edge

Approach 1: Send all sensor observations to the cloud for processing

Approach 2: downscale semantic processing so that

each device is capable of machine perception

Page 51: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

52

0101100011010011110010101100011011011010110001101001111001010110001101011000110100111

Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning

Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.

Efficient execution of machine perception

Page 52: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

53

O(n3) < x < O(n4)

O(n)

Efficiency Improvement

• Problem size increased from 10’s to 1000’s of nodes

• Time reduced from minutes to milliseconds• Complexity growth reduced from polynomial

to linear

Evaluation on a mobile device

Page 53: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

54

2 Prior knowledge is the key to perceptionUsing SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web

3 Intelligence at the edgeBy downscaling semantic

inference, machine perception can execute efficiently on resource-constrained devices

1 Translate low-level data to high-level knowledge

Machine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making

Semantic Perception for smarter analytics: 3 ideas to takeaway

Page 54: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

kHealthKnowledge-enabled Healthcare

Applied to ADHF, Asthma, GI, and Dementia

55

Page 55: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Brief Introduction Video

Page 56: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

57

Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information

canary in a coal mine

Empowering Individuals (who are not Larry Smarr!) for their own health

kHealth: knowledge-enabled healthcare

Page 57: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

60

1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.

25 million

300 million

$50 billion

155,000

593,000

People in the U.S. are diagnosed with asthma (7 million are children)1.

People suffering from asthma worldwide2.

Spent on asthma alone in a year2

Hospital admissions in 20063

Emergency department visits in 20063

Asthma: Severity of the problem

Page 58: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

61

what can we do to avoid asthma episode?

Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies.

Variety Volume

VeracityVelocity

ValueWhat risk factors influence asthma control?What is the contribution of each risk factor?

sem

antic

s Understanding relationships betweenhealth signals and asthma attacksfor providing actionable information

WHY Big Data to Smart Data: Asthma example

Page 59: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

kHealth: Health Signal Processing Architecture

Personal level Signals

Public level Signals

Population level Signals

Domain Knowledge

Risk Model

Events from Social Streams

Take Medication before going to work

Avoid going out in the evening due to high pollen levels

Contact doctor

AnalysisPersonalized Actionable

Information

Data Acquisition & aggregation

62

Page 60: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

63

Asthma Domain Knowledge

Domain Knowledge

ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ; *consider referral to specialist

Asthma Control and Actionable Information

Page 61: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

64

Patient Health Score (diagnostic)

Risk assessment model

Semantic Perception

Personal level Signals

Public level Signals

Domain Knowledge

Population level Signals

GREEN -- Well Controlled YELLOW – Not well controlledRed -- poor controlled

How controlled is my asthma?

Page 62: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

65

Population Level

Personal

Wheeze – YesDo you have tightness of chest? –Yes

Observations Physical-Cyber-Social System Health Signal Extraction Health Signal Understanding

<Wheezing=Yes, time, location>

<ChectTightness=Yes, time, location>

<PollenLevel=Medium, time, location>

<Pollution=Yes, time, location>

<Activity=High, time, location>

Wheezing

ChectTightness

PollenLevel

Pollution

Activity

Wheezing

ChectTightness

PollenLevel

Pollution

Activity

RiskCategory

<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>

.

.

.

Expert Knowledge

Background Knowledge

tweet reporting pollution level and asthma attacks

Acceleration readings fromon-phone sensors

Sensor and personal observations

Signals from personal, personal spaces, and community spaces

Risk Category assigned by doctors

Qualify

Quantify

Enrich

Outdoor pollen and pollution

Public Health

Patient Health Score (diagnostic): Details

Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor

Page 63: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

66

Patient Vulnerability Score (prognostic)

Risk assessment model

Semantic Perception

Personal level Signals

Public level Signals

Domain Knowledge

Population level Signals

Patient health Score

How vulnerable* is my control level today?

*considering changing environmental conditions and current control level

Page 64: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

67

Sensordrone – for monitoring environmental air quality

Wheezometer – for monitoringwheezing sounds

Can I reduce my asthma attacks at night?

What are the triggers? What is the wheezing level?

What is the propensity toward asthma?

What is the exposure level over a day?

Commute to Work

Patient Vulnerability Score (prognostic): Details

Luminosity

CO level

CO in gush during day time

Actionable Information

Personal level Signals

Public level Signals

Population level Signals

What is the air quality indoors?

Page 65: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Sensordrone (Carbon monoxide,

temperature, humidity)

Node Sensor (exhaled Nitric

Oxide)

68

Sensors

Android Device (w/ kHealth App)

Total cost: ~ $500

kHealth Kit for the application for Asthma management

Along with two sensors in the kit, the application uses a variety of population level signals from the web:

Pollen level Air Quality Temperature & Humidity

Page 66: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

69

Usability and decision support trial

Dr. Shalini G. Forbis, MD, MPH

Page 67: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Preliminary insights from patient data

S1 S2

Sensor data QA data

Number of Observations

Page 68: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

6/2/2

014

6/3/2

014

6/4/2

014

6/5/2

014

6/6/2

014

6/7/2

014

6/8/2

014

6/9/2

014

6/10/2

014

6/11/2

014

6/12/2

0140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Did patient take albuterol last night due to cough or wheeze?

6/2/2

014

6/3/2

014

6/4/2

014

6/5/2

014

6/6/2

014

6/7/2

014

6/8/2

014

6/9/2

014

6/10/2

014

6/11/2

014

6/12/2

014

6/13/2

0140

0.05

0.1

0.15

0.2

0.25

Exhaled Nitric Oxide

Medication (Albuterol) related to decreasing Exhaled Nitric Oxide

Page 69: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

6/2/2

014

6/3/2

014

6/4/2

014

6/5/2

014

6/6/2

014

6/7/2

014

6/8/2

014

6/9/2

014

6/10/2

014

6/11/2

014

6/12/2

0140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

How much did asthma or asthma symptoms limit patient's activity to-

day?

6/2/2

014

6/3/2

014

6/4/2

014

6/5/2

014

6/6/2

014

6/7/2

014

6/8/2

014

6/9/2

014

6/10/2

014

6/11/2

014

6/12/2

014

6/13/2

0140

0.05

0.1

0.15

0.2

0.25

Exhaled Nitric Oxide

Activity limitation related to high exhaled Nitric Oxide

Page 70: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

6/2/2

014

6/3/2

014

6/4/2

014

6/5/2

014

6/6/2

014

6/7/2

014

6/8/2

014

6/9/2

014

6/10/2

014

6/11/2

014

6/12/2

0140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Has patient had wheeze, chest tightness, or asthma related

cough today?

41792.6000034259

41793.8430565394

41795.8798923843

41796.9000038657

41798.4715425694

41799.6928319907

41802.4572316551

41803.56436067120

0.05

0.1

0.15

0.2

0.25

Nitric Oxide

Low exhaled Nitric Oxide observed with absence of coughing

Page 71: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

6/2/2

014

6/3/2

014

6/4/2

014

6/5/2

014

6/6/2

014

6/7/2

014

6/8/2

014

6/9/2

014

6/10/2

014

6/11/2

014

6/12/2

0140

0.5

1

1.5

2

2.5

Pollen

6/2/2

014

6/3/2

014

6/4/2

014

6/5/2

014

6/6/2

014

6/7/2

014

6/8/2

014

6/9/2

014

6/10/2

014

6/11/2

014

6/12/2

0140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

How much did asthma or asthma symptoms limit patient's activity

today?

Activity limitation observed with high pollen activity

Page 72: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

75

Two research directions for kHealth asthma with more data…

Root cause analysis Action Recommendation

Find Triggers of AsthmaDerive the cause of asthma attacks for a given patient using statistical techniques + knowledge of asthma and its triggers

Minimize Asthma AttacksModel actions based on the utility theory (cost of actions & its rewards) + knowledge of action consequences

Page 73: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

76

• Healthcare: ADFH, Asthma, GI– Using kHealth system

• Traffic Analytics:– Understanding traffic flow

• Social Media Analysis :– Crisis coordination using Twitris

I will use applications in 3 domains to demonstrate

Page 74: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

78

Understanding traffic flow variations

Page 75: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

79

Vehicular traffic data from San Francisco Bay Area aggregated from on-road sensors (numerical) and incident reports (textual)

http://511.org/

Every minute update of speed, volume, travel time, and occupancy resulting in 178 million link status observations, 738 active events, and 146 scheduled events with many unevenly sampled observations collected over 3 months.

Variety Volume

VeracityVelocity

ValueCan we detect the onset of traffic congestion?Can we characterize traffic congestion based on events?Can we provide actionable information to decision makers?

sem

antic

s Representing prior knowledge of traffic lead to a focused exploration of this massive dataset

Big Data to Smart Data: Traffic Management example

Page 76: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

Semantic Annotation using Background Knowledge

Image Credit: http://traffic.511.org/index

slow-moving-traffic

Domain knowledge in the form of traffic vocabulary

Domain knowledge of traffic flow synthesized from sensor data

80

Explained-by

Horizontal operator: relating/mapping data from different modality to a concept (theme) within a spatio-temporal context;Spatial context even include what it means to have a slow traffic for the type of road

Page 77: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

81

• Healthcare: ADFH, Asthma, GI– Using kHealth system

• Traffic Analytics:– Understanding traffic flow

• Social Media Analysis :– Crisis coordination using Twitris

I will use applications in 2 domains to demonstrate

Page 78: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

82

Image: http://www.gizmodo.com.au/2012/04/how-we-identify-single-voices-in-a-crowd/

BIG QUESTION: Can these needles be identified in the haystack of massive datasets?

Me and @CeceVancePR are coordinating a clothing/food drive for families affected by Hurricane Sandy. If you would like to donate, DM us

Does anyone know how to donate clothes to hurricane #Sandy victims?

[REQUEST/DEMAND]

[OFFER/SUPPLY]Coordination teams

want to hear!

[BIG] Ad-hoc Community with Varying but [FEW] Important Intents

Page 79: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

• May lead to second disaster to be managed:– Under-supply of required demands – Over-supply of not required resources

• Hurricane Sandy example, “Thanks, but no thanks”, NPR, Jan 12 2013

Story link:http://www.npr.org/2013/01/09/168946170/thanks-but-no-thanks-when-post-disaster-donations-overwhelm

Uncoordinated Engagement

Page 80: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

84

How to volunteer, donate to Hurricane Sandy: <URL>

If you have clothes to donate to those who are victims of Hurricane Sandy …

Red Cross is urging blood donations to support those affected <URL>

I have TONS of cute shoes & purses I want to donate to hurricane victims …

Does anyone know how to donate clothes to hurricane #Sandy victims?

Does anyone know of community service organizations to volunteer to help out?

Needs to get something, suggests scarcity:

REQUEST (demand)Offers or wants to give, suggests abundance:

OFFER (supply)

Matching requests with offers

Page 81: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

85

Want to help animals in #Oklahoma? @ASPCA tells how you can help:

http://t.co/mt8l9PwzmOx

RESPONSE TEAMS (including humanitarian

org. and ‘pseudo’ responders)

VICTIM SITE

Where do I go to help out for volunteer work

around Moore?

Anyone know?

Anyone know where to donate to

help the animals from the Oklahoma

disaster? #oklahoma #dogs

Matchable

Matchable

If you would like to volunteer today, help is desperately needed in

Shawnee. Call 273-5331 for

more info

CITIZEN SENSORS

DEMAND SUPPLY

Match-making: Assisting Coordination

Image: http://offthewallsocial.com/tag/social-media/

Page 82: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

86

Two excellent videos• Vinod Khosla:

the Power of Storytelling and the Future of Healthcare

• Larry Smarr: The Human Microbiome and the Revolution in Digital Health

Wrapping up: For more on importance of what we talked about

Page 83: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

87

• Big Data is every where– at individual and community levels - not just

limited to corporation – with growing complexity: Physical-Cyber-Social

• Analysis is not sufficient• Need interaction between bottom up

techniques and top down processing

Wrapping up: Take Away

Page 84: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

88

Wrapping up: Take Away

• Focus on Humans and Improve human life and experience with SMART Data.– Data to Information to Personally and Contextually

Relevant Abstractions (Semantic Perception)– Actionable Information (Value from data) to assist and

support human in decision making.

• Focus on Value -- SMART Data– Big Data Challenges without the intention of deriving

Value is a “Journey without GOAL”.

Page 85: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

89

Amit Sheth’s PHD students

Ashutosh Jadhav*

Hemant Purohit

Vinh Nguyen

Lu ChenPavan Kapanipathi

*

Pramod Anantharam

*

Sujan Perera

Maryam Panahiazar

Sarasi Lalithsena

Shreyansh Batt

Kalpa Gunaratna

Delroy Cameron

Sanjaya Wijeratne

Wenbo Wang

Special thanks: Pramod. This presentation covers some of the work of my PhD students. Key contributors: Pramod Anantharam, Cory Henson and TK Prasad.

Special thanks

Page 86: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

90

• Among top universities in the world in World Wide Web (cf: 10-yr impact, Microsoft Academic Search: among top 10 in June2014)

• Among the largest academic groups in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications

• Exceptional student success: internships and jobs at top salary (IBM Watson/Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups )

• 100 researchers including 15 World Class faculty (>3K citations/faculty avg) and ~45 PhD students- practically all funded

• Extensive research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …)

Page 87: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

91

Top organization in WWW: 10-yr Field Rating (MAS)

Page 88: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

92

Smart Data - How you and I can and will exploit Big Data http://knoesis.org

Page 89: Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

93

thank you, and please visit us at

http://knoesis.org

Smart Data - How you and I will exploit Big Data