23
DE-IDENTIFIED BIOMETRIC DATA WAREHOUSE higi SH llc © 2017 – confidential and proprietary 1 POWERED BY James Rapp Sr. Software Engineer Daniel Neems , PhD Data Science Researcher

DE-IDENTIFIED BIOMETRIC DATA WAREHOUSE - … · DE-IDENTIFIED BIOMETRIC DATA WAREHOUSE ... Azure Runbook. V 1.5 De-identified Analytics Issues ... Software/Engineering

Embed Size (px)

Citation preview

DE-IDENTIFIED BIOMETRIC DATA WAREHOUSE

higi SH llc © 2017 – confidential and proprietary1

POWEREDBY

JamesRappSr.SoftwareEngineer

DanielNeems,PhDDataScienceResearcher

What is higi?higi enables consumers to collect and use their

biometric data to act on their health by connecting them to partners that want to motivate action

higi SH llc © 2017 – confidential and proprietary2

Founded in2012

4.5+ millionregistered account holders

78%of the U.S. population

lives within 5 miles of a higi station

80+integrated health

devices, activity trackers and apps

36.5+ millionpeople have used

a higi station

136+ millionmiles of activity logged

198+ milliontests completed across the

higi network

50+retailer banners

The largest self screening connected network in the world.

Over11,000 stations

nationwide

higi SH llc © 2017 – confidential and proprietary3

V 1.0 De-identified Analytics Implementation

higi SH llc © 2017 – confidential and proprietary4

V 1.0 De-identified Analytics Issues

higi SH llc © 2017 – confidential and proprietary5

•Rigid Schema•Costly•Brittle•Poor Performance

V 1.5 Analytics Implementation

higi SH llc © 2017 – confidential and proprietary6

TableStorage

1001

Blobstorage

BlobMigration

CloudService

AnalyticsProcessor

CloudService

Hadoop(HDInsight)

Reports(CSV)

AutomationviaAzureRunbook

V 1.5 De-identified Analytics Issues

higi SH llc © 2017 – confidential and proprietary7

• Very costly• Time-consuming queries• Brittle• Rigid report output • Required dev effort to update•Hard to debug and maintain

V 2.0 Analytics Goals

higi SH llc © 2017 – confidential and proprietary8

•Easy to implement•Easy to maintain•Flexible• Inexpensive•Performant

V 2.0 Analytics Implementation

higi SH llc © 2017 – confidential and proprietary9

TableStorage

1001

Blobstorage

BlobMigration

Webjob(timertrigger)

Blobtos3migration

Webjob(event-based)

SnowflakeReporting(Tableau)

How does Snowflake help?

higi SH llc © 2017 – confidential and proprietary10

• Faster• Scalable• Reports customizable by business•Devs deal with raw data and leave the

analysis to our data science team

Overview of Data Operations

higi SH llc © 2017 – confidential and proprietary11

GregRumpleChiefInformationOfficer

RossGoglia,MBADataScienceProduct

Manager

DanielNeems,PhDDataScientist

KhanSiddiqui,MDChiefTechnologyOfficerChiefMedicalOfficer

RobertBakosVPof

Software/Engineering

Overview of Data Operations

higi SH llc © 2017 – confidential and proprietary12

DataOperations

DataInfrastructure

DataProduct

DataScience

BITools/Dashboards

DataStudies/Publications

MachineLearning

ForExternalConsumption

Data Architecture (1st generation)

higi SH llc © 2017 – confidential and proprietary13

Data Silos

Data Architecture (2nd generation)

higi SH llc © 2017 – confidential and proprietary14

DataPipelineintheCloud

Data Architecture (3rd generation)

higi SH llc © 2017 – confidential and proprietary15

DataWarehouseintheCloud

Snowflake Data Warehouse

higi SH llc © 2017 – confidential and proprietary16

Userregistrationanddemographics

Health(e.g.kiosk)activities

Fitnessactivities

Lifestyleactivities(e.g.gym)

Nutritionactivities/purchases

Challengecreationandjoining

Scoreupdates

Achievementearning

Pointsearning

RewardredemptionConnectionof3rd partydevices

Participationinloyaltyprograms

Userdatashareopt-ins(viaAPI)

Communitycreationandjoining

Friendfollowing

Socialactivities(e.g.sharing,inviting)

Chatterandcommenting/liking

Adviewsandclicks

Surveytaking

Loginsacrosskiosk/web/mobile

Activ

ityIntegrations

Conten

t

Gamificatio

nSocial

Time Saving Scalability

Query on a X-Small Warehouse

higi SH llc © 2017 – confidential and proprietary17

Query on a X-Large Warehouse

Snowflake Data Warehouse

higi SH llc © 2017 – confidential and proprietary18

Tableau-Snowflake Demo

higi SH llc © 2017 – confidential and proprietary19

Data Science and Machine Learning Approach

higi SH llc © 2017 – confidential and proprietary20

DemographicFactorA

DemographicFactorB

BehavioralFactorA

BehavioralFactorB

HighUser

Engagementor

PositiveHealth

Outcomes

Machine Learning Workflow

higi SH llc © 2017 – confidential and proprietary21

PredictiveanalyticsandMLmodels• retentionandengagementprediction• health-relatedprediction• statisticallysignificantdriversofengagementandbehavior

• userclusteranalysis• recommendationengines

Thank you!

higi SH llc © 2017 – confidential and proprietary22

[email protected]@higi.com

[email protected]@higi.com

Data Science and Machine Learning Approach

higi SH llc © 2017 – confidential and proprietary23