Data Lakes, Warehouses and Marts, Oh My! · 2019. 7. 27. · Science Center. Original Data...

Preview:

Citation preview

Data Lakes,

Warehouses and

Marts, Oh My!Chris Williams, MD

University of Oklahoma Health Science Center

OriginalDataWarehouse

Disclosures

In the past 12 months, I have not had any significant financial interest or other relationship with the manufacturers of the products or providers of the services that will be discussed in my presentation.

Jargon Free

Learning Objectives

• What is driving adoption?• Common roadmap for data wrangling• Importance of a strategic approach

What is Driving Adoption?

The Why

The Why

Source: Gartner (February 2014)

Hospitals are in the information business,

some of them choose to act like it.

The Why

The Why

timoelliott.com

The Why

The Why

The Why

The Why

The Why

https://www.forbes.com/sites/bernardmarr/2016/06/14/data-driven-decision-making-10-simple-steps-for-any-business/

Data Driven Decisions

1. Start with strategy2. Narrow your focus3. What don’t you know?4. What data do you need5. What data do you have

Data Driven Decisions

6. Estimate the costs7. Collect the data8. Analyze9. Distribute10.Learn

Common roadmap for data wrangling

www.datawatch.com/what-is-data-wrangling/

Data -> Knowledge

https://panoply.io/data-warehouse-guide/data-mart-vs-data-warehouse/

Importing Data from multiple systems

Ingesting Data from:

• EHR• LIS• Payor• Billing• HR• Inventory• Etc… weather?

Importing Data from multiple systems

Date formats:

• Relational (MS SQL, Oracle)• No-SQL (Cache, Mumps, Document Stores)• Text• Excel• CSV• HL7

Data Warehouse

Typically a relational database to house vast amount of transformed and “cleaned” data

Enterprise Data Model

Star Schema

https://bidatapro.net/2018/04/23/what-is-fact-table-in-data-warehouse/

Data Lake

An unmodified archive of data feeds entering the data warehouse, i.e. native formats, raw, etc...

Data Lake

Data Lake

Pros:

• Relatively inexpensive• Revisit the data in its native format• Flexibility• Staging area for POC

Data Lake

Cons:

• Added cost/complexity• Maintain “current” copies of data• Additional maintenance burden

– The dreaded “Data Swamp”

https://panoply.io/data-warehouse-guide/data-mart-vs-data-warehouse/

Data Mart

Smaller collections of data, curated from multiple sources, and formatted for a targeted reporting purpose

Data Mart

• Often used to refresh Key Performance Indicator reports • Source for Data Visualization tools • Forecasting Models

Data Lakes, Warehouses, Marts

Visualization

Visualization

Visualization

Visualization

Trends:

• Moving away from tables• Intuitive dashboards• Interactive, self-service

Visualization

Examples

• Tableau• PowerBI• Sisense• Many others

https://panoply.io/data-warehouse-guide/data-mart-vs-data-warehouse/

Importance of a Strategic Approach

Strategy

The kernel

Diagnosis

Guiding Policy

Coherent Action

-Richard Rumelt

Diagnosis

The kernel

Diagnosis

Guiding Policy

Coherent Action

“What is the problem we are trying to solve?”

Guiding Policy

The kernel

Diagnosis

Guiding Policy

Coherent Action

THINK BIG PICTURE

Coherent Action

The kernel

Diagnosis

Guiding Policy

Coherent Action

Lagging Data Maturity

● No strategy in place● Haphazard approach● Data capture is inconsistent and a

consequence of business processes

Limited Data Maturity

● Recognition of need (Diagnosis)● Lack of Guiding Policy● Data Silos● Data Lag● Inaccessible by end users

Functional Data Maturity

● Guiding Policy is in Place● Actions are beginning to align● Data Driven Decision Making is being

attempted

Data Maturity

• Strategy is driving actions• Data is driving models• Decision Making is shifted

– Retrospective -> Real-Time -> Anticipated

Increasing Data Literacy

Gartner defines data literacy as the ability to read, write and communicate data in context, including an understanding of data sources and constructs, analytical methods and techniques applied — and the ability to describe the use case, application and resulting value.

Increasing Data Literacy

Users are asking:

• Where is the data coming from / how is it captured• Am I asking the right questions• What assumptions am I making

Acknowledgements

Dr. David McClintock, Michigan Medicine

Dr. Matthew Atkins, OU Physicians

Dr. Gregory Blakey, OU Physicians

Dr. Dale Bratzler, OU Physicians

Alan Smith, OU Medicine

Questions?

Recommended