11
[email protected] Taming Big Data in Healthcare Paolo Furlani

pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

[email protected]

Taming Big Data in Healthcare

Paolo Furlani

Page 2: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Agenda

➢ Overview of DGR

➢ Initial Challenges

➢ The Platform that We Needed

➢ Overview of Snowflake and Talend

➢ Our way forward

Page 3: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Our mission:

Bringing together real-world data streams for algorithmically-drivenresponsiveness

Page 4: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Initial challenges

➢ Infrastructure was not ready for big data

➢ Building a big data team was costly and time-consuming

➢ Existing big data solutions were complicated & hard to integrate

➢ Pressure to move fast

Page 5: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

The platform we needed

➢ A mature SQL Engine that works with big data

➢ Supports multi-terabyte data volumes

➢ Hosted in the Cloud

… But does it exist?

Page 6: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Data Warehouse Built for the Cloud...

Page 7: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

The glue that holds it all together:

Talend!

• Quick to get up and running

• Scalable compute performance

• Minimal coding involved

• Keeping data gurus focused on building workflows instead of coding

Page 8: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Our current architecture

Page 9: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Talend Integration Cloud

Page 10: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Our way forward

➢ Snowflake: our big data engine

➢AWS S3: Data Lake + Disaster/Recovery

➢Talend: Connecting the pieces together

Next up: Streamlining advanced machine learning with AWS EMR Spark Clusters

Page 11: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Be Eligible to Win Prizes at the End of the Show!