20

Growing a Data Pipeline for Analytics Roberto Vitillo, Staff Data Engineer @ Mozilla 26th PyData London Meetup

Growing a Data Pipeline for Analytics

Download PDF Report

Upload
roberto-agostino-vitillo
View
114
Download
0

Embed Size (px)

Citation preview

Page 1: Growing a Data Pipeline for Analytics

Growing a Data Pipeline for Analytics

Roberto Vitillo, Staff Data Engineer @ Mozilla26th PyData London Meetup

Page 2: Growing a Data Pipeline for Analytics

Page 3: Growing a Data Pipeline for Analytics

Page 4: Growing a Data Pipeline for Analytics

brew install apache-spark

Page 5: Growing a Data Pipeline for Analytics

Page 6: Growing a Data Pipeline for Analytics

Don’t do it yourself!

Page 7: Growing a Data Pipeline for Analytics

Input OutputETL

Storage

Page 8: Growing a Data Pipeline for Analytics

JSON

JSON?

Page 9: Growing a Data Pipeline for Analytics

Page 10: Growing a Data Pipeline for Analytics

Page 11: Growing a Data Pipeline for Analytics

Page 12: Growing a Data Pipeline for Analytics

Page 13: Growing a Data Pipeline for Analytics

JSON

Parquet

Spark, Hive, Pig …

Page 14: Growing a Data Pipeline for Analytics

JSON

Parquet

Spark, Hive, Pig … ???

Page 15: Growing a Data Pipeline for Analytics

“The easier it is to ask questions, the more questions will be asked”

Page 16: Growing a Data Pipeline for Analytics

Page 17: Growing a Data Pipeline for Analytics

Modern SQL supports Map, Arrays & Structs

Page 18: Growing a Data Pipeline for Analytics

Page 19: Growing a Data Pipeline for Analytics

JSON

Parquet

Spark, Hive, Pig …

Presto, Re:dash

Page 20: Growing a Data Pipeline for Analytics

TLDR;

• Don’t build your own pipeline unless you really have to

• Use schemas

• Exploit columnar storage

• Use SQL

Marketing Analytics Strategies for a Growing Brand

Marketing

Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014

Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014

Data & Analytics

PIPELINE CONSTRUCTION INSPECTOR CERTIFICATION. · the aging of the existing pipeline network, has produced a series of major pipeline construction projects. This has led to a growing

PIPELINE CONSTRUCTION INSPECTOR CERTIFICATION. · the aging of the existing pipeline network, has produced a series of major pipeline construction projects. This has led to a growing

Documents

Pipeline analytics concept for posting on linked in

Pipeline analytics concept for posting on linked in

Documents

Building Unified Big Data Analytics and AI Pipelines · 2020-04-06 · End-to-End Big Data Analytics and AI Pipeline Production Data pipeline Prototype on laptop using sample data

Building Unified Big Data Analytics and AI Pipelines · 2020-04-06 · End-to-End Big Data Analytics and AI Pipeline Production Data pipeline Prototype on laptop using sample data

Documents

Google Analytics - 5 Steps To Growing Your Profitable Leads & Sales

Google Analytics - 5 Steps To Growing Your Profitable Leads & Sales

Business

Big Data Pipeline and Analytics Platform

Big Data Pipeline and Analytics Platform

Documents

Growing the Business Lending Pipeline

Growing the Business Lending Pipeline

Economy & Finance

Profiling DRDoS Attacks with Data Analytics Pipeline

Profiling DRDoS Attacks with Data Analytics Pipeline

Documents

Relevant and actionable insights are imperative to … · Web view- Service Analytics, Warranty Analytics, Asset Failure Analysis, Pipeline Sales Conversion Analytics, Real Time Failure

Relevant and actionable insights are imperative to … · Web view- Service Analytics, Warranty Analytics, Asset Failure Analysis, Pipeline Sales Conversion Analytics, Real Time Failure

Documents

Building Analytics Infrastructure for Growing Tech Companies

Building Analytics Infrastructure for Growing Tech Companies

Data & Analytics

Big data analytics’ growing pains - DC Velocity · 2019-06-10 · Big data analytics’ growing pains Six years ago, the terms “big data” and “analytics” were ubiquitous—

Big data analytics’ growing pains - DC Velocity · 2019-06-10 · Big data analytics’ growing pains Six years ago, the terms “big data” and “analytics” were ubiquitous—

Documents

A pipeline for functional and visual analytics of ...ceur-ws.org/Vol-1229/dynak2014_paper2.pdf · A pipeline for functional and visual analytics of microbial genetic networks

A pipeline for functional and visual analytics of ...ceur-ws.org/Vol-1229/dynak2014_paper2.pdf · A pipeline for functional and visual analytics of microbial genetic networks

Documents

Accelerate AI Data Pipeline through - Cisco · Analytics Project Analytics and data visualisation We’ll help you gain insight into your organisation through data analytics, business

Accelerate AI Data Pipeline through - Cisco · Analytics Project Analytics and data visualisation We’ll help you gain insight into your organisation through data analytics, business

Documents

Query-able Kafka: An agile data analytics pipeline for ... · Query-able Kafka: An agile data analytics pipeline for mobile wireless networks Eric Falk University of Luxembourg eric.falk@uni.lu

Query-able Kafka: An agile data analytics pipeline for ... · Query-able Kafka: An agile data analytics pipeline for mobile wireless networks Eric Falk University of Luxembourg [email protected]

Documents

Growing Data Analytics at Etsy (Cristopher Bohn)

Growing Data Analytics at Etsy (Cristopher Bohn)

Documents

Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)

Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)

Engineering

Medicines Differentiation Analytics Increasing Pipeline Returns Medicines Differentiation Analytics Methodology Review with ……. Date, 2011 Add any Logo

Documents

Pipeline Unified Big Data Analytics - GitHub Pagesfrank19900731.github.io/downloads/file/Unified Big Data... · 2017-02-13 · Unified big data analytics pipeline for Batch / interactive

Pipeline Unified Big Data Analytics - GitHub Pagesfrank19900731.github.io/downloads/file/Unified Big Data... · 2017-02-13 · Unified big data analytics pipeline for Batch / interactive

Documents

Growing Idaho’s Talent Pipeline

Growing Idaho’s Talent Pipeline

Documents

LNR - Liquid Newsroom. News Pipeline & Predictive Analytics

LNR - Liquid Newsroom. News Pipeline & Predictive Analytics

News & Politics

Growing the IT Talent Pipeline

Growing the IT Talent Pipeline

Documents

Inspire 2013 - Growing your Alteryx ROI with Predictive Analytics- AbsolutData

Inspire 2013 - Growing your Alteryx ROI with Predictive Analytics- AbsolutData

Documents

Pipeline analytics concept for posting

Pipeline analytics concept for posting

Business

Analytics of Reliability for Real-Time Big Data Pipeline ... · time big data analytics pipeline architecture by using Apache Kafka and Apache Storm. The remainder of this paper is

Analytics of Reliability for Real-Time Big Data Pipeline ... · time big data analytics pipeline architecture by using Apache Kafka and Apache Storm. The remainder of this paper is

Documents

ELK for KPI’s - Indico · Elastic Search A distributed, RESTful search and analytics engine capable of solving a growing number of use cases. Logstash Data processing pipeline that

ELK for KPI’s - Indico · Elastic Search A distributed, RESTful search and analytics engine capable of solving a growing number of use cases. Logstash Data processing pipeline that

Documents

Crowdfunding Your Fundraising: Growing Your Donor Pipeline

Crowdfunding Your Fundraising: Growing Your Donor Pipeline

Education

ATW Growing your talent pipeline – you!

ATW Growing your talent pipeline – you!

Documents

Rethinking the Analytics Pipeline - Big Data & AI World London · 2020-06-12 · Rethinking the Analytics Pipeline From Data Lake to Data Marketplace Big Data World March 11, 2020

Rethinking the Analytics Pipeline - Big Data & AI World London · 2020-06-12 · Rethinking the Analytics Pipeline From Data Lake to Data Marketplace Big Data World March 11, 2020

Documents

Finding and Growing a Talent Pipeline - Activated Insights€¦ · Great Place to Work® and Argentum, Finding and Growing a Talent Pipeline 1 The senior living industry must find

Finding and Growing a Talent Pipeline - Activated Insights€¦ · Great Place to Work® and Argentum, Finding and Growing a Talent Pipeline 1 The senior living industry must find

Documents