Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why

Embed Size (px)

Citation preview

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why did it happen? Descriptive Analytics INFORMATION Diagnostic Analytics OPTIMIZATION Confirmation Theory Hypothesis Observation
  • Slide 6
  • Implement Data Warehouse Physical Design ETL Development Reporting & Analytics Development Install and Tune Reporting & Analytics Design Dimension Modelling ETL Design Setup Infrastructure Understand Corporate Strategy Data sources ETL BI and analytic Data warehouse Gather Requirements Business Requirements Technical Requirements
  • Slide 7
  • Ingest all data regardless of requirements Store all data in native format without schema definition Do analysis Using analytic engines like Hadoop Interactive queries Batch queries Machine Learning Data warehouse Real-time analytics Devices
  • Slide 8
  • What happened? What is happening? Why did it happen? What are key relationships? What will happen? What if? How risky is it? What should happen? What is the best option? How can I optimize? Data sources
  • Slide 9
  • Slide 10
  • Massive Compute and Storage Deployment expertise Data of all Volume Variety, Velocity Speed Scale Economics Always Up, Always On Open and flexible Time to value
  • Slide 11
  • Azure Facts >4 trillion objects in Azure 300,000-1M+ requests per second Double compute and storage every 6 months Azure Storage HDInsight Data Factory ML Stream Analytics Database DocumentDB Search Event Hubs
  • Slide 12
  • Microsofts cloud Hadoop offering 100% open source Apache Hadoop Built on the latest releases across Hadoop (2.6) Up and running in minutes with no hardware to deploy Harness existing.NET and Java skills Utilize familiar BI tools for analysis including Microsoft Excel
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Use CasesWhere? Active Archive / Compliance Reporting Restricted data = down here. Up there could be considered for other scenarios. ETL / Data Warehouse Optimization Often has down here gravity, but cloud-based ETL offload has big payout Smart Meter AnalysisTypically born up there Single View of Customer May have heavy down here gravity; unless youre using SaaS apps, then why not up there? New Data for Product Management Restricted data = down here. Up there could be considered for many scenarios. Vehicle Data for Transportation/LogisticsWhy not up there? Vehicle Data for InsuranceMay have heavy down here gravity (ex. join w/risk data, etc.)
  • Slide 17
  • Slide 18
  • Rockwell Automation is partnered with one of the six oil and gas super majors to build unmanned internet-connected gas dispensers. Each dispenser emits real-time management metrics allowing them to detect anomalies and predict when proactive maintenance needs to occur. Store sensor data every 5 minutes Temperature, pressure, vibration, etc. Tens of thousands of data points / second Azure Blobs Azure HDInsight Hive, Pig, Azure SQL DB Power BI for O365 Mobile Notification Hub Mobile Device Real-time notification
  • Slide 19
  • JustGiving wanted to harness the power of their data by using network science to map peoples connections and relationships so that they could connect people with the causes they care about. Based on 15 years of data, the JustGiving GiveGraph is the worlds largest ecosystem of giving behavior. It contains more than 81 million person nodes, thousands of causes and 285 million connections and is the engine that drives JustGivings social platform, enabling levels of personalization and engagement that a traditional infrastructure would be unable to deliver. SQL Server On-premises Agent Azure Blobs Azure HDInsight Give Graph Azure Tables Web API Website + Event store Service Bus Serves results Azure Cache Activity Feeds
  • Slide 20
  • Use Cases Ad Placement and Offers Active Archive ETL Offload Single View of Customer Recommendation Engine Customer Targeting and Acquisition New Data for Product Management Vehicle Data Web Personalization and Experience
  • Slide 21
  • Slide 22
  • Microsoft contribution to Apache code Hadoop 2.0 Sample Query Hive 10HDP 1.3 / Hive 11 HDP 2.0 32x Speedup 40X Speedup HDP 2.1 100x Speedup
  • Slide 23
  • Data Node Task Tracker Name Node Job Tracker HMaster Coordination Region Server
  • Slide 24
  • Stream processing Search and query Data analytics (Excel) Web/thick client dashboards Devices to take action
  • Slide 25
  • Slide 26
  • Slide 27
  • A hyper scale repository for big data analytic workloads
  • Slide 28
  • Slide 29