21
Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows Matt Winkler (@mwinkle) Principal Program Manager 3-038

Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

  • Upload
    idalee

  • View
    33

  • Download
    2

Embed Size (px)

DESCRIPTION

Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows. Matt Winkler (@ mwinkle ) Principal Program Manager 3-038. What is big?. Image courtesy of CERN. The Large Hadron Collider produces 1 PB/sec. But, I don’t have a Large Hadron Collider. - PowerPoint PPT Presentation

Citation preview

Page 1: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and WindowsMatt Winkler (@mwinkle)Principal Program Manager3-038

Page 2: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

What is big?

Page 3: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Image courtesy of CERN

Page 4: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

The Large Hadron Collider produces 1 PB/sec

Page 5: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

But, I don’t have a Large Hadron Collider

Page 6: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

But you do have…SensorsClicksLogsTransactional recordsCall centersMedical transcriptionsImagesDocumentsSignals from social mediaSimulations

Page 7: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Systems like Hadoop evolved to extract value from this data,

shaped at the intersection of physics and

economics

Page 8: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Redundant, distributed, scalable storage

Easily distribute the computation

Page 9: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows
Page 10: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Getting Started with HDInsight on Azure and Windows

Page 11: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Introduction to Map/Reduce

Map f(k1,v1) list(k2,v2)Reduce f(k2, list(v2)) (k2, v3)

Functionally In Practice, WordCountThe quick brown fox jumps over the lazy dog

Map(the,1) (quick,1), (brown,1), (fox,1), (jumps,2) (over,1), (the,1),(lazy,1),(dog,1)

Shuffle(the,(1,1)) (quick,1), (brown,1), (fox,1),(jumps,1) (over,1),(lazy,1),(dog,1)

Reduce(the,2) (quick,1), (brown,1), (fox,1), (jumps,1),(over,1), (lazy,1),(dog,1)

In Code

Then, scale to TB/PB of data over 10’s, 100’s or 1000’s of nodes

Page 12: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Map/Reduce in JavaScript

Page 13: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Map/Reduce in .NET

Page 14: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

What’s After Wordcount?Reverse indexingDistributed data cleansingData transformationMachine learning algorithmsTraditional analyticsPredictive analytics

Recommended Reading: Data-Intensive Text Processing with MapReduce

Page 15: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Hive, Like SQL, Just Bigger

SELECT airlinelocal.Origin,airlinelocal.Dest, airlinelocal.Carrier, AVG(averagearrivaldelay –

airlinelocal.ArrDelayMinutes) as AvgDiffFromAverage

FROM airlinelocal

JOIN reallybadroutes

ON (airlinelocal.Origin = reallybadroutes.Origin

AND airlinelocal.Dest = reallybadroutes.Dest)

GROUP BY airlinelocal.Origin, airlinelocal.Dest, airlinelocal.Carrier

ORDER By AvgDiffFromAverage DESC

You write Hadoop ExecutesHive Compiles

Hive M/R FilterHive M/R JoinHive M/R AggregateHive M/R Order

Page 16: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

HiveLINQ To Hive

Page 17: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

Easy to get started

Write Hadoop jobs in the language of your choice

Use your tools to process big data

Page 18: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

• Microsoft Big Data• Azure HDInsight• .NET SDK For Hadoop

Resources

Please submit session evals on the Build Windows 8 App or at http://aka.ms/BuildSessions

Page 19: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

• Follow us on Twitter @WindowsAzure

• Get Started: www.windowsazure.com/build

Resources

Please submit session evals on the Build Windows 8 App or at http://aka.ms/BuildSessions

Page 20: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Page 21: Developing Big Data Analytics Applications with JavaScript and .NET for Windows Azure and Windows

• Appendix beyond this