32
Deep Learning is Like Water: It's Everywhere! Arno Candel, PhD CTO, H2O.ai @ArnoCandel AI By The Bay, San Francisco March 6, 2017

Arno Candel AIByTheBay 030617

Embed Size (px)

Citation preview

Page 1: Arno Candel AIByTheBay 030617

DeepLearningisLikeWater:It'sEverywhere!

ArnoCandel,PhDCTO,H2O.ai@ArnoCandel

AIByTheBay,SanFrancisco

March6,2017

Page 2: Arno Candel AIByTheBay 030617

MeettheH2OMakers

Page 3: Arno Candel AIByTheBay 030617

3

SoftwareProduct:H2O-AIforBusinessTransformation• DistributedDataFramewithScalableExecutionEngine• DistributedAlgorithmsDeepLearning,GBM,RF,GLM,K-Means,PCA,…• Apachev2OpenSource(github.com/h2oai)

H2OisEasytoUseandDeploy• h2o.ai/downloadandrunanywhere,immediately• ClientAPIs:R,Python,Java,Scala,REST,FlowGUI• Spark(cf.SparklingWater),Hadoop,BareMetal• Productionizewithauto-generatedJava/C++scoringcode

H2O.ai-MakersofH2O,SparklingWater,DeepWater,…

Page 4: Arno Candel AIByTheBay 030617

https://www.cbinsights.com

H2O.ai-AttheCoreofAI

Page 5: Arno Candel AIByTheBay 030617

H2O.ai-LovedByTheBest

Page 6: Arno Candel AIByTheBay 030617

H2O.ai-Visionaryin2017GartnerMQforDataScience

Page 7: Arno Candel AIByTheBay 030617

H2ODeepWatergot(Tech)Crunch’ed

Page 8: Arno Candel AIByTheBay 030617

H2O.ai-HighlightedbyFortuneMagazine

http://fortune.com/2017/02/23/artificial-intelligence-companies/

Page 9: Arno Candel AIByTheBay 030617

Powerful,ScalableTechniquesforDeepLearningandAI

brandnew:Dec2016

H2OBook-WrittenbytheCommunity

Page 10: Arno Candel AIByTheBay 030617

H2O.aiCustomerLove

10http://www.h2o.ai/customers/

Page 11: Arno Candel AIByTheBay 030617

11http://www.h2o.ai/customers/

H2O.aiCustomerLove

Page 12: Arno Candel AIByTheBay 030617

12

HighLevelArchitectureofH2O

HDFS

S3

NFS

DistributedIn-Memory

ParallelParser

LosslessCompression

H2OComputeEngine

ProductionScoringEnvironment

Exploratory&DescriptiveAnalysis

FeatureEngineering&Selection

Supervised&UnsupervisedModeling

ModelEvaluation&Selection

Predict

Data&ModelStorage

ModelExport: StandaloneScoringCode

C/C++/Java R/Py/etc.

DataPrepExport:PlainOldJavaObject

Local

SQL

LDAP Kerberos SSL HTTPS

HTTP

Page 13: Arno Candel AIByTheBay 030617

NativeAPIs:Java,Scala—RESTAPIs:R,Python,Flow,JavaScript,Java

13

library(h2o)h2o.init()h2o.deeplearning(x=1:4,y=5,as.h2o(iris))

importh2o

fromh2o.estimators.deeplearningimportH2ODeepLearningEstimator

h2o.init()

dl=H2ODeepLearningEstimator()

dl.train(x=list(range(1,4)),y="Species",training_frame=iris.hex)

import_root_.hex.deeplearning.DeepLearningimport_root_.hex.deeplearning.DeepLearningParametersvaldlParams=newDeepLearningParameters()dlParams._train=iris.hexdlParams._response_column=‘Speciesvaldl=newDeepLearning(dlParams)valdlModel=dl.trainModel.get

Allheavyliftingisdonebythebackend!

Built-ininteractiveGUIandnotebook-nocodingneeded!

Page 14: Arno Candel AIByTheBay 030617

LiveDemoofDistributedDeepLearninginH2O

Airline dataset: 116M flights in the U.S. over 20 years

10xin-memorycompressionvsCSV AllclusterCPUcoresarebusy

Page 15: Arno Candel AIByTheBay 030617

Brand-new:H2OXGBoostIntegration(GradientBoosting)

WhyXGBoost?Competitiveaccuracyandspeed(greatforKaggle)

GPUsupport(forsmall/mediumdata)Efficientonsparsedata

WhyintegrateintoH2O?Easeofuse(FlowGUI,R/PyAPIs)

Real-timemodelstatus(varimp,metrics)Efficientdatapreprocessing(sparse,categorical)

IntegrationintoH2Oecosystem(modeling,deployment,support)

Page 16: Arno Candel AIByTheBay 030617

LiveDemoofGPUGradientBoostinginH2O

Kaggle dataset: BNP Paribas Cardif Claims Management

Page 17: Arno Candel AIByTheBay 030617

DeepWater=THEDeepLearningPlatform

H2Ointegratesthetop open-sourceDLtools

NativeGPUsupport isupto100xfasterthan

EnterpriseReadyEasytotrainanddeploy,interactive,scalable,etc.Flow,R,Python,Spark/Scala,Java,REST,POJO,Steam

NewBigDataUseCases(previouslyimpossibleordifficultinH2O)

Image-socialmedia,manufacturing,healthcare,…Video-UX/UI,security,automotive,socialmedia,…Sound-automotive,security,callcenters,healthcare,…Text-NLP,sentiment,security,finance,fraud,…TimeSeries-security,IoT,finance,e-commerce,…

DeepWater:BestOpen-SourceDeepLearning

EnterpriseDeepLearningforBusinessTransformation

Page 18: Arno Candel AIByTheBay 030617

DeepWaterBringsState-Of-The-ArtDeepLearningonGPUstoH2O

H2ODeepLearning:simplemulti-layernetworks,CPUs

H2ODeepWater:arbitrarynetworks,CPUsorGPUs

Limitedtobusinessanalytics,statisticalmodels(CSVdata)

Largenetworksforbigdata(e.g.image1000x1000x3->3minputsperobservation)

1-5layersMBs/GBsofdata

1-1000layersGBs/TBsofdata

Page 19: Arno Candel AIByTheBay 030617

Open-Source-LeverageCommunityCode,DataandModels

BestImageClassifierasofAug2016:Google+MicrosoftHybridArchitecture

https://research.googleblog.com/2016/08/improving-inception-and-image.html

open-sourceimplementation

H2OtakesDLmodeldefinitionasinput

Page 20: Arno Candel AIByTheBay 030617

BuildyourownmodelswithDeepWaterToday!https://github.com/h2oai/h2o-3/blob/master/h2o-py/tests/testdir_algos/deepwater/pyunit_inception_resnet_v2_deepwater.py

LiveDemoofDeepWaterforImageClassificationonGPUs

Page 21: Arno Candel AIByTheBay 030617

Yesterday:SmallData(<GB) Today:BigData(TeraBytes,ExaBytes)

Data+Skillsaregoodforbusiness

Data+MachineLearningAREthebusiness

ThingsareChangingQuickly

Page 22: Arno Candel AIByTheBay 030617

ChallengesWithAIandDeepLearning

Page 23: Arno Candel AIByTheBay 030617

CEO:“WewilltransformourbusinesswithAI”Management:“HiresomeonetogiveusAI”SeniorDataScientist:“IshouldlookintoAI”JuniorDataScientist:“IuseTensorFlowallthetime”HighSchoolKid:“IdidmyinternshiponDeepLearning”AverageJoe:“Iwantaself-drivingcar(andkeepmyjob)”

StanfordProfessors:“focusoninterpretability,startwithsimplemodels!”

TheHypeandRealityofAI

Page 24: Arno Candel AIByTheBay 030617

H2O.aiStanfordAdvisors

stankrdpic

Sri/CEOBoydHastieTibshirani

inaconferenceroomnotsofaraway

me

Page 25: Arno Candel AIByTheBay 030617

WhichOpen-SourceAIPlatformtoUse?

Page 26: Arno Candel AIByTheBay 030617

WhichProgrammingLanguageToUse?

WhichoneforDevelopmentvsProduction?

Page 27: Arno Candel AIByTheBay 030617

WhichHardwareToUse?

WhichoneforDevelopmentvsProduction?

Analog/Neuromorphic

Page 28: Arno Candel AIByTheBay 030617

WhoDoestheWorkandonWhatInfrastructure?

WhichoneforDevelopmentvsProduction?

Cloud?Which? OnPremise?

DataLake?Micro-Services?

Page 29: Arno Candel AIByTheBay 030617

WhichoneforDevelopmentvsProduction?

WhenistheModelGoodEnough?

Crowdsourcing? TrustaGenius? InternalBake-Off?

Page 30: Arno Candel AIByTheBay 030617

Whatproblemareyousolvinginthefirstplace?

Whatproblemshould/couldyoubesolvinginstead?

Whatcanyoulearnfromthemodel?

Howcanyouimprovethemodels?More,betterdata?

Howcanyoucharacterizethemodel?

DoyouneedAI,DeepLearningorjustasimplemodel?

BacktotheDrawingBoard!

Page 31: Arno Candel AIByTheBay 030617

GradientBoosting Machine

Generalized LinearModeling

DeepLearning

Distributed RandomForest

DoyouneedAI,DeepLearningorjustaSimpleModel?

Page 32: Arno Candel AIByTheBay 030617

FutureOfAI:OrWhat’sLeftforHumanstoDo?

CharlieChaplin-ModernTimes1936