Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Real-timeAnalyticsPoweredbyGPU-AcceleratedDatabases
ChrisPrendergastandWoodyChristyGTC,May8,2017
Wins IDCHPCInnovationExcellenceAwardforworkwithUS PostalService.
Kinetica Background
2009 2012
UnitedStatesArmyIntelligenceseeksameanstoassessterroristandothernationalsecuritythreats.
Nodatabaseinthemarketwasfastorflexibleenoughtomettheirneeds.
FoundersAmitVijandNima NegahbanstartonthepioneeringuseofGPUswhilebuildingaGPU-accelerateddatabasefromthegroundup.
2014
CommercializationenteredproductionwithUSPS.
2016
Rebranded toKinetica.Seedfunding.MovedHQtoSanFrancisco.Expandedmanagementteam.Hiredfieldteam.
Wins IDCHPCInnovationExcellenceAwardforworkwithUSArmy.
GPUdb goeslivewiththeUSArmyIntelligence.
Patentgrantedfor“Methodandsystemforimprovingcomputationalconcurrencyusingamulti-threadedGPUcalculationengine”
22
Evolution of Analytics
3
SimpleReporting
StandardAnalytics Real-time Analytics MachineLearning DeepLearning
Listcustomer energyconsumptioninthepast3years
Whatistheaverageconsumption byregionmonthly? Perhousehold?Residentialvs.Commercial?
Whatisthecurrentenergyconsumption byaregion/household?Howdoesthatcomparetohistoricaverages?How doesitcomparetootherregions?
Givenlocation,history,demographic,,usage,whatisthelikelihood ofserviceissues/outage?
Deducefromunspecifiedsignalsacrossawiderangeofdatasetsthelikelihoodthiscustomerwillconsumemore/lessenergy?Haveserviceinterruption?
GPUAcceleration
GPUAccelerationOvercomesProcessingBottlenecks
4
4,000+coresperdeviceinmanycases,versus16to32coresper
typicalCPU-baseddevice.
HighperformancecomputingtrendtousingGPU’stosolve
massiveprocessingchallengesGPUaccelerationbringshighperformancecomputetocommodityhardware
Parallelprocessingisidealforscanningentiredataset&bruteforcecompute.
GPUsaredesignedaroundthousandsofsmall,efficientcoresthatarewellsuitedtoperformingrepeatedsimilarinstructionsinparallel.Thismakesthemwell-suitedtothecompute-intensiveworkloadsrequiredoflargedatasets.
Kinetica:ADistributed,In-MemoryDatabase
5
GPU-accelerateddatabaseoperations
Naturallanguageprocessingbasedfull-textsearch
NativeGISandIP-addressobject
support
Realtimedatahandlerstoingeststructuredand
unstructureddata
Deepintegrationwithopensourceandcommercial
frameworksandapplications:Hadoop,Spark,NiFi,Accumulo,H20,Tableau,Kibana andCaravel
Predictablescaleoutfordataingestionand
querying
Notypicaltuning,indexing,andtweaking
Distributedvisualizationpipelinebuiltin
Kinetica:UniqueStrengths&Capabilities
Fast,Distributed,OLAPEngineforFastMoving,LargeScaleData
6
OLAPPerformance,Scalability,Stability
GeospatialProcessing&Visualization
APIforGPUPoweredData&ComputeOrchestration
ConvergedAIandBI
NativeGeospatialandVisualizationPipeline
FastData
In-DatabaseAnalytics
InteractiveLocation-BasedAnalytics
DatabaseorCachesystemservinguppre-computedaggregates
Italsotakesalotofefforttore-computeaggregatesandtoloadtheservingdatabaseorcache
Whatisthemainproblem?
ChallengeswithLambdaandKappaArchitectures
7
PerformanceBI
0.09s
2.5s
Query2:Sumaggregationwithasubqueryaggregationjoiningbothtables
LARGETELCO
LeadingEnterpriseDatabase
8
345s
44s
0.65s
0.68s
CASESTUDY
LeadingEnterpriseDatabase
Query1:Simpleaveragecalculationonthe1.8Browtable
Real-Time,AdvancedAnalytics,SpeedLayerforTeradataorOracle
9
Parallelingestionofevents
Lambda-typearchitectureforTeradataorOracle
Kineticaisspeedlayerwithreal-timeanalyticcapabilitiesformillisecondSLAs
ConvergeMachineLearning,DeepLearning,NLP,streamingandlocationanalyticsandfastQuery,Reporting&AnalyticswithKinetica&Teradata/Oracle
DATAINMOTIONANDREST
DATAWAREHOUSE/TRANSACTIONAL
AmazonKinesis
ANALYSTS
MOBILEUSERS
DASHBOARDS&APPLICATIONS
ALERTINGSYSTEMS
KineticaConnectors
STREAM/ETLPROCESSING
FastGPUaccelerated,in-
MemoryDatabaseConvergeML,DL,
Streaming,Location,and
QR&A
SpeedLayerforHadoop
10
ParallelIngestion
Parallelingestionofevents
Kineticaisspeedlayerwithreal-timeanalyticcapabilities
HDFSforarchivalstore
Muchloosercouplingthantraditionallambdaarchitecture
BatchmodeSparkorMRjobscanpushdatatoKineticaasneededforfastqueryondataloadedfromHDFS
EVENTS
MESSAGEBROKERS
AmazonKinesis
ANALYSTS
MOBILEUSERS
DASHBOARDS&APPLICATIONS
ALERTINGSYSTEMS
Put,get,scan
Executecomplexanalyticsonthefly
KineticaConnectors
STREAMPROCESSING
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
HDFS(HadoopDistributedFileSystem)
• Noneedtoregularlyrecompute aggregates.
• Noneedtoloadandmanageaseparateservingsystemorcachetomakedeephistoricalaggregatesavailabletoyourstreamprocessingcode.
• Aggregatesarealwaysuptodate,astheyarecomputedondemand;thelatesteventsarealwaysincluded
• Betterperformancewithsignificantlyreducedoperationalcomplexity,hardwarefootprintandcost.
SIMPLIFYYOURARCHITECTURE
STREAMINGANALYTICS,SIMPLIFIED
EVENTS
MESSAGEBROKERS
AmazonKinesis
ANALYSTS
MOBILEUSERS
DASHBOARDS&APPLICATIONS
ALERTINGSYSTEMS
PUT,GET,SCAN
ExecutecomplexanalyticsontheflyKinetica
Connectors
STREAMPROCESSING
INTELLIGENCE:USArmy- INSCOM
USArmy’sin-memorycomputationalengineforanydatawithageospatialortemporalattributeforamajorjointcloudinitiativewithintheIntelligenceCommunity(ICITE).
Intelanalystsareabletoconductnearreal-timeanalyticsandfuseSIGINT,ISR,andGEOINTstreamingbigdatafeedsandvisualizeinawebbrowser.
Firsttimeinhistorymilitaryanalystsareabletoqueryandvisualizebillionstotrillionsofnearreal-timeobjectsinaproductionenvironment.
Majorexecutivemilitaryandcongressionalvisibility.
OracleSpatial(92Minutes)
42xLowerSpace28xLowerCost38xLowerPowerCost
U.SArmyINSCOMShiftfromOracletoGPUdb
GPUdb(20ms)
1GPUdbservervs42serverswithOracle10gR2(2011)
CASESTUDY: LOCATIONBASEDANALYTICS
LOGISTICS:Routeoptimization
DISTRIBUTEDANALYSIS
ATSCALE200,000USPSdevicesemittinglocation eachminuteà250+millioneventscapturedandanalyzeddaily…......trackedon10nodes.
USPSisthesinglelargestlogisticentityinthecountry,movingmoreindividualitemsinfourhoursthanthecombinationofUPS,FedEx,andDHLmoveallyear.
CASESTUDY: LOCATIONBASEDANALYTICS
15,000simultaneoussessions
PREDICTIVEINFRASTRUCTUREMANAGEMENT
15
Kineticaoperatesasaspeed-layerwithESRItomonitor,manage,andpredictinfrastructurehealth.
LARGEUTILITYCOMPANY
CASESTUDY: LOCATIONBASEDANALYTICS
LOGISTICS&FLEETMANAGEMENT
16
Kineticaenablesagiletrackingofshipmentstoassiststoremanagersfortrackingofinventoryandarrivaltimes.
• Visibilityandtrackingofdeliveries&trucksforstoremanagers
• ETA&Notifications– Provideestimatedtimeofdelivery,notificationsandcustomlocationbasedalerting
• RouteOptimizationbasedontrucksize,andifcargoisperishableorcontainshazardousmaterials.
LARGERETAILER
CASESTUDY: LOCATIONBASEDANALYTICS
PIPELINE&WELLANALYTICS
17
Kineticaenablesinteractivequeryandgeospatialvisualizationoflargenumbersofupstreamandmidstreamassets.
• Complexjoinsacrossseveraltableswith300mrowsofdata.Approx 100GBinsize.
• Createcustomvisualizations,charts.
• Visualizationofwellsbylandownership,region,etc.
ENERGYRESEARCH
CASESTUDY: LOCATIONBASEDANALYTICS
LIFESCIENCES:GENOMICSRESEARCHCASESTUDY:ADVANCEDIN-DATABASEANALYTICS
18
GPU-accelerationonKineticaenablesprocessingoftranscriptomicstorunsimulationsfordrugresearch.
• Seekingoutsignalsfrommassivecollectionofdrugtargetscombinedwithhistoricaldata.
• Acceleratesimulationsofchemicalreactions.
• In-databaseprocessingtodevelopmodels,leveragingGPUaccelerationforperformance,anddirectaccesstoCUDAAPIsviaUDFsdeployedwithinKinetica.
OneofthethingsIlikeaboutKineticaisitgivesusmoreofageneral-purposeuseofthetechnology.Therehasbeenalotofsoftwarecreatedtoanswercertainquestions[but]highlyspecializedtoolshavelimitedfunctionalityandaretunedtodoacertainworkload.
"MarkRamsey,ChiefDataOfficeratGSK
RISKMANAGEMENT
19
Largefinancialinstitutionmovescounterpartyriskanalysisfromovernighttoreal-time.
• DatacollectedbyXVAlibrarywhichcomputesriskmetricsforeachtrade
• Riskcomputationsarebecomingmorecomplexandcomputationallyheavy.xVA analysisneedstoprojectyearsintothefuture.
• Kineticaenablesbankstomovefrombatch/overnightanalysistoastreaming/real-timesystemforflexiblereal-timemonitoringbytraders,auditorsandmanagement.
MULTINATIONALBANK
CASESTUDY:ADVANCEDIN-DATABASEANALYTICS
FasterAnalyticsonInventoryandSales
0.65s
0.68s
LARGERETAILER
EnterpriseIn-MemDB
20
34s
44s
0.65s
0.68s
CASESTUDY
EnterpriseIn-MemDB
Query1:Sumofretailsalesgroupedbyregion
Query2:Sumofinventoryavailablegroupedbytype
StopbyBooth#431andGetyourFreeT-shirt
www.kinetica.com