Upload
edward-hodges
View
219
Download
0
Embed Size (px)
Citation preview
© 2010 IBM Corporation
IBM InfoSphere StreamsEnabling a smarter planet
Roger ReaInfoSphere Streams Product [email protected]
Sept 15, 2010
© 2010 IBM Corporation2
Moore’s Law drives new waves of technology …
2 Technology Waves
Welcome to the Decade of Smart
Multicore Chips
Embedded Chips
1
5
10
500
1,000
Billion
s o
f U
nit
s S
hip
ped
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015
2,000
The “Internet of Things”
S/360 IBM PC World Wide Web
Source: IDC, SSR and IBM Market Insights
© 2010 IBM Corporation3
Time is ripe for a new era of computing in support of Big Data
• Emerging trends create need for new languages
• Scientific programming Fortran
• Business programming Cobol
• Systems programming at higher level C
• Increased productivity C++
• Web programming Java
• Streaming data sources and multicore architectures
• Streams Processing Language
© 2010 IBM Corporation4
IBM InfoSphere Streams
• Streaming analytic applications• Multiple input streams• Advanced streaming analytics
• Eclipse based IDE
• Define sources, apply operators, define intermediary and final output sinks
• User defined operators in Java or C++
• Optimizing compiler automates deployment and connections
• Extremely low latency
• Cluster of up to 125 nodes
InfoSphere Streams Studio (IDE for Streams Processing Language)
Source Adapters
Sink Adapters
Operator Repository
Automated, Optimized Deploy and Management (Scheduler)
© 2010 IBM Corporation5
Scalable stream processing
• InfoSphere Streams provides
• A programming model and IDE for defining data sources and software analytic modules called operators that are fused into process execution units (PEs)
• infrastructure to support the composition of scalable stream processing applications from these components
• deployment and operation of these applications across distributed x86 processing nodes, when scaled processing is required
• stream connectivity between data sources and PEs of a stream processing application
© 2010 IBM Corporation6
Streams offers tremendous deployment flexibility
With only a simple re-compile of application:
All on one machine fused into one multi-threaded process
All on one machine; each operator in its own process
Each operator in its own process, each process on its own machine
© 2010 IBM Corporation7
ANISE: Active Network for Information from Synchrotron ExperimentsHigh speed network to process data from synchrotrons in Canada and US using the CANARIE network
ProcessingService
DataServiceData
ServiceProcessing
Service
ANISE
Business ModelLayer
PersistenceLayer
DeviceProxies
Client ServicesLayer
BrowserBrowser
LaboratoryControl Module
ServiceProxies
Science Studio
LabatoryControl Module
IOCs
Beamline
IOCs
Beamline
General, commonComponent
XRD Processing
XRF Processing
Science Studio specificComponent
Canadian Light Source, Canada
Argonne Lab. US
StreamComputing
© 2010 IBM Corporation8
TerraEchos Adelos™– Covert Intrusion Detection
• State-of-the-art covert surveillance based on Streams platform
• Acoustic signals from buried fiber optic cables are monitored, analyzed and reported in real time to locate intruders
• Currently designed to scale up to 1600 streams of raw binary data
© 2010 IBM Corporation9
Forecasting Space Weather at LOFAR Outrigger in Scandinavia (LOIS)
Triaxial Antenna InfoSphere Streams
Radio signal input and data preparation
Signal detection and noise filtering
Strength and 3D directional analysis
Swedish Institute of Space Physics
Solar Flares
Space Weather prediction
regarding impact on satellites and
electric grids+ + =
© 2010 IBM Corporation10
Real Time Marine Mammal Position and Behavior Modeling
Analytics & Sensors
Advanced Acoustical Analytics
InfoSphere Streams
Filter wind & wave noise
Model Marine Mammal environment
Correlate to Galway Bay ecosystem
+ + =
© 2010 IBM Corporation11
What are key advantages of Streams?Compiling groups of operators into single processes enables:
• Efficient use of cores• Distributed execution• Very fast data exchange • Can be automatic or tuned• Can be scaled with the push of a button
Language built for Streaming
applications: • Reusable operators• Rapid application development• Continuous “pipeline”
processing
Extremely flexible and high performance transport:
• Very low latency• High data rates
Easy to extend:• Built in adaptors• Extend with C++ and Java • Extend running applications
Use the data that gives you a competitive advantage:
• Can handle virtually any data type
• Use data that is too expensive and time sensitive for other approaches
© 2010 IBM Corporation12
QUESTIONS ?