Build your own Data Processing
Platform in the Cloud integrating
data from Millions of Things and
other sources
Philipp Behre, Solutions Architect, Amazon Web Services
What to expect from this session
• How to Collect, store, and analyze data from small things in a big world?
• What tools are there for Data Engineers to build a cloud based data
platform with AWS?
• How to Enable your business teams to make data informed decisions?
• How get smart support for people to make decisions with confidence based
on near-real time predictions?
Next: Start creating !!
The Person has the context to decide
The Person has the context to decide
Analy
ze &
decid
e
The Cloud make decisions with smart
situational awareness
Monitor
& have
the final
say Enable
smart
decisions &
act
One Example: Water Pipe
Connect – Secure – Integrate
DEVICE SDKSet of client libraries to
connect, authenticate and
exchange messages
DEVICE GATEWAYCommunicate with devices via
MQTT and HTTP
AUTHENTICATIONSecure with mutual
authentication and encryption
RULES ENGINETransform messages
based on rules and
route to AWS Services
AWS Services
- - - - -
3P Services
SHADOWPersistent thing state
during intermittent
connections
APPLICATIONS
AWS IoT API
REGISTRYIdentity and Management of
your things
AWS IoT: How it Works
Water Pipe – Simplified Data Flow
Sens
ors
Val
ve
“If you can’t measure it, you can’t improve it”-Lord Kelvin
analyze your data
make data-informed decisions
improve your processes
RetrospectiveAnalyze historical
trends to know
what's happening in
the app
Predictive Anticipate user
behavior to enhance
experience
InquisitiveDiscover latent user
behavior to shape
product or marketing
decisions
Three Types of Data-Driven Decision Making
IoT-Data Architectures build
out of AWS services
Primitives for IoT – with a focus on collect, store, analyze
AWS
Lambda
Amazon EMR
Amazon S3
AWS IoT
Amazon
Kinesis
Amazon Machine
LearningAmazon Redshift
Amazon QuickSight
Amazon
Cognito
Amazon Elasticsearch
Service
Amazon DynamoDB
Understanding your data - People involved
• BI Analysts
• Data Engineers
• Application Developers
• Data Scientists
• ….
Actually … everyone in your company making a
decision!
BI Analyst with existing BI Tools
BI Analyst BI Tools
Amazon EC2
Amazon
Redshift
QuickSight
API
• Primary tool is SQL
• Data is largely structured with well known data sources
• Primary concern is fast, consistent performance
• Need to extend SQL with custom functions
Amazon
QuickSight
Amazon
QuickSightBI Tools
Amazon EC2
Data Engineer familiar with Hadoop and Spark
Data
Engineer
Existing Structured
Data
Amazon Redshift
New Structured
Data Amazon
Redshift
Amazon
EMR
Enrichment /
Transformation
ETL
Data Source
Amazon Redshift
Integration
Data Scientist with existing toolsets
Data
ScientistTool kits like R
Studio installed
Amazon EC2
Unstructured and semi-
structured Data
Amazon S3
Structured
Data
Amazon
Redshift• Work with unstructured datasets
• Use existing toolsets to connect to Redshift
Example: Querying Redshift with R Packages
• RJDBC – supports SQL queries
• dplyr – Uses R code for data
analysis
• RPostgreSQL - R compliant
driver or Database Interface
(DBI)R
User
R
Studio
Amazo
n EC2
Unstructure
d Data
Amazon S3
User Profile
Amazon
RDS
Amazon
Redshift
Connecting R with Amazon Redshift blog post: https://blogs.aws.amazon.com/bigdata/post/Tx1G8828SPGX3PK/Connecting-R-with-Amazon-Redshift
Application Developers can build smart
applications using Amazon Machine Learning
Structured
Data/Predictions
Amazon Redshift
Generate/Qu
ery
Predictions
Amazon
QuickSight
Application
Amazon
Machine
Learning
Visualize
• All skill levels
• Machine Learning technology is accessed through APIs / SDKs
• Embed visualizations in applications
Back to our water pipe …
Instantly React – getting ‘smarter’
Smart Application – supporting people
Val
ve
6
IoT
Sens
ors
1
1 sensors send data
Amazon
Kinesis
Streams
2
2 Inbound stream (raw data)
Amazon
DynamoD
B
4[triggers]
4 write aggregate & trigger event
Amazon
S3
5 process event
5
[updates state]
[gets real-time
prediction]
[write data &
Prediction]
[notify]
Amazon
Machine
Learning
Amazon
Kinesis
Analytics
3
3 time-series aggregation
Follow up and capture results
Amazon S3
1 follow up
1 2
2 Capture result & activity
3
3 Frequently load to S3
Collect business and contextual data – learn
and improve
Amazon S3
Machine
Learning
2 transform and load
Amazon EMR
Amazon Redshift
2
3 let apps query data
3
4 let people query data
Amazon QuickSight4
5 re-train prediction model5
Amazon
Kinesis
Firehose
1 store additional data
1
The Cloud make decisions with smart
situational awareness
monit
or make
decision
s
Start building !!!
monit
or make
decision
s
Resources
AWS IoT Landing Page: http://aws.amazon.com/iot
AWS Mobile Landing Page: http://aws.amazon.com/mobile
YouTube Channels/Playlist:
• AWS re:Invent 2015 Mobile/IoT Sessions:
http://bit.ly/22ik1V1
• AWS re:Invent 2015 Big Data / Analytics Sessions:
• http://bit.ly/1S2
• AWS Webinar Channel: http://bit.ly/1QVI2IY
Questions?AMAZON CONFIDENTIAL
Start building today!!