DEMO
Provide analytics to help dog owners determine the best times and locations to walk their dog
Santa CruzApril?Fridays?
San MateoTuesdays?
Santa ClaraSundays?
AlamedaOctober?
Data pipeline
DatastoreDistributed File
System
Real-timeProcessing
DataIngestion
UserInterface
UserMessages
User N
User 2User 1
Meet-upRequests
Serving LayerBatch and Real-time Layer
Data Ingestion
Data ingestion
DataIngestion
UserMessages
User N
User 2User 1
Meet-upRequests
{"timestamp": [2015, 1, 23, 13, 55, 0], "county": ["Dundy County", "NE"], "creatorID": 15854, "senderID": 844090, "rank": 0, "messageID": 622878, "message": "Let's meet up at 2PM today!"}
Message Topic
JSON format
Real-time processing
Distributed file storage
Batch and real-time layer
Distributed File System
Real-timeProcessing
Datastore
{"timestamp": [2015, 1, 23, 13, 55, 0], "county": ["Dundy County", "NE"], "creatorID": 15854, "senderID": 844090, "rank": 0, "messageID": 622878, "message": "Let's meet up at 2PM today!"}
Batch and real-time layer
Distributed File System
Real-timeProcessing
Datastore
{"timestamp": [2015, 1, 23, 13, 55, 0], "county": ["Dundy County", "NE"], "creatorID": 15854, "senderID": 844090, "rank": 0, "messageID": 622878, "message": "Let's meet up at 2PM today!"}
( (state, county), 1 )
( (state, county, year+month), 1 )( (state, county, year+month+day), 1 )
( (state, county), json(message) )
Batch and real-time layer
Distributed File System
Real-timeProcessing
Datastore
{"timestamp": [2015, 1, 23, 13, 55, 0], "county": ["Dundy County", "NE"], "creatorID": 15854, "senderID": 844090, "rank": 0, "messageID": 622878, "message": "Let's meet up at 2PM today!"}
( (state, county), 1 )
( (state, county, year+month), 1 )( (state, county, year+month+day), 1 )
( (state, county), json(message) )
reduceByKey( _+ _ )
reduceByKey( _+ _ )
Serving layer
Datastore UserInterface
Real-time processing
Batch processing
by_county_day
by_county_rt_msgs
Partition keyClustering columnValue
STATE(VARCHAR)
COUNTY(VARCHAR)
DATE(INT)
TIME(INT)
MESSAGE(VARCHAR)
CA Santa Clara County 20150205 124523 “JSON_msg”
STATE(VARCHAR)
COUNTY(VARCHAR)
DATE(INT)
COUNT(INT)
CA Santa Clara County 20150205 72
by_county_monthSTATE
(VARCHAR)COUNTY
(VARCHAR)DATE(INT)
COUNT(INT)
CA Santa Clara County 201502 2361
Austin Ouyang
Previous employment: Dynetics, Inc.
– RF Systems Engineer
Education: MS Biomedical Engineering (University of Texas Southwestern)
BS Electrical Engineering (University of Illinois – Urbana Champaign)
Hobbies: algorithmic futures trading, rock climbing, and cycling
Contact: [email protected]: http://github.com/aouyang1