View
237
Download
0
Category
Preview:
Citation preview
Use Case: UberEATS ETD Prediction
5
●○
●
○
●
○
●
○○
●○
HADOOP / YARN (Batch)
HADOOP / YARN (Batch)
Hive Feature Store
NETWORK (Realtime)
HADOOP / YARN (Batch)
Cassandra Feature Store
Hive Feature Store
Rea
l-tim
e pr
edic
tion
Trai
ning
Use Case: UberEATS ETD ML Pipeline
Hive
11
Feature store
Model Training
ModelUberEATS App
Model Performance
ETD
Problems• Hard to figure out good features
• Hard to build the pipelines to generate features
• Can’t compute some features in real time
Solution: DSL and Feature Store● Database of curated and crowd-sourced features
● Make it easy to use and transform these features in ML projects
● Make it easy to discover new useful features
● Batch and realtime serving
Data Pipeline For Predictions
Feature DSL
Transformed Features
Basis Features ML Model PredictionsData Lake Spark or
SQL
Data Pipeline For Predictions w/ Feature Palette
Feature Store
Feature DSL
Transformed Features
Basis Features ML Model PredictionsData Lake Spark or
SQL
Use Case: UberEATS ETD Model Details
15
Feature store
Model: GBT RegressionUberEATS
AppETD
● restaurant features○ location, avg prep-time, avg delivery time,
avg demand during lunch ...● contextual features
○ time of day, day of week, ...● order features
○ #items, total cost, ...● near real-time features
○ info about the past N orders, ...● ...
● Feature store provides aggregate features for real-time prediction
○ These features are time-consuming to compute in real-time
Problem● Often you want to train a model per city
● But hard to train and manage 400+ models for a project
Solution ● Let users define partitioning scheme
● Automatically train model per partition
● Manage and deploy as single logical model
1. Define partition scheme
2. Make train / test split
3. Keep same split for each level
M
M
M M M
M
M M M
4. Train model for every node
M
M
M M
M
M M M
5. Prune bad models
M
M
M M
M
M M M
6. At prediction time, use best model for each node
Use Case: UberEATS ETD Prediction Performance
24
● Partitioned GBDT Regression Model
● Latency (measured from client)
○ p50: 7ms
○ p95: 15ms
○ p99: 20ms
Conclusion● We present a scalable ML as a service system
● We focus on the scalability challenges and solutions
○ Feature store key to enable aggregate features for real-time prediction
■ Same API to access feature store for both batch training and real-time prediction
○ Partitioned models greatly simplifies model management and selection
■ Per city model performance often worse than global model
○ Scalable low latency real-time prediction service enables interactive user experiences
■ Load balancing across containers without global state
■ Fast one button deployment
■ Hot swap model upgrade
Recommended