Scaling machine learning as a service at Uber — Li Erran Li at #papis2016

Use Case: UberEATS ETD Prediction

●○

○○

●○

HADOOP / YARN (Batch)

Hive Feature Store

NETWORK (Realtime)

HADOOP / YARN (Batch)

Cassandra Feature Store

Hive Feature Store

Use Case: UberEATS ETD ML Pipeline

Feature store

Model Training

ModelUberEATS App

Model Performance

Problems• Hard to figure out good features

• Hard to build the pipelines to generate features

• Can’t compute some features in real time

Solution: DSL and Feature Store● Database of curated and crowd-sourced features

● Make it easy to use and transform these features in ML projects

● Make it easy to discover new useful features

● Batch and realtime serving

Data Pipeline For Predictions

Feature DSL

Transformed Features

Basis Features ML Model PredictionsData Lake Spark or

Data Pipeline For Predictions w/ Feature Palette

Feature Store

Feature DSL

Transformed Features

Basis Features ML Model PredictionsData Lake Spark or

Use Case: UberEATS ETD Model Details

Feature store

Model: GBT RegressionUberEATS

AppETD

● restaurant features○ location, avg prep-time, avg delivery time,

avg demand during lunch ...● contextual features

○ time of day, day of week, ...● order features

○ #items, total cost, ...● near real-time features

○ info about the past N orders, ...● ...

● Feature store provides aggregate features for real-time prediction

○ These features are time-consuming to compute in real-time

Problem● Often you want to train a model per city

● But hard to train and manage 400+ models for a project

Solution ● Let users define partitioning scheme

● Automatically train model per partition

● Manage and deploy as single logical model

1. Define partition scheme

2. Make train / test split

3. Keep same split for each level

4. Train model for every node

5. Prune bad models

6. At prediction time, use best model for each node

Use Case: UberEATS ETD Prediction Performance

● Partitioned GBDT Regression Model

● Latency (measured from client)

○ p50: 7ms

○ p95: 15ms

○ p99: 20ms

Conclusion● We present a scalable ML as a service system

● We focus on the scalability challenges and solutions

○ Feature store key to enable aggregate features for real-time prediction

■ Same API to access feature store for both batch training and real-time prediction

○ Partitioned models greatly simplifies model management and selection

■ Per city model performance often worse than global model

○ Scalable low latency real-time prediction service enables interactive user experiences

■ Load balancing across containers without global state

■ Fast one button deployment

■ Hot swap model upgrade

Scaling machine learning as a service at Uber — Li Erran Li at #papis2016

Technology

A Minimum-Energy Path-Preserving Topology …lierranli/publications/TWireless04.pdf1 A Minimum-Energy Path-Preserving Topology-Control Algorithm Li (Erran) Li Joseph Y. Halpern Dept

Submission 143 Written submission from Uber Introduction ... · Submission 143 Written submission from Uber Introduction to Uber Uber is a smartphone app that allows customers to

1 Locating Internet Bottlenecks: Algorithms, Measurement, and Implications Ningning Hu (CMU) Li Erran Li (Bell Lab) Zhuoqing Morley Mao (U. Mich) Peter

SoftCell: Scalable and Flexible Cellular Core Network ...jrex/papers/softcell13.pdf · SoftCell: Scalable and Flexible Cellular Core Network Architecture Xin Jiny, Li Erran Li?, Laurent

1 On Spectrum Sharing Games1 On Spectrum Sharing Games∗ Magnu´s M. Halldorsson, Li (Erran) Li,´ Member, IEEE, Joseph Y. Halpern, Senior Member, IEEE, and Vahab S. Mirrokni Abstract

Instructor: Li Erran Li ( lierranli@cs.columbia )

Cellular Networks and Mobile Computing COMS 6998-10, Spring 2013 Instructor: Li Erran Li (lierranli@cs.columbia.edu) lierranli/coms

Cellular Networks and Mobile Computing COMS 6998-8, Spring 2012 Instructor: Li Erran Li (lierranli@cs.columbia.edu) coms6998-8

AA (Uber Pro) Breakdown Assistance/media/the-aa/pdf/uber/... · AA (Uber Pro) Breakdown Assistance is available for Uber Pro partner-drivers, for as long as they are an Uber Pro partner-driver

A Cone-Based Distributed Topology-Control Algorithm for Wireless Multi-Hop … · 2018-01-04 · Algorithm for Wireless Multi-Hop Networks Li (Erran) Li ... et al. [7] describe an

Argos Clayton W. Shepard Hang Yu, Narendra Anand, Li Erran Li, Thomas Marzetta, Richard Yang, Lin Zhong Practical Many-Antenna Base Stations

Tennessee State Data Center Murfreesboro, Tennessee November 2015 ERRAN PERSLEY

Packet Mixing: Superposition Coding and Network Coding Richard Alimi CS434 Lecture Joint work with: L. Erran Li, Ramachandran Ramjee, Harish Viswanathan,

DIMACS Cloud Workshop, Dec. 8-9, 2011 PACE: Policy-Aware Application Cloud Embedding Y. Richard Yang Joint work w/ Erran Li, G. Wilfong, M. F. Nowlan,

Li Liu, E. Downey Brill, G. Mahinthakumar, James Uber, Emily M. Zechman, S. Ranjithan

1 Wide-Area IP Network Mobility Xin Hu 1, Li (Erran) Li 2, Z. Morley Mao 1 and Yang Richard Yang 3 1 Bell Labs, Alcatel-Lucent, Murray Hill, NJ 2 University

Xitao Wen, Chunxiao Diao, Xun Zhao, Yan Chen, Li Erran Li ...conferences.sigcomm.org/sigcomm/2014/doc/slides/219.pdf · Xitao Wen, Chunxiao Diao, Xun Zhao, Yan Chen, Li Erran Li,

Incentive-Compatible Opportunistic Routing for Wireless Networks Fan Wu, Tingting Chen, Sheng Zhong (SUNY Buffalo) Li Erran Li Li Erran Li (Bell Labs)

1 Retransmission Repeat: Simple Retransmission Permutation Can Resolve Overlapping Channel Collisions Li (Erran) Li Bell Labs, Alcatel-Lucent Joint work

UCAN: A Unified Cellular and Ad Hoc Network Architecture Presenter: Tripp Parker Authors: Haiyun Luo Ramachandran Ramjee Prasun Sinha, Li Erran Li, Songwu