Upload
nikhil-dandekar
View
657
Download
1
Embed Size (px)
Citation preview
Scaling Recommendations at Quora
Nikhil Dandekar @nikhilbd
9/16/2016
Quora’s Mission
“To share and grow the world’s knowledge”
● Millions of questions & answers
● Millions of users
● Over a million topics
● Growing exponentially...
Lots of high-quality textual information
Lots of data relations
● Scaling the home page feed
● Scaling the Machine Learning environment
● Pragmatism: aka don’t chase every new, shiny object
Agenda
Scaling the Home Page Feed
Recommendations at Quora
● Home feed
● Digest emails
● Topics to follow
● Users to follow
● Related Questions
● Related Topics (topic → topic)
● Trending topics
● …..
Home feed
● Goal: personalized, engaging experience for
reading/writing
● Show a ranked list of stories (questions/answers)
● ML model predicts an interestingness score for each
story
● Training data:
○ impression logs from the past
○ x: features about user/story/interactions
○ y: score based on actions (answer/follow,
upvote/click)
What is interestingness?
click
upvote
downvote
expand
share
click
answer pass
downvote
follow
Performance and Cost
Millions of questions and answers
The best 20 questions and answers
Personalized Ranking
x millions of users
Scaling challenge:
● Content growing exponentially
○ Time spent per ranking request growing
exponentially
● Users growing exponentially
○ Number of ranking requests growing
exponentially
● Computational resources spent on ranking
growing quadratically with respect to user
growth
● Solution: Multi-phase ranking!
● Use an unpersonalized model to reduce the
number of candidates for the personalized
model
● Cache the computed score in storage
Performance and Cost
Millions of questions and answers
The best 20 questions and answers
Ranking
x millions of users
Thousands of questions and answers
Unpersonalized (1p)
Personalized (2p)
Feed backend system
Aggregator 1 Aggregator 2 Aggregator 3
Leaf 1 Leaf 2 Leaf 3
Aggregator
Leaf
Requests from Web (python)
...
...
...
user_id
object_id
Scaling the Machine Learning Environment
ML applications
● Feed / digest
● Search
● Answer ranking / Answer collapsing
● User-user, user-topic recommendations
● Related questions
● Duplicate questions
● Question-topics
● Question quality
● Spam users / content
● ….and a lot more
Machine Learning environment
ML Models
● Logistic Regression
● Gradient Boosted Decision Trees
● LambdaMART
● Random Forest
● Matrix Factorization
● Deep Neural Networks
● LDA
● k-means
● k-NNs
● ...and others
● Productionizing ML training
○ Continuous retraining of models to
adapt to new data
○ Use Luigi to keep track of task
dependencies
Machine Learning environment
● Productionizing ML training:
○ Continuous retraining of models to
adapt to new data
○ Use Luigi to keep track of task
dependencies
● Use Amazon EC2 spot instance for
training tasks
○ Usually much cheaper than
on-demand price
○ Can spawn multiple boxes at once and
shut them down after training is
complete
Machine Learning environment
● Productionizing ML training:
○ Continuous retraining of models to
adapt to new data
○ Use Luigi to keep track of task
dependencies
● Use Amazon EC2 spot instance for training
tasks
● Extremely important to have automatic
monitoring of each task’s input/output
○ Data can change in unexpected ways
○ Don’t want bugs in upstream models
to affect downstream models
Machine Learning environment
Data populator
Training model 1
Training model 2 Training model 3
● Productionizing ML training:
○ Continuous retraining of models to
adapt to new data
○ Use Luigi to keep track of task
dependencies
● Use Amazon EC2 spot instance for training
tasks
● Extremely important to have automatic
monitoring of each task’s input/output
○ Data can change in unexpected ways
○ Don’t want bugs in upstream models
to affect downstream models
Machine Learning environment
Data populator
Training model 1
Training model 2 Training model 3
Verify data
Verify metrics
Counts, class proportions,...
MSE, R2, AUC,...
● Need a ML platform that is
○ Easy to ramp up on
○ Easy to iterate on
○ Fast
○ Reliable
○ Reusable
○ Production-ready
Machine Learning platform goals
● Have a centralized ML platform that is shared across teams
○ Write training scripts in C++/Python and run them on remote boxes
○ Provide Python wrappers with iPython integration
○ Store data on Redshift/S3 and have training boxes communicate with them directly
Machine Learning platform
Dev laptop
Storage services (Redshift, S3…)
Training boxes
CPU/GPU
● In an IPython notebook
Lego ML platform
Lego ML platform
● Single way to define and add ML features
● Features are reusable
○ Different ML applications do not define / calculate them separately
● Available both offline (training time) and online (prediction time)
● Single point for logging, monitoring, documentation etc.
Alchemy Feature Engineering Framework
Pragmatism
● Relevance
● Speed: Fast prediction, (relatively) fast
training
● Fast development and iteration time
● Reliability / Robustness
● Cost
● Debuggability
● Low technical debt
What all matters for your ML algorithm:
Occam’s razor for Machine Learning
● Given two models that perform more or
less equally, you should always prefer
the less complex
● E.g. A Deep Learning model:
○ +1% in accuracy
○ 10x training time
○ 1.5x prediction time
○ Costly to store and maintain
● Look at all the factors, not just
relevance
Distributing ML training
● Distributed ML training helps you scale with data
● But most of what people do in practice can fit into a single, multi-core
machine
● Trade-offs:
○ Relevance gains
○ Training speed
○ Development and iteration time
○ Costs
● Use what works best given these factors, with an eye out for the future
● Figure out how to scale up your data and your models
● But scaling is not just about data and the models
○ Think about your ML environment too
● Be Pragmatic
○ Don’t chase every new, shiny object
In summary
● https://www.quora.com/careers
● Technical Lead - Machine Learning
● Software Engineer - Machine Learning
● Software Engineer - NLP
● Engineering Manager - Machine Learning
We are hiring!
Thanks!