Recsys2016 Tutorial by Xavier and Deepak

Lessons Learned from Building real-life Recsys

Xavier Amatriain (Quora) Deepak Agarwal (LinkedIn)

What is a recommender system ?

A recommender system recommends items to users to optimize a utility composed of one or more objectives Almost every website is powered by a recommender system

Web Recommender Problem

User i with user features xi

(demographics, browse history, geo-location, search history, topics of questions answered, Topics interested in, …)

visits

item j with item features xj

(keywords, content categories, author, ...)

Algorithm selects

(i, j) : response yij

Interaction (Click, share, like, answer, ask, follow,..)/no-interaction

Which item should we select? •  The one with highest predicted utility •  The one most useful for improving the utility prediction model

Exploit Explore

Today • We are going to talk about recommender systems at

Agenda • Recommender Systems at LinkedIn (Deepak)

• Context & Overview • End-to-end of recommender systems in practice:

• Examples --- Jobs Recommendation, LinkedIn Feed • Lessons Learned

• Recommender Systems at Quora (Xavier) • Context & Overview • Lessons Learned

•  Conclusion (Xavier)

6

Our vision Create economic opportunity for every member of the global workforce

Our mission Connect the world’s professionals to make them more productive and successful

Our core value Members first!

Companies Jobs Skills People Schools Knowledge

Actors and value propositions

Value proposition for Users (Members)

CONNECT

with your professional world

STAY INFORMED

through professional news and knowledge

GET HIRED

and build your career

Value proposition for Customers

HIRE MARKET SELL @WORK

Several Recommendation Problems

• Member experience •  LinkedIn Feed •  PYMK (People you may Know) • Job recommendation • …..

Recommendation Problems continued ….

• Customer experience • Recruiter (source candidates for recruiters) • Sales Solution (close deals with companies) • Linkedin Learning (course recommendation) • Recommend user segments in advertising

Recommendations: Delivery Mechanisms

• Pull Model: Serve most relevant when the user visits • Desktop, mobile web, mobile app, tablet,..

• Push Model: Get in touch with user to deliver recommendations {Email, Notifications}

• Higher relevance bar (do not spam and inundate the users) • Right message, right user, right time, right frequency, right channel

Done through ML and optimization

MATCH-MAKING: Know your items, your users and their interactions

User Characteristics Profile Information

Title, seniority, skills, education, endorsements, presentations,… Behavioral

Activities, search,.. Edge features (ego-centric network)

Connection strength, content affinities,..

 Professional profile of record

Item Features Articles

author, sharer, keywords, named entities, topics, category, likes, comments, latent representation, etc.

Jobs

company, title,skills, keywords, geo, …

.......

User Intent

• Why are you here ? • Hire, get hired, stay informed, grow network, nurture connections, sell, market,..

• Explicit (e.g., visiting jobs homepage, search query), • Implicit (needs to be inferred, e.g., based on activities)

How to Scale Recommendations?

•  Formulate objectives to optimize

• Optimize via ML models • incorporate both implicit and explicit signals about user and items

• Automate

Connecting long-term objectives to proxies that can be optimized by machines/algorithms

Long-term objectives (return visits to site, connections,

quality job applies,,..)

Short-term proxies (CTR, connection prob, apply prob, …)

Large scale optimization via ML, UI changes,..

Experiment Learn Innovate

Automation Optimize proxies with short feedback loop via Machine Learning !!

Whom?

User!Profile,!User!Intent!Item!Filtering,!Understanding!

Context What?

Interaction Data

INPUT SIGNALS

MACHINE LEARNING RANK%Items%Sort!by!Score!Mul:;objec:ve!Business!rule!

SCORE%Items%P(Click),!P(Share)!Similarity,…!

Under the Hood of a Typical Recommender System at LinkedIn

21

Example Application: Job Recommendation

Objective: Job Applications

Predict the probability that a user i would apply for a job j

given … • User features

•  Profile: Industry, skills, job positions, companies, education • Network: Connection patterns

•  Item (i.e., job) features •  Source: Poster, company, location • Content: Keywords, title, skills

• Data about users’ past interactions with diff types of items •  Items: Jobs, articles, updates, courses, comments •  Interactions: Apply, click, like, share, connect, follow

System Architecture

Front End Service

Ranking Service

Item Index

User Feature Stores User DB

Item DB

Offline Data Pipelines

Item Feature Pipelines

User Feature Pipelines Data Stream Processing

User Activity Data Streams

Live Index Updater

ETL ETL Online Offline

Model Training Pipelines

Offline Index Builder

User

Photon-ML

Apache

Hadoop, Pig, Scalding, Spark, … Search Index

Experimentation platform

Ranking Library

Feature Generation •  Types: User features, item features, activity features •  Processing methods: Streaming, offline

Streaming example: Skills required by a job new job j

Job DB Live Index Updater Item

Index

Kafka Skill Extraction Pipeline

Skill Extraction Pipeline

Skill extractor - ML model - Predict p(job j requires skill s) based on job description, … - Skills are standardized

Distributed data/event delivery and queueing system

Metadata

Data ETLed to Hadoop

Model Training Raw User Features Raw Item Features

DAG of Transformers

DAG of Transformers

DAG of Transformers

Feature Vector of User i xi

Matching Feature Vector mij

Feature Vector of Item j zj

(trees, similarities)

Parameter vector for each user i

Parameter vector for each item j

p(i applies for j) = f( xi, zj, mij | θ, αi, βj )

Feature Processing

Parameter Learning

Global parameter vector

Model Deployment

User Feature Stores

Live Index Updater Item

Index

Parameter vector for each user i

Parameter vector for each item j

p(i applies for j) = f( xi, zj, mij | θ, αi, βj ) Global parameter vector

Online Ranking

User Feature Stores User DB

User Feature Pipelines Data Stream Processing

User Activity Data Stream

Ranking Service

Item Index

Offline Data Pipelines

ETL ETL Online Offline

Model Training Pipelines

Offline Index Builder

Front End Service

User

User Features & Parameters

Item DB Live Index Updater

Item Feature Pipelines

Online A/B Experiments

Experiment setting - Select a user segment - Allocate traffic to different models

Result reporting - Report experimental results - Impact on a large number of metrics

LinkedIn Feed

TheFeed:

31

•  Deliver on the Value Propositions: •  Stay connected with your Network (your network is your identity!) •  Ability to build your professional reputation •  Stay informed with relevant professional knowledge •  Discover opportunities •  Generate revenue (directly or indirectly)

Function of the Feed

•  Heterogeneity of Types. . •  Organic Content

•  Articles by Influencers, Articles by Network, Shares by network, Content by topic (follows), Jobs, PYMK,group discussions, etc

•  Sponsored •  Sponsored updates, Jobs ads, ..

Challenges of the Feed

TheFeed:Notalltypesareequal

34

Action rates per type (Normalized)

Impression Discounting

• Reduce the chance of showing the same item to the same user repeatedly

• Decay the score of an item based on #times that the user saw the item before

• Using real-time feedback

• Discounting by user segments and item types

Global (over all types)

Impression discounting curves of a few item types

Diversification • Users’ experience deteriorates when exposed to the same kind of items multiple

times on the same page

• Decay relevance scores of repeat items from the same actor and of the same type

Discounting actor repetitions

Group Discussion CTR Drop 2 adjacent discussions 21% 3 adjacent discussions 48%

How to Combine Different Objectives

•  Thefeedsystemservesupdatesbasedonrelevancescores•  Adjusttheservingstrategytoop?mizerevenuewhileenforcing

engagement(e.g.CTR)constraints

Foruserx,itemiRankby:eCPI(i|x)+SB*pCTR(i|x)

maximizerevenuesuchthatengagement>=engagementtarget

–  eCPI:es?matedrevenueforagivenupdate–  Fororganicupdates,eCPI=0–  SB:shadowbid(intrinsicvalua?onoforganicclickstoLinkedIn)

TradeoffsPointsandEfficientFron?er

Revenue gain (relative)

Engagement gain (relative) 0

Conservative (high SB)

Aggressive (low SB)

Original System (no Optim)

- +

Better efficient frontier More aggressive (very low SB)

Encouraging Viral loops: Some heuristics •  Value of share, comment, like > Value of click • Rank by using linear combination of CTR and Viral Action Rates

• Lose CTR but gain more viral actions (shares, likes, comments) • Increasing viral actions increases unique user visits & feed sessions

• Viral action triggers notification to actors in many cases (e.g., like/comment on a post written by your connection)

• Encourage users to share/comment/like more • Boost article scores by users who share good stuff and who don’t share very often

• May lose some CTR in short-term but increase cohort that shares on LinkedIn

Update Type 1 … Update

Type N Each type scores and orders its potential

updates

TheFeed:AthreestageRanker

Mulitiple Objective

The third stage adjusts for diversity, impression discounting, balance of objectives: engagement &

revenue

Blending Results The second stage rank orders every update using ML model

LESSONS LEARNT

1. Cost of a Bad Recommendation

• How ML works where a few bad recommendations can hurt brand ? • Maximize precision without hurting performance metrics significantly

• Collect negative feedback from users, crowd; incorporate within algorithms • Create better product focus, filter unnecessary content from inventory

• E.g., unprofessional content on Feed

• Better insights/explanations associated with recommendations help build trust

2. Data Tracking • Proper data tracking and monitoring is not always easy!

• Data literacy and understanding across organization (front-end, UI, SRE) • Proper tooling, continuous monitoring very important to scale the process

• Our philosophy: Loose coupling between FE and BE teams! • FE (client) emits limited events along with trackingid • BE emits more details and joins against trackingid

• Tracking events can have significant impact • View-port tracking (what the user actually saw) for more informative negatives

3. Content Inventory • High quality and comprehensive content inventory as important as

recommendation algorithms

• Examples: Learning, Jobs, Feed

• Supply and demand analysis, gap analysis, proactively producing more high quality content for inventory

4. A/B Testing with Network Interference • Random treatment assignments (spillover effects, need to adjust)

• Treatment recommendations affect control group as well

• A like/share in treatment may create a new item when ranking in control

45

Data & Analytics

Recsys2016 Tutorial by Xavier and Deepak