28
Recruiting Solutions Expertise Search @ Viet Ha-Thuc, Ganesh Venkataraman, Mario Rodriguez, Shakti Sinha, Senthil Sundaram and Lin Guo 1 Viet Ha- Thuc

Best application paper at IEEE Big Data: Linkedin expertise search

Embed Size (px)

Citation preview

Page 1: Best application paper at IEEE Big Data: Linkedin expertise search

1Recruiting SolutionsRecruiting SolutionsRecruiting Solutions

Expertise Search @

Viet Ha-Thuc, Ganesh Venkataraman, Mario Rodriguez, Shakti Sinha, Senthil Sundaram and Lin Guo

Viet Ha-Thuc

Page 2: Best application paper at IEEE Big Data: Linkedin expertise search

2

• 200+ countries and territories

• 2+ new members per second

Page 3: Best application paper at IEEE Big Data: Linkedin expertise search

3

Page 4: Best application paper at IEEE Big Data: Linkedin expertise search

4

Talent SolutionsHelp recruiters and companies to search for the right talent with their desired expertise

Page 5: Best application paper at IEEE Big Data: Linkedin expertise search

5

Agenda

Introduction

Skill Reputation Scores

Personalized Learning-to-Rank

Results & Lessons

Page 6: Best application paper at IEEE Big Data: Linkedin expertise search

6

Introduction

Skills– 40K+ standardized skills– Members get endorsed on

skills– Represent professional

expertise

Page 7: Best application paper at IEEE Big Data: Linkedin expertise search

7

Introduction Expertise search on LinkedIn

– Skill and no personal name

Page 8: Best application paper at IEEE Big Data: Linkedin expertise search

8

Introduction Unique challenges to LinkedIn expertise Search

– Scale: 400M members x 40K standardized skills

– Sparsity of skills in profiles

– Personalization

Page 9: Best application paper at IEEE Big Data: Linkedin expertise search

9

Agenda

Introduction

Skill Reputation Scores

Personalized Learning-to-Rank

Results & Lessons

Page 10: Best application paper at IEEE Big Data: Linkedin expertise search

10

ReputationInformation a decision maker uses to make a

judgment on an entity with a record (*)

(*) “Building web reputation systems”, Glass and Farmer, 2010

Page 11: Best application paper at IEEE Big Data: Linkedin expertise search

11

Skill Reputation Scores

Decision Maker: searcher

Record: Professional career

Skill reputation: member expertise on a skill

Judgment: Hire?

Page 12: Best application paper at IEEE Big Data: Linkedin expertise search

12

Estimating Skill Reputation

Endorse profile

browsemap

? .85 .45? ? .35

? .42 ?

? ? .05Mem

bers

Skills

P(expert| member, skill)

Supervised Learning algorithm

Page 13: Best application paper at IEEE Big Data: Linkedin expertise search

13

Estimating Skill Reputation

Endorse profile

browsemap

? .85 .45

? ? .35

? .42 ?

? ? .05Mem

bers

Skills0.5 1

0.7 0

0 0.6

0.1 0

0.2 0.3 0.5

0.5 0.7 0.2

Mem

bers

Skills

Each row is a representation of a member in latent space

Each column represents a skill in

latent space

Matrix Factorization

Page 14: Best application paper at IEEE Big Data: Linkedin expertise search

14

Estimating Skill Reputation

Endorse profile

browsemap

? .85 .45

? ? .35

? .42 ?

.02 ? ?Mem

bers

Skills0.5 1

0.7 0

0 0.6

0.1 0

0.2 0.3 0.5

0.5 0.7 0.2

Mem

bers

Skills

.6 .85 .45

.14 .21 .35

.3 .42 .12

.02 .03 .05Mem

bers

SkillsFill in unknown cells in

the original matrix

Page 15: Best application paper at IEEE Big Data: Linkedin expertise search

15

Matrix Factorization

Matrix factorization by Alternating Least Squares optimization

? .85 .45

? ? .35

? .42 ?

.02 ? ?Mem

bers

Skills0.5 1

0.7 0

0 0.6

0.1 0Mem

bers

Skills

?

R M S

Si+1 = ArgminS ||R – Mi.S)||2

Page 16: Best application paper at IEEE Big Data: Linkedin expertise search

16

Matrix Factorization

Matrix factorization by Alternating Least Squares optimization

? .85 .45

? ? .35

? .42 ?

.02 ? ?Mem

bers

Skills0.5 1

0.7 0

0 0.6

0.1 0

0.2 0.3 0.5

0.5 0.7 0.2

Mem

bers

Skills

R M S

Mi+1 = ArgminM ||R – M.Si+1||2

?

Page 17: Best application paper at IEEE Big Data: Linkedin expertise search

17

Matrix Factorization

Matrix factorization by Alternating Least Squares optimization– Apache Mahout

Take skill co-occurrence patterns to infer missing skills– Members knowing “Big Data” are also likely to know “Hadoop”

Page 18: Best application paper at IEEE Big Data: Linkedin expertise search

18

Skill Reputation Feature

Project a query into latent space: Q = sj + sk

Reputation = mi . (sj+sk) = mi.sj + mi.sk

Efficiency: Pre-compute and index member-skill scores mi.sjSSkills

sj sk

Mem

bers

M

mi

Page 19: Best application paper at IEEE Big Data: Linkedin expertise search

19

Features Reputation feature

Social Connection

Homophily– Geo– Industry

Textual Features

Page 20: Best application paper at IEEE Big Data: Linkedin expertise search

20

Agenda

Introduction

Skill Reputation Scores

Personalized Learning-to-Rank

Results & Lessons

Page 21: Best application paper at IEEE Big Data: Linkedin expertise search

Ranking

▪ Manually tuning vs. Learning to Rank (LTR)

▪ Why Learning to Rank?– Hard to manually tune with very large number of features– Challenging to personalize– LTR allows leveraging large volume of click data in an

automated way

21

Page 22: Best application paper at IEEE Big Data: Linkedin expertise search

22

Training Data: click logs Top-K randomization

Uncertain (removed)

Bad: label = 0

Good: label = 1click

InMail Perfect: label = 3

Page 23: Best application paper at IEEE Big Data: Linkedin expertise search

23

Learning to Rank

Coordinate Ascent Listwise

– Consider relevance is relative to every query– Allow optimizing quality metric directly

Objective function– Normalized Discounted Cumulative Gain (NDCG@K)– Graded relevance labels

Page 24: Best application paper at IEEE Big Data: Linkedin expertise search

24

Agenda

Introduction

Skill Reputation Scores

Personalized Learning-to-Rank

Results & Lessons

Page 25: Best application paper at IEEE Big Data: Linkedin expertise search

25

Experiments Query Tagging

Target Segment: skill and no-name Baseline

– No skill reputation feature– Hand-tuned

Search Products: Flagship and premium A/B Tests for 4 weeks

– Novelty effect: ignore 1st week– Size: hundreds of thousand searches

Page 26: Best application paper at IEEE Big Data: Linkedin expertise search

26

Results

CTR@10 # Messages per Search

Flagship +11% +20%

Premium +18% +37%

Improvements over the baseline

Page 27: Best application paper at IEEE Big Data: Linkedin expertise search

27

Take-Aways Going beyond text features

– Exploit structured data

Matrix factorization for large-scale reputation estimation – 400M members x 40K skills

Personalized Learning-to-Rank is crucial

Full Paper: http://arxiv.org/pdf/1602.04572v1.pdf

Page 28: Best application paper at IEEE Big Data: Linkedin expertise search

28

We are hiring!email: [email protected]