Upload
daniel-tunkelang
View
15.920
Download
4
Embed Size (px)
DESCRIPTION
Data By The People, For The People Daniel Tunkelang Director, Data Science at LinkedIn Invited Talk at the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) LinkedIn has a unique data collection: the 175M+ members who use LinkedIn are also the content those same members access using our information retrieval products. LinkedIn members performed over 4 billion professionally-oriented searches in 2011, most of those to find and discover other people. Every LinkedIn search and recommendation is deeply personalized, reflecting the user's current employment, career history, and professional network. In this talk, I will describe some of the challenges and opportunities that arise from working with this unique corpus. I will discuss work we are doing in the areas of relevance, recommendation, and reputation, as well as the ecosystem we have developed to incent people to provide the high-quality semi-structured profiles that make LinkedIn so useful. Bio: Daniel Tunkelang leads the data science team at LinkedIn, which analyzes terabytes of data to produce products and insights that serve LinkedIn's members. Prior to LinkedIn, Daniel led a local search quality team at Google. Daniel was a founding employee of faceted search pioneer Endeca (recently acquired by Oracle), where he spent ten years as Chief Scientist. He has authored fourteen patents, written a textbook on faceted search, created the annual workshop on human-computer interaction and information retrieval (HCIR), and participated in the premier research conferences on information retrieval, knowledge management, databases, and data mining (SIGIR, CIKM, SIGMOD, SIAM Data Mining). Daniel holds a PhD in Computer Science from CMU, as well as BS and MS degrees from MIT.
Citation preview
Recruiting Solutions Recruiting Solutions Recruiting Solutions
Data By The People, For The People Daniel Tunkelang Director, Data Science LinkedIn
Daniel
1
Why do 175M+ people use LinkedIn?
2
Identity: find and be found
3
Insights: discover and share knowledge
4
People use LinkedIn because of other people.
5
People as Users + People as Data
Unique opportunities and challenges! § Search § Recommendations § Networking
6
Search
7
People search is personal!
8
But not all relevance factors are personal.
9
Good Bad
People are semi-structured objects.
10 10
for i in [1..n]! s ← w1 w2 … wi! if Pc(s) > 0! a ← new Segment()! a.segs ← {s}! a.prob ← Pc(s)! B[i] ← {a}! for j in [1..i-1]! for b in B[j]! s ← wj wj+1 … wi! if Pc(s) > 0! a ← new Segment()! a.segs ← b.segs U {s}! a.prob ← b.prob * Pc(s)! B[i] ← B[i] U {a}! sort B[i] by prob! truncate B[i] to size k!
LinkedIn uses scale to derive structure.
11 11
Software Developer
Social network is more than a ranking signal.
12 12
People are a gateway to other entities.
13 13
Search: Summary
14
People finding people.
People being found.
People finding content.
Through other people.
Recommendations
15 15
Recommendation products at LinkedIn
16 16
Similar Profiles
Events You May Be Interested In
News
Network updates
Connections
LinkedIn’s recommender ecosystem
17
Recommendations drive:
> 50% of connections > 50% of job applications > 50% of group joins
Inputs for recommender systems
18
Content Social Graph
…
Behavior
Page Views Actions
Queries
Jobs You Might Be Interested In
19
How LinkedIn matches people to jobs
20
Corpus Stats
Job
User Base
Filtered
title geo company
industry description functional area
…
Candidate
General expertise specialties education headline geo experience
Current Position title summary tenure length industry functional area …
Similarity (candidate expertise, job description)
0.56 Similarity
(candidate specialties, job description)
0.2 Transition probability
(candidate industry, job industry)
0.43
Title Similarity
0.8
Similarity (headline, title)
0.7 . . .
derived
Matching Binary Exact matches: geo, industry, … Soft transition probabilities, similarity, … Text
Transition probabilities Connectivity yrs of experience to reach title education needed for this title …
Is job-hunting socially contagious?
21
[Posse, 2012]
Social referral
22
Suggest based on connection strength and relevance to target user.
2x conversion!
[Amin et al, 2012]
Suggested skill endorsements
23
Recommendations: Summary
24 24
Content is king.
Connections provide social dimension.
Context determines where and when a recommendation is appropriate.
Networking
25
People You May Know
26
Closing the triangles
§ Triads suggest and affect relationships. [Simmel, 1908], [Granovetter, 1973]
§ Triangle closing is a Big Data problem. [Shah, 2011]
§ Use machine learning to rank candidates. 27
Alice
Bob
Carol
?
Shared connections as a signal
28
Power of social proof
29
More power of social proof
30
…
Networking: Summary
31
Close triangles to suggest connections.
Connections as social proof.
Unleash the power of weak ties.
Conclusion
§ People use LinkedIn because of other people. § Primary use cases:
– Find and be found. – Discover and share knowledge.
§ People are at the heart of LinkedIn’s products: – Search – Recommendations – Networking
32
2 4 8
17
32
55
90
2004 2005 2006 2007 2008 2009 2010 2011 LinkedIn Members (Millions)
175M+
25th Most visit website worldwide (Comscore 6-12)
Company pages
>2M
62% non U.S.
2/sec
85% Fortune 500 Companies use LinkedIn to hire
Thank You!
33
We’re
Hiring!
Learn more at http://data.linkedin.com/