Science is anchored by scientists
Keeping up with significant advances
– We look for recent work by key authors
Evaluating quality of new research
– We identify & invite experts to review
Funding impactful research
– We look for investigators with a proven record
Scholars on Google Scholar
Challenges
– Identify publications by an author
– Keep them up-to-date
– Pivot from articles to authors and back
Opportunities
– Find me what I need to read
– Identify key researchers in areas
Overview
Author disambiguation
– Quickly setup author profile, auto-maintain
Integration with Scholar search
– Work by key authors
– Find experts in an area
Personalized recommendations
– Find me what I need to read
Author disambiguation approach
Build a statistical model grouping articles by an author with the same name
– Author lists, journals, co-authors, research area, affiliations, text of articles
– Multi-dimensional model
– Shoot for high precision, very good recall
Disambiguation approach – II
Many authors work in multiple areas
– With multiple co-authors, multiple communities
– Trying to group these results will break others
Allow such groups to remain separate
– Make it trivially easy for authors to merge
Final disambiguation step is human
– Key is to make the human step very simple
How does it work? Present groups of articles matching name
– Author selects groups written by her
After setup, updates are automated
Author can merge/add/remove
Changes fed back into the statistical model
– Improves update precision & recall
– Automated => must be hi-precision/hi-recall
What do you get?
List of all your publications
Citation metrics – overall and per article
Links to co-authors
Follow all your citations
Colleagues can follow your work
Personalized recommendations
How well does it work?
Worldwide adoption
– Widely published and cited authors
– Most authors take a few minutes
– Most authors opt for automated updates
– All countries, all areas
Why does it work? Statistical model is quite effective
– Able to achieve high precision
– Recall tradeoff is small
– Flexibility to split groups key, but not frequent
For most authors takes 5-10 minutes
– Effort is one-time, updates are automated
Enables many useful services – well worth the few minutes!
Finding Scholars
Queries including name
– Add matching author profiles as results
Keyword queries
– Link author names to author profiles
Browse researchers in an area
– Link author interests to search over profiles
Personalized recommendations
Challenge: keep up with rapid growth in articles
Approach:
– Centered around author profiles
– Analyze research interests & evolution
– Factor in co-authors & their evolution
– Leverage citation graph
– Scan all newly seen articles
Finally… Individuals key for recovering structure
– Allow efficient, cascade-able disambiguation
– Depts, institutions, funding agencies
– Most other analyses can be layered on top
Find what I need to read – classic hard problem
– Much progress, much much more to do