Upload
atypon
View
151
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Academic Publishing in Europe, 30 January 2013 Speaker: Kevin Cohn
Citation preview
Kevin CohnChief Operating Officer
@Atypon
Improving Research Efficiency
Academic Publishing in Europe, Berlin30 January 2013
User and Content Fingerprinting
• Provider of Software as a Service content delivery for publishers
• Literatum platform used to deliver 15M journal articles and 70,000 eBooks
• 1.5 billion user sessions in 2012
About Atypon
3 Improving Research Efficiency
• Research efficiency can be greatly improved if publishers tap into their huge volume of data to better connect users to content.
Thesis
4 Improving Research Efficiency
Users don’t want “advanced search...”
...but they do want relevant results.
This is the APE I’m looking for.
Data can drive this behavior.
• Relevancy is the only order that matters
• > 50% of clicks are to the first result
• > 90% of clicks are on the first page
• Filters/facets aren’t used
Observations
9 Improving Research Efficiency
• Give users what they want: a simple, Google-like search interface
• But use proprietary data to calculate relevancy for each individual user
Objectives
10 Improving Research Efficiency
Automatic Topic Modeling11 Improving Research Efficiency
• Based on a statistical model called latent Dirichlet allocation (LDA)
• Creates “topics:” collections of words that occur together with great frequency
Topic #1: {mammal, primate, hominoidea}
Topic #2: {academic, publishing, europe}
Automatic Topic Modeling
12 Improving Research Efficiency
13 Improving Research Efficiency
13 Improving Research Efficiency
Topic #1
Topic #2
16 Improving Research Efficiency
16 Improving Research Efficiency
17 Improving Research Efficiency
17 Improving Research Efficiency
17 Improving Research Efficiency
18 Improving Research Efficiency
• My search for “APE” returns results about this conference, not primates
• The same is true for recommendations
• Better related articles (topics 1 and 2 are not related, despite sharing “APE”)
Applications
19 Improving Research Efficiency
• Topics are self-updating = low-cost, low-maintenance
• Flat (not hierarchical) = avoids troublesome questions about classification
• Probabilistic (not binary) = better at expressing relevancy to topics
Not a Taxonomy/Ontology...
20 Improving Research Efficiency
21 Improving Research Efficiency
21 Improving Research Efficiency
• Topics are “collections of words that occur together with great frequency”
• Knowing that “APE” is an acronym for “Academic Publishing in Europe”
• Knowing that “CC0” and “CC BY” are Creative Commons license types
...But Is Helped by Them
22 Improving Research Efficiency
• We didn’t invent ATM (or LDA)
• Our implementation started as a collaboration with academic researchers...
• ...and will require considerable experimentation and testing to get right
Worth Mentioning
23 Improving Research Efficiency
• Usage is not personally identifiable
• Usage is not shared with third parties
• Users can opt out of personalization
Privacy
24 Improving Research Efficiency
• ATM uses proprietary data to calculate relevancy for each individual user
• Gives users what they want: a simple, Google-like search interface
• Improves research efficiency by freeing up searching time for reading
Summary
25 Improving Research Efficiency
Thank You
26 Improving Research Efficiency
Kevin CohnChief Operating Officer, Atypon