SciReader:ARecommendersystemforBiomedicalliteratureDesai,P.1,2 ,Lehmann,B.2,Telis,N.2,PritchardJ.P2
1StanfordCenterforGenomicsandPersonalizedMedicine(SCGPM),2 DepartmentofGenetics,StanfordUniversity
Motivationn
Withtherecentexplosioninbiomedicalresearch,ithashasbecomeincreasinglyimportantandyetchallengingtokeepupwiththerelevantliterature.SciReaderisapersonalizedrecommendersystemthatspecificallyaimstohelpresearchersandpractitionersinthebiomedicalcommunityparsethroughthelargevolumeofliteratureandfilterpublicationsthatmayberelevantandofinteresttothem.
SciReaderwasinitiallydevelopedatthePritchardlab(Geneticsdepartment,StanfordSchoolofMedicine)andisnowmaintainedandoperatedbytheSCGPM.ItiscurrentlybeingmigratedtoGoogleCloudandshouldbeavailablesoon.(http://scireader.org)
Introduction• SciReaderisacloudbasedservicethatusesnovelalgorithmsto
classifyandclusterpublishedbiomedicalcorporausingtopicmodeling(LatentDirichletAllocation).
• Usersprovidebasicinfo:i.e.topics/keywordsofinterestandjournalpapers.
• Bestresultswhenusercreatesa‘library’anduploadpapersofinteresttoit.CancreatePersonalizedrecommendationsbasedonrelevancy,recency,impactfactorandsentimentanalysis–updateddaily.
• Weeklyemaildigestsofimportantpublicationsinyourfieldofresearch.
• Relevanttrendingtwitterfeedsprovidedinrealtime.
TopicModelingofPubmed/BioRxivusingLDA
• ThecornerstoneofScireaderisitstopicmodelofPubmed.• Topicmodelsrepresentaclassofcomputerprogramsthatseemsto
‘automagically’extractunderlyingthemesortopicsfromlargeunstructuredtexts.
• LDA:LatentDirichletAllocation,atopicmodelalgorithm• Mathematically:
• WeusedTitlesandAbstractsfromallthearticlespublishedinpubmedin2012(~1.2million)totrainaLDAmodelwhichwasthenusedtocreatea‘topicinferencer’.
• Ourtopicmodelhas150topicswhichweregroupedinto20‘supertopics”
• AllarticlesfrompubmedandbioRxivhavebeen‘topicmodeled’usingthisinferencer
SciReaderScreenshots
BasicoverviewoftheRecommenderpipeline
Bloodcancersgenomics Obstetrics
Exampletopics‘discovered’byLDA
v
Summary• SciReaderisagreatwayforresearchersandmedicalpractitionersto
stayabreastofadvancesintheirfield.• SciReader’stopicmodelanddatabasecanbeusedasaresearchtoolto
performlongitudinalstudiesonhistoryofdiseasebasedonPublicationdataandotherbibiliometricstudies.