Copyright © 2004-2009, KISTI1BlogTalk2009
Generating Researcher Networkswith Identified Persons
on a Semantic Service Platform
15 Sep. 2009
Hanmin Jung
KISTI
Copyright © 2004-2009, KISTI2BlogTalk2009
Research networks would be useful for finding
Collaborators
Speakers (Key persons of a researcher group)
Issues
Getting sources
Resolving identities
Finding experts
Generating networks
Agenda
Copyright © 2004-2009, KISTI3BlogTalk2009
Getting sources …
Copyright © 2004-2009, KISTI4BlogTalk2009
Identified Entities
Papers: 453,124Elsevier international journal papers with full-texts and metadata
Persons: 1,352,220
Topics: 339,947
Institutions: 91,514
Locations: 409,575 (with GPS coordinate)
RDF Triples: 283,087,518 (2008.11)
Sources
Copyright © 2004-2009, KISTI5BlogTalk2009
Resolving identities … How to resolve identities?How to merge different identifiers as one?
Copyright © 2004-2009, KISTI6BlogTalk2009
OntoFrame
RDF Triple Store
OntoURI®
ListenerTriple
Generator
OntoReasoner®
OntoFrame 2008 Service
WS API/SPARQL
XML
WS API
SQL/Expanded Triples
FieldInformation
DB Tables
Legacy DB Table
Legacy DB Table
… FieldInformation
OntologyInstances
OntologySchemataOntologies Search Engine
XML
WS API
WS API
Answers
Copyright © 2004-2009, KISTI7BlogTalk2009
Ontology
Reference and Academic Knowledge Ontologies
Copyright © 2004-2009, KISTI8BlogTalk2009
OntoFrame
Select Database & OntologySelect Database & Ontology
Edit Mapping RulesEdit Mapping Rules
Design Ontology ModelDesign Ontology Model Edit URI Generation RulesEdit URI Generation Rules
Edit Identity Resolution RulesEdit Identity Resolution Rules
Crawl DatabaseCrawl DatabaseNormalize Field ValuesNormalize Field Values
Extract TopicsExtract TopicsResolve IdentitiesResolve Identities
Refer Authority DataRefer Authority Data
Apply sameAs RelationsApply sameAs Relations
Apply Mapping RulesApply Mapping Rules
Test Mapping ProcessTest Mapping Process
Apply Identity Resolution RulesApply Identity Resolution Rules
Apply URI Generation RulesApply URI Generation Rules
Modeling-TimeProcess
Indexing-TimeProcess
Generate RDF TriplesGenerate RDF Triples
Assign URIsAssign URIs
Syntactic-to-Semantic Process
Run-TimeProcess
Copyright © 2004-2009, KISTI9BlogTalk2009
Identity Resolution
One or TwoPersons?
Barry G.T.Lowden
BarryLowden
One or TwoPersons?
ChristianBecker
ChristianBecker
case 1case 1 case 2case 2
Copyright © 2004-2009, KISTI10BlogTalk2009
Identity Resolution
Rules for Resolving Personal Identities
Class Resource Kind Match Relation Source Weight
Person Order 1
Person Name Pivot Exact Single OntoURI
Person hasInstitution Feature Exact Single OntoURI 2
Person Email Feature Number Single 4
Person hasCoauthor Feature Number Multiple OntoReasoner 1
Person hasTopic threshold 0.8
Copyright © 2004-2009, KISTI11BlogTalk2009
Identity Resolution
Authority Data
Normalized Form Variant Form Kind Class
IBM International Business Machines Corporation Abbreviation Institution
Microsoft MS Abbreviation Institution
Microsoft 마이크로소프트 Korean Institution
London 런던 Korean Location
Academic Inc. Academic Press Inc, LTD Alternative Publication
Copyright © 2004-2009, KISTI12BlogTalk2009
Identity Resolution
sameAs
Authorization
∅
Copyright © 2004-2009, KISTI13BlogTalk2009
Identity Resolution
sameAs
Candidates
Copyright © 2004-2009, KISTI14BlogTalk2009
ReSIST (2006 ~ 2008)
Copyright © 2004-2009, KISTI15BlogTalk2009
ReSIST (2006 ~ 2008)
Resilience Knowledge Base
"Deliverable D31: Final Workshop report" by ReSIST
Copyright © 2004-2009, KISTI16BlogTalk2009
LOD Project
http://richard.cyganiak.de/2007/10/lod/
Linking Open Data Community Project
Available in RDF and SVG (Scalable Vector Graphics) versions
KISTI
Copyright © 2004-2009, KISTI17BlogTalk2009
Finding experts … How to extract topics?How to determine topics of a researcher?
Copyright © 2004-2009, KISTI18BlogTalk2009
Topic Extraction
System Architecture
Copyright © 2004-2009, KISTI19BlogTalk2009
Propagating Topics of Entities
Topic Propagation
Article Person
Copyright © 2004-2009, KISTI20BlogTalk2009
Experts Finding
Process
Knowledge expansionMaking direct relations for shorter access path
Experts retrievalQuerying with SPARQL for a given topic
Converting SPARQL-to-SQL
Using backward chaining path
Post-processingGrouping and counting retrieved authors
Ranking by names or the number of achievements
Making an XML document as the result
Copyright © 2004-2009, KISTI21BlogTalk2009
Knowledge Expansion
Inference Rule
@prefix isrl: <http://www.kisti.re.kr/isrl/ResearchRefOntology#>
(?x isrl:hasCreatorInfo ?y) (?y isrl:hasCreator ?z) ->
(?x isrl:createdByPerson ?z)
Article
hasCreatorInfo
CreatorInfo
hasCreator
Person
createdByPerson
……
Copyright © 2004-2009, KISTI22BlogTalk2009
Experts Retrieval
Backward Chaining Path
Copyright © 2004-2009, KISTI23BlogTalk2009
Generating networks … How to find a researcher group?How about similar researchers?
Copyright © 2004-2009, KISTI24BlogTalk2009
OntoFrame 2008
Copyright © 2004-2009, KISTI25BlogTalk2009
Researcher Networks (T, P)
Copyright © 2004-2009, KISTI26BlogTalk2009
Process
Getting co-author pairs for a target topic (T) SELECT DISTINCT ?person1 ?person2
WHERE {
?article aca:yearOfAccomplishment ?year .
FILTER(?year>=startYear && ?year<=endYear) .
?article aca:hasTopicOfArticle <topURI> .
?article aca:createdByPerson ?person1 .
?article aca:createdByPerson ?person2 .
FILTER(?person1 < ?person2) .
}
Selecting a target researcher (P) in the pairs
Tracing group members connected with him (seed)
Researcher Networks (T, P)
Copyright © 2004-2009, KISTI27BlogTalk2009
Researcher Networks (P)
Copyright © 2004-2009, KISTI28BlogTalk2009
Process
Getting co-author pairs including a target researcher (P) SELECT ?per1 ?per2
WHERE {
?article aca:yearOfAccomplishment ?year .
FILTER(?year>=startYear && ?year<=endYear) .
?article aca:createdByPerson ?per1 .
?article aca:createdByPerson ?per2 .
FILTER(?per1 < ?per2) .
FILTER(?per1=<perURI> || ?per2=<perURI>) .
}
Ranking them with the frequency of co-authorship
Researcher Networks (P)
Copyright © 2004-2009, KISTI29BlogTalk2009
Similar Researchers
Copyright © 2004-2009, KISTI30BlogTalk2009
Process (1/2)
Getting topics of a target researcher (P) SELECT ?per1 ?topic
WHERE {
?article aca:createdByPerson ?per1 .
?article aca:hasTopicArea ?topicArea .
?topicArea aca:hasTopicOfTopicArea ?topic .
FILTER(?per1=<perURI>) .
}
Ranking and selecting top n topics for him
Similar Researchers (P)
Copyright © 2004-2009, KISTI31BlogTalk2009
Process (2/2)
Getting researchers who largely share topics with him SELECT DISTINCT ?per2
WHERE {
?per2 aca:hasTopicOfPerson ?topic1 .
?per2 aca:hasTopicOfPerson ?topic2 .
?per2 aca:hasTopicOfPerson ?topic3 .
?per2 aca:hasTopicOfPerson ?topic4 .
FILTER(?per2!=<perURI>) .
FILTER(?topic1 < ?topic2 && ?topic2 < ?topic3 && ?topic3 < ?topic4) .
{
FILTER(?topic1=<topic[0]> || ?topic1=<topic[1]> || ?topic1=<topic[2]> || ?topic1=<topic[3]> || ?topic1=<topic[4]>) .
FILTER(?topic2=<topic[0]> || ?topic2=<topic[1]> || ?topic2=<topic[2]> || ?topic2=<topic[3]> || ?topic2=<topic[4]>) .
FILTER(?topic3=<topic[0]> || ?topic3=<topic[1]> || ?topic3=<topic[2]> || ?topic3=<topic[3]> || ?topic3=<topic[4]>) .
FILTER(?topic4=<topic[0]> || ?topic4=<topic[1]> || ?topic4=<topic[2]> || ?topic4=<topic[3]> || ?topic4=<topic[4]>) .
}
Similar Researchers
Copyright © 2004-2009, KISTI32BlogTalk2009
Processes to Generate Researcher Networks
Getting sources: Papers
Resolving identities: Rules, Authority data, sameAs
Finding experts: Topics, Reasoning
Generating networks: Topic-, Person-constrained
Next Research Topic
Service mashup to get researcher networks directly
Conclusions
Copyright © 2004-2009, KISTI33BlogTalk2009
Thank you
“A lot of times, people don’t know what they want until you show it to them.”
by Steve Jobs