Upload
freddy-limpens
View
257
Download
1
Embed Size (px)
Citation preview
Un cycle de vie complet pour mantiquedes folksonomies
A full lifecycle for the semantic enrichment of folksonomies
1
E G C 2 0 1 1 – B R E S T
2 5 - 2 8 . 0 1 . 2 0 1 1
Freddy Limpens, Fabien Gandon
(Edelweiss, INRIA Sophia Antipolis)
Michel Buffa
(Kewi-I3S, UNSA-CNRS)
2
From social tagging to folksonomies
Tags freely associated to resources …
… collected and shared on the web
3
… resulting in
FOLKSONOMIES
A mass of users for a mass of resources
Limitations of folksonomies
4
Spelling variations of tags:
newyork= new_york = nyc
Limitations of folksonomies
5
Ambiguityof tags
… or in Texas, USA ?
… in France ?
paris
6
How to turn folksonomies ...
?
... intotopic structures (thesaurus) ?
pollution
Soil pollutions
has narrower
pollutant Energy
related related
7
… without overloading users
… and by collectingall user's expertiseinto the process
1. State of the art
8
9
State of the art
Involving users in tags structuring:
• Simple syntax to structure tags (Huyn-Kim
Bang et al. 2008)
• Crowdsourcing strategy to validate tag-concepts mapping (Lin et al. 2010)
pollution
Soil pollutions
has narrower
pollutant Energy
related related
RDF
? : Resource Description Framework☐ Rwanda Defense Force
10
State of the art
Automatic extraction of tag semantics:
• Similarity based on co-occurrence patterns (Specia& Motta 2007;
Catutto 2008)
• Association rule mining (Mika 2005; Hothoet al. 2006)
pollution
Soil pollutions
has narrower
pollutant Energy
related related
11
State of the art
Tags and Semantic Web models
TAGS + SCOT + SIOC + FOAF for tags and tagging :
tags:Tagging#11111
sioc:Itemhttp://www.windenergy.com
tags:taggedResource
scot:Tag#wind-energy
tags:associatedTag
foaf:Agent#freddy.limpens
tags:taggedBy
2. Tagging &folksonomyenrichment models
12
13
Tagging model
Tagging = linkingaresourcewith asign
What is a tagging ?
"nature"
picture shows "nature"
(1) (2) (3)
place located l:england
editing makes me : )
14
Tagging model
NiceTag(Monnin et al, 2010):
Tagging as named graphs*
*Carrolet al. (2005)
nt:TaggedResourcehttp://www.windenergy.com
nt:ManualTagAction (named graph)
nt:isAboutscot:Tag
#wind-energy
sioc:UserAccountfreddy
sioc:has_creator
sioc:Containerdelicious.com
sioc:has_container
15
Folksonomy enrichment
2 complementary semantic enrichment:
http://www.windenergy.com
nt:ManualTagAction
nt:isAbout wind-energy
renewable energy
windenergy
wind turbine
has broader
close match
has narrower
environment
related
Structuring tags as in a thesaurus (SKOS)
16
Tagging model
Supporting diverging points of view
car pollutionskos:related
john
agrees
paul
disagrees
Supporting diverging points of view
Reification of relations with named graphs
car pollutionskos:related
srtag:SingleUser"john"
srtag:hasApproved
srtag:SingleUser"paul"
srtag:hasRejected
srtag:TagSemanticStatement
srtag:TagStructureComputer"r2d2"
srtag:hasProposed
17
18
Ademe scenario
Expertsproduce docs
+ tagArchivists
centralize + tag
Public audienceread + tag
Life-cycle grounded on usage analysis
19
Ademe’s dataset
Delicious TheseNet Cadic
WhatBookmarks of usersof tag "ademe"
Keywords for Ademe'sPhDproje
cts
Archivistsindexinglexicon
# tags 1015 6583 1439
# resources 196 1425 4675
# tagging
(1R - 1T - 1U)3015 10160 25515
# users 812 1425 1
3. Going through the folksonomy enrichment life-cycle
20
ADDING TAGS
Automatic processing
User-centricstructuring
Detectconflicts
Globalstructuring
Flatfolksonomy
Structuredfolksonomy
Folksonomy enrichment life-cycle
21
ADDING TAGS
Automatic processing
User-centricstructuring
Detectconflicts
Globalstructuring
Flatfolksonomy
Structuredfolksonomy
Folksonomy enrichment life-cycle
22
Automaticprocessing
1. String-based
2. Co-occurrence patterns
3. User-based associations
Flatfolksonomy
23
3 methodsto automaticallyextract tags semantics
24
1. String-based metrics
pollution Soil pollutions
pollutantpollution
=> « pollution » related to « pollutant »
=> « pollution » broaderthan« soil pollutions »
1. String-based
metrics results1. String-based metrics
25results on full dataset
tags from experts
tags fromarchivstsclose matchrelated
broader
26
2. Co-occurrence patterns
Example of folksonomy
cc
ecology energy wind turbine sustainability housing
ecology 0 1 1 3 1
energy 1 0 2 4 3
wind turbine 1 2 0 1 1
sustainability 3 4 1 0 4
housing 1 3 1 4 0
IFσ> 0.85 => "energy" related "sustainability"
2. Co-occurrence patterns
27
28
2. Co-occurrence patterns
Cadicdataset
renewableenergywind-energy
Alex
Delphine
Claire
Monique
Anne
Hyponym relations (broader/narrower):
« renewableenergy »broaderthan« wind-energy »
3. User-based association
29
3. User-based association
THESENET dataset
30
Global results of automatic processings
Total with 3 automatic methods: 83027 relations for 9037 tags
– 68633 related
– 11254hyponym
– 3193 spelling variants
31
32
?
Computed relations are not always accurate
ADDING TAGS
Automatic processing
User-centricstructuring
Detectconflicts
Globalstructuring
Flatfolksonomy
Structuredfolksonomy
Folksonomy enrichment life-cycle
33
Firefox extension SRTAgEditor
34
Capturing users's contributions
Embedding structuring tasks within everyday activity (searching e.g)
35
Capturing users's contributions
36
Capturing user's point of view
John
srtag:hasRejectedenergie
france
skos:broader
srtag:TagSemanticStatement
Exemple:Rejecting a relation
37
Capturing user's point of view
John
srtag:hasRejectedenergie
energy
skos:related
srtag:TagSemanticStatement
Exemple:Proposing another
relation
energie
energy
skos:closeMatch
srtag:TagSemanticStatement
srtag:hasProposed
ADDING TAGS
Automatic processing
User-centricstructuring
Detectconflicts
Globalstructuring
Flatfolksonomy
Structuredfolksonomy
Folksonomy enrichment life-cycle
38
39
Conflict detection
environment pollution
Using rules:
IFnum(narrower)/num(broader) ≥ cTHENnarrowerwinsELSErelatedwins
narrower
John
srtag:hasApproved
Anne
srtag:hasApproved
broader
Monique
srtag:hasApproved
Delphine
srtag:hasApproved
40
Conflict detection
related
broader narrower
less constrained less constrained less constrained
close match
relatedenvironment pollutionnarrower
broader
ConflictingConflictSolverchoicedebatablerejected
41
Conflict detection
Experimentationat ADEME
42
ADDING TAGS
Automatic processing
User-centricstructuring
Detectconflicts
Globalstructuring
Flatfolksonomy
Structuredfolksonomy
Folksonomy enrichment life-cycle
43
44
Global map
Includes all points of view, highlightsconflicts + consensuses
Referent choices
45
Choices of the referent user (archivistsatAdemee.g.)
ADDING TAGS
Automatic processing
User-centricstructuring
Detectconflicts
Globalstructuring
Flatfolksonomy
Structuredfolksonomy
Folksonomy enrichment life-cycle
46
Each point of viewcorresponds to a layer
47
Enriching individual points of view
Integratingothers' contributions:1. Current user -> "Anne"2. ReferentUser (e.g. archivists)3. ConflictSolver (software agent)4. Otherindividualusers5. Automatons (metrics)
BROADER
NARROWER
RELATED
CLOSE MATCH
environnementSearch:
preoccupation environnementales
grenelle de lenvironnement
competences environnementales
environment
environmental
domainesenvironnementaux
Anne is looking for resources tagged "environnement"
48
Algorithmbased on random labelspropagation
(Raghavan et al., 2007):
Why not using tags instead and theirsemantic relations ?
Application to communitydetection (Érétéo, 2011)
49
Application to communitydetection (Érétéo, 2011)
50
Application to Ademe'ssocial network :
•linking 1 tag to each user (the mostoftenused one)
•whentwousers are linkedAND their tags are related=>mergethem
•with 9200 tags => group users in 30 communities
Application to communitydetection (Érétéo, 2011)
51
Result for "biggest" tags :
1. pollution
2. développement durable
3. énergie
4. chimie
5. pollution de l'air
6. métaux
7. biomasse
8. déchets
5. Conclusion
52
53
What we do :
Help online communities
structure their tagswind-energy
renewable energy
sustainability
wind turbine
has broader
related
has narrower
environment
related
An approach to bridge tagging with Semantic Web:
Automatic processing of tags:
User interface to capture tag structuring embedded in every-day tasks
Implementation within ISICIL solution (tagging server)
54
Our contributions:
• More user interfaces :
• ISICIL :test with final users Ademe and Orange labs
• Testing on other types of communities
• Temporal dimension
• Multilinguism
55
Future work
56
Thank you for your attention !
me : [email protected]://www-sop.inria.fr/members/Freddy.Limpens
myadvisors : Fabien Gandon : [email protected] Buffa : [email protected]
"communitydetectionguy":Guillaume Erétéo : [email protected]
ISICIL team :http://isicil.inria.fr
ANGELETOU S., SABOU M. & MOTTA E. (2008). Semantically Enriching Folksonomies with FLOR. In CISWeb Workshop at European Semantic Web Conference ESWC.
BRAUN S., SCHMIDT A., WALTER A., NAGYPÁL G. & ZACHARIAS V. (2007). Ontology maturing: a collaborative web 2.0 approach to ontology engineering. In CKC, volume 273 of CEUR Workshop Proceedings: CEURWS.org.
CATTUTO C., BENZ D., HOTHO A. & STUMME G. (2008). Semantic grounding of tag relatedness in social bookmarking systems. In Proceedings of the 7th International Conference on The Semantic Web, Berlin, Heidelberg: Springer-Verlag.
GANDONF.,BOTTOLIERV.,CORBYO.&DURVILLEP. (2007).Rdf/xml source declaration, w3c member submission. http://www.w3.org/Submission/rdfsource/.
HALPIN H. & PRESUTTI V. (2009). An ontology of resources: Solving the identity crisis in ESWC, volume 5554 of Lecture Notes in Computer Science, p. 521–534: Springer.
HOTHO A., JÄSCHKE R., SCHMITZ C. & STUMME G. (2006). Information retrieval in folksonomies: Search and ranking. In The Semantic Web: Research and Applica- tions, LNCS(4011) , Heidelberg: Springer.
HUYNH-KIM BANG B., DANÉ E. & GRANDBASTIEN M. (2008). Merging semantic and participative approaches for organizing teachers’ documents. In Proceedings of World Conference on Educational Multimedia, Hypermedia & Telecommunications, p. x4959–4966, Vienna France.
KIM H.-L., YANG S.-K., SONG S.-J., BRESLIN J. G. & KIM H.-G. (2007). Tag Mediated Society with SCOT Ontology. In Semantic Web Challenge, ISWC.
LIN H. & DAVIS J. (2010). Computational and crowdsourcing methods for extracting ontological structure from folksonomy. In ESWC (2), volume 6089 of Lecture Notes in Computer Science, p. 472–477: Springer.
MIKA P. (2005). Ontologies are Us: a Unified Model of Social Networks and Semantics. In ISWC, volume 3729 of LNCS, p. 522–536: Springer.
MONNIN A., LIMPENS F., GANDON F. & LANIADO D. (2010). Speech acts meet tagging: Nicetag ontology. In I-SEMANTICS ’10: Proceedings of the 6th International Conference on Semantic Systems, p. 1–10, New York, NY, USA: ACM.
PASSANT A. & LAUBLET P. (2008). Meaning of a tag: A collaborative approach to bridge the gap between tagging and linked data. In Proceedings of the WWW 2008 Workshop Linked Data on the Web (LDOW2008), Beijing, China.
SPECIA L. & MOTTA E. (2007). Integrating folksonomies with the semantic web. In Proc. of the European Semantic Web Conference (ESWC2007), volume 4519 of LNCS, p. 624–639, Berlin Heidelberg, Germany: Springer-Verlag. 57
References