50
1 Hybrid Approaches to Taxonomy & Folksonomy Semantic Technology, 2009 Stephanie Lemieux Earley & Associates [email protected] www.earley.com

Hybrid Approaches to Taxonomy & Folksonmy

Embed Size (px)

Citation preview

Page 1: Hybrid Approaches to Taxonomy & Folksonmy

1

Hybrid Approaches to Taxonomy & Folksonomy

Semantic Technology, 2009

Stephanie LemieuxEarley & Associates

[email protected]

Page 2: Hybrid Approaches to Taxonomy & Folksonmy

2

Agenda

• The taxonomy/folksonomy debate• Tagging pitfalls• Social tagging & the enterprise• Hybrid approaches to

taxonomy/folksonomy• Corporate tagging tools

Page 3: Hybrid Approaches to Taxonomy & Folksonmy

3

About me

• Stephanie Lemieux– Senior Consultant at Earley & Associates, Inc.

– Masters in Library and Information Studies (MLIS),

specializing in taxonomy development, content

management, search, IA

– Developed enterprise taxonomies and helped a

variety of clients through CMS deployments

– Projects include: Motorola, Ford Foundation, Best

Buy, American Greetings, Urban Land Institute

– Blog: http://sethearley.wordpress.com/

Page 4: Hybrid Approaches to Taxonomy & Folksonmy

The tired debate

Taxonomy Folksonomy

Control Democracy

Top-down Bottom-up

Arduous process Just do it

Accurate Good enough

Restrictive Flexible

Static Evolving

Expensive to maintain Low cost – “crowdsourced”

4

Page 5: Hybrid Approaches to Taxonomy & Folksonmy

The relevance problem

• Search results should be relevant to what a searcher wants, but technology can only determine if it is relevant to a search term*

• Taxonomies and folksonomies = 2 approaches to the problem of relevance with common goal of describing content, each with particular gaps

5

*Billy Cripe: Folksonomy, Keywords & Tags: Social & Democratic User Interaction in Enterprise Content Managementhttp://www.oracle.com/technology/products/content-management/pdf/OracleSocialTaggingWhitePaper.pdf

Page 6: Hybrid Approaches to Taxonomy & Folksonmy

Taxonomy

• Added by a small number of individuals: author/originators or “authorized” persons (e.g.librarian)

• Describes meaning or purpose of content based on a set view point for a specific audience using a controlled vocabulary

• Relationships between terms defined– Hierarchical (e.g. Computer hardware > Keyboard)– Associative (e.g. Computer hardware – Software)– Equivalent (e.g. Laptop = Notebook Computer)

6

Page 7: Hybrid Approaches to Taxonomy & Folksonmy

Tags

• Added by authors and consumers (individual motivation)

• Can connote any type of meaning or purpose

• No compression around a single viewpoint, no control of vocabulary

• Self-correcting through volume

7

Page 8: Hybrid Approaches to Taxonomy & Folksonmy

Why tagging is so interesting…

• Adding individual value to the act of classification – user control over findability

• Reducing the cognitive burden (i.e. it’s easy)

• Reduced technological investment (i.e. it’s cheap)

• Can leverage emergent structure (folksonomy)

8

Reno|Reno|TagsTags

Page 9: Hybrid Approaches to Taxonomy & Folksonmy

9

The downside…

Neither tags nor taggers are perfect…• No language control

Guy & Tonkin, 2006.http://www.dlib.org/dlib/january06/guy/01guy.html

Study: 40% of flickr tags and

28% of del.icio.us tags were flawed in

these ways

Misspellings Library vs. libaryPlam pilot

Compound words TimBernersLee

Case & number Folksonomy,Folksonomies

Personal tags To readMy dog@work

Single-use tags Billybobsdog

Page 10: Hybrid Approaches to Taxonomy & Folksonmy

The downside…

• Varying levels of granularity

• Same tag, different meanings• Lack of relationships between

tags – which is broader? Narrower?• Lack of consistency/approach to change –

even single user can change language and hamper own personal retrieval

10

RobinRobin

BirdBird

Turdus migratorinus

Turdus migratorinus

…Known as “tag noise”

Page 11: Hybrid Approaches to Taxonomy & Folksonmy

11

The downside…

• Most tag search does not account for stemming, plurals, etc.

E.g.

Search on Delicious:

Folksonomy: 16049

Folksonomies: 4404

Both: 2642

Page 12: Hybrid Approaches to Taxonomy & Folksonmy

12

The tagging hype cycle

http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html

Page 13: Hybrid Approaches to Taxonomy & Folksonmy

13

The web vs. the enterprise

• Shirky: “there is no shelf”– Traditional organization schemes are built to deal with

physical collections and constraints.– They don’t work well on the web

• large corpus• no clear edges• no formal categories• no authority

• The enterprise is much more defined• smaller corpuses• formal entities• coordinated users, clear tasks• need for reliable retrieval

E.g.FlickrDelicious

Social tagging works well in this

context

Social tagging is more of a

challenge, needsclear arena

Page 14: Hybrid Approaches to Taxonomy & Folksonmy

14

Role of folksonomy in the enterprise?

• Tagging external links– Seeing what colleagues are interested in– Sharing links with a specific team– Subscribing to link feeds– Monitoring news/blog coverage of the company– Consumer/competitor research– Tracking industry trends

• Tagging internal links– Finding/facilitating access to most popular pages on the

intranet– Seeing what intranet pages mean to staff

Page 15: Hybrid Approaches to Taxonomy & Folksonmy

15

Role of folksonomy in the enterprise?

• Social aspects– Identifying subject matter experts– Connecting people who share interests– Encouraging collaboration & resource sharing

• Improve your taxonomy, information retrieval– User tagging to refine the corporate taxonomy

• New concepts• New terminology

– Seeing what employees find interesting– Distributing tagging tasks

Page 16: Hybrid Approaches to Taxonomy & Folksonmy

16

The downside…

• Potential issues of security, inappropriateness– Can implement some level of vetting

• Privacy concerns– Can be anonymous tagging, although this removes

some social value– Can create role or team-based collections

• Need higher ratio of active participants due to population size

Page 17: Hybrid Approaches to Taxonomy & Folksonmy

17

Message text

External News Reports

Discussion postings

Links

Engineering document repositories

Success Stories

Policies

Approved Methods

Best Practices

Lower Cost Higher CostTagging/Organizing Processes

Unfiltered Reviewed/Vetted/Approved

Lower Value Higher Value

Key concept: Not all content is created equally

The content continuum

Page 18: Hybrid Approaches to Taxonomy & Folksonmy

18

What if we blended the two?

Folksonomy/TaxonomyLow cost

Findability

Flexible

Structured relationships

User terminologyOversight

Social sharing

Consistency

Page 19: Hybrid Approaches to Taxonomy & Folksonmy

Hybrid approaches

19

Co-existence Tag-influenced taxonomy

Taxonomy influenced tagging

Tag hierarchies/ontologies

Page 20: Hybrid Approaches to Taxonomy & Folksonmy

Co-existence

• Taxonomy and folksonomy are used side by side

• Strengths of each approach preserved, philosophy of each kept “pure”

20

Web example: Flickr & Library of Congress: http://www.flickr.com/photos/library_of_congress/

Page 21: Hybrid Approaches to Taxonomy & Folksonmy

Co-existence – public library

21

Page 22: Hybrid Approaches to Taxonomy & Folksonmy

22

Raytheon – corporate example• Used in Raytheon employee portal - website lists

(“Suggested sites” feature box)

• How does it work: – inserted “Suggested Sites” in a "feature" box to the right

of the regularly ranked results – website suggestions (URLs) submitted along with

recommended tags/keyword which are subsequently verified and approved by librarians

http://www.slideshare.net/CJMConnors/i-kms-singapore-presentation

Page 23: Hybrid Approaches to Taxonomy & Folksonmy

23

Variation: Tag mediation

• Vetting & editing tags• Pros:

– Weeds out potentially inappropriate tags– Eliminates misspellings, plural issues, etc.– Some can be done automatically (spell-checker,

e.g.)– Enhances findability

• Cons: – Higher effort/cost– Perceived lack of trust– Who knows better?

Page 24: Hybrid Approaches to Taxonomy & Folksonmy

Tag-influenced taxonomy

• Taxonomy & tagging co-exist, tags serve as pool of candidate terms to enrich taxonomy, keep it current– Find new terminology (synonyms, popular language)– Find new concepts

• Performed as separate processes (taxonomy tagging=formal, tagging=informal) or combined in single interface

24

Page 25: Hybrid Approaches to Taxonomy & Folksonmy

Tag-influenced taxonomy

• Requires formal vetting process• Can be supported by automation (e.g.

candidate tags pulled & filtered with script to remove taxonomy terms, stop words)

• Evaluate candidates based on – Frequency (“literary warrant”)– Salience within context

• Look at tags used in conjunction with taxonomy

25

Page 26: Hybrid Approaches to Taxonomy & Folksonmy

Taxonomy-influenced tagging

• Presenting choices/suggestionsto user from controlled set of terms/tags– Sometimes users prefer easy choice

• Drop-down menus• Check boxes• Type ahead• Tree view

– “influenced” – option to enter own tag? Good source of new terms

– Enforces consistency– Offers structure

26

Page 27: Hybrid Approaches to Taxonomy & Folksonmy

27

WWW example: ZigTag (2008)

Definitions from Wikipedia & Wordnet

Tagging with type-ahead against database of 3M unique concepts & 8M synonyms

Page 28: Hybrid Approaches to Taxonomy & Folksonmy

28

Zigtag

• Type ahead & synonyms encourage consistency• Users can enter new tags• Synonyms based on Wikipedia, so can be “dirty

data”• No hierarchy, only equivalent relationships so far

Page 29: Hybrid Approaches to Taxonomy & Folksonmy

29

Zigtag search

Still get problems with uncontrolled tags & recall

Interesting relationships from Wikipedia

Browesable tag cloud

Page 30: Hybrid Approaches to Taxonomy & Folksonmy

Example: myedna (Education.au)

http://www.educationau.edu.au/jahia/webdav/site/myjahiasite/shared/papers/tagging_hayman.ppt

Fully taxonomy-directed tagging

Page 31: Hybrid Approaches to Taxonomy & Folksonmy

© 2008 31

Buzzillions.com

• Review site: tags are “controlled” not against a taxonomy, but against other tags – reduces redundancy

• Only popular tags exposed as faceted navigation

Page 32: Hybrid Approaches to Taxonomy & Folksonmy

SharePoint?

• Plugins make taxonomy easy, present like tags

E.g. KWizCom: plugin manages taxonomy and tags in easy interface… can opt-out of letting users create own tags

32

Page 33: Hybrid Approaches to Taxonomy & Folksonmy

33

Taxonomy-directed tagging

• Pros:– More consistency– Better support for findability– Relationships, definitions leveraged – adding

meaning to the tags– Realistic for the enterprise

• Cons:– Not really folksonomy anymore..– Can be forcing terminology on user– Need to develop reference list of concepts –

manually through taxonomy or need large corpus to derive automatically

Page 34: Hybrid Approaches to Taxonomy & Folksonmy

Tag hierarchies

• 2 flavors: user-powered, automatic derivation

• User-powered– Social approach– Bogus hierarchies possible– Small population will contribute

• RawSugar tried it (no longer around): taggers could specify hierarchy in own account, tags clustered in a based on common groups

34

Page 35: Hybrid Approaches to Taxonomy & Folksonmy

Raw Sugar example

35

Page 36: Hybrid Approaches to Taxonomy & Folksonmy

36

More user-powered tag relationships• E.g. LibraryThing

LibraryThing allows any use to combine (or uncombine) 2 tags that are semantically equivalent.

www.librarything.com

Page 37: Hybrid Approaches to Taxonomy & Folksonmy

Automatic derivation

• Tag hiearchies, facets, ontologies, or “folksontology”

• Done through statistical/clustering algorithms

37

http://www.pui.ch/phred/automated_tag_clustering/

Page 38: Hybrid Approaches to Taxonomy & Folksonmy

Delicious & citeulike hiearchy

38

http://heymann.stanford.edu/taghierarchy.html

Page 39: Hybrid Approaches to Taxonomy & Folksonmy

Clustering at Flickr

39

Page 40: Hybrid Approaches to Taxonomy & Folksonmy

40

Auto clustering/facets

• Still not very mature• Time-sensitive• Community-

sensitive• Ambiguous tags• Improve with volume

(self-correcting)

http://www.pui.ch/phred/automated_tag_clustering/

Page 41: Hybrid Approaches to Taxonomy & Folksonmy

Intelligent tags

• Moving toward more semantic tagging with machine readable tags– Flickr: can tag images with machine tags

e.g. “geo:quartier=“SoHo” namespace:predicate=value

e.g. “lastfm:event=34640” – makes your photo appear on a lastfm event page

41

Page 42: Hybrid Approaches to Taxonomy & Folksonmy

Intelligent tags

• MOAT: Meaning of a tag – part of linked data movement, mapping tags to semantic web– http://moat-project.org/

• Adding to the triplet– User – resource – tag – meaning– Meaning = URI to a resource containing

meaning (e.g. DBPedia)

42

<tag:RestrictedTagging> <tag:taggedResource rdf:resource="http://example.org/post/1"/> <foaf:maker rdf:resource="http://apassant.net/alex"/> <tag:associatedTag rdf:resource="http://tags.moat-project.org/tag/apple"/> <moat:tagMeaning rdf:resource="http://dbpedia.org/resource/Apple_Records"/></tag:RestrictedTagging>

Page 43: Hybrid Approaches to Taxonomy & Folksonmy

Conclusion

• Not all content is created equal – tags and taxonomies have their sweet spots

• Hybrid approaches are emerging– taxonomy-influenced tagging leading the pack

in popularity on the web– co-existence in the enterprise

• Look for more developments on the semantic web/linked data front for making tags more intelligent

43

Page 44: Hybrid Approaches to Taxonomy & Folksonmy

Corporate social tagging tools

44© 2008

Page 45: Hybrid Approaches to Taxonomy & Folksonmy

45

Corporate social tagging software

• http://www.connectbeam.com/

Page 46: Hybrid Approaches to Taxonomy & Folksonmy

46

Corporate social tagging software

• http://www.cogenz.com/

Page 47: Hybrid Approaches to Taxonomy & Folksonmy

47

Corporate social tagging software•

http://www-306.ibm.com/software/lotus/products/connections/dogear.html

Page 48: Hybrid Approaches to Taxonomy & Folksonmy

© 2008 48

Corporate social tagging software• BEA AquaLogic Pathways

• http://www.bea.com/framework.jsp?CNT=index.jsp&FP=/content/products/aqualogic/pathways/

Page 49: Hybrid Approaches to Taxonomy & Folksonmy

Corporate social tagging software

• http://www.newsgator.com/business/socialsites/default.aspx

49

Page 50: Hybrid Approaches to Taxonomy & Folksonmy

50

Stephanie [email protected]

Blog: sethearley.wordpress.comTwitter: stephlemieux

Send an email to [email protected] for a free pass to one of our conference calls.

Questions?