Upload
hillary-sherman
View
223
Download
0
Embed Size (px)
Citation preview
Advanced Semantics and SearchBeyond Tag Clouds and Taxonomies
Tom ReamyChief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com
2
Agenda
Introduction– 2.0 is really 1.35
Semantic Search - Integrated Design– Examples – Good, Bad, Ugly– Themes and Conclusions
Integrated Solutions – How to Beat the Crowd– People, Technology, Tags, Semantics
Conclusion
3
KAPS Group: General
Knowledge Architecture Professional Services Virtual Company: Network of consultants – 12-15 Partners – FAST, Inxight, Siderean,Nstein, etc. Consulting, Strategy, Knowledge architecture audit Taxonomies: Enterprise, Marketing, Insurance, etc. Services:
– Taxonomy development, consulting, customization– Technology Consulting – Search, CMS, Portals, etc.– Metadata standards and implementation– Knowledge Management: Collaboration, Expertise, e-learning– Applied Theory – Faceted taxonomies, complexity theory, natural
categories
4
2.0 – Reality Check - General
Evolution, not Revolution Tyranny of the majority - worst type of central authority More Madness of Crowds than Wisdom of Crowds Enterprise 2.0 – still looking for a problem to solve
– Social Networking is a small part of business “Things fall apart; the center cannot hold;
Mere anarchy is loosed upon the world,…The best lack all conviction, while the worstAre full of passionate conviction.” - The Second Coming – W.B. Yeats
5
2.0 – Reality Check - Search
Folksonomies don’t compare with taxonomies or ontologies Serendipity browsing is small part of search Fundamental Limits
– Limited areas of success – popular sites are popular– Quality Content – finance, science, etc – not good candidates– No mechanism for improving folksonomies– Scale – Too Big (million hits) – Too Little (200 items) –
Amazon and LibraryThing– Need intrinsic value of tagging – not tagging for better tags
Bad Tags - idiosyncratic or too broad, errors, limited reach – Most people can’t tag very well – learned skill
6
Semantics and Search: An Integrated Approach:Elements Multiple Knowledge Structures
– Facet – orthogonal dimension of metadata– Taxonomy - Subject matter / aboutness– Ontology – Relationships / Facts
• Subject – Verb - Object
Software - Text analytics, auto-categorization, entity extraction
People – tagging, evaluating tags, fine tune rules and taxonomy
People – Users, social tagging, suggestions Rich Search Results – context and conversation
7
8
9
10
11
12
13
Integrated Design – Facets & SemanticsDesign Issues - General What is the right combination of elements?
– Faceted navigation, metadata, browse, search, categorized search results, file plan
What is the right balance of elements?– Dominant dimension or equal facets
Full Facets – Multiple intersecting filters– 1 or 2 filters (source / type) – No
When to combine search, topics, and facets?– Search first and then filter by topics / facet– Browse/facet front end with a search box
14
Integrated Design – Facets & SemanticsDesign Issues - General Good Information Architecture
– Space wars – summary or full facet display– Simplicity vs. research power– Source and Type are basics– Standard Facets – People, Companies, Place, Industry– Interactive interface – sliders, date ranges
Semantics still hardest – summaries, related, rank Taxonomy – just another facet?
– Keywords vs. simple taxonomy Tag Clouds / Clusters – how useful? Feedback – numbers of stories vs. top stories
15
Integrated Design – Facets & SemanticsDesign Issues - Users Homogeneity of Audience and Content Model of the Domain – broad
– How many facets do you need?– More facets and let users decide– Allow for customization – can’t define a single set
User Analysis – tasks, labeling, communities• Issue – labels that people use to describe their business
and label that they use to find information Match the structure to domain and task
– Users can understand different structures
16
Integrated Solution: Enterprise and eCommerce
Semantics, Technology, People, Policy Design the right balance for each area
– Products – facets, Publishing – more software emphasis – for tags
– Enterprise – more precise targets, high quality content, more direct role for policy
New Relationship of Central and Crowd– Not top down or bottom up– Interpenetration of opposites
Variety of Knowledge structures– Folksonomies, taxonomies, ontologies, facets
17
Integrated Solutions: Technology
Text Analytics – Taxonomy management, entity extraction, categorization, sentiment
– Auto-populate variety of metadata – author, title, date, etc.– Relevance – best bets to weights and classes of documents
Search – Integrated features, facets and clusters and tag clouds and feedback
Enterprise Content Management– Place to add metadata, supported by policy– Gather input from authors, tag clouds plus
18
Integrated Solution: People
Programmers, Librarians, Taxonomists, Metadata specialist– Integrate, design, develop rules, monitor activity & quality
Authors, Subject Matter Experts– Input into design (important facets), rules, activity meaning
Users – Web 2.0– Feedback – quality and usability– Suggestions – missing terms, bad categorization & entity– Tags Clouds & folksonomy – for social networking features,
not for information retrieval
19
Conclusions
90% of what you hear about Folksonomies (2.0) is hype – again– Folksonomies are a great source for first drafts and social research– Social Networking is really good – for social networking
Semantic Infrastructure solution (people, policy, technology, semantics) and feedback is best approach
Integrated design is essential – not facets as add on Semantics is still not there – hardest, but some progress Text Analytics (Entity extraction and auto-categorization) are
essential Future – new kinds of applications:
– Text Mining, research tools, sentiment
Questions?
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com