View
216
Download
0
Category
Tags:
Preview:
Citation preview
Strategies LLCTaxonomy
June 2, 2009 Copyright 2009 Taxonomy Strategies LLC. All rights reserved.
Getting Started with Business Taxonomy Design
Joseph A. Busch, Founder & Principal
Ron Daniel Jr., Principal
2Taxonomy Strategies LLC The business of organized information
Workshop agenda
Time Duration Description
9:00-9:15 15 min Introduction
9:15-9:30 15 min Warm-up exercise
9:30-9:45 15 min Taxonomy background
9:45-10:00 15 min Taxonomy exercise
10:00-10:15 15 min Taxonomy background continued
10:15-10:30 15 min Coffee break
10:30-11:15 45 min Taxonomy process
11:15-11:45 30 min Taxonomy exercise
11:45-12:00 15 min Q & A
3Taxonomy Strategies LLC The business of organized information
Who we are: Joseph Busch
v Over 25 years in the business of organized information. Founder, Taxonomy Strategies LLC Director, Solutions Architecture, Interwoven VP, Infoware, Metacode Technologies
– (acquired by Interwoven, November 2000) Program Manager, Getty Foundation Manager, Pricewaterhouse
v Metadata and taxonomies community leadership. President, American Society for Information Science & Technology Director, Dublin Core Metadata Initiative Adviser, National Research Council Computer Science and Telecommunications
Board Reviewer, National Science Foundation Division of Information and Intelligent
Systems Founder, Networked Knowledge Organization Systems/Services
4Taxonomy Strategies LLC The business of organized information
Who we are: Ron Daniel, Jr.
v Over 15 years in the business of metadata & automatic classification.
Principal, Taxonomy Strategies Standards Architect, Interwoven Senior Information Scientist, Metacode Technologies (acquired by
Interwoven, November 2000) Technical Staff Member, Los Alamos National Laboratory
v Metadata and taxonomies community leadership. Chair, PRISM (Publishers Requirements for Industry Standard Metadata)
working group Acting chair, XML Linking working group Member, RDF working groups Co-editor, PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2 reports.
5Taxonomy Strategies LLC The business of organized information
Who are you?
Your Rolev Content Manager v Editorv Information Architectv Usability Expertv Librarianv Records Managerv Knowledge Engineerv Ontologistv Chief Information Officerv Communicationsv Administration
Industrial Sectorv Financial Services
Banking & Insurancev High Tech
Computers, Software & Telecommunications
v Heavy Manufacturing Steel, Automobiles, Aircraft, etc.
v Government Federal, State or local
v Manufacturing Consumer Products, etc.
v Medical & Health Carev Mining & Refining
Petrochemicals, Oil & Gasv Pharmaceuticals
Drugs, Biotech
What sectors do you work in?
6Taxonomy Strategies LLC The business of organized information
Pop Quiz
On a blank piece of paper:
• What question(s) did you want to have answered by coming to today’s talks?
Flag one question to be discussed later.
You do NOT have to provide your name.
Please DO provide your job title, division, and either company name or company type.
7Taxonomy Strategies LLC The business of organized information
Exercise 1: How do you organize your sock drawer?
Or, like this?
Like this?
8Taxonomy Strategies LLC The business of organized information
BACKGROUNDGetting Started with Business Taxonomy Design:
9Taxonomy Strategies LLC The business of organized information
Simple definition of metadata and taxonomy
Metadata
Title
Author
Department
Audience
Topic
Topics
Employee Services
Compensation
Retirement
Insurance
Further Education
Finance and Budget
Products and Services
Support Services
Infrastructure
Supplies
The Taxonomy is the lists of values to go into the metadata fields.
Audience
InternalExecutives
Managers
External
Suppliers
Customers
Partners
Metadata is data about data – in our case it is a set of fields of library catalog-like data about published content..
10Taxonomy Strategies LLC The business of organized information
Traditional v. business taxonomy: Side-by-side comparison
Traditional Taxonomy
v Detailed model for real world.
v Absolute Granularity and Ultimate Classification.
v Modern ‘cladistic’ approach yields far deeper hierarchies with very low fan-out.
Business Taxonomy
v Simple & Usable for common tasks
v Granularity is small groups, not individual items.
v Modern ‘faceted’ approach uses multiple small facets which combine to yield small groups.
Kingdom Animalia
Phylum Chordata
Class Reptilia
Order Squamata
Family Colubridae
Genus Pituophis
Species Catenifer
Pacific Gopher Snake
(Pituophis Catenifer)
11Taxonomy Strategies LLC The business of organized information
Business taxonomy problem: How can a customer pick from >5,000 faucets w/o quitting?
Refine search by:v Categoryv Pricev Brandv Color/Finishv # Handlesv Series Namev Water Filter?v Faucet Sprayv Handle Shapev Soap Dispenser?
12Taxonomy Strategies LLC The business of organized information
How business taxonomy translates into front-end interface
Metadata Field: Size
Taxonomy Values:4.55.566.578…
Metadata Field: Color
Taxonomy Values:BlackBlueBrownGreenGreyIvory…
Metadata Field: Type
Taxonomy Values:Athletic InspiredBootsLoafers and Slip-onsOxfords and MoreSandals
Metadata Field: Brand
Taxonomy Values:Antonio MauriziBacco BucciBen ShermanBruno Magli…
13Taxonomy Strategies LLC The business of organized information
How business taxonomy translates into front-end interface…for YOUR BUSINESS
Metadata Field:Topic
Taxonomy Values:ManufacturingBenefitsInfrastructureQualitySafety…
Metadata Field:Locale
Taxonomy Values:North AmericaEuropeAsiaSouth America…
Metadata Field: Document Type
Taxonomy Values:FormsPoliciesProceduresReportsNews…
Metadata Field: Department
Taxonomy Values:HRSales and MarketingCommunicationsShipping…
XYZ Corp. Intranet
Departments:HRFinanceITMore…
Document Types:FormsPoliciesReportsNewsMore…
TopicsBenefitsManufacturingQualitySafetyMore…
Regions:N. AmericaEuropeAsiaS. America More…
14Taxonomy Strategies LLC The business of organized information
Exercise 2: High Level Taxonomy Identification
Metadata Field B:___________________
Taxonomy Values:
_________________
_________________
_________________
_________________
_________________
_________________
_________________
Your Org’s Site
Grouping A:Lorem ipsoFactorum delosIstab unoLibrea joheMore…
Grouping B:Lorem ipsoFactorum delosIstab unoLibrea joheMore…
Grouping C:Lorem ipsoFactorum delosIstab unoLibrea joheMore…
Grouping D:Lorem ipsoFactorum delosIstab unoLibrea joheMore…
Metadata Field A:___________________
Taxonomy Values:
_________________
_________________
_________________
_________________
_________________
_________________
_________________
Metadata Field D:___________________
Taxonomy Values:
_________________
_________________
_________________
_________________
_________________
_________________
_________________
Metadata Field C:___________________
Taxonomy Values:
_________________
_________________
_________________
_________________
_________________
_________________
_________________
15Taxonomy Strategies LLC The business of organized information
Why use facets in a business taxonomy?
v Categorize in multiple, independent, categories.
v Allow combinations of categories to narrow the choice of items.
v 4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (104) Easier to maintain Easier to reuse existing lists Can be easier to navigate, if
software supports it Accommodates different needs
and preferences42 values to maintain (10+6+11+15)
9900 combinations (10x6x11x15)
Main Ingredients
Cooking Methods
Meal Type Cuisines
• Chocolate• Dairy• Fruits• Grains• Meat &
Seafood• Nuts• Olives• Pasta• Spices &
Seasonings• Vegetables
• Breakfast• Brunch• Lunch• Supper• Dinner• Snack
• African• American• Asian• Caribbean• Continental• Eclectic/
Fusion/ International
• Jewish• Latin American• Mediterranean• Middle Eastern• Vegetarian
• Advanced• Bake• Broil• Fry• Grill• Marinade• Microwave• No Cooking• Poach• Quick• Roast• Sauté• Slow
Cooking• Steam• Stir-fry
16Taxonomy Strategies LLC The business of organized information
Justification for business taxonomy
v Easier information managementv Flexibility to respond to changing needsv Foundation for findability and usabilityv Typical ROI Scenarios:
Greater sales on a public shopping site Faster and more consistent responses by call center staff Reduced regulatory and legal risk Improved knowledge worker productivity Improved overall staff productivity
v Don’t justify the taxonomy, justify the goal the taxonomy will help you achieve.
17Taxonomy Strategies LLC The business of organized information
Effectiveness of applications of a business taxonomy
v For a product catalog, e.g., HomeDepot.com Conversion rate increases
– 20% increase. Petersen Lift in average order size.
– 20% increase. Petersen
v For knowledge workers, e.g., call center support staff Time saved
– 36% faster than search. Chen & Dumais.
v For knowledge workers, e.g., analysts Increase in productivity
– 25% productivity increase from not re-creating content . Taylor.– Estimated productivity loss exceeded $10M per year—about $500 per
employee per year. Nielsen.
18Taxonomy Strategies LLC The business of organized information
How do taxonomies improve search?
v Input (Query) Side “Search” using a small set of pre-defined values instead of trying to
guess what word or words might have been used in the content. Have synonyms mapped together so searches for “car”, “auto”, and
“automobile” return the same things.
v Output (Results) Side Organize search results into groups of related items. Sorting and filtering Refining search results
19Taxonomy Strategies LLC The business of organized information
Google search on “pcb” –Returns > 28M items
Taxonomy could suggest “Polychlorinated
Biphenyls” vs. “Printed Circuit Boards” or
“Pakistan Cricket Board”
20Taxonomy Strategies LLC The business of organized information
169,169 items
169,169 items
Categorized results Refine search by clicking on categories
21Taxonomy Strategies LLC The business of organized information
Taxonomy in action on the results side: www.CareerBuilder.com search on IT positions
By Category By Company By City By State
22Taxonomy Strategies LLC The business of organized information
Intra-site navigation through metadata and taxonomy
Forest Park Master Plan
Construction Update: July 2003
Aviation Field: The fields are complete and are open to the public. Work is still underway on the paths. The Forest Park Softball League is seeking teams for fall play. Contact Roger Berry at 289-5307.
Boathouse: Project is complete and open. The City of St. Louis awarded the contact for the operator to
Home > Dept. of Parks, Recreation & Forestry > Division of Parks > News & Announcements
Leisure & Culture Transportation & Infrastructure Parks & Gardens Construction, Maint, Impro
Dynamically populated with query:SELECT thumbnail,URLWHERE Format = Video/* and Org = Parks.Select a random result if list is long.
Org = “Division of Parks” AND Type=“Online Forms”
Topic = AROUND(“Parks & Gardens”, “Construction, Maintenance & Improvements”)
SELECT Ttitle, Description, URLWHEREOrg = “Parks Division”AND Type=HomePage
Org = RELATED_ORG(Topic = “Parks & Gardens”, “Construction, Maintenance & Improvements”)AND Type=HomePage (Get Title and Description)
Breadcrumbs and Left-nav are dynamic and based on directory in which content is created.
Main content tagged with:ORG = Parks, Recreation & Forestry Division/Parks DepartmentTOPIC=Leisure & Culture/Parks & Gardens; Transport & Infrastructure/Construction, Maintenance & ImprovementsCONTENT TYPE = News and Announcements
23Taxonomy Strategies LLC The business of organized information
TAXONOMY DEVELOPMENT METHODOLOGY
Getting Started with Business Taxonomy Design:
24Taxonomy Strategies LLC The business of organized information
Taxonomy development methods
Method DescriptionAutomated analysis
Munge, blast, crunch text to analyze corpus.
Workshopping Guide group in activities to identify key concepts.
Strawman Prepare best guess, then bring it to the table to discuss.
Adapt Existing Vocabularies
Customize internal terminology, industry standards, etc.
HybridCombination of some or all of these methods.
25Taxonomy Strategies LLC The business of organized information
Key components to a successful taxonomy project
Identify business
case
Identify business
case
Planning & research
Planning & research
Set-up taxonomy
team
Set-up taxonomy
team
Define use cases
Define use cases
Build high-level
taxonomy
Build high-level
taxonomyBuild-out taxonomy
detail
Build-out taxonomy
detail
Maintain & evolve
taxonomy
Maintain & evolve
taxonomy
Validation testing & review
Validation testing & review
Migrate content
Migrate content
Interview stake-holders
Interview stake-holders
26Taxonomy Strategies LLC The business of organized information
Define business case: Business case examples
v Improve search and browsing to reduce the amount of time employees spend looking for information.
v Reduce business silos, foster collaboration and content reuse, and thereby reduce redundant work.
v Reduce the amount of time employees spend e-mailing basic information to each other.
v Build confidence that employees are getting the most up to date information, and increase employee loyalty by helping them stay “up to date” on the company.
27Taxonomy Strategies LLC The business of organized information
Research & planning
v Identify target content to be focused on. Provide a list of websites (and/or other target content file stores) Prioritize this list for the purposes of the taxonomy project.
v Gather any query logs, usage statistics and usability surveys.v Collect any existing documentation related to audience
personas, content organization, metadata, keywords, and any other guidelines or standards.
v Identify and gather any internal classifications (org charts, sales regions, records retention schedule, code of conduct, product lists, etc.); and any relevant industry standard classifications (UNSPSC, NAICS, USPS, regulated activities, etc.)
28Taxonomy Strategies LLC The business of organized information
Interview stakeholders
v Recruit people from business-critical functions such as marketing, public relations, product marketing, legal, etc.
Include people who have credibility, are early adopters, hold large amounts of content, and are “squeaky wheels” or “fans.”
v Conduct 10-20 interviews.v The goal is for stakeholders to be the review board during the
taxonomy development process, and beyond.
29Taxonomy Strategies LLC The business of organized information
Define use cases: Intranet examples
v Content related to business areas or facilities By geographic location, by type, by specific facility, by access
restrictions, by audience, etc.
v Company-wide content By business function, by topic, by access rights, etc.
Use Case: Create a safety policies and procedures website for facilities organized by State.
Use Scenario: Find all safety policies and procedures related to a facilities located in Ohio.
Use Case: Locate any content that has policies and procedures around a particular topic.
Use Scenario: A policy regarding smoking company-wide has changed and references to outdated policies should be removed. Find official policies, as well as newsletters related to the smoking policy company-wide.
30Taxonomy Strategies LLC The business of organized information
Define use cases: .com examples
v Web content managers By content type, by topic, by location, etc.
v Public users seeking information by topic, by location, etc.
Use Case: Provide search for dividend schedules, earnings statements and stock splits; and the corresponding press releases for a specific time period.
Use Scenario: An investor who recently sold stock is preparing taxes and would like to do a concise search so that they can find historical information about their holdings.
Use Case: Find and recall all public-facing pages that describe a specific safety tip. Use Scenario: Find and recall all public-facing pages that discuss gas safety.
31Taxonomy Strategies LLC The business of organized information
Build high-level taxonomy
v Identify the types of actors Audiences, roles & access rights
v Identify the types of contentv Identify the types activities
Business processes, applications & uses
v Identify the types of named entities Products, services, projects, organizations, locations, etc.
v Topics will be everything else.
A business taxonomy should have no more than 6-10 broad divisions.
32Taxonomy Strategies LLC The business of organized information
Audience ProductsLocationOrganization Content Type
Product Line
Application
Technology
Industry Solution
Person
“Is a” groups of Products
Build high level taxonomy: Oracle.com top-level taxonomy
The Oracle.com taxonomy has no explicit topics, only actors, content types, and named entities.
33Taxonomy Strategies LLC The business of organized information
Build high level taxonomy: SGMS top-level taxonomyhttp://mysearch.internet.gov.sg/
TopicsTopics
The SGMS (Singapore Government Metadata Standard) Taxonomy is much more focused on Topics.
34Taxonomy Strategies LLC The business of organized information
Build-out taxonomy detail
v Get agreement on the broad divisions first, then build-out the detailed taxonomy.
v Use existing terminologies whenever they are available for business functions, locations, products & services, etc.
v Only build a vocabulary when no alternative authoritative source exists.
v Only create categories for which there already is content, or likely to be content soon.
v Keep the taxonomy broad and shallow. Roll-up more specific terms into broader categories
A business taxonomy should have no more than 1,200 categories.
35Taxonomy Strategies LLC The business of organized information
Build out taxonomy detail: NASA Taxonomyhttp://nasataxonomy.jpl.nasa.gov/
36Taxonomy Strategies LLC The business of organized information
Validation testing and review
Method Process Who Requires ValidationWalk-thru Show & explain • Taxonomist
• SME• Team
• Rough taxonomy • Approach• Appropriateness to task
Walk-thru Check conformance to editorial rules
• Taxonomist • Draft taxonomy• Editorial Rules
• Consistent look and feel
Usability Testing
Contextual analysis (card sorting, scenario testing, etc.)
• Users • Rough taxonomy• Tasks & Answers
• Tasks are completed successfully• Time to complete task is reduced
User Satisfaction
Survey • Users • Rough Taxonomy• UI Mockup• Search prototype
• Reaction to taxonomy• Reaction to new interface• Reaction to search results
Tagging Samples
Tag sample content with taxonomy
• Taxonomist• Team• Indexers
• Sample content• Rough taxonomy
(or better)
• Content ‘fit’ • Fills out content inventory• Training materials for people &
algorithms• Basis for quantitative methods
37Taxonomy Strategies LLC The business of organized information
Migrate content
v Prioritize content to be tagged Identify and dispose of ROT.
v Use business rules to automate content tagging Tag landing pages for major sections. Lower-level pages inherit tags from top-level pages.
v Use workflow to enforce tagging Require entry of simple tagging in order to submit an item into the
content management system.
v Use templates to guide user tagging Pre-populate template fields whenever possible. Use context-sensitive pick lists. Call-out to taxonomy service for more complex controlled vocabularies.
v Provide tagging incentives Almost instantaneous feedback.
38Taxonomy Strategies LLC The business of organized information
Maintain and evolve taxonomy
v Taxonomy building is iterative. A taxonomy should be improved over time and maintained.
v Designate a taxonomy editor as the single point-of-contact for taxonomy changes.
v Log change requests and notify requestors.v Prioritize taxonomy changes, e.g.
Improves information access, use and reuse. Requires creating new data or metadata. Affects program operations or has a financial impact. Enables communication campaigns or organizational strategy. Positive impact on users
39Taxonomy Strategies LLC The business of organized information
BUILD A HIGH LEVEL TAXONOMYExercise:
41Taxonomy Strategies LLC The business of organized information
Exercise: Promo website taxonomy
What is Dunder Mifflin Promo?v The new online division that
markets promotional products. DM Promo was designed to reinvent the business of selling promotional paper products.
v The DMP website will provide: Product catalog browse by category,
brand, cost, popularity, feature, etc. Product specs, series, schedule,
imprinting, & colors. Various types of content such as
product ideas, articles, testimonials, etc.
Account information, shipping & returns.
DMP Products include:v Logo binders & filing suppliesv Logo calendars & plannersv Logo paper, cardstock & padsv Logo pens & pencilsv Logo promotional products
(badges & lanyards, mugs, stress balls, tote bags, mp3 players)
v Logo trophies & novelties (custom money, banners & signs, origami, party hats, paper boats)
v Logo wear (shirts, t-shirts, sweatshirts , fleece, bathing suits, hats, bandanas, team uniforms, socks)
42Taxonomy Strategies LLC The business of organized information
Exercise 3: Identify topics for Promo website taxonomy
1. Form groups No more than 10 in a group. Appoint recorder & reporter.
2. Brainstorm topics (10 min) Write one topic on each Post-it
3. Sort Post-its into groups (5 min)4. Present taxonomies (10 min)5. Compare taxonomies (5 min)
Recommended