43
Strategies LLC Taxonomy June 2, 2009 Copyright 2009 Taxonomy Strategies LLC. All rights reserved. Getting Started with Business Taxonomy Design Joseph A. Busch, Founder & Principal Ron Daniel Jr., Principal

Strategies LLC Taxonomy June 2, 2009Copyright 2009 Taxonomy Strategies LLC. All rights reserved. Getting Started with Business Taxonomy Design Joseph A

Embed Size (px)

Citation preview

Strategies LLCTaxonomy

June 2, 2009 Copyright 2009 Taxonomy Strategies LLC. All rights reserved.

Getting Started with Business Taxonomy Design

Joseph A. Busch, Founder & Principal

Ron Daniel Jr., Principal

2Taxonomy Strategies LLC The business of organized information

Workshop agenda

Time Duration Description

9:00-9:15 15 min Introduction

9:15-9:30 15 min Warm-up exercise

9:30-9:45 15 min Taxonomy background

9:45-10:00 15 min Taxonomy exercise

10:00-10:15 15 min Taxonomy background continued

10:15-10:30 15 min Coffee break

10:30-11:15 45 min Taxonomy process

11:15-11:45 30 min Taxonomy exercise

11:45-12:00 15 min Q & A

3Taxonomy Strategies LLC The business of organized information

Who we are: Joseph Busch

v Over 25 years in the business of organized information. Founder, Taxonomy Strategies LLC Director, Solutions Architecture, Interwoven VP, Infoware, Metacode Technologies

– (acquired by Interwoven, November 2000) Program Manager, Getty Foundation Manager, Pricewaterhouse

v Metadata and taxonomies community leadership. President, American Society for Information Science & Technology Director, Dublin Core Metadata Initiative Adviser, National Research Council Computer Science and Telecommunications

Board Reviewer, National Science Foundation Division of Information and Intelligent

Systems Founder, Networked Knowledge Organization Systems/Services

4Taxonomy Strategies LLC The business of organized information

Who we are: Ron Daniel, Jr.

v Over 15 years in the business of metadata & automatic classification.

Principal, Taxonomy Strategies Standards Architect, Interwoven Senior Information Scientist, Metacode Technologies (acquired by

Interwoven, November 2000) Technical Staff Member, Los Alamos National Laboratory

v Metadata and taxonomies community leadership. Chair, PRISM (Publishers Requirements for Industry Standard Metadata)

working group Acting chair, XML Linking working group Member, RDF working groups Co-editor, PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2 reports.

5Taxonomy Strategies LLC The business of organized information

Who are you?

Your Rolev Content Manager v Editorv Information Architectv Usability Expertv Librarianv Records Managerv Knowledge Engineerv Ontologistv Chief Information Officerv Communicationsv Administration

Industrial Sectorv Financial Services

Banking & Insurancev High Tech

Computers, Software & Telecommunications

v Heavy Manufacturing Steel, Automobiles, Aircraft, etc.

v Government Federal, State or local

v Manufacturing Consumer Products, etc.

v Medical & Health Carev Mining & Refining

Petrochemicals, Oil & Gasv Pharmaceuticals

Drugs, Biotech

What sectors do you work in?

6Taxonomy Strategies LLC The business of organized information

Pop Quiz

On a blank piece of paper:

• What question(s) did you want to have answered by coming to today’s talks?

Flag one question to be discussed later.

You do NOT have to provide your name.

Please DO provide your job title, division, and either company name or company type.

7Taxonomy Strategies LLC The business of organized information

Exercise 1: How do you organize your sock drawer?

Or, like this?

Like this?

8Taxonomy Strategies LLC The business of organized information

BACKGROUNDGetting Started with Business Taxonomy Design:

9Taxonomy Strategies LLC The business of organized information

Simple definition of metadata and taxonomy

Metadata

Title

Author

Department

Audience

Topic

Topics

Employee Services

Compensation

Retirement

Insurance

Further Education

Finance and Budget

Products and Services

Support Services

Infrastructure

Supplies

The Taxonomy is the lists of values to go into the metadata fields.

Audience

InternalExecutives

Managers

External

Suppliers

Customers

Partners

Metadata is data about data – in our case it is a set of fields of library catalog-like data about published content..

10Taxonomy Strategies LLC The business of organized information

Traditional v. business taxonomy: Side-by-side comparison

Traditional Taxonomy

v Detailed model for real world.

v Absolute Granularity and Ultimate Classification.

v Modern ‘cladistic’ approach yields far deeper hierarchies with very low fan-out.

Business Taxonomy

v Simple & Usable for common tasks

v Granularity is small groups, not individual items.

v Modern ‘faceted’ approach uses multiple small facets which combine to yield small groups.

Kingdom Animalia

Phylum Chordata

Class Reptilia

Order Squamata

Family Colubridae

Genus Pituophis

Species Catenifer

Pacific Gopher Snake

(Pituophis Catenifer)

11Taxonomy Strategies LLC The business of organized information

Business taxonomy problem: How can a customer pick from >5,000 faucets w/o quitting?

Refine search by:v Categoryv Pricev Brandv Color/Finishv # Handlesv Series Namev Water Filter?v Faucet Sprayv Handle Shapev Soap Dispenser?

12Taxonomy Strategies LLC The business of organized information

How business taxonomy translates into front-end interface

Metadata Field: Size

Taxonomy Values:4.55.566.578…

Metadata Field: Color

Taxonomy Values:BlackBlueBrownGreenGreyIvory…

Metadata Field: Type

Taxonomy Values:Athletic InspiredBootsLoafers and Slip-onsOxfords and MoreSandals

Metadata Field: Brand

Taxonomy Values:Antonio MauriziBacco BucciBen ShermanBruno Magli…

13Taxonomy Strategies LLC The business of organized information

How business taxonomy translates into front-end interface…for YOUR BUSINESS

Metadata Field:Topic

Taxonomy Values:ManufacturingBenefitsInfrastructureQualitySafety…

Metadata Field:Locale

Taxonomy Values:North AmericaEuropeAsiaSouth America…

Metadata Field: Document Type

Taxonomy Values:FormsPoliciesProceduresReportsNews…

Metadata Field: Department

Taxonomy Values:HRSales and MarketingCommunicationsShipping…

XYZ Corp. Intranet

Departments:HRFinanceITMore…

Document Types:FormsPoliciesReportsNewsMore…

TopicsBenefitsManufacturingQualitySafetyMore…

Regions:N. AmericaEuropeAsiaS. America More…

14Taxonomy Strategies LLC The business of organized information

Exercise 2: High Level Taxonomy Identification

Metadata Field B:___________________

Taxonomy Values:

_________________

_________________

_________________

_________________

_________________

_________________

_________________

Your Org’s Site

Grouping A:Lorem ipsoFactorum delosIstab unoLibrea joheMore…

Grouping B:Lorem ipsoFactorum delosIstab unoLibrea joheMore…

Grouping C:Lorem ipsoFactorum delosIstab unoLibrea joheMore…

Grouping D:Lorem ipsoFactorum delosIstab unoLibrea joheMore…

Metadata Field A:___________________

Taxonomy Values:

_________________

_________________

_________________

_________________

_________________

_________________

_________________

Metadata Field D:___________________

Taxonomy Values:

_________________

_________________

_________________

_________________

_________________

_________________

_________________

Metadata Field C:___________________

Taxonomy Values:

_________________

_________________

_________________

_________________

_________________

_________________

_________________

15Taxonomy Strategies LLC The business of organized information

Why use facets in a business taxonomy?

v Categorize in multiple, independent, categories.

v Allow combinations of categories to narrow the choice of items.

v 4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (104) Easier to maintain Easier to reuse existing lists Can be easier to navigate, if

software supports it Accommodates different needs

and preferences42 values to maintain (10+6+11+15)

9900 combinations (10x6x11x15)

Main Ingredients

Cooking Methods

Meal Type Cuisines

• Chocolate• Dairy• Fruits• Grains• Meat &

Seafood• Nuts• Olives• Pasta• Spices &

Seasonings• Vegetables

• Breakfast• Brunch• Lunch• Supper• Dinner• Snack

• African• American• Asian• Caribbean• Continental• Eclectic/

Fusion/ International

• Jewish• Latin American• Mediterranean• Middle Eastern• Vegetarian

• Advanced• Bake• Broil• Fry• Grill• Marinade• Microwave• No Cooking• Poach• Quick• Roast• Sauté• Slow

Cooking• Steam• Stir-fry

16Taxonomy Strategies LLC The business of organized information

Justification for business taxonomy

v Easier information managementv Flexibility to respond to changing needsv Foundation for findability and usabilityv Typical ROI Scenarios:

Greater sales on a public shopping site Faster and more consistent responses by call center staff Reduced regulatory and legal risk Improved knowledge worker productivity Improved overall staff productivity

v Don’t justify the taxonomy, justify the goal the taxonomy will help you achieve.

17Taxonomy Strategies LLC The business of organized information

Effectiveness of applications of a business taxonomy

v For a product catalog, e.g., HomeDepot.com Conversion rate increases

– 20% increase. Petersen Lift in average order size.

– 20% increase. Petersen

v For knowledge workers, e.g., call center support staff Time saved

– 36% faster than search. Chen & Dumais.

v For knowledge workers, e.g., analysts Increase in productivity

– 25% productivity increase from not re-creating content . Taylor.– Estimated productivity loss exceeded $10M per year—about $500 per

employee per year. Nielsen.

18Taxonomy Strategies LLC The business of organized information

How do taxonomies improve search?

v Input (Query) Side “Search” using a small set of pre-defined values instead of trying to

guess what word or words might have been used in the content. Have synonyms mapped together so searches for “car”, “auto”, and

“automobile” return the same things.

v Output (Results) Side Organize search results into groups of related items. Sorting and filtering Refining search results

19Taxonomy Strategies LLC The business of organized information

Google search on “pcb” –Returns > 28M items

Taxonomy could suggest “Polychlorinated

Biphenyls” vs. “Printed Circuit Boards” or

“Pakistan Cricket Board”

20Taxonomy Strategies LLC The business of organized information

169,169 items

169,169 items

Categorized results Refine search by clicking on categories

21Taxonomy Strategies LLC The business of organized information

Taxonomy in action on the results side: www.CareerBuilder.com search on IT positions

By Category By Company By City By State

22Taxonomy Strategies LLC The business of organized information

Intra-site navigation through metadata and taxonomy

Forest Park Master Plan

Construction Update: July 2003

Aviation Field: The fields are complete and are open to the public. Work is still underway on the paths. The Forest Park Softball League is seeking teams for fall play. Contact Roger Berry at 289-5307.

Boathouse: Project is complete and open. The City of St. Louis awarded the contact for the operator to

Home > Dept. of Parks, Recreation & Forestry > Division of Parks > News & Announcements

Leisure & Culture Transportation & Infrastructure Parks & Gardens Construction, Maint, Impro

Dynamically populated with query:SELECT thumbnail,URLWHERE Format = Video/* and Org = Parks.Select a random result if list is long.

Org = “Division of Parks” AND Type=“Online Forms”

Topic = AROUND(“Parks & Gardens”, “Construction, Maintenance & Improvements”)

SELECT Ttitle, Description, URLWHEREOrg = “Parks Division”AND Type=HomePage

Org = RELATED_ORG(Topic = “Parks & Gardens”, “Construction, Maintenance & Improvements”)AND Type=HomePage (Get Title and Description)

Breadcrumbs and Left-nav are dynamic and based on directory in which content is created.

Main content tagged with:ORG = Parks, Recreation & Forestry Division/Parks DepartmentTOPIC=Leisure & Culture/Parks & Gardens; Transport & Infrastructure/Construction, Maintenance & ImprovementsCONTENT TYPE = News and Announcements

23Taxonomy Strategies LLC The business of organized information

TAXONOMY DEVELOPMENT METHODOLOGY

Getting Started with Business Taxonomy Design:

24Taxonomy Strategies LLC The business of organized information

Taxonomy development methods

Method DescriptionAutomated analysis

Munge, blast, crunch text to analyze corpus.

Workshopping Guide group in activities to identify key concepts.

Strawman Prepare best guess, then bring it to the table to discuss.

Adapt Existing Vocabularies

Customize internal terminology, industry standards, etc.

HybridCombination of some or all of these methods.

25Taxonomy Strategies LLC The business of organized information

Key components to a successful taxonomy project

Identify business

case

Identify business

case

Planning & research

Planning & research

Set-up taxonomy

team

Set-up taxonomy

team

Define use cases

Define use cases

Build high-level

taxonomy

Build high-level

taxonomyBuild-out taxonomy

detail

Build-out taxonomy

detail

Maintain & evolve

taxonomy

Maintain & evolve

taxonomy

Validation testing & review

Validation testing & review

Migrate content

Migrate content

Interview stake-holders

Interview stake-holders

26Taxonomy Strategies LLC The business of organized information

Define business case: Business case examples

v Improve search and browsing to reduce the amount of time employees spend looking for information.

v Reduce business silos, foster collaboration and content reuse, and thereby reduce redundant work.

v Reduce the amount of time employees spend e-mailing basic information to each other.

v Build confidence that employees are getting the most up to date information, and increase employee loyalty by helping them stay “up to date” on the company.

27Taxonomy Strategies LLC The business of organized information

Research & planning

v Identify target content to be focused on. Provide a list of websites (and/or other target content file stores) Prioritize this list for the purposes of the taxonomy project.

v Gather any query logs, usage statistics and usability surveys.v Collect any existing documentation related to audience

personas, content organization, metadata, keywords, and any other guidelines or standards.

v Identify and gather any internal classifications (org charts, sales regions, records retention schedule, code of conduct, product lists, etc.); and any relevant industry standard classifications (UNSPSC, NAICS, USPS, regulated activities, etc.)

28Taxonomy Strategies LLC The business of organized information

Interview stakeholders

v Recruit people from business-critical functions such as marketing, public relations, product marketing, legal, etc.

Include people who have credibility, are early adopters, hold large amounts of content, and are “squeaky wheels” or “fans.”

v Conduct 10-20 interviews.v The goal is for stakeholders to be the review board during the

taxonomy development process, and beyond.

29Taxonomy Strategies LLC The business of organized information

Define use cases: Intranet examples

v Content related to business areas or facilities By geographic location, by type, by specific facility, by access

restrictions, by audience, etc.

v Company-wide content By business function, by topic, by access rights, etc.

Use Case: Create a safety policies and procedures website for facilities organized by State.

Use Scenario: Find all safety policies and procedures related to a facilities located in Ohio.

Use Case: Locate any content that has policies and procedures around a particular topic.

Use Scenario: A policy regarding smoking company-wide has changed and references to outdated policies should be removed. Find official policies, as well as newsletters related to the smoking policy company-wide.

30Taxonomy Strategies LLC The business of organized information

Define use cases: .com examples

v Web content managers By content type, by topic, by location, etc.

v Public users seeking information by topic, by location, etc.

Use Case: Provide search for dividend schedules, earnings statements and stock splits; and the corresponding press releases for a specific time period.

Use Scenario: An investor who recently sold stock is preparing taxes and would like to do a concise search so that they can find historical information about their holdings.

Use Case: Find and recall all public-facing pages that describe a specific safety tip. Use Scenario: Find and recall all public-facing pages that discuss gas safety.

31Taxonomy Strategies LLC The business of organized information

Build high-level taxonomy

v Identify the types of actors Audiences, roles & access rights

v Identify the types of contentv Identify the types activities

Business processes, applications & uses

v Identify the types of named entities Products, services, projects, organizations, locations, etc.

v Topics will be everything else.

A business taxonomy should have no more than 6-10 broad divisions.

32Taxonomy Strategies LLC The business of organized information

Audience ProductsLocationOrganization Content Type

Product Line

Application

Technology

Industry Solution

Person

“Is a” groups of Products

Build high level taxonomy: Oracle.com top-level taxonomy

The Oracle.com taxonomy has no explicit topics, only actors, content types, and named entities.

33Taxonomy Strategies LLC The business of organized information

Build high level taxonomy: SGMS top-level taxonomyhttp://mysearch.internet.gov.sg/

TopicsTopics

The SGMS (Singapore Government Metadata Standard) Taxonomy is much more focused on Topics.

34Taxonomy Strategies LLC The business of organized information

Build-out taxonomy detail

v Get agreement on the broad divisions first, then build-out the detailed taxonomy.

v Use existing terminologies whenever they are available for business functions, locations, products & services, etc.

v Only build a vocabulary when no alternative authoritative source exists.

v Only create categories for which there already is content, or likely to be content soon.

v Keep the taxonomy broad and shallow. Roll-up more specific terms into broader categories

A business taxonomy should have no more than 1,200 categories.

35Taxonomy Strategies LLC The business of organized information

Build out taxonomy detail: NASA Taxonomyhttp://nasataxonomy.jpl.nasa.gov/

36Taxonomy Strategies LLC The business of organized information

Validation testing and review

Method Process Who Requires ValidationWalk-thru Show & explain • Taxonomist

• SME• Team

• Rough taxonomy • Approach• Appropriateness to task

Walk-thru Check conformance to editorial rules

• Taxonomist • Draft taxonomy• Editorial Rules

• Consistent look and feel

Usability Testing

Contextual analysis (card sorting, scenario testing, etc.)

• Users • Rough taxonomy• Tasks & Answers

• Tasks are completed successfully• Time to complete task is reduced

User Satisfaction

Survey • Users • Rough Taxonomy• UI Mockup• Search prototype

• Reaction to taxonomy• Reaction to new interface• Reaction to search results

Tagging Samples

Tag sample content with taxonomy

• Taxonomist• Team• Indexers

• Sample content• Rough taxonomy

(or better)

• Content ‘fit’ • Fills out content inventory• Training materials for people &

algorithms• Basis for quantitative methods

37Taxonomy Strategies LLC The business of organized information

Migrate content

v Prioritize content to be tagged Identify and dispose of ROT.

v Use business rules to automate content tagging Tag landing pages for major sections. Lower-level pages inherit tags from top-level pages.

v Use workflow to enforce tagging Require entry of simple tagging in order to submit an item into the

content management system.

v Use templates to guide user tagging Pre-populate template fields whenever possible. Use context-sensitive pick lists. Call-out to taxonomy service for more complex controlled vocabularies.

v Provide tagging incentives Almost instantaneous feedback.

38Taxonomy Strategies LLC The business of organized information

Maintain and evolve taxonomy

v Taxonomy building is iterative. A taxonomy should be improved over time and maintained.

v Designate a taxonomy editor as the single point-of-contact for taxonomy changes.

v Log change requests and notify requestors.v Prioritize taxonomy changes, e.g.

Improves information access, use and reuse. Requires creating new data or metadata. Affects program operations or has a financial impact. Enables communication campaigns or organizational strategy. Positive impact on users

39Taxonomy Strategies LLC The business of organized information

BUILD A HIGH LEVEL TAXONOMYExercise:

40Taxonomy Strategies LLC The business of organized information

41Taxonomy Strategies LLC The business of organized information

Exercise: Promo website taxonomy

What is Dunder Mifflin Promo?v The new online division that

markets promotional products. DM Promo was designed to reinvent the business of selling promotional paper products.

v The DMP website will provide: Product catalog browse by category,

brand, cost, popularity, feature, etc. Product specs, series, schedule,

imprinting, & colors. Various types of content such as

product ideas, articles, testimonials, etc.

Account information, shipping & returns.

DMP Products include:v Logo binders & filing suppliesv Logo calendars & plannersv Logo paper, cardstock & padsv Logo pens & pencilsv Logo promotional products

(badges & lanyards, mugs, stress balls, tote bags, mp3 players)

v Logo trophies & novelties (custom money, banners & signs, origami, party hats, paper boats)

v Logo wear (shirts, t-shirts, sweatshirts , fleece, bathing suits, hats, bandanas, team uniforms, socks)

42Taxonomy Strategies LLC The business of organized information

Exercise 3: Identify topics for Promo website taxonomy

1. Form groups No more than 10 in a group. Appoint recorder & reporter.

2. Brainstorm topics (10 min) Write one topic on each Post-it

3. Sort Post-its into groups (5 min)4. Present taxonomies (10 min)5. Compare taxonomies (5 min)

Strategies LLCTaxonomy

June 2, 2009 Copyright 2009 Taxonomy Strategies LLC. All rights reserved.

Questions?Joseph Busch, 415-377-7912,

[email protected]

Ron Daniel, 925-691-8374, [email protected]