38
Strategies LLC Taxonomy April 11, 2005 Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical Information Managers Group Meeting Newport Beach, CA

Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

Strategies LLC

Taxonomy

April 11, 2005 Copyright 2005 Taxonomy Strategies LLC. All rights reserved.

4 Myths about Taxonomies

ITIMG – Industrial Technical

Information Managers Group Meeting

Newport Beach, CA

Page 2: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

2TAXONOMY STRATEGIES LLC The business of organized information

Who I am

Over 25 years in the business of organized information Founder & Principal, Taxonomy Strategies Director, Solutions Architecture, Interwoven VP, Infoware, Metacode Technologies Program Manager, Getty Foundation Manager, Pricewaterhouse Assistant Director for Technical Services, Hampshire College Chief, Technical Services, Paul Weiss Rifkind Wharton & Garrison

Metadata & taxonomies community leadership. President, American Society for Information Science & Technology Trustee, Dublin Core Metadata Initiative Co-Founder, Networked Knowledge Organization Systems/Services Adviser, National Research Council Computer Science and

Telecommunications Board Reviewer, National Science Foundation Division of Information and

Intelligent Systems

Page 3: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

3TAXONOMY STRATEGIES LLC The business of organized information

Recent & current projects

Government Commodity Futures Trading Commission Defense Intelligence Agency ERIC Federal Aviation Administration Federal Reserve Bank of Atlanta Forest Service GSA Office of Citizen Services (

www.firstgov.gov) Head Start Infocomm Development Authority of

Singapore NASA (nasataxonomy.jpl.nasa.gov) Small Business Administration Social Security Administration USDA Economic Research Service USDA e-Government Program (

www.usda.gov)

Commercial Allstate Insurance Blue Shield of California Debevoise & Plimpton Halliburton Hewlett Packard Motorola PeopleSoft Pricewaterhouse Coopers Siderean Software Sprint Time Inc.

Commercial subcontracts Agency.com – Top financial services Critical Mass – Fortune 50 retailer Deloitte Consulting – Big credit card Gistics/OTB – Direct selling giant

NGO’s CEN IDEAlliance IMF OCLC

Page 4: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

4TAXONOMY STRATEGIES LLC The business of organized information

What I do

Organize Stuff

Page 5: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

5TAXONOMY STRATEGIES LLC The business of organized information

Agenda

Myth #1: The Web has changed everything Myth #2: Taxonomies are monolithic hierarchies Myth #3: Literary warrant Myth #4: Knowledge workers

Page 6: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

6TAXONOMY STRATEGIES LLC The business of organized information

Finding information should not be about “Feeling Lucky”

Page 7: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

7TAXONOMY STRATEGIES LLC The business of organized information

Something is wrong with this picture

“…search is so fundamental that people should have been focusing on it all along. The reality of the situation is that there was a great assumption that search was actually working just fine.”

— Harley Manning, Research Director

Page 8: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

8TAXONOMY STRATEGIES LLC The business of organized information

Why doesn’t search work?

For search engines to work, they need better stuff to work on!

Otherwise it’s Garbage in… …and garbage out.

Correctly matching content with questions (regardless of the technology) requires better content to work on.

Page 9: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

9TAXONOMY STRATEGIES LLC The business of organized information

How to fix search … add metadata to search on

“Adding metadata to unstructured content allows it to be managed like structured content. Applications that use structured content work better.”

“Enriching content with structured metadata is critical for supporting search and personalized content delivery.”

“Content that has been adequately tagged with metadata can be leveraged in usage tracking, personalization and improved searching.”

Page 10: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

10TAXONOMY STRATEGIES LLC The business of organized information

What is metadata? Another view of Dublin Core

Asset metadata – Who, Where & When:

Title, Creator, Publisher, Contributor, Date, Type,

Format, Identifier, Source, Language

Subject metadata –What & Why:

Subject, Description, Coverage

Relational metadata – Links between and to:

Relation

Use metadata – How can it be used:

Rights & Permissions

Functionality

Dif

fic

ult

to

Ge

ne

rate

Better resource description = Better navigation &

discovery

Page 11: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

11TAXONOMY STRATEGIES LLC The business of organized information

Dublin Core is a little more complicated

Elements1. Identifier2. Title3. Creator4. Contributor5. Publisher6. Subject7. Description8. Coverage9. Format10. Type11. Date12. Relation13. Source14. Rights15. Language

AbstractAccess rightsAlternativeAudienceAvailableBibliographic citationConforms toCreatedDate acceptedDate copyrightedDate submittedEducation levelExtentHas formatHas partHas versionIs format ofIs part of

Is referenced byIs replaced byIs required byIssuedIs version ofLicenseMediatorMediumModifiedProvenanceReferencesReplacesRequiresRights holderSpatialTable of contentsTemporalValid

RefinementsBoxDCMITypeDDCIMTISO3166ISO639-2LCCLCSHMESHPeriodPointRFC1766RFC3066TGNUDCURIW3CTDF

EncodingsCollectionDatasetEventImageInteractive ResourceMoving ImagePhysical ObjectServiceSoftwareSoundStill ImageText

Types

Page 12: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

12TAXONOMY STRATEGIES LLC The business of organized information

Metadata is a data model– A scheme for e-Forms

Element Namespace Source Purpose

Identifier dc:identifier System supplied Basic accountability

Registrar dc:creator LDAP validated Accountability & maintenance

Form Name dc:title User Text search, results display

Form Number dcterms:alternative User Text search, results display

Revision Date dcterms:modified User Filter or rank search results

Agency dc:publisher FIPS 95-2Key index to retrieve & aggregate assets

Form Type dc:typeForm Type vocabulary Browse or group search results

Industry Code us:naics NAICS codes Browse or group search results

Jurisdiction dc:coverage FIPS 5-2 Browse or group search results

Purpose us:feabrmFEA Business Ref Model Browse or group search results

... … ... ...

Subject

Page 13: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

13TAXONOMY STRATEGIES LLC The business of organized information

How is Dublin Core used in corporate environments?

57%

43% 43%

29%

0%

10%

20%

30%

40%

50%

60%

De facto Simple Access enabler Compliance

Base: 20 corporate information managers CEN/ISSS Workshop on Dublin Core

– Guidance information for the deployment of Dublin Core metadata in Corporate Environments

Page 14: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

14TAXONOMY STRATEGIES LLC The business of organized information

Dublin Core framework for corporate use

Not just 15 elements A framework to enable cross-resource exploration and

useDublin Core is framework for “integration metadata” at BellSouth

Page 15: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

15TAXONOMY STRATEGIES LLC The business of organized information

Agenda

Myth #1: The Web has changed everything Myth #2: Taxonomies are monolithic hierarchies Myth #3: Literary warrant Myth #4: Knowledge workers

Page 16: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

16TAXONOMY STRATEGIES LLC The business of organized information

Hierarchical classification of things into a tree structureHierarchical classification of things into a tree structure

What is a taxonomy? Systematics view

Kingdom Phylum Class Order Family Genus Species

AnimaliaChordata

MammaliaCarnivora

CanidaeCanis

C. familiari

Linnaeus …

Segment Family Class Commodity

44-Office Equipment and Accessories and Supplies .12-Office Supplies

.17-Writing Instruments

.05-Mechanical pencils

.06-Wooden pencils

.07-Colored pencils

UNSPSC …

Page 17: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

17TAXONOMY STRATEGIES LLC The business of organized information

Jurisdiction

Industry Impact

BRM Impact

Form TypeAgency AudienceKeyword Topic

Taxonomic metadata – e-Forms exampleTaxonomic metadata – e-Forms example

0001 Legislative

1000 Judicial1100

Executive Office of Pres

0003 Exec Depts1200 Agriculture1300 Commerce9700 Defense9100 Education8900 Energy7500 HHS7000 DHS8600 HUD1400 Interior1500 Justice1600 Labor1900 State6900 Transport2000 Treasury3600 Veterans

Ind AgenciesIntl Orgs

ApplicationApprovalClaimInformation

requestInformation

submission

InstructionsLegal filingPaymentProcuremen

tRenewalReservationService

requestTestOther inputOther

transaction

Agriculture & food

CommerceCommunica-

tionsEducationEnergyEnv proForeign relsGovtHealth &

safetyHousing &

comm devLaborLawNamed grpsNational defNat resourcesRecreationSci & techSocial pgmsTransport

AllGeneral

CitizenBusinessGovtEmployeeNative American

Non-resident

TouristSpecial

group

00 Generic11

Agriculture21 Mining22 Utilities23

Construct31-33

Manuf42

Wholesale44-45

Retail48-49 Trans51 Info52 Finance54

Profession55 Mgmt56 Support61

Education62 Health

Care71 Arts72

Hospitality81 Other

Services92 Public

Admin

FederalState +Local +Other +

Citizen SrvcsSocial SrvsDefenseDisastersEcon DevEducationEnergyEnv MgmtLaw EnfJudicial

CorrectionalHealthSecurityIncome Sec

IntelligenceIntl AffairsNat ResourTransportWorkforceScience

DeliverySupport Manageme

nt

TaxonomiesTaxonomies

Metadata Elements

Page 18: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

18TAXONOMY STRATEGIES LLC The business of organized information

The power of taxonomy facets

4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,00010,000 nodes (104) Easier to maintain Can be easier to

navigate

Page 19: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

19TAXONOMY STRATEGIES LLC The business of organized information

Taxonomic metadata example:Form SS-4. Employer Identification Number (EIN)

Facet Values

Agency IRS

Content Type Information Submission

Industry Impact

Generic

Jurisdiction Federal

Programs & Services

Support Delivery of Services/General Government/Taxation Management

Keyword Topic

Commerce/Employment taxes

Audience Business

Page 20: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

20TAXONOMY STRATEGIES LLC The business of organized information

Methods used to create & maintain metadata

71%

57%

43% 43%

0%

10%

20%

30%

40%

50%

60%

70%

80%

Forms DistributedProduction

Centralizedproduction

Not Automated

Base: 20 corporate information managers CEN/ISSS Workshop on Dublin Core

– Guidance information for the deployment of Dublin Core metadata in Corporate Environments

Page 21: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

21TAXONOMY STRATEGIES LLC The business of organized information

Agenda

Myth #1: The Web has changed everything Myth #2: Taxonomies are monolithic hierarchies Myth #3: Literary warrant Myth #4: Knowledge workers

Page 22: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

22TAXONOMY STRATEGIES LLC The business of organized information

Literary warrant

The “literature” on which a controlled vocabulary is based.

The “official names” of people, organizations, events, places, and things has been published sources

Type of Entity Authoritative Sources

Author names Title page

Places

US Board on Geographic Names, National Geo-Spatial Intelligence Agency, ISO 3166, UN Statistics Division

Subjects Existing literature

Page 23: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

23TAXONOMY STRATEGIES LLC The business of organized information

Why vocabulary differences are necessary

Terminology is needed before “literature” establishes warrant.

Categories are needed for internal purposes such as sorting, analysis, and other ad hoc groupings.

Organizations, places, and other entities change over time.

Page 24: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

24TAXONOMY STRATEGIES LLC The business of organized information

Folksonomies: Emergent topics

Page 25: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

25TAXONOMY STRATEGIES LLC The business of organized information

Some vocabulary differences are necessary: Grouping

ISO 3166-1

UN Code

Internal Code Name Official Name

AUT 40 122 Austria Republic of Austria

BEL 56 124 Belgium Kingdom of Belgium

DNK 208 128 Denmark Kingdom of Denmark

FRA 250 132 France French Republic

DEU 276 134 GermanyFederal Republic of Germany

SMR 674 135 San MarinoRepublic of San Marino

ITA 380 136 Italy Italian Republic

LUX 442 137 LuxembourgGrand Duchy of Luxembourg

… … … … …

Page 26: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

26TAXONOMY STRATEGIES LLC The business of organized information

Some vocabulary differences are necessary: Entities change over time

Name Part ofEffective

Dates Entity TypeSerbia and Montenegro Europe 2003- Independent state

Serbia and Montenegro

Federal Republic of Yugoslavia 1991-2003 Republic

Yugoslavia Europe 1929-1991 Independent state

Page 27: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

27TAXONOMY STRATEGIES LLC The business of organized information

Sources for 7 common taxonomies

Taxonomy Definition Potential Sources

Organization Organizational structure. FIPS 95-2, U.S. Government Manual, Your organizational structure, etc.

Content Type Structured list of the various types of content being managed or used.

DC Types, AGLS Document Type, AAT Information Forms , Records management policy, etc.

Industry Broad market categories such as lines of business, life events, or industry codes.

FIPS 66, SIC, NAICS, etc.

Location Place of operations or constituencies.

FIPS 5-2, FIPS 55-3, ISO 3166, UN Statistics Div, US Postal Service, etc.

Function Functions and processes performed to accomplish mission and goals.

FEA Business Reference Model, Enterprise Ontology, AAT Functions, etc.

Topic Business topics relevant to your mission and goals.

Federal Register Thesaurus, NAL Agricultural Thesaurus, LCSH, etc.

Audience Subset of constituents to whom a piece of content is directed or intended to be used.

GEM, ERIC Thesaurus, IEEE LOM, etc.

Products and Services

Names of products/programs & services.

ERP system, Your products and services, etc.

Page 28: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

28TAXONOMY STRATEGIES LLC The business of organized information

How Dublin Core is extended?

100%

86%

57% 57%

0%

20%

40%

60%

80%

100%

120%

Doc Types Products &Services

Roles InconsistentEncoding

Base: 20 corporate information managers CEN/ISSS Workshop on Dublin Core

– Guidance information for the deployment of Dublin Core metadata in Corporate Environments

Page 29: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

29TAXONOMY STRATEGIES LLC The business of organized information

Business process document types: Local document type lists are commonly invented

Oil & gas services company document types

analysis, appraisals, assessments, forecasts, predictions

agendas, plans, designs, schedules, workflow

applications, proposals, requests, requirements

permits, consents, approvals, rejections, certificates

work orders, correspondence

auditing, compliance, testing, inspections, operations reports

lessons learned, after-action reviews, meeting minutes, FAQs

policies, procedures, training manuals, standards, best practices

research notes, journal articles

newsletters, bulletins, press releases

ads, brochures, data sheets, technical notes, case studies, price lists

checklists, templates, forms, logos, branding

software, database forms

Page 30: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

30TAXONOMY STRATEGIES LLC The business of organized information

What controlled vocabularies are being used?

57%

29%

14%

43%

0%

10%

20%

30%

40%

50%

60%

ERP LDAP Business Process ISO 3166

Language CodesBase: 20 corporate information managers CEN/ISSS Workshop on Dublin Core

– Guidance information for the deployment of Dublin Core metadata in Corporate Environments

Page 31: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

31TAXONOMY STRATEGIES LLC The business of organized information

Agenda

Myth #1: The Web has changed everything Myth #2: Taxonomies are monolithic hierarchies Myth #3: Literary warrant Myth #4: Knowledge workers

Page 32: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

32TAXONOMY STRATEGIES LLC The business of organized information

Searching

Creating

Commun-icating

Knowledge workers spend up to 2.5 hours each day looking for information …

… But find what they are looking for only 40% of the time.

— Kit Sims Taylor

Page 33: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

33TAXONOMY STRATEGIES LLC The business of organized information

Creating new

content

Recreating existing content

SearchingCommun-icating

26%9%

Knowledge workers spend more time re-creating existing content than creating new content

— Kit Sims Taylor

Page 34: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

34TAXONOMY STRATEGIES LLC The business of organized information

High cost of not finding information

“The amount of time wasted in futile searching for vital information is enormous, leading to staggering costs …”

— Sue Feldman,

High cost of poor classification

Poor classification costs a 10,000 user organization $10M

each year—about $1,000 per employee.

— Jakob Nielsen, useit.com

Page 35: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

35TAXONOMY STRATEGIES LLC The business of organized information

Opportunities and challenges

80% of enterprise data is unstructured. Outputs from back office systems are documents—

queries & reports.

Avoiding unnecessary recreation of content. Enabling decision-making transparency. Promulgating policies & guidelines. Managing intellectual property. Supporting product & services throughout their life cycle

—development, marketing, sales & support.

Page 36: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

36TAXONOMY STRATEGIES LLC The business of organized information

Productivity, loyalty, and revenue have provided the ROI

Page 37: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

37TAXONOMY STRATEGIES LLC The business of organized information

Intranet has provided the best ROI

Intranet

Web/online customer sales

Web dev infrastructure

Middleware to link Web to ERP

e-billing/payment systems

Web/online business sales

Wireless Web access

Extranet/supply chain

e-marketplace/ portal

None

Page 38: Strategies LLCTaxonomy April 11, 2005Copyright 2005 Taxonomy Strategies LLC. All rights reserved. 4 Myths about Taxonomies ITIMG – Industrial Technical

Strategies LLC

Taxonomy

April 11, 2005 Copyright 2005 Taxonomy Strategies LLC. All rights reserved.

Joseph A. Busch+ 415-377-7912

[email protected]://ww.taxonomystrategies.com