27
1 Taxonomy Enterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme ECM Strategy ECM Practitioner ECM Specialist Case Study 2 © AIIM | All rights reserved 2

Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

  • Upload
    leanh

  • View
    244

  • Download
    6

Embed Size (px)

Citation preview

Page 1: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

1

Taxonomy

Enterprise Content Management (ECM)

ECM ECM ECM Case Study

AIIM ECM Certificate programme

ECMStrategy

ECMPractitioner

ECM Specialist

Case Study

2© AIIM | All rights reserved 2

Page 2: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

2

ECM Practitioner Course Outline

Foundations Tools & Instruments

1. Introduction

2. Technologies

& Functionality

4. Create & Capture

5. Metadata

7. Security & Control

10. Delivery & Presentation

8. Process & Automation

11. Trends & Directions

Futures

3

3. Information Architecture

9. Findability6. Taxonomy

© AIIM | All rights reserved 3

Agenda

Defining taxonomies and classification

Subject-based classificationTaxonomies FolksonomiesOntologiesThesaurus and Semantic networks

Business case for classification

4

Business case for classification

Standards and guidelines

Classification challenges

© AIIM | All rights reserved

Page 3: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

3

Agenda

Defining taxonomies and classification

Subject-based classificationTaxonomies FolksonomiesOntologiesThesaurus and Semantic networks

Business case for classification

5

Business case for classification

Standards and guidelines

Classification challenges

© AIIM | All rights reserved

Defining taxonomy (1)

Taxonomy is the science of classifying information

A taxonomy is a law for classifying information

Taxonomies are nearly ubiquitous, but poorly understood

6

Source: Dictionary.com

© AIIM | All rights reserved

Page 4: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

4

Defining taxonomy (2)

“In recent years, the business world has fallen in love with the term ‘taxonomies’. We use it specifically to referwith the term taxonomies . We use it specifically to refer to a hierarchical arrangement of categories within the user interface of a website or intranet.”

7

Source: Information Architecture for the World Wide Web (Louis Rosenfeld and Peter Morville, 2002)

© AIIM | All rights reserved

AIIM website

1

2

4

3

8© AIIM | All rights reserved

Page 5: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

5

Understanding taxonomies

A taxonomy is a classification schemeSuch as the way that an individual classifies the content of their e mailSuch as the way that an individual classifies the content of their e-mail inbox, a personal CD collection, or the contents on an iPod

A taxonomy is a knowledge mapReflects how it’s owner conceives a given body of content (a knowledge domain), for purposes of browsing, navigating, discovering, and sharing that information

9

A taxonomy is semanticIndicating the relationships between concepts, such as the relationships between a car and a steering wheel, in that the steering wheel is a “part of” a car

Source: Organising Knowledge (Patrick Lambe, 2007)

© AIIM | All rights reserved

Category perspectives

Business function

Geo-political

Company focus vs. industry focus

Product or service

Business issues, conditions, events

10

Type/Source of content

© AIIM | All rights reserved

Page 6: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

6

Representations of taxonomies (1)

Lists

Trees

Hierarchies

Polyhierarchies

Matrices

11

Facets

System Maps

© AIIM | All rights reserved

Source: Organising Knowledge (Patrick Lambe, 2007)

Representations of taxonomies (2)

ListsSimple collection of relatedSimple collection of related things. The relationship is defined by the purpose of the list.

Good when domain is simple, amount of content is small. Basic building blocks of all other taxonomical representations

Examples: Country codes types

12

Examples: Country codes, types of diseases

Source: Organising Knowledge (Patrick Lambe, 2007) Source: Wikipedia

© AIIM | All rights reserved

Page 7: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

7

Representations of taxonomies (3)

TreesRepresents a transition fromRepresents a transition from general to more specific relationships or whole to part.

Good when a list gets to be too long, and “naturally” breaks into subcategories.

Examples: Yellow pages (phone directories)

13

directories)

Source: Organising Knowledge (Patrick Lambe, 2007) Source: CoreFiling.com

© AIIM | All rights reserved

Representations of taxonomies (4)

HierarchiesA specific tree structure that hasA specific tree structure that has inclusiveness, consistency, and maintains the same “type” of relationship at each level. The “child” inherits all of the characteristics of the “parent” and each child can only belong in one place in the taxonomy

14

Works best with mature, formal, logical schemes

Examples: Military rank, Biological, Family Genealogy

Source: Organising Knowledge (Patrick Lambe, 2007)

© AIIM | All rights reserved

Page 8: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

8

Representations of taxonomies (5)

PolyhierarchiesUsed when an item belongs inUsed when an item belongs in more than one place in the real world, and multiple organising principles are required. Provides “virtual linking” between hierarchies.Example: a single collection of content concerning diseases can

15

content concerning diseases can be organised/taxonomised via affected body part and causes

Source: Organising Knowledge (Patrick Lambe, 2007) Source: Rosenfeld, Morville (2006)

© AIIM | All rights reserved

Representations of taxonomies (6)

MatricesProvides a 2 or 3 dimensionalProvides a 2 or 3-dimensional cross linking of taxonomies, and an ability to provide differing views into the same body of content.Example: The same content could be located based on project manager, project initiation, and/or

16

manager, project initiation, and/or affected standards

Source: Organising Knowledge (Patrick Lambe, 2007)

© AIIM | All rights reserved

Page 9: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

9

Representations of taxonomies (6)

FacetsA multi dimensional taxonomyA multi-dimensional taxonomy comprised of multiple tags, each tag representing an individual taxonomy, thus the content is categorised in multiple ways, within a single interface.Example: selecting wines based on characteristics such as type,

17

on characteristics such as type, price, varietals, regions, appellations, and price.

Source: Organising Knowledge (Patrick Lambe, 2007) Source: wine.com

© AIIM | All rights reserved

Representations of taxonomies (7)

System mapsVisual representations of aVisual representations of a domain of knowledgeLabelled representing taxonomy categoriesExample: A collection of medical content relating to the human nervous system is accessible via a diagram of the human body

18

a diagram of the human body. Each component of that system is illustrated in context, and labelled appropriately.

Source: Organising Knowledge (Patrick Lambe, 2007)

© AIIM | All rights reserved

Page 10: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

10

Defining classification

Classification:“The systematic identification and arrangement of business activities and/or records into categories according to logically structured conventions, methods and procedural rules represented in a classification system”

19

Source: ISO 15489

© AIIM | All rights reserved

What is classification?

In simple terms, it’s just grouping information together

Common examples of classification:Cars

by make, model, performanceFood

tinned/fresh, type (meat, vegetable, grain)TV programmes

20

p gcomedy, thriller, quiz show

Clothes adult/child, expensive/cheap, winter/summer

© AIIM | All rights reserved

Page 11: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

11

Dewey Decimal system

Used to classify information throughout the western

Dewey Decimal system000 General & Bibliographythroughout the western

worldVery Euro-centric

000 General & Bibliography

100 Philosophy & Psychology

200 Religion

300 Social Science

400 Languages & Linguistics

500 Sciences

21© AIIM | All rights reserved

600 Technology

800 Literature

900 Geography & History

Chinese library classification

43,600 categories. Constantly expanding to meet the needs of a rapidly changing nation

1) Marxism, Leninism, Maoism & Deng Xiaoping Theory

2) Philosophy and Religion3) Social Sciences4) Politics and Law5) Military Science6) Economics

12) Natural Science13) Mathematics, Physics and

Chemistry14) Astronomy and Geoscience15) Life Sciences16) Medicine and Health Sciences17) Agricultural Sciences

Political considerations drive some organisation

22

6) Economics7) Culture, Science, Education, and

Sports8) Languages and Linguistics9) Literature10) Art11) History and Geography

17) Agricultural Sciences18) Industrial Technology19) Transportation20) Aviation and Aerospace21) Environmental Science

© AIIM | All rights reserved

Page 12: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

12

US Library of Congress

Used to categorise books published in the United States

Expanded categories emphasise USA-specific history and interestsA) General WorksB) Philosophy, Psychology, ReligionC) History: Auxiliary SciencesD) History: General and Old WorldE) History: United StatesF) History: Western Hemisphere

L) MusicM) Fine ArtsN) Literature & LanguagesO) ScienceP) MedicineQ) Agriculture

23

G) Geography, Anthropology, RecreationH) Social ScienceI) Political ScienceJ) LawK) Education

R) TechnologyS) Military ScienceT) Naval ScienceU) Bibliography & Library

Science

© AIIM | All rights reserved

What are classification schemes?

A classification scheme…Is the structure an organisation uses for organisingIs the structure an organisation uses for organising, accessing/retrieving, storing and managing its informationCan be used to classify records

A Business Classification Scheme (BCS) is a classification scheme based on an organisation’s business functions and activities

24

These are predominately used for Records Management purposes

© AIIM | All rights reserved

Page 13: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

13

Classification schemes: Types

Keyword /

Deployment

P Hierarchical / thesaurus-based

Functional

Subject /thematic

Principles of class

Generally preferred

tree style

25

sification

Organisational

© AIIM | All rights reserved

Hierarchical / tree style BCSs: Key

CLASS C

FILE

RECORD

F

R

26

DOCUMENT D

© AIIM | All rights reserved

Page 14: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

14

Schematic example: Hierarchical / tree BCS

C

C C C

C C CC C C CC

C C C C

C

C

27

F F FF F FF F

© AIIM | All rights reserved

Populated example: Hierarchical / tree BCS

Innovation, Knowledge Transfer and Technical Infrastructure (super ( pfunction)

Innovation (function)Knowledge Transfer (function)Technical Infrastructure (function)

Standards and Accreditation (sub function)

Policy Management (activity)

SuperFunction

Function Function

Sub Function

Sub Function

Sub Function

28

Activity Activity Activity Activity

Infrastructure Support (activity) National Measurement System (sub function)

Policy Management (activity)Civil Space Activity (sub function)

Space Regulation (activity)© AIIM | All rights reserved

Page 15: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

15

Agenda

Defining taxonomies and classification

Subject-based classificationTaxonomies FolksonomiesOntologiesThesaurus and Semantic networks

Business case for classification

29

Business case for classification

Standards and guidelines

Classification challenges

© AIIM | All rights reserved

Toward subject-based classification

It’s often valuable to create multiple classificationsUsers: Intended audienceContent: Inherent subject matterContext: Temporal, organisational or political drivers

User-understood terms are criticalEspecially important for e-commercePeople search Google for “cheap flights” 75x

th “l f ” (S G M G )

30

Source: Louis Rosenfeld LLC

more than “low fares” (Source: Gerry McGovern)Who are the users? Scientists? Consumers?

Context mattersWhy this user with this content?

© AIIM | All rights reserved

Page 16: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

16

Taxonomies in context

31© AIIM | All rights reserved

Source: Yahoo!

Hierarchies as implicit semantics

Divides information space into categories & subcategories, relating broader & narrower concepts viasubcategories, relating broader & narrower concepts via parent-child relationship

Generic = Class-species: Species B (crow) is a member of Class A (Bird) & inherits characteristics of its parent}Whole-Part = B is a part of A (i.e., Index Finger is part of Hand)Instance = B is an instance of A (i.e., Indian Ocean is an Ocean)

32

AABB

© AIIM | All rights reserved

Page 17: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

17

Differing views

Simple truth: People see (and label!) the world differently…

Sand trap, or bunker?

33

Sand trap, or bunker?

© AIIM | All rights reserved

Personal taxonomy

• Personal classification of information• E-mail folders -- most common

manifestationmanifestation• Can improve relevance and findability

to an individual– Some approaches enable personal

classification in addition to “authorised” taxonomy

– Gmail and some other systems employ faceted classification as well

• From enterprise perspective, personal taxonomies can be quite problematic

No interoperabilit ling istic chaos

34

– No interoperability, linguistic chaos– Impossible to establish enterprise-wide

standards and vocabularies• When combined with peers, can

become a “folksonomy”

© AIIM | All rights reserved

Page 18: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

18

Folksonomy

Collaborative tagging of content with minimal controls

Relevance between metadata and content may be determined by users in a democratic fashion

Clusters emerge and communities typically self-organise around them (“Wisdom of the crowd”)

Typically arise in Web-based communities where i di id l h t t th t d t

35

individuals share content, then create and use tags

Best used when there is a critical mass of taggersCan be a useful “bottom-up” approach to developing taxonomies

© AIIM | All rights reserved

Folksonomy example

36

Source: flickr.com

© AIIM | All rights reserved

Page 19: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

19

What is an ontology?

Explicit specification or conceptualisation of a domainOften subsume thesauri, but employ richer semantic relationshipsOften subsume thesauri, but employ richer semantic relationships among terms and attributesApply rigid rules specifying terms and relationshipsDo more than just control vocabulary; are a knowledge representation

Semantic technologies are typically centered around ontologies

An ontology for salad would contain the structure for

37

An ontology for salad would contain the structure for how it relates to everything, from ingredients to growers to the rodents that might eat it, and how a salad is different in Japan vs. Italy

© AIIM | All rights reserved

Why develop an ontology?

To improve knowledge sharing and reuse, and make software more adaptable to an environmentsoftware more adaptable to an environment

Share common understanding of the structure of information among people or software agentsEnable reuse of domain knowledgeMake domain assumptions explicitSeparate domain knowledge from operational knowledge

38

p gAnalyse domain knowledge

Source: http://www.alphaworks.ibm.com/contentnr/introsemantics

© AIIM | All rights reserved

Page 20: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

20

The challenge of meaning

Meaning is a hard problem for machines and humans alikeSame term can have multiple meaningsSame term can have multiple meaningsMultiple terms can have the same meaningUltimately meaning is contextual

Dublin Core designed to disambiguate at a fundamental levelE.g., distinguishes definitively among “Creator” and “Contributor,” and “Publisher”

39

But in the wild, it is much harder to achieve semantic agreement

© AIIM | All rights reserved

Controlled vocabularies

Supporting tools based on collections of terms used to tag, track and describe contenttag, track and describe content

For example, users may wish to organise content according to

business sectorgeographical locationproduct type

40

organisation typepolicy topic

Allow content to be described using only 'official terms'

© AIIM | All rights reserved

Page 21: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

21

Controlled vocabularies: Types

Simple lists Lists of terms allowed to be used to describe an information resourceLists of terms allowed to be used to describe an information resource

Synonym ringsA 'ring' of connected terms, all treated as equivalent for searching Synonym rings can be used to link acronyms, variant spellings or scientific / popular terms

Thesaurus

41

ThesaurusHierarchical arrangement of broader and narrower meanings

© AIIM | All rights reserved

Simple lists and synonym rings

Simple list of bovine diseasesAnaplasmosisAnaplasmosis Babesiosis Bovine spongiform encephalopathy (BSE) Cysticercosis

Synonym ring for a BSE

BSE

Mad cows’ disease

Bovine spongiform

encephalopathy

42

Prion disease

© AIIM | All rights reserved

Page 22: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

22

Thesaurus

A networked collection of controlled vocabulary terms, using associative relationships

Used to manage and identify the relationships among and between terms

E.g. Equal to, Related to, Opposite of

Some examples from a hypothetical domainLettuce = Frisée (a.k.a, ‘a synonym ring’)Lettuce is a narrower type of Greens

43

Coriander is related to Cilantro; but they are not equal

Useful to reconcile different lexicons across business units or functional groups

© AIIM | All rights reserved

Sample thesaurus

44© AIIM | All rights reserved

Page 23: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

23

Ontologies and taxonomies and thesauri

How does this relate to Taxonomies and Thesauri?“We have all agreed to call this thing lettuce Lettuce is a vegetable ”We have all agreed to call this thing lettuce. Lettuce is a vegetable.

There is a much larger potential pool of semantic information that a taxonomy may or may not contain:

“Lettuce grows in the ground. Rabbits are a hazard to lettuce growers. Tomatoes and cucumbers are often eaten with lettuce, and the three of these things together make what is called a salad. But, a salad is not only defined by the collection of these three things in Japan a mixture

45

only defined by the collection of these three things…in Japan, a mixture of seaweed and sesame seeds is a salad. In the Midwestern United States, a collection of Jell-O and radishes is called a salad, and there is no lettuce involved.”

© AIIM | All rights reserved

Agenda

Defining taxonomies and classification

Subject-based classificationTaxonomies FolksonomiesOntologiesThesaurus and Semantic networks

Business case for classification

46

Business case for classification

Standards and guidelines

Classification challenges

© AIIM | All rights reserved

Page 24: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

24

Benefits of classifying records (1)

Providing linkages between individual records which accumulate to provide a continuous record of activityp y

Ensuring records are named in a consistent manner over time

Assisting in the retrieval of all records relating to a particular function or activity

Determining security protection and access appropriate

47

g y p pp pfor sets of records

Allocating user permissions for access to, or action on, particular groups of records

© AIIM | All rights reserved

Benefits of classifying records (2)

Distributing responsibility for management of particular sets of recordssets of records

Distributing records for action

Determining appropriate retention periods and disposition actions for records

48© AIIM | All rights reserved

Page 25: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

25

Agenda

Defining taxonomies and classification

Subject-based classificationTaxonomies FolksonomiesOntologiesThesaurus and Semantic networks

Business case for classification

49

Business case for classification

Standards and guidelines

Classification challenges

© AIIM | All rights reserved

Standards and guidelines

ISO 15489 - the international standard for records managementg

MoReq2 - the Model Requirements for the Management Of Electronic Records

DIRKS - the Design and Implementation of Record-Keeping Systems methodology

ISO 2788 - Guidelines for the Establishment of

50

Monolingual Thesauri

© AIIM | All rights reserved

Page 26: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

26

Agenda

Defining taxonomies and classification

Subject-based classificationTaxonomies FolksonomiesOntologiesThesaurus and Semantic networks

Business case for classification

51

Business case for classification

Standards and guidelines

Classification challenges

© AIIM | All rights reserved

Classification challenges (1)

Laborious and difficult to developTendency to over analyseTendency to over-analyse

May need more than one

Classification of content into a categorisation scheme is ongoing work

52© AIIM | All rights reserved

Page 27: Enterprise Content Management (ECM) - AIIM … Notes/ECMP-6-Taxonomy.pdfEnterprise Content Management (ECM) ECM ECM ECM Case Study AIIM ECM Certificate programme Strategy Practitioner

27

Classification challenges (2)

Categories need ongoing care and feeding (including thesauri, taxonomies, controlled vocabularies(including thesauri, taxonomies, controlled vocabularies and ontologies)

Content changes, Context changesVocabularies changeExperience may breed new perspectives

53© AIIM | All rights reserved

What you have learned

How to leverage classification in general and taxonomies in particular as part of an ECM strategyin particular as part of an ECM strategy

Different approaches to subject-based organisation schemes:

TaxonomiesThesauriSemantic networks

54

OntologiesFolksonomies

Managing classification challenges

© AIIM | All rights reserved