25
Using standards to make vocabularies available. The Becta VMS (Vocabulary Management Service) Mike Taylor <[email protected]> Giraffatitan brancai reconstruction from Paul (1988)

Becta Vms

Embed Size (px)

DESCRIPTION

Presentation on Becta Vocabulary management systems by Mike Taylor at the CETIS MDR SIG meeting on 2008-02-12

Citation preview

Page 1: Becta Vms

Using standards to makevocabularies available.

The Becta VMS(Vocabulary Management Service)

Mike Taylor<[email protected]>

Giraffatitan brancai reconstruction from Paul (1988)

Page 2: Becta Vms

Contents

Vocabularies

Becta

The Becta Vocabulary Management Service

The Zthes XML format

The Zthes web service

Page 3: Becta Vms

Contents

Vocabularies

Becta

The Becta Vocabulary Management Service

The Zthes XML format

The Zthes web service

So what?

What now?

Page 4: Becta Vms

Vocabularies

Vocabularies are sets of terms used to tag documents.

Their use increases both recall and precision of searching.

At the simplest level, all Flickr tags form a vocabulary.

Richer vocabularies have semantics and structure.

Thesauri, taxonomies, ontologies, authority lists andcontrol lists are all more or less the same thing asvocabularies. (Purists will hate me for saying that.)

Page 5: Becta Vms

Semantics and structure

Terms may carry scope notes.

Terms may be listed with synonyms.

Links may exist between terms:BT (broader term) e.g. cat BT vehicleNT (narrower term) e.g. animal NT dogUF (use for, preferred term) e.g. dog UF houndUSE (non-preferred term) e.g. hound USE dogRT (related term) e.g. vehicle RT travel

Mappings to other languages are possible.

(Some semantics and structure can be induced byusage patterns in unstructured vocabularies.)

Page 6: Becta Vms

Sample terms from a vocabulary

dog:UF hound, canineBT animalNT dachsund, dalmatian, poodleScope note: includes domestic dogs only;

wolves and African hunting dogs arelisted separately.

animal:UF creature, beast, bruteBT organismNT dog, cat, Brachiosaurus altithorax, slugRT life

Page 7: Becta Vms

Searching with a vocabulary

Two main ways to use a vocabulary:

1. Visible to the user. Can be browsed to findsuitable search terms.

2. Behind the scenes: non-preferred terms mappedto preferred terms or synonyms expanded.

Expansion of query terms can include expansionto broader and narrower terms, or translated terms.

Relevance ranking can take term-closeness into account.

Page 8: Becta Vms

Becta

British Educational Communications and Technology Agency.

An agency of the Department of Education and Skills.

Oversees procurement of IT equipment for schools.

In charge of e-learning strategy.

Page 9: Becta Vms

Becta VMS

Creating vocabularies is a pain.

Tools are expensive.

Becta needed to facilitate vocabulary creationfor Curriculum Online.

Created the Vocabulary Management System (VMS)-- Studio (not available without training)-- Bank: http://bank.vocman.com/-- Spine

Page 10: Becta Vms

Vocabulary bank

Page 11: Becta Vms

Vocabulary bank

Page 12: Becta Vms

Downloaded XML<Zthes xmlns:dc='http://purl.org/dc/elements/1.1'> <thes> <dc:title>Early Years Foundation Stage</dc:title> <dc:description>Curriculum guidance for the Foundation Stage in England</dc:description> <dc:date>22/10/2007</dc:date> <dc:identifier>eyfs</dc:identifier> <dc:language>En-GB</dc:language> <thesNote label='version'>1.0</thesNote> <thesNote label='globallyUniqueId'>1001-eyfs</thesNote> <thesNote label='authority' vocab='0001-Authority'>QCA</thesNote> </thes> <term> <termId>000639</termId> <termName>Early Support</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> <termSortKey>3</termSortKey> <termNote label='globallyUniqueId'>1001-000639</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='source'>1001-eyfs</termNote> <termNote label='curriculumType' vocab='0001-CurriculumType'>category2</termNote> <relation> <relationType>BT</relationType> <termId>000635</termId> <termName>Inclusive Practice</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> </relation> [...] </term> [...]

Page 13: Becta Vms

Downloaded XML: the vocabulary<Zthes xmlns:dc='http://purl.org/dc/elements/1.1'> <thes> <dc:title>Early Years Foundation Stage</dc:title> <dc:description>Curriculum guidance for the Foundation Stage in England</dc:description> <dc:date>22/10/2007</dc:date> <dc:identifier>eyfs</dc:identifier> <dc:language>En-GB</dc:language> <thesNote label='version'>1.0</thesNote> <thesNote label='globallyUniqueId'>1001-eyfs</thesNote> <thesNote label='authority' vocab='0001-Authority'>QCA</thesNote> </thes> <term> <termId>000639</termId> <termName>Early Support</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> <termSortKey>3</termSortKey> <termNote label='globallyUniqueId'>1001-000639</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='source'>1001-eyfs</termNote> <termNote label='curriculumType' vocab='0001-CurriculumType'>category2</termNote> <relation> <relationType>BT</relationType> <termId>000635</termId> <termName>Inclusive Practice</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> </relation> [...] </term> [...]

Page 14: Becta Vms

Downloaded XML: a term<Zthes xmlns:dc='http://purl.org/dc/elements/1.1'> <thes> <dc:title>Early Years Foundation Stage</dc:title> <dc:description>Curriculum guidance for the Foundation Stage in England</dc:description> <dc:date>22/10/2007</dc:date> <dc:identifier>eyfs</dc:identifier> <dc:language>En-GB</dc:language> <thesNote label='version'>1.0</thesNote> <thesNote label='globallyUniqueId'>1001-eyfs</thesNote> <thesNote label='authority' vocab='0001-Authority'>QCA</thesNote> </thes> <term> <termId>000639</termId> <termName>Early Support</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> <termSortKey>3</termSortKey> <termNote label='globallyUniqueId'>1001-000639</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='source'>1001-eyfs</termNote> <termNote label='curriculumType' vocab='0001-CurriculumType'>category2</termNote> <relation> <relationType>BT</relationType> <termId>000635</termId> <termName>Inclusive Practice</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> </relation> [...] </term> [...]

Page 15: Becta Vms

Downloaded XML: a relation<Zthes xmlns:dc='http://purl.org/dc/elements/1.1'> <thes> <dc:title>Early Years Foundation Stage</dc:title> <dc:description>Curriculum guidance for the Foundation Stage in England</dc:description> <dc:date>22/10/2007</dc:date> <dc:identifier>eyfs</dc:identifier> <dc:language>En-GB</dc:language> <thesNote label='version'>1.0</thesNote> <thesNote label='globallyUniqueId'>1001-eyfs</thesNote> <thesNote label='authority' vocab='0001-Authority'>QCA</thesNote> </thes> <term> <termId>000639</termId> <termName>Early Support</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> <termSortKey>3</termSortKey> <termNote label='globallyUniqueId'>1001-000639</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='source'>1001-eyfs</termNote> <termNote label='curriculumType' vocab='0001-CurriculumType'>category2</termNote> <relation> <relationType>BT</relationType> <termId>000635</termId> <termName>Inclusive Practice</termName> <termType>PT</termType> <termLanguage>En-GB</termLanguage> </relation> [...] </term> [...]

Page 16: Becta Vms

The Zthes formatAn open, freely available, specification:

http://zthes.z3950.org/

Very simple – no attempt to generalise.

In use by various organisations in different domains:Becta (education)Synapse/Factiva (business intelligence)ELVIS/Decomate II/Elise II (European projects)Natural History Museum (biological taxonomy)OCLC (libraries)

Was considered (along with SKOS and MARC authorities)by the BS 8723-5:2007 part 5 committee.

Defeated by NIH syndrome.

Page 17: Becta Vms

The Z in Zthes ... some history

Zthes started life as a Z39.50 profile in 1999.(ANSI/NISO Z39.50 is a venerable search/retrieve standard.)

Zthes was quickly expanded by the addition of an XML format.

An SRU profile for Zthes followed in 2003.(SRU is Search/Retrieve via URL.)

XML format and SRU profile are currently at v1.0 (2006).

Some small additions on the way to support OCLC's use.

Page 18: Becta Vms

Zthes SRU in the Becta VMS

Requests are REST-like URLs:

http://bank.vocman.com/bank-webapp/sru/CurrentTermsoperation=SearchRetrievemaximumRecords=10recordSchema=zthesquery=zthes.relType="BT" and

zthes.termGuid="1000-KSWO-0005"

Search for records related by “BT” (broader term) to the termwith identified “1000-KSWO-0005”, and return the first ten.

query contains a CQL query: simple but powerful.

(This URL omits SRU's version parameter – naughty!)

Page 19: Becta Vms

Zthes SRU response

<srw:searchRetrieveResponse xmlns:srw='http://www.loc.gov/zing/srw/'> <srw:version>1.1</srw:version> <srw:numberOfRecords>1</srw:numberOfRecords> <srw:records> <srw:record> <term xmlns:k-int='http://www.k-int.com/' xmlns:dc='http://purl.org/dc/elements/1.1/'> <termId>KSWO-0005</termId> <termName>Working in groups</termName> <termType>PT</termType> <termNote label='source'>1000-QCA Metadata Standard: XTags</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='globallyUniqueId'>1000-KSWO-0005</termNote> <k-int:termRevisionNumber>0</k-int:termRevisionNumber> <k-int:termInstanceId>8931</k-int:termInstanceId> <relation> <relationType>BT</relationType> <termId>KSWO</termId> <termName>Key Skills: working with others</termName> </relation> </term> </srw:record> </srw:records></srw:searchRetrieveResponse>

Page 20: Becta Vms

Zthes SRU response: vehicle

<srw:searchRetrieveResponse xmlns:srw='http://www.loc.gov/zing/srw/'> <srw:version>1.1</srw:version> <srw:numberOfRecords>1</srw:numberOfRecords> <srw:records> <srw:record> <term xmlns:k-int='http://www.k-int.com/' xmlns:dc='http://purl.org/dc/elements/1.1/'> <termId>KSWO-0005</termId> <termName>Working in groups</termName> <termType>PT</termType> <termNote label='source'>1000-QCA Metadata Standard: XTags</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='globallyUniqueId'>1000-KSWO-0005</termNote> <k-int:termRevisionNumber>0</k-int:termRevisionNumber> <k-int:termInstanceId>8931</k-int:termInstanceId> <relation> <relationType>BT</relationType> <termId>KSWO</termId> <termName>Key Skills: working with others</termName> </relation> </term> </srw:record> </srw:records></srw:searchRetrieveResponse>

Page 21: Becta Vms

Zthes SRU response: payload

<srw:searchRetrieveResponse xmlns:srw='http://www.loc.gov/zing/srw/'> <srw:version>1.1</srw:version> <srw:numberOfRecords>1</srw:numberOfRecords> <srw:records> <srw:record> <term xmlns:k-int='http://www.k-int.com/' xmlns:dc='http://purl.org/dc/elements/1.1/'> <termId>KSWO-0005</termId> <termName>Working in groups</termName> <termType>PT</termType> <termNote label='source'>1000-QCA Metadata Standard: XTags</termNote> <termNote label='authority' vocab='0001-Authority'>QCA</termNote> <termNote label='globallyUniqueId'>1000-KSWO-0005</termNote> <k-int:termRevisionNumber>0</k-int:termRevisionNumber> <k-int:termInstanceId>8931</k-int:termInstanceId> <relation> <relationType>BT</relationType> <termId>KSWO</termId> <termName>Key Skills: working with others</termName> </relation> </term> </srw:record> </srw:records></srw:searchRetrieveResponse>

Page 22: Becta Vms

So what?

Page 23: Becta Vms

So what?

The advantage that all web services bring:loose coupling.

As useful as the Becta VMS Bank is, it is not theonly useful application of the vocabularies.

Using the Zthes/SRU web service, anyone can makeapplications that search and navigate vocabularies.

(And they should work with other Zthes/SRU vocabularies.)

Page 24: Becta Vms

So what?

The advantage that all web services bring:loose coupling.

As useful as the Becta VMS Bank is, it is not theonly useful application of the vocabularies.

Using the Zthes/SRU web service, anyone can makeapplications that search and navigate vocabularies.

(And they should work with other Zthes/SRU vocabularies.)

I will not insult your intelligence by using the word “mashup”.

Page 25: Becta Vms

What now?

Becta has to demonstrate that its facilities are useful ...

... which means it has to make them useful.

– Do these facilities help you?– If so, how might you use them?– If not, could they be made useful?– How?

Feedback, please!– Talk to me.– Email me on <[email protected]>– http://www.surveymonkey.com/s.aspx

?sm=YJt7RtxHmJQEgQFvXHZSTQ%3d%3d