52
1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 [email protected] MSAC

1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 [email protected]

Embed Size (px)

Citation preview

Page 1: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

1

Multilingual Subject Access to Catalogues of National

Libraries (MSAC)

TEL-ME-MOR/M-CAST Seminar On Subject Access

Prague, November 24, 2006

[email protected]

MSAC

Page 2: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

2

Goal of the initiative

• is to provide users with an authorized indexing and retrieval tool for multilingual subject searching in online environment

• the initiative is complying with the main goals and recommendations currently defined by IFLA for the activity of Classification and Indexing Section: – Changing Roles of Subject Access Tools (Berlin)

– Implementation and Adaptation of Global Tools for Subject Access to Local Needs (Buenos Aires)

– Cataloguing and Subject Tools for Global Access: International Partnerships (Oslo)

– Interoperability of subject access for multilingual and multi-script networked environment (Seoul)

MSAC

Page 3: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

3

CZENAS - MSAC

Czech National Subject Authority File/CZENAS • cooperative venture of three large libraries in Czechia:

– National Library of the Czech Republic– Moravian Library in Brno– Research Library in Olomouc

Multilingual Subject Access to Catalogues of National Libraries/MSAC• joined initiative of seven national libraries:

– National and University Library, Zagreb, Croatia– National Library of the Czech Republic, Prague – National Library of Latvia, Riga– Martynas Mazvydas National Library of Lithuania, Vilnius – National and University Library St. Kliment Ohridski, Skopje,

Macedonia– Slovak National Library in Martin, Slovakia– National and University Library, Ljubljana, Slovenia

MSAC

Page 5: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

5

Multilingualism – What do users want?

– to search a multilingual collection by using queries in one language or

– to retrieve documents in a number of specific languages

– to prefer an interface in the language of their choice

• solution: the users are provided with the language support they need

• possible limits: – technologies– language skills of the staff– financial means

Therefore, there have been only few attempts to create a multilingual subject access tool or to integrate already existing library systems in the area of multilingual subject access

MSAC

Page 6: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

6

Factors affecting subject indexing – What librarians should want?

• standardization of subject retrieval process and indexing and classification tools which– minimizes duplication of work in sharing information– supports shared cataloguing process at national and

international level• interoperability among different indexing and classification

schemes which consists in– intellectual mapping between terms in different controlled

vocabularies– using a switching language as an intermediary for moving

among equivalent terms in different vocabularies, above all multilingual

• to increase precision and recall trough Z39.50 protocol and its profiles and to apply authority control whenever possible – in all databases searched through, introducing the same subject search criteria both in remote and local databases

MSAC

Page 7: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

7

Subject analysis process in online environment - What librarians should prefer?

they should

• prefer post-coordinated indexing system

• simplify application syntax in subject headings strings

• support conceptual compatibility of indexing formulas/preferred terms used in various indexing languages

• support harmonisation between various indexing languages

• support mapping between verbal terms and equivalent notations of classification scheme

• improve subject access for OPACs and for Web resources

MSAC

Page 8: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

8

UDC classification system in on-line environment – Can it be useful?

can enhance subject access, because it

– provides context to search terms

– covers all subjects – enables language independent notations to be linked to

search terms of various verbal  languages

– enables other languages to be joined later without the need to classify the resources again

– could serve as switching language, which ensures convertibility between information languages

– supports very detailed expressions of complex subjects using a variety of common and special auxiliaries, specific symbols and punctuation

– is flexible more than other universal classification schemes

– indicates entities which occur in more than one domain (class)

MSAC

Page 9: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

9

examples

heading  water

UDC  546.212 (inorganic chemistry)

UDC 556-032.2 (hydrology)

UDC  628.1.03 (water management)

heading  incest

UDC  316.835.2 (sociology)  

UDC 343.542.5 (criminal law)  

UDC 616.89-008.442.38 (psychiatry)

MSAC

Page 10: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

10

What system should be used for developing MSAC?

• UDC system proved to be the most suitable for creation of a multilingual common indexing tool

• all the participating libraries used it, even if in different versions

• in MSAC is applied as an enumerative classification, functionality very similar to that of DDC

• UDC numbers – single and complex (pre-combined) are treated as single numbers

• the same citation order should be adopted– international exchange of information demands consistency

in building UDC class numbers• in MSAC system UDC class numbers are used alongside their

descriptions• 608.1 -- bioethics / bioetika • 608.3 -- biological safety / biologická bezpečnost

MSAC

Page 11: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

11

What is the role of Czech National Subject Authority File in MSAC iniciative?

integrated indexing and retrieval tool in which verbal controlled terms are being linked to UDC equivalent notations

• respecting IFLA recommendation - to consider possible relationships between subject authority records and classification

• respecting LC practice

topical authority file – control vocabulary in which following kinds of relationships between terms are defined:

• equivalence (expressed: USE)• hierarchy (expressed: BT-Broader term; NT-Narrower term)• association (expressed: RT-Related term)

Czech authority file of topical terms - base for multilingual controlled vocabulary

MSAC

Page 12: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

12

Formats: UNIMARC? MARC21? MSAC supports both UNIMARC and MARC 21• UNIMARC: Croatia, Lithuania• Comarc (based on UNIMARC): Slovenia, Macedonia• MARC 21: Latvia, Slovakia, Czechiaintention - to respect MARC formats as much as possible, but in

view of specific needs identified, some extensions and corrections have to be introduced

• fields for entering combinations of language variants and UDC notations extended by – subfield “b” (UDC equivalent notation) – subfield “c” (UDC qualifier)– UNIMARC - tag 450: subfields a, b, c – MARC 21 - tag 750: subfields a, b, c

• MARC 21 Format for Authority had to be extended by special field 089 for entering UDC number

MSAC

Page 13: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

13

What English terms are used?

• English equivalents of preferred terms, mostly LCSH terms are being chosen

• if LCSH equivalents are not found (LC terms being too broad), the reference sources like LC titles and subtitles file, encyclopedias, manuals, language vocabularies, www pages, full text databases are consulted

Approval process:• the proposals of preferred terms linked to the UDC class

numbers and English equivalents are being sent to the editorial staff for approval, then the approved authority records are entered via special programme procedure into the authority database

MSAC

Page 14: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

14

What entities are involved in the mapping process?

• mapping process is done intellectually • consists in establishing equivalents between the subject

controlled terms used in indexing systems of participating libraries through a switching language

• switching language: UDC notations based on UDC MRF and English equivalents

• mapping links are defined between preferred terms represented by isolated lexical units only

• subject headings strings as a whole are excluded, are not mapped

• authority records as a whole are excluded, are not mapped

• links are established only between topical main headings (main entries), UDC numbers and language equivalents

MSAC

Page 15: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

15

Is the combination of verbal expressions and UDC notations effective?

simple combination• one verbal expression is mapped to one simple UDC notation

– painting / malířství – UDC 75• one verbal expression is mapped to one compound/complex UDC

notation– medical law / medicínské právo – UDC 34:61 – history of law / právní dějiny – UDC 34(091) – Anglo-American law / angloamerické právo – UDC 34(410+73)

complex combination• one verbal expression is mapped to multiple UDC notations

– death/smrt – UDC equivalent 128 (metaphysics) – death/smrt – UDC equivalent 2-186 (theological anthropology)– death/smrt – UDC equivalent 233-186 (Hinduism) – death/smrt – UDC equivalent 393 (ethnography) – death/smrt – UDC equivalent 616-036.88 (medicine)

• one UDC notation is mapped to multiple verbal expressions– 34 -- law / právo * laws / zákony* legal aspects / právní aspekty *

legal regulations / právní předpisy

MSAC

Page 16: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

16

What MSAC indexes can you use for browsing?

Topical terms – Multilingual Topical terms – Czech 26845Topical terms – English 24082Topical terms – Croatian 1031 Topical terms – Latvian 44 Topical terms – Lithuanian 1535Topical terms – Macedonian 734Topical terms – Slovak 1665Topical terms – Slovenian 1010UDC

Subject fields: Astronomy, Demography, Law, Politics, Sociology, Sport, Theater,

Librarianship

MSAC

Page 17: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

17

MSAC

Page 18: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

18

MSAC - Welcome page MSAC

Page 19: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

19

Browsing – Lithuanian search termMSAC

Page 20: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

20

Browsing – Lithuanian indexMSAC

Page 21: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

21

Lithuanian search term - standard record MSAC

Page 22: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

22

Lithuanian search term – MARC 21 recordMSAC

Page 23: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

23

MSAC – UNIMARC – all languagesMSAC

Page 24: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

24

UDC – Czech – English combinationMSAC

Page 25: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

25

How can be MSAC terms in BIB databases of cooperating libraries applied?

• MSAC initiative is based on entering MSAC search terms into bibliographic records of cooperating institutions

• it requires transferring authority terms in specific MSAC format from NL CR AUT database to databases of cooperating libraries (to create auxiliary authority file)

• authoritative forms of MSAC terms are supposed to be transferred via FTP

• file of the authoritative MSAC forms should contain combination of English and respective languages variants and UDC notations

MSAC

Page 26: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

26

Application of MSAC search terms: Is a special field needed?

• cooperating libraries are required to add a special field, e.g. tag 690 - in both UNIMARC and MARC 21 formats into the bibliographic records

• intention is to add MSAC search terms (English term + combination of respective natural language term and UDC number) in special fields

• to offer multilingual access to the collections of cooperating libraries without having to abandon or change their existing subject systems

MSAC

Page 27: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

27

Simple and complex UDC combinations: task to be solved

• One term is linked to one UDC notation – the process could be done semiautomatically– based on the concordance between existing subject

terms applied in previous subject analysis process and those originating from MSAC file

– intellectual checking of data is supposed• One term is linked to two or more UDC notations

– the use of UDC notation depends on • the subject of the document • the decision of the cataloguer

– the process should be done intellectually

MSAC

Page 28: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

28

Solution: simple combination - Croatian application

• Croatian – authority format of MSAC record– LDR  001  ph119254 – 089  |a 314.151-054.6 |2 msac-udc– 75007  |a aliens |2 msac-eng– 75017  |a Tuđinci |b 323.113 |2 msac-scr

• Croatian – application in BIB record– 69007  |a aliens |2 msac-eng– 69017  |a Tuđinci |b 323.113 |2 msac-scr

MSAC

Page 29: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

29

Solution: Complex combination – Lithuanian AUT record

• Lithuanian– authority format of MSAC record– 089  |a 316.835.2 |c sociology |2 msac-udc– 089  |a 343.542.5 |c criminal law |2 msac-udc– 089  |a 616.89-008.442.38 |c psychiatry |2 msac-udc

• 75007  |a incest |2 msac-eng– 75037  |a kraujomaiša |b 343.542.5 |c baudžiamoji

teisė |2 msac-lit – 75037  |a kraujomaiša |b 616.89-008.442.38 |c

psichiatrija |2 msac-lit – 75037  |a kraujomaiša |b 316.835.2 |c sociologija |2

msac-lit

MSAC

Page 30: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

30

Solution: Complex combination – Lithuanian BIB record

• Lithuanian - application in BIB records Note that the UDC qualifier in subfield c is not used in

BIB records• from point of view of the law• 69007  |a incest |2 msac-eng• 69037  |a kraujomaiša |b 343.542.5 |2 msac-lit • or from point of view of the psychiatry• 69007  |a incest |2 msac-eng• 69037  |a kraujomaiša |b 616.89-008.442.38 |2 msac-lit • or from point of view of the sociology• 69007  |a incest |2 msac-eng• 69037  |a kraujomaiša |b 316.835.2 |2 msac-lit

MSAC

Page 31: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

31

What is the role of UIG in MSAC?

• MSAC searching scenario is based on search capabilities of Uniform Information Gateway (UIG)

• Uniform Information Gateway (UIG) provides simultaneous searching in different Czech and foreign resources (library catalogues, union catalogues, full text databases etc.) through one user interface

• It is based on the MetaLib metasearch system

• UIG MetaLib uses Z39.50 for communication. Therefore cooperating libraries should provide access via Z39.50 protocol

MSAC

Page 32: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

32

MSAC searching scenario: easy to use • list of categories offered in Metasearch feature was

extended by a new category of MSAC libraries for testing

• category contains library catalogues of  national libraries cooperating on creation of multilingual subject access tool -  Multilingual Subject Access to Library Catalogues (MSAC)

• when searching in the category of MSAC libraries we start in MetaSearch modul using Advanced search procedure in which we can search by subject or title

MSAC

Page 33: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

33

UIG - Welcome pageMSAC

Page 34: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

34

UIG – MSAC librariesMSAC

Page 35: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

35

Metasearch modul – advanced searchMSAC

Page 36: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

36

Results by databasesMSAC

Page 37: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

37

Metasearch resultsMSAC

Page 38: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

38

From the Lithuanian databaseMSAC

Page 39: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

39

From the Czech databaseMSAC

Page 40: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

40

From the Slovenian databaseMSAC

Page 41: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

41

From the Slovak databaseMSAC

Page 42: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

42

From the Latvian databaseMSAC

Page 43: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

43

To continue searching processMSAC

Page 44: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

44

Introducing a more specific termMSAC

Page 45: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

45

MSAC

Page 46: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

46

MSAC

Page 47: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

47

MSAC

Page 48: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

48

From the Lithuanian databaseMSAC

Page 49: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

49

From the Slovenian databaseMSAC

Page 50: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

50

From the Czech databaseMSAC

Page 51: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

51

Future development

• idea to create a multilingual subject retrieval tool or to introduce a mapping scheme in existing systems is considered as an essential element of The European Library service

• MSAC project – beginning phase • problems:

– only voluntary work of teams of participating libraries– communication mostly via e-mails only – no external financial support

• new perspective:

• integration in The European Library TEL • joining the MACS project

– use the MACS Link Management System as a central source (clearinghouse) of mapping results (including subject

headings and classification links

MSAC

Page 52: 1 Multilingual Subject Access to Catalogues of National Libraries (MSAC) TEL-ME-MOR/M-CAST Seminar On Subject Access Prague, November 24, 2006 Marie.Balikova@nkp.cz

52

Multilingual subject access - a challenge

Thank you for you attention!

Special thanks to

Rita Maciulevičienė

Špela Razpotnik

Romano Krauth

Senka Naumovska

Hanka Peťová

Anna Maulina„Subject team“ from Czech National Library

MSAC: http://sigma.nkp.cz/eng/auv

CZENAS: http://sigma.nkp.cz/eng/aut

JIB: http://www.jib.cz/V?RN=205587682

MSAC