ALEG Project Technical System Implementation Update Kent Fitch 8 July 2000


Citation preview

ALEG ProjectTechnical System Implementation


Kent Fitch

8 July 2000


• Introduction• Stage 1: report summary• Stage 2 : progress report

– the big picture– data model– visual design– implementation options

• Stage 3 + : timetable, what to expect


• Kent Fitch, Project Computing Pty Ltd– applications programmer/analyst– database administrator– IBM systems & network programmer– UNIX systems & network programmer– commercial & public-domain software developer– web developer (since 1994)– not affiliated with commercial software suppliers

Stage 1 : report summary

• ALEG goals:– unify ALEG resources under a single data

model and software/hardware platform• decentralised control

• consolidated resource, but– partners’ identity/profile not subsumed

– separate, customisable views

– web based maintenance and access• as simple and powerful as possible


Stage 1 : report summary

• ALEG goals (continued)

– query & data retrieval• Z39.50


• XML, customisable presentation

– Links to external resources• holdings (Kinetica?)

• manuscripts and archive items (RAAM?)

Stage 2 : progress report

• The big picture

• Data model

• Visual design

• Operational issues

• Implementation options

Stage 2 : progress report

• Stage 2 is the “detailed design stage”– Most important: agree on data model– Then “motherhood” issues:

• “nice” user interface (simple, fast, powerful, flexible…)• “wonderful” operational characteristics (simple, fast, powerful,

flexible, robust, cheap) – security, charging, logging...

– And finally:• evaluate implementation options against desires / requirements• implementation plan

– including data take-up

Stage 2 : the big picture

• Library systems are facing challenges– recognition that existing data models are not

adequate, especially given:• declining budgets

• growth in resources (especially electronic)

• “challenge” from web-based resource discovery systems (eg Google)

Stage 2 : the big picture

• “The web changes everything”– charging model– user interface expectations– interconnectedness of all things…

• metadata


• Topic Maps

• XLink

– users expect everything, all the time, for free

Stage 2 : the big picture• New data models:

– Dublin Core– simple 15 element metadata set

– IFLA’s FRBR– Functional Requirements for Bibliographic Records

– INDECS– INteroperability of Data in E-Commerce systems

– CIDOC’s CRM– object oriented reference model for cultural heritage

– Harvest project ABC– common cross-realm entities (work, event, agent)

…more on this later

Stage 2 : the big picture

• Where does ALEG fit in all of this?– about literary resources, but not a library catalogue– emphasis on resources which help interpret and

understand:• reviews/criticisms

• setting & subject classification

• biographical material

• archival items

• linking, relationships


Stage 2 : the big picture

• Where does ALEG fit in all of this? (continued)

– central database, but decentralised control and separate, customisable views

– web based maintenance and access – linking to full text

• via SETIS & others for large works

• scanned reviews/articles (copyright issues)

• electronic journals and other resources


Stage 2 : the big picture

• Where does ALEG fit in all of this? (continued)

– integration with external electronic resources• as a user:

– holdings (Kinetica?)

– manuscripts (RAAM/son-of-RAAM?)

• as a provider– Z39.50 target

– support specialised views, deliver data in XML format

Stage 2 : the big picture

• For ALEG to be relevant

– it must be designed to integrate with and take advantage of the systems of the present and the future, not those of the past

– it must ‘play nicely’ in the ‘semantic web’

Stage 2 : the big picture

• “The systems of the present and the future”– web based– common metadata based– common data model, supporting

• unambiguous identification

• rich relationships, explicit and implied

• interoperability with other systems

Stage 2 : data model

• The data model lays the foundation of a system– worth spending a lot of time to “get it right”

• “Getting it right”– explicitly represent the objects the system has to deal

with• adding attributes later is easy

• adding concepts later is hard

– an eye to interoperability with other systems and future system and industry directions

Stage 2 : data model

• Being explicit:

“Metadata not descriptions”– identifiers, not words– relationships, not labels– events, not things

Godfrey Rust, Technical Coordinator, INDECS projectMetadata 2010, presentation to British Library Seminar, Sept 99

Stage 2 : data model

• The rest of the world has been doing a LOT of thinking about this too!

– IFLA FRBR– Dublin Core– INDECS– Harmony ABC– Topic Maps….

Stage 2 : data model IFLA FRBR

From:Deconstructing the Library Catalogue, Tom Delsey, National Library of CanadaPresentation to British Library Seminar, Sept 99





realized through

embodied in

exemplified by


From:Deconstructing the Library Catalogue, Tom Delsey, National Library of CanadaPresentation to British Library Seminar, Sept 99









Corporate body


From:Deconstructing the Library Catalogue, Tom Delsey, National Library of CanadaPresentation to British Library Seminar, Sept 99









Corporate Body


Worksubject of

subject ofsubject of


Stage 2 : data model IFLA FRBR

From:Deconstructing the Library Catalogue, Tom Delsey, National Library of CanadaPresentation to British Library Seminar, Sept 99

Work Workrelated to

successor supplement complement summarization adaptation transformation imitation whole / part

Stage 2 : data model IFLA FRBR

From:Deconstructing the Library Catalogue, Tom Delsey, National Library of CanadaPresentation to British Library Seminar, Sept 99


abridgement revision translation musical arrangement whole / part

Expressionrelated to

Stage 2 : data model IFLA FRBR

From:Deconstructing the Library Catalogue, Tom Delsey, National Library of CanadaPresentation to British Library Seminar, Sept 99


reproduction alternate whole / part

Manifestationrelated to

Stage 2 : data model IFLA FRBR

From:Deconstructing the Library Catalogue, Tom Delsey, National Library of CanadaPresentation to British Library Seminar, Sept 99


reconfiguration reproduction whole / part

Itemrelated to

Stage 2 : data model INDECS

From:A Common Model to Support Interoperable Metadata - Progress report on reconciling metadata requirements from the Dublin Core and INDECS/DOI Communities, D-Lib January 1999 - Bearman/Miller/Rust/Trant/Weibel

Stage 2 : data model

• Topic Maps– “A thesaurus on steroids”– “The Global Positioning System for the Web”

– Charles Goldfarb, SGML

– ISO standard, influenced by HyTime (SGML)– a framework for defining topics, associations

scopes, occurrences and attributes (facets) separate from an underlying data base

Stage 2 : data modelTopic Maps

• Traditional approach• data bases are self contained

• others can’t ‘point in’ and make assertions and build relationships about your data

• but enter:• common markup, metadata and semantics

• universal addressability

• “the web”

• and then...• relationships and assertions can be separated from the base data

Stage 2 : data modelTopic Maps

• So what?– Many different views and organisations of the

base data can be supported– Building and maintaining topics and associations

can become a specialist task• divide and conquer• plurality of interpretations and values

– Base data can be easily reused, reinterpreted, combined with other databases

Stage 2 : data modelTopic Maps

White,PatrickVoss (novel)

Tree of Man

Flaws in the Glass

David Malouf

Voss (opera)

Fly away Peter

Stage 2 : data modelTopic Maps






St Kilda




20th CenturyAcme ArmamentsFactory

Leura Katoomba

Blue Mtns

Stage 2 : data modelTopic Maps



London born











lived in


Stage 2 : data model

Stage 2 : data model

• Proposed ALEG data model– Titles and agents:

• FRBR, as expanded/interpreted by INDECS

– Subjects:• Topic Maps

– Plus:• Award, archival item

– Minus:• FRBR Item (but what about holdings?)

Stage 2 : data model

Stage 2 : data model

Stage 2 : visual design

“The practice of simplicity”

Stage 2 : visual design

The help manual for a well designed system:

Stage 2 : visual design

The face of the most successful search engine:

Stage 2 : visual design

• Some approaches to searching:– detailed search form– refinement– “intelligent” grouping/ranking of results

• Not mutually exclusive

• Regardless of approach:– search from every screen– users don’t understand boolean searches

– (“cats and dogs”)

Stage 2 : visual designDetailed



Stage 2 : visual designALEG Search Results for: “Patrick White”

1. White, Patrick (1912-1990) - 1200 results2. White, Patrick ([1984]-) - 3 results3. White, Patrick A.T - 2 results


Stage 2 : visual designALEG Search Results for: “Patrick White”

1. White, Patrick (1912-1990) - 1200 results Biographical Details Works By - 140

Short Stories - 27Dramas - 22Novels - 13Verse - 8Criticism - 1

Subject - 256 results Reviews of Works - 1134 results2. White, Patrick ([1984]-) - 3 results3. White, Patrick A.T - 2 results







Stage 2 : visual designALEG Search Results for: “Patrick White”

1. White, Patrick (1912-1990) - 1200 results Biographical Details Works By - 140

Short Stories - 27Dramas - 22Novels - 13

Voss Criticism of - 56 As subject - 23 Related Work - 2 The Tree of Man The Vivisector








Stage 2 : visual designALEG Search Results for: “Patrick White”

1. White, Patrick (1912-1990) - 1200 results Biographical Details Works By - 140

Short Stories - 27Dramas - 22Novels - 13

Voss Criticism of - 56 As subject - 23 Related Work - 2 The Tree of Man The Vivisector








Search scope: White, Patrick (1912-1990)

Search term : David Malouf


Stage 2 : visual design• Searching for “David Malouf” within the scope of

“Patrick White (1912 - 1990)”– Would return

• references to “Voss [the Opera]”• David Malouf reviews of Patrick White works and vica versa• ...

– What about:• a review by David Malouf of Voss written under a pseudonym?• Searching for graveyard

– would it return subjects of cemeteries? (automatically?)

– how would “related” terms be shown?

Stage 2 : visual designALEG Title Maintenance

Work Title: Type: single work Form: Genre: Agent Biographical Details Works By - 140

Short Stories - 27Dramas - 22Novels - 13

Voss Criticism of - 56 As subject - 23 Related Work - 2 The Tree of Man The Vivisector







One Tree Hill+

Stage 2 : implementation options

• Three components– web front end data maintenance – web front end data retrieval– back end data storage

• including Z39.50 target

Stage 2 : implementation options

• Browser based maintenance system– it is now possible to build very sophisticated

applications in the browser– technologies:

• Document Object Model (DOM)

• Cascading Style Sheets (CSS)

• JavaScript

– Unfortunately, Netscape has not kept up• IE 5 is much more functional for web applications

• some hope that “Mozilla” will be competitive

Stage 2 : implementation options

• Browser based user access interface– choice between power and platform dependencies

• target “lowest common denominator”?

• several popular targets (Netscape 2/4, IE5)?

• As at 18 June, reports browser market share as:

– Internet Explorer 86%

– Netscape 14%

– accessibility issues are very important• making the facility available to all users

Stage 2 : implementation options

• Adapt existing proprietary library catalogue– ADFA has just implemented Voyager

• Build something– based on proprietary text or XML system

• dbTextWorks, Blue Angel, Tamino, ...

– based on “standard” tools• SQL, Apache, Java XML packages, HTML

Stage 2 : implementation options

• Adapt existing proprietary library catalogue– Pro

• because it is there• Z39.50 interface and simple web interface likely to be ready to


– Con• ALEG is not a library catalogue• compromises to data model, maintenance screens, user

interface, access model• hard to estimate risks and cost• vendor lock-in

Stage 2 : implementation options

• Build something based on proprietary text or XML system– Pro

• some infrastructure may be ready to use

– Con• initial software cost, plus training and maintenance

• have to build nearly everything

• vendor lock in

Stage 2 : implementation options

• Build something based on standard tools – Pro

• free tools• no vendor lock in• no constraints/complete flexibility• no “unknowns”

– Con• have to build nearly everything• requires “faith”• no vendor “support”

Stage 3 + : what to expect

• Timetable– Stage 2 to do:

• user interface principles

• access/logging principles

• evaluate implementation options, decide

• data conversion plan

• produce implementation timetable

• done by 21st July (2 weeks late)


Stage 3 + : what to expect

• Timetable (continued)

– Stage 3: 24 July - 17 October:• development of prototype

• design/implement database

• design/implement prototype user interfaces

• trial data load from AUSTLIT, BAL, Lu Rees (+ ?)

• hardware/software commissioning

• 8 September: early demo (basic maintenance, search, browse functions) for partner feedback


Stage 3 + : what to expect

• Timetable (continued)

– Stage 4: 18 October - 8 January• refinement of prototype into production system

– incorporate feedback– complete all user interface functions– add Z39.50 target, MARC extract, XML/XSL extract– add user maintenance/accounting

• complete data load

• volume and load testing

– Stage 5: 9 January - 18 January• production
