11
ETDE: UK Comments on CAI batches Heather Cholerton UK Technical Support

ETDE: UK Comments on CAI batches Heather Cholerton UK Technical Support

Embed Size (px)

Citation preview

ETDE: UK Comments on CAI batches

Heather Cholerton

UK Technical Support

Experience so far

• 8000 + records through CAI for ETDE

• 3500+ from “UK” published journals

• About 2700 reviewed by UK for selection

• Of these about 1000 out of scope, rejected

• About 1700 selected and indexing checked so far

Overall impressions

• Many terms generated by CAI• Usually 25 –30• UK indexers usually reduce to 10-15• All records received have had some

indexing intervention• Need to remove terms not useful for

searching unless primary theme – E.g. heat, operation, availability, uses

Multiple meanings (1) Real index terms

• Examples include– Emission (carbon dioxide or photons etc)– Buses (vehicles or electrical)– SMEs (Small Medium Enterprises or

Superconducting Magnetic Energy Storage)– Solutions (Answers or solute in solvent)– Plants (power plants or vegetation)– Bees (insects or computer model)

Multiple meanings (2) - Common words

• Examples– Lead (metal, noun) for lead (verb)– Currents (electrical) for Current (at present)– WHO (organisation) for who (interrogative

pronoun)– Dates (fruits) for dates (calendar)– Gadolinium phosphides for GDP

Multiple meanings (3) - Comments

• Not possible just to use hidden term

• Can the context be recognised?

• Is there any other way round this

• Does it matter for most users?

Missing index terms

• Examples:– Thailand for Thai– USA for Upper North West– Cyanobacteria for Spirulina

Time to index

• UK indexers found it very time consuming initially

• But after 150 –200 records indexed, faster than manual indexing

• May require extra time if there are missing index terms

810 author terms

• Useful to understand author’s view of “themes”

• Rarely used to generate new terms

• On balance, not as useful as hoped

• If cost neutral, recommend keep

Effects on searching

• If not manually checked:– Potential to miss records– Potential to retrieve incorrect records– Database quality impacts– Search strategy impacts– May increase time for expert searching– But probably few effects for naïve searchers

Conclusions

• Faster than manual indexing once learned• Indexer must be alert for incorrect terms• Indexer must be aware of meanings of terms used

in INIS/ETDE thesaurus to capture multiple meanings

• Indexer must add missing terms• Indexer should propose hidden terms• Still currently requires manual intervention on

every ETDE record to index correctly