Experience so far
• 8000 + records through CAI for ETDE
• 3500+ from “UK” published journals
• About 2700 reviewed by UK for selection
• Of these about 1000 out of scope, rejected
• About 1700 selected and indexing checked so far
Overall impressions
• Many terms generated by CAI• Usually 25 –30• UK indexers usually reduce to 10-15• All records received have had some
indexing intervention• Need to remove terms not useful for
searching unless primary theme – E.g. heat, operation, availability, uses
Multiple meanings (1) Real index terms
• Examples include– Emission (carbon dioxide or photons etc)– Buses (vehicles or electrical)– SMEs (Small Medium Enterprises or
Superconducting Magnetic Energy Storage)– Solutions (Answers or solute in solvent)– Plants (power plants or vegetation)– Bees (insects or computer model)
Multiple meanings (2) - Common words
• Examples– Lead (metal, noun) for lead (verb)– Currents (electrical) for Current (at present)– WHO (organisation) for who (interrogative
pronoun)– Dates (fruits) for dates (calendar)– Gadolinium phosphides for GDP
Multiple meanings (3) - Comments
• Not possible just to use hidden term
• Can the context be recognised?
• Is there any other way round this
• Does it matter for most users?
Missing index terms
• Examples:– Thailand for Thai– USA for Upper North West– Cyanobacteria for Spirulina
Time to index
• UK indexers found it very time consuming initially
• But after 150 –200 records indexed, faster than manual indexing
• May require extra time if there are missing index terms
810 author terms
• Useful to understand author’s view of “themes”
• Rarely used to generate new terms
• On balance, not as useful as hoped
• If cost neutral, recommend keep
Effects on searching
• If not manually checked:– Potential to miss records– Potential to retrieve incorrect records– Database quality impacts– Search strategy impacts– May increase time for expert searching– But probably few effects for naïve searchers
Conclusions
• Faster than manual indexing once learned• Indexer must be alert for incorrect terms• Indexer must be aware of meanings of terms used
in INIS/ETDE thesaurus to capture multiple meanings
• Indexer must add missing terms• Indexer should propose hidden terms• Still currently requires manual intervention on
every ETDE record to index correctly