Upload
vincent-smith
View
1.424
Download
2
Tags:
Embed Size (px)
DESCRIPTION
A Science Information Committee (SIC) presentation authored by Smith, V.S., Blagoderov, V., Kitching, I. and Simonsen T., given at the Natural History Museum, London, UK. May 14th, 2010.
Citation preview
Scaling-up collections digitisation
Vincent S. SmithVladimir Blagoderov, Ian Kitching & Thomas Simonsen
“the rate of progress by the UK taxonomic institutions in digitising and making collections information available is disappointingly low… there is a significant risk of damage to the international reputation of major institutions such as The Natural History Museum”
House of Lords Science and Technology CommitteeReport on Taxonomy and Systematics, 2009
Rate of digitisation at the NHM
Specimen focus
SatScanTM (by SmartDrive)
Example outputs
Diptera: http://sciaroidea.info/node/44309
Coreidae: http://sciaroidea.info/node/44310
Sackler Lab Trials Nine test projects over 1 month (ent. bot. & palaeoent.) - Assess utility for coll. management and research - Understand technical & practical limitations
Key Facts • Minimal resolved structures: 0.06 - 0.1 mm• Depth of field: 10 - 80 mm• File size (15000 x 14000): 340Mb (TIFF)• Scanning time (45 x 50 cm): 5-7 min, depending on exposure• Stitching time, 200-400 tiles: 5:30-9:30 min (batchable, overnight)
Sackler Lab TrialsAperture, Exposure, Depth of Field & Resolution
11 81041Exposure (ms)
DoF (mm) 6 8017
Smallest resolvable structure (µm)
56 9859
Open ClosedMidwayAperture
General points
Implications
Entomology dept.
• Best suited to drawers of numerous, uniformly positioned, med. size spec.• Excellent results with large and medium-size beetles, moths and butterflies• Sufficient information is usually preserved to allow id. for these specimens• Objects less than 10 mm could not be imaged so adequately• Such images could be used in other ways• Specimen labels and barcodes (when not obscured) could be easily read
from the digitised image
• Of the 135,000 draws in Entom., 85,000 could be usefully imaged at the current level of resolution with this system
• This work could be completed in ~2024 person-days (ten person-years) using one system
• Other lens / camera options might be explored to image remaining draws
Caveats
• Metadata• Utility of surface (usually dorsal) view images - not a panacea• Assigning specimen level identifiers (physical, virtual or both)• Image storage (85k stitched images = 28,222 GB or 27.6TB)• Software workflow (managing identifiers, cropping etc)• Integration with existing systems (KeEMu and DAMS)• Challenges to research & collection management processes (e.g.
staff time, curation activities)• Cost: Circa £50k (outright purchase) or £2k per month hire
NHM Issues
• Max. scanning area ~ 500 x 600 mm – insufficient for some drawers• Occasional errors during scanning and stitching• Focusing (currently time consuming)• Inconvenient access to scanning area
Hardware / Software issues
Metadata capture is rate limiting
• Specimen images & metadata need not be captured together• Link back together through common identifiers• Specimen level identifiers can be physical, virtual or both• Assignment of virtual identifiers might be automated• Prioritise metadata capture on research & collection activities• Image and re-image as required• Crowd source metadata capture, assignment of identifiers and
image cropping
• Acquiring images for use with automated identification software• Manual identifications• Morphometric analysis of specimens• Support the monitoring of environmental change• Supporting biodiversity conservation research• Studies on colour pattern variations
Possible Applications
• Accurate specimen counts for the entire collection• Collections audit and security• Improving accessibility to the entire collection• Saving curator & visitor time• Improving curation• Updating identifications (crowdsourcing possibilities) • Encouraging typification (discovery of unrecognized/unlabelled types)• Populating KE EMu
• Visual & engaging equipment on display in Sackler Lab.• Innovating crowd sourcing possibilities with the public• Meets NHM strategic commitments on collection accessibility
Collection management
Research
Public engagement
Next Steps…
• Metadata• Utility of surface (usually dorsal) view images - not a panacea• Assigning specimen level identifiers (physical, virtual or both)• Image storage (85k stitched images = 28,222 GB or 27.6TB)• Software workflow (managing identifiers, cropping etc)• Integration with existing systems (KeEMu and DAMS)• Challenges to research & collection management processes (e.g.
staff time, curation activities)• Cost: Circa £50k (outright purchase) or £2k per month hire
Larger Scale Project to address NHM Issues
Acknowledgements
• Smart drive Ltd (esp. Mike Broderick & Dennis Murphy)
http://sciaroidea.info/sites/sciaroidea.info/files/SatScanTrialReport.pdf