An update on the Transforming Taxonomic Interfaces Initiative Matt Yoder Michael Twidale Andrea Thomer Kenney Guo

Embed Size (px)

DESCRIPTION

Quick History of TTI Describing phenotypes semantically Hymenoptera Anatomy Ontology unify a complex lexicon Application of the HAO to Taxonomy very “old” science monographic/comprehensive

Citation preview

An update on the Transforming Taxonomic Interfaces Initiative Matt Yoder Michael Twidale Andrea Thomer Kenney Guo (seeking feedback) Quick History of TTI Describing phenotypes semantically Hymenoptera Anatomy Ontology unify a complex lexicon Application of the HAO to Taxonomy very old science monographic/comprehensive Why richer semantics? Species are numerous, rarely studied, maximize work Giant integrated databases Nanopublications Populate the graph of life with inference Calculate species limits Rapid prototyping of semantic enhancements to biodiversity informatics platforms CSCW - Conceptual paper Semantic Express - Can I express X? Formalized suite of taxonomists end use requirements Catalog of annotated wireframe interfaces Formalized description of how to rapidly prototype features all this catalyze development of workbenches for taxonomists Distilling interviews ~ 35 taxonomists Monographers, integrative, still novice after 3 years Both confirm and ignore frustrations with relative phenotype descriptors Lean on images (stacks!) Figure out their own workflow Rebooting wet-cycles - picking up is not instantaneous Validation over multiple sources Distilling interviews Digital work is overwhelmingly desktop based (!cloud) Paper is ubiquitous, so is excel Digitization can be extramural Enter and refine The myth of the collaborative taxonomy?! Taxonomic Interfaces Hackath on Idea 12 people, proposing ideas for application interfaces, documented in workbooks Outcome: a Representing data (Kenney) Simple Example Natural Language Semantic Annotation (CharaParser) EQ using termsEQ using term IDsEQ/Manchester EntityQualityEntityQuality Notaulus shape: falciform notaulusfalciformHAO: PATO: has part some (notaulus and (is bearer of some falciform)) Representing data (Kenney) Complex Example Natural Language Semantic Annotation (CharaParser)EQ using termsEQ using term IDsEQ/Manchester EntityQualityEntityQuality Malar space length: longer than 0.5 of compound eye height malar linelengthHAO: PATO: has part some (malar line and (is bearer of some (length and (has_measurement some ((has_unit some (length and (inheres in some compound eye))) and (has_magnitude some float[>= 0.5f])))))) Representing data (Kenney) Super Complex Example the antero-dorsally projecting process of the posterior maxilla is located posterior to the laterally projecting and bent knob of the ethmoid in species X (Dahdul et al., 2010) Representing data (Kenney) Human Variants dermal sculpture on skull-roof weak curator 1: PATO: poorly developed curator 2: PATO: decreased magnitude curator 3: PATOTEMP: weakly sculptured surface Taxonomic information work Data are both collected (specimens, measurements, molecular sequences) and created (coded traits) through skill and judgment What to record, what not; whats the difference that makes a difference (to tell a species apart)? Species are not described in isolation: New species concepts are situated amongst existing concepts and literature Literally placed in a graph structure As much a conceptual engineering task as descriptive from Taxonomic information work When a researcher works with data, its not just through data seeking, entry or retrieval Other data sources (species descriptions, matrices, specimen records) are assessed, integrated and improved Middle ground between information seeking and retrieval? Information use? Information interaction? Information engineering? Semantic engineering? Representations of Confidence trusting your own work (e.g. using terms like circa) trusting anothers (in a team) work trusting anothers (in another team) work Workflows and Flows of Work { picture redacted } Workflows and Flows of Work Keeping track of the work Capturing Context What doesnt get written down How much? What? Why should we worry? What taxonomists do: They observe a lot more than they write down Potential Implications for LIS How do other domains of data and metadata represent uncertainty? How might taxonomists data practices inform data curation best practices? Taxonomy data = very long-lived data! Might other fields learn something from its management? Finding physical/digital analogs/parallels - tagging (notes on specimens, notes on data) so that users feel comfortable with the paradigm Software development considerations- how do you engineer big complex applications for the integrative taxonomist? Acknowledgements Funding: NSF-DBI Collaborating taxonomists