A Dissertation defense presented to the Department of Computer Science

Embed Size (px)

DESCRIPTION

David A. Gaitros, Dissertation Defense, FSU, December 2006 Overview Acknowledgements Problem Definition Research Statement Goals and Challenges Semantic Associations Ontology in Semantic Associations MorphBank Architecture MorphBank Object Relations Annotation and Collections Semantic Annotations Example MorphBank Semantic Association Results Future work Questions I am going to cover the topic of Semantic Associations using Annotations and the application of this idea to MorphBank. MorphBank is a very large and complex project which has many facets. Much of the background is covered in the dissertation and additional information can be obtained by going directly to the web site at http:.//morphbank.net. David A. Gaitros, Dissertation Defense, FSU, December 2006 Davd A. Gatros, Dissertation Defense

Citation preview

THE REPRESENTATION OF ASSOCIATION SEMANTICS WITH ANNOTATIONS IN A BIODIVERSITY INFORMATICS SYSTEM
A Dissertation defense presented to the Department of Computer Science In partial fulfillment for the Requirements of the degree Doctor of Philosophy David A. Gaitros The Representation of Association Semantics with Annotations in a Biodiversity Informatics System Supported by the National Science Foundation (NSF) Biological Database Informatics (BDI) program Gant DBI ). $2.25 Million 3 year project. Welcome committee and distinguished guests. Dissertation Committee Dr.Greg Riccardi Dr.Fredrik Ronquist Dr.Robert van Engelen Dr.Ashok Srinivasan December 8th, 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Overview Acknowledgements Problem Definition Research Statement Goals and Challenges Semantic Associations Ontology in Semantic Associations MorphBank Architecture MorphBank Object Relations Annotation and Collections Semantic Annotations Example MorphBank Semantic Association Results Future work Questions I am going to cover the topic of Semantic Associations using Annotations and the application of this idea to MorphBank.MorphBank is a very large and complex project which has many facets.Much of the background is covered in the dissertation and additional information can be obtained by going directly to the web site at David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank Primary Investigators MorphBank Development Team
Acknowledgement MorphBank Primary Investigators Dr. Fredrik RonquistDr. Greg Riccardi Dr. Austin MastDr. Robert van Engelen Dr. Corinne JrgensenDr. Peter Jrgensen Dr. Greg Erickson MorphBank Development Team Mr. Wilfredo BlancoMrs. Neelima Jammigumpula Mr. Steve WinnerMrs. Karolina Maneva-Jakimoska Mrs. Cynthia GaitrosMrs. Debbie Paul Ms. Katja SeltmannMr. Chris Cprek Showing the acknowledgements first because of the The magnitude of the project. Contribution to the whole project by a large number of people. To express my appreciation to them. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Acknowledgement (continued)
Research Associates Dr. Gordon ErlebacherDr. Andy Deans Dr. Matthew BuffingtonMr. Shayne Steele Student Research Associates Mr. Gabriel LoganMr. Jason Simmons Mr. Stanislov UstymenkoMr. Wei Zhang Ms. Allison von EbersteinMs. Janet Capps The list of individuals does not include many other participants who have also made contributions to MorphBank and indirectly to this research.Also thanks to the Spring 2004 Software Engineering class for their work on the original MorphBank requirements document with me. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Problem Statement Scientist canproduce large amounts of data but cannot always process or search it. In biodiversity, specimens can be dissected, cataloged, photographed, analyzed, and stored in a variety of media. Much of the detailed knowledge of these specimens are still kept in personal journals, scientific logs, hand-written notes, and human memory. Such informal methods of storing and retrieving information represented a problem when other biologists attempting to search for biodiversity subject matter. How can we help solve this problem? What would a dissertation be without a problem to solve? Scientists can and do produce terabytes of data.Some, such as meteorologists can do this daily.Making sense of this data is becoming a very difficult problem. Biodiversity information is not different. However, much of the data is very informal and in a non-digital media. The are many scientific collections exist of specimens in boxes locked in cabinets that have never been cataloged in a central data repositories. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Statement This research adds value to image repositories by collecting and publishing semantically rich user specified associations among images and other objects. Read statement. I wanted to increase the productivity of a biodiversity web site by creating a method that would allow biologists and eventually other scientists to search and discover new relationships amongst data. I am going to show that I not only conducted the research but was able to take the demonstration of the utility past the prototype stages.This was a very successful venture. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Goals Gather available data standards for biodiversity and semantic associative systems. Develop models Transform models into a relational database Develop data retrieval methods Research and develop methods to expose MorphBank data Develop a prototype semantic associative annotation tool Research automated object association Show that a semantically rich environment is useful to research scientists Look at the databases, including MorphBank and literature for any proposed standards on naming and structure. Come up with a schema and review the ideas with the community Create a relational database. How can we get data. We want to make the data easily available to the world. Annotation tool Research what is being done in areas of mining data. Are people going to use the system. Here I want to make a comment concerning the initial perception of the research. When I started the PhD program here at FSU, a few of the professors told me I would find out two things: My initial idea of the outcome of my research and the final version would be very different. I would probably have to scale back on the scope of what I wanted to accomplish. #1 was true,#2 however as it turns out I was able to accomplish much more. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Challenges Finding consensus on data naming standards Finding a flexible and reliable taxonomic name server Developing a model for semantic associations Developing a prototype of a functional semantic association annotation tool. The magnitude of the work that must be accomplished. Management of a development team Creation of a development environment Creation of a commercial quality web site Populate database Maintenance of a Biodiversity system Attracting sufficient users to determine the feasibility of such a system How mature are the data standards out there. Naming of specimens is a tedious task. Many scientists avoid this problem by just allowing users to type in the names.To prone to errors. Difficult to search. How is the data associated, I did not know when we started. The annotation tool I need does not exist.I really dont want to write one. The massive amount of work that must be done. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Semantic Associations
Represents a very complex set of relations among objects Allows users to gain insight or query for interesting relationships among large amounts of data Inside a semantically rich environment, ontologies and context are preserved. The novel approach is integrating ad-hoc annotation data with semantic associations with tools that allow for the discovery of the relationship. Semantic associations are complex relationships built upon the ontology of the terms used in the data.Semantic associations are dynamic and not static in nature so they change over time. We want scientists to use the ontology that best describes their perception of their research. Lets say that you had two mathematicians who are describing proofs to the same problem using different types of mathematics to describe their solution. This would be an example of using different ontologies. There are research projects that involve Semantic Associations, There are research project that involve Annotations. The novel approach here is the marrying of the two to create an environment where ontologies are preserved and using annotations, new semantic associations can be created, searched, and found. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Semantic Associations
Associations that have a direct relations are easy to find, others are not View Information: Head posterior cleanedin alcohol Locality Information: Europe Specimen Data: Female, indeterminate, adult, Diplopepis rosae Contributor: Johan Liljeblad and Fredrik Ronquist General Comments: 12 records Determinations: 15 records Related Phylogenetic Characters: 1 record External Data Sources:15 sources Some of the data is easy to store and retrieve.View, locality, specimen, contributor, etc.All of these we can create static tables for and queries to retrieve the data quite easily. However, ad-hoc comments, determination annotations, different physical characteristics , and even storing an unspecified number of external sources all prove to be difficult in a data repository.They can be stored but linking them with other objects is often not attempted. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Semantic Associations
What we would like to be able to find: Other images By this contributor Related specimens Other images That use this view Contributor Data About the View Image Specimen Data Here is an example.These are some of the things we wanted to do with MorphBank. Note that any object that is one edge away is easy to find.Two or more become difficult. Image can find a contributor or view or specimen.But lets say that I wanted to find the related specimen, where the specimen was located, and then find other related images collected at that site and perhaps the contributors. ALL PATHS ARE NOT DEFINED. Other related objects Related comments Place where This specimen Was collected The nature Of the relationships David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Semantic Associations
What we would like to be able to find: Any phylogenetic Characters/states Other Taxonomic descriptions All related images Associated publications Specimen Data All Annotations Annotation contributors All Determination Annotations Lets take the problem another step and show the complexity of the situation.On the previous slide we noted the relationship to image and specimen. We can take any relation ( specimen) and expand the desired paths in an almost infinite series. All related images Other objects contributed External Data Links David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Ontology in Semantic Associations
Ontology is a specification of a specialization (Charles Canton) Ontologies represent a community consensus among participants and there is pressure against change. Issues: What if someone desires to use a different taxonomic structure to describe a specimen? What if an error is discovered in the current ontology? How do you deviate from the current ontology without distorting the data and relationships? Before we go any further, lets talk about ontology.A funny word, the definition of which causes some problem. It is usually used in philosophy but in this case we use it to mean a consensus among participants on meanings and definitions. Change is hard but in order for science to progress, change is often necessary so in MorphBank we must be able to allow scientists to use their own ontology. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Ontology in Semantic Associations
MorphBank has several software and internal features that address this problem Through the use of Semantic Associations with Annotations, users can preserve the use of their own ontologies without inhibiting anyone else or corrupting data MorphBank allows for local modifications on external data references We have several features that allow this. Most notably is the use of collections and annotations the give scientists a forum for discussion and agreement/disagreement. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank Architecture
Working Set Under Review Released MorphBank Version 2.5 Data Service Browse Search Upload Admin Annotation Read Only ITIS MorphBank Security Service Login About News Help Contributor Unregistered User Scientist Lead Group Coordinator Administrator I want to give you a little background on MorphBank, There was considerable effort in the beginning of the research project to create a valid and correct architecture. In a Software Engineering sense, this project was accomplished correctly. There was a tremendous amount of work accomplished early in the discovery and analysis of the requirements.This early work eased the production effort later on. There are things we would change but that is always the case. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank Object Model
The early ideas of creating a valid data model with a centralized catalog was absolutely paramount in the success of the project.Since all objects in MorphBank (image, specimen, view ,locality, publication, user, group, annotation, collection) have a unique serial number and inherit a base object, relationships among these objects can be built with maximum flexibility. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank Inheritance Relationships
Here we see an example.The idea of a collection is central to MorphBank.A collection is a group of related items that have an idea in common.A collection can be any number of related objects including the relation myCollection which is a restricted subset of a Collection used internally within the database.A collection can have many objects and likewise an object can be in many collection. Here is the idea.By putting related objects in a collection, I can show their relationship to other objects regardless of the distance. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank Object Relationships
Another view point.Note that there are hard coded relationships between specimen and locality, image and view, and user and group. The all, including annotation and mycollection inherit the base object.So, by going through the baseobject I can find any relationship of distance 1. Using the concept of a Collection, I can using annotations and myCollection to find any relationship of distance N. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank Annotation Architecture
Here is how the annotation tool works.As I had stated earlier, the idea of an Annotation has changed.I originally concentrated on the graphic nature of the tool and I spent quite a bit of time. And although this is important, I was somewhat surprise at the importance of annotation of data relationships later in the project. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense MorphBank Object Relationship
So important was the relationship annotation and the ability to include external data references (different data models and ontologies) that we included in the research the ability to import and store XML documents inside of MorphBank. Example: Image Annotation Overview Using an XML Schema David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Annotations, Collections, and Associations
The research program started with a concentration on annotations. However, the idea of a collection and building a relationship between the two evolved after time Annotation: A note that describes, explains, and/or evaluates the contents of a book, article, video, image, etc. This information is always accompanied by a citation. Collection: Several things grouped together or considered a whole. [Websters Dictionary] Associations: Phrases that lend meaning to information, making it understandable and actionable, and provide new and possibly unexpected insights [Boenerges Aleman-Meza] David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Collection The is a partial screen shot of a collection screen.At this time, we limited a collection to images but internally we are able to put any valid MorphBank object in a Collection including another Collection. If we have time I can demonstrate this. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Annotation Select Related New Determination Taxon name Annotations
This is an annotation screen that show Determination Annotations. A specialized annotation that allows biologists to assign a taxonomic determination to a specimen and agree, disagree, or agree with qualification on other determinations. This particular screen is full of the complex relations that we have described. Title, comments, and image Annotation Associate Related Materials David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Annotation Looking at the show feature, we start to see these relationships.Note the references to the specimen and related annotations.Also, not shown because of size limitations, are related objects. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Annotation Here is another type of annotation that allows users to import XML data and also place a marker on an image. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Annotation The annotation tool that allows the user to insert specific markers on an image. Note the image is not altered. The marker and annotations are stored separately.There is not restriction on the number of annotations an image or any other object may have. The original data is not altered. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Annotation Text version of previous annotation Specimen record of an adult female of form Indeterminate Pteroceraphron mirablipennis gathered by D. C. Darling of the institute CNCI. The specimen was gathered on August 4th, The specimen was gathered near Indiana: Porter Co.: Cowles Bog: Dune Acres, United States of America.This particular specimen is of class Insecta, order Hymenoptera, family Ceraphronidae, Genus Species Pteroceraphron mirablipennis. This particular image (104272) was submitted by Dr. Andy Deans on August 8th, 2006 and released November 12th The view of the image is of the body with a lateral view using auto-montage photography.No particular preparation.There are six related images of this same specimen. There are two related determination annotations.(1) Which identifies the wings, antennae match key and (2) that states this diagnosis if for the genus Pteroceraphron If we were to attempt to write out the complete annotation of the previous image it would look something like this. Note, this is not a complete description.Finding information and parsing this document would be very difficult. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Annotation 104272 lanceolot wings 25.2352.1 This is a lanceolot wing 67572 Andy Deans 3 HymAtol . This represents the XML version of the annotation that can be produced by MorphBank.Used in communicating to external sources, this is in a very machine readable format but is less useful to humans.It is also very verbose and I was only able to place a small portion of the document on the slide. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Results Collections are a form of annotations by the fact that items that are in a collection define a relationship. Inheritance is a strategy for annotation whereby we now know we can extend this model into new meanings. Through the baseObject class we can form complex relationships and through annotations we can provide meaning to those relationships. Through Collections we can form relationships of objects thatwould otherwise have no direct links to each other and through annotations we can provide meaning to those relationships. There is no limit to how this capability can be extended. David A. Gaitros, Dissertation Defense, FSU,December 2006 David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Results Through inheritance we restrict the semantics that are used with objects to improve context searches. Fields of data are not open to interpretation Fields are distinct to the objects they reference Example:Determination annotations inherit from Annotations and further restrict the meaning of that type of annotation. Tools can now be built that allow for more extensive and elaborate building of relationships. David A. Gaitros, Dissertation Defense, FSU,December 2006 David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Results Version 2.2 and 2.5 MorphBank documented and released Currently working on subsequent versions Updating documentation Under Configuration and Control hits on the web site per day 3 accepted conference papers, 1 Biodiversity Journal publication, 3 Taxonomic Data Working Group Presentation, 1 ATOL/PBI Presentation Over 100,000 data items Over 60,000 images 98 Groups 121 Registered users from 85 organizations (Ex: FSU, UF, Harvard, Yale, USC, American Museum of Natural History, Duke, Johns Hopkins) 350 Annotations As stated before, the results were significant.Multiple versions of the database were produced and documented for release.A simple search on the web shows the site being used throughout the community and it has received rave reviews. In particular the capability of collections and annotations and the uniqueness of that capability are drawing more attention to the site. Images, data, users, and groups are constantly being added and as the capability of the system grows, so will the use. Efforts are underway to require that biologists publish their image in MorphBank to be used as references in publications. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Results 336 Determination Annotations 1,544 distinct objects contained in 384 Collections Received very positive feedback from trial participants Received praise from the National Science Foundation for the quality and quantity of work accomplished to date First Biodiversity System to offer semantic association annotations, general annotations, legacy annotations, and determination annotations Being used currently by organizations for collaboration on specimen determinations MorphBank., through the use of semantic associations and annotations, represents the ability to increase our understanding of the relationships of distant entities.The more MorphBank is used (even if data is not added) the greater our understanding of relationships. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Results Developed prototype for semantic search on internally stored XML documents External objects are exposed through LSIDs in an RDF format.XML documents and being exposed and used by other organizations and data repositories Morphobank Genbank Provide direct links using the MorphBank Show function as URLs used in Conference and Journal papers As the amount of data grows in MorphBank so does the wealth of semantic associations. We were able to accomplish more.The ideas on collection and the extended use of XML data, Life Science Identifiers, exposing objects as RDF documents, and the proliferation of the use of the site is beyond our original expectations. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Results David Gaitros Contribution Analysis of the problem Analysis of the original MorphBank version 1.0. Analysis of data requirements and gathering of initial MorphBank requirements. Research of the current state of knowledge of annotations in scientific systems. Research of available taxonomic name servers. Modeling Creation of the MorphBank security model. Creation of the MorphBank data model and schema. Creation of the semantic association annotation model. Project Manager Leadership of the design team for the MorphBank system. Management of the production of MorphBank version 2.2 and 2.5. Procurement of hardware and software licenses. Management of the MorphBank NSF/BDI grant under the direction of the Primary Investigators. Oversight of the functional and design review meetings with users and primary investigators. Presentations of the project at conferences and workshops. MorphBank is a very large project.There were lots of people involved with it. Besides the basic research on MorphBank, my particular contribution involved much more.Remember that in order to get the results, MorphBank had to work. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Research Results Software Design and Development Design and implementation of the initial MorphBank Administration Model. Design and implementation of the initial version of the Taxonomic name selection module. Design and implementation of the MorphBank Annotation Software. Design and implementation of the initial version of the MorphBank Collection module. Design of the external search and exposure feature for the release of MorphBank images in response to MorphOBank external references requirements. Design of the software test plans. Contributor to the MorphBank users manual. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Future Work Continue to extend the capability of Annotations and Collections Turn on the feature that allows for the annotation of any object Turn on the feature that allows for any object to be in a collection Research more efficient search techniques for semantic associations Complete development and release of phylogenetic character state software Research the possibility of further developing the extensible schema capability Analysis of the complexity of relationships of the objects associated through collections and annotations Expand and mature the use of Life Science Identifiers Implement a security strategy that is separate from the implementation of the software Map the current data schema to the ABCD standard for the purpose of exporting data. Publish results in high quality journal.Continued exposure at conferences and workshops Future work. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
QUESTIONS David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense Environment Requirements
One of the major problems with semantic associations is the complexity and reliability of the relationships Allowing unqualified individuals to make contributions to the data repositories induces errors that makes the data unreliable Relationship connections are easily corrupted if heuristics are not followed David A. Gaitros, Dissertation Defense, FSU,December 2006 Environment Requirements
Features of MorphBank that satisfy environment requirements Secure login of ALL contributors Restriction of contributors to the area of their expertise Group membership and data ownership Categories of data In-progress Under review Released (cannot be altered only annotated) Strict adherence to add, update, view, anddelete heuristics All objects are centrally cataloged and uniquely identifiable All objects can be accessed via a globally unique identifier David A. Gaitros, Dissertation Defense, FSU,December 2006 David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation We want multiple annotations per any MorphBank object to allow scientists to add ad-hoc data to the database without specifically creating new tables or columns in existing tables. How to store and retrieve this information in an efficient and reliable manner. How to relate these annotations correctly to all other objects. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation Most disciplines have a common language and phrases that they use in describing articles in their area. Example: Communication of a pilot to a control tower: Pilot: Tallahassee ground control this is Cessna 3245 Yankee on ramp ready for taxi to active runway with information Bravo. We can pick out specific information that appears in an exact order. This system of formal semantics in aviation communication allows the participants to communicate efficiently and effectively without misunderstanding. We can schematize this conversation: We found during the course of research that biologists, like other disciplines, have their own common language and ontologies that are used. If we can capture these types of phrases and dissect them, we can schematize them for storage. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation Tallahasee Ground Control Cessna 32345/Taxittoramp> This is a simple illustration of how we were able to formulate the different annotation and collection schemas. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation With a Biological Image annotation we have several distinct parts: Specimen ( biological item of interest0 Image ( A specimen may have more then one image) Type Annotation Text Description of Annotation Title of Annotation Date (Time Stamp) Location (X/Y coordinate of the area on the image) Associate MorphBank Object ( Image, Specimen, Publication, Group, User, Annotation, Location, View). David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation All aspects of the annotation can be placed into a schema and searched accurately. Searches on plain text presents a problem. Example: WebSearch for Fruit Fly Solution: Allow researchers to use restricted semantic annotation in writing the text description Place data in an XML document Items can be searched quickly and efficiently No restrictions on content New semantics can be added at anytime. In MorphBank and in this research we want to make several improvements on search for information. We want to make it more accurate.Unlike a Google search, we want a search for a specific string to return only the related objects. We need it to also be fast. There are several ways to accomplish this. The use of specific relationships built into the system. Only searching related objects and not the whole database. Also using the power of XML documents and specify the attributes. David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense David A. Gaitros, Dissertation Defense, FSU, December 2006
Semantic Annotation A B Red Green Black David A. Gaitros, Dissertation Defense, FSU,December 2006 Davd A. Gatros, Dissertation Defense