23
Development of the Generation Development of the Generation Challenge Program Ontology Challenge Program Ontology for Crops for Crops Elizabeth Arnaud Elizabeth Arnaud (Bioversity International) (Bioversity International) and and Rosemary Shrestha (CRIL-CIMMYT), Rosemary Shrestha (CRIL-CIMMYT), Richard Richard Bruskiewich Bruskiewich (IRRI) (IRRI) TDWG 2008 Annual Conference, TDWG 2008 Annual Conference, 20-25 October 2008 20-25 October 2008 Fremantle, Western Australia Fremantle, Western Australia

Development of the Generation Challenge Program Ontology for Crops Elizabeth Arnaud (Bioversity International) and Rosemary Shrestha (CRIL-CIMMYT), Richard

Embed Size (px)

Citation preview

Development of the Development of the Generation Challenge Program Generation Challenge Program Ontology for CropsOntology for Crops

Elizabeth Arnaud Elizabeth Arnaud

(Bioversity International)(Bioversity International)

andand

Rosemary Shrestha (CRIL-CIMMYT), Rosemary Shrestha (CRIL-CIMMYT), Richard Richard BruskiewichBruskiewich (IRRI) (IRRI)

TDWG 2008 Annual Conference, TDWG 2008 Annual Conference,

20-25 October 200820-25 October 2008

Fremantle, Western AustraliaFremantle, Western Australia

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

The Generation Challenge The Generation Challenge ProgrammeProgramme

Science for better crops in the tropicsScience for better crops in the tropics

For the majority of For the majority of crop farmers in the crop farmers in the developing worlddeveloping world, the ravages of drought, low , the ravages of drought, low soil fertility, crop pests and diseases are soil fertility, crop pests and diseases are aggravated by their limited access to aggravated by their limited access to improved improved cropscrops. .

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

The Generation Challenge The Generation Challenge ProgrammeProgrammeScience for better crops in the tropicsScience for better crops in the tropics

By using advances inBy using advances in molecular biologymolecular biology and and harnessing the harnessing the rich global stocks of crop genetic rich global stocks of crop genetic resourcesresources, the Generation CP creates and , the Generation CP creates and provides a new generation of plants that meet provides a new generation of plants that meet farmer needs. farmer needs.

http://www.generationcp.org/http://www.generationcp.org/

Consultative Group on International AgriculturalConsultative Group on International AgriculturalResearch (CGIAR)Research (CGIAR)

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

GCP subprogramsGCP subprograms

SP1- Genetic Diversity of Global Genetic Resources

SP2 - Genomics towards gene discovery

SP3 - Trait Capture for Crop Improvement

SP4 - Bioinformatics and Crop Information SystemsBuilding an 'integrated platform' of molecular biology and bioinformatics tools = Molecular breeding platform

SP5 - Capacity Building and Enabling Delivery

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

The Generation Challenge The Generation Challenge ProgrammeProgramme

Target areasDrought-prone environments

Mandate cropsAll the CGIAR mandate crops = 22 crops

Commissioned and competitive projects 275 projects in 5 years

GCP New Challenge initiativesGCP New Challenge initiatives

CerealsCerealsl Rice/Rice/droughtdrought//AAfricafrical Wheat/Wheat/droughtdrought//AAsiasial Sorghum/Sorghum/droughtdrought//AAfricafrical Rice-Sorghum-Maize/Rice-Sorghum-Maize/soil problemsoil problem//AAsia & Africasia & Africa

LegumesLegumes

1.1. Cowpeas/Cowpeas/droughtdrought//AAfricafrica

2.2. CChickpeas/hickpeas/droughtdrought//AAfrica and Asiafrica and Asia

Root and tubersRoot and tubers

1.1. Cassava/Cassava/virusvirus//AAfricafrica

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

Integration across diverse crop datasets

Volume and complexity of biological data is Volume and complexity of biological data is increasingincreasing

Historical data are scattered in numerous Historical data are scattered in numerous crop crop specific databasesspecific databases

Each database uses slightly different Each database uses slightly different terminologies for terms related to phenotypes terminologies for terms related to phenotypes

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

Integration across Diverse GCP Crop Data

• Anatomical• Developmental• Field Performance• Stress Response

GenotypeGermplasm Phenotyp

e

MolecularExpressio

n

Environment

• Inventory• Identification (passport)• Genealogy

• Genetic Maps• Physical Maps• DNA Sequence• Functional Annotation• Molecular Variation (Natural or Induced)

• Location (GIS)• Climate• Day Length• Ecosystem• Agronomy• Stresses

• Transcripteome• Proteome• Metabolome• Physiology

has has

determinesdetermines

affects

SP3

SP3

SP2

SP1

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

An integrated platform for An integrated platform for molecular breedingmolecular breeding

To support and encourage researchers to share and reuse information among agricultural databases

To form the basis for the generation of data templates, web services and software.

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

GCP Scientific Domain ModelGCP Scientific Domain Model

Germplasm identification (“passport") and pedigree data Phenotypic characterization and evaluation data Geographic location and environmental descriptions Genotype and molecular data Genomic map data for markers and loci Functional genomics data

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

The exchange of new findings and joint work on projects presuppose that all those involved have the same understanding of the terms they use. This calls the need for an extensively standardized description of plant development stages with phenological characteristics and coding.

Prof. Dr. F. KlingaufPresident of the Federal Biological Research Centrefor Agriculture and Forestry,Berlin and Braunschweig

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

Importance of crop ontologyImportance of crop ontology

Similar plant structures are described by their species-Similar plant structures are described by their species-specific terms.specific terms.

Fruit

Kernel in Maize

Grain in Wheat

Pod in Beans

Grain or caryopsis in Rice

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

The GCP OntologyThe GCP Ontology

"Thesaurus" of biological concepts that can be shared "Thesaurus" of biological concepts that can be shared and used across species to which and used across species to which genetic and genetic and phenotypic dataphenotypic data can be associated can be associated

integrative data miningintegrative data mining on GCP annotated data using on GCP annotated data using the platform and web servicesthe platform and web services

Developed with Developed with crop experts,crop experts, for plant structure, for plant structure, developmental stages, traits and expression of the developmental stages, traits and expression of the traitstraits

for selected priority GCP crops: for selected priority GCP crops: Wheat, Maize, Wheat, Maize, Sorghum, Chickpea, Banana & PlantainSorghum, Chickpea, Banana & Plantain

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

GCP Sources for mapping the GCP Sources for mapping the terms terms International Crop Information Systems International Crop Information Systems ICIS model (ICIS model (http://www.icis.cgiar.orghttp://www.icis.cgiar.org ) )

IMIS (maize)IMIS (maize) IRIS (rice)IRIS (rice) IWIS (wheat)IWIS (wheat)

MusaMusa germplasm information system ( germplasm information system (http://www.musa-diversity.orghttp://www.musa-diversity.org ) )

ICRISAT information system (Sorghum, chickpea)ICRISAT information system (Sorghum, chickpea)

CIP information system (potato)CIP information system (potato)

Crop descriptors for traits Crop descriptors for traits (Bioversity International)(Bioversity International)

GCP data templates GCP data templates

GCP datasetsGCP datasetshttp://www.generationcp.orghttp://www.generationcp.org

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

GCP OntologyGCP Ontology

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

Developing the GCP ontologyDeveloping the GCP ontology

GCP crop ontologyGCP crop ontology

mapping

Plant Structure ontologyTrait Ontology

GCP concept ID

PO concept ID & TO concept ID DBXref

Data annotation with GCP ontologyData annotation with GCP ontology

GCP data Templates

1

2

3

4

Crop DB

www.plantontology.org/www.gramene.org/plant_ontology/

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

GCP ontology term has:GCP ontology term has:

Term: plant height

ID: GCP_322*.0000021

Namespace: maize_traitDefinition: Measurement of plant height from soil surface

to the highest point in plant.

Synonyms: PHT, PTHT, Planth. Shoot height

Dbxrefs: PO:10202TO:0000207, IMIS_TRAITID:1008

is_a: GCP_322.0000108

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

Building ontology with OBO.EditBuilding ontology with OBO.Edit

Terms are linked by the relationships such as Terms are linked by the relationships such as is-ais-a part-ofpart-of has-a has-a disjoint fromdisjoint from derived from, etc.derived from, etc.

It is structured as a hierarchical directed acyclic It is structured as a hierarchical directed acyclic graph (DAG)graph (DAG)

Terms can have more than one parent and zero, Terms can have more than one parent and zero, one or more childrenone or more children

Draft releases of the OBO formatted ontology files for rice, Draft releases of the OBO formatted ontology files for rice, wheat and maize trait are available at wheat and maize trait are available at http://cropforge.org/projects/gcpontology/

http://oboedit.org/

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

%HSATIVUM_TILLER1_FLAG_1%HSATIVUM_TILLER1_FLAG_1

Complex trait name

%HSATIVUM_TILLER1_FLAG_1%HSATIVUM_TILLER1_FLAG_1

Complex trait name

Description: Description: The trait is scored for severity of The trait is scored for severity of the the disease disease caused by caused by Helminthosporium sativumHelminthosporium sativum (leaf (leaf spot) at tiller 1 and flag 1 stage in spot) at tiller 1 and flag 1 stage in percentage. percentage.

Description: Description: The trait is scored for severity of The trait is scored for severity of the the disease disease caused by caused by Helminthosporium sativumHelminthosporium sativum (leaf (leaf spot) at tiller 1 and flag 1 stage in spot) at tiller 1 and flag 1 stage in percentage. percentage.

Complex trait's names created Complex trait's names created by breeders in the crop by breeders in the crop databasesdatabases

to be decomposed into simple terms that are readable to be decomposed into simple terms that are readable for both human and computer and mapped against Ontologyfor both human and computer and mapped against Ontology

Plant Ontology

Qualities &Units Ontology

Assessment MethodsOntology

(e.g. ICIS)PATO

Qualifier

Phenoptype

“values” have “units”

(units implicitly indicates attribute)

Plant structure

Development stages

Markers/alleles/sequence ontology

Genotype Factor (G)

EFF

EC

TS

Treatment, Location, Climatic variables

/water, Growth conditions, Stress

Management/agronomy

External environmental data (E)

Time Ontology

Temporal factor (T)

Experimental design

Experiment factor (ED)

Ontology for Crops

phenotypic qualities

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

GCP Ontology – present and future prospects:GCP Ontology – present and future prospects:

GCP Ontology

GCP Ontology

Data Source CGIAR

GCP data templates

GCP Domain Module OntologyGCP Domain Module Ontology

ICIS dataset

Taxonomic Ontology

Plant Anatomy & Development Ontology

Plant Anatomy & Development Ontology Phenotype & Trait OntologyPhenotype & Trait Ontology

Structural & Functional Genomic Ontology

Location & Environment OntologyLocation & Environment Ontology

General Science Ontology

Web Interface (Chado/koios)

Query Linkage to external ontologies

Present S

tatu

s

Future

Plan

General Germplasm OntologyGeneral Germplasm Ontology

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

http://pantheon.generationcp.org

http://pantheon.generationcp.org

TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia

Thank Thank you !you !

Crops' Harvest CelebrationSan Isidoro FeriaLucban, Philippines