23
1 Data Integration and Extraction over Molecular Biological Data Cui Tao supported by NSF

Data Integration and Extraction over Molecular Biological Data

  • Upload
    maura

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

Data Integration and Extraction over Molecular Biological Data. Cui Tao. supported by NSF. Motivation. Online biological data: Highly diverse in granularity and variety Various formats Different terminologies, ID systems, units. How to Build a Gene Extraction Ontology?. Concepts - PowerPoint PPT Presentation

Citation preview

Page 1: Data Integration and Extraction over Molecular Biological Data

1

Data Integration and Extraction over Molecular Biological Data

Cui Tao

supported by NSF

Page 2: Data Integration and Extraction over Molecular Biological Data

2

Motivation

Online biological data: Highly diverse in granularity and

variety Various formats Different terminologies, ID systems,

units

Page 3: Data Integration and Extraction over Molecular Biological Data

3

How to Build a Gene Extraction Ontology? Concepts Relationship sets Constraints Data Frames

Page 4: Data Integration and Extraction over Molecular Biological Data

4

How to Build a Gene Extraction Ontology?

(G*A*U*C*)*

(G*A*T*C*)*

Page 5: Data Integration and Extraction over Molecular Biological Data

5

Knowledge Sources Gene Ontology

Thousands of terms

All Species Toolkit 1,231,935 species names

Protein Databases Thousands of protein names

(Molecular Function, Biological Process, Cellular Component)

Page 6: Data Integration and Extraction over Molecular Biological Data

6

Extraction Rules Statistical NLP Machine learning

Naïve Bayes Hidden Markov Models Decision Trees

Page 7: Data Integration and Extraction over Molecular Biological Data

7

Integration

Page 8: Data Integration and Extraction over Molecular Biological Data

8

Page 9: Data Integration and Extraction over Molecular Biological Data

9

Page 10: Data Integration and Extraction over Molecular Biological Data

10

Page 11: Data Integration and Extraction over Molecular Biological Data

11

Page 12: Data Integration and Extraction over Molecular Biological Data

12

Page 13: Data Integration and Extraction over Molecular Biological Data

13

Integration Information Hidden behind Links

Page 14: Data Integration and Extraction over Molecular Biological Data

14

Page 15: Data Integration and Extraction over Molecular Biological Data

15

Page 16: Data Integration and Extraction over Molecular Biological Data

16

Page 17: Data Integration and Extraction over Molecular Biological Data

17

Query-based Extraction

Query the gene extraction ontology

Find applicable resources Fill out forms Extract information

Page 18: Data Integration and Extraction over Molecular Biological Data

18

Query-based Extraction

Example: “Find the alfR gene, its sequence, its protein's function, and any mutant that inhibits this gene.”

Gene NameGene Sequence

Gene

Mutant

Protein FunctionMutant Function

Page 19: Data Integration and Extraction over Molecular Biological Data

19

Page 20: Data Integration and Extraction over Molecular Biological Data

20

Page 21: Data Integration and Extraction over Molecular Biological Data

21

Page 22: Data Integration and Extraction over Molecular Biological Data

22

Page 23: Data Integration and Extraction over Molecular Biological Data

23

Contribution Provides a way to automatically

integrate online biological data from different sources

Provides an approach that can find proper online resources, fill out online forms and extract data depending on user’s query