34
Generating Knowledge Networks from Phenotypic Descriptions Fagner Leal Patr´ ıcia Cavoto, Julio dos Reis, Andr´ e Santanch` e [email protected] Laboratory of Information Systems University of Campinas Campinas - S˜ ao Paulo - Brazil October 24, 2016

Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Generating Knowledge Networks fromPhenotypic Descriptions

Fagner LealPatrıcia Cavoto, Julio dos Reis, Andre Santanche

[email protected]

Laboratory of Information SystemsUniversity of Campinas

Campinas - Sao Paulo - Brazil

October 24, 2016

Page 2: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research ScenarioPhenotype Descriptions

I Morphological structuresI Behavior traitsI Life cycles; etc.

Examples:

1. No dark longitudinal stripes on head and body.2. Scattered breast melanophores (Fuiman et al., 1983).

Pteronotropis hubbsi can also be distinguished from Notropischalybaeus by the presence of two caudal spots, one large spotcentered at the base of the caudal fin below the flexednotochord and a smaller spot located dorsally above it, and bythe presence of 9 dorsal rays in late metalarvae. Notropischalybaeus has a single caudal spot in which no part extendsabove the notochord and 8 dorsal rays (Marshall, 1947).

1 / 16

Page 3: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research ScenarioPhenotype Descriptions

I Morphological structuresI Behavior traitsI Life cycles; etc.

Examples:

1. No dark longitudinal stripes on head and body.2. Scattered breast melanophores (Fuiman et al., 1983).

Pteronotropis hubbsi can also be distinguished from Notropischalybaeus by the presence of two caudal spots, one large spotcentered at the base of the caudal fin below the flexednotochord and a smaller spot located dorsally above it, and bythe presence of 9 dorsal rays in late metalarvae. Notropischalybaeus has a single caudal spot in which no part extendsabove the notochord and 8 dorsal rays (Marshall, 1947).

1 / 16

Page 4: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research ScenarioPhenotype Descriptions

I Morphological structuresI Behavior traitsI Life cycles; etc.

Examples:

1. No dark longitudinal stripes on head and body.

2. Scattered breast melanophores (Fuiman et al., 1983).Pteronotropis hubbsi can also be distinguished from Notropischalybaeus by the presence of two caudal spots, one large spotcentered at the base of the caudal fin below the flexednotochord and a smaller spot located dorsally above it, and bythe presence of 9 dorsal rays in late metalarvae. Notropischalybaeus has a single caudal spot in which no part extendsabove the notochord and 8 dorsal rays (Marshall, 1947).

1 / 16

Page 5: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research ScenarioPhenotype Descriptions

I Morphological structuresI Behavior traitsI Life cycles; etc.

Examples:

1. No dark longitudinal stripes on head and body.2. Scattered breast melanophores (Fuiman et al., 1983).

Pteronotropis hubbsi can also be distinguished from Notropischalybaeus by the presence of two caudal spots, one large spotcentered at the base of the caudal fin below the flexednotochord and a smaller spot located dorsally above it, and bythe presence of 9 dorsal rays in late metalarvae. Notropischalybaeus has a single caudal spot in which no part extendsabove the notochord and 8 dorsal rays (Marshall, 1947).

1 / 16

Page 6: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research Scenario

I Biology Knowledge Bases

e.g., FishBase: knowledge base about fishes1

I Identification Keys (IK)sI Artifacts to identify specimensI Observable characteristics

1http://www.fishbase.org2 / 16

Page 7: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research Scenario

I Biology Knowledge Basese.g., FishBase: knowledge base about fishes1

I Identification Keys (IK)sI Artifacts to identify specimensI Observable characteristics

1http://www.fishbase.org2 / 16

Page 8: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research Scenario

I Biology Knowledge Basese.g., FishBase: knowledge base about fishes1

I Identification Keys (IK)s

I Artifacts to identify specimensI Observable characteristics

1http://www.fishbase.org2 / 16

Page 9: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research Scenario

I Biology Knowledge Basese.g., FishBase: knowledge base about fishes1

I Identification Keys (IK)sI Artifacts to identify specimensI Observable characteristics

1http://www.fishbase.org2 / 16

Page 10: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research ScenarioIdentification Keys

Example of IK to Teleostean families

Drawbacks:I Need previous knowledgeI Need to follow the flow

3 / 16

Page 11: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Research ScenarioIdentification Keys

Example of IK to Teleostean families

Drawbacks:I Need previous knowledgeI Need to follow the flow

3 / 16

Page 12: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

GoalTo recognize and explicit phenotype elements locked in theIdentification Keys. Using the Entity-Quality (EQ) representation:

I Entity: morphological structureI Quality: qualifier state of the Entity

4 / 16

Page 13: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

GoalTo recognize and explicit phenotype elements locked in theIdentification Keys. Using the Entity-Quality (EQ) representation:

I Entity: morphological structureI Quality: qualifier state of the Entity

4 / 16

Page 14: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Related WorkInformation Extraction

Reference Context Approach

Ciaramita et al., 2005 Interactions in molecular biology Unsupervised Learning andRules over Dependency Trees

Song et al., 2015 Biomedical anatomic entities Dictionary-based

Pyysalo and Ananiadou, 2014 Biomedical Anatomic entities Supervised learning

Ramakrishnan et al., 2008 Biomedical Anatomical entities Dictionary-based, Rules overDependencies Trees and Statis-tical Learning

Fundel et al., 2007 Gene and Protein Interaction Rules over Dependency Trees

Cui, 2012 Morphological structures of or-ganisms

Unsupervised Learning

5 / 16

Page 15: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodGeneral View

Step 1:It explores isolated sentences

Step 2:It explores the sentence correlations

6 / 16

Page 16: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 1 - General View

Assumption:The typical way in which phenotype descriptions are written canguide the extraction of EQ elements.

7 / 16

Page 17: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 1 - General View

Assumption:The typical way in which phenotype descriptions are written canguide the extraction of EQ elements.

7 / 16

Page 18: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 1 - General View

Assumption:The typical way in which phenotype descriptions are written canguide the extraction of EQ elements.

7 / 16

Page 19: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 1 - Match Algorithm

Identifying Entities and Qualities:

8 / 16

Page 20: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 1 - Output

9 / 16

Page 21: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 2 - General View

Assumption:The structure of Identification Keys holds correlations that can beexploited to improve the extraction of EQ statements.

Generally, in phenotype descriptions:1. Alternative sentences refer to the same Entities.2. Alternative sentences assign complementary Qualities to

Entity.

10 / 16

Page 22: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 2 - General View

Assumption:The structure of Identification Keys holds correlations that can beexploited to improve the extraction of EQ statements.Generally, in phenotype descriptions:

1. Alternative sentences refer to the same Entities.2. Alternative sentences assign complementary Qualities to

Entity.

10 / 16

Page 23: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 2 - Algorithm

Compare the two relations, based on:(a) Existence of antonymy between the quality parts(b) Relation Type(c) Grammatical classes of quality parts(d) Relation Directions

Similarity =∑d

i=a vi

11 / 16

Page 24: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 2 - Algorithm

Compare the two relations, based on:(a) Existence of antonymy between the quality parts(b) Relation Type(c) Grammatical classes of quality parts(d) Relation Directions

Similarity =∑d

i=a vi

11 / 16

Page 25: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 2 - Algorithm

Compare the two relations, based on:(a) Existence of antonymy between the quality parts(b) Relation Type(c) Grammatical classes of quality parts(d) Relation Directions

Similarity =∑d

i=a vi

11 / 16

Page 26: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

MethodStep 2 - Output

12 / 16

Page 27: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Evaluation - Numerical AssessmentGold Standard-based Assessment

Gold standard set: 100 phenotype descriptions (randomly selected)were manually annotated

XXXXXXXXXXXMeasuresElements EQ pair Entity

Recall 0,45 0,76

Precision 0,87 0,94

F-measure 0,59 0,84

13 / 16

Page 28: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Evaluation - Application ExperimentsEQ sharing through taxons

Figure 1: Bipartite network of Species and EQs 14 / 16

Page 29: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Evaluation - Application ExperimentsEQ sharing through taxons

Figure 2: Projection of bipartide network

15 / 16

Page 30: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Conclusion

Original approach to automatically recognize Entities andQualities, exploring :

I Writing characteristics of phenotype descriptionsI Organizational structure of IKs

Future WorkI To compare against other approachesI To recognize complete EQs in Step 2 (not only the quality

part)I To calibrate the parameters and thresholds

16 / 16

Page 31: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Thank you!

Page 32: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Classical Measures

Recall = TPTP + FN (1)

Precision = TPTP + FP (2)

F -measure = 2 ∗ Precision ∗ RecallPrecision + Recall (3)

Examples of:I True Positive:

I expected: E [lips]Q[notfringed ]I recognized:E [lips]Q[notfringed ]

I False Positive:I expected E [vertebrae]Q[119 to 132]I recognized: E [vertebrae]Q[132]

I False Negative:I recognized E [breastmelanophores]Q[Scattered ]

Page 33: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Considering Partial MatchesI Complete Miss (CM): false negativeI Wrong Hit (WH): false PositiveI Full Match (FM): true Positive

Partial Precision = Partial MatchFull Match + Partial Match + Wrong Hit (4)

Full Precision = Full MatchFull Match + Partial Match + Wrong Hit (5)

Partial Recall = PartialMatchFull Match + Partial Match + Complete Miss

(6)

Full Recall = FullMatchFull Match + Partial Match + Complete Miss (7)

Page 34: Generating Knowledge Networks from Phenotypic Descriptionsescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/pantoja-fagner-slides.pdf2. Scattered breast melanophores (Fuiman et

Considering Partial MatchesTotal Precision = Partial Precision + Full PrecisionTotal Recall = Partial Recall + Full Recall

XXXXXXXXXXXMeasuresElements EQ pair Entity

Partial-Recall 0.05 0.08Full-Recall 0.39 0.67Partial-Precision 0.11 0.1Full-Precision 0.75 0.84

Table 1: Results concerning Perfect and also Partial Matches

XXXXXXXXXXXMeasuresElements EQ pair Entity

Total Recall 0,45 0,76Total Precision 0,87 0,94Total F-measure 0,59 0,84

Table 2: Total Results