51
Java framework for molecular interactions JAMI IntAct team [email protected]

Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Java framework for molecular interactions

JAMI

IntAct [email protected]

Page 2: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

PSI-MI standard formats

Page 3: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

2 different formats – different needs

PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7

- complex relationships, hierarchy- more complete- binary/n-ary interactions and complexes- schema - can only be used by developers and software- complex schema sometimes confusing => difficult parsing

- simple format- easy to read (not to all)- easy to index- Cannot be used for complex relationships- Too many columns if trying to add all the information of XML

Page 4: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Segmentation of the tools/software

PSI-XML 2.5

- MI-XML validator- can be used to exchange fully MIMIx/IMEx compliant data

- PSICQUIC- Data best practices- clustering and scoring- can be easily used for visualization/networking

Need to unify our tools/software

PSI-MITAB 2.5, 2.6 and 2.7

Endless conversions

Page 5: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Common Java framework?

PSI-XML 2.5 PSI-MITAB 2.7 Databases + Other formats

Common API/framework (interfaces)

PSICQUIC and indexing

Semantic validator Data enricher

Protein update (and others)Clustering and scoring

Page 6: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Simplified view of the PSI-XML 2.5 schema

ParticipantParticipant ParticipantParticipant ParticipantParticipant

InteractorInteractor InteractorInteractor

FeatureFeature

RangeRangeRangeRange

FeatureFeature

ExperimentExperiment

EntryEntry

InteractionInteraction InteractionInteraction

ExperimentExperiment

ParticipantParticipant

InteractionInteraction RangeRange

1

n

EntrySetEntrySet

SourceSource

Page 7: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

JAMI model interfaces

Page 8: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Different interactors

Interaction with participants

Set of interactors

Page 9: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

How is described the interactor?

MITAB and XMLXML only

MITAB only

Page 10: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Interactor extensions

● More specific fields● Short cuts● Utility methods● Sequence only for proteins and

nucleic acids (Polymer)

Page 11: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

How is described the organism?

MITAB and XMLXML only

Page 12: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Controlled vocabulary terms

Shortcuts

Page 13: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Basic objects

Page 14: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Different interactions

Based on experimental details

Page 15: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Interaction evidence

InteractorInteractor

FeatureEvidenceFeature

Evidence

Interaction Evidence

Interaction Evidence

ExperimentExperiment

ParticipantEvidence

ParticipantEvidence

RangeRange

1

n

PublicationPublication

SourceSource

ParticipantEvidence

ParticipantEvidence

ParticipantEvidence

ParticipantEvidence

InteractorInteractor

FeatureEvidenceFeature

Evidence

RangeRange RangeRangeBindingFeature

BindingFeature

RangeRange

BindingFeature

BindingFeature

RangeRange RangeRange

Inferred interactions

n

1

Page 16: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Differences with XML● Interaction evidence

➢ One experiment➢ One interaction type

● Experiment➢ One host organism➢ No participant identification method➢ No feature detection method

● Participant evidence➢ One experimental role➢ One expressed in organism➢ One participant identification method

● Feature➢ One feature detection method

Page 17: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Modelled interaction

ModelledFeature

ModelledFeature

Modelled Interaction

Modelled Interaction

ExperimentExperiment

RangeRange

n

PublicationPublication

SourceSource

ModelledParticipantModelled

ParticipantModelled

ParticipantModelled

Participant

InteractorInteractor

ModelledFeature

ModelledFeature

RangeRange RangeRangeBindingsite

Bindingsite

RangeRange

Bindingsite

Bindingsite

RangeRange RangeRange

Inferred interactions

n

1ModelledParticipantModelled

Participant

InteractorInteractor

SourceSource

Page 18: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Differences with XML

● Modelled interaction● Experiment not required

● Modelled participant➢ No experimental role➢ No participant identification method

● Modelled Feature➢ No feature detection method

Page 19: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Cooperative interaction

● Cooperative mechanism (CV term)

● Effect outcome (CV term)

● Response (CV term)

● Affected interactions (Modelled interactions)

Page 20: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Allosteric interaction

● Allostery mechanism (CV term)

● Allostery type (CV term)

● Allosteric molecule (modelled participant not interactor?)

● Allosteric effector (modelled participant not interactor?)

● Allosteric PTM (modelled feature)

Page 21: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Complexes

PSI-MI standard formats

ComponentFeature

ComponentFeature

ComplexComplex

ExperimentExperiment

RangeRange

n

PublicationPublication

SourceSource

ComponentComponent ComponentComponent

InteractorInteractor

ComponentFeature

ComponentFeature

RangeRange RangeRangeBindingsite

Bindingsite

RangeRange

Bindingsite

Bindingsite

RangeRange RangeRange

Inferred interactions

n1

InteractorInteractor

ComponentComponent

Interactionevidence

Interactionevidence1

1 n

Page 22: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Differences with XML

● Complex● Experiment not required● Can have a list of publications● Parameters and confidences?

● Component➢ No experimental role➢ No participant identification method➢ No biological role?

● Component Feature➢ No feature detection method

interactor only

Page 23: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Datasources

● Equivalent to EntrySet/Entry

● Embedded parser

● Parsing events and list of Errors (line number, col number)

● MITAB Column● Object id

Page 24: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

JAMI core and modules

● JAMI corehttps://psimi.googlecode.com/svn/trunk/psi-jami

● JAMI MITAB parserhttps://psimi.googlecode.com/svn/branches/psimitab-parser-2.0.0-SNAPSHOT

● JAMI PSI-XML 2.5 parserhttps://psimi.googlecode.com/svn/branches/psi25-xml-2.0.0-SNAPSHOT

Page 25: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

MITAB/PSI-XML parsing

PSI-XML 2.5 PSI-MITAB 2.7

Common API/framework (interfaces)

Current Java XML model

Current Java MITAB model

Page 26: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

JAMI goals and benefits

● Utility methods➢ Extract all uniprot cross references➢ Compare two interactors

● Unit testing (in progress)

● Same interfaces for XML and MITAB➢ Do not duplicate code➢ Share code and tools➢ Help to develop faster

Community effort

Page 27: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Application example: PSI-MI validator

Page 28: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Validator updated source code

● MI schema validatorhttps://psimi.googlecode.com/svn/tags/psimi-schema-validator-2.1.0-SNAPSHOT

● MI schema validator in command linehttps://psimi.googlecode.com/svn/branches/psimi-schema-validator-cli-2.1.0-SNAPSHOT

● PSI-MI validatorhttp://wwwdev.ebi.ac.uk/intact/validator/

● JAMI-HTML writerhttps://psimi.googlecode.com/svn/trunk/jami-html

Page 30: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

CV rules (2)

JAMI model

Page 31: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Syntax validation

● Managed by each datasource➢ SAX validation (syntax and grammar => schema)➢ MITAB syntax validation based on events

● MITAB syntax

➢ 15, 36 or 42 columns➢ Invalid fields

➢ Missing database or database accession➢ Special characters not properly escaped➢ Missing alias db source or name➢ Missing annotation topic➢ ...

Page 32: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

XML syntax error : missing interaction detection method

Missing interaction detection method

Page 33: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

XML syntax validation

Page 34: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

MITAB syntax error : wrong number of columns and missing database

Page 35: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

MITAB syntax validation

Page 36: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

XML CV validation

File contextError message

Page 37: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

MITAB CV validation

File contextError message

Page 38: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

XML MIMIx validation

Page 39: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

MITAB MIMIx validation

Page 40: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Common HTML view

Page 41: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Next steps

● Performance review

● More unit testing

● Rules can be obsolete in MITAB

● Current rules are outdated and need to be reviewed

● PSI enricher

Page 42: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

PSI-XML schema 2.5 issues and next steps

Page 43: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Minor issues: confidences

● Need an unit?● Add a confidence type

Page 44: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Minor issues: interaction types

• Interaction type without experiment ref

Page 45: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Minor issues: publications

• No concept of publication element?

• Should be able to describe publication (both Xref : pulication Id and ListOfAttribute : publication date, journal, authors, etc.) in the BibRef element of an ExperimentDescription

<bibref><xref>

<primaryRef db="pubmed" dbAc="MI:0446" id="2556388" refTypeAc="MI:0685" refType="primary-reference"></primaryRef>

</xref>

<attributeList><attribute name=”author-list” nameAc=”MI:0636”>Valentin-Ranc C, Carlier MF</attribute>

....</attributeList>

</bibref>

Page 46: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Complexes

• InteractionRef : need to define experiment?

Page 47: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Experiment confidence?

Page 48: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Participant experimental interactor?

Page 49: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Two flavours make parsing more difficult

• Mix of compact/expanded XML?

• Some elements allow experimentRef but not experimentDescription

Page 50: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Next steps

● Define better way to represent complexes

● Dynamic interactions?

● List of interactor as an interactor?

● Should we use namespaces => modules?

Page 51: Java framework for molecular interactions2 different formats – different needs PSI-XML 2.5 PSI-MITAB 2.5, 2.6 and 2.7 - complex relationships, hierarchy - more complete - binary/n-ary

Master headline

????

??? ?

??

?

?

?

?

?

?

??

?

?

? ?

?