28
EMBL-EBI MSD Search tools

EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

Embed Size (px)

Citation preview

Page 1: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

MSD Search tools

Page 2: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

MSDlite

Page 3: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

MSDlite

Page 4: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

The “Atlas” Pages

Page 5: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

The Atlas: Ligands

Page 6: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

The Atlas: Sequence

Page 7: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Strengths: simple, easy to use form allows multiple search fields to be combined relatively fast, despite performing quite complex SQL

queries Weaknesses:

not exposing the power of a relational database user can't specify the relationship between search fields:

"name" AND "title" AND "keyword" "name" OR "title" OR "keyword" ( "name" OR "title" ) AND NOT "keyword"

the search form is defined by the authors of the search system, not the author of a query

Simple search

interface

Page 8: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBIDescribing complex

searches

We want to allow the user to entirely control their query Since HTML forms are inherently static, we'll use an applet

to provide a dynamic "form" that will let the user: choose the fields to be searched specify the relationships between search fields choose the result fields and how results are presented perform "complex" sub-queries e.g. SSM, FASTA

Page 9: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI Graphical DB search system

MSDpro uses an applet for constructing queries and a server to execute them

Avoids the need for the user to understand a complex database schema or know SQL

The user describes their query entirely graphically, including logical operations such as AND, OR and NOT

Applet generates an XML description of the user’s query, which is sent to the MSD query server and converted to SQL automatically

Page 10: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Page 11: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Page 12: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Automatic SQL generation

The query server is a Java servlet: accepts a query description as XML converts the user’s query description into a true

SQL query, which is then submitted to the search database

Searches can include components that are executed outside of the database, e.g. sequence similarity, determined using FASTA or structural similarity, determined using SSM

Page 13: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Visualisation

• The process of representing abstract data to aid in understanding the meaning of the data.

• Not to be confused with rendering data (drawing pictures)

• Typically though, we render data in such a way to visualize the information within that data.

Page 14: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Introduction

Biological data comes from & is of interest to:

Chemists : reaction mechanism, drug design Biologists : sequence, expression, homology, function. Structure biologists : atomic structure, fold, classification,

function. Medicine : clinical effect Education : Media :

Presentation of diverse information to a diverse audience. Each has there own point of view (context).

Expert = scientist working within their own field of expertise Non-expert = scientist using data/information outside their

field Novice = Non-scientist

Page 15: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Web pages These are notoriously badly designed often resulting in

the information on that site being unusable.The front page should load quicklyThe main point should appear on the first full screenClutter – not logically laid outToo busy – cannot find the salient point8% men & 0.5% women are colour blindBad text/fonts

Too often it doesn’t workUser will go somewhere elseThe latest wiz-bang stuff only works on the latest browsersOnly works in one browser – they only tested on one.

Does not conform to standard HTML

Not just presentation of results

Google is a good design

Page 16: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Asking questions

Asking questionsBiological data is very complex

Chemistry, Biology, Physics, Statistics, Medicine..Most users will be from a different field

Asking the right question is difficult.The user cannot use the correct terminologyToo many things to query (2000 attributes in

MSD)SQL : not suitable for most users

Interface too complexToo many check boxes, widgets etc Trying to be too cleverThe “Go” button is buried somewhere

Page 17: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Result presentation

ResultsBiological data is complex

Chemistry, physics, biology, statistics, medicine…

Experts users want all the detail Ie : want to use a specific methodThey want all the detailsThe want (I hope) the statistical validity of the

results

The non-expert wants the best practice answer returned within their own context.The want comparative analysis with other fieldsThe want to know the results are valid

Page 18: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Query design

Suitable for text queries

Only one logicAND or OR

PredefinedEasy to useLimited scope2000 attributes ->

2000 check-boxes !

The simple text box design is very common

Page 19: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Query design

Graphical interfaceMultiple logic

AND/OR/NOTUnder users controlSlowerSteep learning curve

Some users just cannot get it

Intuitive once mastered

Pretty

Page 20: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI Query design

HIS|SER:S/H>C2.0 HIS.ne2:S/S>C2.0 HIS.

[n]:S/T>C2.0

Figurative 2D sketch for 3D query (Active sites) Informative – presents meaning for the questionSlower Less error prone

select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter in ('SER','HIS')

and DISTANCE <= 2.0 and type_id = 1

and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 1

intersect select distinct entry_id, ligand_id from contact_search sel

where neighbour_code_3_letter = 'HIS' and ( NEIGHBOUR_ATOM_NAME = 'NE2'

and type_id = 1 and distance <= 2.0 or NEIGHBOUR_SYMBOL = 'N'

and type_id = 1 and distance <= 2.0)

and TYPE_ID != 0 group by entry_id, ligand_id having count(distinct

neighbour_residue_id) >= 2 intersect

select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter = 'HIS'

and NEIGHBOUR_ATOM_NAME = 'NE2' and DISTANCE <= 2.0 and type_id = 1

and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 2

intersect select distinct entry_id, ligand_id from contact_search sel

where neighbour_code_3_letter = 'HIS' and NEIGHBOUR_SYMBOL = 'N'

and DISTANCE <= 2.0 and type_id = 1

and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 3

intersect select distinct entry_id, ligand_id from residue_contact sel

where neighbour_code_3_letter in ('HIS','SER','HIS') and BOND_STRENGTH != 10

group by entry_id, ligand_id having count(*) >= 3; 

Page 21: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

YAMGP (yet another molecular graphics program)

Many different programs are available

AstexViewer@MSD-EBI

Quanta

Rasmol

MolMol

Chime

O

Spock

Swiss-PDBviewer

Molscript

iMol

Pymol

Chimera

XtalView

FrodoBobscript InsightII

Raster3D

WebLab-viewer

POVRay

Yasara

LigPlotWebMol

PymolGrasp

Mage

Whatif

VMD

Frodo

Page 22: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Result visualisation

Multiple types of biological dataTextual data 3D structure 2D chemical sketches1D sequenceNode linkedGeneral/derived dataWeb pagesErrors/VarianceData provenance

Page 23: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

AstexViewer@MSD-EBI

Java 1.1 AppletShould run under most

browsersSmall footprint, high speed.

StructureLine, stick, ball & stick,

sphere, schematic, surface + texture map.

Written by Mike Hartshorn (Astex therapeutics Ltd).

Multiple structures supported

Page 24: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

AstexViewer@MSD-EBI

Sequence Multiple sequence

alignment Editing, Annotation, colours… Consensus alignment Pick, Brushing & Magic

lens

Page 25: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Chemistry2D flat representationAnnotation, colours… Interaction typesPlacement fn(contact

distance)EditablePick, Brush and magic

lens

Page 26: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI

Graphs

Graphs2D, 2D grid and NDLinkage plotsAnnotation, colours…Ramachandran, etc…Pick, Brush Magic Len

Page 27: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI AstexViewer@MSI-EBI

VisualisationLensingLinked viewsBrushingPickingFlying viewsHyperbolic

distortionAnimationSolid renderingDepth cuesColour,lightingHighlightingEtc…

Page 28: EMBL-EBI MSD Search tools. EMBL-EBI MSDlite EMBL-EBI MSDlite

EMBL-EBI Visualisation : comparative analysis

Similarity/DifferenceData superpositionAttribute display

Colour, size…

CorrelationAttribute mapping

Sequence colour by structure alignment