Upload
logan-cole
View
236
Download
7
Embed Size (px)
Citation preview
EMBL-EBI
MSD Search tools
EMBL-EBI
MSDlite
EMBL-EBI
MSDlite
EMBL-EBI
The “Atlas” Pages
EMBL-EBI
The Atlas: Ligands
EMBL-EBI
The Atlas: Sequence
EMBL-EBI
Strengths: simple, easy to use form allows multiple search fields to be combined relatively fast, despite performing quite complex SQL
queries Weaknesses:
not exposing the power of a relational database user can't specify the relationship between search fields:
"name" AND "title" AND "keyword" "name" OR "title" OR "keyword" ( "name" OR "title" ) AND NOT "keyword"
the search form is defined by the authors of the search system, not the author of a query
Simple search
interface
EMBL-EBIDescribing complex
searches
We want to allow the user to entirely control their query Since HTML forms are inherently static, we'll use an applet
to provide a dynamic "form" that will let the user: choose the fields to be searched specify the relationships between search fields choose the result fields and how results are presented perform "complex" sub-queries e.g. SSM, FASTA
EMBL-EBI Graphical DB search system
MSDpro uses an applet for constructing queries and a server to execute them
Avoids the need for the user to understand a complex database schema or know SQL
The user describes their query entirely graphically, including logical operations such as AND, OR and NOT
Applet generates an XML description of the user’s query, which is sent to the MSD query server and converted to SQL automatically
EMBL-EBI
EMBL-EBI
EMBL-EBI
Automatic SQL generation
The query server is a Java servlet: accepts a query description as XML converts the user’s query description into a true
SQL query, which is then submitted to the search database
Searches can include components that are executed outside of the database, e.g. sequence similarity, determined using FASTA or structural similarity, determined using SSM
EMBL-EBI
Visualisation
• The process of representing abstract data to aid in understanding the meaning of the data.
• Not to be confused with rendering data (drawing pictures)
• Typically though, we render data in such a way to visualize the information within that data.
EMBL-EBI
Introduction
Biological data comes from & is of interest to:
Chemists : reaction mechanism, drug design Biologists : sequence, expression, homology, function. Structure biologists : atomic structure, fold, classification,
function. Medicine : clinical effect Education : Media :
Presentation of diverse information to a diverse audience. Each has there own point of view (context).
Expert = scientist working within their own field of expertise Non-expert = scientist using data/information outside their
field Novice = Non-scientist
EMBL-EBI
Web pages These are notoriously badly designed often resulting in
the information on that site being unusable.The front page should load quicklyThe main point should appear on the first full screenClutter – not logically laid outToo busy – cannot find the salient point8% men & 0.5% women are colour blindBad text/fonts
Too often it doesn’t workUser will go somewhere elseThe latest wiz-bang stuff only works on the latest browsersOnly works in one browser – they only tested on one.
Does not conform to standard HTML
Not just presentation of results
Google is a good design
EMBL-EBI
Asking questions
Asking questionsBiological data is very complex
Chemistry, Biology, Physics, Statistics, Medicine..Most users will be from a different field
Asking the right question is difficult.The user cannot use the correct terminologyToo many things to query (2000 attributes in
MSD)SQL : not suitable for most users
Interface too complexToo many check boxes, widgets etc Trying to be too cleverThe “Go” button is buried somewhere
EMBL-EBI
Result presentation
ResultsBiological data is complex
Chemistry, physics, biology, statistics, medicine…
Experts users want all the detail Ie : want to use a specific methodThey want all the detailsThe want (I hope) the statistical validity of the
results
The non-expert wants the best practice answer returned within their own context.The want comparative analysis with other fieldsThe want to know the results are valid
EMBL-EBI
Query design
Suitable for text queries
Only one logicAND or OR
PredefinedEasy to useLimited scope2000 attributes ->
2000 check-boxes !
The simple text box design is very common
EMBL-EBI
Query design
Graphical interfaceMultiple logic
AND/OR/NOTUnder users controlSlowerSteep learning curve
Some users just cannot get it
Intuitive once mastered
Pretty
EMBL-EBI Query design
HIS|SER:S/H>C2.0 HIS.ne2:S/S>C2.0 HIS.
[n]:S/T>C2.0
Figurative 2D sketch for 3D query (Active sites) Informative – presents meaning for the questionSlower Less error prone
select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter in ('SER','HIS')
and DISTANCE <= 2.0 and type_id = 1
and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 1
intersect select distinct entry_id, ligand_id from contact_search sel
where neighbour_code_3_letter = 'HIS' and ( NEIGHBOUR_ATOM_NAME = 'NE2'
and type_id = 1 and distance <= 2.0 or NEIGHBOUR_SYMBOL = 'N'
and type_id = 1 and distance <= 2.0)
and TYPE_ID != 0 group by entry_id, ligand_id having count(distinct
neighbour_residue_id) >= 2 intersect
select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter = 'HIS'
and NEIGHBOUR_ATOM_NAME = 'NE2' and DISTANCE <= 2.0 and type_id = 1
and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 2
intersect select distinct entry_id, ligand_id from contact_search sel
where neighbour_code_3_letter = 'HIS' and NEIGHBOUR_SYMBOL = 'N'
and DISTANCE <= 2.0 and type_id = 1
and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 3
intersect select distinct entry_id, ligand_id from residue_contact sel
where neighbour_code_3_letter in ('HIS','SER','HIS') and BOND_STRENGTH != 10
group by entry_id, ligand_id having count(*) >= 3;
EMBL-EBI
YAMGP (yet another molecular graphics program)
Many different programs are available
AstexViewer@MSD-EBI
Quanta
Rasmol
MolMol
Chime
O
Spock
Swiss-PDBviewer
Molscript
iMol
Pymol
Chimera
XtalView
FrodoBobscript InsightII
Raster3D
WebLab-viewer
POVRay
Yasara
LigPlotWebMol
PymolGrasp
Mage
Whatif
VMD
Frodo
EMBL-EBI
Result visualisation
Multiple types of biological dataTextual data 3D structure 2D chemical sketches1D sequenceNode linkedGeneral/derived dataWeb pagesErrors/VarianceData provenance
EMBL-EBI
AstexViewer@MSD-EBI
Java 1.1 AppletShould run under most
browsersSmall footprint, high speed.
StructureLine, stick, ball & stick,
sphere, schematic, surface + texture map.
Written by Mike Hartshorn (Astex therapeutics Ltd).
Multiple structures supported
EMBL-EBI
AstexViewer@MSD-EBI
Sequence Multiple sequence
alignment Editing, Annotation, colours… Consensus alignment Pick, Brushing & Magic
lens
EMBL-EBI
Chemistry2D flat representationAnnotation, colours… Interaction typesPlacement fn(contact
distance)EditablePick, Brush and magic
lens
EMBL-EBI
Graphs
Graphs2D, 2D grid and NDLinkage plotsAnnotation, colours…Ramachandran, etc…Pick, Brush Magic Len
EMBL-EBI AstexViewer@MSI-EBI
VisualisationLensingLinked viewsBrushingPickingFlying viewsHyperbolic
distortionAnimationSolid renderingDepth cuesColour,lightingHighlightingEtc…
EMBL-EBI Visualisation : comparative analysis
Similarity/DifferenceData superpositionAttribute display
Colour, size…
CorrelationAttribute mapping
Sequence colour by structure alignment