17
CATH Soap Web Services http://farmer2.biochem.ucl.ac.uk/cgi- bin/SWSScan.cgi SWSScan

CATH Soap Web Services SWSScan

Embed Size (px)

Citation preview

Page 1: CATH Soap Web Services  SWSScan

CATH Soap Web Services

http://farmer2.biochem.ucl.ac.uk/cgi-bin/SWSScan.cgi

SWSScan

Page 2: CATH Soap Web Services  SWSScan

Orengo Group, UCL

What we want to offer

• Access to the CATH algorithms

• Access to CATH data

• “Programmable” access to our own resources for a new automated update protocol

Page 3: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Sequence Comparisons

• BLAST query sequences against complete CATH domain sequence database

• NW query sequences against complete CATH domain sequence database

• Scan query sequences against the CATH Sam HMM library

• Scan query sequences against the CATH HMMer HMM library

Page 4: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Structural Comparisons

• SSAP is a residue-based pairwise comparison algorithm that scores structural similarity between chains/domains in PDB format.

• Cathedral uses graphs to identify the fold of the query structure and then confirms fold assignment using SSAP

• Both algorithms require additional files that can be derived from the original PDB on the fly

Page 5: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Why SOAP?

•It’s Simple• Allows management of our farms without

changing the current configuration.

• A protocol being adopted by other FP6 projects such as BioSapiens

Page 6: CATH Soap Web Services  SWSScan

Orengo Group, UCL

SWSScan

SOAP HTTP Layer

Perl Layer

SGE Farm Manager

Anim

al Anim

al Anim

al Anim

al Anim

al

Anim

al Anim

al Anim

al Anim

al Anim

al

Anim

al Anim

al Anim

al Anim

al Anim

al

Anim

al Anim

al Anim

al Anim

al Anim

al

Page 7: CATH Soap Web Services  SWSScan

Orengo Group, UCL

SOAP Bits

#!/usr/local/bin/perl –w

use SOAP::Transport::HTTP;

SOAP::Transport:HTTP::CGI->dispatch_to(“/usr/local/bin/cath/perl/SWSScan.pm->handle;

#!/usr/local/bin/perl –wuse SOAP::Lite;

my $urn = “SWSScan”;my $proxy = “http://farmer2.biochem.ucl.ac.uk/cgi-bin/SWSScan.cgi”;

my $URI = SOAP::Lite->uri(“urn:$urn”);my $hWebService = $URI->proxy($proxy);

my $job_id = $hWebService->submit_scan($in, “blast”)->result;

Page 8: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Input formats

• Accept as many formats as possible to make the service as useable as possible

• Makes the web service responsible for parsing data rather than the user

• Ultimately parses all input data into a standard data structure

Page 9: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Input Formats

• Scalar– Single or multiple sequence FASTA– Full or chain/domain PDB– Single or list of CATH chain/domain ids– Single or list of PDB codes

• Array– As scalar in each element

• Object– Standard data structure only

Page 10: CATH Soap Web Services  SWSScan

Orengo Group, UCL

SWSScan Output

• Job id is returned rather than the results because– Apache has a timeout– The farm could be busy– The job could be big

• Requires a monitoring call and a retrieval call

Page 11: CATH Soap Web Services  SWSScan

Orengo Group, UCL

SWSScan Output

• A structured data object consisting of a results object that contains in a keyed hash query sequence objects which in turn contain an array of matching sequence objects

• All returned data is contained in the object including any requested files

Page 12: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Wrapper

date e.g. ‘date’

source e.g. ‘CATH’

Sequence e.g. Sequence Object

Page 13: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Sequence

date e.g. ‘set date’source e.g. ‘source of data’type e.g. ‘?’Id e.g. ‘1cuk001’length e.g. size of residue arraysequencepdb_header/footer e.g. ‘raw text’fasta_header e.g. ‘raw text’wolf_file, sec_file, etc. e.g. ‘raw text’matches-> an array of sequence objects

Page 14: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Residue

sequence_number e.g. key to Residue array

letter e.g. ‘residue single letter’

coords e.g. ‘ATOM 1 N MET 1 -7.750 -4.498 -20.265 1.00 21.82

ATOM 2 CA MET 1 -7.178 -5.177 -19.122 1.00 18.39

ATOM 3 C MET 1 -7.857 -4.686 -17.798 1.00 20.00

ATOM 4 O MET 1 -8.202 -5.472 -16.932 1.00 18.10

ATOM 5 CB MET 1 -5.728 -4.808 -19.125 1.00 19.97

ATOM 6 CG MET 1 -4.838 -5.882 -18.694 1.00 29.38

ATOM 7 SD MET 1 -3.162 -5.241 -18.626 1.00 33.40

ATOM 8 CE MET 1 -3.477 -3.687 -19.473 1.00 39.23

Page 15: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Sequence - matches

date e.g. 'Tue Nov 29 10:29:27 GMT 2005’type e.g. 'BLAST',id e.g. '12asA0',source e.g. 'CATH',sequence_length e.g. '330',matched_length e.g. '22',query_start e.g. '1',query_stop e.g. '20',match_start e.g. '185'match_stop e.g. '206',raw_score e.g. '15.4'evalue e.g. '1.9'sequence_id e.g. '36.364'

Page 16: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Other Web Services

• Retrieval of CATH data – Accession codes– PDB definitions of domains– Derived CATH files

• Gene3D– CATH <-> Uniprot mappings

Page 17: CATH Soap Web Services  SWSScan

Orengo Group, UCL

Questions

• We are also offering our machines to be used, but is there some kind of quota system expected to be used?

• Is using a job id an acceptable protocol?• If you only use a retrieval call how do you tell the

difference between “no results” with “no results YET”?

• Will there be a central repository of code?• What about BioPerl?• How much more complex is the structured data

model going to get?