Introduction to SRS

Preview:

DESCRIPTION

Introduction to SRS. Frankie Cheung frankie@cc.hku.hk. Introduction to SRS. Part 0. Introduction. SRS Introduction. SRS system started out as a Sequence Retrieval System that employed sophisticated parsing and indexing of database text files. SRS Introduction. - PowerPoint PPT Presentation

Citation preview

HKU

ComputerCentre

Introduction to SRS

Frankie Cheung frankie@cc.hku.hk

HKU

ComputerCentre

Introduction to SRS

Part 0. IntroductionPart 0. Introduction

HKU

ComputerCentre

SRS IntroductionSRS system started out as

a Sequence Retrieval Sequence Retrieval SystemSystem that employed sophisticated parsing and parsing and indexingindexing of database text files

HKU

ComputerCentre

SRS IntroductionFast access Fast access to diverse life science data - ge

netic, protein, cellular, molecular, and clinical - for researchers and bioinformaticians

Integration of public and proprietary data through

one interfaceone interface Unique ability to perform cross-database cross-database

queriesqueries Rapid string searchRapid string search of large volumes of

data Seemless integration of data and anintegration of data and an

alysis tools alysis tools

HKU

ComputerCentre

Starting SRS Browse the BIOSUPPORT Hompage:

http://bioinfo.hku.hk/ Select “Tools”“Tools” Option

HKU

ComputerCentre

Starting SRS At the “Tools” page, find and click on the

Sequence Retrieval System Sequence Retrieval System (SRS)(SRS) system

HKU

ComputerCentre

SRS Main Page After successful login, the default page is a

“Temporary ProjectTemporary Project”.

HKU

ComputerCentre

SRS Main Page For a “Temporary ProjectTemporary Project”, the result of

your work would not be stored after you close SRS.

HKU

ComputerCentre

SRS Main Page

Starting a “Permanent ProjectPermanent Project”, the result of your work would be stored even after you close SRS.

HKU

ComputerCentre

Permanent Project After selecting a “Permanent ProjectPermanent Project”,

your account name and project name will be displayed. (Since no password protection under current policy, this approach is not recommended.)

HKU

ComputerCentre

Permanent Project All the query and job result will be saved in

your “Permanent ProjectPermanent Project”.

HKU

ComputerCentre

Permanent Project Option “Save to desktopSave to desktop” : download & save

your project info to your local machine

HKU

ComputerCentre

Permanent Project Option “Rename projectRename project” : change the name

of your project

HKU

ComputerCentre

Permanent Project Option “Delete projectDelete project” : delete all the related

information of your current project

HKU

ComputerCentre

“Quick Searches” page :

Quick access with default databank selection

HKU

ComputerCentre

“Quick Text Search” Quick sequence query with default

databank selection Enter your search string and click “Search”

to start searching

HKU

ComputerCentre

“Quick Text Search” Default Databank Setting:

Nucleotide Seq EMBL

Protein Seq SWALL

Protein Structure PDB

Genome LocusLink

Mutations OMIM

Metabolic Pathway Pathway

HKU

ComputerCentre

Introduction to SRS

Part 1. Simple QueryPart 1. Simple Query

HKU

ComputerCentre

Simple query Go back to “Select DatabankSelect Databank”, click

on “PIRPIR” to select “PIR” databank

HKU

ComputerCentre

Simple query Then click the “QueryQuery” button to fill in the

query form

HKU

ComputerCentre

Simple query Click the first “AllTextAllText” pull down menu to

change it to “DescriptionDescription” Enter “cytochromecytochrome” at the blank besides

HKU

ComputerCentre

Simple query Click the second “AllTextAllText” pull down

menu to change it to “OrganismOrganism” Enter “pigpig” at the blank besides

HKU

ComputerCentre

Simple query Click the “SearchSearch” button to start

searching any sequence in PIR databanks satisfy your query form

HKU

ComputerCentre

Simple query After searching for awhile, there would be

some result listing entries. Try to click the PIR:S26019PIR:S26019 entry

HKU

ComputerCentre

Gene Sequence SRS would display the PIR:S26019 descri

ption: including ID, accession numbeID, accession number, description, keywordsr, description, keywords, etc

HKU

ComputerCentre

Gene Sequence Scroll down to see the reference related to

this sequence, there is also a Medline rMedline referenceeference hyperlink available

HKU

ComputerCentre

Gene Sequence Scroll down to see the gene functiongene function a

nd the gene sequencegene sequence

HKU

ComputerCentre

Gene Sequence Save the gene sequence information to your local

machine Click the “SaveSave” button at the leftmost panel

HKU

ComputerCentre

Gene Sequence Click the “Output toOutput to” button to choose the

text/html format

HKU

ComputerCentre

Gene Sequence Click the “Use View” button to choose the

“Complete entriesComplete entries” format

HKU

ComputerCentre

Gene Sequence Then click “savesave” to save again the PIR:S26019

to your local machine

HKU

ComputerCentre

Simple query: Exercise

15 Minutes Break15 Minutes Break

Please try the following exercises Please try the following exercises by yourself!by yourself!

HKU

ComputerCentre

Simple query: Exercise

Question 1. Question 1.

Search all the “SARS virus” in ESearch all the “SARS virus” in EMBL databasesMBL databases

HKU

ComputerCentre

Simple query: Exercise 1. Search all the “SARS virus” in EMBL dat

abases

HKU

ComputerCentre

Simple query: Exercise 1. Search all the “SARS virus” in EMBL dat

abases

HKU

ComputerCentre

Simple query: Exercise 1. Search all the “SARS virus” in EMBL dat

abases:

HKU

ComputerCentre

Simple query: Exercise

Question 2. Question 2.

How many CDS regions for the “How many CDS regions for the “EMBL:AY278488” (SARS coronaEMBL:AY278488” (SARS coronavirus BJ01) ?virus BJ01) ?

HKU

ComputerCentre

Simple query: Exercise 2. Click to see the detail AY278488

HKU

ComputerCentre

Simple query: Exercise 2. Click to see the detail AY278488

HKU

ComputerCentre

Simple query: Exercise 2. Count the number of CDS entries

HKU

ComputerCentre

Simple query: Exercise 2. CDS entries of EMBL:AY278488

orf1ab orf1a polyprotein spike glycoprotein S putative uncharacterized protein 1 putative uncharacterized protein 2 envelope protein E membrane protein M putative uncharacterized protein 3 putative uncharacterized protein 4 nucleocapsid protein N putative uncharacterized protein 5

HKU

ComputerCentre

Simple query: Exercise

Question 3. Question 3.

What is the protein ID for EMBL:What is the protein ID for EMBL:AY278741 (SARS coronavirus UrAY278741 (SARS coronavirus Urbani) spike glycoprotein?bani) spike glycoprotein?

HKU

ComputerCentre

Simple query: Exercise 3. Click to see the detail AY278741

HKU

ComputerCentre

Simple query: Exercise 3. EMBL:AY278741

HKU

ComputerCentre

Simple query: Exercise 3. EMBL:AY278741

HKU

ComputerCentre

Simple query: Exercise

Question 4. Question 4.

Following the link of the spike glyFollowing the link of the spike glycoprotein of EMBL:AY278741, wcoprotein of EMBL:AY278741, what is the biological function of thihat is the biological function of this spike-protein? s spike-protein?

HKU

ComputerCentre

Simple query: Exercise 4. EMBL:AY278741

HKU

ComputerCentre

Simple query: Exercise 4. Find biological function of VGL2_CVHSA

HKU

ComputerCentre

Simple query: Exercise 4. Find biological function of VGL2_CVHSA

HKU

ComputerCentre

Introduction to SRS

Part 2. Extended SearchPart 2. Extended Search

HKU

ComputerCentre

Extended Search Click the “Select DatabanksSelect Databanks” button at

the top to select PIR as query targe

HKU

ComputerCentre

Extended Search Click the “QueryQuery” button at the top to back to

query form

HKU

ComputerCentre

Extended Search Click the “Extended QueryExtended Query” button at the

left panel to display the extended search form

HKU

ComputerCentre

Extended Search In the “ExtendedExtended” search form, find the

“Super FamilySuper Family” field and then enter

“cytochrome-ccytochrome-c ” Click the “SearchSearch” to start the query

HKU

ComputerCentre

Extended Search After searching for awhile, there would be

extended search result list appear

HKU

ComputerCentre

Using Search Results Click the “ResultResult” button at the top to reuse

your query result

HKU

ComputerCentre

Using Search Results In the “Result” pages, enter “Q1 !Q3Q1 !Q3” at the

top blank field and click “Expression” to combine 2 queries results(Q1 but not Q3), click “Search”

HKU

ComputerCentre

Using Search Results After waiting for awhile, SRS give the query result

of Query 1 but not Query 3 => Query of all sequences in PIR with description

having “cytochrome” string and organism having “pig”, but their super-family having no string of “cytochrome-c”

HKU

ComputerCentre

Extended Search: Exercise

15 Minutes Break15 Minutes Break

Please try the following exercises Please try the following exercises by yourself!by yourself!

HKU

ComputerCentre

Extended Search : Exercise

Question 1. Question 1.

Search all the human DNA sequeSearch all the human DNA sequence entries created after 01-Jan-nce entries created after 01-Jan-2003 with sequence length great2003 with sequence length greater than 100000 in EMBL.er than 100000 in EMBL.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 1.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 1.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 1.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 1.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 1.

HKU

ComputerCentre

Extended Search : Exercise

Question 2. Question 2.

Search all the mouse protein seqSearch all the mouse protein sequence entries created before 01-uence entries created before 01-Jul-2002 with NCBI TAX_ID less Jul-2002 with NCBI TAX_ID less than 4000 in SWISSPROT.than 4000 in SWISSPROT.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 2.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 2.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 2.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 2.

HKU

ComputerCentre

Extended Search : Exercise

Question 3. Question 3.

Using the previous result, search Using the previous result, search the entries that have links to “PRthe entries that have links to “PROSITE” database.OSITE” database.

HKU

ComputerCentre

Extended Search : ExerciseQuestion 3. Identify the Query ID (Q6 in this case,

you may not have same query ID)

HKU

ComputerCentre

Extended Search : ExerciseQuestion 3. Enter the expression to find the entries

in previous that have link to “PROSITE” database (“Q6<PROSITE” in this case, you may not have same query ID)

HKU

ComputerCentre

Extended Search : ExerciseQuestion 3. Result similar to this one

HKU

ComputerCentre

Introduction to SRS

Part 3. Linking related Part 3. Linking related sequence / databasesequence / database

HKU

ComputerCentre

Linking Related Sequence Go back to result #3, Click to select the first first

two sequencetwo sequence Click “LinkLink” button to search for the related

sequence

HKU

ComputerCentre

Linking Related Sequence Ensure the choice “Find related entriesFind related entries” Click “EMBLEMBL” as to search related sequence in

this databank Click “SearchSearch” button

HKU

ComputerCentre

Linking Related Sequence Result: 5 entries found in “EMBL” are the the

related sequence with your selected sequence

HKU

ComputerCentre

Linking Related Sequence Refine the result list to “SeqSimpleView” in the left

panel

HKU

ComputerCentre

Linking Related Sequence Result List

HKU

ComputerCentre

Linking database: Exercise

15 Minutes Break15 Minutes Break

Please try the following exercises Please try the following exercises by yourself!by yourself!

HKU

ComputerCentre

Linking database: Exercise

Question 1.Question 1.Find in the EMBL database the corFind in the EMBL database the cor

responding sequences that link tresponding sequences that link to the longest sequence of the preo the longest sequence of the previous result (question #2 in extenvious result (question #2 in extended search)ded search)

HKU

ComputerCentre

Linking database : ExerciseQuestion 1. Go back to the previous result Question

#2 in extended search(Query #6 in this case)

HKU

ComputerCentre

Linking database : ExerciseQuestion 1. Sort to find the longest sequence from

the result list

HKU

ComputerCentre

Linking database : ExerciseQuestion 1. Sort to find the longest sequence from

the result list

HKU

ComputerCentre

Linking database : ExerciseQuestion 1. Click the “Link” button after selecting the

longest sequence

HKU

ComputerCentre

Linking database : ExerciseQuestion 1. Search the “EMBL” database which

have related entries links to selected sequence

HKU

ComputerCentre

Linking database : ExerciseQuestion 1. EMBL:AP000423 is matched

HKU

ComputerCentre

Linking database: Exercise

Question 2.Question 2.Using the previous result (question Using the previous result (question

#1 in simple query), make a quer#1 in simple query), make a query to find the entries which have liy to find the entries which have links to SWISSPROT databasenks to SWISSPROT database

HKU

ComputerCentre

Linking database : ExerciseQuestion 2. Go back to the previous result Question

#1 in simple query (Query #2 in this case)

HKU

ComputerCentre

Linking database : ExerciseQuestion 2. Click the “Link” button with “Unselected

only”

HKU

ComputerCentre

Linking database : ExerciseQuestion 2. Select “Swissprot”, show only results wit

h related entries link to “Swissprot”

HKU

ComputerCentre

Linking database : ExerciseQuestion 2. Around 11 EMBL entries are matchedReturn list should be EMBL entries but not

SWISSPROT entries

HKU

ComputerCentre

Linking database: Exercise

Question 3.Question 3.Using the previous result (question Using the previous result (question

#1 in simple query), make a quer#1 in simple query), make a query to find the entries which have ny to find the entries which have no links to SWISSPROT databaseo links to SWISSPROT database

HKU

ComputerCentre

Linking database : ExerciseQuestion 3. Select “Swissprot”, show only results wit

hout related entries

HKU

ComputerCentre

Linking database : ExerciseQuestion 3. Around 32 EMBL entries are matched

HKU

ComputerCentre

Introduction to SRS

Part 4. ApplicationPart 4. Application

HKU

ComputerCentre

Application: BLAST Go back to the query result page -> Q9

"(([PIR-ID:ODHU1] | [PIR-ID:ODBO1]) > EMBL )" Click the checkbox of EMBL:MIHSM1 and then click

“Launch” with application “BLASTBLASTNN”

HKU

ComputerCentre

Application: BLAST BLAST window appear and change the “Complete

Database to search” to Yeast Complete GeYeast Complete Genomenome

Change other parameters as you like and then click “Launch” to execute

HKU

ComputerCentre

Application: BLAST

BLAST job was submitted to the batch queue

Click the hourglass icon (left top corner) to check whether the job is finished?

HKU

ComputerCentre

Application: BLAST

After the job result is ready, click to see the BLAST job result

HKU

ComputerCentre

Application: BLAST

After waiting for awhile, the BLAST result list appear, click any one of them to view the detail

HKU

ComputerCentre

Application: BLAST

BLAST alignment result with the databanks

HKU

ComputerCentre

Application: Clustalw

Go back to blast result list. Then click the checkbox to select first 5 sequence

Click the Application pull down menu to “NCNClustalWlustalW”

and then click “LaunchLaunch”

HKU

ComputerCentre

Application: Clustalw

Then “ClustalW” window appear and change the parameters as you like

HKU

ComputerCentre

Application: Clustalw

After click “LaunchLaunch”, SRS would launch clustalw with you selected sequence and give the result

HKU

ComputerCentre

Application: EMBOSS(eg: PRETTYPLOT)

Using the ClustalW result, and then click “Tools”

HKU

ComputerCentre

Application: EMBOSS(eg: PRETTYPLOT)

Choose “prettyplotN” , and then click “Launch”

HKU

ComputerCentre

Application: EMBOSS(eg: PRETTYPLOT)

It jump to “prettyplotN” interface, scroll down …..

HKU

ComputerCentre

Application: EMBOSS(eg: PRETTYPLOT)

Click to select “Display the consensus”, and then click “Launch”

HKU

ComputerCentre

Application: EMBOSS(eg: PRETTYPLOT)

Click the first graphic file to view the result, while the second file would be empty

HKU

ComputerCentre

Application: EMBOSS(eg: PRETTYPLOT)

You will see the identical residue shown in RED color The last line show the consensus of the alignment

HKU

ComputerCentre

Application: EMBOSS(eg: SIXPACK)

Go back to the query result page -> Q9

"(([PIR-ID:ODHU1] | [PIR-ID:ODBO1]) > EMBL )" Click the checkbox of EMBL:MIHSM1 and then click “Tools”

HKU

ComputerCentre

Application: EMBOSS(eg: SIXPACK)

Choose “SIXPACK” and then click “Launch”

HKU

ComputerCentre

Application: EMBOSS(eg: SIXPACK)

click “Launch” to run SIXPACK program

HKU

ComputerCentre

Application: EMBOSS(eg: SIXPACK)

At the result page, see the number of possible ORF in 6-frame

HKU

ComputerCentre

Application: EMBOSS(eg: SIXPACK)

See the translation in all 6-frames

HKU

ComputerCentre

Application: EMBOSS(eg: TMAP)

Several EMBOSS can be directly launched by SRS: for example, TMAP

Go back to Q4 result and click the checkbox of PIR:S26019 and then click “Tools”

HKU

ComputerCentre

Application: EMBOSS(eg: TMAP)

It will jump to show a list of application, change the pull-down menu to “TMAPTMAP” and click launch

HKU

ComputerCentre

Application: EMBOSS(eg: TMAP)

“TMAP” window appear and click “Launch” to execute TMAP with your selected sequence

HKU

ComputerCentre

Application: EMBOSS(eg: TMAP)

After waiting for awhile, “TMAP”give out the predicts transmembrane segments result

HKU

ComputerCentre

Application: EMBOSS(eg: TMAP)

Scroll at the bottom to click to see the membrane spanning regions diagram

HKU

ComputerCentre

Application : Exercise

15 Minutes Break15 Minutes Break

Please try the following exercises Please try the following exercises by yourself!by yourself!

HKU

ComputerCentre

Application: Exercise

Question 1.Question 1.Do the sequence alignment of the Do the sequence alignment of the

5 shortest sequence from the pre5 shortest sequence from the previous result (question #2 in extenvious result (question #2 in extended search) using clustalwded search) using clustalw

HKU

ComputerCentre

Application: ExerciseQuestion 1. Go back to the previous result Question

#2 in extended search, Q6 in this case

HKU

ComputerCentre

Application: ExerciseQuestion 1. Sort to find the 5 shortest sequences

from the result list

HKU

ComputerCentre

Application: ExerciseQuestion 1. Launch “Clustalw” after selecting 5 short

est sequences

HKU

ComputerCentre

Application: ExerciseQuestion 1. No need to modify any parameters, click

“launch” to start sequence alignment

HKU

ComputerCentre

Application: ExerciseQuestion 1.

HKU

ComputerCentre

Application: Exercise

Question 2.Question 2.Do the BLAST search using the EDo the BLAST search using the E

MBL:AY278491 (HKU-SARS viruMBL:AY278491 (HKU-SARS virus) sequence against EMBL (Reles) sequence against EMBL (Release): VRL(virus) databasease): VRL(virus) database

HKU

ComputerCentre

Application: ExerciseQuestion 2. Go back to previous result list(Q2 in

normal case) to find the HKU-SARS virus

HKU

ComputerCentre

Application: ExerciseQuestion 2. Search against the Viruses database

HKU

ComputerCentre

Application: ExerciseQuestion 2. After waiting for few minutes, BLAST

result list display

HKU

ComputerCentre

Application: ExerciseQuestion 2. See the BLAST result vs TOR2

HKU

ComputerCentre

Application: Exercise

Question 3.Question 3.Find the transmembrance segmentFind the transmembrance segment

s of the SWISSPROT:VME1_CVs of the SWISSPROT:VME1_CVHSA (SARS virus M-protein) seqHSA (SARS virus M-protein) sequence using TMAPuence using TMAP

HKU

ComputerCentre

Application: ExerciseQuestion 3. After finding the sequence, launch “tmap

HKU

ComputerCentre

Application: ExerciseQuestion 3. After finding the sequence, launch “tmap

HKU

ComputerCentre

Application: ExerciseQuestion 3.

HKU

ComputerCentre

Application: ExerciseQuestion 3.

HKU

ComputerCentre

Application: ExerciseQuestion 3.

HKU

ComputerCentre

Introduction to SRS

Part 5. OthersPart 5. Others

HKU

ComputerCentre

User submit own sequence:

User can submit their own sequence to work with SRS

Go to “Select Databanks”, click to “Expand all” and find the “User Owned Databanks”

Click the “Add Data” beside USERDNA

HKU

ComputerCentre

User submit own sequence:

Enter your own DNA sequence in the blank field Then click “Launch” to input your own sequence t

o the UserDNA databank

HKU

ComputerCentre

User submit own sequence:

Wait for SRS to store your input DNA sequence

HKU

ComputerCentre

User submit own sequence:

Your sequence has been successfully input to the USERDNA databank

HKU

ComputerCentre

User submit own sequence:

Then you can search related link sequence or launch application with your sequence

For example, you can launch BLAST with your input sequence

HKU

ComputerCentre

SRS On-Line Help

HKU

ComputerCentre

SRS On-Line Help

HKU

ComputerCentre

-END-

Recommended