Upload
zhen
View
210
Download
0
Embed Size (px)
DESCRIPTION
Introduction to SRS. Frankie Cheung [email protected]. Introduction to SRS. Part 0. Introduction. SRS Introduction. SRS system started out as a Sequence Retrieval System that employed sophisticated parsing and indexing of database text files. SRS Introduction. - PowerPoint PPT Presentation
Citation preview
HKU
ComputerCentre
Introduction to SRS
Part 0. IntroductionPart 0. Introduction
HKU
ComputerCentre
SRS IntroductionSRS system started out as
a Sequence Retrieval Sequence Retrieval SystemSystem that employed sophisticated parsing and parsing and indexingindexing of database text files
HKU
ComputerCentre
SRS IntroductionFast access Fast access to diverse life science data - ge
netic, protein, cellular, molecular, and clinical - for researchers and bioinformaticians
Integration of public and proprietary data through
one interfaceone interface Unique ability to perform cross-database cross-database
queriesqueries Rapid string searchRapid string search of large volumes of
data Seemless integration of data and anintegration of data and an
alysis tools alysis tools
HKU
ComputerCentre
Starting SRS Browse the BIOSUPPORT Hompage:
http://bioinfo.hku.hk/ Select “Tools”“Tools” Option
HKU
ComputerCentre
Starting SRS At the “Tools” page, find and click on the
Sequence Retrieval System Sequence Retrieval System (SRS)(SRS) system
HKU
ComputerCentre
SRS Main Page After successful login, the default page is a
“Temporary ProjectTemporary Project”.
HKU
ComputerCentre
SRS Main Page For a “Temporary ProjectTemporary Project”, the result of
your work would not be stored after you close SRS.
HKU
ComputerCentre
SRS Main Page
Starting a “Permanent ProjectPermanent Project”, the result of your work would be stored even after you close SRS.
HKU
ComputerCentre
Permanent Project After selecting a “Permanent ProjectPermanent Project”,
your account name and project name will be displayed. (Since no password protection under current policy, this approach is not recommended.)
HKU
ComputerCentre
Permanent Project All the query and job result will be saved in
your “Permanent ProjectPermanent Project”.
HKU
ComputerCentre
Permanent Project Option “Save to desktopSave to desktop” : download & save
your project info to your local machine
HKU
ComputerCentre
Permanent Project Option “Rename projectRename project” : change the name
of your project
HKU
ComputerCentre
Permanent Project Option “Delete projectDelete project” : delete all the related
information of your current project
HKU
ComputerCentre
“Quick Searches” page :
Quick access with default databank selection
HKU
ComputerCentre
“Quick Text Search” Quick sequence query with default
databank selection Enter your search string and click “Search”
to start searching
HKU
ComputerCentre
“Quick Text Search” Default Databank Setting:
Nucleotide Seq EMBL
Protein Seq SWALL
Protein Structure PDB
Genome LocusLink
Mutations OMIM
Metabolic Pathway Pathway
HKU
ComputerCentre
Introduction to SRS
Part 1. Simple QueryPart 1. Simple Query
HKU
ComputerCentre
Simple query Go back to “Select DatabankSelect Databank”, click
on “PIRPIR” to select “PIR” databank
HKU
ComputerCentre
Simple query Then click the “QueryQuery” button to fill in the
query form
HKU
ComputerCentre
Simple query Click the first “AllTextAllText” pull down menu to
change it to “DescriptionDescription” Enter “cytochromecytochrome” at the blank besides
HKU
ComputerCentre
Simple query Click the second “AllTextAllText” pull down
menu to change it to “OrganismOrganism” Enter “pigpig” at the blank besides
HKU
ComputerCentre
Simple query Click the “SearchSearch” button to start
searching any sequence in PIR databanks satisfy your query form
HKU
ComputerCentre
Simple query After searching for awhile, there would be
some result listing entries. Try to click the PIR:S26019PIR:S26019 entry
HKU
ComputerCentre
Gene Sequence SRS would display the PIR:S26019 descri
ption: including ID, accession numbeID, accession number, description, keywordsr, description, keywords, etc
HKU
ComputerCentre
Gene Sequence Scroll down to see the reference related to
this sequence, there is also a Medline rMedline referenceeference hyperlink available
HKU
ComputerCentre
Gene Sequence Scroll down to see the gene functiongene function a
nd the gene sequencegene sequence
HKU
ComputerCentre
Gene Sequence Save the gene sequence information to your local
machine Click the “SaveSave” button at the leftmost panel
HKU
ComputerCentre
Gene Sequence Click the “Output toOutput to” button to choose the
text/html format
HKU
ComputerCentre
Gene Sequence Click the “Use View” button to choose the
“Complete entriesComplete entries” format
HKU
ComputerCentre
Gene Sequence Then click “savesave” to save again the PIR:S26019
to your local machine
HKU
ComputerCentre
Simple query: Exercise
15 Minutes Break15 Minutes Break
Please try the following exercises Please try the following exercises by yourself!by yourself!
HKU
ComputerCentre
Simple query: Exercise
Question 1. Question 1.
Search all the “SARS virus” in ESearch all the “SARS virus” in EMBL databasesMBL databases
HKU
ComputerCentre
Simple query: Exercise 1. Search all the “SARS virus” in EMBL dat
abases
HKU
ComputerCentre
Simple query: Exercise 1. Search all the “SARS virus” in EMBL dat
abases
HKU
ComputerCentre
Simple query: Exercise 1. Search all the “SARS virus” in EMBL dat
abases:
HKU
ComputerCentre
Simple query: Exercise
Question 2. Question 2.
How many CDS regions for the “How many CDS regions for the “EMBL:AY278488” (SARS coronaEMBL:AY278488” (SARS coronavirus BJ01) ?virus BJ01) ?
HKU
ComputerCentre
Simple query: Exercise 2. Click to see the detail AY278488
HKU
ComputerCentre
Simple query: Exercise 2. Click to see the detail AY278488
HKU
ComputerCentre
Simple query: Exercise 2. Count the number of CDS entries
HKU
ComputerCentre
Simple query: Exercise 2. CDS entries of EMBL:AY278488
orf1ab orf1a polyprotein spike glycoprotein S putative uncharacterized protein 1 putative uncharacterized protein 2 envelope protein E membrane protein M putative uncharacterized protein 3 putative uncharacterized protein 4 nucleocapsid protein N putative uncharacterized protein 5
HKU
ComputerCentre
Simple query: Exercise
Question 3. Question 3.
What is the protein ID for EMBL:What is the protein ID for EMBL:AY278741 (SARS coronavirus UrAY278741 (SARS coronavirus Urbani) spike glycoprotein?bani) spike glycoprotein?
HKU
ComputerCentre
Simple query: Exercise 3. Click to see the detail AY278741
HKU
ComputerCentre
Simple query: Exercise 3. EMBL:AY278741
HKU
ComputerCentre
Simple query: Exercise 3. EMBL:AY278741
HKU
ComputerCentre
Simple query: Exercise
Question 4. Question 4.
Following the link of the spike glyFollowing the link of the spike glycoprotein of EMBL:AY278741, wcoprotein of EMBL:AY278741, what is the biological function of thihat is the biological function of this spike-protein? s spike-protein?
HKU
ComputerCentre
Simple query: Exercise 4. EMBL:AY278741
HKU
ComputerCentre
Simple query: Exercise 4. Find biological function of VGL2_CVHSA
HKU
ComputerCentre
Simple query: Exercise 4. Find biological function of VGL2_CVHSA
HKU
ComputerCentre
Introduction to SRS
Part 2. Extended SearchPart 2. Extended Search
HKU
ComputerCentre
Extended Search Click the “Select DatabanksSelect Databanks” button at
the top to select PIR as query targe
HKU
ComputerCentre
Extended Search Click the “QueryQuery” button at the top to back to
query form
HKU
ComputerCentre
Extended Search Click the “Extended QueryExtended Query” button at the
left panel to display the extended search form
HKU
ComputerCentre
Extended Search In the “ExtendedExtended” search form, find the
“Super FamilySuper Family” field and then enter
“cytochrome-ccytochrome-c ” Click the “SearchSearch” to start the query
HKU
ComputerCentre
Extended Search After searching for awhile, there would be
extended search result list appear
HKU
ComputerCentre
Using Search Results Click the “ResultResult” button at the top to reuse
your query result
HKU
ComputerCentre
Using Search Results In the “Result” pages, enter “Q1 !Q3Q1 !Q3” at the
top blank field and click “Expression” to combine 2 queries results(Q1 but not Q3), click “Search”
HKU
ComputerCentre
Using Search Results After waiting for awhile, SRS give the query result
of Query 1 but not Query 3 => Query of all sequences in PIR with description
having “cytochrome” string and organism having “pig”, but their super-family having no string of “cytochrome-c”
HKU
ComputerCentre
Extended Search: Exercise
15 Minutes Break15 Minutes Break
Please try the following exercises Please try the following exercises by yourself!by yourself!
HKU
ComputerCentre
Extended Search : Exercise
Question 1. Question 1.
Search all the human DNA sequeSearch all the human DNA sequence entries created after 01-Jan-nce entries created after 01-Jan-2003 with sequence length great2003 with sequence length greater than 100000 in EMBL.er than 100000 in EMBL.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 1.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 1.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 1.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 1.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 1.
HKU
ComputerCentre
Extended Search : Exercise
Question 2. Question 2.
Search all the mouse protein seqSearch all the mouse protein sequence entries created before 01-uence entries created before 01-Jul-2002 with NCBI TAX_ID less Jul-2002 with NCBI TAX_ID less than 4000 in SWISSPROT.than 4000 in SWISSPROT.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 2.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 2.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 2.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 2.
HKU
ComputerCentre
Extended Search : Exercise
Question 3. Question 3.
Using the previous result, search Using the previous result, search the entries that have links to “PRthe entries that have links to “PROSITE” database.OSITE” database.
HKU
ComputerCentre
Extended Search : ExerciseQuestion 3. Identify the Query ID (Q6 in this case,
you may not have same query ID)
HKU
ComputerCentre
Extended Search : ExerciseQuestion 3. Enter the expression to find the entries
in previous that have link to “PROSITE” database (“Q6<PROSITE” in this case, you may not have same query ID)
HKU
ComputerCentre
Extended Search : ExerciseQuestion 3. Result similar to this one
HKU
ComputerCentre
Introduction to SRS
Part 3. Linking related Part 3. Linking related sequence / databasesequence / database
HKU
ComputerCentre
Linking Related Sequence Go back to result #3, Click to select the first first
two sequencetwo sequence Click “LinkLink” button to search for the related
sequence
HKU
ComputerCentre
Linking Related Sequence Ensure the choice “Find related entriesFind related entries” Click “EMBLEMBL” as to search related sequence in
this databank Click “SearchSearch” button
HKU
ComputerCentre
Linking Related Sequence Result: 5 entries found in “EMBL” are the the
related sequence with your selected sequence
HKU
ComputerCentre
Linking Related Sequence Refine the result list to “SeqSimpleView” in the left
panel
HKU
ComputerCentre
Linking Related Sequence Result List
HKU
ComputerCentre
Linking database: Exercise
15 Minutes Break15 Minutes Break
Please try the following exercises Please try the following exercises by yourself!by yourself!
HKU
ComputerCentre
Linking database: Exercise
Question 1.Question 1.Find in the EMBL database the corFind in the EMBL database the cor
responding sequences that link tresponding sequences that link to the longest sequence of the preo the longest sequence of the previous result (question #2 in extenvious result (question #2 in extended search)ded search)
HKU
ComputerCentre
Linking database : ExerciseQuestion 1. Go back to the previous result Question
#2 in extended search(Query #6 in this case)
HKU
ComputerCentre
Linking database : ExerciseQuestion 1. Sort to find the longest sequence from
the result list
HKU
ComputerCentre
Linking database : ExerciseQuestion 1. Sort to find the longest sequence from
the result list
HKU
ComputerCentre
Linking database : ExerciseQuestion 1. Click the “Link” button after selecting the
longest sequence
HKU
ComputerCentre
Linking database : ExerciseQuestion 1. Search the “EMBL” database which
have related entries links to selected sequence
HKU
ComputerCentre
Linking database : ExerciseQuestion 1. EMBL:AP000423 is matched
HKU
ComputerCentre
Linking database: Exercise
Question 2.Question 2.Using the previous result (question Using the previous result (question
#1 in simple query), make a quer#1 in simple query), make a query to find the entries which have liy to find the entries which have links to SWISSPROT databasenks to SWISSPROT database
HKU
ComputerCentre
Linking database : ExerciseQuestion 2. Go back to the previous result Question
#1 in simple query (Query #2 in this case)
HKU
ComputerCentre
Linking database : ExerciseQuestion 2. Click the “Link” button with “Unselected
only”
HKU
ComputerCentre
Linking database : ExerciseQuestion 2. Select “Swissprot”, show only results wit
h related entries link to “Swissprot”
HKU
ComputerCentre
Linking database : ExerciseQuestion 2. Around 11 EMBL entries are matchedReturn list should be EMBL entries but not
SWISSPROT entries
HKU
ComputerCentre
Linking database: Exercise
Question 3.Question 3.Using the previous result (question Using the previous result (question
#1 in simple query), make a quer#1 in simple query), make a query to find the entries which have ny to find the entries which have no links to SWISSPROT databaseo links to SWISSPROT database
HKU
ComputerCentre
Linking database : ExerciseQuestion 3. Select “Swissprot”, show only results wit
hout related entries
HKU
ComputerCentre
Linking database : ExerciseQuestion 3. Around 32 EMBL entries are matched
HKU
ComputerCentre
Introduction to SRS
Part 4. ApplicationPart 4. Application
HKU
ComputerCentre
Application: BLAST Go back to the query result page -> Q9
"(([PIR-ID:ODHU1] | [PIR-ID:ODBO1]) > EMBL )" Click the checkbox of EMBL:MIHSM1 and then click
“Launch” with application “BLASTBLASTNN”
HKU
ComputerCentre
Application: BLAST BLAST window appear and change the “Complete
Database to search” to Yeast Complete GeYeast Complete Genomenome
Change other parameters as you like and then click “Launch” to execute
HKU
ComputerCentre
Application: BLAST
BLAST job was submitted to the batch queue
Click the hourglass icon (left top corner) to check whether the job is finished?
HKU
ComputerCentre
Application: BLAST
After the job result is ready, click to see the BLAST job result
HKU
ComputerCentre
Application: BLAST
After waiting for awhile, the BLAST result list appear, click any one of them to view the detail
HKU
ComputerCentre
Application: BLAST
BLAST alignment result with the databanks
HKU
ComputerCentre
Application: Clustalw
Go back to blast result list. Then click the checkbox to select first 5 sequence
Click the Application pull down menu to “NCNClustalWlustalW”
and then click “LaunchLaunch”
HKU
ComputerCentre
Application: Clustalw
Then “ClustalW” window appear and change the parameters as you like
HKU
ComputerCentre
Application: Clustalw
After click “LaunchLaunch”, SRS would launch clustalw with you selected sequence and give the result
HKU
ComputerCentre
Application: EMBOSS(eg: PRETTYPLOT)
Using the ClustalW result, and then click “Tools”
HKU
ComputerCentre
Application: EMBOSS(eg: PRETTYPLOT)
Choose “prettyplotN” , and then click “Launch”
HKU
ComputerCentre
Application: EMBOSS(eg: PRETTYPLOT)
It jump to “prettyplotN” interface, scroll down …..
HKU
ComputerCentre
Application: EMBOSS(eg: PRETTYPLOT)
Click to select “Display the consensus”, and then click “Launch”
HKU
ComputerCentre
Application: EMBOSS(eg: PRETTYPLOT)
Click the first graphic file to view the result, while the second file would be empty
HKU
ComputerCentre
Application: EMBOSS(eg: PRETTYPLOT)
You will see the identical residue shown in RED color The last line show the consensus of the alignment
HKU
ComputerCentre
Application: EMBOSS(eg: SIXPACK)
Go back to the query result page -> Q9
"(([PIR-ID:ODHU1] | [PIR-ID:ODBO1]) > EMBL )" Click the checkbox of EMBL:MIHSM1 and then click “Tools”
HKU
ComputerCentre
Application: EMBOSS(eg: SIXPACK)
Choose “SIXPACK” and then click “Launch”
HKU
ComputerCentre
Application: EMBOSS(eg: SIXPACK)
click “Launch” to run SIXPACK program
HKU
ComputerCentre
Application: EMBOSS(eg: SIXPACK)
At the result page, see the number of possible ORF in 6-frame
HKU
ComputerCentre
Application: EMBOSS(eg: SIXPACK)
See the translation in all 6-frames
HKU
ComputerCentre
Application: EMBOSS(eg: TMAP)
Several EMBOSS can be directly launched by SRS: for example, TMAP
Go back to Q4 result and click the checkbox of PIR:S26019 and then click “Tools”
HKU
ComputerCentre
Application: EMBOSS(eg: TMAP)
It will jump to show a list of application, change the pull-down menu to “TMAPTMAP” and click launch
HKU
ComputerCentre
Application: EMBOSS(eg: TMAP)
“TMAP” window appear and click “Launch” to execute TMAP with your selected sequence
HKU
ComputerCentre
Application: EMBOSS(eg: TMAP)
After waiting for awhile, “TMAP”give out the predicts transmembrane segments result
HKU
ComputerCentre
Application: EMBOSS(eg: TMAP)
Scroll at the bottom to click to see the membrane spanning regions diagram
HKU
ComputerCentre
Application : Exercise
15 Minutes Break15 Minutes Break
Please try the following exercises Please try the following exercises by yourself!by yourself!
HKU
ComputerCentre
Application: Exercise
Question 1.Question 1.Do the sequence alignment of the Do the sequence alignment of the
5 shortest sequence from the pre5 shortest sequence from the previous result (question #2 in extenvious result (question #2 in extended search) using clustalwded search) using clustalw
HKU
ComputerCentre
Application: ExerciseQuestion 1. Go back to the previous result Question
#2 in extended search, Q6 in this case
HKU
ComputerCentre
Application: ExerciseQuestion 1. Sort to find the 5 shortest sequences
from the result list
HKU
ComputerCentre
Application: ExerciseQuestion 1. Launch “Clustalw” after selecting 5 short
est sequences
HKU
ComputerCentre
Application: ExerciseQuestion 1. No need to modify any parameters, click
“launch” to start sequence alignment
HKU
ComputerCentre
Application: ExerciseQuestion 1.
HKU
ComputerCentre
Application: Exercise
Question 2.Question 2.Do the BLAST search using the EDo the BLAST search using the E
MBL:AY278491 (HKU-SARS viruMBL:AY278491 (HKU-SARS virus) sequence against EMBL (Reles) sequence against EMBL (Release): VRL(virus) databasease): VRL(virus) database
HKU
ComputerCentre
Application: ExerciseQuestion 2. Go back to previous result list(Q2 in
normal case) to find the HKU-SARS virus
HKU
ComputerCentre
Application: ExerciseQuestion 2. Search against the Viruses database
HKU
ComputerCentre
Application: ExerciseQuestion 2. After waiting for few minutes, BLAST
result list display
HKU
ComputerCentre
Application: ExerciseQuestion 2. See the BLAST result vs TOR2
HKU
ComputerCentre
Application: Exercise
Question 3.Question 3.Find the transmembrance segmentFind the transmembrance segment
s of the SWISSPROT:VME1_CVs of the SWISSPROT:VME1_CVHSA (SARS virus M-protein) seqHSA (SARS virus M-protein) sequence using TMAPuence using TMAP
HKU
ComputerCentre
Application: ExerciseQuestion 3. After finding the sequence, launch “tmap
”
HKU
ComputerCentre
Application: ExerciseQuestion 3. After finding the sequence, launch “tmap
”
HKU
ComputerCentre
Application: ExerciseQuestion 3.
HKU
ComputerCentre
Application: ExerciseQuestion 3.
HKU
ComputerCentre
Application: ExerciseQuestion 3.
HKU
ComputerCentre
Introduction to SRS
Part 5. OthersPart 5. Others
HKU
ComputerCentre
User submit own sequence:
User can submit their own sequence to work with SRS
Go to “Select Databanks”, click to “Expand all” and find the “User Owned Databanks”
Click the “Add Data” beside USERDNA
HKU
ComputerCentre
User submit own sequence:
Enter your own DNA sequence in the blank field Then click “Launch” to input your own sequence t
o the UserDNA databank
HKU
ComputerCentre
User submit own sequence:
Wait for SRS to store your input DNA sequence
HKU
ComputerCentre
User submit own sequence:
Your sequence has been successfully input to the USERDNA databank
HKU
ComputerCentre
User submit own sequence:
Then you can search related link sequence or launch application with your sequence
For example, you can launch BLAST with your input sequence
HKU
ComputerCentre
SRS On-Line Help
HKU
ComputerCentre
SRS On-Line Help
HKU
ComputerCentre
-END-