40
LSM3241: Bioinformatics and LSM3241: Bioinformatics and Biocomputing Biocomputing Lecture 2: Bioinformatics of Lecture 2: Bioinformatics of viral genome viral genome Prof. Chen Yu Zong Prof. Chen Yu Zong Tel: 6874-6877 Tel: 6874-6877 Email: Email: [email protected] [email protected] http://xin.cz3.nus.edu.sg http://xin.cz3.nus.edu.sg Room 07-24, level 7, SOC1, Room 07-24, level 7, SOC1, National University of Singapore National University of Singapore

LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: [email protected]

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

LSM3241: Bioinformatics and BiocomputingLSM3241: Bioinformatics and Biocomputing

Lecture 2: Bioinformatics of viral genomeLecture 2: Bioinformatics of viral genome

Prof. Chen Yu ZongProf. Chen Yu Zong

Tel: 6874-6877Tel: 6874-6877Email: Email: [email protected]@nus.edu.sghttp://xin.cz3.nus.edu.sghttp://xin.cz3.nus.edu.sg

Room 07-24, level 7, SOC1, Room 07-24, level 7, SOC1, National University of SingaporeNational University of Singapore

Page 2: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

22

Resource of Viral GenomesResource of Viral GenomesNCBI Genome Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=genome

Page 3: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

33

Resource of Viral GenomesResource of Viral GenomesNCBI Genome Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=genome

2,226 entries of viral genomes (1,524 distinct virus strains) in the database. Early 2005 figure: 1,250 entries and 1,022 distinct

1,193 entries of complete viral genome. Early 2005 figure: 900

Page 4: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

44

Resource of Viral GenomesResource of Viral GenomesNCBI Genome Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=genome

12 entries of coronavirus genomes (8 in early 2005)

16 entries of influenza H5N1 genomes

Page 5: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

55

Resource of Viral GenomesResource of Viral GenomesNCBI Genome Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=genome

Information of viral genomes in the database can also be retrieved by clicking the viruses link:

Click Here

Page 6: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

66

Resource of Viral GenomesResource of Viral GenomesNCBI Genome Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=genome

List of viral genomes: (1,927 entries in Jan 2006, 1,461 in Jan 2005)

Page 7: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

77

Resource of Viral GenomesResource of Viral GenomesNCBI Genome Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=genome

Viral taxonomy groups:

Page 8: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

88

Resource of Viral GenomesResource of Viral GenomesNCBI Genome Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=genome

Viral genome list:

Page 9: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

99

Resource of Viral GenomesResource of Viral GenomesViral genome list:

Page 10: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1010

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesViral name link:

Viral genome link

All entries

Page 11: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1111

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesViral protein link:

Limit to title search

Page 12: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1212

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesSARS coronavirus PP1ab PID link. It gives multiple entries from difference strains or from related species

Viral strain

Page 13: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1313

Different strains of SARS coronavirusDifferent strains of SARS coronavirus

Page 14: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1414

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesNote: Viral polyprotein is not a single protein, it is a combination of several proteins. Information about these proteins can be difficult to read

Suggestion: Looking into a latest NCBI entry of the same virus from a reputable research group

Page 15: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1515

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesSARS coronavirus unknown sars3a PID link:

Page 16: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1616

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesAlternative way to find SARS coronavirus genome. Look for the latest entry with complete genome and good functional annotation. Not all entries have these.

Page 17: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1717

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesThe latest good entry: AY572038 civet020 SARS coronavirus (In Jan 2005 AY310120 SARS coronavirus FRA), complete genome

Page 18: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1818

SARS Coronavirus GenomeSARS Coronavirus Genome

You are expected to find the info about each gene (genome location, sequence, function)

Page 19: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

1919

Function of SARS Coronavirus GenesFunction of SARS Coronavirus Genes

Page 20: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2020

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesWhere to find the proteins in the genome entry?

Source 1: mat_peptide

Protein name

Page 21: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2121

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesWhere to find the proteins in the genome entry?

Source 1:

mat_peptide

Page 22: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2222

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesWhere to find the proteins in the genome entry?

Putative 3C-like protease mat_peptide link:

Protein name

Protein function

Page 23: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2323

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesWhere to find the proteins in the genome entry?

Source 2: CDS

Protein name

Page 24: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2424

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesWhere to find the proteins in the genome entry?

Source 2:

CDS

Page 25: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2525

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesWhere to find the proteins in the genome entry?

Source 2:

CDS

Page 26: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2626

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesWhere to find the proteins in the genome entry?

Source 2:

CDS

Page 27: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2727

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesWhere to find the proteins in the genome entry?

Nucleocapsid protein protein_id link:

Protein name

Page 28: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2828

Bioinformatics of Viral GenomesBioinformatics of Viral Genomes

How to find the name or function of a putative

protein in a genome?

• Medline keyword search

• Google search

Page 29: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

2929

Bioinformatics of Viral GenomesBioinformatics of Viral Genomes

What if the function of a putative protein is unknown?

• Sequence alignment (BLAST, PSI-BLAST). This will be further discussed in lecture 4.

• Motif analysis (Conduct a PROSITE motif search)

• If sequence analysis fails or in doubt, try machine learning method (SVMProt , Nucleic Acids Res., 31: 3692-3697; ProtFun , Bioinformatics, 19:635-642). This will be studied in lecture 5.

Page 30: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3030

Bioinformatics of Viral GenomesBioinformatics of Viral Genomes

Drug design:

• Step 1: Finding the right target in the genome

• A key protein involved in viral cycle (stop the disease process)

• Different from human proteins (reduce side-effects)

• Step 2: Finding or making a chemical agent to stop the protein

• In majority of cases: protein inhibitors

• Step 3: Test and clinical trials

Page 31: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3131

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesSARS Drug design:

The target: 3C like protease

Page 32: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3232

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesSARS Drug design:

• Inhibitor design: Finding inhibitors of similar proteins, such as those of the same name (3C like proteases or 3C proteases of other species), may offer clues to inhibitor design.

Search from NCBI

Page 33: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3333

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesSearch from NCBI finds 19 references.

Page 34: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3434

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesCheck each abstract to find the name of one or more inhibitors.

Be prepared to read the full paper to find inhibitors

Page 35: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3535

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesMake sure the paper talks about the inhibitors of the right protein.

This one actually talks about inhibitors of protease family, thus may

not necessarily be suitable for SARS 3C like protease

Page 36: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3636

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesSARS Drug design:

• Inhibitor design: Finding inhibitors of similar proteins, such as those of the same name (3C like proteases or 3C proteases of other species), may offer clues to inhibitor design.

Search from Google

Page 37: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3737

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesSearch from Google finds numerous entries

Page 38: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3838

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesCheck each entry to find the name of one or more inhibitors.

Be prepared to read the full paper to find inhibitors

Page 39: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

3939

Bioinformatics of Viral GenomesBioinformatics of Viral GenomesDesign of SARS 3C like protease inhibitors

using rhinovirus 3C like protease inhibitors as templates

Page 40: LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877 Email: csccyz@nus.edu.sg

4040

Summary of Today’s lectureSummary of Today’s lecture

• Genome database at NCBI• Viral genomes

– SARS coronavirus genome as an example

• Finding proteins from a genome• Therapeutic target identification from a genome and

inhibitor design