16
An Analysis of An Analysis of “Coronavirus 3CL “Coronavirus 3CL pro pro proteinase cleavage proteinase cleavage sites: Possible sites: Possible relevance to SARS relevance to SARS virus pathology” virus pathology” Connie Wu Connie Wu

Connie Wu

Embed Size (px)

DESCRIPTION

An Analysis of “Coronavirus 3CL pro proteinase cleavage sites: Possible relevance to SARS virus pathology”. Connie Wu. Article Resources. BMC Bioinformatics 2004, 5:72 Published on Jun 6, 2004 Article URL http://www.biomedcentral.com/1471-2105/5/72 NetCorona URL - PowerPoint PPT Presentation

Citation preview

An Analysis of An Analysis of “Coronavirus 3CL“Coronavirus 3CLpropro proteinase cleavage proteinase cleavage

sites: Possible sites: Possible relevance to SARS relevance to SARS virus pathology”virus pathology”

Connie WuConnie Wu

Article ResourcesArticle Resources

BMC Bioinformatics BMC Bioinformatics 2004, 5:722004, 5:72

Published on Jun 6, 2004Published on Jun 6, 2004

Article URL Article URL http://www.biomedcentral.com/1471-2105/5/72

NetCorona URLNetCorona URLhttp://www.cbs.dtu.dk/services/NetCoronahttp://www.cbs.dtu.dk/services/NetCorona

OutlineOutline

SARS outbreak in 2003SARS outbreak in 2003

Introduction to SARS virusIntroduction to SARS virus

Experimental database usedExperimental database used

Pattern Recognition MethodPattern Recognition Method

Neural Network MethodNeural Network Method

Biological Significance on NetCoronaBiological Significance on NetCorona

SARS Outbreak in 2003SARS Outbreak in 2003

A Chinese man was A Chinese man was found to have caught found to have caught the infectious the infectious respiratory disease in respiratory disease in Hong Kong, first case Hong Kong, first case emerge from the emerge from the general population general population since July 2003.since July 2003.

Infected more than Infected more than 8,000 people in close to 8,000 people in close to 30 nations and killed 30 nations and killed more than 750.more than 750.

SARS VirusSARS Virus

Belongs to the family of human coronavirus, Belongs to the family of human coronavirus, normally causes mild cold symptoms in normally causes mild cold symptoms in human.human.

The proteolytic cleavage of host proteins by The proteolytic cleavage of host proteins by viral proteinases is found in the pathology of viral proteinases is found in the pathology of other virus families such as picornaviruses.other virus families such as picornaviruses.

Virus proliferation can be arrested using Virus proliferation can be arrested using specific proteinase inhibitors.specific proteinase inhibitors.

SARS VirusSARS Virus

Experimental databaseExperimental database

Seven full-length coronavius genomes Seven full-length coronavius genomes retrieved from the GenBank database.retrieved from the GenBank database.

Each sequence contained eleven 3CLEach sequence contained eleven 3CLpropro proteinase cleavage sites, given a total 77 proteinase cleavage sites, given a total 77 identifiable sites.identifiable sites.

Identify the main 3CL sites (P1) in Identify the main 3CL sites (P1) in polyproteins using alignment without polyproteins using alignment without gaps.gaps.

P1 = N-terminal to cleavage siteP1 = N-terminal to cleavage site

P1’= C-terminal to cleavage siteP1’= C-terminal to cleavage site

Consensus Pattern Consensus Pattern RecognitionRecognition

Glutamine (Q) in position P1, and a trend of Glutamine (Q) in position P1, and a trend of strong preference for leucine (L) at position strong preference for leucine (L) at position P2 in found in coronavirus proteinase.P2 in found in coronavirus proteinase.‘‘LQ’ consensus pattern predictionLQ’ consensus pattern prediction60/7760/77 true positives (78%) true positives (78%)196196 additional false positives by random additional false positives by random occurrence of this pair of amino acidoccurrence of this pair of amino acid‘‘LQ[S/A]’ consensus pattern predictionLQ[S/A]’ consensus pattern prediction48/7748/77 true positive (62%) true positive (62%)3636 additional false positives additional false positives

Limitations of Pattern Limitations of Pattern RecognitionRecognition

Simple consensus pattern recognition Simple consensus pattern recognition (i.e. ‘LQ’)(i.e. ‘LQ’)low specificitylow specificityhigh sensitivityhigh sensitivity

Sophisticated consensus pattern Sophisticated consensus pattern recognition (i.e. ‘LQ[S/A]’)recognition (i.e. ‘LQ[S/A]’)high specificityhigh specificitylow sensitivitylow sensitivity

Neural NetworkNeural Network

A sequence window of 9 amino acid A sequence window of 9 amino acid centered on the glutamine in the P1 centered on the glutamine in the P1 position position A score between 0 and 1 to every A score between 0 and 1 to every glutamine that is presentglutamine that is presentScoreScore> 0.8 = most likely to cleaved > 0.8 = most likely to cleaved 0.5 ~ 0.8 = possibly cleaved0.5 ~ 0.8 = possibly cleaved< 0.5 = likely not cleaved< 0.5 = likely not cleaved67/7767/77 true positives (87.0%) true positives (87.0%)1358/13721358/1372 true negatives (99.0%) true negatives (99.0%)

Neural NetworkNeural Network

Three-layered Three-layered neural neural networknetwork

Two hidden Two hidden neuronsneurons

Neural Network TrainingNeural Network Training

Training was done with three-fold Training was done with three-fold cross-validation and Matthews cross-validation and Matthews correlation coefficients were correlation coefficients were calculated by sum up values in all calculated by sum up values in all combinations of training and test combinations of training and test sets.sets.An averaged sum of the score of all An averaged sum of the score of all three networks arising from the three networks arising from the three-fold cross-validation was used three-fold cross-validation was used for predition.for predition.

Neural Network on Host Cell Neural Network on Host Cell proteinprotein

Cystic fibrosis transmembrane Cystic fibrosis transmembrane conductance regulator (CFTR), an conductance regulator (CFTR), an ATP-dependent chloride channel is ATP-dependent chloride channel is predicted as a cleavage site with a predicted as a cleavage site with a high score 0.842 at Gln762.high score 0.842 at Gln762.Transcription factor OCT-1 is Transcription factor OCT-1 is predicted to be cleaved at Gln62 by predicted to be cleaved at Gln62 by the 3CLpro proteinase with a high the 3CLpro proteinase with a high confidence score of 0.874.confidence score of 0.874.

Limitation of NetCoronaLimitation of NetCorona

High specificityHigh specificity

Low sensitivityLow sensitivity

Not accurate in predicting sites with Not accurate in predicting sites with relative low cleavage efficiency in relative low cleavage efficiency in vivo.vivo.

Need to disregard high scored Need to disregard high scored cleavage sites that are inaccessible cleavage sites that are inaccessible to the proteinase.to the proteinase.

Significance of Significance of NetCoronaNetCorona

Employed by researchers suspecting Employed by researchers suspecting a possible viral proteinase cleavage.a possible viral proteinase cleavage.

Useful if working with coronavirus Useful if working with coronavirus function.function.

May facilitate proteinase inhibitor May facilitate proteinase inhibitor drug discovery.drug discovery.

Possible future strategy for drug Possible future strategy for drug developmentdevelopment