29
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen

Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen

  • Upload
    ghazi

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen. Use of artificial neural networks. A data-driven method to predict a feature, given a set of training data In biology input features could be amino acid sequence or nucleotides Secondary structure prediction - PowerPoint PPT Presentation

Citation preview

Page 1: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Artificial Neural Networks

Thomas Nordahl Petersen &Morten Nielsen

Page 2: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

• A data-driven method to predict a feature, given a set of training data

• In biology input features could be amino acid sequence or nucleotides

• Secondary structure prediction

• Signal peptide prediction

• Surface accessibility

• Propeptide prediction

Use of artificial neural networks

N C

Signalpeptide

Propeptide Mature/active protein

Page 3: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Neural network prediction methodshttp://www.cbs.dtu.dk/services/

Page 4: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Pattern recognition

Page 5: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Biological Neural network

Page 6: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Biological neuron structure

Synapse

Neuron

TerminalSeveral connections

Page 7: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Diversity of interactions in a network enables complex calculations

• Similar in biological and artificial systems

• Excitatory (+) and inhibitory (-) relations between compute units

fire0

1

Page 8: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Transfer of biological principles to artificial neural network algorithms

• Non-linear relation between input and output

• Massively parallel information processing

• Data-driven construction of algorithms

• Ability to generalize to new data items

Page 9: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Sparse encoding of amino acid sequence windows

Page 10: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Sparse encoding

Inp Neuron 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

AAcid

A 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

R 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

N 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

D 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Q 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

E 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0

Page 11: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

BLOSUM encoding (Blosum50 matrix)

A R N D C Q E G H I L K M F P S T W Y V A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3C 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4

Page 12: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Sequence encoding (continued)

• Sparse encoding

– V:0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

– L:0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

– V.L=0 (unrelated)

• Blosum encoding

– V: 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4

– L:-1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1

– V.L = 0.88 (highly related)

– V.R = -0.08 (close to unrelated)

Page 13: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

I1 I2 I3

h1 h2

O1

Input

h1 h1

hidden

output

= 1/ (1+e-x)

o=H1*v1,1 + H2*v2,1O1 = (o)

w1,1

v1,1

w3,1

v2,1

Error = O - True

w1,2

Page 14: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Sigmodial or logistic function

Page 15: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen
Page 16: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Training and error reduction

Page 17: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Training and error reduction

Page 18: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Training and error reduction

Size matters

Page 19: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

ß-strand

Helix

TurnBend

Secondary Structure Elements

Page 20: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Neural Network Architecture

IKEEHVI IQAE

HEC

IKEEHVIIQAEFYLNPDQSGEF…..Window

Input Layer

Hidden Layer

Output Layer

Weights

Page 21: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

• Normally the best prediction is obtained by averaging• results from several predictions - “wisdom of the crowd

• Two types of neural networks• Prediction of features in classes/bins e.g. H, E or C (1,0,0)

• Values close to 1 or 0 are more accurate than values close to 1/2

• Prediction of real values e.g. Surface accessibility (0.43)• Reliability of a prediction is more difficult to estimate

Predictions and reliability of a prediction

Page 22: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen
Page 23: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Eukaryotic SP & TM

Signal peptide cleavage1523 seq

C-terminal end ofTM-regions669 seq

Page 24: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Signal peptide prediction

Signal pepdide likenessCleavage siteCombined information

Page 25: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Propeptide predictionMany secretory proteins and peptides are synthesized as inactive precursors that inaddition to signal peptide cleavage undergo post-translational processing to becomebiologically active polypeptides. Precursors are usually cleaved at sites composed ofsingle or paired basic amino acid residues by members of the subtilisin/kexin-likeproprotein convertase (PC) family. In mammals, seven members have been identified,with furin being the one first discovered and best characterized. Recently, theinvolvement of furin in diseases ranging from Alzheimer's disease and cancer toanthrax and Ebola fever has created additional focus on proprotein processing.We have developed a method for prediction of cleavage sites for PCs based onartificial neural networks. Two different types of neural networks have beenconstructed: a furin-specific network based on experimental results derived fromthe literature, and a general PC-specific network trained on data from the Swiss-Protprotein database. The method predicts cleavage sites in independent sequences witha sensitivity of 95% for the furin neural network and 62% for the general PC network.

Protein Engineering, Design and Selection: 17: 107-112, 2004.

General cleavage: R/K-Xn-R/K , n=0, 2, 4, 6

Furin cleavage: R-X-R/K-R

Page 26: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen

Propeptide prediction

Furin cleavage

Page 27: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen
Page 28: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen
Page 29: Artificial Neural Networks  Thomas Nordahl Petersen & Morten Nielsen