Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
New generation of NMR analysis assisted by Deep Learning and highly sophisticated Web-
tools developed by PDBj-BMRB
Naohiro Kobayashi
PDBj-BMRB group Institute for Protein Research
Osaka university, Japan
The 55th Annual Meeting of the Biophysical Society of Japan Luncheon meeting
1
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
NBDC
wwPDB
Project director Haruki Nakamura
PDBj: Database for X-ray and EM structure and experimental data Haruki Nakamura (Professor) Atsushi Nakagawa (Professor) Rei Kinjyo (Associate Prof.) Daron Standley (Prof. at iFREc) 5 annotators, 2 programmers, 2 research fellows
PDBj-BMRB: Database for NMR structure and experimental data Toshimichi Fujiwara (Professor) Chojiro Kojima (Prof. YNU & Osaka Univ.) Naohiro Kobayashi (Sp.Ap. Associate Prof.) Masashi Yokochi (Programmer Annotator) Takeshi Iwata (Sys. Admin. Annotator)
3-year project April 2014-March 2017
Organization of the project of PDBj at Osaka university supported by JST-NBDC
2
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
We are collecting chemical shift data in collaboration with Univ. Wisconsin-Madison and wwPDB
3
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
O
C CH3 N
H
C O
C N H
C O
C
H
N H
CH2
CH CH3 CH2
C O
C N H
CH2
N C O
C
H
CH2
CH2
COOH
N C O
C
H
CH2 CH2
C O
C N H
CH2
OH
Plasmid
[15N]-NH4Cl [13C]-Glucose Expression Purification
NMR tube
CH2 CH3 CH2
NH2
C
Nearly all of the atoms can be labeled with 13C, 15N!
Stable isotope labeling of protein made amazing NMR analysis possible!
4
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
We can observe nearly all of the atom signals in protein… each one of the atom
This is really amazing… as a spectroscopy technique
5
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
If you prepare 15N-labeled protein…
If the protein has 135 amino acid residues…you can use the 135 signal as 135 atomic probe
How you can imagine, if you can see the signal of each atom in protein
6
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
If you assign all of the NMR signals….
You can hear the sound of atoms!
7
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
Domain orientation Structure modeling
What you can do if you assigned all the NMR signals
Backbone atom dynamics
0.8
0.4
0.0
-0.4
NOE
430 450 470 490 500 Residue number
Ligand interaction
Sugase et al., Nature, 2007
8
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
BioMagResBank (BMRB) Databak of NMR database
9
IPR seminar 2016-06-03
Our activities on BMRB processing (2015)
• 78 entries processed at Osaka (~10%)
• BMRB total 759 entries
0
300
600
900
1200
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Ent
ry
Year
MadisonOsaka
10
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
NMR-STAR v3 BMRB/XML BMRBxTool
bmr15400.str
Yokochi et al., J. Biomed. Sem. (2016)
Conversion of BMRB data into machine readable format
11
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
BMRB/XML
bmr15400.xml
BMRBoTool
BMRB/RDF
bmr15400.rdf
RDF: Resource Description Framework
Toward the more machine readable format
12
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
BMRB Mutation OMIM dbSNP 2nd SASA% ∆∆Ghydr 4280 L100V 300005 rs28935168 Coil 22.5 3.9 4280 R106W 300005 rs28934907 Strand 8.0 11 4280 R133C 300005 rs28934904 Coil 49.4 7.2 4280 E137G 300005 rs61748392 Helix 41.7 6.6 4280 A140V 300005 rs28934908 Helix 42.0 1.7
High SASA
High DDGhydr
MeCP2: methylated CpG binding protein NMR structure 1QK9: Wakefiled et al., 1999
R133
E137 The mutations,R133C and E137 have been found in patients with X-linked mental retardation.
-- detailed information from OMIM--
Data analysis using BMRB/PDB/OMIM database
13
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
Multiple database search on the PDBj-BMRB portal
Universal search box Search filter
Search tabs & settings
Search button
14
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25 15
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
MagRO A tool for integrated NMR data analysis
16
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
MagRO: A package of tools for highly automated NMR data analysis
GUIs and automated tools suppot user to analyze NMR signal easily
17
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
However… None of them can start from peak identification
There are many programs that can assign NMR signals automatically
[Why?]
• So many noises and artifacts in NMR
spectra
• Difficult to get weak important signals • Sometimes signals are overlapped
18
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
By the way, the technology neural network has been developed
dramatically in this a few years
Developments of techniques for deep layered parceptron
Developments of Graphic Processing Unit (GPU)
Many layered and connected neural networks is now called: Deep Learning (DNN)
19
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
What is Convolutional Neural Network (CNN)?
Neocognitron: Fukushima et al. (1979)
LeNet-5: LeCun et al. (1988), DQN Hassabis et al. (2015)
Convolution
Filter size:5x5
Stlide:1
40x40 Conv1 Pool1
Max-pooling Window: 2x2 Stlide:2
Conv2
Convolution
Filter size:5x5
Stlide:1
Pool2
Stlide:2
…
Out-put
Hidden leyer
128 dim
Noise or Signal?
Max-pooling Window: 2x2
10x10x32 tensor
20
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
CNN+FNN->ZNN: fully automated noise/signal recognition
CNN: like a “eye-sight” of human, can recognize 2D NMR image but the view field is narrow
FNN: collects the other information, helps CNN ZNN: integrates CNN, FNN noise/signal recognition
ZNN
Noise or Signal?
FNN CNN
weak
strong
wiggle close lone
noisy
border
side-chain
spec-type
8->6->1 : NN
10->8->1 : NN
1600dim 5-leyer
21
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
Id_Robot
Annealing sequential assignment
Assignments
Matrix AASS
1HN-15Na clusters, Ca, Cb, CO chemical shifts
Matrix ART
3D-NMR spectra: HNCO, HNCACB, CBCA(CO)NH
Matrix CCON
Peak identification: CNN/FNN->ZNN
Perfectly automated assignment system using Id_Robot + Anneal_Robot
Anneal_Robot Random mutation 20cycles 40stages (Total:800 annealings)
22
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
Performance of new version
Backbone signal assignment (117 a.a.) manual: 6-8 hours semi-automated (previous version): 30-60min
new version with AI: 50sec (4-CPUs) 30-60 times faster!
…this is only working for spectra with very good quality still required improvements on peak recognition
23
IPR seminar 2016-06-03 Ann. Meeting BSJ 2016-11-25
Acknowledgement
Takashi Nagata, Masato Katahira Kenji Sugase
Kyoto University:
Wisconsin University: Eldon L. Ulrich, John L. Markley
Osaka University: Toshimichi Fujiwara, Chojiro Kojima (YNU, Osaka Univ.) Masashi Yokochi, Takeshi Iwata
Tokushima Bunri University: Yoshikazu hattori
Kengo Tsuda RIKEN Yokohama:
Yutaka Muto, Kanako Kuwasako Musashino University:
24