Upload
bruno-blake
View
224
Download
0
Tags:
Embed Size (px)
Citation preview
Bioinformatics
• Develop a plant phosphorylation database
• Completely classify and annotate plant protein kinases and phosphatases
• Develop data models and implement data handling procedures for experimental subprojects
Bioinformatics
• Timeline– Year 1: Implement Plant PP Functional Genomics Database
Data design and implementation for knockouts projectClassify Kinase and Phosphatases
– Year 2: Data design and implementation for proteomicsContinue knockout acquisitionScreen completed genome for additional targetsClassify and annotate functional domains
– Year 3: Data design and implementation for interactionsContinue knockout and add proteomics data acquisitionClassify and annotate functional domains
– Year 4: Complete knockoutsMap A. thaliana functional data to other plant genomesIntegrate expression data from external sourcesContinue proteomics and interaction data acquisition
– Year 5: Continue proteomics and interaction data acquisitionUpdate and extend classifications
Experimental Subprojects
• Gene Knockouts
– Isolate knockout mutations for all the protein kinases and phosphatases encoded by the arabidopsis genome
• Proteomics
– Create a two-dimensional gel phosphoprotein database
– Integrate the phosphoprotein database with gene sequences
• Interactions of Signaling Components
– Develop three-hybrid and split-hybrid screens for the analysis of plant protein kinases
– Begin genome-wide screening with individual arabidopsis protein kinases using the three-hybrid and split-hybrid approaches
Communication
• Electronic communication will be used to facilitate interactions between PPP functional genomics personnel
• Features of COW
– Responses organized by Topics
– Topics permit detailed conversations focused on specific PPP issues or projects
– Provides archive of conversations
– Coordination & management of issues
– Text and HTML possible: Web links to data
Tools and Approaches
• Bioinformatics group will draw on a broad range of community experience and resources.
• Resources developed at UCSD include
– Protein Kinase Resource
– Molecular Information Agent
– Family Pairwise Search
– Molecular Pattern Recognition
• Profile/MEME/MAST
– DictyDB
– INFO
Plant PP Functional Genomics Database
• http://www.sdsc.edu/mpr/plant_p
• Data definitions in STAR for sequences and features derived from Protein Kinase Resource
• Relational database (MySQL)
• Currently populated with 850 plant kinases and kinase substrates
• Preliminary classification into Hanks and Quinn groups
Sequence
UIDName, alternatesProperties
Sequence
SourceURLDates
Features List
Feature 1Feature 2
Feature 3Feature 4PositionsRanges
Feature
UID Name
Feature Members List
Sequence 1PositionsRanges
Sequence2PositionsRanges
Sequence 3PositionsRanges
Sequence 4PositionsRanges
AlignmentMethod
PKR/PPFGDB Data Model
PKR/PPFGDB Data Model
• STAR data definition language
– Used for CIF and mmCIF
– Data model for PDB at SDSC/Rutgers/NIST
• Sequence and features dictionaries include methods (PERL) to derive data from online sources (SwissProt, PIR, NCBI)
Plant Phosphoprotein Functional Genomics Database
• 850 sequences - all matches to keyword “protein kinase” in Entrez
– 770 kinases
– 80 other (mostly kinase substrates)
• Classification (Hanks and Quinn, 1994)– AGC (9) 72
I VIII Other1 46 25
– CAMK (3) 204I II Other177 16 11
– CMGC (6) 157I II III IV V54 45 35 9 14
– PTK (23) 23
– OPK (14) 297II IV VIII X XII Other7 10 5 251 5 19
– Unassigned 17
Plant Phosphoprotein Functional Genomics Database
• Classification
– 1449 kinases from PKR
– Remove very similar sequences using WU Purge program
• 141 probe sequences
– Use FPS to calculated P-value of match to Kinase, class, and subclass
• 141 x 1850 SW comparisons on Biocellerator
Family Pairwise Search (FPS)
• Combines information from multiple queries
– Identifies family membership based on a known panel
• Does not require multiple alignment or “training”
• Identifies motifs and folds using known panel
– SCOP families
– PROSITE motifs
– identifies homologs based on similarity to the entire group of sequences
• Not sensitive to spurious matches
– Sensitive “product of P-values” statistic and effective family size
• Server fps.sdsc.edu
Plant Phosphoprotein Functional Genomics Database
• Fibronectin type-iii domain• Sh2 domain• Sh3 domain• Phorbol-ester and DAG binding domain• Ig-like domain• C2 domain• Pleckstrin homology domain• Lim (lin-11 isl-1 mec-3) domain• Dhr domain• Guanylate kinase domain• Phorbol-ester and dag binding domain• F5/8 type c (phospholipid-binding) domain• Dimerization and phosphorylation domain• Transmitter domain • Calmodulin-binding domain• Phospholipid binding domain• P21-binding domain• Nuclear localization signal • Egf-like domain• Polo-homology domain• Collagen-like• Death domain
• Gap domain• Myosin domain• Cbs domain• Sam domain• Fha domain• Cub domain• Actin-binding domain• Rii-beta subunit binding domain• P10 binding site domain• Drbm domain• Receiver domain • Pac motif• Zinc-fingers• Mads domain• Metallopeptidase domain• Heat repeats domain
• Leucine-rich repeats• Coiled coil • Many low entropy regions
Domains and Features found with Kinase catalytic domains
Protein Kinase Resource
• PKR Goal
– Integrate sequence, structure, genetics, function and disease information
• How many kinases are there?
– Currently about 4500 kinase sequences (approximately 2-fold redundancy)
– 50+ three-dimensional structures
Protein Kinase Resource
• The SH3 (Src Homology 3) domain is a small conserved sequence of about 60 amino acid residues that interacts with proline-rich peptides to form protein aggregates.
• Structurally, the SH3 domain folds as a compact beta-barrel of five to six anti-parallel beta-strands. The hydrophobic beta-strands are connected by hydrophilic loops to form two orthogonal beta-sheets bringing the amino and the carboxyl termini of the domain close to each other. The ligands of the SH3 domains are peptides containing a ten residue consensus sequence, XPXXPPPFXP (where X is any amino acid residue, F is phenylalanine and P is peoline). This peptide forms a left-handed polyproline (PPII) helix that lies along the binding site of the SH3 domain, with its prolines interacting with the aromatic residues on the hyrophobic face of the SH3 domain.
• Functionally, the SH3 domain is involved in cell-cell communication and signal transduction from the cell surface to the nucleus. It acts as part of an adapter molecule and recruits downstream proteins in a signalling pathway. For example, in the eye development pathway in the Drosophila (Sevenless Pathway), a ligand from the R8 cell, Boss (Bride of Sevenless), binds two molecules of Sev (Sevenless) receptors on the surface of the R7 cell. This binding dimerizes the receptors which are Protein Tyrosine Kinases, so now they are close to one another and can transphosphorylate each other.
STAR Data Block (Sequence)
data_ABL1_HUMANloop__sequence.id_sequence.type_sequence.name_sequence.date_create_sequence.update_sequence_sequence.update_annotation_sequence.synonym_sequence.citation_sequence.length_sequence.mol_weight_sequence.sequencePKRPSEQ000008PROTEINABL1_HUMAN1986-07-211990-04-011997-02-01'ABL1, ABL'; RN [1] RP SEQUENCE FROM N.A. RC TISSUE=FIBROBLAST; RX MEDLINE; 90082420. [, Geneva] RA FAINSTEIN E., EINAT M., GOKKEL E., MARCELLE C., CROCE C.M., RA GALE R.P., CANAANI E.; RL ONCOGENE 4:1477-1481(1989).;1130122944; MLEICLKLVGCKSKKGLSSSSSCYLEEALQRPVASDFEPQGLSEAARWNSKENLLAGPSE NDPNLFVALYDFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVN SLEKHSWYHGPVSRNAAEYLLSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTAS DGKLYVSSESRFNTLAELVHHHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEMERT DITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQ LLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFI;
STAR Data Block (uid)
loop__uid.id_uid.status_uid.date_create_uid.update_uid.keywords_uid.description_short_uid.description_longPKRPSEQ000008ACTIVE1997-08-41997-08-4;TRANSFERASE, TYROSINE-PROTEIN KINASE, PROTO-ONCOGENE, ATP-BINDING,PHOSPHORYLATION, SH2 DOMAIN, SH3 DOMAIN, CHROMOSOMAL TRANSLOCATION,3D-STRUCTURE, ALTERNATIVE SPLICING;;PROTO-ONCOGENE TYROSINE-PROTEIN KINASE ABL (EC ) (P150)(C-ABL);;CC -!- CATALYTIC ACTIVITY: ATP + A PROTEIN TYROSINE = ADP +CC PROTEIN TYROSINE PHOSPHATE.CC -!- SUBCELLULAR LOCATION: CYTOPLASMIC.CC -!- TISSUE SPECIFICITY: WIDELY EXPRESSED.CC -!- DISEASE: PARTICIPATES IN A T(9;22)(Q34;Q11) CHROMOSOMALCC TRANSLOCATION THAT PRODUCES A BCR-ABL ONCOGENE RESPONSIBLE FORCC CHRONIC MYELOID LEUKEMIA (CML), ACUTE MYELOID LEUKEMIA (AML), ANDCC ACUTE LYMPHOBLASTIC LEUKEMIA (ALL).CC -!- ALTERNATIVE PRODUCTS: TWO FORMS, IA AND IB HAVE ALTERNATIVE AMINOCC TERMINI.CC -!- SIMILARITY: TO OTHER PROTEIN-TYROSINE KINASES IN THE CATALYTICCC DOMAIN. BELONGS TO THE ABL SUBFAMILY.CC -!- SIMILARITY: CONTAINS A COPY EACH OF THE SH2 AND SH3 DOMAINS.;
STAR Data Block (xref)
_xref.id PKRPSEQ000008loop_ _xref.dbname _xref.primary _xref.acc _xref.name _xref.date _xref.type _xref.source_db _xref.source_accession _xref.comment SWISS yes P00519 ABL1_HUMAN 1997-02-01 PROTEIN . . 'blah blah' EMBL no X16416 . 1997-02-01 NUCLEIC SWISS P00519 . EMBL no M14752 . 1997-02-01 NUCLEIC SWISS P00519 . EMBL no S69223 . 1997-02-01 NUCLEIC SWISS P00519 . PIR no A25582 TVHUA 1997-02-01 PROTEIN SWISS P00519 . PDB no 1AB2 1AB2 1997-02-01 STRUCTURE SWISS P00519 . PDB no 1ABL 1ABL 1997-02-01 STRUCTURE SWISS P00519 . OMIM no 189980 . 1997-02-01 NUCLEIC SWISS P00519 .
STAR Data Block (feature table)
_feature_table.id PKRPSEQ000008loop_ _feature_table.feature_id _feature_table.feature_name _feature_table.feature_location _feature_table.feature_type _feature_table.feature_source PKRFEAT85 SP-SH3 61-121 DOMAIN PRIMARY PKRFEAT86 SP-SH2 127-217 DOMAIN PRIMARY PKRFEAT87 SP-PROTEIN_KINASE 242-493 DOMAIN PRIMARY PKRFEAT88 SP-NUCLEAR_LOCALIZATION 605-609 DOMAIN PRIMARY PKRFEAT89 SP-PRO-RICH 782-1019 DOMAIN PRIMARY PKRFEAT90 SP-ATP 248-256 SITE PRIMARY PKRFEAT91 SP-ATP 271-271 SITE PRIMARY
Molecular Information Agent
• MIA - Molecular Information Agent
– Web resources often change URLs
– Web resources often change available services
– Laboratory scientists may only use services occasionally
• May not know what services are available
• May not know how to efficiently use them
• Want quick answers (where possible)
Molecular Information Agent
• MIA - Molecular Information Agent
– Search for linked information starting from a single item
– Automatic location and validation of web links
• All links are visited and the presence of a usable page confirmed
• Useful information extracted and used as a basis of further searches
– Basis of sequence/structure/information crosslinking in MSD
TemplateResource
QueryParser
Query Manager Resource
ParserKeywords
Synopsis &Reference
Molecular Information Agent
PATH_MAP_NUM
DSSP
IMB JenaImage
PDBSum
PDB
BiochemicalPathways
PDB Report 3Dee ProMotif
Protein Motions
ProCheckMacromolecular
File
STING GRASSColumbia
Pic
CATH FSSP SCOP
TOPS
MMDBPDB_CODE
EC_UNPRECISE
EC_PRECISE
ENZYME
WIT Swiss2D PAGE
SwissModel
SEView
PeptideMass
ProtScale
Swiss 3dImage
Compute pI/MW
ProfileScan
DOMO
ProtPattern ProDom
ProtoMap
Swiss-Prot
Swiss-Prot
Scan PROSITE
SP_Name
SP_ID
PROSITE
GI_Nuc
GI_PROT
NCBI
PRODOC
NCBI_UID
MGI
EMBL_UID
Prosite
MouseGenome DB
EMBL
Medline
ProDocOMIM
GDB
GeneCards
OMIM
OMIM
GENE_SYM
BLAST
PIR_IDPIR_ENTRY
PIRPIRProClass
NCBITaxonomy
NCBI_TAXONOMY
Motifs/Domains
Other
Sequence
Structure
NCBI
EMBL
SWISS_PROT
PIR
PDB
SCOP
CATH
Medline
OMIM
FSSP
HSSP
PROSITE PRODOM
PROTOMAP
DOMO
EC Enzyme
TGD
DSSP
MMDB
WIT
MIA - Selected Data Resources
Molecular Information Agent
• Limit Searches
• Do not requery resources
• Time-outs and availability issues
MIA - Molecular Information Agent
– Problems
• Web resources often change URLs
• Web resources often change available services
– Laboratory scientists may only use services occasionally
• May not know what services are available
• May not know how to efficiently use them
• Want quick answers (where possible)
MIA - Molecular Information Agent
• Simplifies finding and using information
– Search for all linked information beginning from a single item
– Automatic location and validation of web links
• All links are visited and the presence of a usable page confirmed
• Useful information extracted and used as a basis of further searches
– Basis of sequence/structure/information crosslinking in PDB
Molecular Pattern Recognition
Genes
Sequences
AlignedSequences
Motifs
FKEAFSLFD
KDGDGTITTK
ELGTVMRSL
FKEAFSLFDKDGDGCITTKELGTVMRSLIREAFRVFDKDGNGYISAAELRHVMTNLIKAIIQKADANKDGKIDREEFMKLIKS.IDAIIKKADGNNDGKIRVQEFVKMIESSFNKAFELYDQDGDGYIDENELDALLKDL
FKEAFSLFDKDGDGCITTKELGTVMRSL
IREAFR
VFDKDG
NGYISA
AELRHV
MTNLI
KAII
QK ADANKDG KI
DREE F
MKLI
KS.
FNKAFELYDQDGDGYIDENELDALLKDL
Molecular Pattern Recognition
MCEG FKEAFSLFDKDGDGTITTKELGTVMRSLMCDO FKEAFSLFDKDGDGSITTKELGTVMRSLMCSP FKEAFSLFDKDGDGCITTKELGTVMRSLMCCHM IREAFRVFDKDGNGYISAAELRHVMTNLMCUR1C IKAIIQKADANKDGKIDREEFMKLIKS.MCUR2C IDAIIKKADGNNDGKIRVQEFVKMIESSKLBOB FNKAFELYDQDGDGYIDENELDALLKDLKLCHI FNKAFEMYDQDGNGYIDENELDALLKDLKLBOI LDELFEELDKNGDGEVSFEEFQVLVKKIKLPGI LDDLFQELDKNGNGEVSFEEFQVLVKKI
Ligands to Calcium
Hydrophobic patches that stabilize structure
Calcium Binding EF-Hand Structure
Molecular Pattern Recognition
• Motif Learning/Description
– Profile
• Analytical calculation of motif description using finite mixture model
• Learning is fast - seconds on workstation
• Database search is slow - hours on workstation, about 90 sec on Compugen Bio-XLP
– MEME
• Unsupervised learning using expectation maximization
• Learning is quadratic - typically minutes on T3E
• Searching database is linear - seconds on workstation
Profile Analysis
• Describes protein structural and sequence motifs using a position specific scoring matrix and position specific gap penalties calculated from a sequence or multiple sequence alignment
• Evolutionary profile - sequence information
• Structural Profile - structural information
• Evolutionary/Structural Profile sequence and structure
Evolutionary Profile
• Mixture distribution using a biologically relevant model
• Explicit evolutionary model for each aligned column
– Sequences weighted for similarity
• Find the group of preferred residues at each position
• Weight mixture components by probability of observed data given the model distribution
Anc (w) PAMA (0.61) 1 T (0.17) 64 S (0.14) 64E (0.76) 1 D (0.63) 16 Q (0.28) 64 N (0.16) 128D (0.84) 1 E (0.55) 32 N (0.36) 32 Q (0.09) 128L (0.75) 32 M (0.36) 128 I (0.31) 64 V (0.30) 64V (0.53) 32 I (0.31) 128 T (0.16) 64 A (0.15) 64 M (0.09) 256 L (0.08) 256
Evolutionary Profile
MAST/MEME
• Describes protein sequence motifs using a position specific scoring matrix derived from an unsupervised EM algorithm applied to unaligned sequences
• MEME - motif learning
• MAST - database searching
• Servers
– MEME/MAST
– ParaMEME - parallel MEME
– MetaMEME - multiple motif models)
MEME/MAST
• Expectation maximization
– Unsupervised learning
– Unaligned sequences
– Automatic pattern width determination
– Automatic extraction of multiple motifs
• Learning is quadratic in the size of training data - typically minutes on CRAY T3E
• Searching database is linear in length of database - seconds on workstation
A C D E F G H I K L M N P Q R S T V W Y 0.40 -6.97 2.23 3.16 -6.50 -4.35 -3.58 -5.13 -0.13 -5.04 -4.24 -3.35 -4.07 -1.69 -3.90 -3.70 -3.74 -4.35 -6.43 -5.43 -4.59 -4.83 4.20 -2.45 -5.37 -4.85 -3.79 -5.41 -5.13 -5.65 -5.06 -2.28 -5.62 -4.70 -5.05 -4.61 -5.05 -5.20 -5.05 -4.95 -4.28 -3.74 -7.04 -6.66 -3.22 -6.76 -6.22 1.56 -6.54 2.57 -2.01 -6.41 -6.43 -5.98 -6.49 -6.24 -4.25 2.02 -5.13 -4.96 1.71 -2.64 -5.41 -4.90 -2.97 -4.21 -4.04 -1.21 -4.64 0.75 -1.90 -4.40 -4.72 -4.27 -4.51 -3.43 1.97 2.23 -3.89 -3.56 -5.22 -5.32 -5.80 -4.06 -6.24 -5.91 -2.89 -6.23 -5.34 -5.44 -4.37 -5.05 -5.55 4.57 -4.72 -5.54 -5.56 -6.11 -5.29 -6.00 -2.99 -6.93 3.01 2.91 -6.50 -4.34 -3.59 -0.31 -2.63 -5.06 -4.24 -3.31 -4.08 -1.72 -3.93 -3.70 -3.75 -4.36 -6.44 -5.43 1.49 -2.76 -5.58 -5.67 -4.92 -3.44 -5.03 -3.16 -5.48 -4.51 -3.80 -4.34 -4.62 -4.79 -5.16 -2.35 3.20 1.66 -5.36 -5.41 -3.72 -2.67 -4.68 -4.69 4.40 -4.93 -3.37 -2.70 -4.71 -1.98 -2.24 -4.41 -4.39 -4.60 -4.72 -3.71 -4.35 -2.98 0.74 0.33 -4.25 -3.86 -6.79 -6.24 2.25 -6.20 -5.12 2.82 -5.96 0.95 2.14 -5.90 -5.93 -5.09 -5.65 -5.54 -4.21 0.76 -4.02 -4.10 -0.94 -3.01 -4.44 -4.17 -4.79 -4.21 -1.99 -4.15 0.00 -4.10 -3.65 -3.24 -3.84 -2.22 4.01 -3.77 -3.76 -4.63 -3.16 -4.12
bits 6.6 6.0 5.3 4.7 Information 4.0 content 3.3 * * * (28.2 bits) 2.7 * * * 2.0 *** ****** 1.3 ********** 0.7 ********** 0.0 ----------
Multilevel EDLVQETFIRconsensus D VA DA sequence T
MEME/MAST
• Log-Odds form of a MEME motif
Motif Width Consensus Sites
1 10 EDLVQDTFIR 19 Core Binding 2 16 WLYRITRNLFIDHYRR 19 Strand Opening 3 36 REAFVLCDIHGLSYKEIAKTLGVSLGTVRSRIHRAR 19 -35 region 4 24 PSPEQQYHNYETLECIQAALDELP 19 ? Region 3.2 5 11 WLAYRLTGCQH 18 ? Region 1 6 10 QNYQPGTFEG 18 -10 region 7 14 RCDMPSWDELVRQH 17 8 15 MQAMPQDYDSFPADE 16 9 20 HWREPSDEPQGTAVFDATGD 8 10 8 LHALPDHS 12 11 8 MAEQLSTS 12.8
MEME/MAST
• RNA polymerase sigma factor motifs
• Top motifs are all known biologically important regions!
DictyDB: 'alfA-A11H2FUCA' Sequence
Reading Frames1 2 3 Splice
Sites
Codon Usage Analysis - D. discoideum codon usage used
Exons and CDS Seqs
alfA Gene Translation
"alfA-A11H2FUCA" DNA sequence, with alfA coding sequence highlighted
Two GenBank Entries, One Composite Entry
Textual Information for Sequence "alfA-A11H2FUCA",
for GenBank subsequences "alfA-DDIFUCA" and "alfA-DDA11H2",
and for Coding Subsequences "alfA-DDIFUCA_1.cds",
"alfA-DDIFUCA_2.cds" (alfA), and "alfA-DDA11H2_1.cds" (alfA)
(see blue vertical rectangles "Exons andCDS Seqs" in Display)
DictyDB: Chromosome 3
lagC: position and YAC clones
Locus Positions on Chromosome
AllelesLoci
lagC Locus Text Description² (pdiA) Allele Text Description
YAC Clones
INFO: INterruption Finder and Organizer
• Identifies interruptions in one sequence relative to DB sequences
• Task 1: DNA coding sequence vs Protein Databases
• Task 2: Alternative Splice Sites
• Task 3: Protein Secondary Structure vs alpha helix/beta strand Databases
INFO: Coding Sequence Algorithm
• Input DNA translated into 6 frames
• Each 25 aa Window of each translate compared to each 25 aa Window of each DB entry via ktup and Distance Matrix
• Hits above Min Score kept; overlapping Hits to given DB entry comprise a Cluster
• Overlapping Clusters against many DB entries comprise a MegaCluster
• End points of MegaCluster define Exon / Gene
INFO: Three Runs refine Results
• Run 1: High Min Score - no False Positives
• MiniLibrary built from DB entry Hits of Run 1
• Run 2: Low Min Score -run against MiniLibrary - high Specificity - refine MegaCluster ends
• Run 3: Reverse Sequence - Sequences of input translates and MiniLibrary entries are reversed - refines Leading Edge end points
INFO: Algorithm Schematic
A. DNA Sequence:
B. Conceptual Translation:
C. 25-mer Lookup Table:
D. Protein Databank Entry:
E. Compare 25-mers with Databank Entry: ktup = 2 Distance Matrix
F. Group Hits into Clusters: all overlapping hits against one databank entry
Two Clusters against same databank entry
One Cluster against given databank entry
GeneTwo Exons, one Intron
G. Group Clusters into MegaClusters: overlapping clusters against different databank entries
Two MegaClusters; two Clusters each, each against two databank entries
One MegaCluster: three Clusters against two databank entries
H. Display Clusters via Histogram Plot: Histogram Boundaries give Exons Introns are between Coding Exons
INFO: PROSITE validation test
1
10
100
1000
4000
260 280 300 320 340 360 380 400
Score
Number of
Hits
: True Positive Hits
: False Positive Hits
531 input DNA Sequences analyzed against SwissProt 26.0 Data from the PROSITE Database used to determine if a hit were Positive or Negative
INFO: Exon Sizes reported
0
20
40
60
80
0 200 400 600 800 1000 0 50 100 2001500
0.2
0.4
0.6
0.8
Exon Size (bases)
1.0
Exon Count Fraction Found per 10 Bases33
33 75 bases
INFO: Three Human ExamplesC
lust
er
Count
0
4
8
0 4 8 12 16
HUMPAIA(J03764)
0
4
8
0 4 8 12
HSFESFPS(X06292)
0
4
8
0 10 20 30
HUMTHB(M17262)
Sequence Position (kb)
INFO: Three Alternative Splicings
0 1 2 3 4 5
Peripherin-2:
Peripherin-1:
0 1 2 3 4 5
Sequence Position (kb)
Peripherin-3:
0 1 2 3 4 5
INFO: 1AGN Alpha Helices
A. Alpha Helices AngiogeninPeptide inducer of Neovascularization, Human
PDB entry 1AGN
INFO: 1AGN Beta Strands
B. Beta Strands AngiogeninPeptide inducer of Neovascularization, Human
PDB entry 1AGN