Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Masters Research Project
i
AAAA
PROJECT REPORTPROJECT REPORTPROJECT REPORTPROJECT REPORT
ONONONON
MOLECULARMOLECULARMOLECULARMOLECULAR
MODELING AGAINEST LEUKEMIAMODELING AGAINEST LEUKEMIAMODELING AGAINEST LEUKEMIAMODELING AGAINEST LEUKEMIA
SHARMA MAHESHSHARMA MAHESHSHARMA MAHESHSHARMA MAHESH
International E – Publication www.isca.co.in , www.isca.me
Masters Research Project
ii
AAAA
PROJECT REPORTPROJECT REPORTPROJECT REPORTPROJECT REPORT
ONONONON
MOLECULARMOLECULARMOLECULARMOLECULAR
MODELING AGAINEMODELING AGAINEMODELING AGAINEMODELING AGAINEST LEUKEMIAST LEUKEMIAST LEUKEMIAST LEUKEMIA
By
SHARMA MAHESHSHARMA MAHESHSHARMA MAHESHSHARMA MAHESH
Forensic Expert & Investigator, SIFS INDIA, 2443, Hudson Lines,
Kingsway Camp, Delhi-110009, INDIA
Anglia Ruskin University Cambridge, East Road (UK) CB1 1PT
2015
International E - Publication
www.isca.me , www.isca.co.in
Masters Research Project
iii
International E - Publication 427, Palhar Nagar, RAPTC, VIP-Road, Indore-452005 (MP) INDIA
Phone: +91-731-2616100, Mobile: +91-80570-83382
E-mail: [email protected] , Website: www.isca.me , www.isca.co.in
© Copyright Reserved
2015
All rights reserved. No part of this publication may be reproduced, stored, in a
retrieval system or transmitted, in any form or by any means, electronic,
mechanical, photocopying, reordering or otherwise, without the prior
permission of the publisher.
ISBN: 978-93-84648-61-9
Masters Research Project
iv
AAAA
PROJECT REPORTPROJECT REPORTPROJECT REPORTPROJECT REPORT
ONONONON
MOLECULARMOLECULARMOLECULARMOLECULAR
MODELING AGAINEST LEUKEMIAMODELING AGAINEST LEUKEMIAMODELING AGAINEST LEUKEMIAMODELING AGAINEST LEUKEMIA
By
SHARMA MAHESHSHARMA MAHESHSHARMA MAHESHSHARMA MAHESH
Forensic Expert & Investigator, SIFS INDIA, 2443, Hudson Lines,
Kingsway Camp, Delhi-110009, INDIA
Anglia Ruskin University Cambridge, East Road (UK) CB1 1PT
CEO GUIDE
Mr.G.V.L.P Subba RaoMr.G.V.L.P Subba RaoMr.G.V.L.P Subba RaoMr.G.V.L.P Subba Rao Ms.B.BhavaniMs.B.BhavaniMs.B.BhavaniMs.B.Bhavani
BioMed InformaticsBioMed InformaticsBioMed InformaticsBioMed Informatics Ms. M.SrividyaMs. M.SrividyaMs. M.SrividyaMs. M.Srividya
Drug discovery & clinical Research ceDrug discovery & clinical Research ceDrug discovery & clinical Research ceDrug discovery & clinical Research centententente
Ms.B.ChandrakalaMs.B.ChandrakalaMs.B.ChandrakalaMs.B.Chandrakala
Medwin Hospitals , Hyderabad Medwin Hospitals , Hyderabad Medwin Hospitals , Hyderabad Medwin Hospitals , Hyderabad
(A.P.) , INDIA (A.P.) , INDIA (A.P.) , INDIA (A.P.) , INDIA
Masters Research Project
v
Masters Research Project
vi
DDDDEDICATIOEDICATIOEDICATIOEDICATIONNNN
This project is dedicated to my parents, Family And my Friends.
Masters Research Project
vii
DDDDECLARATIOECLARATIOECLARATIOECLARATIONNNN
I am MAHESH SHARMA here by declare that the work done in the
project “ MOLECULAR MODELING AGAINEST LEUKMIA “ by me
and it`s all work have done in lab of BIOMED INFORMATICS, DRUG
DISCOVERY AND CLINICAL RESEARCH CENTER, MEDWIN
HOSPITALS IN HYDERABAD. This is not published and submitted at
any other university or copy from any other resource
MAHESH SHARMA
Masters Research Project
viii
CCCCERTIFICATERTIFICATERTIFICATERTIFICATEEEE
This is to certify that I (MAHESH SHARMA) student Bioinformatics have
carried out project “ MOLECULAR MODELING AGAINEST
LEUKMIA “ at BIOMED INFORMATICS, DRUG DISCOVERY AND
CLINICAL RESEARCH CENTER, MEDWIN HOSPITALS IN
HYDERABAD. This is being as minor project for Bioinformatics. this
project has not submitted to any university and not published or copy from
the other resources.
Sign. And seal Sign. And seal Sign. And seal Sign. And seal
CEO GUIDE
Mr.G.V.L.P Subba Rao Ms.B.Bhavani
BioMed InBioMed InBioMed InBioMed Informaticsformaticsformaticsformatics Ms. M.Srividya
Drug discovery & clinical Research ceDrug discovery & clinical Research ceDrug discovery & clinical Research ceDrug discovery & clinical Research centententente Ms.B.Chandrakala
Medwin Hospitals in Hyderabad
Masters Research Project
ix
AAAACKNOWLEDGMENCKNOWLEDGMENCKNOWLEDGMENCKNOWLEDGMENTTTT
First I “Praise to the god Almighty” for abundantly pouring his blessings and grace to
finish this project successfully.
I want to express deep gratitude to acknowledge the immense help of all those who
contributed with valuable suggestions and timely assistance to complete this work.
I am very grateful to my guide Ms.B.Bhavani, for her valuable guidance throughout my
project. I also thanks for her guidance in the project work presented in this dissertation. It
was mainly due to her effort that the current project made possible. I also express my
indebtedness to Ms. M.Srividya and Ms.B.Chandrakala who has constant inspiring for
me to complete my project.
Special thanks also go to Mr.G.V.L.P Subba Rao, CEO, BioMed Informatics, Drug
discovery and clinical Research center, Medwin Hospitals in Hyderabad, for his kind co-
operation to complete my project. We would also like to thank him for being so
supportive of my work.
The support, patience and encouragement rendered by family members are countless that
shall be long remembered.
I am also very much thankful to all my friends for their timely support during my
dissertation.
CEO
Mr.G.V.L.P Subba Rao MAHESH SHARMA BioMed Informatics
Drug discovery & clinical Research center
Medwin Hospitals in Hyderabad
Masters Research Project
x
INDEXE
ABSTRACT AIM & OBJECTIVEABSTRACT AIM & OBJECTIVEABSTRACT AIM & OBJECTIVEABSTRACT AIM & OBJECTIVE 1111
INTRODUCTION To:INTRODUCTION To:INTRODUCTION To:INTRODUCTION To: 2 to 222 to 222 to 222 to 22
BioinformatBioinformatBioinformatBioinformaticsicsicsics 3333
LeukemiaLeukemiaLeukemiaLeukemia 3333
NCBINCBINCBINCBI 19191919
GENOMICS:GENOMICS:GENOMICS:GENOMICS: 23 to 16723 to 16723 to 16723 to 167
Gene InformationGene InformationGene InformationGene Information 24242424
Genbank FormatGenbank FormatGenbank FormatGenbank Format 26262626
Fasta Format of Nucleotide SequenceFasta Format of Nucleotide SequenceFasta Format of Nucleotide SequenceFasta Format of Nucleotide Sequence 43434343
BioBioBioBio----Edit:Edit:Edit:Edit: 47 to 10647 to 10647 to 10647 to 106
Nucleotide CompositionNucleotide CompositionNucleotide CompositionNucleotide Composition 48484848
TextTextTextText 48484848
GraphGraphGraphGraph 48484848
Plasmid CreationPlasmid CreationPlasmid CreationPlasmid Creation 49494949
RestricRestricRestricRestriction Mappingtion Mappingtion Mappingtion Mapping 49494949
NCBI Tools:NCBI Tools:NCBI Tools:NCBI Tools: 107 to 126107 to 126107 to 126107 to 126
ORF (Open Reading Frame)ORF (Open Reading Frame)ORF (Open Reading Frame)ORF (Open Reading Frame) 108108108108
Map ViewerMap ViewerMap ViewerMap Viewer 113113113113
eeee----PCR (Electronic Polymerase Chain PCR (Electronic Polymerase Chain PCR (Electronic Polymerase Chain PCR (Electronic Polymerase Chain Reaction)Reaction)Reaction)Reaction)
116116116116
VecVecVecVec----ScreenScreenScreenScreen 126126126126
Gene Predicting Tools:Gene Predicting Tools:Gene Predicting Tools:Gene Predicting Tools: 127 to 129127 to 129127 to 129127 to 129
GenscanGenscanGenscanGenscan 128128128128
GenemarkGenemarkGenemarkGenemark 128128128128
Phylogenetic AnPhylogenetic AnPhylogenetic AnPhylogenetic Analysis:alysis:alysis:alysis: 130 to 167130 to 167130 to 167130 to 167
BLASTBLASTBLASTBLAST 131131131131
FASTAFASTAFASTAFASTA 145145145145
MSA (Multiple Sequence Alignment)MSA (Multiple Sequence Alignment)MSA (Multiple Sequence Alignment)MSA (Multiple Sequence Alignment) 147147147147
ClustalWClustalWClustalWClustalW 163163163163
Phylodraw (Phylogenetic Trees)Phylodraw (Phylogenetic Trees)Phylodraw (Phylogenetic Trees)Phylodraw (Phylogenetic Trees) 164164164164
Pair Distance and Root DistancePair Distance and Root DistancePair Distance and Root DistancePair Distance and Root Distance 167167167167
PROTEOMICS:PROTEOMICS:PROTEOMICS:PROTEOMICS: 168 to 291168 to 291168 to 291168 to 291
Genpept FormatGenpept FormatGenpept FormatGenpept Format 169169169169
Fasta Format of PrFasta Format of PrFasta Format of PrFasta Format of Proteinoteinoteinotein 183183183183
BioBioBioBio----Edit:Edit:Edit:Edit: 184 to 186184 to 186184 to 186184 to 186
AminoAminoAminoAmino----Acid compositionAcid compositionAcid compositionAcid composition 185185185185
TextTextTextText 185185185185
GraphGraphGraphGraph 186186186186
Primary Structure Primary Structure Primary Structure Primary Structure Analysis:Analysis:Analysis:Analysis:
187 to 200187 to 200187 to 200187 to 200
Masters Research Project
xi
ProrparamProrparamProrparamProrparam 188188188188
ProtscaleProtscaleProtscaleProtscale 190190190190
Secondary Structure Secondary Structure Secondary Structure Secondary Structure Analysis:Analysis:Analysis:Analysis:
201 to 230201 to 230201 to 230201 to 230
GORGORGORGOR 202202202202
SOPMASOPMASOPMASOPMA 203203203203
PostPostPostPost----TranslatiTranslatiTranslatiTranslational Modifications:onal Modifications:onal Modifications:onal Modifications: 205205205205
SignalPSignalPSignalPSignalP 206206206206
NetNGlyNetNGlyNetNGlyNetNGly 208208208208
NetOGlyNetOGlyNetOGlyNetOGly 212212212212
NetAcetNetAcetNetAcetNetAcet 220220220220
NetPhosNetPhosNetPhosNetPhos 221221221221
SulfinatorSulfinatorSulfinatorSulfinator 230230230230
Topology Prediction:Topology Prediction:Topology Prediction:Topology Prediction: 231 to 232231 to 232231 to 232231 to 232
SOSUISOSUISOSUISOSUI 232232232232
SPDBV Print Screens:SPDBV Print Screens:SPDBV Print Screens:SPDBV Print Screens: 233 to 239233 to 239233 to 239233 to 239
Loading of Raw SequenceLoading of Raw SequenceLoading of Raw SequenceLoading of Raw Sequence 234234234234
After LoadingAfter LoadingAfter LoadingAfter Loading TemplatesTemplatesTemplatesTemplates 234234234234
Coloring of Raw Sequence & Coloring of Raw Sequence & Coloring of Raw Sequence & Coloring of Raw Sequence & TemplatesTemplatesTemplatesTemplates
235235235235
Aligning of 1st TemplateAligning of 1st TemplateAligning of 1st TemplateAligning of 1st Template 235235235235
Aligning of 2nd TemplateAligning of 2nd TemplateAligning of 2nd TemplateAligning of 2nd Template 236236236236
Modeled Protein before Loop Modeled Protein before Loop Modeled Protein before Loop Modeled Protein before Loop BuildingBuildingBuildingBuilding
236236236236
Ramchandran Plot before Loop Ramchandran Plot before Loop Ramchandran Plot before Loop Ramchandran Plot before Loop BuildingBuildingBuildingBuilding
237237237237
Loop Building (Configuration Table)Loop Building (Configuration Table)Loop Building (Configuration Table)Loop Building (Configuration Table) 237237237237
Modeled Protein after Loop BuildingModeled Protein after Loop BuildingModeled Protein after Loop BuildingModeled Protein after Loop Building 238238238238
Ramchandran Plot after Loop Ramchandran Plot after Loop Ramchandran Plot after Loop Ramchandran Plot after Loop BuildingBuildingBuildingBuilding
238238238238
Protein with HProtein with HProtein with HProtein with H----BondsBondsBondsBonds 239239239239
Protein with Side ChainsProtein with Side ChainsProtein with Side ChainsProtein with Side Chains 239239239239
ActiveActiveActiveActive----Site Analysis:Site Analysis:Site Analysis:Site Analysis: 240 to 286240 to 286240 to 286240 to 286
Cavity MethodCavity MethodCavity MethodCavity Method 241241241241
QQQQ----Site FinderSite FinderSite FinderSite Finder 242242242242
3D Analysis:3D Analysis:3D Analysis:3D Analysis: 287 287 287 287 to 292to 292to 292to 292
RASMOL (As Many as Possible)RASMOL (As Many as Possible)RASMOL (As Many as Possible)RASMOL (As Many as Possible) 288288288288
DRUGs:DRUGs:DRUGs:DRUGs: 292 to 321292 to 321292 to 321292 to 321
Individual Docking Print ScreensIndividual Docking Print ScreensIndividual Docking Print ScreensIndividual Docking Print Screens 293293293293
Similar Molecules Print ScreensSimilar Molecules Print ScreensSimilar Molecules Print ScreensSimilar Molecules Print Screens 296296296296
Argus Lab (Surfaces):Argus Lab (Surfaces):Argus Lab (Surfaces):Argus Lab (Surfaces): 302302302302
Final MoleculeFinal MoleculeFinal MoleculeFinal Molecule 303303303303
HOMOHOMOHOMOHOMO 303303303303
LUMOLUMOLUMOLUMO 304304304304
ESP Mapped DensityESP Mapped DensityESP Mapped DensityESP Mapped Density 304304304304
HyHyHyHyperchem (QSAR Properties)perchem (QSAR Properties)perchem (QSAR Properties)perchem (QSAR Properties) 305 to 308305 to 308305 to 308305 to 308
Masters Research Project
xii
Final MoleculeFinal MoleculeFinal MoleculeFinal Molecule 306306306306
Single PointSingle PointSingle PointSingle Point 306306306306
Geometry OptimizationGeometry OptimizationGeometry OptimizationGeometry Optimization 307307307307
Molecule with QSAR Properties Molecule with QSAR Properties Molecule with QSAR Properties Molecule with QSAR Properties TableTableTableTable
307307307307
QSAR Properties ValuesQSAR Properties ValuesQSAR Properties ValuesQSAR Properties Values 308308308308
Cache:Cache:Cache:Cache: 309 to 312309 to 312309 to 312309 to 312
UV visible TransitionsUV visible TransitionsUV visible TransitionsUV visible Transitions 310310310310
IR TransitionsIR TransitionsIR TransitionsIR Transitions 312312312312
Database Docking ResultsDatabase Docking ResultsDatabase Docking ResultsDatabase Docking Results 313 to 321313 to 321313 to 321313 to 321
ConclusionConclusionConclusionConclusion 322 to 323322 to 323322 to 323322 to 323
Reference Reference Reference Reference 324 to 325324 to 325324 to 325324 to 325
Masters Research Project
1
AAAABSTRACBSTRACBSTRACBSTRACTTTT
Molecular Targeting against leukemiaMolecular Targeting against leukemiaMolecular Targeting against leukemiaMolecular Targeting against leukemia
Chromic myelogenus leukemia (CML) is characterized by the Philadelphia (Ph) chromosome
and bcr/abl gene rearrangement which occurs in pluripotent hematopoietic progenitor cells
expressing the c-kit receptor tyrosine kinase (KIT). Alternative splicing is a crucial
mechanism for generating protein diversity. Different splice variants of a given protein can
display different and even antagonistic biological functions. Therefore, appropriate control
of their synthesis is required to assure the complex orchestration of cellular processes within
multicellular organisms. One of the most exciting developments in cancer research in recent
years has been the clinical validation of molecularly targeted drugs that inhibit the action fo
pathogenic tyrosine kinases. The clinical validation of these “first-generation” tyrosine
kinase inhibitors has been the prelude to a second wave of advances in molecular targeting
that is expected to further change the way we classify and treat cancer. Efforts are now being
directed at identifying the tumor subtypes and patients who will benefit the most from the
drugs. Agents directed against new molecular targets are also being explored.
AIMAIMAIMAIM
Homology modeling of tyrosine kinase
Screening of compound library for tyrosine kinase inhibitors
Masters Research Project
2
Masters Research Project
3
Bioinformatics is the application of information technology and computer science to the
field of molecular biology. The term bioinformatics was coined by Paulien Hogeweg in
1979 for the study of informatic processes in biotic systems. Its primary use since at least the
late 1980s has been in genomics and genetics, particularly in those areas of genomics
involving large-scale DNA sequencing. Bioinformatics now entails the creation and
advancement of databases, algorithms, computational and statistical techniques, and theory
to solve formal and practical problems arising from the management and analysis of
biological data. Over the past few decades rapid developments in genomic and other
molecular research technologies and developments in information technologies have
combined to produce a tremendous amount of information related to molecular biology. It is
the name given to these mathematical and computing approaches used to glean
understanding of biological processes. Common activities in bioinformatics include mapping
and analyzing DNA and protein sequences, aligning different DNA and protein sequences to
compare them and creating and viewing 3-D models of protein structures.
The primary goal of bioinformatics is to increase our understanding of biological processes.
What sets it apart from other approaches, however, is its focus on developing and applying
computationally intensive techniques (e.g., pattern recognition, data mining, machine
learning algorithms, and visualization) to achieve this goal. Major research efforts in the
field include sequence alignment, gene finding, genome assembly, protein structure
alignment, protein structure prediction, prediction of gene expression and protein-protein
interactions, genome-wide association studies and the modeling of evolution (kumar, 2002).
LeukemiaLeukemiaLeukemiaLeukemia
Cancer (medical term: malignant neoplasm) is a class of diseases in which a group of cells
display uncontrolled growth (division beyond the normal limits), invasion (intrusion on and
destruction of adjacent tissues), and sometimes metastasis (spread to other locations in the
body via lymph or blood). These three malignant properties of cancers differentiate them
from benign tumors, which are self-limited, and do not invade or metastasize. Most cancers
form a tumor but some, like leukemia, do not. The branch of medicine concerned with the
study, diagnosis, treatment, and prevention of cancer is oncology
Leukemia (British/Canadian English: leukaemia) (Greek leukos λευκός, "white"; aima αίμα,
"blood") is a cancer of the blood or bone marrow and is characterized by an abnormal
proliferation (production by multiplication) of blood cells, usually white blood cells
(leukocytes). Leukemia is a broad term covering a spectrum of diseases. In turn, it is part of
the even broader group of diseases called hematological neoplasms.
Masters Research Project
4
CLASSIFICATION OF LEUKEMIA:-Leukemia is clinically and pathologically subdivided into
a variety of large groups. The first division is between its acute and chronic forms:
• Acute leukemia is characterized by the rapid increase of immature blood cells. This
crowding makes the bone marrow unable to produce healthy blood cells. Immediate
treatment is required in acute leukemia due to the rapid progression and accumulation of the
malignant cells, which then spill over into the bloodstream and spread to other organs of the
body. Acute forms of leukemia are the most common forms of leukemia in children.
• Chronic leukemia is distinguished by the excessive build up of relatively mature, but still
abnormal, white blood cells. Typically taking months or years to progress, the cells are
produced at a much higher rate than normal cells, resulting in many abnormal white blood
cells in the blood. Whereas acute leukemia must be treated immediately, chronic forms are
sometimes monitored for some time before treatment to ensure maximum effectiveness of
therapy. Chronic leukemia mostly occurs in older people, but can theoretically occur in any
age group.
Four major kinds of leukemia
Cell type Acute Chronic
Lymphocytic
leukemia
(or "lymphoblastic")
Acute lymphoblastic
leukemia (ALL)
Chronic lymphocytic
leukemia (CLL)
Myelogenous
leukemia
(also "myeloid" or
"nonlymphocytic")
Acute myelogenous
leukemia (AML)
Chronic myelogenous
leukemia (CML)
As a part of our project we will work on Chronic myelogenous leukemia (CML). Chronic
myelogenous (or myeloid) leukemia (CML), also known as chronic granulocytic
leukemia (CGL), is a cancer of the white blood cells. It is a form of leukemia characterized
by the increased and unregulated growth of predominantly myeloid cells in the bone marrow
and the accumulation of these cells in the blood. CML is a clonal bone marrow stem cell
disorder in which proliferation of mature granulocytes (neutrophils, eosinophils, and
basophils) and their precursors is the main finding. It is a type of myeloproliferative disease
associated with a characteristic chromosomal translocation called the Philadelphia
chromosome. It is now treated with imatinib and other targeted therapies, which have
dramatically improved survival.
Masters Research Project
5
GENE Responsible for Chronic myelogenous leukemiaGENE Responsible for Chronic myelogenous leukemiaGENE Responsible for Chronic myelogenous leukemiaGENE Responsible for Chronic myelogenous leukemia ((((NCBI, 2009)NCBI, 2009)NCBI, 2009)NCBI, 2009)::::----
1 Official Symbol BCR and Name: breakpoint cluster region [Homo sapiens]
Other Aliases: ALL, BCR1, CML, D22S11, D22S662, FLJ16453, PHL
Chromosome: 22; Location: 22q11.23
Annotation: Chromosome 22, NC_000022.10 (23522552..23660224)
MIM: 151410
GeneID: 613
2 Official Symbol ABL1 and Name: c-abl oncogene 1, receptor tyrosine kinase [Homo
sapiens]
Other Aliases: RP11-83J21.1, ABL, JTK7, bcr/abl, c-ABL, p150, v-abl
Other Designations: bcr/c-abl oncogene protein; proto-oncogene tyrosine-protein kinase
ABL1; v-abl Abelson murine leukemia viral oncogene homolog 1
Chromosome: 9; Location: 9q34.1
Annotation: Chromosome 9, NC_000009.11 (133589268..133763062)
MIM: 189980
GeneID: 25
A reciprocal translocation between chromosomes 22 and 9 produces the Philadelphia
chromosome, which is often found in patients with chronic myelogenous leukemia. The
chromosome 22 breakpoint for this translocation is located within the BCR gene. The
translocation produces a fusion protein which is encoded by sequence from both BCR and
ABL, the gene at the chromosome 9 breakpoint. Although the BCR-ABL fusion protein has
been extensively studied, the function of the normal BCR gene product is not clear. The
protein has serine/threonine kinase activity and is a GTPase-activating protein for p21rac.
Two transcript variants encoding different isoforms have been found for this gene
Masters Research Project
6
Masters Research Project
7
Protein involved in causing disease:-
1. chronic myelogenous leukemia tumor antigen 66 short form [Homo sapiens]
554 aa protein
AAM69373.1 GI:21634465
2. chronic myelogenous leukemia tumor antigen 66 [Homo sapiens]
583 aa protein
AAK73017.1 GI:14718862
3. chronic myelogenous leukemia tumor antigen 28 [Homo sapiens]
268 aa protein
AAM75154.1 GI:21693160
4. abl gene
285 aa protein
1107272B GI:224526
5. abl gene
164 aa protein
1107272A GI:224525
Biochemical pathway of Tyrosine kinase:-
A tyrosine kinase is an enzyme that can transfer a phosphate group from ATP to a tyrosine
residue in a protein. Tyrosine kinases are a subgroup of the larger class of protein kinases.
Phosphorylation of proteins by kinases is an important mechanism in signal transduction for
regulation of enzyme activity.
Most tyrosine kinases have an associated protein tyrosine phosphatase.
Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the
gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid
residues in a protein substrate side chain, resulting in a conformational change affecting
protein function. The enzymes fall into two broad classes, characterised with respect to
substrate specificity: serine/threonine specific and tyrosine specific
Approximately 2000 kinases are known, and more than 90 Protein Tyrosine Kinases (PTKs)
have been found in the human genome. They are divided into two classes, receptor and non-
receptor PTKs. At present, 58 receptor tyrosine kinases (RTKs) are known, grouped into 20
subfamilies. They play pivotal roles in diverse cellular activities including growth,
Masters Research Project
8
differentiation, metabolism, adhesion, motility, death. RTKs are composed of an
extracellular domain, which is able to bind a specific ligand, a transmembrane domain, and
an intracellular catalytic domain, which is able to bind and phosphorylate selected
substrates. Binding of a ligand to the extracellular region causes a series of structural
rearrangements in the RTK that lead to its enzymatic activation. In particular, movement of
some parts of the kinase domain gives free access to adenosine triphosphate (ATP) and the
substrate to the active site. This triggers a cascade of events through phosphorylation of
intracellular proteins that ultimately transmit ("transduce") the extracellular signal to the
nucleus, causing changes in gene expression. Many RTKs are involved in oncogenesis,
either by gene mutation, or chromosome translocation, or simply by over-expression. In
every case, the result is a hyper-active kinase, that confers an aberrant, ligand-independent,
non-regulated growth stimulus to the cancer cells.
Masters Research Project
9
Masters Research Project
10
GENE INFORMATIONGENE INFORMATIONGENE INFORMATIONGENE INFORMATION (NCBI, 2009)(NCBI, 2009)(NCBI, 2009)(NCBI, 2009)
1: KIT v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog [ Homo sapiens ]
GeneID: 3815 updated 17-Dec-2009
Official Symbol KIT
Official Full Name v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog
Other Aliases: C-Kit, CD117, PBT, SCFR
Primary Source Ensembl:ENSG00000157404; HPRD:01287; MIM:164920
Gene type protein coding
Organism Homo sapiens
Other Designations: mast/stem cell growth factor receptor; proto-oncogene tyrosine-protein kinase Kit; soluble KIT variant 1
Chromosome: 4; Location: 4q11-q12
Annotation: Chromosome 4, NC_000004.11 (55524095..55606881)
MIM: 164920
GeneID: 3815
Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;
Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates;
Haplorrhini; Catarrhini; Hominidae; Homo
Also known as PBT; SCFR; C-Kit; CD117; KIT
Summary This gene encodes the human homolog of the proto-oncogene c-kit. C-kit
was first identified as the cellular homolog of the feline sarcoma viral oncogene v-kit. This
protein is a type 3 transmembrane receptor for MGF (mast cell growth factor, also known as
stem cell factor). Mutations in this gene are associated with gastrointestinal stromal tumors,
mast cell disease, acute myelogenous lukemia, and piebaldism. Multiple transcript variants
encoding different isoforms have been found for this gene. [provided by RefSeq]
Masters Research Project
11
Masters Research Project
12
GENBANK FORMAT
NCBI Reference Sequence: NM_000222.2
Homo sapiens v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog (KIT), transcript variant 1, mRNA
• Comment
• Features
• Sequence
LOCUS NM_000222 5190 bp mRNA linear PRI 13-DEC-
2009
DEFINITION Homo sapiens v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene
homolog (KIT), transcript variant 1, mRNA.
ACCESSION NM_000222
VERSION NM_000222.2 GI:148005048
KEYWORDS .
SOURCE Homo sapiens (human)
ORGANISM Homo sapiens
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;
Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
Catarrhini; Hominidae; Homo.
REFERENCE 1 (bases 1 to 5190)
AUTHORS Ganesh,S.K., Zakai,N.A., van Rooij,F.J., Soranzo,N.,
Smith,A.V.,
Nalls,M.A., Chen,M.H., Kottgen,A., Glazer,N.L., Dehghan,A.,
Kuhnel,B., Aspelund,T., Yang,Q., Tanaka,T., Jaffe,A., Bis,J.C.,
Verwoert,G.C., Teumer,A., Fox,C.S., Guralnik,J.M., Ehret,G.B.,
Rice,K., Felix,J.F., Rendon,A., Eiriksdottir,G., Levy,D.,
Masters Research Project
13
Patel,K.V., Boerwinkle,E., Rotter,J.I., Hofman,A.,
Sambrook,J.G.,
Hernandez,D.G., Zheng,G., Bandinelli,S., Singleton,A.B.,
Coresh,J.,
Lumley,T., Uitterlinden,A.G., Vangils,J.M., Launer,L.J.,
Cupples,L.A., Oostra,B.A., Zwaginga,J.J., Ouwehand,W.H.,
Thein,S.L., Meisinger,C., Deloukas,P., Nauck,M., Spector,T.D.,
Gieger,C., Gudnason,V., van Duijn,C.M., Psaty,B.M.,
Ferrucci,L.,
Chakravarti,A., Greinacher,A., O'Donnell,C.J., Witteman,J.C.,
Furth,S., Cushman,M., Harris,T.B. and Lin,J.P.
TITLE Multiple loci influence erythrocyte phenotypes in the CHARGE
Consortium
JOURNAL Nat. Genet. 41 (11), 1191-1198 (2009)
PUBMED 19862010
REFERENCE 2 (bases 1 to 5190)
AUTHORS Bodemer,C., Hermine,O., Palmerini,F., Yang,Y., Grandpeix-
Guyodo,C.,
Leventhal,P.S., Hadj-Rabia,S., Nasca,L., Georgin-Lavialle,S.,
Cohen-Akenine,A., Launay,J.M., Barete,S., Feger,F., Arock,M.,
Catteau,B., Sans,B., Stalder,J.F., Skowron,F., Thomas,L.,
Lorette,G., Plantin,P., Bordigoni,P., Lortholary,O.,
Prost,Y.D.,
Moussy,A., Sobol,H. and Dubreuil,P.
TITLE Pediatric Mastocytosis Is a Clonal Disease Associated with
D(816)V
and Other Activating c-KIT Mutations
JOURNAL J. Invest. Dermatol. (2009) In press
PUBMED 19865100
REMARK GeneRIF: Observational study of gene-disease association. (HuGE
Navigator)
Masters Research Project
14
Publication Status: Available-Online prior to print
REFERENCE 3 (bases 1 to 5190)
AUTHORS Akagi,T., Shih,L.Y., Ogawa,S., Gerss,J., Moore,S.R.,
Schreck,R.,
Kawamata,N., Liang,D.C., Sanada,M., Nannya,Y., Deneberg,S.,
Zachariadis,V., Nordgren,A., Song,J.H., Dugas,M., Lehmann,S.
and
Koeffler,H.P.
TITLE Single nucleotide polymorphism genomic arrays analysis of
t(8;21)
acute myeloid leukemia cells
JOURNAL Haematologica 94 (9), 1301-1306 (2009)
PUBMED 19734423
REMARK GeneRIF: Observational study of gene-disease association. (HuGE
Navigator)
REFERENCE 4 (bases 1 to 5190)
AUTHORS Kwon,J.E., Kang,H.J., Kim,S.H., Lee,Y.C., Hyung,W.J., Noh,S.H.,
Kim,N.K. and Kim,H.
TITLE Pathological characteristics of gastrointestinal stromal
tumours
with PDGFRA mutations
JOURNAL Pathology 41 (6), 544-554 (2009)
PUBMED 19900103
REMARK GeneRIF: Observational study of gene-disease association. (HuGE
Navigator)
REFERENCE 5 (bases 1 to 5190)
AUTHORS Stec,R., Grala,B., Maczewski,M., Bodnar,L. and Szczylik,C.
TITLE Chromophobe renal cell cancer--review of the literature and
potential methods of treating metastatic disease
JOURNAL J. Exp. Clin. Cancer Res. 28, 134 (2009)
Masters Research Project
15
PUBMED 19811659
REMARK GeneRIF: Overexpression of CD117 on cellular membranes of
chromophobe renal cell carcinoma could be a potential target
for
kinase inhibitors
Publication Status: Online-Only
REFERENCE 6 (bases 1 to 5190)
AUTHORS Spritz,R.A., Droetto,S. and Fukushima,Y.
TITLE Deletion of the KIT and PDGFRA genes in a patient with
piebaldism
JOURNAL Am. J. Med. Genet. 44 (4), 492-495 (1992)
PUBMED 1279971
REFERENCE 7 (bases 1 to 5190)
AUTHORS Giebel,L.B., Strunk,K.M., Holmes,S.A. and Spritz,R.A.
TITLE Organization and nucleotide sequence of the human KIT
(mast/stem
cell growth factor receptor) proto-oncogene
JOURNAL Oncogene 7 (11), 2207-2217 (1992)
PUBMED 1279499
REFERENCE 8 (bases 1 to 5190)
AUTHORS Andre,C., Martin,E., Cornu,F., Hu,W.X., Wang,X.P. and
Galibert,F.
TITLE Genomic organization of the human c-kit gene: evolution of the
receptor tyrosine kinase subclass III
JOURNAL Oncogene 7 (4), 685-691 (1992)
PUBMED 1373482
REFERENCE 9 (bases 1 to 5190)
AUTHORS Duronio,V., Welham,M.J., Abraham,S., Dryden,P. and
Schrader,J.W.
TITLE p21ras activation via hemopoietin receptors and c-kit requires
Masters Research Project
16
tyrosine kinase activity but not tyrosine phosphorylation of
p21ras
GTPase-activating protein
JOURNAL Proc. Natl. Acad. Sci. U.S.A. 89 (5), 1587-1591 (1992)
PUBMED 1371879
REFERENCE 10 (bases 1 to 5190)
AUTHORS Spritz,R.A., Giebel,L.B. and Holmes,S.A.
TITLE Dominant negative and loss of function mutations of the c-kit
(mast/stem cell growth factor receptor) proto-oncogene in human
piebaldism
JOURNAL Am. J. Hum. Genet. 50 (2), 261-269 (1992)
PUBMED 1370874
COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staff.
The
reference sequence was derived from DC376760.1, X06182.1 and
BC071593.1.
On May 22, 2007 this sequence version replaced gi:4557694.
Summary: This gene encodes the human homolog of the proto-
oncogene
c-kit. C-kit was first identified as the cellular homolog of
the
feline sarcoma viral oncogene v-kit. This protein is a type 3
transmembrane receptor for MGF (mast cell growth factor, also
known
as stem cell factor). Mutations in this gene are associated
with
gastrointestinal stromal tumors, mast cell disease, acute
myelogenous lukemia, and piebaldism. Multiple transcript
variants
encoding different isoforms have been found for this gene.
Masters Research Project
17
[provided by RefSeq].
Transcript Variant: This variant (1) represents the longer
transcript and encodes the longer isoform (1).
Publication Note: This RefSeq record includes a subset of the
publications that are available for this gene. Please see the
Entrez Gene record to access additional publications.
COMPLETENESS: full length.
PRIMARY REFSEQ_SPAN PRIMARY_IDENTIFIER PRIMARY_SPAN COMP
1-66 DC376760.1 1-66
67-4869 X06182.1 1-4803
4870-5190 BC071593.1 4859-5179
FEATURES Location/Qualifiers
source 1..5190
/organism="Homo sapiens"
/mol_type="mRNA"
/db_xref="taxon:9606"
/chromosome="4"
/map="4q11-q12"
gene 1..5190
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/note="v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog"
/db_xref="GeneID:3815"
/db_xref="HGNC:6342"
/db_xref="HPRD:01287"
Masters Research Project
18
/db_xref="MIM:164920"
exon 1..154
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=1
misc_feature 1
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/note="5'-most transcription initiation site"
misc_feature 26
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/note="major transcription initiation site"
misc_feature 30
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/note="major transcription initiation site"
CDS 88..3018
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/EC_number="2.7.10.1"
/note="isoform 1 precursor is encoded by transcript
variant 1; mast/stem cell growth factor receptor;
proto-oncogene tyrosine-protein kinase Kit; soluble
KIT
variant 1"
/codon_start=1
/product="v-kit Hardy-Zuckerman 4 feline sarcoma viral
Masters Research Project
19
oncogene homolog isoform 1 precursor"
/protein_id="NP_000213.1"
/db_xref="GI:4557695"
/db_xref="CCDS:CCDS3496.1"
/db_xref="GeneID:3815"
/db_xref="HGNC:6342"
/db_xref="HPRD:01287"
/db_xref="MIM:164920"
/translation="MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKS
DLIVRVGDEIRLLCTDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHG
LSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDL
RFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSK
ASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLT
ISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLI
VEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTF
LVSNSDVNAAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRC
SASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKTSAYFNFAF
KGNNKEQIHPHTLFTPLLIGFVIVAGMMCIIVMILTYKYLQKPMYEVQWKVVEEINGN
NYVYIDPTQLPYDHKWEFPRNRLSFGKTLGAGAFGKVVEATAYGLIKSDAAMTVAVKM
LKPSAHLTEREALMSELKVLSYLGNHMNIVNLLGACTIGGPTLVITEYCCYGDLLNFL
RRKRDSFICSKQEDHAEAALYKNLLHSKESSCSDSTNEYMDMKPGVSYVVPTKADKRR
Masters Research Project
20
SVRIGSYIERDVTPAIMEDDELALDLEDLLSFSYQVAKGMAFLASKNCIHRDLAARNI
LLTHGRITKICDFGLARDIKNDSNYVVKGNARLPVKWMAPESIFNCVYTFESDVWSYG
IFLWELFSLGSSPYPGMPVDSKFYKMIKEGFRMLSPEHAPAEMYDIMKTCWDADPLKR
PTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSVRINSVGSTASSSQPLLV
HDDV"
exon 155..424
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=2
exon 425..706
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=3
STS 561..685
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/standard_name="STS-N21003"
/db_xref="UniSTS:21855"
exon 707..843
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=4
exon 844..1012
Masters Research Project
21
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=5
exon 1013..1202
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=6
exon 1203..1318
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=7
exon 1319..1433
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=8
exon 1434..1627
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=9b
exon 1628..1734
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
Masters Research Project
22
/number=10
exon 1735..1861
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=11
exon 1862..1966
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=12
exon 1967..2077
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=13
exon 2078..2228
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=14
exon 2229..2320
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=15
exon 2321..2448
/gene="KIT"
Masters Research Project
23
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=16
exon 2449..2571
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=17
exon 2572..2683
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=18
exon 2684..2783
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=19
exon 2784..2889
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=20
exon 2890..5176
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/inference="alignment:Splign"
/number=21
Masters Research Project
24
STS 2999..3294
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/standard_name="GDB:512789"
/db_xref="UniSTS:157531"
STS 3014..3989
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/standard_name="GDB:181531"
/db_xref="UniSTS:155257"
STS 3090..3288
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/standard_name="SHGC4-128"
/db_xref="UniSTS:79238"
STS 3756..4087
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/standard_name="SHGC-12679"
/db_xref="UniSTS:62849"
STS 4234..4555
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/standard_name="SHGC-50170"
/db_xref="UniSTS:72370"
STS 4328..4557
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
Masters Research Project
25
/standard_name="G15879"
/db_xref="UniSTS:79825"
STS 5000..5107
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/standard_name="SHGC-67781"
/db_xref="UniSTS:853"
polyA_signal 5152..5157
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
polyA_site 5176
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
ORIGIN
1 tctgggggct cggctttgcc gcgctcgctg cacttgggcg agagctggaa cgtggaccag
61 agctcggatc ccatcgcagc taccgcgatg agaggcgctc gcggcgcctg ggattttctc
121 tgcgttctgc tcctactgct tcgcgtccag acaggctctt ctcaaccatc tgtgagtcca
181 ggggaaccgt ctccaccatc catccatcca ggaaaatcag acttaatagt ccgcgtgggc
241 gacgagatta ggctgttatg cactgatccg ggctttgtca aatggacttt tgagatcctg
301 gatgaaacga atgagaataa gcagaatgaa tggatcacgg aaaaggcaga agccaccaac
361 accggcaaat acacgtgcac caacaaacac ggcttaagca attccattta tgtgtttgtt
421 agagatcctg ccaagctttt ccttgttgac cgctccttgt atgggaaaga agacaacgac
481 acgctggtcc gctgtcctct cacagaccca gaagtgacca attattccct caaggggtgc
541 caggggaagc ctcttcccaa ggacttgagg tttattcctg accccaaggc gggcatcatg
601 atcaaaagtg tgaaacgcgc ctaccatcgg ctctgtctgc attgttctgt ggaccaggag
661 ggcaagtcag tgctgtcgga aaaattcatc ctgaaagtga ggccagcctt caaagctgtg
721 cctgttgtgt ctgtgtccaa agcaagctat cttcttaggg aaggggaaga attcacagtg
781 acgtgcacaa taaaagatgt gtctagttct gtgtactcaa cgtggaaaag agaaaacagt
Masters Research Project
26
841 cagactaaac tacaggagaa atataatagc tggcatcacg gtgacttcaa ttatgaacgt
901 caggcaacgt tgactatcag ttcagcgaga gttaatgatt ctggagtgtt catgtgttat
961 gccaataata cttttggatc agcaaatgtc acaacaacct tggaagtagt agataaagga
1021 ttcattaata tcttccccat gataaacact acagtatttg taaacgatgg agaaaatgta
1081 gatttgattg ttgaatatga agcattcccc aaacctgaac accagcagtg gatctatatg
1141 aacagaacct tcactgataa atgggaagat tatcccaagt ctgagaatga aagtaatatc
1201 agatacgtaa gtgaacttca tctaacgaga ttaaaaggca ccgaaggagg cacttacaca
1261 ttcctagtgt ccaattctga cgtcaatgct gccatagcat ttaatgttta tgtgaataca
1321 aaaccagaaa tcctgactta cgacaggctc gtgaatggca tgctccaatg tgtggcagca
1381 ggattcccag agcccacaat agattggtat ttttgtccag gaactgagca gagatgctct
1441 gcttctgtac tgccagtgga tgtgcagaca ctaaactcat ctgggccacc gtttggaaag
1501 ctagtggttc agagttctat agattctagt gcattcaagc acaatggcac ggttgaatgt
1561 aaggcttaca acgatgtggg caagacttct gcctatttta actttgcatt taaaggtaac
1621 aacaaagagc aaatccatcc ccacaccctg ttcactcctt tgctgattgg tttcgtaatc
1681 gtagctggca tgatgtgcat tattgtgatg attctgacct acaaatattt acagaaaccc
1741 atgtatgaag tacagtggaa ggttgttgag gagataaatg gaaacaatta tgtttacata
1801 gacccaacac aacttcctta tgatcacaaa tgggagtttc ccagaaacag gctgagtttt
1861 gggaaaaccc tgggtgctgg agctttcggg aaggttgttg aggcaactgc ttatggctta
1921 attaagtcag atgcggccat gactgtcgct gtaaagatgc tcaagccgag tgcccatttg
1981 acagaacggg aagccctcat gtctgaactc aaagtcctga gttaccttgg taatcacatg
2041 aatattgtga atctacttgg agcctgcacc attggagggc ccaccctggt cattacagaa
2101 tattgttgct atggtgatct tttgaatttt ttgagaagaa aacgtgattc atttatttgt
2161 tcaaagcagg aagatcatgc agaagctgca ctttataaga atcttctgca ttcaaaggag
2221 tcttcctgca gcgatagtac taatgagtac atggacatga aacctggagt ttcttatgtt
2281 gtcccaacca aggccgacaa aaggagatct gtgagaatag gctcatacat agaaagagat
2341 gtgactcccg ccatcatgga ggatgacgag ttggccctag acttagaaga cttgctgagc
2401 ttttcttacc aggtggcaaa gggcatggct ttcctcgcct ccaagaattg tattcacaga
2461 gacttggcag ccagaaatat cctccttact catggtcgga tcacaaagat ttgtgatttt
Masters Research Project
27
2521 ggtctagcca gagacatcaa gaatgattct aattatgtgg ttaaaggaaa cgctcgacta
2581 cctgtgaagt ggatggcacc tgaaagcatt ttcaactgtg tatacacgtt tgaaagtgac
2641 gtctggtcct atgggatttt tctttgggag ctgttctctt taggaagcag cccctatcct
2701 ggaatgccgg tcgattctaa gttctacaag atgatcaagg aaggcttccg gatgctcagc
2761 cctgaacacg cacctgctga aatgtatgac ataatgaaga cttgctggga tgcagatccc
2821 ctaaaaagac caacattcaa gcaaattgtt cagctaattg agaagcagat ttcagagagc
2881 accaatcata tttactccaa cttagcaaac tgcagcccca accgacagaa gcccgtggta
2941 gaccattctg tgcggatcaa ttctgtcggc agcaccgctt cctcctccca gcctctgctt
3001 gtgcacgacg atgtctgagc agaatcagtg tttgggtcac ccctccagga atgatctctt
3061 cttttggctt ccatgatggt tattttcttt tctttcaact tgcatccaac tccaggatag
3121 tgggcacccc actgcaatcc tgtctttctg agcacacttt agtggccgat gatttttgtc
3181 atcagccacc atcctattgc aaaggttcca actgtatata ttcccaatag caacgtagct
3241 tctaccatga acagaaaaca ttctgatttg gaaaaagaga gggaggtatg gactgggggc
3301 cagagtcctt tccaaggctt ctccaattct gcccaaaaat atggttgata gtttacctga
3361 ataaatggta gtaatcacag ttggccttca gaaccatcca tagtagtatg atgatacaag
3421 attagaagct gaaaacctaa gtcctttatg tggaaaacag aacatcatta gaacaaagga
3481 cagagtatga acacctgggc ttaagaaatc tagtatttca tgctgggaat gagacatagg
3541 ccatgaaaaa aatgatcccc aagtgtgaac aaaagatgct cttctgtgga ccactgcatg
3601 agcttttata ctaccgacct ggtttttaaa tagagtttgc tattagagca ttgaattgga
3661 gagaaggcct ccctagccag cacttgtata tacgcatcta taaattgtcc gtgttcatac
3721 atttgagggg aaaacaccat aaggtttcgt ttctgtatac aaccctggca ttatgtccac
3781 tgtgtataga agtagattaa gagccatata agtttgaagg aaacagttaa taccattttt
3841 taaggaaaca atataaccac aaagcacagt ttgaacaaaa tctcctcttt tagctgatga
3901 acttattctg tagattctgt ggaacaagcc tatcagcttc agaatggcat tgtactcaat
3961 ggatttgatg ctgtttgaca aagttactga ttcactgcat ggctcccaca ggagtgggaa
4021 aacactgcca tcttagtttg gattcttatg tagcaggaaa taaagtatag gtttagcctc
4081 cttcgcaggc atgtcctgga caccgggcca gtatctatat atgtgtatgt acgtttgtat
4141 gtgtgtagac aaatatttgg aggggtattt ttgccctgag tccaagaggg tcctttagta
Masters Research Project
28
4201 cctgaaaagt aacttggctt tcattattag tactgctctt gtttcttttc acatagctgt
4261 ctagagtagc ttaccagaag cttccatagt ggtgcagagg aagtggaagg catcagtccc
4321 tatgtatttg cagttcacct gcacttaagg cactctgtta tttagactca tcttactgta
4381 cctgttcctt agaccttcca taatgctact gtctcactga aacatttaaa ttttaccctt
4441 tagactgtag cctggatatt attcttgtag tttacctctt taaaaacaaa acaaaacaaa
4501 acaaaaaact ccccttcctc actgcccaat ataaaaggca aatgtgtaca tggcagagtt
4561 tgtgtgttgt cttgaaagat tcaggtatgt tgcctttatg gtttccccct tctacatttc
4621 ttagactaca tttagagaac tgtggccgtt atctggaagt aaccatttgc actggagttc
4681 tatgctctcg cacctttcca aagttaacag attttggggt tgtgttgtca cccaagagat
4741 tgttgtttgc catactttgt ctgaaaaatt cctttgtgtt tctattgact tcaatgatag
4801 taagaaaagt ggttgttagt tatagatgtc taggtacttc aggggcactt cattgagagt
4861 tttgtcttgg atattcttga aagtttatat ttttataatt ttttcttaca tcagatgttt
4921 ctttgcagtg gcttaatgtt tgaaattatt ttgtggcttt ttttgtaaat attgaaatgt
4981 agcaataatg tcttttgaat attcccaagc ccatgagtcc ttgaaaatat tttttatata
5041 tacagtaact ttatgtgtaa atacataagc ggcgtaagtt taaaggatgt tggtgttcca
5101 cgtgttttat tcctgtatgt tgtccaattg ttgacagttc tgaagaattc taataaaatg
5161 tacatatata aatcaaaaaa aaaaaaaaaa
//
Masters Research Project
29
Fasta Format of Nucleotide Sequence (NCBI, 2009)
>gi|148005048|ref|NM_000222.2| Homo sapiens v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog (KIT), transcript variant 1, mRNA
TCTGGGGGCTCGGCTTTGCCGCGCTCGCTGCACTTGGGCGAGAGCTGGAACGTGGACCAGAGCTCGGATC
CCATCGCAGCTACCGCGATGAGAGGCGCTCGCGGCGCCTGGGATTTTCTCTGCGTTCTGCTCCTACTGCT
TCGCGTCCAGACAGGCTCTTCTCAACCATCTGTGAGTCCAGGGGAACCGTCTCCACCATCCATCCATCCA
GGAAAATCAGACTTAATAGTCCGCGTGGGCGACGAGATTAGGCTGTTATGCACTGATCCGGGCTTTGTCA
AATGGACTTTTGAGATCCTGGATGAAACGAATGAGAATAAGCAGAATGAATGGATCACGGAAAAGGCAGA
AGCCACCAACACCGGCAAATACACGTGCACCAACAAACACGGCTTAAGCAATTCCATTTATGTGTTTGTT
AGAGATCCTGCCAAGCTTTTCCTTGTTGACCGCTCCTTGTATGGGAAAGAAGACAACGACACGCTGGTCC
GCTGTCCTCTCACAGACCCAGAAGTGACCAATTATTCCCTCAAGGGGTGCCAGGGGAAGCCTCTTCCCAA
GGACTTGAGGTTTATTCCTGACCCCAAGGCGGGCATCATGATCAAAAGTGTGAAACGCGCCTACCATCGG
CTCTGTCTGCATTGTTCTGTGGACCAGGAGGGCAAGTCAGTGCTGTCGGAAAAATTCATCCTGAAAGTGA
GGCCAGCCTTCAAAGCTGTGCCTGTTGTGTCTGTGTCCAAAGCAAGCTATCTTCTTAGGGAAGGGGAAGA
ATTCACAGTGACGTGCACAATAAAAGATGTGTCTAGTTCTGTGTACTCAACGTGGAAAAGAGAAAACAGT
CAGACTAAACTACAGGAGAAATATAATAGCTGGCATCACGGTGACTTCAATTATGAACGTCAGGCAACGT
TGACTATCAGTTCAGCGAGAGTTAATGATTCTGGAGTGTTCATGTGTTATGCCAATAATACTTTTGGATC
AGCAAATGTCACAACAACCTTGGAAGTAGTAGATAAAGGATTCATTAATATCTTCCCCATGATAAACACT
ACAGTATTTGTAAACGATGGAGAAAATGTAGATTTGATTGTTGAATATGAAGCATTCCCCAAACCTGAAC
ACCAGCAGTGGATCTATATGAACAGAACCTTCACTGATAAATGGGAAGATTATCCCAAGTCTGAGAATGA
AAGTAATATCAGATACGTAAGTGAACTTCATCTAACGAGATTAAAAGGCACCGAAGGAGGCACTTACACA
TTCCTAGTGTCCAATTCTGACGTCAATGCTGCCATAGCATTTAATGTTTATGTGAATACAAAACCAGAAA
TCCTGACTTACGACAGGCTCGTGAATGGCATGCTCCAATGTGTGGCAGCAGGATTCCCAGAGCCCACAAT
AGATTGGTATTTTTGTCCAGGAACTGAGCAGAGATGCTCTGCTTCTGTACTGCCAGTGGATGTGCAGACA
CTAAACTCATCTGGGCCACCGTTTGGAAAGCTAGTGGTTCAGAGTTCTATAGATTCTAGTGCATTCAAGC
ACAATGGCACGGTTGAATGTAAGGCTTACAACGATGTGGGCAAGACTTCTGCCTATTTTAACTTTGCATT
Masters Research Project
30
TAAAGGTAACAACAAAGAGCAAATCCATCCCCACACCCTGTTCACTCCTTTGCTGATTGGTTTCGTAATC
GTAGCTGGCATGATGTGCATTATTGTGATGATTCTGACCTACAAATATTTACAGAAACCCATGTATGAAG
TACAGTGGAAGGTTGTTGAGGAGATAAATGGAAACAATTATGTTTACATAGACCCAACACAACTTCCTTA
TGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAGTTTTGGGAAAACCCTGGGTGCTGGAGCTTTCGGG
AAGGTTGTTGAGGCAACTGCTTATGGCTTAATTAAGTCAGATGCGGCCATGACTGTCGCTGTAAAGATGC
TCAAGCCGAGTGCCCATTTGACAGAACGGGAAGCCCTCATGTCTGAACTCAAAGTCCTGAGTTACCTTGG
TAATCACATGAATATTGTGAATCTACTTGGAGCCTGCACCATTGGAGGGCCCACCCTGGTCATTACAGAA
TATTGTTGCTATGGTGATCTTTTGAATTTTTTGAGAAGAAAACGTGATTCATTTATTTGTTCAAAGCAGG
AAGATCATGCAGAAGCTGCACTTTATAAGAATCTTCTGCATTCAAAGGAGTCTTCCTGCAGCGATAGTAC
TAATGAGTACATGGACATGAAACCTGGAGTTTCTTATGTTGTCCCAACCAAGGCCGACAAAAGGAGATCT
GTGAGAATAGGCTCATACATAGAAAGAGATGTGACTCCCGCCATCATGGAGGATGACGAGTTGGCCCTAG
ACTTAGAAGACTTGCTGAGCTTTTCTTACCAGGTGGCAAAGGGCATGGCTTTCCTCGCCTCCAAGAATTG
TATTCACAGAGACTTGGCAGCCAGAAATATCCTCCTTACTCATGGTCGGATCACAAAGATTTGTGATTTT
GGTCTAGCCAGAGACATCAAGAATGATTCTAATTATGTGGTTAAAGGAAACGCTCGACTACCTGTGAAGT
GGATGGCACCTGAAAGCATTTTCAACTGTGTATACACGTTTGAAAGTGACGTCTGGTCCTATGGGATTTT
TCTTTGGGAGCTGTTCTCTTTAGGAAGCAGCCCCTATCCTGGAATGCCGGTCGATTCTAAGTTCTACAAG
ATGATCAAGGAAGGCTTCCGGATGCTCAGCCCTGAACACGCACCTGCTGAAATGTATGACATAATGAAGA
CTTGCTGGGATGCAGATCCCCTAAAAAGACCAACATTCAAGCAAATTGTTCAGCTAATTGAGAAGCAGAT
TTCAGAGAGCACCAATCATATTTACTCCAACTTAGCAAACTGCAGCCCCAACCGACAGAAGCCCGTGGTA
GACCATTCTGTGCGGATCAATTCTGTCGGCAGCACCGCTTCCTCCTCCCAGCCTCTGCTTGTGCACGACG
ATGTCTGAGCAGAATCAGTGTTTGGGTCACCCCTCCAGGAATGATCTCTTCTTTTGGCTTCCATGATGGT
TATTTTCTTTTCTTTCAACTTGCATCCAACTCCAGGATAGTGGGCACCCCACTGCAATCCTGTCTTTCTG
AGCACACTTTAGTGGCCGATGATTTTTGTCATCAGCCACCATCCTATTGCAAAGGTTCCAACTGTATATA
TTCCCAATAGCAACGTAGCTTCTACCATGAACAGAAAACATTCTGATTTGGAAAAAGAGAGGGAGGTATG
GACTGGGGGCCAGAGTCCTTTCCAAGGCTTCTCCAATTCTGCCCAAAAATATGGTTGATAGTTTACCTGA
Masters Research Project
31
ATAAATGGTAGTAATCACAGTTGGCCTTCAGAACCATCCATAGTAGTATGATGATACAAGATTAGAAGCT
GAAAACCTAAGTCCTTTATGTGGAAAACAGAACATCATTAGAACAAAGGACAGAGTATGAACACCTGGGC
TTAAGAAATCTAGTATTTCATGCTGGGAATGAGACATAGGCCATGAAAAAAATGATCCCCAAGTGTGAAC
AAAAGATGCTCTTCTGTGGACCACTGCATGAGCTTTTATACTACCGACCTGGTTTTTAAATAGAGTTTGC
TATTAGAGCATTGAATTGGAGAGAAGGCCTCCCTAGCCAGCACTTGTATATACGCATCTATAAATTGTCC
GTGTTCATACATTTGAGGGGAAAACACCATAAGGTTTCGTTTCTGTATACAACCCTGGCATTATGTCCAC
TGTGTATAGAAGTAGATTAAGAGCCATATAAGTTTGAAGGAAACAGTTAATACCATTTTTTAAGGAAACA
ATATAACCACAAAGCACAGTTTGAACAAAATCTCCTCTTTTAGCTGATGAACTTATTCTGTAGATTCTGT
GGAACAAGCCTATCAGCTTCAGAATGGCATTGTACTCAATGGATTTGATGCTGTTTGACAAAGTTACTGA
TTCACTGCATGGCTCCCACAGGAGTGGGAAAACACTGCCATCTTAGTTTGGATTCTTATGTAGCAGGAAA
TAAAGTATAGGTTTAGCCTCCTTCGCAGGCATGTCCTGGACACCGGGCCAGTATCTATATATGTGTATGT
ACGTTTGTATGTGTGTAGACAAATATTTGGAGGGGTATTTTTGCCCTGAGTCCAAGAGGGTCCTTTAGTA
CCTGAAAAGTAACTTGGCTTTCATTATTAGTACTGCTCTTGTTTCTTTTCACATAGCTGTCTAGAGTAGC
TTACCAGAAGCTTCCATAGTGGTGCAGAGGAAGTGGAAGGCATCAGTCCCTATGTATTTGCAGTTCACCT
GCACTTAAGGCACTCTGTTATTTAGACTCATCTTACTGTACCTGTTCCTTAGACCTTCCATAATGCTACT
GTCTCACTGAAACATTTAAATTTTACCCTTTAGACTGTAGCCTGGATATTATTCTTGTAGTTTACCTCTT
TAAAAACAAAACAAAACAAAACAAAAAACTCCCCTTCCTCACTGCCCAATATAAAAGGCAAATGTGTACA
TGGCAGAGTTTGTGTGTTGTCTTGAAAGATTCAGGTATGTTGCCTTTATGGTTTCCCCCTTCTACATTTC
TTAGACTACATTTAGAGAACTGTGGCCGTTATCTGGAAGTAACCATTTGCACTGGAGTTCTATGCTCTCG
CACCTTTCCAAAGTTAACAGATTTTGGGGTTGTGTTGTCACCCAAGAGATTGTTGTTTGCCATACTTTGT
CTGAAAAATTCCTTTGTGTTTCTATTGACTTCAATGATAGTAAGAAAAGTGGTTGTTAGTTATAGATGTC
TAGGTACTTCAGGGGCACTTCATTGAGAGTTTTGTCTTGGATATTCTTGAAAGTTTATATTTTTATAATT
TTTTCTTACATCAGATGTTTCTTTGCAGTGGCTTAATGTTTGAAATTATTTTGTGGCTTTTTTTGTAAAT
ATTGAAATGTAGCAATAATGTCTTTTGAATATTCCCAAGCCCATGAGTCCTTGAAAATATTTTTTATATA
TACAGTAACTTTATGTGTAAATACATAAGCGGCGTAAGTTTAAAGGATGTTGGTGTTCCACGTGTTTTAT
Masters Research Project
32
TCCTGTATGTTGTCCAATTGTTGACAGTTCTGAAGAATTCTAATAAAATGTACATATATAAATCAAAAAA
AAAAAAAAAA
Masters Research Project
33
Masters Research Project
34
(Bioedit, 2009)
DNA molecule: gi|148005048|ref|NM_000222.2| Homo sapiens v-kit Hardy-Zuckerman 4
feline sarcoma viral oncogene homolog (KIT), transcript variant 1, mRNA
Length = 5190 base pairs
Molecular Weight = 1572664.00 Daltons, single stranded
Molecular Weight = 3148288.00 Daltons, double stranded
G+C content = 42.35%
A+T content = 57.65%
Nucleotide Number Mol%
A 1488 28.67
C 1067 20.56
G 1131 21.79
T 1504 28.98
Masters Research Project
35
Plasmid
BioEdit version 7.0.5.2 (6/5/05) Restriction Mapping Utility
(c)1998, Tom Hall
gi|148005048|ref|NM_000222.2| Homo sapiens v-kit Hardy-Zuckerman 4 feline sarcoma
viral oncogene homolog (KIT), transcript variant 1, mRNA Restriction Map
3/24/2002 12:35:08 PM
5190 base pairs
Translations: none
Restriction Enzyme Map:
Masters Research Project
36
1
TCTGGGGGCTCGGCTTTGCCGCGCTCGCTGCACTTGGGCGAGAGCTGGAACGTGGACCAGAGCTCGGATCCCATCGCAGC
80
1
AGACCCCCGAGCCGAAACGGCGCGAGCGACGTGAACCCGCTCTCGACCTTGCACCTGGTCTCGAGCCTAGGGTAGCGTCG
80
BanII MwoI Hin4I Hpy8I AlwI
NlaIV Hin4I
Bsp1286I HpyF10VI Hin4I EcoICRI
AlwI
BsgI Cac8I BslI
BbvI MwoI BanII
HpyF10VI BsiHKAI
Bsp1286I
SacI
BamHI
BstYI
81
TACCGCGATGAGAGGCGCTCGCGGCGCCTGGGATTTTCTCTGCGTTCTGCTCCTACTGCTTCGCGTCCAGACAGGCTCTT
160
81
ATGGCGCTACTCTCCGCGAGCGCCGCGGACCCTAAAAGAGACGCAAGACGAGGATGACGAAGCGCAGGTCTGTCCGAGAA
160
BplI BbvI HaeII NlaIV BplI HgaI
Hpy188III
MwoI Cac8I BbeI MwoI
MboII
Hin4I MwoI HaeII HpyF10VI
HpyF10VI HpyF10VI
MnlI MwoI
BanI
Masters Research Project
37
KasI
HpyF10VI
BsaHI
NarI
SfoI
BsaJI
161
CTCAACCATCTGTGAGTCCAGGGGAACCGTCTCCACCATCCATCCATCCAGGAAAATCAGACTTAATAGTCCGCGTGGGC
240
161
GAGTTGGTAGACACTCAGGTCCCCTTGGCAGAGGTGGTAGGTAGGTAGGTCCTTTTAGTCTGAATTATCAGGCGCACCCG
240
SapI BsaJI NlaIV BsmBI BstF5I DrdI
EarI PleI FokI BstF5I
MlyI BsmAI
FokI BstF5I
FokI
241
GACGAGATTAGGCTGTTATGCACTGATCCGGGCTTTGTCAAATGGACTTTTGAGATCCTGGATGAAACGAATGAGAATAA
320
241
CTGCTCTAATCCGACAATACGTGACTAGGCCCGAAACAGTTTACCTGAAAACTCTAGGACCTACTTTGCTTACTCTTATT
320
AlwI TspRI AlwI BstF5I
TspDTI
BstYI
FokI
Masters Research Project
38
321
GCAGAATGAATGGATCACGGAAAAGGCAGAAGCCACCAACACCGGCAAATACACGTGCACCAACAAACACGGCTTAAGCA
400
321
CGTCTTACTTACCTAGTGCCTTTTCCGTCTTCGGTGGTTGTGGCCGTTTATGTGCACGTGGTTGTTTGTGCCGAATTCGT
400
AlwI TspGWI BsrFI AflIII Bme1580I
AflII
TspDTI BsaAI
SmlI
PmlI
ApaLI
Hpy8I
BsiHKAI
Bsp1286I
401
ATTCCATTTATGTGTTTGTTAGAGATCCTGCCAAGCTTTTCCTTGTTGACCGCTCCTTGTATGGGAAAGAAGACAACGAC
480
401
TAAGGTAAATACACAAACAATCTCTAGGACGGTTCGAAAAGGAACAACTGGCGAGGAACATACCCTTTCTTCTGTTGCTG
480
BceAI AlwI HindIII HincII BslI
BbsI
MslI BstYI Hpy8I
BsrBI
481
ACGCTGGTCCGCTGTCCTCTCACAGACCCAGAAGTGACCAATTATTCCCTCAAGGGGTGCCAGGGGAAGCCTCTTCCCAA
560
Masters Research Project
39
481
TGCGACCAGGCGACAGGAGAGTGTCTGGGTCTTCACTGGTTAATAAGGGAGTTCCCCACGGTCCCCTTCGGAGAAGGGTT
560
MboII MspA1I MnlI BpuEI SmlI BanI MboII
EarI
DrdI BslI BsaJI
BsaJI
MnlI
StyI
NlaIV
561
GGACTTGAGGTTTATTCCTGACCCCAAGGCGGGCATCATGATCAAAAGTGTGAAACGCGCCTACCATCGGCTCTGTCTGC
640
561
CCTGAACTCCAAATAAGGACTGGGGTTCCGCCCGTAGTACTAGTTTTCACACTTTGCGCGGATGGTAGCCGAGACAGACG
640
MnlI Hpy188III BslI BspHI BslI
MwoI
MnlI FauI BslI Hpy188III
HpyF10VI
SmlI BsaJI Cac8I BclI
StyI SfaNI
BpuEI
641
ATTGTTCTGTGGACCAGGAGGGCAAGTCAGTGCTGTCGGAAAAATTCATCCTGAAAGTGAGGCCAGCCTTCAAAGCTGTG
720
641
TAACAAGACACCTGGTCCTCCCGTTCAGTCACGACAGCCTTTTTAAGTAGGACTTTCACTCCGGTCGGAAGTTTCGACAC
720
Hin4I MmeI FokI BsaXI BstF5I Cac8I
MwoI
Masters Research Project
40
BsaXI TspRI Hin4I Hpy188III
HpyF10VI
Hpy8I TspDTI MnlI
MnlI ApoI
721
CCTGTTGTGTCTGTGTCCAAAGCAAGCTATCTTCTTAGGGAAGGGGAAGAATTCACAGTGACGTGCACAATAAAAGATGT
800
721
GGACAACACAGACACAGGTTTCGTTCGATAGAAGAATCCCTTCCCCTTCTTAAGTGTCACTGCACGTGTTATTTTCTACA
800
MboII ApoI MboII Hpy8I
Cac8I EcoRI TspRI
BmgBI
ApaLI
Bme1580I
BsiHKAI
Bsp1286I
801
GTCTAGTTCTGTGTACTCAACGTGGAAAAGAGAAAACAGTCAGACTAAACTACAGGAGAAATATAATAGCTGGCATCACG
880
801
CAGATCAAGACACATGAGTTGCACCTTTTCTCTTTTGTCAGTCTGATTTGATGTCCTCTTTATATTATCGACCGTAGTGC
880
TatI SfcI
Cac8I MslI
Hpy8I
Masters Research Project
41
881
GTGACTTCAATTATGAACGTCAGGCAACGTTGACTATCAGTTCAGCGAGAGTTAATGATTCTGGAGTGTTCATGTGTTAT
960
881
CACTGAAGTTAATACTTGCAGTCCGTTGCAACTGATAGTCAAGTCGCTCTCAATTACTAAGACCTCACAAGTACACAATA
960
SfaNI HphI AclI TspDTI
TspDTI Hpy188III
HincII
Hpy8I
961
GCCAATAATACTTTTGGATCAGCAAATGTCACAACAACCTTGGAAGTAGTAGATAAAGGATTCATTAATATCTTCCCCAT
1040
961
CGGTTATTATGAAAACCTAGTCGTTTACAGTGTTGTTGGAACCTTCATCATCTATTTCCTAAGTAATTATAGAAGGGGTA
1040
BpmI AlwI BsaJI TspDTI MboII
Eco57MI StyI AseI
1041
GATAAACACTACAGTATTTGTAAACGATGGAGAAAATGTAGATTTGATTGTTGAATATGAAGCATTCCCCAAACCTGAAC
1120
1041
CTATTTGTGATGTCATAAACATTTGCTACCTCTTTTACATCTAAACTAACAACTTATACTTCGTAAGGGGTTTGGACTTG
1120
SfcI BsaXI BsaXI BsmI
TspDTI
Hpy8I
Masters Research Project
42
1121
ACCAGCAGTGGATCTATATGAACAGAACCTTCACTGATAAATGGGAAGATTATCCCAAGTCTGAGAATGAAAGTAATATC
1200
1121
TGGTCGTCACCTAGATATACTTGTCTTGGAAGTGACTATTTACCCTTCTAATAGGGTTCAGACTCTTACTTTCATTATAG
1200
BstYI AlwI TspDTI BseMII
BaeI
TspRI TspRI BspCNI
BtsI MboII
1201
AGATACGTAAGTGAACTTCATCTAACGAGATTAAAAGGCACCGAAGGAGGCACTTACACATTCCTAGTGTCCAATTCTGA
1280
1201
TCTATGCATTCACTTGAAGTAGATTGCTCTAATTTTCCGTGGCTTCCTCCGTGAATGTGTAAGGATCACAGGTTAAGACT
1280
TspDTI Hpy8I BaeI BanI
BbvI
BsaAI NlaIV
SnaBI MnlI
TspDTI
1281
CGTCAATGCTGCCATAGCATTTAATGTTTATGTGAATACAAAACCAGAAATCCTGACTTACGACAGGCTCGTGAATGGCA
1360
1281
GCAGTTACGACGGTATCGTAAATTACAAATACACTTATGTTTTGGTCTTTAGGACTGAATGCTGTCCGAGCACTTACCGT
1360
BsaHI MwoI Hpy188III BssSI
ZraI HpyF10VI
Hpy188III
Masters Research Project
43
AatII
1361
TGCTCCAATGTGTGGCAGCAGGATTCCCAGAGCCCACAATAGATTGGTATTTTTGTCCAGGAACTGAGCAGAGATGCTCT
1440
1361
ACGAGGTTACACACCGTCGTCCTAAGGGTCTCGGGTGTTATCTAACCATAAAAACAGGTCCTTGACTCGTCTCTACGAGA
1440
Cac8I BslI BbvI BanII BseMII AlwNI
NspI PflMI Bsp1286I BspCNI
SphI SfaNI
1441
GCTTCTGTACTGCCAGTGGATGTGCAGACACTAAACTCATCTGGGCCACCGTTTGGAAAGCTAGTGGTTCAGAGTTCTAT
1520
1441
CGAAGACATGACGGTCACCTACACGTCTGTGATTTGAGTAGACCCGGTGGCAAACCTTTCGATCACCAAGTCTCAAGATA
1520
TatI BsrI TspRI FokI BsgI BslI
SfcI
BstF5I PflMI
1521
AGATTCTAGTGCATTCAAGCACAATGGCACGGTTGAATGTAAGGCTTACAACGATGTGGGCAAGACTTCTGCCTATTTTA
1600
1521
TCTAAGATCACGTAAGTTCGTGTTACCGTGCCAACTTACATTCCGAATGTTGCTACACCCGTTCTGAAGACGGATAAAAT
1600
BsmI
Masters Research Project
44
1601
ACTTTGCATTTAAAGGTAACAACAAAGAGCAAATCCATCCCCACACCCTGTTCACTCCTTTGCTGATTGGTTTCGTAATC
1680
1601
TGAAACGTAAATTTCCATTGTTGTTTCTCGTTTAGGTAGGGGTGTGGGACAAGTGAGGAAACGACTAACCAAAGCATTAG
1680
DraI FokI BstF5I Hpy8I
1681
GTAGCTGGCATGATGTGCATTATTGTGATGATTCTGACCTACAAATATTTACAGAAACCCATGTATGAAGTACAGTGGAA
1760
1681
CATCGACCGTACTACACGTAATAACACTACTAAGACTGGATGTTTATAAATGTCTTTGGGTACATACTTCATGTCACCTT
1760
Cac8I BstAPI MslI SspI
TatI TspRI
MwoI
HpyF10VI
1761
GGTTGTTGAGGAGATAAATGGAAACAATTATGTTTACATAGACCCAACACAACTTCCTTATGATCACAAATGGGAGTTTC
1840
1761
CCAACAACTCCTCTATTTACCTTTGTTAATACAAATGTATCTGGGTTGTGTTGAAGGAATACTAGTGTTTACCCTCAAAG
1840
TspDTI BseRI Hpy8I BclI
MnlI
Masters Research Project
45
1841
CCAGAAACAGGCTGAGTTTTGGGAAAACCCTGGGTGCTGGAGCTTTCGGGAAGGTTGTTGAGGCAACTGCTTATGGCTTA
1920
1841
GGTCTTTGTCCGACTCAAAACCCTTTTGGGACCCACGACCTCGAAAGCCCTTCCAACAACTCCGTTGACGAATACCGAAT
1920
BseMII TaqII BsaJI Hpy188III BpmI
BspCNI BsaJI MnlI
BslI Eco57MI
1921
ATTAAGTCAGATGCGGCCATGACTGTCGCTGTAAAGATGCTCAAGCCGAGTGCCCATTTGACAGAACGGGAAGCCCTCAT
2000
1921
TAATTCAGTCTACGCCGGTACTGACAGCGACATTTCTACGAGTTCGGCTCACGGGTAAACTGTCTTGCCCTTCGGGAGTA
2000
SfaNI EaeI BpuEI SmlI Bme1580I
AloI
PacI SfaNI Bsp1286I
2001
GTCTGAACTCAAAGTCCTGAGTTACCTTGGTAATCACATGAATATTGTGAATCTACTTGGAGCCTGCACCATTGGAGGGC
2080
2001
CAGACTTGAGTTTCAGGACTCAATGGAACCATTAGTGTACTTATAACACTTAGATGAACCTCGGACGTGGTAACCTCCCG
2080
MnlI Hpy188III AloI SspI BsgI NlaIV
MnlI BslI
BseMII BsaJI TspDTI Cac8I
EcoO109I
BspCNI StyI
PspOMI
NlaIV
Masters Research Project
46
2081
CCACCCTGGTCATTACAGAATATTGTTGCTATGGTGATCTTTTGAATTTTTTGAGAAGAAAACGTGATTCATTTATTTGT
2160
2081
GGTGGGACCAGTAATGTCTTATAACAACGATACCACTAGAAAACTTAAAAAACTCTTCTTTTGCACTAAGTAAATAAACA
2160
ApaI SspI ApoI TspDTI MboII
BanII HphI
Bme1580I
Bsp1286I
BsaJI
2161
TCAAAGCAGGAAGATCATGCAGAAGCTGCACTTTATAAGAATCTTCTGCATTCAAAGGAGTCTTCCTGCAGCGATAGTAC
2240
2161
AGTTTCGTCCTTCTAGTACGTCTTCGACGTGAAATATTCTTAGAAGACGTAAGTTTCCTCAGAAGGACGTCGCTATCATG
2240
BsgI MboII MboII FalI MboII PleI
TatI
BbvI BstAPI PsiI BsmI SfcI
ScaI
FalI MwoI BbsI MlyI
AlwNI PstI
HpyF10VI
Masters Research Project
47
2241
TAATGAGTACATGGACATGAAACCTGGAGTTTCTTATGTTGTCCCAACCAAGGCCGACAAAAGGAGATCTGTGAGAATAG
2320
2241
ATTACTCATGTACCTGTACTTTGGACCTCAAAGAATACAACAGGGTTGGTTCCGGCTGTTTTCCTCTAGACACTCTTATC
2320
BbvI TatI MslI BsmFI BpmI BslI
TspDTI Eco57MI BglII
BsaJI BstYI
StyI
BslI
2321
GCTCATACATAGAAAGAGATGTGACTCCCGCCATCATGGAGGATGACGAGTTGGCCCTAGACTTAGAAGACTTGCTGAGC
2400
2321
CGAGTATGTATCTTTCTCTACACTGAGGGCGGTAGTACCTCCTACTGCTCAACCGGGATCTGAATCTTCTGAACGACTCG
2400
MlyI MnlI BstF5I BseMII
BlpI
PleI FauI FokI
BspCNI MboII
BbsI
2401
TTTTCTTACCAGGTGGCAAAGGGCATGGCTTTCCTCGCCTCCAAGAATTGTATTCACAGAGACTTGGCAGCCAGAAATAT
2480
2401
AAAAGAATGGTCCACCGTTTCCCGTACCGAAAGGAGCGGAGGTTCTTAACATAAGTGTCTCTGAACCGTCGGTCTTTATA
2480
SexAI MwoI MnlI BsmAI
MmeI
Masters Research Project
48
HpyF10VI MnlI
BbvI
2481
CCTCCTTACTCATGGTCGGATCACAAAGATTTGTGATTTTGGTCTAGCCAGAGACATCAAGAATGATTCTAATTATGTGG
2560
2481
GGAGGAATGAGTACCAGCCTAGTGTTTCTAAACACTAAAACCAGATCGGTCTCTGTAGTTCTTACTAAGATTAATACACC
2560
MnlI AlwI BsmAI Hpy188III
2561
TTAAAGGAAACGCTCGACTACCTGTGAAGTGGATGGCACCTGAAAGCATTTTCAACTGTGTATACACGTTTGAAAGTGAC
2640
2561
AATTTCCTTTGCGAGCTGATGGACACTTCACCTACCGTGGACTTTCGTAAAAGTTGACACATATGTGCAAACTTTCACTG
2640
BanI FokI AccI
Hin4I BsaHI
BstF5I BstZ17I
NlaIV Hpy8I
Hin4I
AflIII
2641
GTCTGGTCCTATGGGATTTTTCTTTGGGAGCTGTTCTCTTTAGGAAGCAGCCCCTATCCTGGAATGCCGGTCGATTCTAA
2720
2641
CAGACCAGGATACCCTAAAAAGAAACCCTCGACAAGAGAAATCCTTCGTCGGGGATAGGACCTTACGGCCAGCTAAGATT
2720
ZraI BplI BplI BslI BsrFI
Masters Research Project
49
AatII Hin4I BbvI BsmI
AhdI Hin4I
BsiEI
2721
GTTCTACAAGATGATCAAGGAAGGCTTCCGGATGCTCAGCCCTGAACACGCACCTGCTGAAATGTATGACATAATGAAGA
2800
2721
CAAGATGTTCTACTAGTTCCTTCCGAAGGCCTACGAGTCGGGACTTGTGCGTGGACGACTTTACATACTGTATTACTTCT
2800
FalI SfaNI Hpy188III FalI BspCNI AarI
SfaNI
BclI BsaWI BlpI FokI BseMII BspMI
BspEI BstF5I
2801
CTTGCTGGGATGCAGATCCCCTAAAAAGACCAACATTCAAGCAAATTGTTCAGCTAATTGAGAAGCAGATTTCAGAGAGC
2880
2801
GAACGACCCTACGTCTAGGGGATTTTTCTGGTTGTAAGTTCGTTTAACAAGTCGATTAACTCTTCGTCTAAAGTCTCTCG
2880
BbsI TspDTI FokI Hin4I
BseYI BsaBI
MboII
AlwI
BstF5I
BstYI
Masters Research Project
50
2881
ACCAATCATATTTACTCCAACTTAGCAAACTGCAGCCCCAACCGACAGAAGCCCGTGGTAGACCATTCTGTGCGGATCAA
2960
2881
TGGTTAGTATAAATGAGGTTGAATCGTTTGACGTCGGGGTTGGCTGTCTTCGGGCACCATCTGGTAAGACACGCCTAGTT
2960
BsiHKAI Hin4I SfcI MmeI BsaJI Hpy8I
Bsp1286I PstI BbvI BtgI AccI
2961
TTCTGTCGGCAGCACCGCTTCCTCCTCCCAGCCTCTGCTTGTGCACGACGATGTCTGAGCAGAATCAGTGTTTGGGTCAC
3040
2961
AAGACAGCCGTCGTGGCGAAGGAGGAGGGTCGGAGACGAACACGTGCTGCTACAGACTCGTCTTAGTCACAAACCCAGTG
3040
AlwI BseRI BbvI BseYI MnlI ApaLI MslI
BpmI BstEII
MnlI MnlI Tth111I
Eco57MI
AlwNI Hpy8I HphI
Bme1580I
TspRI
BsiHKAI
Bsp1286I
BseMII
BspCNI
3041
CCCTCCAGGAATGATCTCTTCTTTTGGCTTCCATGATGGTTATTTTCTTTTCTTTCAACTTGCATCCAACTCCAGGATAG
3120
3041
GGGAGGTCCTTACTAGAGAAGAAAACCGAAGGTACTACCAATAAAAGAAAAGAAAGTTGAACGTAGGTTGAGGTCCTATC
3120
Masters Research Project
51
MboII EarI FokI BpmI BstF5I
SfaNI BstXI
MnlI Eco57MI
BslI
3121
TGGGCACCCCACTGCAATCCTGTCTTTCTGAGCACACTTTAGTGGCCGATGATTTTTGTCATCAGCCACCATCCTATTGC
3200
3121
ACCCGTGGGGTGACGTTAGGACAGAAAGACTCGTGTGAAATCACCGGCTACTAAAAACAGTAGTCGGTGGTAGGATAACG
3200
BanI BtsI TspRI BsiHKAI EaeI FokI
BstF5I
NlaIV BseMII Bsp1286I
Bme1580I BspCNI AleI
Bsp1286I MslI
MmeI
3201
AAAGGTTCCAACTGTATATATTCCCAATAGCAACGTAGCTTCTACCATGAACAGAAAACATTCTGATTTGGAAAAAGAGA
3280
3201
TTTCCAAGGTTGACATATATAAGGGTTATCGTTGCATCGAAGATGGTACTTGTCTTTTGTAAGACTAAACCTTTTTCTCT
3280
NlaIV MmeI XmnI
MnlI
TspDTI
MnlI
Masters Research Project
52
3281
GGGAGGTATGGACTGGGGGCCAGAGTCCTTTCCAAGGCTTCTCCAATTCTGCCCAAAAATATGGTTGATAGTTTACCTGA
3360
3281
CCCTCCATACCTGACCCCCGGTCTCAGGAAAGGTTCCGAAGAGGTTAAGACGGGTTTTTATACCAACTATCAAATGGACT
3360
BsrI PleI BstXI
Hpy8I
NlaIV EcoNI
BmrI MlyI
BsaJI
StyI
BslI
3361
ATAAATGGTAGTAATCACAGTTGGCCTTCAGAACCATCCATAGTAGTATGATGATACAAGATTAGAAGCTGAAAACCTAA
3440
3361
TATTTACCATCATTAGTGTCAACCGGAAGTCTTGGTAGGTATCATCATACTACTATGTTCTAATCTTCGACTTTTGGATT
3440
Eco57I FokI BstF5I
Eco57MI
3441
GTCCTTTATGTGGAAAACAGAACATCATTAGAACAAAGGACAGAGTATGAACACCTGGGCTTAAGAAATCTAGTATTTCA
3520
3441
CAGGAAATACACCTTTTGTCTTGTAGTAATCTTGTTTCCTGTCTCATACTTGTGGACCCGAATTCTTTAGATCATAAAGT
3520
BslI BsaJI TspDTI
AflII TspDTI
SmlI
Masters Research Project
53
3521
TGCTGGGAATGAGACATAGGCCATGAAAAAAATGATCCCCAAGTGTGAACAAAAGATGCTCTTCTGTGGACCACTGCATG
3600
3521
ACGACCCTTACTCTGTATCCGGTACTTTTTTTACTAGGGGTTCACACTTGTTTTCTACGAGAAGACACCTGGTGACGTAC
3600
BseYI AlwI TspDTI Hpy8I SapI
BtsI TspRI
BsmAI FalI MboII EarI
FalI
SfaNI Hpy8I
3601
AGCTTTTATACTACCGACCTGGTTTTTAAATAGAGTTTGCTATTAGAGCATTGAATTGGAGAGAAGGCCTCCCTAGCCAG
3680
3601
TCGAAAATATGATGGCTGGACCAAAAATTTATCTCAAACGATAATCTCGTAACTTAACCTCTCTTCCGGAGGGATCGGTC
3680
SexAI DraI MwoI BplI StuI
MwoI
HpyF10VI
HpyF10VI
MnlI
Cac8I
3681
CACTTGTATATACGCATCTATAAATTGTCCGTGTTCATACATTTGAGGGGAAAACACCATAAGGTTTCGTTTCTGTATAC
3760
Masters Research Project
54
3681
GTGAACATATATGCGTAGATATTTAACAGGCACAAGTATGTAAACTCCCCTTTTGTGGTATTCCAAAGCAAAGACATATG
3760
BplI TspGWI MnlI
AccI
SfaNI
BstZ17I
TspDTI
Hpy8I
3761
AACCCTGGCATTATGTCCACTGTGTATAGAAGTAGATTAAGAGCCATATAAGTTTGAAGGAAACAGTTAATACCATTTTT
3840
3761
TTGGGACCGTAATACAGGTGACACATATCTTCATCTAATTCTCGGTATATTCAAACTTCCTTTGTCAATTATGGTAAAAA
3840
BsaJI Hpy8I TspRI
3841
TAAGGAAACAATATAACCACAAAGCACAGTTTGAACAAAATCTCCTCTTTTAGCTGATGAACTTATTCTGTAGATTCTGT
3920
3841
ATTCCTTTGTTATATTGGTGTTTCGTGTCAAACTTGTTTTAGAGGAGAAAATCGACTACTTGAATAAGACATCTAAGACA
3920
AloI BseRI MnlI XmnI
TspDTI
PpiI BsaXI SfcI
BsaXI AloI
PpiI
Masters Research Project
55
3921
GGAACAAGCCTATCAGCTTCAGAATGGCATTGTACTCAATGGATTTGATGCTGTTTGACAAAGTTACTGATTCACTGCAT
4000
3921
CCTTGTTCGGATAGTCGAAGTCTTACCGTAACATGAGTTACCTAAACTACGACAAACTGTTTCAATGACTAAGTGACGTA
4000
Eco57I TatI SfaNI
BtsI TspRI
Eco57MI
4001
GGCTCCCACAGGAGTGGGAAAACACTGCCATCTTAGTTTGGATTCTTATGTAGCAGGAAATAAAGTATAGGTTTAGCCTC
4080
4001
CCGAGGGTGTCCTCACCCTTTTGTGACGGTAGAATCAAACCTAAGAATACATCGTCCTTTATTTCATATCCAAATCGGAG
4080
NlaIV AleI BtsI TspRI
MslI
BstXI
4081
CTTCGCAGGCATGTCCTGGACACCGGGCCAGTATCTATATATGTGTATGTACGTTTGTATGTGTGTAGACAAATATTTGG
4160
4081
GAAGCGTCCGTACAGGACCTGTGGCCCGGTCATAGATATATACACATACATGCAAACATACACACATCTGTTTATAAACC
4160
MwoI NspI BsrI AccI
MnlI
HpyF10VI Hpy8I
SspI
MnlI
Cac8I
Masters Research Project
56
4161
AGGGGTATTTTTGCCCTGAGTCCAAGAGGGTCCTTTAGTACCTGAAAAGTAACTTGGCTTTCATTATTAGTACTGCTCTT
4240
4161
TCCCCATAAAAACGGGACTCAGGTTCTCCCAGGAAATCATGGACTTTTCATTGAACCGAAAGTAATAATCATGACGAGAA
4240
BseMII MnlI PleI TspDTI
TatI
BspCNI MlyI
ScaI
EcoO109I
PpuMI
NlaIV
4241
GTTTCTTTTCACATAGCTGTCTAGAGTAGCTTACCAGAAGCTTCCATAGTGGTGCAGAGGAAGTGGAAGGCATCAGTCCC
4320
4241
CAAAGAAAAGTGTATCGACAGATCTCATCGAATGGTCTTCGAAGGTATCACCACGTCTCCTTCACCTTCCGTAGTCAGGG
4320
XbaI HindIII MslI BsmFI
BsgI SfaNI
Hpy188III MnlI
PsrI
4321
TATGTATTTGCAGTTCACCTGCACTTAAGGCACTCTGTTATTTAGACTCATCTTACTGTACCTGTTCCTTAGACCTTCCA
4400
4321
ATACATAAACGTCAAGTGGACGTGAATTCCGTGAGACAATAAATCTGAGTAGAATGACATGGACAAGGAATCTGGAAGGT
4400
Masters Research Project
57
BsgI Hpy8I AflII MlyI
HphI SmlI PleI
AarI
BspMI
MwoI
PsrI
HpyF10VI
4401
TAATGCTACTGTCTCACTGAAACATTTAAATTTTACCCTTTAGACTGTAGCCTGGATATTATTCTTGTAGTTTACCTCTT
4480
4401
ATTACGATGACAGAGTGACTTTGTAAATTTAAAATGGGAAATCTGACATCGGACCTATAATAAGAACATCAAATGGAGAA
4480
BsmAI SwaI SfcI
Hpy8I
TspRI ApoI
DraI
4481
TAAAAACAAAACAAAACAAAACAAAAAACTCCCCTTCCTCACTGCCCAATATAAAAGGCAAATGTGTACATGGCAGAGTT
4560
4481
ATTTTTGTTTTGTTTTGTTTTGTTTTTTGAGGGGAAGGAGTGACGGGTTATATTTTCCGTTTACACATGTACCGTCTCAA
4560
DraI BtsI TspRI BsrGI
MnlI MnlI TatI
Hpy8I
Masters Research Project
58
4561
TGTGTGTTGTCTTGAAAGATTCAGGTATGTTGCCTTTATGGTTTCCCCCTTCTACATTTCTTAGACTACATTTAGAGAAC
4640
4561
ACACACAACAGAACTTTCTAAGTCCATACAACGGAAATACCAAAGGGGGAAGATGTAAAGAATCTGATGTAAATCTCTTG
4640
Hpy188III BceAI
4641
TGTGGCCGTTATCTGGAAGTAACCATTTGCACTGGAGTTCTATGCTCTCGCACCTTTCCAAAGTTAACAGATTTTGGGGT
4720
4641
ACACCGGCAATAGACCTTCATTGGTAAACGTGACCTCAAGATACGAGAGCGTGGAAAGGTTTCAATTGTCTAAAACCCCA
4720
EaeI BslI TspRI BpmI HincII
Hpy188III BsrI Eco57MI HpaI
Hpy8I
4721
TGTGTTGTCACCCAAGAGATTGTTGTTTGCCATACTTTGTCTGAAAAATTCCTTTGTGTTTCTATTGACTTCAATGATAG
4800
4721
ACACAACAGTGGGTTCTCTAACAACAAACGGTATGAAACAGACTTTTTAAGGAAACACAAAGATAACTGAAGTTACTATC
4800
HphI TaqII ApoI
4801
TAAGAAAAGTGGTTGTTAGTTATAGATGTCTAGGTACTTCAGGGGCACTTCATTGAGAGTTTTGTCTTGGATATTCTTGA
4880
Masters Research Project
59
4801
ATTCTTTTCACCAACAATCAATATCTACAGATCCATGAAGTCCCCGTGAAGTAACTCTCAAAACAGAACCTATAAGAACT
4880
Eco57I TspDTI Bme1580I
Hpy188III
Eco57MI Bsp1286I
4881
AAGTTTATATTTTTATAATTTTTTCTTACATCAGATGTTTCTTTGCAGTGGCTTAATGTTTGAAATTATTTTGTGGCTTT
4960
4881
TTCAAATATAAAAATATTAAAAAAGAATGTAGTCTACAAAGAAACGTCACCGAATTACAAACTTTAATAAAACACCGAAA
4960
PsiI TspRI
BtsI
4961
TTTTGTAAATATTGAAATGTAGCAATAATGTCTTTTGAATATTCCCAAGCCCATGAGTCCTTGAAAATATTTTTTATATA
5040
4961
AAAACATTTATAACTTTACATCGTTATTACAGAAAACTTATAAGGGTTCGGGTACTCAGGAACTTTTATAAAAAATATAT
5040
SspI SspI PleI
MlyI
SspI
5041
TACAGTAACTTTATGTGTAAATACATAAGCGGCGTAAGTTTAAAGGATGTTGGTGTTCCACGTGTTTTATTCCTGTATGT
5120
Masters Research Project
60
5041
ATGTCATTGAAATACACATTTATGTATTCGCCGCATTCAAATTTCCTACAACCACAAGGTGCACAAAATAAGGACATACA
5120
DraI BstF5I AflIII
FokI
BsaAI
PmlI
5121 TGTCCAATTGTTGACAGTTCTGAAGAATTCTAATAAAATGTACATATATAAATCAAAAAAAAAAAAAAAA
5190
5121 ACAGGTTAACAACTGTCAAGACTTCTTAAGATTATTTTACATGTATATATTTAGTTTTTTTTTTTTTTTT
5190
MfeI HincII ApoI MboII
Hpy8I EcoRI BsrGI
TatI
Eco57I
Eco57MI
Restriction table:
Enzyme Recognition frequency Positions
__________________________________________________________________________
AarI CACCTGCnnnn'nnnn_ 2 2782, 4347
AatII G_ACGT'C 2 1284, 2643
AccI GT'mk_AC 4 2622, 2940, 3757, 4147
AclI AA'CG_TT 1 908
AflII C'TTAA_G 3 394, 3501, 4345
AflIII A'CryG_T 3 372, 2625, 5101
Masters Research Project
61
AhdI GACnn_n'nnGTC 1 2644
AleI CACnn'nnGTG 2 3160, 4012
AloI GAACnnnnnnTCCnnnnnnn_nnnnn' 2 2030, 3898
AloI GGAnnnnnnGTTCnnnnnnn_nnnnn' 2 1998, 3866
AlwI GGATCnnnn'n_ 12 62, 75, 260, 289, 341, 419, 985
1139, 2507, 2810, 2963, 3549
AlwNI CAG_nnn'CTG 3 1424, 2186, 2995
ApaI G_GGCC'C 1 2082
ApaLI G'TGCA_C 3 376, 784, 3002
ApoI r'AATT_y 6 683, 770, 2125, 4429, 4767, 5146
AseI AT'TA_AT 1 1026
BaeI ACnnnnGTAyCnnnnnnn_nnnnn' 1 1195
BaeI GrTACnnnnGTnnnnnnnnnn_nnnnn' 1 1228
BamHI G'GATC_C 1 67
BanI G'GyrC_C 5 104, 537, 1238, 2596, 3124
BanII G_rGCy'C 4 11, 65, 1395, 2082
BbeI G_GCGC'C 1 108
BbsI GAAGACnn'nnnn_ 4 477, 2214, 2394, 2804
BbvI GCAGCnnnnnnnn'nnnn_ 10 15, 89, 1276, 1388, 2173, 2241
2480, 2700, 2925, 2982
BceAI ACGGCnnnnnnnnnnnn'nn_ 2 406, 4631
BclI T'GATC_A 3 600, 1822, 2733
BglII A'GATC_T 1 2306
BlpI GC'TnA_GC 2 2396, 2756
Bme1580I G_kGCm'C 7 380, 788, 1975, 2082, 3006, 3127
4848
BmgBI CAC'GTC 1 783
BmrI ACTGGGnnnn_n' 1 3303
BplI GAGnnnnnCTCnnnnnnnn_nnnnn' 6 82, 114, 2660, 2692, 3653, 3685
BpmI CTGGAGnnnnnnnnnnnnnn_nn' 6 963, 1899, 2286, 3029, 3096, 4694
Masters Research Project
62
BpuEI CTTGAGnnnnnnnnnnnnnn_nn' 3 515, 586, 1946
BsaAI yAC'GTr 3 375, 1207, 5102
BsaBI GATnn'nnATC 1 2814
BsaHI Gr'CG_yC 3 105, 1281, 2640
BsaJI C'CnnG_G 15 108, 179, 541, 558, 585, 999
1869, 1870, 2026, 2085, 2289
2934, 3313, 3495, 3764
BsaWI w'CCGG_w 1 2748
BsaXI ACnnnnnCTCCnnnnnnn_nnn' 3 650, 1062, 3896
BsaXI GGAGnnnnnGTnnnnnnnnn_nnn' 3 680, 1092, 3866
BseMII CTCAGnnnnnnnn_nn' 9 1173, 1416, 1844, 2009, 2387,
2770
3007, 3140, 4168
BseRI GAGGAGnnnnnnnn_nn' 3 1784, 2974, 3874
BseYI C'CCAG_C 3 2805, 2988, 3523
BsgI GTGCAGnnnnnnnnnnnnnn_nn' 6 14, 1484, 2050, 2172, 4314, 4325
BsiEI CG_ry'CG 1 2712
BsiHKAI G_wGCw'C 6 65, 380, 788, 2882, 3006, 3155
BslI CCnn_nnn'nnGG 17 64, 462, 534, 589, 590, 627, 1372
1493, 1848, 2076, 2290, 2301
2699, 3113, 3314, 3450, 4653
BsmI GAATG_Cn' 4 1103, 1532, 2209, 2709
BsmAI GTCTCn'nnnn_ 5 195, 2454, 2526, 3526, 4417
BsmBI CGTCTCn'nnnn_ 1 195
BsmFI GGGACnnnnnnnnnn'nnnn_ 2 2267, 4302
Bsp1286I G_dGCh'C 12 11, 65, 380, 788, 1395, 1975
2082, 2882, 3006, 3127, 3155
4848
BspCNI CTCAGnnnnnnn_nn' 9 1174, 1417, 1845, 2010, 2388,
2769
3008, 3141, 4169
Masters Research Project
63
BspEI T'CCGG_A 1 2748
BspHI T'CATG_A 1 597
BspMI ACCTGCnnnn'nnnn_ 2 2782, 4347
BsrI ACTG_Gn' 4 1454, 3298, 4109, 4677
BsrBI CCG'CTC 1 453
BsrFI r'CCGG_y 2 362, 2707
BsrGI T'GTAC_A 2 4546, 5160
BssSI C'ACGA_G 1 1349
BstAPI GCAn_nnn'nTGC 2 1695, 2186
BstEII G'GTnAC_C 1 3036
BstF5I GGATG_nn' 15 197, 201, 205, 307, 687, 1465
1636, 2368, 2598, 2757, 2815
3103, 3190, 3395, 5092
BstXI CCAn_nnnn'nTGG 3 3120, 3341, 4014
BstYI r'GATC_y 6 67, 294, 424, 1131, 2306, 2815
BstZ17I GTA'TAC 2 2623, 3758
BtgI C'CryG_G 1 2934
BtsI GCAGTG_nn' 7 1133, 3130, 3592, 3993, 4023,
4520
4933
Cac8I GCn'nGC 11 26, 100, 592, 705, 745, 872, 1361
1687, 2065, 3679, 4088
DraI TTT'AAA 5 1612, 3628, 4428, 4482, 5082
DrdI GACnn_nn'nnGTC 2 227, 485
EaeI y'GGCC_r 3 1935, 3164, 4644
EarI CTCTTCn'nnn_ 4 163, 558, 3063, 3586
Eco57I CTGAAGnnnnnnnnnnnnnn_nn' 4 3372, 3923, 4823, 5162
EcoICRI GAG'CTC 1 63
Eco57MI CTGrAGnnnnnnnnnnnnnn_nn' 10 963, 1899, 2286, 3029, 3096, 3372
3923, 4694, 4823, 5162
Masters Research Project
64
EcoNI CCTnn'n_nnAGG 1 3312
EcoO109I rG'GnC_Cy 2 2078, 4190
EcoRI G'AATT_C 2 770, 5146
FalI AAGnnnnnCTTnnnnnnnn_nnnnn' 6 2175, 2207, 2729, 2761, 3565,
3597
FauI CCCGCnnnn'nn_ 2 583, 2356
FokI GGATGnnnnnnnnn'nnnn_ 15 184, 188, 192, 314, 674, 1472
1623, 2375, 2605, 2764, 2822
3090, 3177, 3382, 5099
HaeII r_GCGC'y 2 99, 108
HgaI GACGCnnnnn'nnnnn_ 1 133
Hin4I GAynnnnnvTCnnnnnnnn_nnnnn' 6 52, 79, 650, 2630, 2662, 2869
Hin4I GAbnnnnnrTCnnnnnnnn_nnnnn' 6 47, 84, 682, 2630, 2662, 2901
HincII GTy'rAC 4 448, 912, 4706, 5133
HindIII A'AGCT_T 2 434, 4279
HpaI GTT'AAC 1 4706
HphI GGTGAnnnnnnn_n' 5 893, 2126, 3030, 4328, 4721
Hpy8I GTn'nAC 25 55, 378, 448, 652, 786, 814, 912
1063, 1214, 1653, 1795, 2623
2941, 3004, 3354, 3568, 3589
3758, 3778, 4148, 4336, 4473
4547, 4706, 5133
Hpy188III TC'nn_GA 15 148, 578, 598, 691, 942, 1333
1351, 1888, 2017, 2539, 2749
4262, 4572, 4654, 4877
HpyF10VI GCn_nnnnn'nGC 16 26, 29, 84, 103, 105, 137, 638
714, 1296, 1696, 2187, 2436, 3647
3675, 4084, 4349
KasI G'GCGC_C 1 104
MboII GAAGAnnnnnnn_n' 16 150, 482, 545, 743, 779, 1024
Masters Research Project
65
1178, 2148, 2183, 2195, 2214
2399, 2809, 3050, 3573, 5155
MfeI C'AATT_G 1 5126
MlyI GAGTCnnnnn' 7 184, 2228, 2338, 3313, 4188, 4360
5025
MmeI TCCrACnnnnnnnnnnnnnnnnnn_nn' 5 657, 2477, 2922, 3131, 3233
MnlI CCTCnnnnnn_n' 31 86, 507, 539, 561, 561, 652, 693
1241, 1762, 1894, 2006, 2069
2353, 2444, 2449, 2492, 2992
2995, 3003, 3053, 3273, 3277
3679, 3719, 3895, 4088, 4154
4180, 4291, 4486, 4528
MslI CAynn'nnrTG 8 410, 879, 1703, 2255, 3009, 3160
4012, 4290
MspA1I CmG'CkG 1 492
MwoI GCnn_nnn'nnGC 16 25, 28, 83, 102, 104, 136, 637
713, 1295, 1695, 2186, 2435, 3646
3674, 4083, 4348
NarI GG'CG_CC 1 105
NlaIV GGn'nCC 13 69, 106, 186, 539, 1240, 2062
2080, 2598, 3126, 3207, 3299
4004, 4191
NspI r_CATG'y 2 1363, 4094
PacI TTA_AT'TAA 1 1923
PflMI CCAn_nnn'nTGG 2 1372, 1493
PleI GAGTCnnnn'n_ 7 183, 2227, 2338, 3312, 4187, 4360
5024
PmlI CAC'GTG 2 375, 5102
PpiI GAACnnnnnCTCnnnnnnnn_nnnnn' 1 3898
PpiI GAGnnnnnGTTCnnnnnnn_nnnnn' 1 3866
Masters Research Project
66
PpuMI rG'GwC_Cy 1 4190
PsiI TTA'TAA 2 2196, 4896
PspOMI G'GGCC_C 1 2078
PsrI GAACnnnnnnTACnnnnnnn_nnnnn' 1 4317
PsrI GTAnnnnnnGTTCnnnnnnn_nnnnn' 1 4349
PstI C_TGCA'G 2 2231, 2915
SacI G_AGCT'C 1 65
SapI GCTCTTCn'nnn_ 2 163, 3586
ScaI AGT'ACT 2 2239, 4232
SexAI A'CCwGG_T 2 2409, 3618
SfaNI GCATCnnnnn'nnnn_ 12 603, 883, 1424, 1921, 1947, 2742
2800, 3112, 3566, 3704, 3958
4320
SfcI C'TryA_G 7 851, 1050, 1518, 2227, 2911, 3909
4446
SfoI GGC'GCC 1 106
SmlI C'TyrA_G 6 394, 530, 565, 1961, 3501, 4345
SnaBI TAC'GTA 1 1207
SphI G_CATG'C 1 1363
SspI AAT'ATT 7 1727, 2044, 2102, 4155, 4971,
5001
5029
StuI AGG'CCT 1 3668
StyI C'CwwG_G 6 558, 585, 999, 2026, 2289, 3313
SwaI ATTT'AAAT 1 4428
TaqII CACCCAnnnnnnnnn_nn' 2 1862, 4746
TatI w'GTAC_w 9 813, 1447, 1750, 2237, 2247, 3952
4230, 4546, 5160
TspDTI ATGAAnnnnnnnnn_nn' 23 318, 342, 676, 909, 940, 1012
1113, 1154, 1203, 1208, 1761
Masters Research Project
67
2054, 2139, 2273, 2810, 3263
3503, 3508, 3559, 3705, 3913
4211, 4840
TspGWI ACGGAnnnnnnnnn_nn' 2 353, 3699
TspRI _nnCAsTGnn' 17 268, 675, 783, 1133, 1159, 1461
1760, 3033, 3137, 3599, 3785
4000, 4030, 4422, 4527, 4677
4933
Tth111I GACn'n_nGTC 1 3011
XbaI T'CTAG_A 1 4261
XmnI GAAnn'nnTTC 2 3259, 3904
ZraI GAC'GTC 2 1282, 2641
Enzymes that cut five or fewer times
Enzyme Recognition frequency Positions
__________________________________________________________________________
AarI CACCTGCnnnn'nnnn_ 2 2782, 4347
AatII G_ACGT'C 2 1284, 2643
AccI GT'mk_AC 4 2622, 2940, 3757, 4147
AclI AA'CG_TT 1 908
AflII C'TTAA_G 3 394, 3501, 4345
AflIII A'CryG_T 3 372, 2625, 5101
AhdI GACnn_n'nnGTC 1 2644
AleI CACnn'nnGTG 2 3160, 4012
AloI GAACnnnnnnTCCnnnnnnn_nnnnn' 2 2030, 3898
AloI GGAnnnnnnGTTCnnnnnnn_nnnnn' 2 1998, 3866
AlwNI CAG_nnn'CTG 3 1424, 2186, 2995
ApaI G_GGCC'C 1 2082
Masters Research Project
68
ApaLI G'TGCA_C 3 376, 784, 3002
AseI AT'TA_AT 1 1026
BaeI ACnnnnGTAyCnnnnnnn_nnnnn' 1 1195
BaeI GrTACnnnnGTnnnnnnnnnn_nnnnn' 1 1228
BamHI G'GATC_C 1 67
BanI G'GyrC_C 5 104, 537, 1238, 2596, 3124
BanII G_rGCy'C 4 11, 65, 1395, 2082
BbeI G_GCGC'C 1 108
BbsI GAAGACnn'nnnn_ 4 477, 2214, 2394, 2804
BceAI ACGGCnnnnnnnnnnnn'nn_ 2 406, 4631
BclI T'GATC_A 3 600, 1822, 2733
BglII A'GATC_T 1 2306
BlpI GC'TnA_GC 2 2396, 2756
BmgBI CAC'GTC 1 783
BmrI ACTGGGnnnn_n' 1 3303
BpuEI CTTGAGnnnnnnnnnnnnnn_nn' 3 515, 586, 1946
BsaAI yAC'GTr 3 375, 1207, 5102
BsaBI GATnn'nnATC 1 2814
BsaHI Gr'CG_yC 3 105, 1281, 2640
BsaWI w'CCGG_w 1 2748
BsaXI ACnnnnnCTCCnnnnnnn_nnn' 3 650, 1062, 3896
BsaXI GGAGnnnnnGTnnnnnnnnn_nnn' 3 680, 1092, 3866
BseRI GAGGAGnnnnnnnn_nn' 3 1784, 2974, 3874
BseYI C'CCAG_C 3 2805, 2988, 3523
BsiEI CG_ry'CG 1 2712
BsmI GAATG_Cn' 4 1103, 1532, 2209, 2709
BsmAI GTCTCn'nnnn_ 5 195, 2454, 2526, 3526, 4417
BsmBI CGTCTCn'nnnn_ 1 195
BsmFI GGGACnnnnnnnnnn'nnnn_ 2 2267, 4302
BspEI T'CCGG_A 1 2748
Masters Research Project
69
BspHI T'CATG_A 1 597
BspMI ACCTGCnnnn'nnnn_ 2 2782, 4347
BsrI ACTG_Gn' 4 1454, 3298, 4109, 4677
BsrBI CCG'CTC 1 453
BsrFI r'CCGG_y 2 362, 2707
BsrGI T'GTAC_A 2 4546, 5160
BssSI C'ACGA_G 1 1349
BstAPI GCAn_nnn'nTGC 2 1695, 2186
BstEII G'GTnAC_C 1 3036
BstXI CCAn_nnnn'nTGG 3 3120, 3341, 4014
BstZ17I GTA'TAC 2 2623, 3758
BtgI C'CryG_G 1 2934
DraI TTT'AAA 5 1612, 3628, 4428, 4482, 5082
DrdI GACnn_nn'nnGTC 2 227, 485
EaeI y'GGCC_r 3 1935, 3164, 4644
EarI CTCTTCn'nnn_ 4 163, 558, 3063, 3586
Eco57I CTGAAGnnnnnnnnnnnnnn_nn' 4 3372, 3923, 4823, 5162
EcoICRI GAG'CTC 1 63
EcoNI CCTnn'n_nnAGG 1 3312
EcoO109I rG'GnC_Cy 2 2078, 4190
EcoRI G'AATT_C 2 770, 5146
FauI CCCGCnnnn'nn_ 2 583, 2356
HaeII r_GCGC'y 2 99, 108
HgaI GACGCnnnnn'nnnnn_ 1 133
HincII GTy'rAC 4 448, 912, 4706, 5133
HindIII A'AGCT_T 2 434, 4279
HpaI GTT'AAC 1 4706
HphI GGTGAnnnnnnn_n' 5 893, 2126, 3030, 4328, 4721
KasI G'GCGC_C 1 104
MfeI C'AATT_G 1 5126
Masters Research Project
70
MmeI TCCrACnnnnnnnnnnnnnnnnnn_nn' 5 657, 2477, 2922, 3131, 3233
MspA1I CmG'CkG 1 492
NarI GG'CG_CC 1 105
NspI r_CATG'y 2 1363, 4094
PacI TTA_AT'TAA 1 1923
PflMI CCAn_nnn'nTGG 2 1372, 1493
PmlI CAC'GTG 2 375, 5102
PpiI GAACnnnnnCTCnnnnnnnn_nnnnn' 1 3898
PpiI GAGnnnnnGTTCnnnnnnn_nnnnn' 1 3866
PpuMI rG'GwC_Cy 1 4190
PsiI TTA'TAA 2 2196, 4896
PspOMI G'GGCC_C 1 2078
PsrI GAACnnnnnnTACnnnnnnn_nnnnn' 1 4317
PsrI GTAnnnnnnGTTCnnnnnnn_nnnnn' 1 4349
PstI C_TGCA'G 2 2231, 2915
SacI G_AGCT'C 1 65
SapI GCTCTTCn'nnn_ 2 163, 3586
ScaI AGT'ACT 2 2239, 4232
SexAI A'CCwGG_T 2 2409, 3618
SfoI GGC'GCC 1 106
SnaBI TAC'GTA 1 1207
SphI G_CATG'C 1 1363
StuI AGG'CCT 1 3668
SwaI ATTT'AAAT 1 4428
TaqII CACCCAnnnnnnnnn_nn' 2 1862, 4746
TspGWI ACGGAnnnnnnnnn_nn' 2 353, 3699
Tth111I GACn'n_nGTC 1 3011
XbaI T'CTAG_A 1 4261
XmnI GAAnn'nnTTC 2 3259, 3904
ZraI GAC'GTC 2 1282, 2641
Masters Research Project
71
Position Enzyme(s)
__________________________________________________________________________
11 BanII G_rGCy'C
11 Bsp1286I G_dGCh'C
14 BsgI GTGCAGnnnnnnnnnnnnnn_nn'
15 BbvI GCAGCnnnnnnnn'nnnn_
25 MwoI GCnn_nnn'nnGC
26 HpyF10VI GCn_nnnnn'nGC
26 Cac8I GCn'nGC
28 MwoI GCnn_nnn'nnGC
29 HpyF10VI GCn_nnnnn'nGC
47 Hin4I GAbnnnnnrTCnnnnnnnn_nnnnn'
52 Hin4I GAynnnnnvTCnnnnnnnn_nnnnn'
55 Hpy8I GTn'nAC
62 AlwI GGATCnnnn'n_
63 EcoICRI GAG'CTC
64 BslI CCnn_nnn'nnGG
65 BanII G_rGCy'C
65 BsiHKAI G_wGCw'C
65 Bsp1286I G_dGCh'C
65 SacI G_AGCT'C
67 BamHI G'GATC_C
67 BstYI r'GATC_y
69 NlaIV GGn'nCC
75 AlwI GGATCnnnn'n_
79 Hin4I GAynnnnnvTCnnnnnnnn_nnnnn'
82 BplI GAGnnnnnCTCnnnnnnnn_nnnnn'
83 MwoI GCnn_nnn'nnGC
Masters Research Project
72
84 Hin4I GAbnnnnnrTCnnnnnnnn_nnnnn'
84 HpyF10VI GCn_nnnnn'nGC
86 MnlI CCTCnnnnnn_n'
89 BbvI GCAGCnnnnnnnn'nnnn_
99 HaeII r_GCGC'y
100 Cac8I GCn'nGC
102 MwoI GCnn_nnn'nnGC
103 HpyF10VI GCn_nnnnn'nGC
104 MwoI GCnn_nnn'nnGC
104 BanI G'GyrC_C
104 KasI G'GCGC_C
105 HpyF10VI GCn_nnnnn'nGC
105 BsaHI Gr'CG_yC
105 NarI GG'CG_CC
106 NlaIV GGn'nCC
106 SfoI GGC'GCC
108 BbeI G_GCGC'C
108 HaeII r_GCGC'y
108 BsaJI C'CnnG_G
114 BplI GAGnnnnnCTCnnnnnnnn_nnnnn'
133 HgaI GACGCnnnnn'nnnnn_
136 MwoI GCnn_nnn'nnGC
137 HpyF10VI GCn_nnnnn'nGC
148 Hpy188III TC'nn_GA
150 MboII GAAGAnnnnnnn_n'
163 SapI GCTCTTCn'nnn_
163 EarI CTCTTCn'nnn_
179 BsaJI C'CnnG_G
183 PleI GAGTCnnnn'n_
184 MlyI GAGTCnnnnn'
Masters Research Project
73
184 FokI GGATGnnnnnnnnn'nnnn_
186 NlaIV GGn'nCC
188 FokI GGATGnnnnnnnnn'nnnn_
192 FokI GGATGnnnnnnnnn'nnnn_
195 BsmBI CGTCTCn'nnnn_
195 BsmAI GTCTCn'nnnn_
197 BstF5I GGATG_nn'
201 BstF5I GGATG_nn'
205 BstF5I GGATG_nn'
227 DrdI GACnn_nn'nnGTC
260 AlwI GGATCnnnn'n_
268 TspRI _nnCAsTGnn'
289 AlwI GGATCnnnn'n_
294 BstYI r'GATC_y
307 BstF5I GGATG_nn'
314 FokI GGATGnnnnnnnnn'nnnn_
318 TspDTI ATGAAnnnnnnnnn_nn'
341 AlwI GGATCnnnn'n_
342 TspDTI ATGAAnnnnnnnnn_nn'
353 TspGWI ACGGAnnnnnnnnn_nn'
362 BsrFI r'CCGG_y
372 AflIII A'CryG_T
375 BsaAI yAC'GTr
375 PmlI CAC'GTG
376 ApaLI G'TGCA_C
378 Hpy8I GTn'nAC
380 Bme1580I G_kGCm'C
380 BsiHKAI G_wGCw'C
380 Bsp1286I G_dGCh'C
394 AflII C'TTAA_G
Masters Research Project
74
394 SmlI C'TyrA_G
406 BceAI ACGGCnnnnnnnnnnnn'nn_
410 MslI CAynn'nnrTG
419 AlwI GGATCnnnn'n_
424 BstYI r'GATC_y
434 HindIII A'AGCT_T
448 HincII GTy'rAC
448 Hpy8I GTn'nAC
453 BsrBI CCG'CTC
462 BslI CCnn_nnn'nnGG
477 BbsI GAAGACnn'nnnn_
482 MboII GAAGAnnnnnnn_n'
485 DrdI GACnn_nn'nnGTC
492 MspA1I CmG'CkG
507 MnlI CCTCnnnnnn_n'
515 BpuEI CTTGAGnnnnnnnnnnnnnn_nn'
530 SmlI C'TyrA_G
534 BslI CCnn_nnn'nnGG
537 BanI G'GyrC_C
539 MnlI CCTCnnnnnn_n'
539 NlaIV GGn'nCC
541 BsaJI C'CnnG_G
545 MboII GAAGAnnnnnnn_n'
558 EarI CTCTTCn'nnn_
558 BsaJI C'CnnG_G
558 StyI C'CwwG_G
561 MnlI CCTCnnnnnn_n'
561 MnlI CCTCnnnnnn_n'
565 SmlI C'TyrA_G
578 Hpy188III TC'nn_GA
Masters Research Project
75
583 FauI CCCGCnnnn'nn_
585 BsaJI C'CnnG_G
585 StyI C'CwwG_G
586 BpuEI CTTGAGnnnnnnnnnnnnnn_nn'
589 BslI CCnn_nnn'nnGG
590 BslI CCnn_nnn'nnGG
592 Cac8I GCn'nGC
597 BspHI T'CATG_A
598 Hpy188III TC'nn_GA
600 BclI T'GATC_A
603 SfaNI GCATCnnnnn'nnnn_
627 BslI CCnn_nnn'nnGG
637 MwoI GCnn_nnn'nnGC
638 HpyF10VI GCn_nnnnn'nGC
650 Hin4I GAynnnnnvTCnnnnnnnn_nnnnn'
650 BsaXI ACnnnnnCTCCnnnnnnn_nnn'
652 Hpy8I GTn'nAC
652 MnlI CCTCnnnnnn_n'
657 MmeI TCCrACnnnnnnnnnnnnnnnnnn_nn'
674 FokI GGATGnnnnnnnnn'nnnn_
675 TspRI _nnCAsTGnn'
676 TspDTI ATGAAnnnnnnnnn_nn'
680 BsaXI GGAGnnnnnGTnnnnnnnnn_nnn'
682 Hin4I GAbnnnnnrTCnnnnnnnn_nnnnn'
683 ApoI r'AATT_y
687 BstF5I GGATG_nn'
691 Hpy188III TC'nn_GA
693 MnlI CCTCnnnnnn_n'
705 Cac8I GCn'nGC
713 MwoI GCnn_nnn'nnGC
Masters Research Project
76
714 HpyF10VI GCn_nnnnn'nGC
743 MboII GAAGAnnnnnnn_n'
745 Cac8I GCn'nGC
770 ApoI r'AATT_y
770 EcoRI G'AATT_C
779 MboII GAAGAnnnnnnn_n'
783 TspRI _nnCAsTGnn'
783 BmgBI CAC'GTC
784 ApaLI G'TGCA_C
786 Hpy8I GTn'nAC
788 Bme1580I G_kGCm'C
788 BsiHKAI G_wGCw'C
788 Bsp1286I G_dGCh'C
813 TatI w'GTAC_w
814 Hpy8I GTn'nAC
851 SfcI C'TryA_G
872 Cac8I GCn'nGC
879 MslI CAynn'nnrTG
883 SfaNI GCATCnnnnn'nnnn_
893 HphI GGTGAnnnnnnn_n'
908 AclI AA'CG_TT
909 TspDTI ATGAAnnnnnnnnn_nn'
912 HincII GTy'rAC
912 Hpy8I GTn'nAC
940 TspDTI ATGAAnnnnnnnnn_nn'
942 Hpy188III TC'nn_GA
963 BpmI CTGGAGnnnnnnnnnnnnnn_nn'
963 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
985 AlwI GGATCnnnn'n_
999 BsaJI C'CnnG_G
Masters Research Project
77
999 StyI C'CwwG_G
1012 TspDTI ATGAAnnnnnnnnn_nn'
1024 MboII GAAGAnnnnnnn_n'
1026 AseI AT'TA_AT
1050 SfcI C'TryA_G
1062 BsaXI ACnnnnnCTCCnnnnnnn_nnn'
1063 Hpy8I GTn'nAC
1092 BsaXI GGAGnnnnnGTnnnnnnnnn_nnn'
1103 BsmI GAATG_Cn'
1113 TspDTI ATGAAnnnnnnnnn_nn'
1131 BstYI r'GATC_y
1133 TspRI _nnCAsTGnn'
1133 BtsI GCAGTG_nn'
1139 AlwI GGATCnnnn'n_
1154 TspDTI ATGAAnnnnnnnnn_nn'
1159 TspRI _nnCAsTGnn'
1173 BseMII CTCAGnnnnnnnn_nn'
1174 BspCNI CTCAGnnnnnnn_nn'
1178 MboII GAAGAnnnnnnn_n'
1195 BaeI ACnnnnGTAyCnnnnnnn_nnnnn'
1203 TspDTI ATGAAnnnnnnnnn_nn'
1207 BsaAI yAC'GTr
1207 SnaBI TAC'GTA
1208 TspDTI ATGAAnnnnnnnnn_nn'
1214 Hpy8I GTn'nAC
1228 BaeI GrTACnnnnGTnnnnnnnnnn_nnnnn'
1238 BanI G'GyrC_C
1240 NlaIV GGn'nCC
1241 MnlI CCTCnnnnnn_n'
1276 BbvI GCAGCnnnnnnnn'nnnn_
Masters Research Project
78
1281 BsaHI Gr'CG_yC
1282 ZraI GAC'GTC
1284 AatII G_ACGT'C
1295 MwoI GCnn_nnn'nnGC
1296 HpyF10VI GCn_nnnnn'nGC
1333 Hpy188III TC'nn_GA
1349 BssSI C'ACGA_G
1351 Hpy188III TC'nn_GA
1361 Cac8I GCn'nGC
1363 NspI r_CATG'y
1363 SphI G_CATG'C
1372 BslI CCnn_nnn'nnGG
1372 PflMI CCAn_nnn'nTGG
1388 BbvI GCAGCnnnnnnnn'nnnn_
1395 BanII G_rGCy'C
1395 Bsp1286I G_dGCh'C
1416 BseMII CTCAGnnnnnnnn_nn'
1417 BspCNI CTCAGnnnnnnn_nn'
1424 AlwNI CAG_nnn'CTG
1424 SfaNI GCATCnnnnn'nnnn_
1447 TatI w'GTAC_w
1454 BsrI ACTG_Gn'
1461 TspRI _nnCAsTGnn'
1465 BstF5I GGATG_nn'
1472 FokI GGATGnnnnnnnnn'nnnn_
1484 BsgI GTGCAGnnnnnnnnnnnnnn_nn'
1493 BslI CCnn_nnn'nnGG
1493 PflMI CCAn_nnn'nTGG
1518 SfcI C'TryA_G
1532 BsmI GAATG_Cn'
Masters Research Project
79
1612 DraI TTT'AAA
1623 FokI GGATGnnnnnnnnn'nnnn_
1636 BstF5I GGATG_nn'
1653 Hpy8I GTn'nAC
1687 Cac8I GCn'nGC
1695 BstAPI GCAn_nnn'nTGC
1695 MwoI GCnn_nnn'nnGC
1696 HpyF10VI GCn_nnnnn'nGC
1703 MslI CAynn'nnrTG
1727 SspI AAT'ATT
1750 TatI w'GTAC_w
1760 TspRI _nnCAsTGnn'
1761 TspDTI ATGAAnnnnnnnnn_nn'
1762 MnlI CCTCnnnnnn_n'
1784 BseRI GAGGAGnnnnnnnn_nn'
1795 Hpy8I GTn'nAC
1822 BclI T'GATC_A
1844 BseMII CTCAGnnnnnnnn_nn'
1845 BspCNI CTCAGnnnnnnn_nn'
1848 BslI CCnn_nnn'nnGG
1862 TaqII CACCCAnnnnnnnnn_nn'
1869 BsaJI C'CnnG_G
1870 BsaJI C'CnnG_G
1888 Hpy188III TC'nn_GA
1894 MnlI CCTCnnnnnn_n'
1899 BpmI CTGGAGnnnnnnnnnnnnnn_nn'
1899 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
1921 SfaNI GCATCnnnnn'nnnn_
1923 PacI TTA_AT'TAA
1935 EaeI y'GGCC_r
Masters Research Project
80
1946 BpuEI CTTGAGnnnnnnnnnnnnnn_nn'
1947 SfaNI GCATCnnnnn'nnnn_
1961 SmlI C'TyrA_G
1975 Bme1580I G_kGCm'C
1975 Bsp1286I G_dGCh'C
1998 AloI GGAnnnnnnGTTCnnnnnnn_nnnnn'
2006 MnlI CCTCnnnnnn_n'
2009 BseMII CTCAGnnnnnnnn_nn'
2010 BspCNI CTCAGnnnnnnn_nn'
2017 Hpy188III TC'nn_GA
2026 BsaJI C'CnnG_G
2026 StyI C'CwwG_G
2030 AloI GAACnnnnnnTCCnnnnnnn_nnnnn'
2044 SspI AAT'ATT
2050 BsgI GTGCAGnnnnnnnnnnnnnn_nn'
2054 TspDTI ATGAAnnnnnnnnn_nn'
2062 NlaIV GGn'nCC
2065 Cac8I GCn'nGC
2069 MnlI CCTCnnnnnn_n'
2076 BslI CCnn_nnn'nnGG
2078 EcoO109I rG'GnC_Cy
2078 PspOMI G'GGCC_C
2080 NlaIV GGn'nCC
2082 ApaI G_GGCC'C
2082 BanII G_rGCy'C
2082 Bme1580I G_kGCm'C
2082 Bsp1286I G_dGCh'C
2085 BsaJI C'CnnG_G
2102 SspI AAT'ATT
2125 ApoI r'AATT_y
Masters Research Project
81
2126 HphI GGTGAnnnnnnn_n'
2139 TspDTI ATGAAnnnnnnnnn_nn'
2148 MboII GAAGAnnnnnnn_n'
2172 BsgI GTGCAGnnnnnnnnnnnnnn_nn'
2173 BbvI GCAGCnnnnnnnn'nnnn_
2175 FalI AAGnnnnnCTTnnnnnnnn_nnnnn'
2183 MboII GAAGAnnnnnnn_n'
2186 BstAPI GCAn_nnn'nTGC
2186 MwoI GCnn_nnn'nnGC
2186 AlwNI CAG_nnn'CTG
2187 HpyF10VI GCn_nnnnn'nGC
2195 MboII GAAGAnnnnnnn_n'
2196 PsiI TTA'TAA
2207 FalI AAGnnnnnCTTnnnnnnnn_nnnnn'
2209 BsmI GAATG_Cn'
2214 MboII GAAGAnnnnnnn_n'
2214 BbsI GAAGACnn'nnnn_
2227 PleI GAGTCnnnn'n_
2227 SfcI C'TryA_G
2228 MlyI GAGTCnnnnn'
2231 PstI C_TGCA'G
2237 TatI w'GTAC_w
2239 ScaI AGT'ACT
2241 BbvI GCAGCnnnnnnnn'nnnn_
2247 TatI w'GTAC_w
2255 MslI CAynn'nnrTG
2267 BsmFI GGGACnnnnnnnnnn'nnnn_
2273 TspDTI ATGAAnnnnnnnnn_nn'
2286 BpmI CTGGAGnnnnnnnnnnnnnn_nn'
2286 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
Masters Research Project
82
2289 BsaJI C'CnnG_G
2289 StyI C'CwwG_G
2290 BslI CCnn_nnn'nnGG
2301 BslI CCnn_nnn'nnGG
2306 BglII A'GATC_T
2306 BstYI r'GATC_y
2338 MlyI GAGTCnnnnn'
2338 PleI GAGTCnnnn'n_
2353 MnlI CCTCnnnnnn_n'
2356 FauI CCCGCnnnn'nn_
2368 BstF5I GGATG_nn'
2375 FokI GGATGnnnnnnnnn'nnnn_
2387 BseMII CTCAGnnnnnnnn_nn'
2388 BspCNI CTCAGnnnnnnn_nn'
2394 BbsI GAAGACnn'nnnn_
2396 BlpI GC'TnA_GC
2399 MboII GAAGAnnnnnnn_n'
2409 SexAI A'CCwGG_T
2435 MwoI GCnn_nnn'nnGC
2436 HpyF10VI GCn_nnnnn'nGC
2444 MnlI CCTCnnnnnn_n'
2449 MnlI CCTCnnnnnn_n'
2454 BsmAI GTCTCn'nnnn_
2477 MmeI TCCrACnnnnnnnnnnnnnnnnnn_nn'
2480 BbvI GCAGCnnnnnnnn'nnnn_
2492 MnlI CCTCnnnnnn_n'
2507 AlwI GGATCnnnn'n_
2526 BsmAI GTCTCn'nnnn_
2539 Hpy188III TC'nn_GA
2596 BanI G'GyrC_C
Masters Research Project
83
2598 BstF5I GGATG_nn'
2598 NlaIV GGn'nCC
2605 FokI GGATGnnnnnnnnn'nnnn_
2622 AccI GT'mk_AC
2623 BstZ17I GTA'TAC
2623 Hpy8I GTn'nAC
2625 AflIII A'CryG_T
2630 Hin4I GAynnnnnvTCnnnnnnnn_nnnnn'
2630 Hin4I GAbnnnnnrTCnnnnnnnn_nnnnn'
2640 BsaHI Gr'CG_yC
2641 ZraI GAC'GTC
2643 AatII G_ACGT'C
2644 AhdI GACnn_n'nnGTC
2660 BplI GAGnnnnnCTCnnnnnnnn_nnnnn'
2662 Hin4I GAynnnnnvTCnnnnnnnn_nnnnn'
2662 Hin4I GAbnnnnnrTCnnnnnnnn_nnnnn'
2692 BplI GAGnnnnnCTCnnnnnnnn_nnnnn'
2699 BslI CCnn_nnn'nnGG
2700 BbvI GCAGCnnnnnnnn'nnnn_
2707 BsrFI r'CCGG_y
2709 BsmI GAATG_Cn'
2712 BsiEI CG_ry'CG
2729 FalI AAGnnnnnCTTnnnnnnnn_nnnnn'
2733 BclI T'GATC_A
2742 SfaNI GCATCnnnnn'nnnn_
2748 BsaWI w'CCGG_w
2748 BspEI T'CCGG_A
2749 Hpy188III TC'nn_GA
2756 BlpI GC'TnA_GC
2757 BstF5I GGATG_nn'
Masters Research Project
84
2761 FalI AAGnnnnnCTTnnnnnnnn_nnnnn'
2764 FokI GGATGnnnnnnnnn'nnnn_
2769 BspCNI CTCAGnnnnnnn_nn'
2770 BseMII CTCAGnnnnnnnn_nn'
2782 AarI CACCTGCnnnn'nnnn_
2782 BspMI ACCTGCnnnn'nnnn_
2800 SfaNI GCATCnnnnn'nnnn_
2804 BbsI GAAGACnn'nnnn_
2805 BseYI C'CCAG_C
2809 MboII GAAGAnnnnnnn_n'
2810 TspDTI ATGAAnnnnnnnnn_nn'
2810 AlwI GGATCnnnn'n_
2814 BsaBI GATnn'nnATC
2815 BstF5I GGATG_nn'
2815 BstYI r'GATC_y
2822 FokI GGATGnnnnnnnnn'nnnn_
2869 Hin4I GAynnnnnvTCnnnnnnnn_nnnnn'
2882 BsiHKAI G_wGCw'C
2882 Bsp1286I G_dGCh'C
2901 Hin4I GAbnnnnnrTCnnnnnnnn_nnnnn'
2911 SfcI C'TryA_G
2915 PstI C_TGCA'G
2922 MmeI TCCrACnnnnnnnnnnnnnnnnnn_nn'
2925 BbvI GCAGCnnnnnnnn'nnnn_
2934 BsaJI C'CnnG_G
2934 BtgI C'CryG_G
2940 AccI GT'mk_AC
2941 Hpy8I GTn'nAC
2963 AlwI GGATCnnnn'n_
2974 BseRI GAGGAGnnnnnnnn_nn'
Masters Research Project
85
2982 BbvI GCAGCnnnnnnnn'nnnn_
2988 BseYI C'CCAG_C
2992 MnlI CCTCnnnnnn_n'
2995 MnlI CCTCnnnnnn_n'
2995 AlwNI CAG_nnn'CTG
3002 ApaLI G'TGCA_C
3003 MnlI CCTCnnnnnn_n'
3004 Hpy8I GTn'nAC
3006 Bme1580I G_kGCm'C
3006 BsiHKAI G_wGCw'C
3006 Bsp1286I G_dGCh'C
3007 BseMII CTCAGnnnnnnnn_nn'
3008 BspCNI CTCAGnnnnnnn_nn'
3009 MslI CAynn'nnrTG
3011 Tth111I GACn'n_nGTC
3029 BpmI CTGGAGnnnnnnnnnnnnnn_nn'
3029 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
3030 HphI GGTGAnnnnnnn_n'
3033 TspRI _nnCAsTGnn'
3036 BstEII G'GTnAC_C
3050 MboII GAAGAnnnnnnn_n'
3053 MnlI CCTCnnnnnn_n'
3063 EarI CTCTTCn'nnn_
3090 FokI GGATGnnnnnnnnn'nnnn_
3096 BpmI CTGGAGnnnnnnnnnnnnnn_nn'
3096 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
3103 BstF5I GGATG_nn'
3112 SfaNI GCATCnnnnn'nnnn_
3113 BslI CCnn_nnn'nnGG
3120 BstXI CCAn_nnnn'nTGG
Masters Research Project
86
3124 BanI G'GyrC_C
3126 NlaIV GGn'nCC
3127 Bme1580I G_kGCm'C
3127 Bsp1286I G_dGCh'C
3130 BtsI GCAGTG_nn'
3131 MmeI TCCrACnnnnnnnnnnnnnnnnnn_nn'
3137 TspRI _nnCAsTGnn'
3140 BseMII CTCAGnnnnnnnn_nn'
3141 BspCNI CTCAGnnnnnnn_nn'
3155 BsiHKAI G_wGCw'C
3155 Bsp1286I G_dGCh'C
3160 AleI CACnn'nnGTG
3160 MslI CAynn'nnrTG
3164 EaeI y'GGCC_r
3177 FokI GGATGnnnnnnnnn'nnnn_
3190 BstF5I GGATG_nn'
3207 NlaIV GGn'nCC
3233 MmeI TCCrACnnnnnnnnnnnnnnnnnn_nn'
3259 XmnI GAAnn'nnTTC
3263 TspDTI ATGAAnnnnnnnnn_nn'
3273 MnlI CCTCnnnnnn_n'
3277 MnlI CCTCnnnnnn_n'
3298 BsrI ACTG_Gn'
3299 NlaIV GGn'nCC
3303 BmrI ACTGGGnnnn_n'
3312 PleI GAGTCnnnn'n_
3312 EcoNI CCTnn'n_nnAGG
3313 MlyI GAGTCnnnnn'
3313 BsaJI C'CnnG_G
3313 StyI C'CwwG_G
Masters Research Project
87
3314 BslI CCnn_nnn'nnGG
3341 BstXI CCAn_nnnn'nTGG
3354 Hpy8I GTn'nAC
3372 Eco57I CTGAAGnnnnnnnnnnnnnn_nn'
3372 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
3382 FokI GGATGnnnnnnnnn'nnnn_
3395 BstF5I GGATG_nn'
3450 BslI CCnn_nnn'nnGG
3495 BsaJI C'CnnG_G
3501 AflII C'TTAA_G
3501 SmlI C'TyrA_G
3503 TspDTI ATGAAnnnnnnnnn_nn'
3508 TspDTI ATGAAnnnnnnnnn_nn'
3523 BseYI C'CCAG_C
3526 BsmAI GTCTCn'nnnn_
3549 AlwI GGATCnnnn'n_
3559 TspDTI ATGAAnnnnnnnnn_nn'
3565 FalI AAGnnnnnCTTnnnnnnnn_nnnnn'
3566 SfaNI GCATCnnnnn'nnnn_
3568 Hpy8I GTn'nAC
3573 MboII GAAGAnnnnnnn_n'
3586 SapI GCTCTTCn'nnn_
3586 EarI CTCTTCn'nnn_
3589 Hpy8I GTn'nAC
3592 BtsI GCAGTG_nn'
3597 FalI AAGnnnnnCTTnnnnnnnn_nnnnn'
3599 TspRI _nnCAsTGnn'
3618 SexAI A'CCwGG_T
3628 DraI TTT'AAA
3646 MwoI GCnn_nnn'nnGC
Masters Research Project
88
3647 HpyF10VI GCn_nnnnn'nGC
3653 BplI GAGnnnnnCTCnnnnnnnn_nnnnn'
3668 StuI AGG'CCT
3674 MwoI GCnn_nnn'nnGC
3675 HpyF10VI GCn_nnnnn'nGC
3679 MnlI CCTCnnnnnn_n'
3679 Cac8I GCn'nGC
3685 BplI GAGnnnnnCTCnnnnnnnn_nnnnn'
3699 TspGWI ACGGAnnnnnnnnn_nn'
3704 SfaNI GCATCnnnnn'nnnn_
3705 TspDTI ATGAAnnnnnnnnn_nn'
3719 MnlI CCTCnnnnnn_n'
3757 AccI GT'mk_AC
3758 BstZ17I GTA'TAC
3758 Hpy8I GTn'nAC
3764 BsaJI C'CnnG_G
3778 Hpy8I GTn'nAC
3785 TspRI _nnCAsTGnn'
3866 AloI GGAnnnnnnGTTCnnnnnnn_nnnnn'
3866 PpiI GAGnnnnnGTTCnnnnnnn_nnnnn'
3866 BsaXI GGAGnnnnnGTnnnnnnnnn_nnn'
3874 BseRI GAGGAGnnnnnnnn_nn'
3895 MnlI CCTCnnnnnn_n'
3896 BsaXI ACnnnnnCTCCnnnnnnn_nnn'
3898 AloI GAACnnnnnnTCCnnnnnnn_nnnnn'
3898 PpiI GAACnnnnnCTCnnnnnnnn_nnnnn'
3904 XmnI GAAnn'nnTTC
3909 SfcI C'TryA_G
3913 TspDTI ATGAAnnnnnnnnn_nn'
3923 Eco57I CTGAAGnnnnnnnnnnnnnn_nn'
Masters Research Project
89
3923 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
3952 TatI w'GTAC_w
3958 SfaNI GCATCnnnnn'nnnn_
3993 BtsI GCAGTG_nn'
4000 TspRI _nnCAsTGnn'
4004 NlaIV GGn'nCC
4012 AleI CACnn'nnGTG
4012 MslI CAynn'nnrTG
4014 BstXI CCAn_nnnn'nTGG
4023 BtsI GCAGTG_nn'
4030 TspRI _nnCAsTGnn'
4083 MwoI GCnn_nnn'nnGC
4084 HpyF10VI GCn_nnnnn'nGC
4088 MnlI CCTCnnnnnn_n'
4088 Cac8I GCn'nGC
4094 NspI r_CATG'y
4109 BsrI ACTG_Gn'
4147 AccI GT'mk_AC
4148 Hpy8I GTn'nAC
4154 MnlI CCTCnnnnnn_n'
4155 SspI AAT'ATT
4168 BseMII CTCAGnnnnnnnn_nn'
4169 BspCNI CTCAGnnnnnnn_nn'
4180 MnlI CCTCnnnnnn_n'
4187 PleI GAGTCnnnn'n_
4188 MlyI GAGTCnnnnn'
4190 EcoO109I rG'GnC_Cy
4190 PpuMI rG'GwC_Cy
4191 NlaIV GGn'nCC
4211 TspDTI ATGAAnnnnnnnnn_nn'
Masters Research Project
90
4230 TatI w'GTAC_w
4232 ScaI AGT'ACT
4261 XbaI T'CTAG_A
4262 Hpy188III TC'nn_GA
4279 HindIII A'AGCT_T
4290 MslI CAynn'nnrTG
4291 MnlI CCTCnnnnnn_n'
4302 BsmFI GGGACnnnnnnnnnn'nnnn_
4314 BsgI GTGCAGnnnnnnnnnnnnnn_nn'
4317 PsrI GAACnnnnnnTACnnnnnnn_nnnnn'
4320 SfaNI GCATCnnnnn'nnnn_
4325 BsgI GTGCAGnnnnnnnnnnnnnn_nn'
4328 HphI GGTGAnnnnnnn_n'
4336 Hpy8I GTn'nAC
4345 AflII C'TTAA_G
4345 SmlI C'TyrA_G
4347 AarI CACCTGCnnnn'nnnn_
4347 BspMI ACCTGCnnnn'nnnn_
4348 MwoI GCnn_nnn'nnGC
4349 PsrI GTAnnnnnnGTTCnnnnnnn_nnnnn'
4349 HpyF10VI GCn_nnnnn'nGC
4360 MlyI GAGTCnnnnn'
4360 PleI GAGTCnnnn'n_
4417 BsmAI GTCTCn'nnnn_
4422 TspRI _nnCAsTGnn'
4428 SwaI ATTT'AAAT
4428 DraI TTT'AAA
4429 ApoI r'AATT_y
4446 SfcI C'TryA_G
4473 Hpy8I GTn'nAC
Masters Research Project
91
4482 DraI TTT'AAA
4486 MnlI CCTCnnnnnn_n'
4520 BtsI GCAGTG_nn'
4527 TspRI _nnCAsTGnn'
4528 MnlI CCTCnnnnnn_n'
4546 BsrGI T'GTAC_A
4546 TatI w'GTAC_w
4547 Hpy8I GTn'nAC
4572 Hpy188III TC'nn_GA
4631 BceAI ACGGCnnnnnnnnnnnn'nn_
4644 EaeI y'GGCC_r
4653 BslI CCnn_nnn'nnGG
4654 Hpy188III TC'nn_GA
4677 TspRI _nnCAsTGnn'
4677 BsrI ACTG_Gn'
4694 BpmI CTGGAGnnnnnnnnnnnnnn_nn'
4694 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
4706 HincII GTy'rAC
4706 HpaI GTT'AAC
4706 Hpy8I GTn'nAC
4721 HphI GGTGAnnnnnnn_n'
4746 TaqII CACCCAnnnnnnnnn_nn'
4767 ApoI r'AATT_y
4823 Eco57I CTGAAGnnnnnnnnnnnnnn_nn'
4823 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
4840 TspDTI ATGAAnnnnnnnnn_nn'
4848 Bme1580I G_kGCm'C
4848 Bsp1286I G_dGCh'C
4877 Hpy188III TC'nn_GA
4896 PsiI TTA'TAA
Masters Research Project
92
4933 TspRI _nnCAsTGnn'
4933 BtsI GCAGTG_nn'
4971 SspI AAT'ATT
5001 SspI AAT'ATT
5024 PleI GAGTCnnnn'n_
5025 MlyI GAGTCnnnnn'
5029 SspI AAT'ATT
5082 DraI TTT'AAA
5092 BstF5I GGATG_nn'
5099 FokI GGATGnnnnnnnnn'nnnn_
5101 AflIII A'CryG_T
5102 BsaAI yAC'GTr
5102 PmlI CAC'GTG
5126 MfeI C'AATT_G
5133 HincII GTy'rAC
5133 Hpy8I GTn'nAC
5146 ApoI r'AATT_y
5146 EcoRI G'AATT_C
5155 MboII GAAGAnnnnnnn_n'
5160 BsrGI T'GTAC_A
5160 TatI w'GTAC_w
5162 Eco57I CTGAAGnnnnnnnnnnnnnn_nn'
5162 Eco57MI CTGrAGnnnnnnnnnnnnnn_nn'
Enzymes that do not cut:
_________________________________________________________
Acc65I, AfeI, AgeI, AscI, AsiSI, AvaI, AvrII, BbvCI, BcgI, BcgI, BciVI, BfrBI
BglI, BmtI, Bpu10I, BsaI, BsiWI, BsrDI, BssHII, BstBI, Bsu36I, ClaI, DraIII, EagI
EciI, EcoRV, FseI, FspI, FspAI, KpnI, MluI, MscI, NaeI, NcoI, NdeI, NgoMIV, NheI
Masters Research Project
93
NotI, NruI, NsiI, PciI, PmeI, PshAI, PvuI, PvuII, RsrII, SacII, SalI, SanDI, SbfI
SfiI, SgrAI, SmaI, SpeI, SrfI, TaqII, XcmI, XhoI, XmaI
Masters Research Project
94
Masters Research Project
95
ORF Finder (Open Reading Frame Finder)
Masters Research Project
96
Masters Research Project
97
Masters Research Project
98
Masters Research Project
99
Masters Research Project
100
Map View (MapViver, 2009)
Masters Research Project
101
Masters Research Project
102
Ref.
Masters Research Project
103
Electronic Forward PCR results (F-Electronic PCR, 2009)
Masters Research Project
104
Electronic Forward PCR results
UniSTS:21855 Links
STS-N21003
Homo sapiens chromosome 4, loci LOC653882 and KIT
Pan troglodytes chromosome 4, locus KIT
Found by e-PCR in sequences from Homo sapiens, Pan
troglodytes, Pongo abelii and Schistosoma japonicum.
Primer Information
Forward primer: ATTTTTCCGACAGCACTGAC
Reverse primer: GGACTTGAGGTTTATTCCTGAC
PCR product size: 125 (bp), Homo sapiens
Homo sapiens
Name: STS-N21003
Also known as: SHGC-67473 sts-N21003
Polymorphism info:
Cross References
Gene GeneID: 3815
Masters Research Project
105
Symbol: KIT
Description: v-kit Hardy-Zuckerman 4 feline
sarcoma viral oncogene homolog
Position: 4q11-q12
Gene GeneID: 653882
Symbol: LOC653882
Description: similar to Mast/stem cell growth factor
receptor precursor (SCFR) (Proto-
oncogene tyrosine-protein kinase
Kit) (c-kit) (CD117 antigen)
Position:
UniGene Hs.479754 V-kit Hardy-Zuckerman 4 feline
sarcoma viral oncogene homolog
Mapping Information
STS-
N21003
Sequence Map: Chr
4|Celera
Map
Viewer
Position: 49661932-49662056 (bp)
STS-
N21003
Sequence Map: Chr
4|HuRef
Map
Viewer
Position: 51512466-51512590 (bp)
STS-
N21003
Sequence Map: Chr
4|Celera
Map
Viewer
Masters Research Project
106
Position: 53066732-53066856 (bp)
STS-
N21003
Sequence Map: Chr 4 Map
Viewer
Position: 55564586-55564710 (bp)
sts-N21003 NCBI RH Map: Chr 4 Map
Viewer
Position: 619 (cR)
Lod score: 4.08
SHGC-67473 TNG Map: Chr 4 Map
Viewer
Position: 30139 (cR50000)
Lod score: 13.8
Reference Interval: 60
sts-N21003 GeneMap99-GB4
Map:
Chr 4 Map
Viewer
Position: 315.03 (cR3000)
Lod score: 3.00
Reference Interval: D4S1577-D4S1594
Electronic PCR results
Masters Research Project
107
RefSeq mRNA (3)
XM_936229.1 150 .. 274 (125 bp)
NM_001093772.1 561 .. 685 (125 bp)
NM_000222.2 561 .. 685 (125 bp)
mRNA (4)
X06182.1 495 .. 619 (125 bp)
BC071593.1 550 .. 674 (125 bp)
EU826594.1 474 .. 598 (125 bp)
AK304031.1 532 .. 656 (125 bp)
Genomic RefSeqs (5 of 9)[Show All Hits]
NW_922095.1 164 .. 288 (125 bp)
NW_922162.1 2891706 .. 2891830 (125 bp)
AC_000047.1 49661932 .. 49662056 (125 bp)
AC_000047.1 53066732 .. 53066856 (125 bp)
NW_001838913.1 2909984 .. 2910108 (125 bp)
Genomic (5)
X69303.1 236 .. 360 (125 bp)
L04143.1 2364 .. 2488 (125 bp)
U63834.1 46673 .. 46797 (125 bp)
AC006552.7 184116 .. 184240 (125 bp)
Masters Research Project
108
AC092545.3 723 .. 847 (125 bp)
Working Draft phase 1 (from GenBank HTGS division) (1)
AC006553.10 100516 .. 100640 (125 bp)
ESTs (5 of 7)[Show All Hits]
H96865.1 171 .. 295 (125 bp)
N21003.1 174 .. 298 (125 bp)
BE081957.1 194 .. 318 (125 bp)
BF744451.1 212 .. 336 (125 bp)
CN414752.1 381 .. 505 (125 bp)
Whole Genome Shotgun sequences (5 of 7)[Show All Hits]
AADD01045773.1 17349 .. 17473 (125 bp)
AADC01040747.1 230491 .. 230615 (125 bp)
AADB02189580.1 11 .. 135 (125 bp)
AADB02006720.1 81824 .. 81948 (125 bp)
AADB02006679.1 164 .. 288 (125 bp)
Patent sequences (5 of 44)[Show All Hits]
AX195908.1 495 .. 619 (125 bp)
AX331941.1 495 .. 619 (125 bp)
AX335913.1 495 .. 619 (125 bp)
Masters Research Project
109
AX771405.1 2364 .. 2488 (125 bp)
CQ719901.1 510 .. 634 (125 bp)
Pan troglodytes
Name: STS-N21003
Polymorphism info:
Cross References
Gene GeneID: 461316
Symbol: KIT
Description: v-kit Hardy-Zuckerman 4 feline sarcoma
viral oncogene homolog
Position:
Mapping Information
STS-N21003 Sequence Map: Chr 4 Map Viewer
Position: 75900215-75900339 (bp)
Electronic PCR results
Masters Research Project
110
RefSeq mRNA (1)
XM_517285.2 777 .. 901 (125 bp)
Genomic RefSeqs (1)
NW_001234062.1 8734044 .. 8734168 (125 bp)
Whole Genome Shotgun sequences (2)
AADA01199678.1 10005 .. 10129 (125 bp)
AACZ02049805.1 58752 .. 58876 (125 bp)
Pongo abelii
Polymorphism info:
Electronic PCR results
ESTs (1)
CR767571.1 524 .. 648 (125 bp)
Schistosoma japonicum
Masters Research Project
111
Polymorphism info:
Electronic PCR results
ESTs (1)
CV751530.1 56 .. 180 (125 bp)
High Thoughput cDNA sequences (1)
AY914890.1 56 .. 180 (125 bp)
Masters Research Project
112
Electronic Rev. PCR results (R-Electronic PCR, 2009)
1 ATTTTTCCGACAGCACTGAC GGACTTGAGGTTTATTCCTGAC 125
Masters Research Project
113
Vecscreen (Vecscreen, 2009)
No significant similarity found.
Masters Research Project
114
Masters Research Project
115
GENSCAN Output (Genscan, 2009)
View gene model output: PS | PDF
GENSCAN 1.0 Date run: 21-Dec-109 Time: 03:03:23
Sequence /tmp/12_21_09-03:03:23.fasta : 5283 bp : 42.31% C+G : Isochore 1 (
0 - 43 C+G%)
Parameter matrix: HumanIso.smat
Predicted genes/exons:
Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr..
----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------
1.01 Sngl + 181 3111 2931 0 0 73 53 2224 0.999 207.50
1.02 PlyA + 3453 3458 6 1.05
Suboptimal exons with probability > 1.000
Exnum Type S .Begin ...End .Len Fr Ph B/Ac Do/T CodRg P.... Tscr..
----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------
NO EXONS FOUND AT GIVEN PROBABILITY CUTOFF
Predicted peptide sequence(s):
>/tmp/12_21_09-03:03:23.fasta|GENSCAN_predicted_peptide_1|976_aa
MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTD
PGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLV
DRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYH
RLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTIKDVSS
SVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSAN
Masters Research Project
116
VTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWE
DYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDR
LVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDS
SAFKHNGTVECKAYNDVGKTSAYFNFAFKGNNKEQIHPHTLFTPLLIGFVIVAGMMCIIV
MILTYKYLQKPMYEVQWKVVEEINGNNYVYIDPTQLPYDHKWEFPRNRLSFGKTLGAGAF
GKVVEATAYGLIKSDAAMTVAVKMLKPSAHLTEREALMSELKVLSYLGNHMNIVNLLGAC
TIGGPTLVITEYCCYGDLLNFLRRKRDSFICSKQEDHAEAALYKNLLHSKESSCSDSTNE
YMDMKPGVSYVVPTKADKRRSVRIGSYIERDVTPAIMEDDELALDLEDLLSFSYQVAKGM
AFLASKNCIHRDLAARNILLTHGRITKICDFGLARDIKNDSNYVVKGNARLPVKWMAPES
IFNCVYTFESDVWSYGIFLWELFSLGSSPYPGMPVDSKFYKMIKEGFRMLSPEHAPAEMY
DIMKTCWDADPLKRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSVRINSV
GSTASSSQPLLVHDDV
Masters Research Project
117
Masters Research Project
118
Blast Result (NT-BLAST, 2009)
Masters Research Project
119
Rectangle View Of Blast Result
Masters Research Project
120
Slanted View Of Blast Result
Radial view of Blast Result
Masters Research Project
121
Force view of Blast Result
Masters Research Project
122
Sequences producing significant alignments:
Legend for links to other resources: UniGene GEO Gene Structure
Map Viewer
(Click headers to sort columns)
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
NM_000222.2
Homo sapiens v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog
(KIT), transcript variant
1, mRNA
958
5
958
5 100% 0.0
100
%
BC071593.1
Homo sapiens v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog,
mRNA (cDNA clone
MGC:87427
IMAGE:4375615),
complete cds
954
2
954
2 99% 0.0 99%
NM_00109377
2.1
Homo sapiens v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog
(KIT), transcript variant
2, mRNA
950
7
950
7 100% 0.0 99%
XM_517285.2
PREDICTED: Pan
troglodytes v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog
(KIT), mRNA
929
1
929
1 99% 0.0 99%
X06182.1 Human c-kit proto-
oncogene mRNA
887
0
940
0 95% 0.0
100
%
Masters Research Project
123
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
XM_001089071
.1
PREDICTED: Macaca
mulatta v-kit Hardy-
Zuckerman 4 feline
sarcoma viral
oncogene homolog
(KIT), mRNA
866
5
866
5 99% 0.0 96%
AK304031.1
Homo sapiens cDNA
FLJ54320 complete
cds, highly similar to
Mast/stem cell growth
factor receptor
precursor (EC 2.7.10.1)
627
6
627
6 66% 0.0 99%
AB463344.1
Synthetic construct
DNA, clone:
pF1KB7018, Homo
sapiens KIT gene for v-
kit Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog,
without stop codon, in
Flexi system
533
2
533
2 56% 0.0 99%
AB097502.1 Callithrix jacchus c-kit
mRNA, complete cds
464
9
464
9 56% 0.0 95%
NG_007456.1
Homo sapiens v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog
(KIT) on chromosome 4
422
4
966
4 99% 0.0
100
%
AC092545.3
Homo sapiens BAC clone
RP11-586A2 from 4,
complete sequence
422
4
887
2 91% 0.0
100
%
Masters Research Project
124
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
NM_00116648
4.1
Bos taurus v-kit Hardy-
Zuckerman 4 feline
sarcoma viral
oncogene homolog
(KIT), mRNA
415
0
486
6 98% 0.0 86%
NM_00100983
7.3
Felis catus v-kit Hardy-
Zuckerman 4 feline
sarcoma viral
oncogene homolog
(KIT), mRNA
>gb|S76596.1| c-
kit=receptor tyrosine
kinase p145c-kit [cats,
fetal head, mRNA,
4222 nt]
403
7
403
7 75% 0.0 85%
NM_00116386
6.1
Equus caballus v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog
(KIT), mRNA
374
6
374
6 58% 0.0 89%
D45168.1
Capra hircus mRNA for
caprine c-kit protein,
complete cds
370
7
370
7 72% 0.0 84%
L04143.1 Human c-kit gene 365
7
974
5 98% 0.0
100
%
X69316.1
H.sapiens KIT proto-
oncogene for
mast/stem cell growth
factor receptor, exon
21
365
7
423
5 41% 0.0
100
%
Masters Research Project
125
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
AF055037.1
Equus caballus tyrosine
kinase receptor
homolog (KIT) mRNA,
partial cds
360
7
360
7 56% 0.0 89%
AJ224643.1
Equus caballus KIT
mRNA (roan coloured
Belgian horse; rn(TaqI-
) allele)
360
5
360
5 55% 0.0 89%
AJ224644.1
Equus caballus KIT
mRNA (roan coloured
Belgian horse)
360
0
360
0 55% 0.0 89%
AJ224645.1
Equus caballus KIT
mRNA (non-roan
coloured North
Swedish Trotter)
359
4
359
4 55% 0.0 89%
NM_00100318
1.1
Canis lupus familiaris v-
kit Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog
(KIT), mRNA
>gb|AF044249.1|AF04
4249 Canis familiaris
receptor tyrosine
kinase c-kit mRNA,
complete cds
357
0
357
0 60% 0.0 87%
AY296484.1
Canis familiaris tyrosine
kinase receptor c-KIT
mRNA, complete cds
355
7
355
7 56% 0.0 88%
D16680.1 Bos primigenius mRNA
for c-kit receptor,
355
7
355
7 58% 0.0 88%
Masters Research Project
126
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
complete cds
AF099030.1
Canis familiaris KIT (c-
kit) mRNA, complete
cds
355
2
355
2 56% 0.0 88%
AF448148.1 Canis familiaris c-KIT
mRNA, complete cds
354
8
354
8 56% 0.0 88%
AY874543.1
Equus caballus
mast/stem cell growth
factor receptor (KIT)
mRNA, partial cds
352
0
352
0 54% 0.0 89%
AJ224642.1
Equus caballus KIT
mRNA (roan coloured
Belgian horse; rn(TaqI-
) allele)
351
3
361
0 55% 0.0 92%
AY313776.1
Canis familiaris tyrosine
kinase receptor (c-kit)
mRNA, complete cds
349
1
349
1 56% 0.0 88%
FJ938289.1
Sus scrofa mast/stem
cell growth factor
receptor (KIT) mRNA,
complete cds
339
8
339
8 72% 0.0 83%
NM_00104452
5.1
Sus scrofa v-kit Hardy-
Zuckerman 4 feline
sarcoma viral
oncogene homolog
(KIT), mRNA
>dbj|AB250963.1| Sus
scrofa KIT mRNA for
mast/stem cell growth
337
4
337
4 72% 0.0 83%
Masters Research Project
127
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
factor receptor,
complete cds
DQ288944.1
Sus scrofa mast/stem
cell growth factor
receptor (KIT) mRNA,
KIT*RC2 allele, partial
cds
334
1
334
1 57% 0.0 87%
DQ288943.1
Sus scrofa mast/stem
cell growth factor
receptor (KIT) mRNA,
KIT*RC1 allele, partial
cds
333
0
333
0 57% 0.0 86%
AY876383.1
Bubalus bubalis proto-
oncogene c-kit
receptor-like mRNA,
complete sequence
332
8
332
8 57% 0.0 87%
AJ223228.1
Sus scrofa mRNA for
mast/stem cell growth
factor receptor
(KIT1*0101)
330
1
330
1 57% 0.0 86%
AJ223230.1
Sus scrofa mRNA for
mast/stem cell growth
factor receptor
(KIT1*0202)
329
0
329
0 57% 0.0 86%
AJ223229.1
Sus scrofa mRNA for
mast/stem cell growth
factor receptor
(KIT1*0201)
327
8
327
8 57% 0.0 86%
DQ314491.1 Bubalus bubalis tissue- 320 320 57% 0.0 86%
Masters Research Project
128
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
type testis proto-
oncogene c-kit
receptor mRNA,
complete cds
3 3
AY910688.1
Equus caballus
mast/stem cell growth
factor receptor (KIT)
mRNA, KIT-SB1 allele,
partial cds,
alternatively spliced
280
8
333
1 52% 0.0 93%
AF263827.1 Bos taurus C-KIT protein
mRNA, partial cds
274
3
274
3 41% 0.0 89%
AF263826.1
Bos taurus breed
Hereford C-KIT protein
mRNA, partial cds
274
3
274
3 41% 0.0 89%
EU247828.1
Rattus norvegicus strain
BN Kit oncogene (KIT)
mRNA, complete cds
271
5
271
5 58% 0.0 82%
EU247827.1
Rattus norvegicus strain
ACI Kit oncogene (KIT)
mRNA, complete cds
270
0
270
0 58% 0.0 82%
NM_022264.1
Rattus norvegicus v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog
(Kit), mRNA
>dbj|D12524.1|RATCK
ITPO Rattus norvegicus
mRNA for c-kit
receptor tyrosine
270
0
270
0 58% 0.0 82%
Masters Research Project
129
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
kinase
NM_00112273
3.1
Mus musculus kit
oncogene (Kit),
transcript variant 1,
mRNA
265
4
313
4 68% 0.0 83%
AK046795.1
Mus musculus 10 days
neonate medulla
oblongata cDNA,
RIKEN full-length
enriched library,
clone:B830009P17
product:kit oncogene,
full insert sequence
264
3
264
3 57% 0.0 82%
BC075716.1
Mus musculus kit
oncogene, mRNA
(cDNA clone
MGC:78140
IMAGE:3673641),
complete cds
264
1
264
1 56% 0.0 83%
BC026713.1
Mus musculus kit
oncogene, mRNA
(cDNA clone
IMAGE:5008623)
264
1
312
1 67% 0.0 83%
X62491.1
R.rattus mRNA for c-kit
receptor tyrosine
kinase isoform
263
4
263
4 58% 0.0 82%
AK047010.1
Mus musculus 10 days
neonate cerebellum
cDNA, RIKEN full-
length enriched library,
263
0
263
0 57% 0.0 82%
Masters Research Project
130
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
clone:B930010O12
product:kit oncogene,
full insert sequence
AJ223231.1
Sus scrofa mRNA for
mast/stem cell growth
factor receptor
(KIT2*0201)
260
4
260
4 47% 0.0 86%
NM_021099.3
Mus musculus kit
oncogene (Kit),
transcript variant 2,
mRNA
257
1
305
1 68% 0.0 83%
BC052457.1
Mus musculus kit
oncogene, mRNA
(cDNA clone
MGC:63313
IMAGE:6830637),
complete cds
255
8
303
8 67% 0.0 83%
AY536430.1
Mus musculus proto-
oncogene c-kit (Kit)
mRNA, complete cds
255
8
255
8 56% 0.0 82%
AY536431.1
Mus musculus mutant
proto-oncogene c-kit
(Kit) mRNA, complete
cds
255
3
255
3 56% 0.0 82%
Y00864.1 Mouse c-kit mRNA 254
7
294
9 64% 0.0 85%
U63834.1 Human KIT protein and
alternatively spliced
KIT protein (KIT) gene,
242
9
954
2 98% 0.0
100
%
Masters Research Project
131
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
complete cds
AY829238.1
Bubalus bubalis c-kit
protooncogene
receptor mRNA, partial
sequence
241
4
241
4 41% 0.0 86%
EU826594.1
Homo sapiens soluble
KIT variant 1 (KIT)
mRNA, complete cds,
alternatively spliced
227
6
227
6 23% 0.0
100
%
AK145462.1
Mus musculus cDNA,
RIKEN full-length
enriched library,
clone:I0C0048O08
product:kit oncogene,
full insert sequence
215
2
259
8 53% 0.0 84%
XM_001371004
.1
PREDICTED:
Monodelphis
domestica similar to
stem cell factor
receptor
(LOC100017501),
mRNA
200
6
200
6 51% 0.0 80%
AF131209.1
Trichosurus vulpecula
stem cell factor
receptor (c-kit) mRNA,
partial cds
181
4
181
4 49% 0.0 79%
X03711.1
Hardy-Zuckermann 4
feline sarcoma virus
(H24-FeSV) kit
oncogene
156
8
156
8 21% 0.0 92%
Masters Research Project
132
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
XM_001724747
.1
PREDICTED: Homo
sapiens similar to KIT
protein (LOC652799),
mRNA
156
5
156
5 16% 0.0 99%
AY184499.1
Macaca mulatta v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene-like protein
(KIT) gene, partial
sequence
151
3
151
3 17% 0.0 96%
DQ450844.1
Lama pacos mast/stem
cell growth factor
receptor mRNA, partial
cds
150
7
150
7 20% 0.0 92%
AY927690.1
Cavia porcellus v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene-like protein
(KIT) mRNA, partial
cds, alternatively
spliced
135
4
135
4 21% 0.0 88%
AY927691.1
Cavia porcellus v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene-like protein
(KIT) mRNA, partial
cds, alternatively
spliced
129
3
129
3 21% 0.0 88%
AM420315.1 Equus caballus genomic
BAC clone CH241-
440E11, containing KIT
112
5
455
6 80% 0.0 97%
Masters Research Project
133
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
gene
AF296696.1
Rattus norvegicus c-kit
receptor mRNA, partial
cds
111
6
111
6 18% 0.0 87%
AF323756.1
Mustela vison c-kit
tyrosine kinase
receptor-like mRNA,
partial sequence
109
8
109
8 17% 0.0 88%
XM_936229.1
PREDICTED: Homo
sapiens similar to v-kit
Hardy-Zuckerman 4
feline sarcoma viral
oncogene homolog
(LOC653882), mRNA
109
6
109
6 11% 0.0 99%
AF296693.1
Rattus norvegicus c-kit
receptor mRNA, partial
cds
106
1
106
1 18% 0.0 86%
XM_002192817
.1
PREDICTED: Taeniopygia
guttata v-kit Hardy-
Zuckerman 4 feline
sarcoma viral
oncogene homolog
(LOC100221238),
mRNA
959 959 24% 0.0 80%
NM_204361.1
Gallus gallus v-kit Hardy-
Zuckerman 4 feline
sarcoma viral
oncogene homolog
(KIT), mRNA
>dbj|D13225.1|CHKCK
948 948 24% 0.0 80%
Masters Research Project
134
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
ITP Gallus gallus mRNA
for c-kit protein,
complete cds
AJ438313.1 Homo sapiens partial
mRNA for KIT protein 876 876 9% 0.0 99%
XM_001506782
.1
PREDICTED:
Ornithorhynchus
anatinus similar to KIT
protein
(LOC100075317),
mRNA
832 832 24% 0.0 78%
AY692084.1
Canis familiaris c-KIT
protein (KIT) mRNA,
partial cds
737 737 11% 0.0 88%
X65997.1
M.musculus c-kit mRNA
for truncated tyrosine-
kinase
625 102
7 19%
9e-
17
5
85%
EF472963.1
Mus musculus strain
CD1 KIT oncogene (Kit)
mRNA, complete cds
619 619 11%
4e-
17
3
85%
AY914890.1 Schistosoma japonicum
unknown mRNA 606 606 6%
3e-
16
9
99%
AC006552.7
Homo sapiens
chromosome 4 clone
C0084L10 map 4p16,
complete sequence
525 144
8 14%
9e-
14
5
100
%
X69303.1 H.sapiens KIT proto-
oncogene for
525 525 5% 9e-
14100
Masters Research Project
135
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
mast/stem cell growth
factor receptor, exon 3
5 %
X69302.1
H.sapiens KIT proto-
oncogene for
mast/stem cell growth
factor receptor, exon 2
505 505 5%
1e-
13
8
100
%
AK238625.1
Sus scrofa mRNA,
clone:THY010008F12,
expressed in thymus
473 473 6%
3e-
12
9
91%
AF296694.1
Rattus norvegicus c-kit
receptor mRNA, partial
cds
462 462 15%
7e-
12
6
77%
AC115853.8
Mus musculus
chromosome 5, clone
RP24-273B9, complete
sequence
457 138
4 23%
3e-
12
4
96%
AM293661.1 Ovis aries partial mRNA
for c-kit (KIT gene) 392 392 6%
1e-
10
4
87%
X69309.1
H.sapiens KIT proto-
oncogene for
mast/stem cell growth
factor receptor, exon 9
363 363 3% 8e-
96
100
%
X69306.1
H.sapiens KIT proto-
oncogene for
mast/stem cell growth
factor receptor, exon 6
357 357 3% 4e-
94
100
%
AF296695.1 Rattus norvegicus c-kit
receptor-like protein
348 348 15% 2e- 74%
Masters Research Project
136
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
mRNA, partial
sequence
91
X69305.1
H.sapiens KIT proto-
oncogene for
mast/stem cell growth
factor receptor, exon 5
326 326 3% 1e-
84 98%
XM_001119799
.1
PREDICTED: Macaca
mulatta similar to
Mast/stem cell growth
factor receptor
precursor (SCFR)
(Proto-oncogene
tyrosine-protein kinase
Kit) (c-kit) (CD117
antigen) (LOC723756),
partial mRNA
320 320 3% 5e-
83 98%
S67773.1
c-kit=proto-oncogene
{promoter} [human,
leukocytes, Genomic,
1306 nt]
287 287 2% 5e-
73
100
%
X69301.1
H.sapiens KIT proto-
oncogene for
mast/stem cell growth
factor receptor, exon 1
287 287 2% 5e-
73
100
%
X69311.1
H.sapiens KIT proto-
oncogene for
mast/stem cell growth
factor receptor, exon
14
283 283 2% 6e-
72
100
%
AF228311.1 Rattus norvegicus clone 257 257 5% 4e- 82%
Masters Research Project
137
Accession Description
Max
sc
or
e
Tota
l
sc
or
e
Query
cover
age
E
val
ue
Max
id
en
t
Links
pSSCK5 c-kit receptor
mRNA, partial cds
64
X69304.1
H.sapiens KIT proto-
oncogene for
mast/stem cell growth
factor receptor, exon 4
257 257 2% 4e-
64
100
%
AF228309.1
Rattus norvegicus clone
pSSCK3 c-kit receptor
mRNA, partial cds
252 252 5% 2e-
62 81%
AF228307.1
Rattus norvegicus clone
pSSCK1 c-kit receptor
mRNA, partial cds
250 250 5% 6e-
62 81%
Masters Research Project
138
Fasta Result
Masters Research Project
139
Masters Research Project
140
MSA (Multiple Sequence Alignment)
>gi|148005048|ref|NM_000222.2| Homo sapiens v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog (KIT), transcript variant 1, mRNA
TCTGGGGGCTCGGCTTTGCCGCGCTCGCTGCACTTGGGCGAGAGCTGGAACGTGGACCAGAGCTCGGATC
CCATCGCAGCTACCGCGATGAGAGGCGCTCGCGGCGCCTGGGATTTTCTCTGCGTTCTGCTCCTACTGCT
TCGCGTCCAGACAGGCTCTTCTCAACCATCTGTGAGTCCAGGGGAACCGTCTCCACCATCCATCCATCCA
GGAAAATCAGACTTAATAGTCCGCGTGGGCGACGAGATTAGGCTGTTATGCACTGATCCGGGCTTTGTCA
AATGGACTTTTGAGATCCTGGATGAAACGAATGAGAATAAGCAGAATGAATGGATCACGGAAAAGGCAGA
AGCCACCAACACCGGCAAATACACGTGCACCAACAAACACGGCTTAAGCAATTCCATTTATGTGTTTGTT
AGAGATCCTGCCAAGCTTTTCCTTGTTGACCGCTCCTTGTATGGGAAAGAAGACAACGACACGCTGGTCC
GCTGTCCTCTCACAGACCCAGAAGTGACCAATTATTCCCTCAAGGGGTGCCAGGGGAAGCCTCTTCCCAA
GGACTTGAGGTTTATTCCTGACCCCAAGGCGGGCATCATGATCAAAAGTGTGAAACGCGCCTACCATCGG
CTCTGTCTGCATTGTTCTGTGGACCAGGAGGGCAAGTCAGTGCTGTCGGAAAAATTCATCCTGAAAGTGA
GGCCAGCCTTCAAAGCTGTGCCTGTTGTGTCTGTGTCCAAAGCAAGCTATCTTCTTAGGGAAGGGGAAGA
ATTCACAGTGACGTGCACAATAAAAGATGTGTCTAGTTCTGTGTACTCAACGTGGAAAAGAGAAAACAGT
CAGACTAAACTACAGGAGAAATATAATAGCTGGCATCACGGTGACTTCAATTATGAACGTCAGGCAACGT
TGACTATCAGTTCAGCGAGAGTTAATGATTCTGGAGTGTTCATGTGTTATGCCAATAATACTTTTGGATC
AGCAAATGTCACAACAACCTTGGAAGTAGTAGATAAAGGATTCATTAATATCTTCCCCATGATAAACACT
ACAGTATTTGTAAACGATGGAGAAAATGTAGATTTGATTGTTGAATATGAAGCATTCCCCAAACCTGAAC
ACCAGCAGTGGATCTATATGAACAGAACCTTCACTGATAAATGGGAAGATTATCCCAAGTCTGAGAATGA
AAGTAATATCAGATACGTAAGTGAACTTCATCTAACGAGATTAAAAGGCACCGAAGGAGGCACTTACACA
TTCCTAGTGTCCAATTCTGACGTCAATGCTGCCATAGCATTTAATGTTTATGTGAATACAAAACCAGAAA
TCCTGACTTACGACAGGCTCGTGAATGGCATGCTCCAATGTGTGGCAGCAGGATTCCCAGAGCCCACAAT
AGATTGGTATTTTTGTCCAGGAACTGAGCAGAGATGCTCTGCTTCTGTACTGCCAGTGGATGTGCAGACA
CTAAACTCATCTGGGCCACCGTTTGGAAAGCTAGTGGTTCAGAGTTCTATAGATTCTAGTGCATTCAAGC
ACAATGGCACGGTTGAATGTAAGGCTTACAACGATGTGGGCAAGACTTCTGCCTATTTTAACTTTGCATT
Masters Research Project
141
TAAAGGTAACAACAAAGAGCAAATCCATCCCCACACCCTGTTCACTCCTTTGCTGATTGGTTTCGTAATC
GTAGCTGGCATGATGTGCATTATTGTGATGATTCTGACCTACAAATATTTACAGAAACCCATGTATGAAG
TACAGTGGAAGGTTGTTGAGGAGATAAATGGAAACAATTATGTTTACATAGACCCAACACAACTTCCTTA
TGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAGTTTTGGGAAAACCCTGGGTGCTGGAGCTTTCGGG
AAGGTTGTTGAGGCAACTGCTTATGGCTTAATTAAGTCAGATGCGGCCATGACTGTCGCTGTAAAGATGC
TCAAGCCGAGTGCCCATTTGACAGAACGGGAAGCCCTCATGTCTGAACTCAAAGTCCTGAGTTACCTTGG
TAATCACATGAATATTGTGAATCTACTTGGAGCCTGCACCATTGGAGGGCCCACCCTGGTCATTACAGAA
TATTGTTGCTATGGTGATCTTTTGAATTTTTTGAGAAGAAAACGTGATTCATTTATTTGTTCAAAGCAGG
AAGATCATGCAGAAGCTGCACTTTATAAGAATCTTCTGCATTCAAAGGAGTCTTCCTGCAGCGATAGTAC
TAATGAGTACATGGACATGAAACCTGGAGTTTCTTATGTTGTCCCAACCAAGGCCGACAAAAGGAGATCT
GTGAGAATAGGCTCATACATAGAAAGAGATGTGACTCCCGCCATCATGGAGGATGACGAGTTGGCCCTAG
ACTTAGAAGACTTGCTGAGCTTTTCTTACCAGGTGGCAAAGGGCATGGCTTTCCTCGCCTCCAAGAATTG
TATTCACAGAGACTTGGCAGCCAGAAATATCCTCCTTACTCATGGTCGGATCACAAAGATTTGTGATTTT
GGTCTAGCCAGAGACATCAAGAATGATTCTAATTATGTGGTTAAAGGAAACGCTCGACTACCTGTGAAGT
GGATGGCACCTGAAAGCATTTTCAACTGTGTATACACGTTTGAAAGTGACGTCTGGTCCTATGGGATTTT
TCTTTGGGAGCTGTTCTCTTTAGGAAGCAGCCCCTATCCTGGAATGCCGGTCGATTCTAAGTTCTACAAG
ATGATCAAGGAAGGCTTCCGGATGCTCAGCCCTGAACACGCACCTGCTGAAATGTATGACATAATGAAGA
CTTGCTGGGATGCAGATCCCCTAAAAAGACCAACATTCAAGCAAATTGTTCAGCTAATTGAGAAGCAGAT
TTCAGAGAGCACCAATCATATTTACTCCAACTTAGCAAACTGCAGCCCCAACCGACAGAAGCCCGTGGTA
GACCATTCTGTGCGGATCAATTCTGTCGGCAGCACCGCTTCCTCCTCCCAGCCTCTGCTTGTGCACGACG
ATGTCTGAGCAGAATCAGTGTTTGGGTCACCCCTCCAGGAATGATCTCTTCTTTTGGCTTCCATGATGGT
TATTTTCTTTTCTTTCAACTTGCATCCAACTCCAGGATAGTGGGCACCCCACTGCAATCCTGTCTTTCTG
AGCACACTTTAGTGGCCGATGATTTTTGTCATCAGCCACCATCCTATTGCAAAGGTTCCAACTGTATATA
TTCCCAATAGCAACGTAGCTTCTACCATGAACAGAAAACATTCTGATTTGGAAAAAGAGAGGGAGGTATG
GACTGGGGGCCAGAGTCCTTTCCAAGGCTTCTCCAATTCTGCCCAAAAATATGGTTGATAGTTTACCTGA
Masters Research Project
142
ATAAATGGTAGTAATCACAGTTGGCCTTCAGAACCATCCATAGTAGTATGATGATACAAGATTAGAAGCT
GAAAACCTAAGTCCTTTATGTGGAAAACAGAACATCATTAGAACAAAGGACAGAGTATGAACACCTGGGC
TTAAGAAATCTAGTATTTCATGCTGGGAATGAGACATAGGCCATGAAAAAAATGATCCCCAAGTGTGAAC
AAAAGATGCTCTTCTGTGGACCACTGCATGAGCTTTTATACTACCGACCTGGTTTTTAAATAGAGTTTGC
TATTAGAGCATTGAATTGGAGAGAAGGCCTCCCTAGCCAGCACTTGTATATACGCATCTATAAATTGTCC
GTGTTCATACATTTGAGGGGAAAACACCATAAGGTTTCGTTTCTGTATACAACCCTGGCATTATGTCCAC
TGTGTATAGAAGTAGATTAAGAGCCATATAAGTTTGAAGGAAACAGTTAATACCATTTTTTAAGGAAACA
ATATAACCACAAAGCACAGTTTGAACAAAATCTCCTCTTTTAGCTGATGAACTTATTCTGTAGATTCTGT
GGAACAAGCCTATCAGCTTCAGAATGGCATTGTACTCAATGGATTTGATGCTGTTTGACAAAGTTACTGA
TTCACTGCATGGCTCCCACAGGAGTGGGAAAACACTGCCATCTTAGTTTGGATTCTTATGTAGCAGGAAA
TAAAGTATAGGTTTAGCCTCCTTCGCAGGCATGTCCTGGACACCGGGCCAGTATCTATATATGTGTATGT
ACGTTTGTATGTGTGTAGACAAATATTTGGAGGGGTATTTTTGCCCTGAGTCCAAGAGGGTCCTTTAGTA
CCTGAAAAGTAACTTGGCTTTCATTATTAGTACTGCTCTTGTTTCTTTTCACATAGCTGTCTAGAGTAGC
TTACCAGAAGCTTCCATAGTGGTGCAGAGGAAGTGGAAGGCATCAGTCCCTATGTATTTGCAGTTCACCT
GCACTTAAGGCACTCTGTTATTTAGACTCATCTTACTGTACCTGTTCCTTAGACCTTCCATAATGCTACT
GTCTCACTGAAACATTTAAATTTTACCCTTTAGACTGTAGCCTGGATATTATTCTTGTAGTTTACCTCTT
TAAAAACAAAACAAAACAAAACAAAAAACTCCCCTTCCTCACTGCCCAATATAAAAGGCAAATGTGTACA
TGGCAGAGTTTGTGTGTTGTCTTGAAAGATTCAGGTATGTTGCCTTTATGGTTTCCCCCTTCTACATTTC
TTAGACTACATTTAGAGAACTGTGGCCGTTATCTGGAAGTAACCATTTGCACTGGAGTTCTATGCTCTCG
CACCTTTCCAAAGTTAACAGATTTTGGGGTTGTGTTGTCACCCAAGAGATTGTTGTTTGCCATACTTTGT
CTGAAAAATTCCTTTGTGTTTCTATTGACTTCAATGATAGTAAGAAAAGTGGTTGTTAGTTATAGATGTC
TAGGTACTTCAGGGGCACTTCATTGAGAGTTTTGTCTTGGATATTCTTGAAAGTTTATATTTTTATAATT
TTTTCTTACATCAGATGTTTCTTTGCAGTGGCTTAATGTTTGAAATTATTTTGTGGCTTTTTTTGTAAAT
ATTGAAATGTAGCAATAATGTCTTTTGAATATTCCCAAGCCCATGAGTCCTTGAAAATATTTTTTATATA
TACAGTAACTTTATGTGTAAATACATAAGCGGCGTAAGTTTAAAGGATGTTGGTGTTCCACGTGTTTTAT
Masters Research Project
143
TCCTGTATGTTGTCCAATTGTTGACAGTTCTGAAGAATTCTAATAAAATGTACATATATAAATCAAAAAA
AAAAAAAAAA
>gi|262050643|ref|NM_001166484.1| Bos taurus v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog (KIT), mRNA
CTTGGCGCTCGCGGCTCTGGGGGCTCGGCTTTGCCGCGCTCCCGGCACTCGGGCGAGAGCCGGAACGTGG
AACAGAGCTCCGGTCCCAGCGCAGCCACCGCGATGAGAGGCGCTCGCGGCGCCTGGGATTTCCTCTTCGT
TCTGCTGCTCCTGCTCCTCGTCCAGACAGGCTCTTCTCAGCCATCTGTGAGTCCAGGGGAACTGTCTCTA
CCATCTATCCACCCAGCAAAATCAGAGTTAATTGTCAGCGTTGGCGACGAGATTAGGCTGTTATGCACCG
ATCCAGGATTTGTCAAGTGGACTTTTGAGATCCTGGGTCAACTGAGTGAGAAAACAAACCCGGAATGGAT
CACCGAGAAAGCAGAGGCCACAAATACAGGCAATTACACGTGCACCAATAAAGGCGGCTTGAGCAGTTCC
ATTTATGTGTTTGTTAGAGACCCCGAGAAGCTTTTCCTGATTGACCTTCCCTTGTACGGGAAAGAAGAAA
ACGACACGCTGGTTCGCTGTCCCCTGACAGACCCCGAGGTGACCAATTACTCTCTTACGGGGTGTGAGGG
GAAACCTCTCCCCAAGGATTTGACATTTGTGGCCGACCCCAAGGCAGGTATCACAATCAGAAATGTGAAG
CGCGAGTACCATCGGCTCTGTCTGCACTGCTCAGCGAATCAGAGGGGCAAGTCCATGCTGTCGAAGAAAT
TCACTCTGAAAGTGCGGGCAGCCATCAAAGCTGTGCCAGTTGTGTCTGTGTCCAAAACCAGCTATCTTCT
CAGGGAAGGAGAGGAATTTGCAGTGACATGCTTGATTAAAGACGTGTCTAGTTCCGTGGACTCTATGTGG
ATAAAGGAAAACAGCCAGCAGACTAAAGCACAGATGAAGAAGAATAGCTGGCATCAGGGTGACTTCAGTT
ATCTCCGTCAGGAAAGGTTGACTATCAGCTCAGCAAGAGTGAATGATTCCGGTGTGTTCATGTGTTACGC
CAATAACACTTTTGGATCAGCAAATGTCACAACAACCTTAGAAGTAGTAGATAAAGGATTCATTAATATC
TTCCCTATGATGAACACAACAGTATTTGTAAATGATGGAGAGAATGTGGATCTGGTTGTTGAATATGAGG
CATATCCCAAACCTGTACACCGACAATGGATATATATGAACAGAACCTCCACTGATAAGTGGGACGATTA
TCCTAAGTCTGAAAATGAAAGTAATATCAGATACGTAAATGAACTTCATCTAACCAGATTAAAAGGGACT
GAAGGAGGCACTTACACATTTCACGTGTCCAATTCTGATGTCAATTCTTCCGTGACATTTAATGTTTACG
TGAACACAAAACCAGAAATCCTGACGCATGACAGGCTGGTGAATGGCATGCTACAGTGCGTGGCCGCAGG
GTTCCCGGAGCCAACCATCGATTGGTACTTTTGTCCAGGAACTGAGCAGAGGTGTTCTGTTCCCGTTGGG
Masters Research Project
144
CCAGTGGATGTACAGATCCAAAACTCATCTGTCTCACCATTTGGAAAACTAGTGGTTTATAGCACCATTG
ATGACAGCACATTCAAACACAATGGGACGGTGGAGTGCAGGGCTTATAACGATGTGGGCAAGAGTTCTGC
CTCTTTTAACTTTGCATTTAAAGGTAACAGCAAAGAACAAATCCATGCTCACACCCTGTTCACGCCGTTG
CTGATTGGTTTTGTGATCGCAGCTGGTTTAATGTGTATCTTCGTGATGATTCTTACATACAAATATTTGC
AGAAACCCATGTATGAAGTACAGTGGAAAGTTGTCGAGGAGATAAATGGAAACAATTATGTTTACATAGA
CCCAACACAACTTCCTTATGATCACAAATGGGAGTTTCCCAGGAACAGGCTGAGTTTTGGGAAAACCCTG
GGTGCTGGCGCCTTCGGGAAAGTTGTTGAGGCCACCGCTTATGGCTTAATTAAATCAGATGCAGCCATGA
CTGTTGCTGTCAAGATGCTCAAACCAAGCGCCCATTTAACCGAACGAGAAGCCCTAATGTCTGAACTCAA
AGTCTTGAGTTACCTCGGTAATCATATGAATATTGTGAATCTTCTGGGAGCGTGCACCATTGGAGGGCCC
ACCCTGGTCATTACAGAATATTGTTGCTATGGTGATCTTCTGAATTTTTTGAGAAGAAAACGTGATTCAT
TTATTTGCTCAAAGCAGGAAGATCACGCCGAAGTGGCGCTTTATAAGAACCTTCTTCATTCAAAGGAGTC
TTCCTGCAATGATAGTACTAATGAGTACATGGACATGAAACCTGGAGTTTCTTATGTTGTACCAACCAAG
GCAGACAAGAGGAGATCTGCAAGAATAGGTTCATACATAGAAAGAGACGTGACTCCTGCTATCATGGAAG
ATGATGAGCTGGCCCTGGACCTGGAGGACTTGCTGAGCTTTTCTTACCAGGTGGCAAAAGGCATGGCGTT
CCTTGCCTCAAAGAATTGTATTCATAGAGACTTGGCAGCCAGAAATATCCTCCTTACTCATGGTCGAATC
ACAAAGATTTGTGATTTTGGTCTAGCCAGAGACATCAAGAATGATTCTAATTATGTGGTCAAAGGAAACG
CTCGACTCCCTGTGAAGTGGATGGCACCAGAGAGTATTTTCAACTGTGTATACACATTTGAAAGTGATGT
CTGGTCCTATGGGATTTTTCTGTGGGAGCTGTTCTCTTTAGGAAGCAGCCCCTACCCTGGAATGCCAGTC
GATTCTAAGTTCTACAAGATGATCAAGGAAGGTTTCCGAATGCTCAGCCCCGAGCATGCACCTGCGGAAA
TGTATGACATCATGAAGACCTGCTGGGATGCTGATCCCTTGAAAAGGCCAACATTTAAGCAGATTGTGCA
GCTGATTGAGAAGCAGATCTCAGAGAGCACCAATCATATTTATTCCAACTTAGCAAACTGCAGCCCCCAC
CGGGAGAACCCCGCCGTGGACCATTCTGTGCGCATCAACTCTGTGGGCAGCAGCGCCTCCTCCACGCAGC
CTCTGCTTGTCCACGAAGATGTCTGAAGCAGTCTGCATCTGGGGGTCTCCTGACAACCCCGAACCCCCTC
TCTTTTGGTTTTCACAATGGTTCCTTTGTTTTCCTTCCACTCGAATCCTACTCCAGGGTAGTGGACACCC
TGATGTAATCCTGTCTTTATGAGCACCCTTTAGTGGCTGATGATTTTTGTCATCAGCTACCATCAAGCTG
Masters Research Project
145
CCTATGTTCCTAATAGCACACAAGCCCCTGCTAACAAGGAAACTTCAGACTTGGAGAAAGGGAGGGTGGG
ATGGACTAGACACACTGAGTCCTTCCCAAGGTTTCTCCAATTTCTATCTGAAAAATGTAGTTGATAGTTT
TGAGTACATAGCAGTAGTCACCTTCATTCCTCATAATCATCCATGATGGTATGATGATGTAGCAAGACTA
GAAGCTGAAACCTTATCCCTTGATGCGGAAAATAGAATGTTATTCAAAGGCCAGAAGCCTATGAATATCT
GGGCTCACGAAATCTAGTATTTCATGCTGGGAGTAAGACATAGGCCATGGAAAAATGTTCTCCAGGCCTG
AATAAAGACTGCTGGGCATGAGCCTTCGTGCTTCTGACCTGGTTTCTAAATCAAATTCACTAATAGTAGC
TGAATTGGAGAGATGGCCTCCCAGCCCACATTTTGTATATACTCATCTATAAATTGTATGCATTCACATA
TTTGAGGGGGGAAAACCCACAAGGTGTAGTTTCTGAATACAATTCTGGCTTGAGTCTGCTGTGTATAGGA
ATAACTGATGAGCCAGAACAAGTTTGAAGGAAACAGTGCTTTAAAAAAAAAAAATTATTGCCAAGCACAG
TCTTAACAAAAATCTCCTTTTGTAGCCGATGAACTTGCTCTAGAGATTGTCCAGAACAAGCCGACCAGCT
TCAGAATGTCATCGTGTTCAGTGGATTTGATGCCGTTTGAAAAAGTGATTTATTGCATGACTCTGGCAGG
AGTGACAACCAGTGCTATCTTAGTTTGGATTCTTCTGTAGCCAAAAATAAACTTTAGATTCAGCCTCCAT
CACAGGCATGTCCTGGACACCTGGCCAGTATCTATATAAGTGTGTATGTGTGTGTGTGTATGTATGCTTG
TAGACAAAATATTTTGGGAGTAGTCTTGCCTTGAGCCCAGAAGGGTCCTTCAGCACCCAAGAAGTAGCCT
GGCTTTCAGCAGCAGCCCAGCTTTTGTTTCACTTCACCTTGCTGTATAGGGAAGCTTACCAAAAGCCTAC
ATAGAGGAGTGCTTAAGTCCATATTTATTTGCAATCCACCTGCACTTAAGGCACTCTGTTACTTATACTC
AACATACTGTACCTGTTCCTTAGACCTTATTAATGCTACTGTCCTACTGAAACATATTTCAGTTCTATCT
ACCCTTTAGGCTGTAGCCTGGATATTATTCTTAGTGTTACCTGTTTGTTTGTTTGTTTTTTTTTTTTGCC
TCTACCCAATTATAAGAGGTCCATGGGTACATGGCAGGGTTTATGTTGTCTCGCAGGATTCAGGTATGCT
GCCCTCACGGTTTTCCCCTCCTAGGTGTCTTTGCTTTCATTTAGAGAACTGTGGCCATTTATCTGAAAGT
AACCATTTGCACTGGAGTTCTATGCTCTCGCACCTTTCCAAAGGTAACAGGTTGGGGCGTTTTGTTGCTA
ATGTAAGACGTTTTTGCTGTTTGCCACACTTTGTCTGAAAATTTTCTTTACGTTTCTATTGACTTTGGTT
ATAGTAAGAAAAGTGGTTGTTAATTGTAGATGTCTAGGTACTTCAGGGGCACTTCACTGAGAGTTTTGTC
TTGGATATTCTTGAAAGTTTATATTTTTATAATTTTTTTTTTTTTTACATCTGATGTTTCTTTGCAGTAG
CTTTATGTTTGAAATTATTTTGTGGCTTTTTTTTTTGTAAATATTGAAATGTAGCAATAATGTCTTTTGA
Masters Research Project
146
ATATTCCCAATCCCATGAGTCCTTGAAAATATTTTTTATATATACAGTAACTTTATGTGTAAATATGTAA
GCGGTGCAGGTTTAAAGGTTGTTTATGTTCTATGTGTTTTTTTCCTGATACGTTGTCCAACTGTTCACAG
TTCTAAAGAATTCTAATAAAAGTGTAAATATAAA
>gi|45383437|ref|NM_204361.1| Gallus gallus v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog (KIT), mRNA
CATGGAGGGCGCGCACCTCGCCTGGGAGCTGGCGCACGCCGTGCTGCTGCTGAGCCTCATCCCCGCAGGT
GGTTCAGTGCCTCATGAAGAATCCTCGCTGGTTGTGAACAAAGGGGAGGAGTTAAGGCTGAAGTGCAATG
AGGAAGGACCCGTGACTTGGAATTTCCAGAACTCAGATCCATCGGCAAAAACAAGGATTTCCAATGAGAA
GGAATGGCACACCAAAAATGCAACAATCAGAGACATAGGCAGATATGAGTGCAAAAGCAAAGGGAGTATT
GTCAACTCTTTCTATGTTTTTGTTAAAGATCCAAATGTGCTCTTCCTTGTTGATTCTCTGATCTATGGGA
AAGAAGACAGTGACATCCTGCTGGTGTGCCCACTAACAGATCCAGATGTTTTAAACTTCACCCTGAGAAA
ATGCGATGGCAAACCTCTCCCCAAAAACATGACATTCATTCCCAATCCACAGAAGGGTATCATTATAAAG
AACGTACAGAGGTCATTCAAGGGCTGCTACCAGTGTTTGGCAAAGCATAATGGAGTTGAGAAAATATCAG
AGCACATTTTCCTGAATGTGAGACCAGTTCACAAAGCTCTTCCAGTCATTACCTTATCCAAAAGCTATGA
GCTTCTCAAAGAAGGGGAAGAATTTGAAGTTACATGCATAATCACGGATGTGGATAGCAGCGTAAAAGCT
AGTTGGATTTCTTACAAAAGTGCGATTGTTACAAGCAAAAGCAGAAATTTGGGTGATTACGGATACGAAA
GGAAATTAACATTGAACATCCGTTCAGTGGGAGTTAATGATTCTGGAGAATTCACATGCCAAGCAGAGAA
CCCTTTCGGAAAAACCAATGCCACGGTAACCTTGAAAGCACTAGCTAAAGGATTTGTCCGTTTGTTTGCA
ACAATGAATACCACAATTGACATAAATGCAGGACAAAATGGAAATTTAACAGTTGAATATGAGGCGTATC
CAAAACCAAAGGAAGAAGTCTGGATGTACATGAACGAAACATTGCAGAATTCATCGGACCATTATGTCAA
GTTCAAGACTGTGGGCAATAACAGTTATACAAGCGAACTTCACCTTACCCGATTAAAAGGAACAGAGGGA
GGCATTTACACATTTTTTGTGTCCAACTCAGATGCCAGCTCCTCTGTAACATTTAATGTCTACGTGAAAA
CAAAACCAGAGATCCTTACCTTGGATATGCTCGGCAATGACATTCTTCAGTGTGTGGCAACTGGATTCCC
AGCCCCTACCATTTACTGGTATTTTTGCCCAGGAACTGAACAGAGGTGTTTAGACTCACCAACGATATCC
CCCATGGATGTGAAAGTCAGTTACACAAACTCATCGGTGCCATCCTTTGAGCGGATCCTGGTCGAGAGCA
Masters Research Project
147
CTGTGAACGCCAGCATGTTCAAGAGCACTGGTACCATCTGCTGTGAGGCATCCAGCAATGGTGACAAGAG
CTCTGTTTTCTTTAACTTTGCTATTAAAGAGCAAATCCGTACCCACACCCTTTTCACACCTTTACTAATC
GCATTTGGGGTCGCCGCTGGACTGATGTGCATCATAGTCATGATCCTGGTGTACATATATTTGCAGAAAC
CCAAATATGAAGTCCAGTGGAAAGTTGTTGAAGAAATAAATGGAAACAACTATGTTTACATAGACCCAAC
GCAACTTCCTTATGATCACAAATGGGAGTTTCCTAGAAACCGGTTGAGTTTTGGTAAAACCCTTGGTGCT
GGAGCTTTTGGAAAGGTTGTTGAAGCCACTGCTTATGGCCTATTTAAATCTGATGCCGCTATGACAGTAG
CAGTGAAAATGTTGAAACCAAGCGCCCATTTAACTGAAAGAGAAGCCTTGATGTCAGAGCTAAAAGTTCT
CAGTTACCTTGGTAACCACATTAATATTGTGAATCTACTTGGAGCTTGCACTATTGGAGGGCCCACGCTG
GTCATTACAGAATATTGCTGCTATGGCGATCTCTTAAATTTCCTGAGGCGGAAGCGAGATTCATTCATTT
GTCCAAAGCATGAAGAGCACGCAGAAGCAGCTGTTTATGAGAACCTTTTGCACCAGGCAGAGCCCACAGC
GGATGCTGTCAATGAGTACATGGACATGAAACCAGGAGTGTCATATGCTGTTCCCCCAAAAGCTGATAAA
AAAAGGCCGGTGAAATCTGGATCTTACACCGATCAGGATGTTACCCTTTCCATGTTGGAAGATGACGAAC
TTGCTCTAGATGTTGAAGATCTATTAAGCTTCTCTTACCAGGTGGCAAAGGGCATGAGCTTCCTGGCCTC
TAAAAACTGCATTCATAGGGATCTGGCAGCAAGAAATATTCTTCTCACTCATGGTCGAATAACAAAAATC
TGTGACTTTGGTCTGGCAAGAGATATAAGGAATGACTCAAATTACGTGGTTAAAGGAAATGCTCGTCTCC
CTGTGAAGTGGATGGCACCTGAAAGCATTTTCAACTGCGTTTACACCTTCGAGAGTGATGTCTGGTCTTA
TGGAATATTGCTTTGGGAACTCTTCTCCTTAGGAAGCAGCCCTTACCCAGGGATGCCCGTGGACTCCAAG
TTCTATAAAATGATCAAGGAGGGATATCGGATGTTCAGCCCCGAGTGCTCACCCCCCGAAATGTACGACA
TAATGAAGAGTTGCTGGGATGCCGATCCCTTGCAGAGACCCACATTCAAACAGATCGTGCAGCTGATAGA
ACAGCAGCTTTCCGATAATGCCCCCCGGGTGTATGCAAACTTTTCCACTCCGCCTTCCACTCAAGGCAAT
GCTACAGATCATTCGGTGAGGATTAACTCAGTGGGTAGCAGCGCTTCATCTACTCAGCCCCTCCTGGTAC
GCGAAGATGTTTGAGTGGCATCTGGAAGAAGAGGAGCCCAGCTGTACTTGCTGTTTGTATTGTGTAGGGG
GTGAGAGAGGGAGGACTTCTATTTCTTTTACTTTTACTCACTACTACCACTGTGTAGCTGAAATGTTGTT
GTAATACTGTCCTTTACCACACACTGTTACACATAAACTTAATGTGCTGAGCACGGTTACAGGCATCATT
GCAATGTTAACTGAAGCTGTATATATTTTGCTATATTGTGTGTATTTGCAGTAGAGAGCCAGATCTGAAG
Masters Research Project
148
AAGAAGAACAAAAAAACACAAAAGAAACCCTGGGCCAGGGTGTTGAATGAGATGACTACAGATTGAACCA
AATCTGCAGCGGTCCTTCTCTGGATCCATGCTTGGGAAGCGTGTAGTCATTTTCCAGTTAATTTTGTTTT
GTTGAATTTAAGAAATAAACAGATATAAAGCAAAGTGCTTTGCAAACCCCTTTCCATATCCAGCCACACC
TGAGATGCTTTCCCCGGCAGACACTGCAGGTTGGAGGCCAGGAGGGACAGACAGTGCGTCCGTGGGCAAG
AAGTGGAAGCCAAGGGATGGACATGGGCAATACAACAGAGAACTTCCAGACACTCTGGGGCATGAGGCAG
ATCTGGAAGGGAAAGACCTCAGTCTTTATTTCCTACCTGGGCAGTCAGCAGTGCAGAGCAGTCAGTAGGT
TTCTGTGGGTGCCTGCACCACATCCCACTGGCATTCCACCACGGAGCATCCACCGCAGTCAGAATCGCAG
GGATGAGATGCTTGGTTTGCTAAGAGTCATGTCCAAAAGCATTCACCAAAGATAGGTGGGCTCTGCTGAG
CTGGTCAAATGACAGGAGGAGGGTTGCTTACAGTAAATGGAGTGTCAGAGGAGCAGAGGCCGTTTAACGA
GACATCTTATCTTGGCAGGAGCAGAAGCAAAAAATTCTATCCTGAGGGCCCTTTGCTTATCTTATGATAT
GCAGGTGAATATTTCCCACAGGCTGGCTTGGATAATGCTCTCAGGAAAGGGCTCTTCCTCCCACAAGTCA
TTTCATCTTTCTATCTCTGTAGTGGTGCTGAGATCCCAGAGAAGAACTTCACTGGGAAGCATCGAATTTT
TTAATGGAATAGAACAATGTATACCTTCGGTCTTGTTTGGAGGCAGATATGAGGCTGTGGGGTCAGTTTC
TGTTGGGCCTGGGATTGTGCTTAAGTATCTTATAATTCCTTTTCTTCCTGTTGGAAACACTGTTGGTTCT
TTTTTTTTTTTTTTTTTTAACCCTTGTTCTGTCGCCTTTAAATTAGTAAATGCGTCAAGAGGAGCATTTT
TCAGACAGAGAGTACCAGTTAAGTACCTATTTATTTACATTGCACTTAAGCACTCTGTAACTTATGTTTT
ACCAGGATTTTGGTTGAGTTTACATGCATTTTAAATGTGTAACTAACTGCCCAAAATGCAAGACGATTGT
AAATACACACACTGGTAGTACATTTTAGGTAGTGTTGAATACCACACTCTTTATATGCCTGTGCATTTGG
TTTATTGTTACAGGAATTCATCTTTCAGAGTTAGTCGAGAAGTAGCCATTTGCACTGAACTTACTTAGGC
TGTTGCACCTTTCCAAAGTTAGCCAGTTGTTTGGAGGCACTCTTATGCATACTTGTTGTTGAGCATTTTC
AGTTTAATGCTATAAAAGCTCTTTCGTAGTGTGTTGGCTTTCAGAGCAACTGTAGACAAACTAGTTTTAT
AATCATAGATGTCTTGGTACTTTATGGACATTTAGCTGAAAAAAGTTTGGGTTTTTTCTTTCTTAAAATT
TATATTTTTATAATTTGGGGGGGGTTGAAACTTATTTTGCAATGGCTTAGTGTTTGAATGGTTTTATGCA
TTGTTTTTGTAAATATGGAAATGTAGCAATAATGTCCTTTTGACTATTCTCAGTCCTTGAGTCTCAAAAA
TATTTATATATATATATATATGCAAAAACTATATGTATAAATATGTAAGTGTTCAAAAGTTTATGAAACA
Masters Research Project
149
CTTATGGATATGTTGCTCGGTTGTAGAATTGCAGTTCAGAAGAGTTCAAATAAATCTGTAAATATGTATT
CAGATGCTACCGATGTGTACTATGTTAAGATGGAATATTTTCATGTAAGTCTTAATAAATTCTTGGGTTG
AAGC
>gi|50423|emb|Y00864.1| Mouse c-kit mRNA
GAGCTCAGAGTCTAGCGCAGCCACCGCGATGAGAGGCGCTCGCGGCGCCTGGGATCTGCTCTGCGTCCTG
TTGGTCCTGCTCCGTGGCCAGACAGCCACGTCTCAGCCATCTGCAAGTCCAGGGGAGCCGTCTCCGCCAT
CCATCCATCCAGCACAATCAGAGTTAATAGTTGAAGCTGGCGACACCCTCAGCCTGACGTGCATTGATCC
CGACTTTGTCAGATGGACTTTCAAGACCTATTTCAATGAAATGGTTGAGAATAAAAAAAATGAATGGATC
CAGGAAAAAGCCGAGGCCACTCGCACGGGCACATACACGTGCAGCAACAGCAATGGCCTCACGAGTTCTA
TTTACGTGTTTGTTAGAGATCCTGCCAAACTTTTCCTGGTTGGCCTTCCCTTGTTTGGCAAAGAAGACAG
CGACGCGCTGGTCCGCTGCCCTCTGACAGACCCACAGGTGTCCAATTATTCCCTCATCGAGTGTGATGGG
AAATCTCTCCCCACGGACCTGACGTTTGTCCCAAACCCCAAGGCTGGCATCACCATCAAAAACGTGAAGC
GCGCCTACCACCGGCTCTGTGTCCGCTGTGCTGCTCAGCGTGACGGTACATGGCTGCATTCTGACAAATT
CACCCTCAAAGTGCGGGAAGCCATCAAGGCTATCCCTGTTGTGTCTGTGCCTGAAACAAGTCACCTCCTT
AAGAAAGGGGACACATTTACGGTGGTGTGCACCATAAAAGATGTGTCTACATCCGTGAACTCCATGTGGC
TAAAGATGAACCCTCAGCCTCAGCACATAGCCCAGGTAAAGCACAATAGCTGGCACCGGGGTGACTTCAA
TTATGAACGCCAGGAGACGCTGACTATCAGCTCGGCAAGAGTTGACGATTCTGGAGTGTTCATGTGTTAT
GCCAATAATACTTTTGGATCAGCAAATGTCACAACAACCTTGAAAGTAGTAGAAAAAGGATTCATCAACA
TCTCCCCTGTGAAGAACACTACAGTATTTGTAACCGATGGAGAAAACGTAGATTTGGTTGTTGAATACGA
GGCCTACCCCAAACCCGAGCACCAGCAGTGGATATATATGAACAGGACCTCGGCTAACAAAGGGAAGGAT
TATGTCAAATCTGATAACAAAAGCAACATCAGATATGTGAACCAACTTCGCCTGACCAGATTAAAAGGCA
CAGAAGGAGGCACTTATACCTTTCTGGTGTCCAACTCTGATGCCAGTGCTTCCGTGACATTCAACGTTTA
CGTGAACACAAAACCAGAAATCCTGACGTACGACAGGCTCATAAATGGCATGCTCCAGTGTGTGGCAGAG
GGATTCCCGGAGCCCACAATAGATTGGTATTTTTGTACAGGAGCAGAGCAAAGGTGTACCACTCCTGTCT
Masters Research Project
150
CACCAGTGGACGTACAGGTCCAGAATGTATCTGTGTCACCATTTGGAAAACTGGTGGTTCAGAGTTCCAT
AGACTCCAGCGTCTTCCGGCACAACGGCACGGTGGAGTGTAAGGCCTCCAACGATGTGGGCAAGAGTTCC
GCCTTCTTTAACTTTGCATTTAAAGAGCAAATCCAGGCCCACACTCTGTTCACGCCGCTGCTCATTGGCT
TTGTGGTCGCAGCTGGCGCGATGGGGATCATTGTGATGGTGCTCACCTACAAATATTTGCAGAAACCCAT
GTATGAAGTACAATGGAAGGTTGTCGAGGAGATAAATGGAAACAATTATGTTTACATAGACCCGACGCAA
CTTCCTTATGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAGTTTTGGAAAGACATTGGGAGCTGGTG
CCTTCGGGAAGGTCGTTGAGGCCACTGCATATGGCTTGATTAAGTCGGATGCTGCCATGACAGTTGCCGT
GAAGATGCTCAAACCAAGTGCCCATTTAACAGAAAGAGAGGCCCTAATGTCGGAACTGAAGGTCCTGAGC
TACCTGGGCAATCACATGAATATTGTGAACCTGCTTGGCGCATGCACGGTGGGAGGGCCCACCCTGGTCA
TTACAGAATATTGTTGCTATGGTGATCTTTTGAATTTTTTGAGAAGGAAGCGTGACTCGTTTATTTTCTC
AAAGCAAGAAGAGCAGGCAGAAGCGGCACTTTATAAGAACCTTCTGCACTCAACGGAGCCTTCCTGTGAC
AGTTCAAATGAATATATGGACATGAAGCCTGGCGTTTCCTACGTGGTGCCAACCAAGACAGACAAGAGGA
GATCCGCAAGAATAGACTCGTACATAGAAAGAGACGTGACTCCTGCCATCATGGAAGATGACGAGCTGGC
TCTGGACCTGGATGATTTGCTGAGCTTCTCCTACCAGGTGGCCAAGGCGATGGCGTTCCTCGCCTCCAAG
AATTGTATTCACAGAGATTTGGCAGCCAGGAATATCCTCCTCACTCACGGGCGGATCACAAAGATTTGCG
ATTTCGGGCTAGCCAGAGACATCAGGAATGATTCGAATTACGTGGTCAAAGGAAATGCACGACTGCCCGT
GAAGTGGATGGCACCAGAGAGCATTTTCAGCTGCGTGTACACATTTGAAAGTGATGTCTGGTCCTATGGG
ATTTTCCTCTGGGAGCTCTTCTCCTTAGGAAGCAGCCCCTACCCAGGGATGCCGGTCGACTCCAAGTTCT
ACAAGATGATCAAGGAAGGCTTCCGGATGGTCAGCCCGGAGCACGCGCCTGCCGAAATGTATGACGTCAT
GAAGACTTGCTGGGACGCTGACCCCTTGAAAAGGCCAACATTCAAGCAGGTTGTCCAACTTATTGAGAAG
CAGATCTCGGACAGCACCAAGCACATTTACTCCAACTTGGCAAACTGCAACCCCAACCCAGAGAACCCCG
TGGTGGTGGACCATTCCGTGAGGGTCAACTCGGTGGGCAGCAGCGCCTCTTCTACGCAGCCCCTGCTCGT
GCACGAAGATGCCTGAGCAGAAACCCAAGTCCAACAGGCTTTGCTGCTGTCTCCGACCCCGTCCTTCTGG
CTTCTGTGATGGTTACTTGGTTTCCCTTTGACTTGCATCCTATTCCAGGGTAGCGAGTTCCCCACCCCAC
CTCCAACCCCACTGTGATTCCGCCTTTACGAGCACACACTTTAGTGGCCGATGGCTTTTCTTTTCTGCCA
Masters Research Project
151
TCAGCCACCGTCCCGCTGCGAAGGTCCGAACTGTATGTATATATTTTCCCAATAGCAAAGTAGCTCCTAC
TGTAAACAGAAGGACTCCTCCTGCTTTAGAGGAGAAGGGAAGGGCGGGGTGAAACTGGATGCCCAGAGTT
CTTCCCCCAGTGCTCCCCTGAGTGTATTTGAAAAGTATGGCCAGTAGTTCACTTGAAGAATAGATGTAGT
CCCATTTGGCCCTGAGAGCCATCCTTAATGATGGGAGATATATGTAGCAAGACTAGAAAGCCAAGCCCTT
TGTGTAGAAAGCAGACCATTCTTAGAACAGAGGGCAACGGGGCATCGGAAGTCTGGTCACGCTAAGAAGA
CCGAGGCTGAGAAGGAACAAGCCAGGGGAAGCGTGAACAATGATGCTCTGCTCTGGGCTGCCGCTCGGGC
TTCTGTACAACTGACCTGGTTTCTCAGTACTTTGCTGTCTGGGAGTAGCATTGGAATCAAGGCCTCCTCC
CTAGTCAGCCTTTGTATATACTCATCTATACGTTGTATGCGTTCATACTTTGGAGGAGGGATTTCCCACA
AGCTTTCGTTTCTGTGTACAGCCCTGGATTAGACCTACTGTGTGTAAGAATAGATTAAGAGCCATACATA
TTTGAAGGAAACAGTTAAATGTTTTTTGGTTGTGGTTGTTGTTGTTGTTGTTTTAAAGAAAAAAATGTAT
ATGCTAAGCACAATCTTTATAAGACCTCTTAGCCAACATACTTGCTCTGTCTACACTTCGGAACAAGCCT
TCCATGTCAGAGTGGCTTTGCAGGCAGGAGAACTGAGGCTGTTTGAAAAGGTTACCACAGGATGGAGAAA
ACAGTGCAGTCCTGGTTTGGATTCTCACATAGCAGGGAGCACAAGTTAAACTCGACCTTTTATAGGCACG
TCCCGGACATCGGGCCTGTATCTATTCAAGTGTGTATGTGTGTGCATGCGTGTGTCTATGCGTGTGGGTG
AGTTGTGTTGGGAAACTTGCCCTGCATCCCTGAGGGTCCTCCTTCAGGACCCAAGACGTAACAGCTTCTG
TCACCGCTCCTGTCTCTCCAGTTTCCCTGCATGTCGCTCACTGTCTAGAATTTACTCAAAGCCGCCACAG
AGGCTTAGCGGAGTGAAGTGCCGAAGGACCTCTTTATTTGGAGTCCTCCTGTATTTAACAACACTCTTAT
CGTAGACCCATTCATTAGACCTTATGTAATGCTGCCAATCCAGGGAAACAGATTTAAAGTGTACCCCGTA
GACAGGGCCCAGAGGTTCCCTTGTCCTTGCCCTCCCCCACACCACCCATGATCACTGTCCAACATAAAGG
GTTCAGTGTGTTACGTGGTCATGTGTTGTCCTTACAGGATTCAGGTATGTTGCCTTCACGGTTTTCCCCA
CCCCCTCCTGCCCTTTATCCTTTAGGCCGTGTGGCCATGAACCTGGAAGAAGTGATCGTTTCGACTTGAG
TGCTACACTCTTGCACCTTTCCAAAGTAAGCTGGTTTGGAGGTCCTGTGGTCATGTACGAGACTGTCACC
AGTTACCGCGCTCTGTTTGAAACATGTCTTTGTATTCCTAATGACTTCAGTTAGAGTAAGGAGAATAGCT
GTTAATATGGATGTCAGGTACTTAAGGGGCCACACCATTGAGAATTTTGTCTTGGATATTCTTGAAAGTT
TATATTTTTATAATTTTTTTTACATCAGATGTCAGATGTTTCTTTCAGTTGCTTGATGTTTGGAATTATT
Masters Research Project
152
ATGTGGCTTTTTTTGTAAATATTGAAATGTAGCAATAATGTCTTTTGAATATTCCTGAGCCCATGAGTCC
CTGAAAATATTTTTTATATATACAGTAACTTTATGTGTAAATAATACGCTGTGCAAGTTTAAACATGTCA
CGTTACATGTGGGTTTTTTCTGATATGTTGTCCAACTGTTGACAGTTCTGAAGAATTC
>gi|114594737|ref|XM_517285.2| PREDICTED: Pan troglodytes v-kit Hardy-Zuckerman 4 feline
sarcoma viral oncogene homolog (KIT), mRNA
ATGAAGGAGCTGACAGCTCAGGCAGTCGCAGTCTGCCTGATTGTCCTAGATCTTCATCATGGCCCCTTTG
CAGGAGCCATAGAGACCCAGGGGGCCAGACGCCGCCGGGAAGAAGCGAGACCCGGGCGGGCGCGAGGGAG
GGGAGGCGAGGAGGGGCGTGGCCGGCGCGCAGAAGGAGGGCGCTGGGAGGAGGGGCTGCTGCTCGGCGC
T
CGCGGCTCTGGGGGCTCGGCTTTGCCGCGCTCGCTGCACTTGGGCGAGAGCTGGAACGTGGACCAGAGCT
CGGATCCCATCGCAGCTACCGCGATGAGAGGCGCTCGCGGCGCCTGGGATTTTCTCTGCGTTCTGCTCCT
ACTGCTTCGCGTCCAGACAGGCTCTTCTCAACCATCTGTGAGTCCAGGGGAACCGTCTCCACCATCCATC
CATCCAGGAAAATCAGACTTAATAGTCCGCGTGGGCGACGAGATTAGGCTGTTATGCACTGATCCGGGCT
TTGTCAAATGGACTTTTGAGATCCTGGATGAAACGAATGAGAATAAGCAGAATGAATGGATCACGGAGAA
GGCAGAAGCCACCAACACTGGCAAATACACGTGCACCAACAAACACGGCTTAAGCAATTCCATTTATGTG
TTTGTTAGAGATCCTGCCAAGCTTTTCCTTGTTGACCGCTCCTTGTATGGGAAAGAAGACAACGACACGC
TGGTCCGCTGTCCTCTCACAGACCCAGAAGTGACCAATTATTCCCTCAAAGGGTGCCAGGGGAAGCCTCT
TCCCAAGGACTTGAGGTTTGTTCCTGACCCCAAGGCGGGCATCATGATCAAAAGTGTGAAACGCGCCTAC
CATCGGCTCTGTCTGCATTGTTCTGTGGACCAGGAGGGCAAGTCAGTGCTGTCGGAAAAATTCATCCTGA
AAGTGAGGCCAGCCTTCAAAGCTGTGCCTGTTGTGTCTGTGTCCAAAGCAAGCTATCTTCTTAGGGAAGG
GGAAGAATTCACAGTGACGTGCACAATAAAAGATGTGTCTAGTTCTGTGTACTCAACGTGGAAAAGAGAA
AACAGTCAGACTAAACTACAGGAGAAATATAATAGCTGGCATCACGGTGACTTCAATTATGAACGTCAGG
CAACGTTGACTATCAGTTCAGCGAGAGTTAATGATTCTGGAGTGTTCATGTGTTATGCCAATAATACTTT
TGGATCAGCAAATGTCACAACAACCTTGGAAGTAGTAGATAAAGGATTCATTAATATCTTCCCCATGATA
AACACTACAGTATTTGTAAATGATGGAGAAAATGTAGATTTGATTGTTGAATATGAAGCATTCCCCAAAC
Masters Research Project
153
CTGAACACCAGCAGTGGATCTATATGAACAGAACCTTCACTGATAAATGGGAAGATTATCCCAAGTCTGA
GAATGAAAGTAATATCAGATACGTAAGTGAACTTCATCTAACGAGATTAAAAGGCACCGAAGGAGGCACT
TACACATTCCTAGTGTCCAATTCTGACGTCAATGCTTCCATAGCATTTAATGTTTATGTGAATACAAAAC
CAGAAATCCTGACTTACGACAGGCTCGTGAATGGCATGCTCCAATGTGTGGCAGCAGGATTCCCAGAGCC
CACAATAGATTGGTATTTTTGTCCAGGAACTGAGCAGAGATGCTCTGCTTCTGTACTGCCAGTGGATGTG
CAGACACTAAACTCATCTGGGCCACCGTTTGGAAAGCTAGTGGTTCAGAGTTCTATAGATTCTAGTGCAT
TCAAGCACAATGGCACGGTTGAATGTAAGGCTTACAACGATGTGGGCAAGACTTCTGCCTATTTTAACTT
TGCATTTAAAGAGCAAATCCATCCCCACACCCTGTTCACTCCTTTGCTGATTGGTTTCGTAATCGTAGCT
GGCATGATGTGCATTATTGTGATGATTCTGACCTACAAATATTTACAGAAACCCATGTATGAAGTACAGT
GGAAGGTTGTTGAGGAGATAAATGGAAACAATTATGTTTACATAGACCCAACACAACTTCCTTATGATCA
CAAATGGGAGTTTCCCAGAAACAGGCTGAGTTTTGGGAAAACCCTGGGTGCTGGAGCTTTCGGGAAGGTT
GTTGAGGCAACTGCTTATGGCTTAATTAAGTCAGATGCGGCCATGACTGTCGCTGTGAAGATGCTCAAAC
CAAGTGCCCATTTAACAGAACGGGAAGCCCTCATGTCTGAACTCAAAGTCCTGAGTTACCTTGGTAATCA
CATGAATATTGTGAATCTACTTGGAGCCTGCACCATTGGAGGGCCCACCCTGGTCATTACAGAATATTGT
TGCTATGGTGATCTTTTGAATTTTTTGAGAAGAAAACGTGATTCATTTATTTGTTCAAAGCAGGAAGATC
ATGCAGAAGCTGCACTTTATAAGAATCTTCTGCATTCAAAGGAGTCTTCCTGCAGCGATAGTACTAATGA
GTACATGGACATGAAACCTGGAGTTTCTTATGTTGTCCCAACCAAGGCCGACAAAAGGAGATCTGCGAGA
ATAGGCTCATACATAGAAAGAGATGTGACTCCCGCCATCATGGAGGATGACGAGTTGGCCCTAGACTTAG
AAGACTTGCTGAGCTTTTCTTACCAGGTGGCAAAGGGCATGGCTTTCCTCGCCTCCAAGAATTGTATTCA
CAGAGACTTGGCAGCCAGAAATATCCTCCTTACTCATGGTCGGATCACAAAGATTTGTGATTTTGGTCTA
GCCAGAGACATCAAGAATGATTCTAATTATGTGGTTAAAGGAAACGCTCGACTACCTGTGAAGTGGATGG
CACCTGAAAGCATTTTCAACTGTGTATACACGTTTGAAAGCGACGTCTGGTCCTATGGGATTTTTCTTTG
GGAGCTGTTCTCTTTAGGAAGCAGCCCCTATCCTGGAATGCCGGTCGATTCTAAGTTCTACAAGATGATC
AAGGAAGGCTTCCGGATGCTCAGCCCTGAACACGCACCTGCTGAAATGTATGACATAATGAAGACTTGCT
GGGATGCAGATCCCCTAAAAAGACCAACATTCAAGCAAATTGTTCAGCTAATTGAGAAGCAGATTTCAGA
Masters Research Project
154
GAGCACCAATCATATTTACTCCAACTTAGCAAACTGCAGCCCCAACCGACAGAAGCCCGTGGTGGACCAT
TCTGTGCGGATCAATTCTGTCGGCAGCACCGCTTCCTCCTCCCAGCCTCTGCTTGTGCACGACGATGTCT
GAGCAGAATCAGTGTTTGGGTCACCCCTCCAGGAATGATCTCTTCTTTTGGCTTCCATGATGGTTATTTT
CTTTTCTTTCAACTTGCATCCAACTCCAGGATAGTGGGCACCCCACTGCAATCCTGTCTTTCTGAGCACA
CTTCAGTGGCCGATGATTTTTGTCATCAGCCACCATCCTATTGCAAAGGTTCCAACTGTATATATTCCCT
ATAGCAACGTAGCTTCTACCATGAACAGAAAACATTCTGATTTGGAAAAAGAGAGGGAGGTATGGCACTG
GGGGCCAGAGTCCTTTCCAAGGCTTCTCCAATTCTGCCCAAAAATATGGTTGATAGTTTACCTGAATAAA
TGGTAGTAATCACAGTTGGCCTTCAGAACCATCCATAGTAGTATGATGATACAAGATTAGAAGCTGAAAA
CCTAAGTCCTTTATGTGGAAAACAGAACATCATTAGAACAAAGGACAGAGTATGAACACCTGGGCTTAAG
AAATCTAGTATTTCATGCTGGGAATGAGACATAGGCCATGAAAAAAGTGATCCCCAAGTGTGAACAAAAG
ATGCTCTTCTGTGGACCACTGCATGAGCTTTTATACTACCGACCTGGTTTTTAAATAGAGTTTGCTATTA
GAGCATTGAATTGGAGAGAAGGCCTCCCTAGCCAGCACTTGTATATACGCATCTATAAATTGTCCGTGTT
CATACATTTGAGGGGAAAACACCATAAGGTTTCGTTTCTGTATACAACCCTGGCATTATGTCCACTGTGT
ATAGAAGTAGATTAAGAGCCATATAAGTTTGAAGGAAACAGTTAATACCATTTTTTAAGGAAACAATATA
ACCACAAAGCACAGTTTGAACAAAATCTCCTCTTTTAGCTGATGAACTTATTCTGTAGATTCTGTGGAAC
AAGCCTATCAGCTTCAGAATGGCATTGTACTCAATGGATTTGATGCTGTTTGACAAAGTTACTGATTCAC
TGCATGGCTCCCACAGGAGTGGGAAAACACTGCTATCTTAGTTTGGATTCTTACGTAGCAGGAAATAAAG
TATAGGTTTAGCCTCCTTCGCAGGCATGTCCTGGACACCGGGCCAGTATCTATATATGTGTATGTACGTT
TGTATGTGTGTAGACAAGTATTTGGAGGGGTATTTTTGCCCTGAGTCCAAGAGGGTCCTTTAGTACCTGA
AAAGTAACTTGGCTTTCATTATTAGTACTGTTTTTGTTTCTTTTCACATAGCTGTCTAGAGTAGCTTACC
AGAAGCTTCCATAGTGGTGCAGAGGAAGTGGAAGGCATCAGTCCCTATGTATTTGCAGTTCACCTGCACT
TAAGGCATTCTGTTATTTAGACTCATCTTACTGTACCTGTTCCTTAGACCTTCCATAATGCTACTGTCTC
ACTGAAACATTTAAATTTTACCCTTTAGACTGTAGCCTGGATATTCTTGTAGTTTACCTCTTTAAAAACA
AAACAAAACAAAAAACTCCCCTTCCTCACTGCCCAATATAAAAGGCAAATGTGTACATGGCAGAGTTTGT
GTGTTGTCTTGAAAGATTCAGGTATGTTGCCTTTATGGTTTCCCCCTTCTACATTTCTTAGACTACATTT
Masters Research Project
155
AGAGAACTGTGGCCGTTATCTGGAAGTAACCATTTGCACTGGAGTTCTATGCTCTCACACCTTTCCAAAG
TTAACAGATTTTGGGGTTGTGTTGTCACCCAAGAGATTGTTGTTTGCCATACTTTGTCTGAAAAATTCCT
TTGTGTTTCTATTGACTTCAATGATAGTAAGAAAAGTGGTTGTTAGTTATAGATGTCTAGGTACTTCAGG
GGCACTTCATTGAGAGTTTTGTCTTGGATATTCTTGAAAGTTTATATTTTTATAATTTTTTCTTACATCA
GATGTTTCTTTGCAGTGGCTTAATGTTTGAAATTATTTTGTGGCTTTTTTTGTAAATATTGAAATGTAGC
AATAATGTCTTTTGAATATTCCCAAGCCCACGAGTCCTTGAAAATATTTTTTATATATACAGTAACTTTA
TGTGTAAATACATAAGCGGCGTAAGTTTAAAGGATGTTGGTGTTCCACGTGTTTTATTCCTGTATGTTGT
CCAATTGTTGACAGTTCTGAAGAATTCTAATAAAATGTACATATATAAATCAAG
Masters Research Project
156
ClustalW (ClustalW, 2009)
Masters Research Project
157
Phylodraw (Phylogenetic Trees)
Phylogram_tree
radial_tree
Masters Research Project
158
rectangle_Cladogram_tree
rooted_tree
Masters Research Project
159
slated_Cladogram_tree
unrooted_tree
Masters Research Project
160
Pair Distance
Label1 Label2 Distance
gi|148005048|ref|NM_000222.2| gi|114594737|ref|XM_517285.2| 0.011170
gi|148005048|ref|NM_000222.2| gi|262050643|ref|NM_001166484. 0.156400
gi|148005048|ref|NM_000222.2| gi|45383437|ref|NM_204361.1| 0.604780
gi|148005048|ref|NM_000222.2| gi|50423|emb|Y00864.1| 0.230490
gi|114594737|ref|XM_517285.2| gi|262050643|ref|NM_001166484. 0.156010
gi|114594737|ref|XM_517285.2| gi|45383437|ref|NM_204361.1| 0.604390
gi|114594737|ref|XM_517285.2| gi|50423|emb|Y00864.1| 0.230100
gi|262050643|ref|NM_001166484. gi|45383437|ref|NM_204361.1| 0.614480
gi|262050643|ref|NM_001166484. gi|50423|emb|Y00864.1| 0.240190
gi|45383437|ref|NM_204361.1| gi|50423|emb|Y00864.1| 0.623510
Average 0.347152 Variation 0.071059
Root Distance
Label Distance
gi|148005048|ref|NM_000222.2| 0.105880
gi|114594737|ref|XM_517285.2| 0.105490
gi|262050643|ref|NM_001166484. 0.115580
gi|45383437|ref|NM_204361.1| 0.498900
gi|50423|emb|Y00864.1| 0.124610
Average 0.190092 Variation 0.069124
Masters Research Project
161
Masters Research Project
162
Genpept Format (Genpept, 2009)
NCBI Reference Sequence: NP_000213.1
v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog isoform 1 precursor [Homo sapiens]
• Comment
• Features
• Sequence
LOCUS NP_000213 976 aa linear PRI 13-DEC-2009
DEFINITION v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog
isoform 1 precursor [Homo sapiens].
ACCESSION NP_000213
VERSION NP_000213.1 GI:4557695
DBSOURCE REFSEQ: accession NM_000222.2
KEYWORDS .
SOURCE Homo sapiens (human)
ORGANISM Homo sapiens
Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
Catarrhini; Hominidae; Homo.
REFERENCE 1 (residues 1 to 976)
AUTHORS Ganesh,S.K., Zakai,N.A., van Rooij,F.J., Soranzo,N., Smith,A.V.,
Nalls,M.A., Chen,M.H., Kottgen,A., Glazer,N.L., Dehghan,A.,
Kuhnel,B., Aspelund,T., Yang,Q., Tanaka,T., Jaffe,A., Bis,J.C.,
Verwoert,G.C., Teumer,A., Fox,C.S., Guralnik,J.M., Ehret,G.B.,
Rice,K., Felix,J.F., Rendon,A., Eiriksdottir,G., Levy,D.,
Patel,K.V., Boerwinkle,E., Rotter,J.I., Hofman,A., Sambrook,J.G.,
Masters Research Project
163
Hernandez,D.G., Zheng,G., Bandinelli,S., Singleton,A.B., Coresh,J.,
Lumley,T., Uitterlinden,A.G., Vangils,J.M., Launer,L.J.,
Cupples,L.A., Oostra,B.A., Zwaginga,J.J., Ouwehand,W.H.,
Thein,S.L., Meisinger,C., Deloukas,P., Nauck,M., Spector,T.D.,
Gieger,C., Gudnason,V., van Duijn,C.M., Psaty,B.M., Ferrucci,L.,
Chakravarti,A., Greinacher,A., O'Donnell,C.J., Witteman,J.C.,
Furth,S., Cushman,M., Harris,T.B. and Lin,J.P.
TITLE Multiple loci influence erythrocyte phenotypes in the CHARGE
Consortium
JOURNAL Nat. Genet. 41 (11), 1191-1198 (2009)
PUBMED 19862010
REFERENCE 2 (residues 1 to 976)
AUTHORS Bodemer,C., Hermine,O., Palmerini,F., Yang,Y., Grandpeix-Guyodo,C.,
Leventhal,P.S., Hadj-Rabia,S., Nasca,L., Georgin-Lavialle,S.,
Cohen-Akenine,A., Launay,J.M., Barete,S., Feger,F., Arock,M.,
Catteau,B., Sans,B., Stalder,J.F., Skowron,F., Thomas,L.,
Lorette,G., Plantin,P., Bordigoni,P., Lortholary,O., Prost,Y.D.,
Moussy,A., Sobol,H. and Dubreuil,P.
TITLE Pediatric Mastocytosis Is a Clonal Disease Associated with D(816)V
and Other Activating c-KIT Mutations
JOURNAL J. Invest. Dermatol. (2009) In press
PUBMED 19865100
REMARK GeneRIF: Observational study of gene-disease association. (HuGE
Navigator)
Publication Status: Available-Online prior to print
REFERENCE 3 (residues 1 to 976)
AUTHORS Akagi,T., Shih,L.Y., Ogawa,S., Gerss,J., Moore,S.R., Schreck,R.,
Kawamata,N., Liang,D.C., Sanada,M., Nannya,Y., Deneberg,S.,
Masters Research Project
164
Zachariadis,V., Nordgren,A., Song,J.H., Dugas,M., Lehmann,S. and
Koeffler,H.P.
TITLE Single nucleotide polymorphism genomic arrays analysis of t(8;21)
acute myeloid leukemia cells
JOURNAL Haematologica 94 (9), 1301-1306 (2009)
PUBMED 19734423
REMARK GeneRIF: Observational study of gene-disease association. (HuGE
Navigator)
REFERENCE 4 (residues 1 to 976)
AUTHORS Kwon,J.E., Kang,H.J., Kim,S.H., Lee,Y.C., Hyung,W.J., Noh,S.H.,
Kim,N.K. and Kim,H.
TITLE Pathological characteristics of gastrointestinal stromal tumours
with PDGFRA mutations
JOURNAL Pathology 41 (6), 544-554 (2009)
PUBMED 19900103
REMARK GeneRIF: Observational study of gene-disease association. (HuGE
Navigator)
REFERENCE 5 (residues 1 to 976)
AUTHORS Stec,R., Grala,B., Maczewski,M., Bodnar,L. and Szczylik,C.
TITLE Chromophobe renal cell cancer--review of the literature and
potential methods of treating metastatic disease
JOURNAL J. Exp. Clin. Cancer Res. 28, 134 (2009)
PUBMED 19811659
REMARK GeneRIF: Overexpression of CD117 on cellular membranes of
chromophobe renal cell carcinoma could be a potential target for
kinase inhibitors
Publication Status: Online-Only
REFERENCE 6 (sites)
Masters Research Project
165
AUTHORS Lennartsson,J., Wernstedt,C., Engstrom,U., Hellman,U. and
Ronnstrand,L.
TITLE Identification of Tyr900 in the kinase domain of c-Kit as a
Src-dependent phosphorylation site mediating interaction with c-Crk
JOURNAL Exp. Cell Res. 288 (1), 110-118 (2003)
PUBMED 12878163
REFERENCE 7 (sites)
AUTHORS Thommes,K., Lennartsson,J., Carlberg,M. and Ronnstrand,L.
TITLE Identification of Tyr-703 and Tyr-936 as the primary association
sites for Grb2 and Grb7 in the c-Kit/stem cell factor receptor
JOURNAL Biochem. J. 341 (PT 1), 211-216 (1999)
PUBMED 10377264
REFERENCE 8 (sites)
AUTHORS Timokhina,I., Kissel,H., Stella,G. and Besmer,P.
TITLE Kit signaling through PI 3-kinase and Src kinase pathways: an
essential role for Rac1 and JNK activation in mast cell
proliferation
JOURNAL EMBO J. 17 (21), 6250-6262 (1998)
PUBMED 9799234
REFERENCE 9 (sites)
AUTHORS Kozlowski,M., Larose,L., Lee,F., Le,D.M., Rottapel,R. and
Siminovitch,K.A.
TITLE SHP-1 binds and negatively modulates the c-Kit receptor by
interaction with tyrosine 569 in the c-Kit juxtamembrane domain
JOURNAL Mol. Cell. Biol. 18 (4), 2089-2099 (1998)
PUBMED 9528781
REFERENCE 10 (sites)
AUTHORS Price,D.J., Rivnay,B., Fu,Y., Jiang,S., Avraham,S. and Avraham,H.
Masters Research Project
166
TITLE Direct association of Csk homologous kinase (CHK) with the
diphosphorylated site Tyr568/570 of the activated c-KIT in
megakaryocytes
JOURNAL J. Biol. Chem. 272 (9), 5915-5920 (1997)
PUBMED 9038210
REFERENCE 11 (sites)
AUTHORS Blume-Jensen,P., Wernstedt,C., Heldin,C.H. and Ronnstrand,L.
TITLE Identification of the major phosphorylation sites for protein
kinase C in kit/stem cell factor receptor in vitro and in intact
cells
JOURNAL J. Biol. Chem. 270 (23), 14192-14200 (1995)
PUBMED 7539802
REFERENCE 12 (sites)
AUTHORS Serve,H., Hsu,Y.C. and Besmer,P.
TITLE Tyrosine residue 719 of the c-kit receptor is essential for binding
of the P85 subunit of phosphatidylinositol (PI) 3-kinase and for
c-kit-associated PI 3-kinase activity in COS-1 cells
JOURNAL J. Biol. Chem. 269 (8), 6026-6030 (1994)
PUBMED 7509796
REFERENCE 13 (residues 1 to 976)
AUTHORS Spritz,R.A., Droetto,S. and Fukushima,Y.
TITLE Deletion of the KIT and PDGFRA genes in a patient with piebaldism
JOURNAL Am. J. Med. Genet. 44 (4), 492-495 (1992)
PUBMED 1279971
REFERENCE 14 (residues 1 to 976)
AUTHORS Giebel,L.B., Strunk,K.M., Holmes,S.A. and Spritz,R.A.
TITLE Organization and nucleotide sequence of the human KIT (mast/stem
cell growth factor receptor) proto-oncogene
Masters Research Project
167
JOURNAL Oncogene 7 (11), 2207-2217 (1992)
PUBMED 1279499
REFERENCE 15 (residues 1 to 976)
AUTHORS Andre,C., Martin,E., Cornu,F., Hu,W.X., Wang,X.P. and Galibert,F.
TITLE Genomic organization of the human c-kit gene: evolution of the
receptor tyrosine kinase subclass III
JOURNAL Oncogene 7 (4), 685-691 (1992)
PUBMED 1373482
REFERENCE 16 (residues 1 to 976)
AUTHORS Duronio,V., Welham,M.J., Abraham,S., Dryden,P. and Schrader,J.W.
TITLE p21ras activation via hemopoietin receptors and c-kit requires
tyrosine kinase activity but not tyrosine phosphorylation of p21ras
GTPase-activating protein
JOURNAL Proc. Natl. Acad. Sci. U.S.A. 89 (5), 1587-1591 (1992)
PUBMED 1371879
REFERENCE 17 (residues 1 to 976)
AUTHORS Spritz,R.A., Giebel,L.B. and Holmes,S.A.
TITLE Dominant negative and loss of function mutations of the c-kit
(mast/stem cell growth factor receptor) proto-oncogene in human
piebaldism
JOURNAL Am. J. Hum. Genet. 50 (2), 261-269 (1992)
PUBMED 1370874
COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staff. The
reference sequence was derived from DC376760.1, X06182.1 and
BC071593.1.
Summary: This gene encodes the human homolog of the proto-oncogene
c-kit. C-kit was first identified as the cellular homolog of the
Masters Research Project
168
feline sarcoma viral oncogene v-kit. This protein is a type 3
transmembrane receptor for MGF (mast cell growth factor, also known
as stem cell factor). Mutations in this gene are associated with
gastrointestinal stromal tumors, mast cell disease, acute
myelogenous lukemia, and piebaldism. Multiple transcript variants
encoding different isoforms have been found for this gene.
[provided by RefSeq].
Transcript Variant: This variant (1) represents the longer
transcript and encodes the longer isoform (1).
Publication Note: This RefSeq record includes a subset of the
publications that are available for this gene. Please see the
Entrez Gene record to access additional publications.
FEATURES Location/Qualifiers
source 1..976
/organism="Homo sapiens"
/db_xref="taxon:9606"
/chromosome="4"
/map="4q11-q12"
Protein 1..976
/product="v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog isoform 1 precursor"
/EC_number="2.7.10.1"
/note="mast/stem cell growth factor receptor;
proto-oncogene tyrosine-protein kinase Kit; soluble KIT
variant 1"
/calculated_mol_wt=109734
Masters Research Project
169
Region 226..292
/region_name="ig"
/note="Immunoglobulin domain; pfam00047"
/db_xref="CDD:109116"
Region 230..304
/region_name="Ig"
/note="Immunoglobulin domain; cd00096"
/db_xref="CDD:143165"
Region 311..411
/region_name="Ig4_SCFR"
/note="Fourth immunoglobulin (Ig)-like domain of stem cell
factor receptor (SCFR); cd05860"
/db_xref="CDD:143268"
Site order(380..383,386)
/site_type="other"
/note="dimerization interface"
/db_xref="CDD:143268"
Site 545
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[9]
Site 547
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[9]
Region 553..928
Masters Research Project
170
/region_name="PTKc_Kit"
/note="Catalytic Domain of the Protein Tyrosine Kinase,
Kit; cd05104"
/db_xref="CDD:133235"
Site 553
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[9]
Site 568
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[8]
/db_xref="HPRD:01287"
Site 570
/site_type="modified"
/experiment="experimental evidence, no additional details
recorded"
/note="dephosphorylation site"
/citation=[9]
/db_xref="HPRD:01475"
Site 570
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[8]
/db_xref="HPRD:01287"
Masters Research Project
171
Region 589..892
/region_name="STYKc"
/note="Protein kinase; unclassified specificity;
smart00221"
/db_xref="CDD:128517"
Site order(595..601,603,621,623,671,673,677,792,796..797,799,
810,828..832,841,876)
/site_type="active"
/db_xref="CDD:133235"
Site order(595..596,598..601,603,621,623,671,673,677,796..797,
799,810)
/site_type="other"
/note="ATP binding site"
/db_xref="CDD:133235"
Site 703
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[7]
/db_xref="HPRD:01287"
Site 721
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[10]
Site 730
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
Masters Research Project
172
recorded"
/citation=[12]
Site 741
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[11]
/db_xref="HPRD:01498"
Site 746
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[11]
/db_xref="HPRD:01498"
Site 747
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[12]
Site order(792,796,828..832,841,876)
/site_type="other"
/note="substrate binding site"
/db_xref="CDD:133235"
Site 809..830
/site_type="other"
/note="activation loop (A-loop)"
/db_xref="CDD:133235"
Site 821
Masters Research Project
173
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[11]
/db_xref="HPRD:01498"
Site 900
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[6]
/db_xref="HPRD:01819"
Site 936
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[7]
/db_xref="HPRD:01287"
Site 959
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/citation=[11]
/db_xref="HPRD:01498"
CDS 1..976
/gene="KIT"
/gene_synonym="C-Kit; CD117; PBT; SCFR"
/coded_by="NM_000222.2:88..3018"
/note="isoform 1 precursor is encoded by transcript
Masters Research Project
174
variant 1"
/db_xref="CCDS:CCDS3496.1"
/db_xref="GeneID:3815"
/db_xref="HGNC:6342"
/db_xref="HPRD:01287"
/db_xref="MIM:164920"
ORIGIN
1 mrgargawdf lcvlllllrv qtgssqpsvs pgepsppsih pgksdlivrv gdeirllctd
61 pgfvkwtfei ldetnenkqn ewitekaeat ntgkytctnk hglsnsiyvf vrdpaklflv
121 drslygkedn dtlvrcpltd pevtnyslkg cqgkplpkdl rfipdpkagi miksvkrayh
181 rlclhcsvdq egksvlsekf ilkvrpafka vpvvsvskas yllregeeft vtctikdvss
241 svystwkren sqtklqekyn swhhgdfnye rqatltissa rvndsgvfmc yanntfgsan
301 vtttlevvdk gfinifpmin ttvfvndgen vdliveyeaf pkpehqqwiy mnrtftdkwe
361 dypksenesn iryvselhlt rlkgteggty tflvsnsdvn aaiafnvyvn tkpeiltydr
421 lvngmlqcva agfpeptidw yfcpgteqrc sasvlpvdvq tlnssgppfg klvvqssids
481 safkhngtve ckayndvgkt sayfnfafkg nnkeqihpht lftplligfv ivagmmciiv
541 miltykylqk pmyevqwkvv eeingnnyvy idptqlpydh kwefprnrls fgktlgagaf
601 gkvveatayg liksdaamtv avkmlkpsah lterealmse lkvlsylgnh mnivnllgac
661 tiggptlvit eyccygdlln flrrkrdsfi cskqedhaea alyknllhsk esscsdstne
721 ymdmkpgvsy vvptkadkrr svrigsyier dvtpaimedd elaldledll sfsyqvakgm
781 aflaskncih rdlaarnill thgritkicd fglardiknd snyvvkgnar lpvkwmapes
841 ifncvytfes dvwsygiflw elfslgsspy pgmpvdskfy kmikegfrml spehapaemy
901 dimktcwdad plkrptfkqi vqliekqise stnhiysnla ncspnrqkpv vdhsvrinsv
961 gstasssqpl lvhddv
//
Masters Research Project
175
Fasta Format of Protein (Protein-Fasta, 2009)
>gi|4557695|ref|NP_000213.1| v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene
homolog isoform 1 precursor [Homo sapiens]
MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCT
DPGFVKWTFEI
LDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGK
EDNDTLVRCPLTD
PEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSE
KFILKVRPAFKA
VPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDF
NYERQATLTISSA
RVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEY
EAFPKPEHQQWIY
MNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNV
YVNTKPEILTYDR
LVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSI
DSSAFKHNGTVE
CKAYNDVGKTSAYFNFAFKGNNKEQIHPHTLFTPLLIGFVIVAGMMCIIVMILTYKYL
QKPMYEVQWKVV
EEINGNNYVYIDPTQLPYDHKWEFPRNRLSFGKTLGAGAFGKVVEATAYGLIKSDAA
MTVAVKMLKPSAH
LTEREALMSELKVLSYLGNHMNIVNLLGACTIGGPTLVITEYCCYGDLLNFLRRKRDS
FICSKQEDHAEA
ALYKNLLHSKESSCSDSTNEYMDMKPGVSYVVPTKADKRRSVRIGSYIERDVTPAIM
EDDELALDLEDLL
SFSYQVAKGMAFLASKNCIHRDLAARNILLTHGRITKICDFGLARDIKNDSNYVVKG
NARLPVKWMAPES
IFNCVYTFESDVWSYGIFLWELFSLGSSPYPGMPVDSKFYKMIKEGFRMLSPEHAPAE
MYDIMKTCWDAD
PLKRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSVRINSVGSTASSSQPL
LVHDDV
Masters Research Project
176
Masters Research Project
177
Amino-Acid composition
Protein: gi|4557695|ref|NP_000213.1| v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog isoform 1 precursor [Homo sapiens]
Length = 976 amino acids
Molecular Weight = 109858.88 Daltons
Amino Acid Number Mol%
Ala A 55 5.64
Cys C 24 2.46
Asp D 53 5.43
Glu E 58 5.94
Phe F 43 4.41
Gly G 55 5.64
His H 22 2.25
Ile I 52 5.33
Lys K 66 6.76
Leu L 83 8.50
Met M 24 2.46
Asn N 54 5.53
Pro P 50 5.12
Gln Q 25 2.56
Arg R 40 4.10
Ser S 79 8.09
Thr T 59 6.05
Val V 78 7.99
Masters Research Project
178
Trp W 14 1.43
Tyr Y 42 4.30
Masters Research Project
179
Masters Research Project
180
Prorparam and Protscale (Protparam, 2009)
Prorparam and Protscale
User-provided sequence:
10 20 30 40 50 60
GIREFNPVKI THARDYZUCK ERMANFELIN ESARCOMAVI RALONCOGEN EHOMOLOGIS
70 80 90 100 110 120
OFORMPRECU RSORHOMOSA PIENSMRGAR GAWDFLCVLL LLLRVQTGSS QPSVSPGEPS
130 140 150 160 170 180
PPSIHPGKSD LIVRVGDEIR LLCTDPGFVK WTFEILDETN ENKQNEWITE KAEATNTGKY
190 200 210 220 230 240
TCTNKHGLSN SIYVFVRDPA KLFLVDRSLY GKEDNDTLVR CPLTDPEVTN YSLKGCQGKP
250 260 270 280 290 300
LPKDLRFIPD PKAGIMIKSV KRAYHRLCLH CSVDQEGKSV LSEKFILKVR PAFKAVPVVS
310 320 330 340 350 360
VSKASYLLRE GEEFTVTCTI KDVSSSVYST WKRENSQTKL QEKYNSWHHG DFNYERQATL
370 380 390 400 410 420
TISSARVNDS GVFMCYANNT FGSANVTTTL EVVDKGFINI FPMINTTVFV NDGENVDLIV
430 440 450 460 470 480
EYEAFPKPEH QQWIYMNRTF TDKWEDYPKS ENESNIRYVS ELHLTRLKGT EGGTYTFLVS
490 500 510 520 530 540
NSDVNAAIAF NVYVNTKPEI LTYDRLVNGM LQCVAAGFPE PTIDWYFCPG TEQRCSASVL
550 560 570 580 590 600
PVDVQTLNSS GPPFGKLVVQ SSIDSSAFKH NGTVECKAYN DVGKTSAYFN FAFKGNNKEQ
610 620 630 640 650 660
IHPHTLFTPL LIGFVIVAGM MCIIVMILTY KYLQKPMYEV QWKVVEEING NNYVYIDPTQ
670 680 690 700 710 720
LPYDHKWEFP RNRLSFGKTL GAGAFGKVVE ATAYGLIKSD AAMTVAVKML KPSAHLTERE
730 740 750 760 770 780
ALMSELKVLS YLGNHMNIVN LLGACTIGGP TLVITEYCCY GDLLNFLRRK RDSFICSKQE
790 800 810 820 830 840
DHAEAALYKN LLHSKESSCS DSTNEYMDMK PGVSYVVPTK ADKRRSVRIG SYIERDVTPA
850 860 870 880 890 900
IMEDDELALD LEDLLSFSYQ VAKGMAFLAS KNCIHRDLAA RNILLTHGRI TKICDFGLAR
910 920 930 940 950 960
DIKNDSNYVV KGNARLPVKW MAPESIFNCV YTFESDVWSY GIFLWELFSL GSSPYPGMPV
970 980 990 1000 1010 1020
Masters Research Project
181
DSKFYKMIKE GFRMLSPEHA PAEMYDIMKT CWDADPLKRP TFKQIVQLIE KQISESTNHI
1030 1040 1050 1060
YSNLANCSPN RQKPVVDHSV RINSVGSTAS SSQPLLVHDD V
References and documentation are available.
Please note the modified algorithm for extinction coefficient.
Number of amino acids: 1061
Molecular weight: 118214.8
Theoretical pI: 6.83
Amino acid composition: CSV format
Ala (A) 61 5.7%
Arg (R) 49 4.6%
Asn (N) 60 5.7%
Asp (D) 54 5.1%
Cys (C) 28 2.6%
Gln (Q) 25 2.4%
Glu (E) 66 6.2%
Gly (G) 58 5.5%
His (H) 25 2.4%
Ile (I) 58 5.5%
Leu (L) 86 8.1%
Lys (K) 68 6.4%
Met (M) 29 2.7%
Phe (F) 46 4.3%
Pro (P) 53 5.0%
Ser (S) 84 7.9%
Thr (T) 60 5.7%
Trp (W) 14 1.3%
Tyr (Y) 43 4.1%
Val (V) 80 7.5%
Pyl (O) 11 1.0%
Sec (U) 2 0.2%
(B) 0 0.0%
(Z) 1 0.1%
(X) 0 0.0%
Total number of negatively charged residues (Asp + Glu): 120
Total number of positively charged residues (Arg + Lys): 117
Atom composition:
As there is at least one ambiguous position (B,Z or X) in the sequence
Masters Research Project
182
considered, the atomic composition cannot be computed.
Extinction coefficients:
Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in
water.
Ext. coefficient 142820
Abs 0.1% (=1 g/l) 1.208, assuming ALL Cys residues appear as half
cystines
Ext. coefficient 141070
Abs 0.1% (=1 g/l) 1.193, assuming NO Cys residues appear as half cystines
Estimated half-life:
The N-terminal of the sequence considered is G (Gly).
The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (Escherichia coli, in vivo).
Instability index:
The instability index (II) is computed to be 39.21
This classifies the protein as stable.
Aliphatic index: 80.55
Grand average of hydropathicity (GRAVY): -0.269
ProtScale (ProtScale, 2009):-
User-provided sequence:
10 20 30 40 50 60
MRGARGAWDF LCVLLLLLRV QTGSSQPSVS PGEPSPPSIH PGKSDLIVRV GDEIRLLCTD
70 80 90 100 110 120
PGFVKWTFEI LDETNENKQN EWITEKAEAT NTGKYTCTNK HGLSNSIYVF VRDPAKLFLV
130 140 150 160 170 180
DRSLYGKEDN DTLVRCPLTD PEVTNYSLKG CQGKPLPKDL RFIPDPKAGI MIKSVKRAYH
190 200 210 220 230 240
Masters Research Project
183
RLCLHCSVDQ EGKSVLSEKF ILKVRPAFKA VPVVSVSKAS YLLREGEEFT VTCTIKDVSS
250 260 270 280 290 300
SVYSTWKREN SQTKLQEKYN SWHHGDFNYE RQATLTISSA RVNDSGVFMC YANNTFGSAN
310 320 330 340 350 360
VTTTLEVVDK GFINIFPMIN TTVFVNDGEN VDLIVEYEAF PKPEHQQWIY MNRTFTDKWE
370 380 390 400 410 420
DYPKSENESN IRYVSELHLT RLKGTEGGTY TFLVSNSDVN AAIAFNVYVN TKPEILTYDR
430 440 450 460 470 480
LVNGMLQCVA AGFPEPTIDW YFCPGTEQRC SASVLPVDVQ TLNSSGPPFG KLVVQSSIDS
490 500 510 520 530 540
SAFKHNGTVE CKAYNDVGKT SAYFNFAFKG NNKEQIHPHT LFTPLLIGFV IVAGMMCIIV
550 560 570 580 590 600
MILTYKYLQK PMYEVQWKVV EEINGNNYVY IDPTQLPYDH KWEFPRNRLS FGKTLGAGAF
610 620 630 640 650 660
GKVVEATAYG LIKSDAAMTV AVKMLKPSAH LTEREALMSE LKVLSYLGNH MNIVNLLGAC
670 680 690 700 710 720
TIGGPTLVIT EYCCYGDLLN FLRRKRDSFI CSKQEDHAEA ALYKNLLHSK ESSCSDSTNE
730 740 750 760 770 780
YMDMKPGVSY VVPTKADKRR SVRIGSYIER DVTPAIMEDD ELALDLEDLL SFSYQVAKGM
790 800 810 820 830 840
AFLASKNCIH RDLAARNILL THGRITKICD FGLARDIKND SNYVVKGNAR LPVKWMAPES
850 860 870 880 890 900
IFNCVYTFES DVWSYGIFLW ELFSLGSSPY PGMPVDSKFY KMIKEGFRML SPEHAPAEMY
910 920 930 940 950 960
DIMKTCWDAD PLKRPTFKQI VQLIEKQISE STNHIYSNLA NCSPNRQKPV VDHSVRINSV
970
GSTASSSQPL LVHDDV
SEQUENCE LENGTH: 976
Using the scale Hphob. / Kyte & Doolittle, the individual values for the 20 amino acids are:
Ala: 1.800 Arg: -4.500 Asn: -3.500 Asp: -3.500 Cys: 2.500 Gln: -
3.500
Glu: -3.500 Gly: -0.400 His: -3.200 Ile: 4.500 Leu: 3.800 Lys: -
3.900
Met: 1.900 Phe: 2.800 Pro: -1.600 Ser: -0.800 Thr: -0.700 Trp: -
0.900
Tyr: -1.300 Val: 4.200 : -3.500 : -3.500 : -0.490
Weights for window positions 1,..,9, using linear weight variation model:
Masters Research Project
184
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
edge center edge
MIN: -3.233
MAX: 3.589
Using the scale Molecular weight, the individual values for the 20 amino acids are:
Ala: 89.000 Arg: 174.000 Asn: 132.000 Asp: 133.000 Cys: 121.000 Gln:
146.000
Glu: 147.000 Gly: 75.000 His: 155.000 Ile: 131.000 Leu: 131.000 Lys:
146.000
Met: 149.000 Phe: 165.000 Pro: 115.000 Ser: 105.000 Thr: 119.000 Trp:
204.000
Tyr: 181.000 Val: 117.000 : 132.500 : 146.500 : 136.750
Weights for window positions 1,..,9, using linear weight variation model:
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
Masters Research Project
185
edge center edge
MIN: 102.111
MAX: 157.778
Using the scale Bulkiness, the individual values for the 20 amino acids are:
Ala: 11.500 Arg: 14.280 Asn: 12.820 Asp: 11.680 Cys: 13.460 Gln:
14.450
Glu: 13.570 Gly: 3.400 His: 13.690 Ile: 21.400 Leu: 21.400 Lys:
15.710
Met: 16.250 Phe: 19.800 Pro: 17.430 Ser: 9.470 Thr: 15.770 Trp:
21.670
Tyr: 18.030 Val: 21.570 : 12.250 : 14.010 : 15.368
Weights for window positions 1,..,9, using linear weight variation model:
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
edge center edge
Masters Research Project
186
MIN: 11.066
MAX: 20.359
Using the scale Polarity / Grantham, the individual values for the 20 amino acids are:
Ala: 8.100 Arg: 10.500 Asn: 11.600 Asp: 13.000 Cys: 5.500 Gln:
10.500
Glu: 12.300 Gly: 9.000 His: 10.400 Ile: 5.200 Leu: 4.900 Lys:
11.300
Met: 5.700 Phe: 5.200 Pro: 8.000 Ser: 9.200 Thr: 8.600 Trp:
5.400
Tyr: 6.200 Val: 5.900 : 12.300 : 11.400 : 8.325
Weights for window positions 1,..,9, using linear weight variation model:
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
edge center edge
MIN: 5.111
MAX: 11.422
Masters Research Project
187
Using the scale Recognition factors, the individual values for the 20 amino acids are:
Ala: 78.000 Arg: 95.000 Asn: 94.000 Asp: 81.000 Cys: 89.000 Gln:
87.000
Glu: 78.000 Gly: 84.000 His: 84.000 Ile: 88.000 Leu: 85.000 Lys:
87.000
Met: 80.000 Phe: 81.000 Pro: 91.000 Ser: 107.000 Thr: 93.000 Trp:
104.000
Tyr: 84.000 Val: 89.000 : 87.500 : 82.500 : 87.950
Weights for window positions 1,..,9, using linear weight variation model:
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
edge center edge
MIN: 80.444
MAX: 98.556
Masters Research Project
188
Using the scale Number of codon(s), the individual values for the 20 amino acids are:
Ala: 4.000 Arg: 6.000 Asn: 2.000 Asp: 2.000 Cys: 1.000 Gln:
2.000
Glu: 2.000 Gly: 4.000 His: 2.000 Ile: 3.000 Leu: 6.000 Lys:
2.000
Met: 1.000 Phe: 2.000 Pro: 4.000 Ser: 6.000 Thr: 4.000 Trp:
1.000
Tyr: 2.000 Val: 4.000 : 2.000 : 2.000 : 3.000
Weights for window positions 1,..,9, using linear weight variation model:
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
edge center edge
MIN: 1.889
MAX: 5.222
Masters Research Project
189
Using the scale Polarity / Zimmerman, the individual values for the 20 amino acids are:
Ala: 0.000 Arg: 52.000 Asn: 3.380 Asp: 49.700 Cys: 1.480 Gln:
3.530
Glu: 49.900 Gly: 0.000 His: 51.600 Ile: 0.130 Leu: 0.130 Lys:
49.500
Met: 1.430 Phe: 0.350 Pro: 1.580 Ser: 1.670 Thr: 1.660 Trp:
2.100
Tyr: 1.610 Val: 0.130 : 26.540 : 26.715 : 13.594
Weights for window positions 1,..,9, using linear weight variation model:
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
edge center edge
MIN: 0.111
MAX: 34.056
Masters Research Project
190
Using the scale Refractivity, the individual values for the 20 amino acids are:
Ala: 4.340 Arg: 26.660 Asn: 13.280 Asp: 12.000 Cys: 35.770 Gln:
17.560
Glu: 17.260 Gly: 0.000 His: 21.810 Ile: 19.060 Leu: 18.780 Lys:
21.290
Met: 21.640 Phe: 29.400 Pro: 10.930 Ser: 6.350 Thr: 11.010 Trp:
42.530
Tyr: 31.530 Val: 13.920 : 12.640 : 17.410 : 18.756
Weights for window positions 1,..,9, using linear weight variation model:
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
edge center edge
MIN: 6.780
MAX: 24.140
Masters Research Project
191
Using the scale Hphob. / Eisenberg et al., the individual values for the 20 amino acids are:
Ala: 0.620 Arg: -2.530 Asn: -0.780 Asp: -0.900 Cys: 0.290 Gln: -
0.850
Glu: -0.740 Gly: 0.480 His: -0.400 Ile: 1.380 Leu: 1.060 Lys: -
1.500
Met: 0.640 Phe: 1.190 Pro: 0.120 Ser: -0.180 Thr: -0.050 Trp:
0.810
Tyr: 0.260 Val: 1.080 : -0.840 : -0.795 : -0.000
Weights for window positions 1,..,9, using linear weight variation model:
1 2 3 4 5 6 7 8 9
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
edge center edge
MIN: -1.108
MAX: 1.037
Masters Research Project
192
Masters Research Project
193
Masters Research Project
194
GOR4 result for : UNK_152580 (GOR IV, 2009)
Abstract GOR secondary structure prediction method version IV, J. Garnier, J.-F. Gibrat, B. Robson, Methods in
Enzymology,R.F. Doolittle Ed., vol 266, 540-553, (1996)
View GOR4 in: [AnTheProt (PC) , Download...] [HELP]
10 20 30 40 50 60 70
| | | | | | |
MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEI
cccccccchhhhhhhhhhccccccccccccccccccccccccccceeeecccceeeeecccccccceeee
LDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTD
ecccccchhhhhhhhhhhcccccceeeecccccccceeeeecccchhhhhhcccccccccceeeeccccc
PEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKA
cccceeecccccccccccccccccccccceeeeeeeeeeeeeeeeccccccccchhhhhhhhhhcccccc
VPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSA
cceeeccchhhhhccccceeeeeeeeccccceeeccccccchhhhhhhhccccccccchhhhheeeeccc
RVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIY
cccccceeeeeecccccccccceeeeeccccceeeecccceeeeecccccceeeeeeccccccchhhhee
MNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDR
eeeeeecccccccccccccchhhhhhhhhhhhcccccceeeeeeccccchhhhheeeccccccceeeeee
LVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVE
eecceeeeeeccccccceeceeeccccccccccccccceeeecccccccceeeeecccccccccccceee
CKAYNDVGKTSAYFNFAFKGNNKEQIHPHTLFTPLLIGFVIVAGMMCIIVMILTYKYLQKPMYEVQWKVV
eeeeccccccchhhhhhhccccccccccccccccceeceeeeecccceeeeeecccccccceeeeeeeee
EEINGNNYVYIDPTQLPYDHKWEFPRNRLSFGKTLGAGAFGKVVEATAYGLIKSDAAMTVAVKMLKPSAH
eeccccceeeecccccccccccccccccccccccccccccchhhhhhhhhhhhchhhhhhhhhhhccccc
LTEREALMSELKVLSYLGNHMNIVNLLGACTIGGPTLVITEYCCYGDLLNFLRRKRDSFICSKQEDHAEA
hhhhhhhhhhhhhhhhccccceeeeeeccccccccceeeeeecccccccccccccccceeecchhhhhhh
ALYKNLLHSKESSCSDSTNEYMDMKPGVSYVVPTKADKRRSVRIGSYIERDVTPAIMEDDELALDLEDLL
hhhhhhhccccccccccccccccccccceeeeccccccceeeeeceecccccchhhhhhhhhhhhhhhhh
SFSYQVAKGMAFLASKNCIHRDLAARNILLTHGRITKICDFGLARDIKNDSNYVVKGNARLPVKWMAPES
hhhhhhhhhhhhhhccccchhhhhhhhhhhhccceeeecccccccccccccceeeeccceeceeccccce
IFNCVYTFESDVWSYGIFLWELFSLGSSPYPGMPVDSKFYKMIKEGFRMLSPEHAPAEMYDIMKTCWDAD
eeeeeeeeccceeeeeeeeeeeecccccccccccccchhhhhhhhhhhhccccccceeeeeeeeeccccc
PLKRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSVRINSVGSTASSSQPLLVHDDV
cccccchhhhhhhhhhhccccccceeecccccccccccceeceeeeeeecccccccccceeeeeec
Sequence length : 976
GOR4 :
Alpha helix (Hh) : 202 is 20.70%
310 helix (Gg) : 0 is 0.00%
Pi helix (Ii) : 0 is 0.00%
Beta bridge (Bb) : 0 is 0.00%
Extended strand (Ee) : 256 is 26.23%
Beta turn (Tt) : 0 is 0.00%
Bend region (Ss) : 0 is 0.00%
Random coil (Cc) : 518 is 53.07%
Ambigous states (?) : 0 is 0.00%
Other states : 0 is 0.00%
Masters Research Project
195
Prediction result file (text): [GOR4]
Masters Research Project
196
SOPMA result for : UNK_153800 (SOPMA, 2009) Abstract Geourjon, C. & Deléage, G., SOPMA: Significant improvement in protein secondary structure
prediction by consensus prediction from multiple alignments., Cabios (1995) 11, 681-684
View SOPMA in: [AnTheProt (PC) , Download...] [HELP]
10 20 30 40 50 60 70
| | | | | | |
MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEI
hhhhhhhhhhhhhhhhheeecccccccccccccccccccccccceeeeettceeeeeeccccceeeeccc
LDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTD
ccccccccceeeeeehcccccccceeeeccccccceeeeeeccccceeeeccheeccttccceeeeeccc
PEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKA
ttcceeeeecccccccccceeeeccccccceehcchhccccceeeeeeccttccccccceeeeeeccccc
VPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSA
cceeeecccheeeettcceeeeeeccccccceeeeeccccccchhhhhhhccccccccccheeeeeeeec
RVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIY
cccttcceeeehhcccccchhheeeeehhtteeeecccccceeeecttcceeeeeeeeccccccceeeec
MNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDR
ccccccccchheeehcccccchhhhhheeeeecccttcceeeeecccccchheeeeeeeccccceeeehc
LVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVE
cctteeeeecttccccceeeeecccccccccccccccccccccceeecccccccchheeeeeccccheee
CKAYNDVGKTSAYFNFAFKGNNKEQIHPHTLFTPLLIGFVIVAGMMCIIVMILTYKYLQKPMYEVQWKVV
hhhhhhhccchheeeeecccccccccccccccchhhhhhhhhhhhhhhhhhhhhhhhttccchhhhheee
EEINGNNYVYIDPTQLPYDHKWEFPRNRLSFGKTLGAGAFGKVVEATAYGLIKSDAAMTVAVKMLKPSAH
eecttcceeeecttcccccccccccccceeetceecccchhheehhhhttcccccchhhhhhhhhcttcc
LTEREALMSELKVLSYLGNHMNIVNLLGACTIGGPTLVITEYCCYGDLLNFLRRKRDSFICSKQEDHAEA
hhhhhhhhhhhhhhhhtccccheeeehhhccttcceeeeeecccthhhhhhhhhtthheeeccccccccc
ALYKNLLHSKESSCSDSTNEYMDMKPGVSYVVPTKADKRRSVRIGSYIERDVTPAIMEDDELALDLEDLL
hhhhhhccccccccccccccccccccccccccccccccccccccccccchhhhhhhhcccccccchhhhh
SFSYQVAKGMAFLASKNCIHRDLAARNILLTHGRITKICDFGLARDIKNDSNYVVKGNARLPVKWMAPES
hhhhhhhhhhhhhhctthhhhhhhhhheeeettceeeecccccccccccccheeeccccccceeecccth
IFNCVYTFESDVWSYGIFLWELFSLGSSPYPGMPVDSKFYKMIKEGFRMLSPEHAPAEMYDIMKTCWDAD
hhhhheehhhhhhhhhhhhhheeettcccccccccchhhhhhhhttcccccccccchhhhhhhhhhhccc
PLKRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSVRINSVGSTASSSQPLLVHDDV
ttcccchhhhhhhhhhhhhhhhhhhhhhhcccccccccccccccceecccccccccccceeeeccc
Sequence length : 976
SOPMA :
Alpha helix (Hh) : 244 is 25.00%
310 helix (Gg) : 0 is 0.00%
Pi helix (Ii) : 0 is 0.00%
Beta bridge (Bb) : 0 is 0.00%
Extended strand (Ee) : 228 is 23.36%
Beta turn (Tt) : 50 is 5.12%
Bend region (Ss) : 0 is 0.00%
Random coil (Cc) : 454 is 46.52%
Ambigous states (?) : 0 is 0.00%
Other states : 0 is 0.00%
Masters Research Project
197
Parameters :
Window width : 17
Similarity threshold : 8
Number of states : 4
Prediction result file (text): [SOPMA]
Masters Research Project
198
Masters Research Project
199
SignalP 3.0 Server - prediction results
Technical University of Denmark (SignalP, 2009)
Using neural networks (NN) and hidden Markov models (HMM) trained on
eukaryotes
>gi_4557695_ref_NP_000213.1_ v-kit Hardy-Zuckerman 4 feline sarcoma viral
oncogene homolog isoform 1 precursor _Homo sapiens_
SignalP-NN result:
# data
>gi_4557695_ref_NP_00 length = 70
# Measure Position Value Cutoff signal peptide?
max. C 26 0.871 0.32 YES
max. Y 26 0.873 0.33 YES
Masters Research Project
200
max. S 17 0.991 0.87 YES
mean S 1-25 0.922 0.48 YES
D 1-25 0.898 0.43 YES
# Most likely cleavage site between pos. 25 and 26: GSS-QP
SignalP-HMM result:
# data
>gi_4557695_ref_NP_000213.1_
Prediction: Signal peptide
Signal peptide probability: 1.000
Signal anchor probability: 0.000
Max cleavage site probability: 0.781 between pos. 25 and 26
# gnuplot script
for making the plot(s)
Masters Research Project
201
NetNGlyc 1.0 Server - prediction results
Technical University of Denmark (NetNGlyc, 2009)
Asn-Xaa-Ser/Thr sequons in the sequence output below are highlighted
in blue.
Asparagines predicted to be N-glycosylated are highlighted in
red.
Output for 'gi_4557695_ref_NP_000213.1_'
###########################################################################
#############
Warning: This sequence may not contain a signal peptide!!
Proteins without signal peptides are unlikely to be exposed to
the N-glycosylation machinery and thus may not be glycosylated
(in vivo) even though they contain potential motifs.
SignalP-NN euk predictions are as follows:
# name Cmax pos ? Ymax pos ? Smax pos ? Smean ? D
?
SignalP output is explained at
http://www.cbs.dtu.dk/services/SignalP/output.html
###########################################################################
#############
Masters Research Project
202
Name: gi_4557695_ref_NP_000213.1_ Length: 976
MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETN
ENKQN 80
EWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKP
LPKDL 160
RFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTI
KDVSS 240
SVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINI
FPMIN 320
TTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVS
NSDVN 400
AAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQ
SSIDS 480
SAFKHNGTVECKAYNDVGKTSAYFNFAFKGNNKEQIHPHTLFTPLLIGFVIVAGMMCIIVMILTYKYLQKPMYEV
QWKVV 560
EEINGNNYVYIDPTQLPYDHKWEFPRNRLSFGKTLGAGAFGKVVEATAYGLIKSDAAMTVAVKMLKPSAHLTERE
ALMSE 640
LKVLSYLGNHMNIVNLLGACTIGGPTLVITEYCCYGDLLNFLRRKRDSFICSKQEDHAEAALYKNLLHSKESSCS
DSTNE 720
YMDMKPGVSYVVPTKADKRRSVRIGSYIERDVTPAIMEDDELALDLEDLLSFSYQVAKGMAFLASKNCIHRDLAA
RNILL 800
THGRITKICDFGLARDIKNDSNYVVKGNARLPVKWMAPESIFNCVYTFESDVWSYGIFLWELFSLGSSPYPGMPV
DSKFY 880
KMIKEGFRMLSPEHAPAEMYDIMKTCWDADPLKRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSV
RINSV 960
GSTASSSQPLLVHDDV
...........................................................................
..... 80
.................................................N..............N..........
..... 160
...........................................................................
..... 240
..........................................N................N...............
....N 320
...............................N..............N............................
Masters Research Project
203
..... 400
..............................................................N............
..... 480
.....N.....................................................................
..... 560
...........................................................................
..... 640
...........................................................................
..... 720
...........................................................................
..... 800
...........................................................................
..... 880
...........................................................................
..... 960
................
1040
(Threshold=0.5)
----------------------------------------------------------------------
SeqName Position Potential Jury N-Glyc
agreement result
----------------------------------------------------------------------
gi_4557695_ref_NP_000213.1_ 130 NDTL 0.7108 (9/9) ++
gi_4557695_ref_NP_000213.1_ 145 NYSL 0.5366 (6/9) +
gi_4557695_ref_NP_000213.1_ 283 NDSG 0.5754 (8/9) +
gi_4557695_ref_NP_000213.1_ 293 NNTF 0.3596 (8/9) -
gi_4557695_ref_NP_000213.1_ 300 NVTT 0.7154 (9/9) ++
gi_4557695_ref_NP_000213.1_ 320 NTTV 0.6528 (9/9) ++
gi_4557695_ref_NP_000213.1_ 352 NRTF 0.7201 (9/9) ++
gi_4557695_ref_NP_000213.1_ 367 NESN 0.6378 (8/9) +
Masters Research Project
204
gi_4557695_ref_NP_000213.1_ 463 NSSG 0.5716 (7/9) +
gi_4557695_ref_NP_000213.1_ 486 NGTV 0.7471 (9/9) ++
gi_4557695_ref_NP_000213.1_ 819 NDSN 0.3409 (8/9) -
gi_4557695_ref_NP_000213.1_ 941 NCSP 0.1006 (9/9) ---
----------------------------------------------------------------------
Masters Research Project
205
NetOGlyc 3.1 Server - prediction results
Technical University of Denmark (NetOGlyc, 2009)
Name: gi_4557695_ Length: 976
MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETN
ENKQN
EWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKP
LPKDL
RFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTI
KDVSS
SVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINI
FPMIN
TTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVS
NSDVN
AAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQ
SSIDS
SAFKHNGTVECKAYNDVGKTSAYFNFAFKGNNKEQIHPHTLFTPLLIGFVIVAGMMCIIVMILTYKYLQKPMYEV
QWKVV
EEINGNNYVYIDPTQLPYDHKWEFPRNRLSFGKTLGAGAFGKVVEATAYGLIKSDAAMTVAVKMLKPSAHLTERE
ALMSE
LKVLSYLGNHMNIVNLLGACTIGGPTLVITEYCCYGDLLNFLRRKRDSFICSKQEDHAEAALYKNLLHSKESSCS
DSTNE
YMDMKPGVSYVVPTKADKRRSVRIGSYIERDVTPAIMEDDELALDLEDLLSFSYQVAKGMAFLASKNCIHRDLAA
RNILL
THGRITKICDFGLARDIKNDSNYVVKGNARLPVKWMAPESIFNCVYTFESDVWSYGIFLWELFSLGSSPYPGMPV
DSKFY
KMIKEGFRMLSPEHAPAEMYDIMKTCWDADPLKRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSV
RINSV
GSTASSSQPLLVHDDV
_________________________..................................................
.....
...........................................................................
.....
Masters Research Project
206
...........................................................................
.....
...........................................................................
.....
...........................................................................
.....
...........................................................................
.....
...........................................................................
.....
..........................................................T................
.....
...........................................................................
.....
.............T.............................................................
.....
...........................................................................
.....
...........................................................................
.....
................
Name S/T Pos G-score I-score Y/N Comment
---------------------------------------------------------------------------
-
gi_4557695_ T 22 0.311 0.064 . -
gi_4557695_ S 24 0.248 0.028 . -
gi_4557695_ S 25 0.269 0.181 . -
gi_4557695_ S 28 0.352 0.509 . -
gi_4557695_ S 30 0.385 0.455 . -
gi_4557695_ S 35 0.461 0.297 . -
gi_4557695_ S 38 0.408 0.425 . -
gi_4557695_ S 44 0.263 0.043 . -
Masters Research Project
207
gi_4557695_ T 59 0.129 0.177 . -
gi_4557695_ T 67 0.117 0.077 . -
gi_4557695_ T 74 0.169 0.024 . -
gi_4557695_ T 84 0.190 0.080 . -
gi_4557695_ T 90 0.165 0.083 . -
gi_4557695_ T 92 0.175 0.050 . -
gi_4557695_ T 96 0.202 0.038 . -
gi_4557695_ T 98 0.183 0.047 . -
gi_4557695_ S 104 0.101 0.055 . -
gi_4557695_ S 106 0.084 0.020 . -
gi_4557695_ S 123 0.078 0.020 . -
gi_4557695_ T 132 0.169 0.024 . -
gi_4557695_ T 139 0.151 0.139 . -
gi_4557695_ T 144 0.194 0.088 . -
gi_4557695_ S 147 0.139 0.030 . -
gi_4557695_ S 174 0.072 0.030 . -
gi_4557695_ S 187 0.070 0.043 . -
gi_4557695_ S 194 0.069 0.021 . -
gi_4557695_ S 197 0.101 0.092 . -
gi_4557695_ S 215 0.139 0.084 . -
gi_4557695_ S 217 0.162 0.057 . -
gi_4557695_ S 220 0.178 0.032 . -
gi_4557695_ T 230 0.263 0.064 . -
gi_4557695_ T 232 0.210 0.054 . -
gi_4557695_ T 234 0.205 0.063 . -
gi_4557695_ S 239 0.162 0.052 . -
gi_4557695_ S 240 0.160 0.043 . -
gi_4557695_ S 241 0.152 0.030 . -
Masters Research Project
208
gi_4557695_ S 244 0.144 0.049 . -
gi_4557695_ T 245 0.209 0.072 . -
gi_4557695_ S 251 0.117 0.018 . -
gi_4557695_ T 253 0.177 0.028 . -
gi_4557695_ S 261 0.081 0.047 . -
gi_4557695_ T 274 0.164 0.071 . -
gi_4557695_ T 276 0.159 0.079 . -
gi_4557695_ S 278 0.093 0.039 . -
gi_4557695_ S 279 0.087 0.065 . -
gi_4557695_ S 285 0.116 0.031 . -
gi_4557695_ T 295 0.176 0.030 . -
gi_4557695_ S 298 0.106 0.075 . -
gi_4557695_ T 302 0.174 0.143 . -
gi_4557695_ T 303 0.166 0.019 . -
gi_4557695_ T 304 0.170 0.091 . -
gi_4557695_ T 321 0.141 0.040 . -
gi_4557695_ T 322 0.130 0.033 . -
gi_4557695_ T 354 0.201 0.018 . -
gi_4557695_ T 356 0.184 0.056 . -
gi_4557695_ S 365 0.111 0.032 . -
gi_4557695_ S 369 0.108 0.018 . -
gi_4557695_ S 375 0.127 0.054 . -
gi_4557695_ T 380 0.186 0.058 . -
gi_4557695_ T 385 0.153 0.064 . -
gi_4557695_ T 389 0.205 0.026 . -
gi_4557695_ T 391 0.168 0.053 . -
gi_4557695_ S 395 0.116 0.054 . -
gi_4557695_ S 397 0.103 0.019 . -
Masters Research Project
209
gi_4557695_ T 411 0.124 0.029 . -
gi_4557695_ T 417 0.126 0.075 . -
gi_4557695_ T 437 0.212 0.135 . -
gi_4557695_ T 446 0.322 0.059 . -
gi_4557695_ S 451 0.183 0.059 . -
gi_4557695_ S 453 0.206 0.116 . -
gi_4557695_ T 461 0.317 0.066 . -
gi_4557695_ S 464 0.207 0.083 . -
gi_4557695_ S 465 0.232 0.143 . -
gi_4557695_ S 476 0.199 0.027 . -
gi_4557695_ S 477 0.165 0.025 . -
gi_4557695_ S 480 0.152 0.052 . -
gi_4557695_ S 481 0.132 0.019 . -
gi_4557695_ T 488 0.226 0.048 . -
gi_4557695_ T 500 0.125 0.061 . -
gi_4557695_ S 501 0.082 0.031 . -
gi_4557695_ T 520 0.138 0.031 . -
gi_4557695_ T 523 0.124 0.057 . -
gi_4557695_ T 544 0.092 0.055 . -
gi_4557695_ T 574 0.145 0.294 . -
gi_4557695_ S 590 0.105 0.051 . -
gi_4557695_ T 594 0.164 0.051 . -
gi_4557695_ T 607 0.214 0.035 . -
gi_4557695_ S 614 0.140 0.050 . -
gi_4557695_ T 619 0.213 0.515 T -
gi_4557695_ S 628 0.126 0.035 . -
gi_4557695_ T 632 0.173 0.106 . -
gi_4557695_ S 639 0.085 0.171 . -
Masters Research Project
210
gi_4557695_ S 645 0.058 0.054 . -
gi_4557695_ T 661 0.113 0.122 . -
gi_4557695_ T 666 0.105 0.064 . -
gi_4557695_ T 670 0.091 0.056 . -
gi_4557695_ S 688 0.059 0.044 . -
gi_4557695_ S 692 0.054 0.045 . -
gi_4557695_ S 709 0.102 0.049 . -
gi_4557695_ S 712 0.114 0.030 . -
gi_4557695_ S 713 0.125 0.019 . -
gi_4557695_ S 715 0.129 0.064 . -
gi_4557695_ S 717 0.129 0.018 . -
gi_4557695_ T 718 0.227 0.026 . -
gi_4557695_ S 729 0.176 0.090 . -
gi_4557695_ T 734 0.223 0.517 T -
gi_4557695_ S 741 0.227 0.061 . -
gi_4557695_ S 746 0.178 0.020 . -
gi_4557695_ T 753 0.153 0.090 . -
gi_4557695_ S 771 0.066 0.070 . -
gi_4557695_ S 773 0.058 0.053 . -
gi_4557695_ S 785 0.065 0.059 . -
gi_4557695_ T 801 0.086 0.076 . -
gi_4557695_ T 806 0.099 0.076 . -
gi_4557695_ S 821 0.076 0.028 . -
gi_4557695_ S 840 0.111 0.018 . -
gi_4557695_ T 847 0.144 0.073 . -
gi_4557695_ S 850 0.080 0.032 . -
gi_4557695_ S 854 0.094 0.079 . -
gi_4557695_ S 864 0.141 0.055 . -
Masters Research Project
211
gi_4557695_ S 867 0.122 0.062 . -
gi_4557695_ S 868 0.118 0.029 . -
gi_4557695_ S 877 0.165 0.034 . -
gi_4557695_ S 891 0.090 0.052 . -
gi_4557695_ T 905 0.238 0.025 . -
gi_4557695_ T 916 0.168 0.031 . -
gi_4557695_ S 929 0.114 0.083 . -
gi_4557695_ S 931 0.091 0.027 . -
gi_4557695_ T 932 0.131 0.036 . -
gi_4557695_ S 937 0.133 0.053 . -
gi_4557695_ S 943 0.146 0.041 . -
gi_4557695_ S 954 0.259 0.050 . -
gi_4557695_ S 959 0.234 0.052 . -
gi_4557695_ S 962 0.210 0.034 . -
gi_4557695_ T 963 0.322 0.164 . -
gi_4557695_ S 965 0.238 0.026 . -
gi_4557695_ S 966 0.251 0.202 . -
gi_4557695_ S 967 0.262 0.076 . -
---------------------------------------------------------------------------
-
Masters Research Project
212
Masters Research Project
213
NetAcet 1.0 Server - prediction results
Technical University of Denmark (NetAcet, 2009)
seq2seq: entry name truncated from "gi_4557695_ref_NP_000213.1_" to
"gi_4557695_ref_NP_00"
#
# NetAcet 1.0 prediction results, 1 sequence
#
#
# Sequence # Context Score Acetylation
# -----------------------------------------------------------------
gi_4557695_ref_NP_00 3 G -MRGARG 0.464 .
Masters Research Project
214
NetPhos 2.0 Server - prediction results
Technical University of Denmark (NetPhos, 2009)
netphos: file Sequence invalid or empty
1025 gi_4557695_
MAVIRALXNCXGENEHXMXLXGISXFXRMPRECXRSXRHXMXSAPIENSMRGARGAWDFLCVLLLLLRVQTGSSQ
PSVSP 80
GEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNS
IYVFV 160
RDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHC
SVDQE 240
GKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGD
FNYER 320
QATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQ
QWIYM 400
NRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDRLVNGML
QCVAA 480
GFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKTSAYFNF
AFKGN 560
NKEQIHPHTLFTPLLIGFVIVAGMMCIIVMILTYKYLQKPMYEVQWKVVEEINGNNYVYIDPTQLPYDHKWEFPR
NRLSF 640
GKTLGAGAFGKVVEATAYGLIKSDAAMTVAVKMLKPSAHLTEREALMSELKVLSYLGNHMNIVNLLGACTIGGPT
LVITE 720
YCCYGDLLNFLRRKRDSFICSKQEDHAEAALYKNLLHSKESSCSDSTNEYMDMKPGVSYVVPTKADKRRSVRIGS
YIERD 800
VTPAIMEDDELALDLEDLLSFSYQVAKGMAFLASKNCIHRDLAARNILLTHGRITKICDFGLARDIKNDSNYVVK
GNARL 880
PVKWMAPESIFNCVYTFESDVWSYGIFLWELFSLGSSPYPGMPVDSKFYKMIKEGFRMLSPEHAPAEMYDIMKTC
WDADP 960
LKRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSVRINSVGSTASSSQPLLVHDDV
1040
Masters Research Project
215
...................................S............S.....................T....
.S.S. 80
...S..S.............................................T.......T..Y...........
.Y... 160
...........S....................T.............................S............
..... 240
..S..S.................S..................T....SSS...T...............S.....
..Y.. 320
.......S........................T................T.........................
..... 400
....T.....Y..S...S...Y..................................Y........T.........
..... 480
.....T.............S........................S.................Y......S.Y...
..... 560
........................................................Y.Y................
...S. 640
.................Y.........T........S...T..................................
..... 720
................S..............Y.....S..SS.S.S...Y.......SY..........S....S
..... 800
......................................................T................Y...
..... 880
....................................S......................S........Y......
..... 960
.................S......Y.................S....S..S..............
1040
Phosphorylation sites predicted: Ser: 38 Thr: 14 Tyr: 17
Serine predictions
Name Pos Context Score Pred
_________________________v_________________
Masters Research Project
216
gi_4557695_ 24 LXGISXFXR 0.006 .
gi_4557695_ 36 ECXRSXRHX 0.833 *S*
gi_4557695_ 43 HXMXSAPIE 0.006 .
gi_4557695_ 49 PIENSMRGA 0.794 *S*
gi_4557695_ 73 VQTGSSQPS 0.102 .
gi_4557695_ 74 QTGSSQPSV 0.039 .
gi_4557695_ 77 SSQPSVSPG 0.990 *S*
gi_4557695_ 79 QPSVSPGEP 0.997 *S*
gi_4557695_ 84 PGEPSPPSI 0.543 *S*
gi_4557695_ 87 PSPPSIHPG 0.924 *S*
gi_4557695_ 93 HPGKSDLIV 0.032 .
gi_4557695_ 153 KHGLSNSIY 0.026 .
gi_4557695_ 155 GLSNSIYVF 0.009 .
gi_4557695_ 172 LVDRSLYGK 0.604 *S*
gi_4557695_ 196 VTNYSLKGC 0.113 .
gi_4557695_ 223 IMIKSVKRA 0.917 *S*
gi_4557695_ 236 CLHCSVDQE 0.003 .
gi_4557695_ 243 QEGKSVLSE 0.930 *S*
gi_4557695_ 246 KSVLSEKFI 0.988 *S*
gi_4557695_ 264 VPVVSVSKA 0.613 *S*
gi_4557695_ 266 VVSVSKASY 0.134 .
gi_4557695_ 269 VSKASYLLR 0.224 .
gi_4557695_ 288 IKDVSSSVY 0.908 *S*
gi_4557695_ 289 KDVSSSVYS 0.540 *S*
gi_4557695_ 290 DVSSSVYST 0.959 *S*
gi_4557695_ 293 SSVYSTWKR 0.287 .
gi_4557695_ 300 KRENSQTKL 0.464 .
gi_4557695_ 310 EKYNSWHHG 0.945 *S*
Masters Research Project
217
gi_4557695_ 327 TLTISSARV 0.005 .
gi_4557695_ 328 LTISSARVN 0.756 *S*
gi_4557695_ 334 RVNDSGVFM 0.252 .
gi_4557695_ 347 NTFGSANVT 0.007 .
gi_4557695_ 414 DYPKSENES 0.557 *S*
gi_4557695_ 418 SENESNIRY 0.907 *S*
gi_4557695_ 424 IRYVSELHL 0.238 .
gi_4557695_ 444 TFLVSNSDV 0.117 .
gi_4557695_ 446 LVSNSDVNA 0.352 .
gi_4557695_ 500 EQRCSASVL 0.936 *S*
gi_4557695_ 502 RCSASVLPV 0.014 .
gi_4557695_ 513 QTLNSSGPP 0.113 .
gi_4557695_ 514 TLNSSGPPF 0.013 .
gi_4557695_ 525 LVVQSSIDS 0.981 *S*
gi_4557695_ 526 VVQSSIDSS 0.287 .
gi_4557695_ 529 SSIDSSAFK 0.018 .
gi_4557695_ 530 SIDSSAFKH 0.077 .
gi_4557695_ 550 VGKTSAYFN 0.853 *S*
gi_4557695_ 639 RNRLSFGKT 0.969 *S*
gi_4557695_ 663 GLIKSDAAM 0.006 .
gi_4557695_ 677 MLKPSAHLT 0.574 *S*
gi_4557695_ 688 EALMSELKV 0.034 .
gi_4557695_ 694 LKVLSYLGN 0.122 .
gi_4557695_ 737 RKRDSFICS 0.997 *S*
gi_4557695_ 741 SFICSKQED 0.111 .
gi_4557695_ 758 NLLHSKESS 0.994 *S*
gi_4557695_ 761 HSKESSCSD 0.668 *S*
gi_4557695_ 762 SKESSCSDS 0.989 *S*
Masters Research Project
218
gi_4557695_ 764 ESSCSDSTN 0.862 *S*
gi_4557695_ 766 SCSDSTNEY 0.961 *S*
gi_4557695_ 778 KPGVSYVVP 0.507 *S*
gi_4557695_ 790 DKRRSVRIG 0.996 *S*
gi_4557695_ 795 VRIGSYIER 0.992 *S*
gi_4557695_ 820 EDLLSFSYQ 0.045 .
gi_4557695_ 822 LLSFSYQVA 0.130 .
gi_4557695_ 834 AFLASKNCI 0.013 .
gi_4557695_ 870 IKNDSNYVV 0.189 .
gi_4557695_ 889 MAPESIFNC 0.009 .
gi_4557695_ 899 YTFESDVWS 0.031 .
gi_4557695_ 903 SDVWSYGIF 0.022 .
gi_4557695_ 913 WELFSLGSS 0.022 .
gi_4557695_ 916 FSLGSSPYP 0.003 .
gi_4557695_ 917 SLGSSPYPG 0.953 *S*
gi_4557695_ 926 MPVDSKFYK 0.017 .
gi_4557695_ 940 FRMLSPEHA 0.995 *S*
gi_4557695_ 978 EKQISESTN 0.900 *S*
gi_4557695_ 980 QISESTNHI 0.313 .
gi_4557695_ 986 NHIYSNLAN 0.039 .
gi_4557695_ 992 LANCSPNRQ 0.162 .
gi_4557695_ 1003 VVDHSVRIN 0.942 *S*
gi_4557695_ 1008 VRINSVGST 0.977 *S*
gi_4557695_ 1011 NSVGSTASS 0.824 *S*
gi_4557695_ 1014 GSTASSSQP 0.063 .
gi_4557695_ 1015 STASSSQPL 0.230 .
gi_4557695_ 1016 TASSSQPLL 0.140 .
_________________________^_________________
Masters Research Project
219
Threonine predictions
Name Pos Context Score Pred
_________________________v_________________
gi_4557695_ 71 LRVQTGSSQ 0.551 *T*
gi_4557695_ 108 RLLCTDPGF 0.015 .
gi_4557695_ 116 FVKWTFEIL 0.010 .
gi_4557695_ 123 ILDETNENK 0.024 .
gi_4557695_ 133 NEWITEKAE 0.725 *T*
gi_4557695_ 139 KAEATNTGK 0.168 .
gi_4557695_ 141 EATNTGKYT 0.840 *T*
gi_4557695_ 145 TGKYTCTNK 0.017 .
gi_4557695_ 147 KYTCTNKHG 0.132 .
gi_4557695_ 181 EDNDTLVRC 0.095 .
gi_4557695_ 188 RCPLTDPEV 0.344 .
gi_4557695_ 193 DPEVTNYSL 0.547 *T*
gi_4557695_ 279 GEEFTVTCT 0.281 .
gi_4557695_ 281 EFTVTCTIK 0.026 .
gi_4557695_ 283 TVTCTIKDV 0.761 *T*
gi_4557695_ 294 SVYSTWKRE 0.732 *T*
gi_4557695_ 302 ENSQTKLQE 0.117 .
gi_4557695_ 323 ERQATLTIS 0.342 .
gi_4557695_ 325 QATLTISSA 0.409 .
gi_4557695_ 344 YANNTFGSA 0.155 .
gi_4557695_ 351 SANVTTTLE 0.039 .
gi_4557695_ 352 ANVTTTLEV 0.483 .
gi_4557695_ 353 NVTTTLEVV 0.687 *T*
Masters Research Project
220
gi_4557695_ 370 PMINTTVFV 0.549 *T*
gi_4557695_ 371 MINTTVFVN 0.051 .
gi_4557695_ 403 YMNRTFTDK 0.077 .
gi_4557695_ 405 NRTFTDKWE 0.912 *T*
gi_4557695_ 429 ELHLTRLKG 0.033 .
gi_4557695_ 434 RLKGTEGGT 0.395 .
gi_4557695_ 438 TEGGTYTFL 0.014 .
gi_4557695_ 440 GGTYTFLVS 0.048 .
gi_4557695_ 460 VYVNTKPEI 0.363 .
gi_4557695_ 466 PEILTYDRL 0.605 *T*
gi_4557695_ 486 FPEPTIDWY 0.582 *T*
gi_4557695_ 495 FCPGTEQRC 0.029 .
gi_4557695_ 510 VDVQTLNSS 0.181 .
gi_4557695_ 537 KHNGTVECK 0.037 .
gi_4557695_ 549 DVGKTSAYF 0.064 .
gi_4557695_ 569 IHPHTLFTP 0.026 .
gi_4557695_ 572 HTLFTPLLI 0.039 .
gi_4557695_ 593 VMILTYKYL 0.430 .
gi_4557695_ 623 YIDPTQLPY 0.067 .
gi_4557695_ 643 SFGKTLGAG 0.080 .
gi_4557695_ 656 VVEATAYGL 0.365 .
gi_4557695_ 668 DAAMTVAVK 0.671 *T*
gi_4557695_ 681 SAHLTEREA 0.992 *T*
gi_4557695_ 710 LGACTIGGP 0.137 .
gi_4557695_ 715 IGGPTLVIT 0.332 .
gi_4557695_ 719 TLVITEYCC 0.020 .
gi_4557695_ 767 CSDSTNEYM 0.006 .
gi_4557695_ 783 YVVPTKADK 0.390 .
Masters Research Project
221
gi_4557695_ 802 ERDVTPAIM 0.388 .
gi_4557695_ 850 NILLTHGRI 0.019 .
gi_4557695_ 855 HGRITKICD 0.612 *T*
gi_4557695_ 896 NCVYTFESD 0.045 .
gi_4557695_ 954 DIMKTCWDA 0.361 .
gi_4557695_ 965 LKRPTFKQI 0.328 .
gi_4557695_ 981 ISESTNHIY 0.063 .
gi_4557695_ 1012 SVGSTASSS 0.048 .
_________________________^_________________
Tyrosine predictions
Name Pos Context Score Pred
_________________________v_________________
gi_4557695_ 144 NTGKYTCTN 0.877 *Y*
gi_4557695_ 157 SNSIYVFVR 0.978 *Y*
gi_4557695_ 174 DRSLYGKED 0.178 .
gi_4557695_ 195 EVTNYSLKG 0.359 .
gi_4557695_ 228 VKRAYHRLC 0.150 .
gi_4557695_ 270 SKASYLLRE 0.239 .
gi_4557695_ 292 SSSVYSTWK 0.474 .
gi_4557695_ 308 LQEKYNSWH 0.271 .
gi_4557695_ 318 GDFNYERQA 0.562 *Y*
gi_4557695_ 340 VFMCYANNT 0.099 .
gi_4557695_ 386 LIVEYEAFP 0.048 .
gi_4557695_ 399 QQWIYMNRT 0.309 .
gi_4557695_ 411 KWEDYPKSE 0.805 *Y*
gi_4557695_ 422 SNIRYVSEL 0.899 *Y*
Masters Research Project
222
gi_4557695_ 439 EGGTYTFLV 0.376 .
gi_4557695_ 457 AFNVYVNTK 0.815 *Y*
gi_4557695_ 467 EILTYDRLV 0.070 .
gi_4557695_ 490 TIDWYFCPG 0.221 .
gi_4557695_ 543 ECKAYNDVG 0.946 *Y*
gi_4557695_ 552 KTSAYFNFA 0.719 *Y*
gi_4557695_ 594 MILTYKYLQ 0.022 .
gi_4557695_ 596 LTYKYLQKP 0.111 .
gi_4557695_ 602 QKPMYEVQW 0.420 .
gi_4557695_ 617 NGNNYVYID 0.829 *Y*
gi_4557695_ 619 NNYVYIDPT 0.877 *Y*
gi_4557695_ 627 TQLPYDHKW 0.046 .
gi_4557695_ 658 EATAYGLIK 0.560 *Y*
gi_4557695_ 695 KVLSYLGNH 0.059 .
gi_4557695_ 721 VITEYCCYG 0.158 .
gi_4557695_ 724 EYCCYGDLL 0.099 .
gi_4557695_ 752 EAALYKNLL 0.714 *Y*
gi_4557695_ 770 STNEYMDMK 0.981 *Y*
gi_4557695_ 779 PGVSYVVPT 0.740 *Y*
gi_4557695_ 796 RIGSYIERD 0.199 .
gi_4557695_ 823 LSFSYQVAK 0.068 .
gi_4557695_ 872 NDSNYVVKG 0.961 *Y*
gi_4557695_ 895 FNCVYTFES 0.107 .
gi_4557695_ 904 DVWSYGIFL 0.067 .
gi_4557695_ 919 GSSPYPGMP 0.308 .
gi_4557695_ 929 DSKFYKMIK 0.264 .
gi_4557695_ 949 PAEMYDIMK 0.853 *Y*
gi_4557695_ 985 TNHIYSNLA 0.871 *Y*
Masters Research Project
223
_________________________^_________________
Masters Research Project
224
Sulfinator (Sulfinator, 2009)
Input processed on Thu Dec 31 07:53:21 CET 2009:
E-cutoff value is 55
No hits found in: GI
Masters Research Project
225
Masters Research Project
226
SOSUI (SOSUI, 2009)
SOSUI Result
Query title : None
Total length : 1047 A. A.
Average of hydrophobicity : -0.268768
This amino acid sequence is of a MEMBRANE PROTEIN which have 1 transmembrane helix.
No. N terminal transmembrane region C terminal type length
1 593 FTPLLIGFVIVAGMMCIIVMILT 615 PRIMARY 23
Masters Research Project
227
Masters Research Project
228
Loading of raw sequence (SPDBV, 2009) :-
After Loading Templates:-
Masters Research Project
229
Coloring of raw sequence and Templates:-
Aligning of first Templates:-
Masters Research Project
230
Aligning of second Templates:-
Modeled protein before loop building:-
Masters Research Project
231
Ramachandran plot protein before loop building:-
Loop building (configuration table):-
Masters Research Project
232
Modelled protein after loop building:-
Ramachandran plot after loop building:-
Masters Research Project
233
Protein with H-Bonds:-
Protein with Side chains:-
Masters Research Project
234
Masters Research Project
235
Cavity Method (Cavity, 2009)
Protein with Molecule surface:-
Cavities:-
Masters Research Project
236
Q-Site Finder (Q-SiteFinder, 2009)
Predicted site 1
Site Volume: 1891 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-751, -732, -636)
Max Coords: (-713, -694, -602)
99 O ASN 680
100 CB ASN 680
101 CG ASN 680
103 ND2 ASN 680
104 N PHE 681
106 C PHE 681
107 O PHE 681
109 CG PHE 681
110 CD1 PHE 681
111 CD2 PHE 681
112 CE1 PHE 681
113 CE2 PHE 681
114 CZ PHE 681
116 CA LEU 682
117 C LEU 682
Masters Research Project
237
118 O LEU 682
119 CB LEU 682
120 CG LEU 682
121 CD1 LEU 682
122 CD2 LEU 682
123 N ARG 683
124 CA ARG 683
125 C ARG 683
126 O ARG 683
128 CG ARG 683
129 CD ARG 683
130 NE ARG 683
131 CZ ARG 683
132 NH1 ARG 683
133 NH2 ARG 683
134 N ARG 684
138 CB ARG 684
261 CA GLU 699
262 C GLU 699
263 O GLU 699
264 CB GLU 699
269 N ALA 700
270 CA ALA 700
271 C ALA 700
272 O ALA 700
Masters Research Project
238
274 N ALA 701
275 CA ALA 701
276 C ALA 701
277 O ALA 701
278 CB ALA 701
279 N LEU 702
280 CA LEU 702
281 C LEU 702
282 O LEU 702
287 N TYR 703
288 CA TYR 703
291 CB TYR 703
292 CG TYR 703
293 CD1 TYR 703
294 CD2 TYR 703
295 CE1 TYR 703
296 CE2 TYR 703
297 CZ TYR 703
298 OH TYR 703
304 CG LYS 704
305 CD LYS 704
306 CE LYS 704
307 NZ LYS 704
317 CA LEU 706
318 C LEU 706
Masters Research Project
239
319 O LEU 706
334 C HIS 708
335 O HIS 708
336 CB HIS 708
337 CG HIS 708
339 CD2 HIS 708
344 C SER 709
345 O SER 709
349 CA LYS 710
350 C LYS 710
351 O LYS 710
352 CB LYS 710
353 CG LYS 710
354 CD LYS 710
355 CE LYS 710
356 NZ LYS 710
357 N GLU 711
359 C GLU 711
360 O GLU 711
363 CD GLU 711
364 OE1 GLU 711
365 OE2 GLU 711
366 N SER 712
367 CA SER 712
380 C CYS 714
Masters Research Project
240
381 O CYS 714
382 CB CYS 714
383 SG CYS 714
384 N SER 715
385 CA SER 715
386 C SER 715
387 O SER 715
388 CB SER 715
389 OG SER 715
408 CB THR 718
409 OG1 THR 718
410 CG2 THR 718
416 CG ASN 719
417 OD1 ASN 719
418 ND2 ASN 719
538 CA LYS 735
539 C LYS 735
540 O LYS 735
541 CB LYS 735
542 CG LYS 735
543 CD LYS 735
546 N ALA 736
547 CA ALA 736
548 C ALA 736
549 O ALA 736
Masters Research Project
241
551 N ASP 737
555 CB ASP 737
556 CG ASP 737
557 OD1 ASP 737
558 OD2 ASP 737
580 CA ARG 740
581 C ARG 740
582 O ARG 740
583 CB ARG 740
590 N SER 741
591 CA SER 741
592 C SER 741
593 O SER 741
596 N VAL 742
597 CA VAL 742
598 C VAL 742
599 O VAL 742
600 CB VAL 742
601 CG1 VAL 742
602 CG2 VAL 742
603 N ARG 743
604 CA ARG 743
605 C ARG 743
606 O ARG 743
614 N ILE 744
Masters Research Project
242
615 CA ILE 744
618 CB ILE 744
619 CG1 ILE 744
621 CD1 ILE 744
695 CA PRO 754
696 C PRO 754
697 O PRO 754
698 CB PRO 754
972 O HIS 790
975 ND1 HIS 790
977 CE1 HIS 790
980 CA ARG 791
981 C ARG 791
982 O ARG 791
983 CB ARG 791
984 CG ARG 791
986 NE ARG 791
987 CZ ARG 791
989 NH2 ARG 791
990 N ASP 792
991 CA ASP 792
992 C ASP 792
993 O ASP 792
994 CB ASP 792
995 CG ASP 792
Masters Research Project
243
996 OD1 ASP 792
997 OD2 ASP 792
1000 C LEU 793
1001 O LEU 793
1010 CB ALA 794
1011 N ALA 795
1015 CB ALA 795
1017 CA ARG 796
1018 C ARG 796
1019 O ARG 796
1020 CB ARG 796
1021 CG ARG 796
1022 CD ARG 796
1023 NE ARG 796
1024 CZ ARG 796
1025 NH1 ARG 796
1026 NH2 ARG 796
1027 N ASN 797
1028 CA ASN 797
1031 CB ASN 797
1034 ND2 ASN 797
1127 CB CYS 809
1132 O ASP 810
1136 OD2 ASP 810
1154 C LEU 813
Masters Research Project
244
1155 O LEU 813
1156 CB LEU 813
1158 CD1 LEU 813
1160 N ALA 814
1161 CA ALA 814
1164 CB ALA 814
1178 C ASP 816
1179 O ASP 816
1180 CB ASP 816
1181 CG ASP 816
1183 OD2 ASP 816
1184 N ILE 817
1185 CA ILE 817
1187 O ILE 817
1188 CB ILE 817
1189 CG1 ILE 817
1190 CG2 ILE 817
1191 CD1 ILE 817
1194 C LYS 818
1195 O LYS 818
1211 C ASP 820
1212 O ASP 820
1218 CA SER 821
1219 C SER 821
1220 O SER 821
Masters Research Project
245
1221 CB SER 821
1222 OG SER 821
1223 N ASN 822
1224 CA ASN 822
1225 C ASN 822
1231 N TYR 823
1232 CA TYR 823
1235 CB TYR 823
1236 CG TYR 823
1238 CD2 TYR 823
1240 CE2 TYR 823
1243 N VAL 824
1245 C VAL 824
1246 O VAL 824
1247 CB VAL 824
1248 CG1 VAL 824
1249 CG2 VAL 824
1250 N VAL 825
1251 CA VAL 825
1255 CG1 VAL 825
1256 CG2 VAL 825
1257 N LYS 826
1261 CB LYS 826
1263 CD LYS 826
1265 NZ LYS 826
Masters Research Project
246
1271 CA ASN 828
1272 C ASN 828
1274 CB ASN 828
1275 CG ASN 828
1276 OD1 ASN 828
1278 N ALA 829
1279 CA ALA 829
1282 CB ALA 829
1306 CB PRO 832
1307 CG PRO 832
1316 N LYS 834
1317 CA LYS 834
1318 C LYS 834
1319 O LYS 834
1320 CB LYS 834
1325 N TRP 835
1326 CA TRP 835
1327 C TRP 835
1328 O TRP 835
1329 CB TRP 835
1335 CE3 TRP 835
1351 CB ALA 837
1360 CA GLU 839
1361 C GLU 839
1363 CB GLU 839
Masters Research Project
247
1368 N SER 840
1369 CA SER 840
1371 O SER 840
1372 CB SER 840
1373 OG SER 840
1406 SG CYS 844
1428 C THR 847
1429 O THR 847
1430 CB THR 847
1432 CG2 THR 847
1433 N PHE 848
1434 CA PHE 848
1437 CB PHE 848
1438 CG PHE 848
1439 CD1 PHE 848
1464 CG ASP 851
1465 OD1 ASP 851
1466 OD2 ASP 851
1468 CA VAL 852
1469 C VAL 852
1470 O VAL 852
1474 N TRP 853
1475 CA TRP 853
1476 C TRP 853
1477 O TRP 853
Masters Research Project
248
1481 CD2 TRP 853
1484 CE3 TRP 853
1486 CZ3 TRP 853
1487 CH2 TRP 853
1488 N SER 854
1489 CA SER 854
1492 CB SER 854
1493 OG SER 854
1515 CG1 ILE 857
1517 CD1 ILE 857
1538 CA TRP 860
1539 C TRP 860
1540 O TRP 860
1541 CB TRP 860
1542 CG TRP 860
1543 CD1 TRP 860
1544 CD2 TRP 860
1545 NE1 TRP 860
1546 CE2 TRP 860
1551 N GLU 861
1553 C GLU 861
1554 O GLU 861
1782 SD MET 889
1783 CE MET 889
2092 CG GLN 927
Masters Research Project
249
2093 CD GLN 927
2094 OE1 GLN 927
2095 NE2 GLN 927
Predicted site 2
Site Volume: 1230 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-726, -714, -642)
Max Coords: (-700, -690, -611)
1244 CA VAL 824
1245 C VAL 824
1246 O VAL 824
1250 N VAL 825
1251 CA VAL 825
1252 C VAL 825
1253 O VAL 825
1257 N LYS 826
1258 CA LYS 826
1259 C LYS 826
1261 CB LYS 826
1262 CG LYS 826
Masters Research Project
250
1266 N GLY 827
1267 CA GLY 827
1268 C GLY 827
1269 O GLY 827
1270 N ASN 828
1271 CA ASN 828
1277 ND2 ASN 828
1290 NE ARG 830
1291 CZ ARG 830
1292 NH1 ARG 830
1293 NH2 ARG 830
1298 CB LEU 831
1299 CG LEU 831
1300 CD1 LEU 831
1301 CD2 LEU 831
1395 C ASN 843
1396 O ASN 843
1397 CB ASN 843
1398 CG ASN 843
1399 OD1 ASN 843
1400 ND2 ASN 843
1423 CE2 TYR 846
1424 CZ TYR 846
1425 OH TYR 846
1431 OG1 THR 847
Masters Research Project
251
1525 CD2 PHE 858
1526 CE1 PHE 858
1527 CE2 PHE 858
1528 CZ PHE 858
1535 CD1 LEU 859
1536 CD2 LEU 859
1586 CA LEU 865
1588 O LEU 865
1590 CG LEU 865
1591 CD1 LEU 865
1604 CA SER 868
1605 C SER 868
1606 O SER 868
1607 CB SER 868
1608 OG SER 868
1609 N PRO 869
1610 CA PRO 869
1611 C PRO 869
1612 O PRO 869
1613 CB PRO 869
1614 CG PRO 869
1615 CD PRO 869
1629 CA PRO 871
1630 C PRO 871
1631 O PRO 871
Masters Research Project
252
1632 CB PRO 871
1633 CG PRO 871
1635 N GLY 872
1636 CA GLY 872
1637 C GLY 872
1638 O GLY 872
1639 N MET 873
1640 CA MET 873
1644 CG MET 873
1648 CA PRO 874
1649 C PRO 874
1650 O PRO 874
1651 CB PRO 874
1656 C VAL 875
1657 O VAL 875
1658 CB VAL 875
1659 CG1 VAL 875
1660 CG2 VAL 875
1662 CA ASP 876
1663 C ASP 876
1664 O ASP 876
1665 CB ASP 876
1666 CG ASP 876
1667 OD1 ASP 876
1668 OD2 ASP 876
Masters Research Project
253
1669 N SER 877
1670 CA SER 877
1671 C SER 877
1672 O SER 877
1673 CB SER 877
1675 N LYS 878
1676 CA LYS 878
1677 C LYS 878
1678 O LYS 878
1679 CB LYS 878
1680 CG LYS 878
1681 CD LYS 878
1682 CE LYS 878
1683 NZ LYS 878
1684 N PHE 879
1685 CA PHE 879
1686 C PHE 879
1687 O PHE 879
1688 CB PHE 879
1689 CG PHE 879
1690 CD1 PHE 879
1692 CE1 PHE 879
1695 N TYR 880
1696 CA TYR 880
1699 CB TYR 880
Masters Research Project
254
1718 C MET 882
1719 O MET 882
1720 CB MET 882
1721 CG MET 882
1723 CE MET 882
1725 CA ILE 883
1726 C ILE 883
1727 O ILE 883
1728 CB ILE 883
1729 CG1 ILE 883
1730 CG2 ILE 883
1731 CD1 ILE 883
1751 CA GLY 886
1752 C GLY 886
1753 O GLY 886
1754 N PHE 887
1755 CA PHE 887
1758 CB PHE 887
1760 CD1 PHE 887
1762 CE1 PHE 887
1780 CB MET 889
1781 CG MET 889
1799 CA PRO 892
1800 C PRO 892
1801 O PRO 892
Masters Research Project
255
1802 CB PRO 892
1805 N GLU 893
1806 CA GLU 893
1807 C GLU 893
1808 O GLU 893
1809 CB GLU 893
1810 CG GLU 893
1814 N HIS 894
1815 CA HIS 894
1818 CB HIS 894
1819 CG HIS 894
1820 ND1 HIS 894
1830 CA PRO 896
1831 C PRO 896
1833 CB PRO 896
1834 CG PRO 896
1836 N ALA 897
1837 CA ALA 897
1838 C ALA 897
1839 O ALA 897
1840 CB ALA 897
1842 CA GLU 898
1843 C GLU 898
1845 CB GLU 898
1846 CG GLU 898
Masters Research Project
256
1847 CD GLU 898
1848 OE1 GLU 898
1849 OE2 GLU 898
1850 N MET 899
1853 O MET 899
1854 CB MET 899
1855 CG MET 899
1874 CB ASP 901
1875 CG ASP 901
1876 OD1 ASP 901
1877 OD2 ASP 901
1881 O ILE 902
1888 C MET 903
1889 O MET 903
1894 N LYS 904
1895 CA LYS 904
1896 C LYS 904
1897 O LYS 904
1899 CG LYS 904
1901 CE LYS 904
1916 N TRP 907
1917 CA TRP 907
1920 CB TRP 907
1921 CG TRP 907
1922 CD1 TRP 907
Masters Research Project
257
1923 CD2 TRP 907
1925 CE2 TRP 907
1926 CE3 TRP 907
1928 CZ3 TRP 907
1937 OD2 ASP 908
2030 CA ILE 920
2031 C ILE 920
2032 O ILE 920
2033 CB ILE 920
2034 CG1 ILE 920
2053 N LEU 923
2054 CA LEU 923
2055 C LEU 923
2056 O LEU 923
2057 CB LEU 923
2058 CG LEU 923
2059 CD1 LEU 923
2060 CD2 LEU 923
2061 N ILE 924
2062 CA ILE 924
2064 O ILE 924
2065 CB ILE 924
2067 CG2 ILE 924
2069 N GLU 925
2070 CA GLU 925
Masters Research Project
258
2071 C GLU 925
2072 O GLU 925
2073 CB GLU 925
2074 CG GLU 925
2075 CD GLU 925
2076 OE1 GLU 925
2077 OE2 GLU 925
2080 C LYS 926
2087 N GLN 927
2088 CA GLN 927
2091 CB GLN 927
2092 CG GLN 927
2093 CD GLN 927
2094 OE1 GLN 927
2095 NE2 GLN 927
2115 CG GLU 930
Predicted site 3
Site Volume: 643 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-723, -718, -618)
Max Coords: (-697, -696, -593)
Masters Research Project
259
688 CA THR 753
690 O THR 753
691 CB THR 753
692 OG1 THR 753
693 CG2 THR 753
710 CB ILE 756
712 CG2 ILE 756
713 CD1 ILE 756
714 N MET 757
719 CG MET 757
720 SD MET 757
721 CE MET 757
743 CB ASP 760
744 CG ASP 760
745 OD1 ASP 760
746 OD2 ASP 760
758 C LEU 762
759 O LEU 762
760 CB LEU 762
766 C ALA 763
767 O ALA 763
768 CB ALA 763
769 N LEU 764
770 CA LEU 764
Masters Research Project
260
771 C LEU 764
772 O LEU 764
774 CG LEU 764
775 CD1 LEU 764
776 CD2 LEU 764
777 N ASP 765
790 CG LEU 766
792 CD2 LEU 766
806 CB ASP 768
807 CG ASP 768
808 OD1 ASP 768
809 OD2 ASP 768
811 CA LEU 769
812 C LEU 769
813 O LEU 769
814 CB LEU 769
815 CG LEU 769
816 CD1 LEU 769
817 CD2 LEU 769
819 CA LEU 770
820 C LEU 770
821 O LEU 770
822 CB LEU 770
823 CG LEU 770
824 CD1 LEU 770
Masters Research Project
261
825 CD2 LEU 770
836 CB PHE 772
838 CD1 PHE 772
840 CE1 PHE 772
847 CB SER 773
848 OG SER 773
849 N TYR 774
850 CA TYR 774
1058 CD2 LEU 800
1424 CZ TYR 846
1425 OH TYR 846
1435 C PHE 848
1436 O PHE 848
1438 CG PHE 848
1440 CD2 PHE 848
1442 CE2 PHE 848
1443 CZ PHE 848
1445 CA GLU 849
1446 C GLU 849
1447 O GLU 849
1448 CB GLU 849
1449 CG GLU 849
1450 CD GLU 849
1451 OE1 GLU 849
1452 OE2 GLU 849
Masters Research Project
262
1453 N SER 850
1454 CA SER 850
1455 C SER 850
1456 O SER 850
1458 OG SER 850
1472 CG1 VAL 852
1473 CG2 VAL 852
1898 CB LYS 904
1899 CG LYS 904
1900 CD LYS 904
1934 CB ASP 908
1935 CG ASP 908
1936 OD1 ASP 908
1937 OD2 ASP 908
1938 N ALA 909
1939 CA ALA 909
1940 C ALA 909
1941 O ALA 909
1942 CB ALA 909
1945 C ASP 910
1946 O ASP 910
1952 CA PRO 911
1955 CB PRO 911
1956 CG PRO 911
1958 N LEU 912
Masters Research Project
263
1987 CA PRO 915
1988 C PRO 915
1989 O PRO 915
1990 CB PRO 915
1991 CG PRO 915
1992 CD PRO 915
1994 CA THR 916
1995 C THR 916
1996 O THR 916
1997 CB THR 916
1998 OG1 THR 916
1999 CG2 THR 916
2015 CB LYS 918
2017 CD LYS 918
2019 NZ LYS 918
2024 CB GLN 919
2025 CG GLN 919
2026 CD GLN 919
2027 OE1 GLN 919
2028 NE2 GLN 919
2034 CG1 ILE 920
2169 OH TYR 936
2199 C ASN 941
2200 O ASN 941
2201 CB ASN 941
Masters Research Project
264
2202 CG ASN 941
2203 OD1 ASN 941
2204 ND2 ASN 941
2205 N CYS 942
2206 CA CYS 942
2210 SG CYS 942
Predicted site 4
Site Volume: 441 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-738, -732, -627)
Max Coords: (-721, -707, -595)
86 CD1 LEU 678
87 CD2 LEU 678
102 OD1 ASN 680
118 O LEU 682
124 CA ARG 683
127 CB ARG 683
129 CD ARG 683
130 NE ARG 683
131 CZ ARG 683
Masters Research Project
265
132 NH1 ARG 683
133 NH2 ARG 683
153 NZ LYS 685
169 CB ASP 687
170 CG ASP 687
171 OD1 ASP 687
172 OD2 ASP 687
249 CB HIS 697
250 CG HIS 697
251 ND1 HIS 697
252 CD2 HIS 697
253 CE1 HIS 697
254 NE2 HIS 697
264 CB GLU 699
265 CG GLU 699
401 O SER 717
405 CA THR 718
407 O THR 718
408 CB THR 718
410 CG2 THR 718
420 CA GLU 720
421 C GLU 720
422 O GLU 720
423 CB GLU 720
428 N TYR 721
Masters Research Project
266
429 CA TYR 721
432 CB TYR 721
433 CG TYR 721
434 CD1 TYR 721
435 CD2 TYR 721
436 CE1 TYR 721
437 CE2 TYR 721
438 CZ TYR 721
439 OH TYR 721
500 O TYR 730
501 CB TYR 730
502 CG TYR 730
503 CD1 TYR 730
504 CD2 TYR 730
505 CE1 TYR 730
506 CE2 TYR 730
507 CZ TYR 730
508 OH TYR 730
521 CG1 VAL 732
522 CG2 VAL 732
544 CE LYS 735
618 CB ILE 744
619 CG1 ILE 744
620 CG2 ILE 744
621 CD1 ILE 744
Masters Research Project
267
630 CB SER 746
631 OG SER 746
635 O TYR 747
645 CA ILE 748
650 CG2 ILE 748
651 CD1 ILE 748
652 N GLU 749
654 C GLU 749
655 O GLU 749
663 C ARG 750
664 O ARG 750
672 N ASP 751
673 CA ASP 751
674 C ASP 751
675 O ASP 751
680 N VAL 752
686 CG2 VAL 752
959 CB CYS 788
960 SG CYS 788
961 N ILE 789
963 C ILE 789
964 O ILE 789
965 CB ILE 789
985 CD ARG 791
988 NH1 ARG 791
Masters Research Project
268
1047 CB LEU 799
1048 CG LEU 799
1049 CD1 LEU 799
1050 CD2 LEU 799
1124 CA CYS 809
1125 C CYS 809
1127 CB CYS 809
1128 SG CYS 809
1129 N ASP 810
1130 CA ASP 810
1131 C ASP 810
1135 OD1 ASP 810
1137 N PHE 811
1140 O PHE 811
1149 CA GLY 812
1150 C GLY 812
1151 O GLY 812
1152 N LEU 813
1157 CG LEU 813
1158 CD1 LEU 813
1159 CD2 LEU 813
1162 C ALA 814
1164 CB ALA 814
1165 N ARG 815
1166 CA ARG 815
Masters Research Project
269
1169 CB ARG 815
1171 CD ARG 815
1183 OD2 ASP 816
1345 SD MET 836
1346 CE MET 836
Predicted site 5
Site Volume: 255 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-726, -720, -618)
Max Coords: (-704, -705, -602)
834 C PHE 772
835 O PHE 772
836 CB PHE 772
843 N SER 773
844 CA SER 773
845 C SER 773
846 O SER 773
847 CB SER 773
848 OG SER 773
874 CB VAL 776
Masters Research Project
270
875 CG1 VAL 776
876 CG2 VAL 776
877 N ALA 777
878 CA ALA 777
881 CB ALA 777
999 CA LEU 793
1000 C LEU 793
1001 O LEU 793
1002 CB LEU 793
1003 CG LEU 793
1004 CD1 LEU 793
1005 CD2 LEU 793
1006 N ALA 794
1007 CA ALA 794
1008 C ALA 794
1009 O ALA 794
1011 N ALA 795
1012 CA ALA 795
1030 O ASN 797
1033 OD1 ASN 797
1040 CG1 ILE 798
1041 CG2 ILE 798
1042 CD1 ILE 798
1121 CG2 ILE 808
1248 CG1 VAL 824
Masters Research Project
271
1249 CG2 VAL 824
1388 CD1 PHE 842
1390 CE1 PHE 842
1402 CA CYS 844
1403 C CYS 844
1404 O CYS 844
1405 CB CYS 844
1407 N VAL 845
1408 CA VAL 845
1410 O VAL 845
1411 CB VAL 845
1412 CG1 VAL 845
1413 CG2 VAL 845
1437 CB PHE 848
1438 CG PHE 848
1439 CD1 PHE 848
1440 CD2 PHE 848
1442 CE2 PHE 848
1449 CG GLU 849
1450 CD GLU 849
1451 OE1 GLU 849
1452 OE2 GLU 849
2013 C LYS 918
2020 N GLN 919
2021 CA GLN 919
Masters Research Project
272
2025 CG GLN 919
2026 CD GLN 919
2027 OE1 GLN 919
2028 NE2 GLN 919
2049 CG GLN 922
2050 CD GLN 922
2051 OE1 GLN 922
2052 NE2 GLN 922
2137 CG ASN 933
2139 ND2 ASN 933
2162 CB TYR 936
2163 CG TYR 936
2164 CD1 TYR 936
2165 CD2 TYR 936
2166 CE1 TYR 936
2167 CE2 TYR 936
2168 CZ TYR 936
2169 OH TYR 936
Predicted site 6
Site Volume: 161 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Masters Research Project
273
Min Coords: (-747, -733, -611)
Max Coords: (-731, -718, -598)
2 CA VAL 668
3 C VAL 668
4 O VAL 668
11 O ILE 669
12 CB ILE 669
14 CG2 ILE 669
15 CD1 ILE 669
28 CG GLU 671
29 CD GLU 671
30 OE1 GLU 671
31 OE2 GLU 671
157 O ARG 686
162 CZ ARG 686
163 NH1 ARG 686
164 NH2 ARG 686
195 CG1 ILE 690
197 CD1 ILE 690
227 NE2 GLN 694
238 CA ASP 696
239 C ASP 696
240 O ASP 696
241 CB ASP 696
Masters Research Project
274
242 CG ASP 696
243 OD1 ASP 696
244 OD2 ASP 696
245 N HIS 697
246 CA HIS 697
247 C HIS 697
248 O HIS 697
255 N ALA 698
256 CA ALA 698
259 CB ALA 698
527 CB PRO 733
528 CG PRO 733
532 C THR 734
533 O THR 734
534 CB THR 734
535 OG1 THR 734
536 CG2 THR 734
537 N LYS 735
546 N ALA 736
550 CB ALA 736
605 C ARG 743
606 O ARG 743
607 CB ARG 743
609 CD ARG 743
614 N ILE 744
Masters Research Project
275
615 CA ILE 744
616 C ILE 744
622 N GLY 745
623 CA GLY 745
624 C GLY 745
626 N SER 746
Predicted site 7
Site Volume: 95 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-720, -716, -640)
Max Coords: (-707, -704, -626)
1276 OD1 ASN 828
1277 ND2 ASN 828
1730 CG2 ILE 883
1735 O LYS 884
1743 C GLU 885
1744 O GLU 885
1745 CB GLU 885
1750 N GLY 886
1751 CA GLY 886
Masters Research Project
276
1766 CA ARG 888
1767 C ARG 888
1768 O ARG 888
1769 CB ARG 888
1771 CD ARG 888
1776 N MET 889
1777 CA MET 889
1781 CG MET 889
1782 SD MET 889
2079 CA LYS 926
2080 C LYS 926
2082 CB LYS 926
2087 N GLN 927
2091 CB GLN 927
2096 N ILE 928
2101 CG1 ILE 928
2103 CD1 ILE 928
Predicted site 8
Site Volume: 95 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-722, -724, -623)
Masters Research Project
277
Max Coords: (-709, -712, -613)
902 CE MET 780
928 CA ALA 784
931 CB ALA 784
958 O CYS 788
966 CG1 ILE 789
1353 CA PRO 838
1354 C PRO 838
1355 O PRO 838
1356 CB PRO 838
1366 OE1 GLU 839
1375 CA ILE 841
1376 C ILE 841
1378 CB ILE 841
1380 CG2 ILE 841
1381 CD1 ILE 841
1382 N PHE 842
1383 CA PHE 842
1386 CB PHE 842
1388 CD1 PHE 842
2132 N ASN 933
2133 CA ASN 933
2136 CB ASN 933
2137 CG ASN 933
Masters Research Project
278
2138 OD1 ASN 933
2139 ND2 ASN 933
Predicted site 9
Site Volume: 55 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-750, -730, -615)
Max Coords: (-740, -719, -604)
1 N VAL 668
2 CA VAL 668
7 CG2 VAL 668
547 CA ALA 736
548 C ALA 736
549 O ALA 736
550 CB ALA 736
551 N ASP 737
552 CA ASP 737
553 C ASP 737
559 N LYS 738
560 CA LYS 738
562 O LYS 738
Masters Research Project
279
563 CB LYS 738
564 CG LYS 738
591 CA SER 741
592 C SER 741
593 O SER 741
594 CB SER 741
596 N VAL 742
597 CA VAL 742
603 N ARG 743
607 CB ARG 743
608 CG ARG 743
609 CD ARG 743
610 NE ARG 743
611 CZ ARG 743
612 NH1 ARG 743
613 NH2 ARG 743
Predicted site 10
Site Volume: 24 Cubic Angstroms
Protein Volume: 25273 Cubic Angstroms
Min Coords: (-729, -708, -628)
Max Coords: (-721, -699, -619)
Masters Research Project
280
1227 CB ASN 822
1228 CG ASN 822
1229 OD1 ASN 822
1230 ND2 ASN 822
1254 CB VAL 825
1255 CG1 VAL 825
1256 CG2 VAL 825
1288 CG ARG 830
1584 OG SER 864
Masters Research Project
281
Masters Research Project
282
Wire Frame:-
Back Bone:-
Masters Research Project
283
Sticks:-
Space fill:-
Masters Research Project
284
Ball and Sticks:-
Ribbons:-
Masters Research Project
285
Strands:-
Masters Research Project
286
Masters Research Project
287
Individual Docking Print Screens (ArgusLab, 2009)
702 LEU- Brotezomib
680 ASN Buslfan
Masters Research Project
288
681 PHE - Hydroxyurea
699 GLU - Ifosfamide
Masters Research Project
289
700 ALA - Leutrol
Masters Research Project
290
Similar Molecules Print Screens (ArgusLab, 2009) 1-methoxy-4-methylsulfonyloxy-butane
Dichlorphenyloxyurea
Masters Research Project
291
Glufosfamide
Ifosfamide mustard
Masters Research Project
292
Ifosfamide mustard
Mannogranol
Masters Research Project
293
Mesna
Methylolurea
Masters Research Project
294
Myleran
N,N-diethyl-N'-hydroxyurea
Masters Research Project
295
Velcade
ZILEUTON
Masters Research Project
296
Masters Research Project
297
FINAL MOLECULE IN ARGUS LAB :- DICLOROPHENYLOXYUREA (ArgusLab, 2009)
HIGHEST OCCUPIED MOLECULAR ORBITAL (HOMO) SURFACE
Masters Research Project
298
LOWEST UNOCCUPIED MOLECULAR ORBITAL (LUMO) SURFACE
ELECTROSTATIC POTENTIAL MAPPED DENSITY
Masters Research Project
299
FINAL MOLECULE IN HYPERCHEM
SINGLE POINT
Masters Research Project
300
GEOMETRY OPTIMIZATION
FINAL MOLECULE WITH QSAR PROPERTIES (QSAR, 2009):
Masters Research Project
301
QSAR PROPERTIES
Partial Charge Net Charge= 0.00e
Surface Area(Approx)= 401.20A02
Surface Area(Grid)= 354.46A02
Volume= 533.85 A03
Hydration Energy= -5.77Kcal/mol
Log p= 1.06
Refractivity= 43.13A03
Polarizability= 17.63 A03
Mass= 215.00 amu
Masters Research Project
302
Masters Research Project
303
FINAL MOL IN CAChe (CAChe, 2009):
FINAL MOL AFTER UV-VISIBLE TRANSITION:
Masters Research Project
304
UV VISIBLE TRANSITION GRAPH:
FINAL MOL AFTER IR TRANSITION:
Masters Research Project
305
IR TRANSITION GRAPH:
Masters Research Project
306
Masters Research Project
307
680asn
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.83547
2. Dichlorphenyloxyurea -6.29333
3. Methylolurea -5.52455
4. Mannogranol -3.76273
5. N,N-diethyl-N'-hydroxyurea -2.17357
Elapsed time for calculation = 33 seconds
Phe681
681phe
Summary of results in order of docking score (kcal/mol)
********************************************************
1. Dichlorphenyloxyurea -7.38248
2. mesna -7.08203
3. Methylolurea -5.54045
4. Mannogranol -2.87791
5. N,N-diethyl-N'-hydroxyurea -2.25558
LEU 682
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -7.49889
2. Dichlorphenyloxyurea -7.42679
Masters Research Project
308
3. Methylolurea -5.50996
4. Mannogranol -3.46054
5. N,N-diethyl-N'-hydroxyurea -2.06696
ARG 683
Summary of results in order of docking score (kcal/mol)
********************************************************
1. Dichlorphenyloxyurea -7.51376
2. mesna -7.20128
3. Methylolurea -5.51805
4. Mannogranol -3.89684
5. N,N-diethyl-N'-hydroxyurea -2.49132
ARG 684
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.77809
2. Dichlorphenyloxyurea -5.80198
3. Methylolurea -5.24809
4. Mannogranol -3.19356
5. N,N-diethyl-N'-hydroxyurea -2.00335
GLU 699
Summary of results in order of docking score (kcal/mol)
********************************************************
Masters Research Project
309
1. Dichlorphenyloxyurea -7.90479
2. mesna -6.30299
3. Methylolurea -5.01062
4. Mannogranol -2.10376
5. N,N-diethyl-N'-hydroxyurea -2.06706
ALA 700
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.17359
2. Dichlorphenyloxyurea -5.75419
3. Methylolurea -5.22776
4. Mannogranol -2.87899
5. N,N-diethyl-N'-hydroxyurea -1.70079
ALA 701
Summary of results in order of docking score (kcal/mol)
********************************************************
1. Dichlorphenyloxyurea -6.22009
2. mesna -6.09249
3. Methylolurea -4.98302
4. Mannogranol -2.67182
5. N,N-diethyl-N'-hydroxyurea -1.83188
LEU 702
Masters Research Project
310
Summary of results in order of docking score (kcal/mol)
********************************************************
1. Dichlorphenyloxyurea -6.09421
2. Methylolurea -4.99511
3. mesna -4.74083
4. Mannogranol -2.38363
5. N,N-diethyl-N'-hydroxyurea -1.82024
TYR 703
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.23621
2. Dichlorphenyloxyurea -6.1831
3. Methylolurea -5.33204
4. Mannogranol -2.44151
5. N,N-diethyl-N'-hydroxyurea -1.95901
LYS 704
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.42078
2. Dichlorphenyloxyurea -5.74163
3. Methylolurea -4.97795
4. Mannogranol -2.23422
5. N,N-diethyl-N'-hydroxyurea -2.05443
Masters Research Project
311
LEU 706
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.85786
2. Dichlorphenyloxyurea -6.17681
3. Methylolurea -5.39481
4. Mannogranol -2.70755
5. N,N-diethyl-N'-hydroxyurea -1.95179
HIS 708
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -5.99358
2. Dichlorphenyloxyurea -5.74579
3. Methylolurea -5.02035
4. Mannogranol -2.75884
5. N,N-diethyl-N'-hydroxyurea -1.94941
SER 709
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.03026
2. Dichlorphenyloxyurea -5.65809
3. Methylolurea -5.14525
Masters Research Project
312
4. Mannogranol -2.47257
5. N,N-diethyl-N'-hydroxyurea -1.38721
LYS 710
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.47998
2. Dichlorphenyloxyurea -6.19609
3. Methylolurea -4.99425
4. Mannogranol -2.26407
5. N,N-diethyl-N'-hydroxyurea -1.7373
GLU 711
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.41158
2. Methylolurea -5.14803
3. Dichlorphenyloxyurea -5.01222
4. Mannogranol -2.31393
5. N,N-diethyl-N'-hydroxyurea -1.42509
SER 712
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.43851
Masters Research Project
313
2. Methylolurea -4.97809
3. Dichlorphenyloxyurea -4.95741
4. Mannogranol -1.91657
5. N,N-diethyl-N'-hydroxyurea -1.6608
CYS 714
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.48801
2. Dichlorphenyloxyurea -5.99383
3. Methylolurea -5.26983
4. Mannogranol -1.89502
5. N,N-diethyl-N'-hydroxyurea -1.60012
SER 715
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.48798
2. Dichlorphenyloxyurea -5.93043
3. Methylolurea -5.13367
4. Mannogranol -2.53956
5. N,N-diethyl-N'-hydroxyurea -1.63789
THR 718
Summary of results in order of docking score (kcal/mol)
Masters Research Project
314
********************************************************
1. Dichlorphenyloxyurea -8.0916
2. mesna -6.37805
3. Methylolurea -5.00683
4. N,N-diethyl-N'-hydroxyurea -2.68111
5. Mannogranol -2.19805
ASN 719
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.54038
2. Dichlorphenyloxyurea -5.5934
3. Methylolurea -4.98243
4. Mannogranol -2.08938
5. N,N-diethyl-N'-hydroxyurea -1.95906
LYS 735
Summary of results in order of docking score (kcal/mol)
********************************************************
1. Dichlorphenyloxyurea -7.69478
2. mesna -6.44391
3. Methylolurea -5.10646
Masters Research Project
315
4. Mannogranol -2.05962
5. N,N-diethyl-N'-hydroxyurea -1.92899
ALA 736
Summary of results in order of docking score (kcal/mol)
********************************************************
1. mesna -6.20818
2. Dichlorphenyloxyurea -5.98871
3. Methylolurea -5.35323
4. Mannogranol -2.03904
5. N,N-diethyl-N'-hydroxyurea -1.64999
Masters Research Project
316
Masters Research Project
317
CONCLUSION
Specific protein causing the disease is identified as Leukemia. Homology modeling of the
protein is done using SPDBV (Swiss PDB Viewer). Active site analysis is done through Q
site finder method and the active site amino acids are noted. Standard available market drugs
targeting the protein were identified as follows:
1 Brotezomib
2 Buslfan
3 Hydroxyurea
4 Ifosfamide
5 Leutrol
Similar molecules (10-15) for these standard molecules were modeled using Argus Lab. A
database is created with all these molecules in Vega ZZ
Virtual Screening of these drugs is done through protein database docking method. The
results obtained show that DICLOROPHENYLOXYUREA is interacting at the lowest
energy level with all the amino acids in the potential active site. So, we consider that ligand
as the potential lead molecule to target the protein in treating the disease
Masters Research Project
318
Masters Research Project
319
REFERENCE
� (ArgusLab, 2009)
ArgusLab. (2009). molecular modeling, graphics, and drug design program. (Biomed
Bioinformatice, Medwin Hospitals: Hyderabad, India).
� (CAChe, 2009)
CAChe. (2009). CAChe chemical tool. (Biomed Bioinformatice, Medwin Hospitals:
Hyderabad, India).
� (Cavity, 2009)
Cavity. (2009). Spdbv tool license agreement for the usage of Cavity viewer. (Biomed
Bioinformatice, Medwin Hospitals: Hyderabad, India), Available from Swiss-Pdb database.
� (ClustalW, 2009)
ClustalW. (2009). Clustalw tool for multiple sequence alignment. (European Molecular
Biology Laboratory: Hinxton, Cambridgeshire, UK), Available from EBI. Retrieved from
http://www.ebi.ac.uk/Tools/msa/clustalw2/
� (F-Electronic PCR, 2009)
F-Electronic PCR. (2009). F-electronic pcr tool. (The National Center for Biotechnology
Information: Bethesda, USA), Available from NCBI. Retrieved from
http://www.ncbi.nlm.nih.gov/projects/e-pcr/forward.cgi
Masters Research Project
320
� (Genscan, 2009)
Genscan. (2009). Genscan tool, identification of complete gene structures in genomic dna.
(Stanford University), Available from The GENSCAN Web Server at MIT. Retrieved from
http://genes.mit.edu/GENSCAN.html
� (Genpept, 2009)
Genpept. (2009). Genpept tool for mast/stem cell growth factor receptor kit isoform 1
precursor [homo sapiens]. (The National Center for Biotechnology Information: Bethesda,
USA), Available from NCBI. Retrieved from http://www.ncbi.nlm.nih.gov/protein/4557695
� (GOR IV, 2009)
GOR IV. (2009). Gor iv secondary structure prediction method. (Pôle BioInformatique
Lyonnais: Lyon, France), Available from NPSA. Retrieved from http://npsa-pbil.ibcp.fr/cgi-
bin/npsa_automat.pl?page=npsa_gor4.html
� (kumar, 2002)
kumar, M. (2002). bioinformatics. Retrieved from
http://dmohankumar.files.wordpress.com/2012/08/bioinformatics.pdf
� levitra. (2002).
levitra. (2002). Cancer. Unpublished raw data, Org, UK. Retrieved from http://europe-
levitra.com/Cancer-articles-au-Cancer.html
Masters Research Project
321
� MapViver, 2009
MapViver. (2009). Mapviver tool. (The National Center for Biotechnology Information:
Bethesda, USA), Available from NCBI. Retrieved from
http://www.ncbi.nlm.nih.gov/projects/mapview/map_search.cgi?taxid=9606&build=104.0
� NCBI. (2009)
NCBI. (2009). Gene responsible for chronic myelogenous leukemia. (The National Center for
Biotechnology Information: Bethesda, USA), Available from NCBI. Retrieved from
www.ncbi.nlm.nih.gov
� (NetAcet, 2009)
NetAcet. (2009). Netacet 1.0 server. (Technical University of Denmark)Retrieved from
http://www.cbs.dtu.dk/services/ NetAcet/
� (NetNGlyc, 2009)
NetNGlyc. (2009). Netnglyc 1.0 server. (Technical University of Denmark)Retrieved from
http://www.cbs.dtu.dk/services/NetNGlyc/
� (NetOGlyc, 2009)
NetOGlyc. (2009). Netgglyc 1.0 server. (Technical University of Denmark)Retrieved from
http://www.cbs.dtu.dk/services/NetOGlyc-3.1
� (NetPhos, 2009)
NetPhos. (2009). Netphos 2.0 server. (Technical University of Denmark)Retrieved from
http://www.cbs.dtu.dk/services/ NetPhos /
Masters Research Project
322
� (NT-BLAST, 2009)
NT-BLAST. (2009). Standard nucleotide blast. (The National Center for Biotechnology
Information: Bethesda, USA), Available from NCBI. Retrieved from
http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&BLAST_PROGRAMS=megaBl
ast&PAGE_TYPE=BlastSearch&SHOW_DEFAULTS=on&LINK_LOC=blasthome
� ORF, 2009
ORF. (2009). Open reading frame finder. (The National Center for Biotechnology
Information: Bethesda, USA), Available from NCBI. Retrieved from
http://www.ncbi.nlm.nih.gov/gorf/gorf.html
� (Protein-Fasta, 2009)
Protein-Fasta. (2009). mast/stem cell growth factor receptor kit isoform 1 precursor [homo
sapiens]. (The National Center for Biotechnology Information: Bethesda, USA), Available
from NCBI. Retrieved from http://www.ncbi.nlm.nih.gov/protein/4557695?report=fasta
� (ProtScale, 2009)
ProtScale. (2009). Protscale tool. (SIB Swiss Institute of Bioinformatics), Available from
ExPASy-Expert Protein Analysis System. Retrieved from http://web.expasy.org/protscale/
� (Protparam, 2009)
Protparam. (2009). Protparam tool by sib swiss institute of bioinformatics. (SIB Swiss
Institute of Bioinformatics), Available from ExPASy-Expert Protein Analysis System.
Retrieved from http://web.expasy.org/protparam/
Masters Research Project
323
� (QSAR, 2009):
QSAR. (2009). Qsar properties . (Biomed Bioinformatice, Medwin Hospitals: Hyderabad,
India).
� (Q-SiteFinder, 2009)
Q-SiteFinder. (2009). Q-sitefinder ligand binding site prediction. (University of
Leeds)Retrieved from http://www.modelling.leeds.ac.uk/qsitefinder/
� (RasMol, 2009)
RasMol. (2009). Rasmol and openrasmol molecular graphics visualisation tool. Available
from OpenRasMol. Retrieved from www.RasMol.org
� (R-Electronic PCR, 2009)
R-Electronic PCR. (2009). R-electronic pcr tool. (The National Center for Biotechnology
Information: Bethesda, USA), Available from NCBI. Retrieved from
http://www.ncbi.nlm.nih.gov/projects/e-pcr/reverse.cgi
� (SignalP, 2009)
SignalP. (2009). Signalp 4.1 server. (Technical University of Denmark)Retrieved from
http://www.cbs.dtu.dk/services/SignalP/
� (SOPMA, 2009)
Masters Research Project
324
SOPMA. (2009). Sopma secondary structure prediction method. (Pôle BioInformatique
Lyonnais: Lyon, France), Available from NPSA. Retrieved from http://npsa-pbil.ibcp.fr/cgi-
bin/npsa_automat.pl?page=npsa_sopma.html
� (SPDBV, 2009)
SPDBV. (2009). Spdbv tool license agreement for the usage of swiss-pdbviewer. (Biomed
Bioinformatice, Medwin Hospitals: Hyderabad, India), Available from Swiss-Pdb database.
� (SOSUI, 2009)
SOSUI. (2009). Sosui: Submit a protein sequence. Retrieved from http://bp.nuap.nagoya-
u.ac.jp/sosui/sosui_submit.html
� (Sulfinator, 2009)
Sulfinator. (2009). The sulfinator tool. (SIB Swiss Institute of Bioinformatics), Available
from ExPASy-Expert Protein Analysis System. Retrieved from
http://web.expasy.org/sulfinator/
� (Vecscreen, 2009)
Vecscreen. (2009). Vecscreen tool. (The National Center for Biotechnology Information:
Bethesda, USA), Available from NCBI. Retrieved from
http://www.ncbi.nlm.nih.gov/tools/vecscreen/
Masters Research Project
325
ABOUT AUTHOR
Forensic Expert & Investigator
Fellow International Science Congress Association
Fellow-SIFS INDIA
M.Sc. in Forensic Science (Cambridge, UK)
M.Sc. in Bioinformatics (Delhi,India)
Microbiologist + Pharma Chemistry (DAVV-Indore, India)
October, 2012 after coming from Cambridge (UK) to India, I have started working with
Government Registered Organization SIFS INDIA where I use to deal with various types of
Forensic Services: such as Forensic Education, Forensic Investigation, Forensic Training,
Forensic Internship, Forensic Research, Security Services, and Scientific Equipment
Department. In SIFS INDIA, I have experience of taking fingerprints from dead bodies,
criminals and suspect, from crime scenes, Police Clearance Certificate (PCC) for private
organizations, visa immigrations and for FBI on real cases. In addition, I have also given my
expert reports, expert opinions after analyzing many hand writing, signature and fingerprints
cases, also MMS cases, case of motor vehicle identification, and case of voice forgery
reorganization.
Sharma Mahesh
Forensic Expert & Investigator