Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
DR. OBAIDUR RAHMAN
Bioinformatics
TOPIC 4
Protein BLAST: BLASTP
LECTURE TOPICS:
Protein BLAST (BLASTP) at the NCBI and ExPASy Websites
The genetic code
Amino acids and their overlapping properties
The BLASTP scoring matrix
BLASTN Vs BLASTP
Type Query Database
BLASTN Nucleotide Nucleotide
BLASTP Protein Protein
There are 64 possible triplets of the four nucleotides. How?
4 bases in the first position
4 bases in the second position
4 bases in the third position
4 x 4x 4=64 codons
64 codons make 20 amino acids how?
There are redundancy among amino acids, usually
referred as degeneracy.
Third position change generally don’t change any amino acid.
But not true for all case, eg. 1
Some time even the first position change don’t effect the amino acid
It seems protein sequences are more conserved than DNA sequences
More information is found in protein sequence alignment than DNA sequence
alignment.
Memorizing the genetic code:
• Most proteins begin with the codon ATG Methionine
• The translation ends with one of the codons known as
stop codons TAA, TAG, TGA
• Some organism preference in codon is higher than
other during degeneracy
Amino acids Codon
Amino acid properties:
This arrangement, which is shown by polarity and charge, is one of these groupings.
Classifying amino acids by polarity is important because their polarity effects which non-covalent interactions they can form.
And these interactions are largely what gives proteins their shape.
Protein structure
There are three main kinds of non-covalent interactions.
Protein structure
The weakest ones are Van Der Waals interactions, such as this one between an aliphatic isoleucine and an aliphatic leucine side chain.
As illustrated on the energy diagram on the right, Van Der Waals interactions are weak and act only over short distances, although they are present between any pair of atoms in close proximity.
The distance at which the energy is minimal represents the Van Der Waals radius that's illustrated here by the transparent spheres.
Protein structure
The strongest non-covalent interactions are salt bridges between pairs of charged ions.
Here, a lysine side chain is paired to the C terminal carboxylate of the protein.
Depending on the polarity of the environment, a salt bridge can provide more than 10 times the binding energy of a Van Der Waals interaction.
Protein structure
Finally, hydrogen bonds are two to five times stronger than Van Der Waals interactions, but they only occur between polar groups with permanent dipoles.
One of these polar groups is acting as a hydrogen bond donor, and the other one is a hydrogen acceptor.
Here, you can see a hydrogen bond within the backbone of a protein within an alpha helix.
Hydrogen bonds are unique because they are directional. They are strongest when the two dipoles are aligned.
In contrast, both of the Van Der Waals interactions and salt bridge interactions are non-directional.
Protein structure
Survey a few of the amino acids.
acidic amino acids-- Aspartate and Glutamate.
They both have the carboxylic acid group at the end of their side chain.
Or maybe I should say a carboxylic, because at physiological pH, they're ionized and charged, negatively charged.
Now glutamate is longer than aspartate by one methylene group.
And you might think that that's not very much. That's not a big difference.
But it actually makes a big difference, especially in the types of conformations or rotamers that each of these side chains can achieve.
Protein structure
Survey a few of the amino acids.
acidic amino acids-- aspartate and glutamate.
The glutamate has many conformations that it can achieve, many more than the aspartate side
chain.
And that means that it might be better able to position itself exactly in the right position to
interact with a substrate or a ligand in the active side of an enzyme.
So although the glutamate can optimally position itself, that can come at an entropic cost.
And that's because the conformational flexibility of the many rotamers will then be limited once
it reaches its bound conformation, reducing its entropy.
Protein structure
Histidine, which is another interesting amino acid that's often found in the active sides of
enzymes.
Now histidine has a pKa for the imidazole group of its side chain of about six, which means
that it can either be uncharged or charged at physiological pH, depending on its environment.
So on the left is a deprotonated or uncharged form of the histidine, whereas on the right is a
protonated and positively charged form of the histidine.
Now in the neutral state, the proton can actually be on either nitrogen atom of the imidazole
group.
Histidine
Protein structure
These two neutral states have different hydrogen bonding
properties as suggested by the red and blue arrows here.
The transitions between the different states can be used to
shuttle protons in active sides.
Protein structure
So from histidine we saw that the pKas of the protein groups reflect their chemical
properties.
Several amino acids have polar groups that have pKas spanning a wide range of values,
as shown here.
As a food for thought, consider why would tyrosine be so much more acidic than
threonine and serine, even though they're all alcohols? (find the answer)
Protein structure
So most amino acids are formed of carbon, nitrogen, oxygen, and hydrogen.
But two of them have a sulfur atom.
The first one is methionine, essentially a hydrophobic
residue.
It's very similar in size and shape to leucine,
shown here.
The second one is cysteine, which has a sulfhydryl
group.
And this group is interesting, because it is actually
quite reactive under physiological conditions.
Protein structure
One of the reactions that it can undertake is
to be oxidized to form a disulfide bond.
So cysteines can react to form these disulfide
bonds under oxidizing conditions.
And those are conditions that are often found
on the extracellular side.
Whereas inside cells, conditions tend to be
more on the reducing side, which means that
the cysteines will be found in the reduced free
form.
Protein structure
So I told you that amino acids that are used by the ribosome, the natural amino acids, and
are L amino acids.
Well, there are actually two exceptions to this.
Glycine has a hydrogen as a side chain, which means
that it now has two hydrogens coming off of its alpha
carbon.
But also, the small side chain means that it has fewer
conformational restrictions.
And that's going to be important in the process of
protein folding.
The second exception is proline.
Proline is a cyclic amino acid.
And that's because its side chain is actually covalently linked
to its amino group.
Now this linkage, this covalent linkage, means that proline is
actually more conformationally restricted than most amino acid.
And again, this interesting conformational property
comes into play when we think about protein folding.
Protein structure
Let's revisit how we can classify the amino
acids.
This Venn diagram shows some of the many ways to
classify amino acids.
Protein structure
For example, if we look at lysine, it is charged at physiological pH because its side chain
amino group carries a positive charge.
It can also readily form hydrogen bonds, and therefore it's also polar.
Why is Lysine also classified as nonpolar?
Protein structure
A look at the structure again, and I'll give you a hint.
Now aside from the charged amino group at the end of its side chain, the rest of the side
chain is aliphatic or nonpolar.
So that means that lysine can sometimes act as nonpolar in certain situations.
So the sequence of amino acids that make up a protein is called its primary structure.
BLASTP & Scoring Matrix
Class Lab Protocol
Task 1. Retrieving protein sequence
Ref. protein record,
DB source has a
hyperlink to the DNA
record encoding this
protein
The CDS section of
an NCBI mRNA
record. This contains
a translation protein
encoded by this
mRNA
THE RESULTS OF BLASTP
The program detected that the protein belongs to a
larger family or “superfamily”: IGF (Insulin Like
Growth Factor) that includes insulin and many related
sequences.
1e-52 is quite small and nobody would argue that hit is by
chance.
Distant homologue
PAIRWISE BLAST
USING ExPASy Website