Homology assessment and Homology assessment and molecular sequence alignment.molecular sequence alignment.
Chris Stewart and Ka Yi LingChris Stewart and Ka Yi Ling
Genetics 677 Genetics 677
ClassicalPhylogenetics
MolecularPhylogeneticsHomology
Big pictureBig pictureEvolution
Divergent Convergent
Orthologs Paralogs
AnalogyHomology
Systematics
HomologyHomology1. Equal in position and details
in structure
2. Equal in developmental origin (i.e. cellular/tissue structure)
3. Logical and continual series of character state transformations
Figure from http://images-eu.amazon.com/images/P/0895262002.01.LZZZZZZZ.jpg
HomologyHomology
Speciation
Duplication
CBA
CharacterCharacterTrait from group of organisms, which has
two or more independent states that can be evaluated
http://www.choose-life.org/Map_states_color.jpg
ParsimonyParsimonyWorking principle that prefers the least complex
explanation for an observation
Figure from http://www.cartoonchurch.com/cartoons/large/simple-living-cartoon.gif
Classical phylogeneticsClassical phylogeneticsMethod of parsimony analysis used to
develop cladograms explaining evolutionary relationships.
Fig 1. Hypothetical cladogram
Constructing cladogramsConstructing cladograms
CB
A
F
E
D
G
MatrixMatrixSegmented Jawed Hair Placenta Multi-cell Limbs
Cat
Kangaroo
Lizard
Salmon
Earthworm
Sponge
Amoeba
1 1 1 1 1 1
1 1 1 0 1 1
0 0 0 0 1 0
1 0 0 0 1 0
1 1 0 0 1 0
1 1 0 0 1 1
0 0 0 0 0 0
Possible cladogramsPossible cladograms
….which one do you pick?
Homoplasy & subjective Homoplasy & subjective characterscharacters
A faulty assignment of primary homology
Figure from: http://www.blackwellpublishing.com/ridley/images/analogies.jpg
Things to considerThings to consider
• Auxillary Principle
• Congruence Test
More things to considerMore things to consider
• Weighting–Needs to respond to homoplasy
• Independent characters
http://ksuoncampus.com/2008/01/29/evolution-of-mario/
Molecular PhylogeneticsMolecular Phylogenetics
• Goal:– to infer process from
pattern
• Why– Not just the observables
– Alternative method to derive evolutionary relationships
Sequence alignment programs
Figure modified from http://bioinfo.ochoa.fib.es/docus/courses/Ali2005Filogenias/seq_analysis/images/SeqAnalFloChart.gif
Protein or Gene of interest
Sequence alignment programs
Molecular charactersMolecular characters
• What can be used? – Nucleotide sequences– Protein sequences– DNA– RNA– Protein
• NO single universally accepted recipe
Figure from http://www.ittc.ku.edu/bioinfo_seminar/images/wheel.gif
List of alignment softwareList of alignment software
Sequence AlignmentSequence Alignment
• Types–Pairwise alignment
–Multiple sequence alignment
Figure from http://en.wikipedia.org/wiki/Sequence_alignment
Human Molecular Genetics, 2006, Vol. 15, Review Issue 1, R54
The nuts and bolts The nuts and bolts
1.1. Gene/ protein of Gene/ protein of interestinterest
2. Homolog search
3. Sequence alignment
4. Tree building
Figure from http://www.usingneuralnetworks.com/images/Face_But_Not_The_Name_Cartoon.jpg
Database Search: BLASTDatabase Search: BLAST
• Basic Local Alignment Search Tool
• Input: Protein and Nucleotide
• Default algorithm: Blosum62
• Other algorithms: PAM family, Blosum family
• Sites that use BLAST: NCBI, EBI, GenomeNet, PIR, DDBJ
Figure: NCBI alignment result site.
How does BLAST score an alignment?How does BLAST score an alignment?
Default matrix in BLAST 2.0
BLOSUM= BLOcks Substitution Matrix
Based on local alignments
BLOSUM62: contributions from proteins more than 62% identical are weighted to sum to one.
Scores: Number values
How does BLAST score an alignment?How does BLAST score an alignment?
BLASTBLAST
The nuts and bolts The nuts and bolts
1. Gene/protein of interest
2.2. Homolog searchHomolog search
3. Sequence alignment
4. Tree building
Figure from http://www.usingneuralnetworks.com/images/Face_But_Not_The_Name_Cartoon.jpg
HomologeneHomologene
Aligning GenesAligning Genes
Homologene scoringHomologene scoring
The nuts and bolts The nuts and bolts
1. Gene/protein of interest
2. Homolog search
3.3. Sequence Sequence alignmentalignment
4. Tree building
Figure from http://www.usingneuralnetworks.com/images/Face_But_Not_The_Name_Cartoon.jpg
Pair-alignment
• Algorithm: Needleman-wunsch dynamic programming
• Global alignment
• DNA, protein
• Find positional
primary homology
• Sites that use N-W: EBI server
Figure from http://ww2.cs.fsu.edu/~hui/research/scanalyze_tutorial/pics/registered_group.jpg
Needleman-Wunsch algorithmNeedleman-Wunsch algorithm
Figure from Journal of Medical Physics 39 (2006) pg 29
Needle-wunsch algorithmNeedle-wunsch algorithm
BLAST alignBLAST align
Multiple sequence alignmentMultiple sequence alignment
• Example: ClustalW
• Progressive alignment
• Nucleotide and Protein
sequences
• Local or Global alignment
• Sites that use MSA: EBI, DDBJ, PBIL, EMBNet, GenomeNet
Figure from www.cs.umbc.edu
ClustalW @ EBIClustalW @ EBI
The nuts and bolts The nuts and bolts
1. Gene/protein of interest
2. Homolog search
3. Sequence alignment
4.4. Tree buildingTree building
Figure from http://www.usingneuralnetworks.com/images/Face_But_Not_The_Name_Cartoon.jpg
SRD5A2 @ TreeFAMSRD5A2 @ TreeFAM
DiscussionDiscussion
• Does sequence orthology relate to functional equivalence?
• Can paralogs be functionally related?
• Do unsequenced genomic regions affect the understanding of orthology and paralogy?
Figures from: http://www.ndpgenderequality.ie/images/cartoons/cartoon_large_intro.gif and http://www.faithmouse.com/cartoon567.jpg
ProsPros ConsCons
Pros and consPros and cons
• Best guess
• Parsimony
• Algorithms
• Speed vs accuracy
• Evolution vs religion
• Evolutionary history
• Find animal models
• Relation between structure and function
• Biological processes
Figure from: http://www.pbrainprojects.com/images/angel_devil.jpg
Assumptions, assumptions, Assumptions, assumptions, assumptionsassumptions
• If Xs is true then the tree is true…
Figure from: http://www.gdargaud.net/Humor/Pics/string_theory.png