Upload
vincent-warner
View
27
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Bioinformatics. Predrag Radivojac Indiana University. Basics of Molecular Biology. Can we understand how cells function?. Eukaryotic cell. Bioinformatics is multidisciplinary!. What is Bioinformatics? Integrates : computer science, statistics, chemistry, physics, and molecular biology - PowerPoint PPT Presentation
Citation preview
Bioinformatics is multidisciplinary!
• What is Bioinformatics?
– Integrates: computer science, statistics, chemistry, physics, and molecular biology
– Goal: organize and store huge amounts of biological data and extract knowledge from it
• Major areas of research– Genomics– Proteomics– Databases
• Practical discipline
Some major applications
· Drug design · Evolutionary studies · Genome characterization
Interesting Problems
• Sequence assembly
Goal:
solve the puzzle, i.e. connect the pieces into one
genomic sequence
Interesting Problems
• Proteomics
S#: 1708 RT: 54.47 AV: 1 NL: 5.27E6T: + c d Full ms2 638.00 [ 165.00 - 1925.00]
200 400 600 800 1000 1200 1400 1600 1800 2000
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rel
ativ
e A
bund
ance
850.3
687.3
588.1
851.4425.0
949.4
326.0524.9
589.2
1048.6397.1226.9
1049.6489.1
629.0
Mass spectrometry
Disease
www.cancer.gov
• Development of tools that can be used to understand and treat human disease
• Prediction of disease-associated genes
• Important from• biological standpoint• medical standpoint• computational standpoint
• Background• human genome• low-throughput data• high-throughput data• ontologies for protein function at
multiple levels
The Time is Right!
Loss/Gain of function and disease
Pauling et al. Science 110: 543 (1949). Chui & Dover. Curr Opin Pediatr, 13: 22 (2001).
Sickle Cell Disease: Autosomal recessive disorder E6V in HBB causes interaction w/ F85 and L88 Formation of amyloid fibrils Abnormally shaped red blood cells, leads to sickle cell anemia Manifestation of disease vastly different over patients
2hbs
E6V
http://gingi.uchicago.edu/hbs2.html
4hhb
15
Proteins = chains of amino acids
• biomolecule, macromolecule– more than 50% of the dry
weight of cells is proteins
• polymer of amino acids connected into linear chains
• strings of symbols
• machinery of life– play central role in the
structure and function of cells
– regulate and execute many biological functions
a) amino acid b) amino acid chain
Introduction to Protein Structure by Branden and Tooze
16
• peptide bonds are planar and strong
• by rotating at each amino acid, proteins adopt structure
Protein structure
Introduction to Protein Structure by Branden and Tooze
17
Protein function
• Multi-level phenomenon– biochemical function – biological function– phenotypical function
• Example: kinase– biochemical function –
transferase– biological function – cell
cycle regulation– phenotypical function –
disease
• Function is everything that happens to or through a protein (Rost et al. 2003)
21
S113 of isocitrate dehydrogenase
G = (V, E)
f: V A A = {A, C, D, … W, Y} g: V {1, +1}
Notation:
Residue neighborhood
22
Graphlets are small non-isomorphic connected graphs.
Different positions of the pivot vertex with respect to the graphlet correspond to graph-theoretical concept of automorphism orbits, or orbits.
S
Przulj et al. Bioinformatics 20: 3508 (2004).
2-graphlets: 013-graphlets: 011, 0124-graphlets: 0111, 0112
0122, 0123
Key insight:
Efficient combinatorial enumeration
of graphlets / orbits over 7 disjoint cases
breadth-first search
A C D E F G H I K … Y AA AC AD …
01 |A|o2 |A|2
o5, o6, o11 |A|3
o3, o4 ?
A = {0, 1} 00, 01 = 10, 11 (3)A = {0, 1, 2} 00, 11, 22, 01 = 10,
02 = 20, 12 = 21 (6)
binomial (multinomial) coefficients
|A |= 20, dimensionality = 1,062,420
01 02
Inner product between vectors of counts of labeled orbits
where
K is a kernel because matrices of inner products are symmetric and positive definite (proof due to David Haussler).
A C D E F G H I K … Y AA AC AD …
A C D E F G H I K … Y AA AC AD …
Graphlet kernel
i(x) is the number of times labeled orbit i occurs in the graph