View
1
Download
0
Category
Preview:
Citation preview
Nearest-neighbor model in microarray hybridization
Enrico Carlon, KULeuven
Plon – May 9, 2011
Enrico Carlon, KULeuven () Nearest-neighbor model in microarray hybridization Plon – May 9, 2011 1 / 28
Outline
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 2 / 28
Outline
1 Hybridization properties from controlled experiments (Agilent)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 2 / 28
Outline
1 Hybridization properties from controlled experiments (Agilent)1 Breakdown of thermal equilibrium
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 2 / 28
Outline
1 Hybridization properties from controlled experiments (Agilent)1 Breakdown of thermal equilibrium2 Modeling non-equilibrium effects
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 2 / 28
Outline
1 Hybridization properties from controlled experiments (Agilent)1 Breakdown of thermal equilibrium2 Modeling non-equilibrium effects3 Exploring other arrays
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 2 / 28
Outline
1 Hybridization properties from controlled experiments (Agilent)1 Breakdown of thermal equilibrium2 Modeling non-equilibrium effects3 Exploring other arrays
2 Algorithms for biological data analysis (Affymetrix)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 2 / 28
Outline
1 Hybridization properties from controlled experiments (Agilent)1 Breakdown of thermal equilibrium2 Modeling non-equilibrium effects3 Exploring other arrays
2 Algorithms for biological data analysis (Affymetrix)1 AffILM (available at www.bioconductor.org)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 2 / 28
Nearest neighbor parameters ∆G37◦C (DNA/DNA)
Table of nearest neighbor parameters for the hybridization free energies ∆G37◦C
in 1M NaCl expressed in kcal/mol. The orientation is 5′-3′ for the upper strand
and 3′-5′ for the lower strand. Only 10 of the 16 parameters are independent.
AATT -1.00 AT
TA -0.88 ACTG -1.44 AG
TC -1.28
TAAT -0.58 TT
AA -1.00 TCAG -1.30 TG
AC -1.45
CAGT -1.45 CT
GA -1.28 CCGG -1.84 CG
GC -2.17
GACT -1.30 GT
CA -1.44 GCCG -2.24 GG
CC -1.84
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 3 / 28
Example of a calculation
G C A C −3’
C G T G −5’A
G
C
= 11.4 kcal/mole
T
0.881.451.451.84
2.241.30
A5’− T
3’− T
0.88
A
. . . plus boundary terms (effect of the double helix ends)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 4 / 28
Mismatches
Base pairings can occurr also for non-complementary bases.
Here: possible structure of a mismatchbetween G (left) and A (right)G·A forms two hydrogen bonds!
Table of nearest-neighbor parameters for AG mismatches at 37◦C in 1MNaCl (unit kcal/mol)
AATG 0.14 AG
TA 0.02 CAGG 0.03 CG
GA 0.11
GACG -0.25 GG
CA -0.52 TAAG 0.42 TG
AA 0.74
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 5 / 28
Hybridization in a biological experiment
Complex system with a large set of chemical reactions
Aim
Understand/characterize hybridization in microarrays
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 6 / 28
Hybridization in a biological experiment
Complex system with a large set of chemical reactions
Aim
Understand/characterize hybridization in microarraysEnrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 6 / 28
Hybridization in a biological experiment
Complex system with a large set of chemical reactions
Aim
Understand/characterize hybridization in microarrays
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 6 / 28
Hybridization in a biological experiment
Complex system with a large set of chemical reactions
Aim
Understand/characterize hybridization in microarrays
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 6 / 28
Hybridization in a biological experiment
Complex system with a large set of chemical reactions
Aim
Understand/characterize hybridization in microarraysEnrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 6 / 28
Hybridization in a biological experiment
Complex system with a large set of chemical reactions
Aim
Understand/characterize hybridization in microarraysEnrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 6 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAATCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAATCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAAGATTCTAACCCACCGTGACAACATT-5’ 1 mismatch
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAATCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAAGATTCTAACCCACCGTGACAACATT-5’ 1 mismatch
...
3’-CAAAAGCTTCTAACCCACCGTGACGACATT-5’ 1 mismatch
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAATCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAAGATTCTAACCCACCGTGACAACATT-5’ 1 mismatch
...
3’-CAAAAGCTTCTAACCCACCGTGACGACATT-5’ 1 mismatch
3’-CAAAAGCTTCTAACCCACCGTGACTACATT-5’ 1 mismatch
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAATCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAAGATTCTAACCCACCGTGACAACATT-5’ 1 mismatch
...
3’-CAAAAGCTTCTAACCCACCGTGACGACATT-5’ 1 mismatch
3’-CAAAAGCTTCTAACCCACCGTGACTACATT-5’ 1 mismatch
3’-CAAAAACTTCTCACCCACCGTGACAACATT-5’ 2 mismatches
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAATCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAAGATTCTAACCCACCGTGACAACATT-5’ 1 mismatch
...
3’-CAAAAGCTTCTAACCCACCGTGACGACATT-5’ 1 mismatch
3’-CAAAAGCTTCTAACCCACCGTGACTACATT-5’ 1 mismatch
3’-CAAAAACTTCTCACCCACCGTGACAACATT-5’ 2 mismatches
3’-CAAAAACTTCTGACCCACCGTGACAACATT-5’ 2 mismatches
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAATCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAAGATTCTAACCCACCGTGACAACATT-5’ 1 mismatch
...
3’-CAAAAGCTTCTAACCCACCGTGACGACATT-5’ 1 mismatch
3’-CAAAAGCTTCTAACCCACCGTGACTACATT-5’ 1 mismatch
3’-CAAAAACTTCTCACCCACCGTGACAACATT-5’ 2 mismatches
3’-CAAAAACTTCTGACCCACCGTGACAACATT-5’ 2 mismatches
3’-CAAAAACTTCTTACCCACCGTGACAACATT-5’ 2 mismatches
...
3’-CAAAAGCTTCTAACCCACTGTGACTACATT-5’ 2 mismatches
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Determination of ∆G from Agilent Custom Arrays
Sufficiently simple setup to avoid “spurious” effects
5’-GTTTTCGAAGATTGGGTGGCACTGTTGTAA-3’ target in solution
3’-CAAAAGCTTCTAACCCACCGTGACAACATT-5’ perfect match
3’-CAAAAACTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAACCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAATCTTCTAACCCACCGTGACAACATT-5’ 1 mismatch
3’-CAAAAGATTCTAACCCACCGTGACAACATT-5’ 1 mismatch
...
3’-CAAAAGCTTCTAACCCACCGTGACGACATT-5’ 1 mismatch
3’-CAAAAGCTTCTAACCCACCGTGACTACATT-5’ 1 mismatch
3’-CAAAAACTTCTCACCCACCGTGACAACATT-5’ 2 mismatches
3’-CAAAAACTTCTGACCCACCGTGACAACATT-5’ 2 mismatches
3’-CAAAAACTTCTTACCCACCGTGACAACATT-5’ 2 mismatches
...
3’-CAAAAGCTTCTAACCCACTGTGACTACATT-5’ 2 mismatches
total 1006 sequences, replicated 15 times = 15K array
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 7 / 28
Intensity vs. ∆Gsol
-8 -6 -4 -2 0-∆∆G
sol (kcal/mol)
100
101
102
103
104
105
I
Experiment 1
50 pM
500 pM
5000 pM
PM
1MM
2MM
Expected: Langmuir isotherm
I =Ace∆G/RT
1 + ce∆G/RT
data for 3 differentconcentrations (c)
lines: slope = 1/RT
scatter of the data
deviations from 1/RT already
far from chemical saturation
No fitting parameters!
∆Gsol is computed from tabulated literature data (parameters measured inhybridization in solution)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 8 / 28
Intensity vs. ∆Gsol
-∆∆G (arb. units)
I (a
rb. u
nits
)
slope 1/RT
A
Expected Langmuir behavior
Expected: Langmuir isotherm
I =Ace∆G/RT
1 + ce∆G/RT
data for 3 differentconcentrations (c)
lines: slope = 1/RT
scatter of the data
deviations from 1/RT already
far from chemical saturation
No fitting parameters!
∆Gsol is computed from tabulated literature data (parameters measured inhybridization in solution)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 8 / 28
Fitting free energy parameters from experimental data
-8 -6 -4 -2 0∆∆Gµarray
(kcal/mol)10
-5
10-4
10-3
10-2
10-1
I/I PM
*
Experiment 1
c = 50 pM
1006 data points and58 parameters
Fit to
I = A exp(∆G/RT )
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 9 / 28
Fitting free energy parameters from experimental data
-8 -6 -4 -2 0∆∆Gµarray
(kcal/mol)10
-5
10-4
10-3
10-2
10-1
I/I PM
*
Experiment 1
c = 50 pM
T = 850K
T = 338K = 65oC
1006 data points and58 parameters
Fit to
I = A exp(∆G/RT )Two temperatures!
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 9 / 28
Identical behavior for 4 different sequences
-8 -6 -4 -2 0-∆∆Gµarray
(kcal/mol)10
-5
10-4
10-3
10-2
10-1
I/I PM
*
-8 -6 -4 -2 0
10-4
10-2
Experiment 1
c = 50 pM
Texp
Teff
-8 -6 -4 -2 0-∆∆Gµarray
(kcal/mol)10
-5
10-4
10-3
10-2
10-1
I/I PM
*
-8 -6 -4 -2 0
10-4
10-2
Experiment 2
c = 50 pM
-8 -6 -4 -2 0-∆∆Gµarray
(kcal/mol)10
-5
10-4
10-3
10-2
10-1
I/I PM
*
-8 -6 -4 -2 0
10-4
10-2
Experiment 3
c = 50 pM
-8 -6 -4 -2 0-∆∆Gµarray
(kcal/mol)10
-5
10-4
10-3
10-2
10-1
I/I PM
*
-8 -6 -4 -2 0
10-4
10-2
Experiment 4
c = 50 pM
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 10 / 28
Comparison of free energy in the array and in solution
0 1 2 3 4 5∆∆G
sol
0
1
2
3
4
5
∆∆G
µarr
ay
CC mismatches
All Experiments
∆∆G: difference between perfectmatch hybridization andhybridization with an internalmismatch.Moderate correlation betweensolution and microarray free energies
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 11 / 28
Understanding the two regimes
k2
k−2
k−1
k1
configuration 1 configuration 2
Hybridization in solution:nucleation and fast zippering
Is zippering rapid inmicroarrays?Probably not!
The simplest description of along lived partially zippedstate is through a 3 statemodel
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 12 / 28
Understanding the two regimes
k2
k−2
k−1
k1
configuration 1 configuration 2
Hybridization in solution:nucleation and fast zippering
Is zippering rapid inmicroarrays?Probably not!
The simplest description of along lived partially zippedstate is through a 3 statemodel
Four parameters
Hybridization rates: k1, k2 (forward), k−1, k−2 (reverse)
Thermodynamics k−1/k1 = e−∆G′/RT k−2/k2 = e−(∆G−∆G′)/RT
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 12 / 28
Two vs. three state model
Two state modelkinetics
dθ
dt= ck1(1− θ)− k−1θ
Equilibrium θeq =ck1
k−1 + ck1=
cK
1 + cK=
ce∆G/RT
1 + ce∆G/RT
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 13 / 28
Two vs. three state model
Two state modelkinetics
dθ
dt= ck1(1− θ)− k−1θ
Equilibrium θeq =ck1
k−1 + ck1=
cK
1 + cK=
ce∆G/RT
1 + ce∆G/RT
Threestatemodel
dθ(1)
dt= ck1(1− θ(1) − θ(2)) + k−2θ
(2)− (k−1 + k2)θ
(1)
dθ(2)
dt= k2θ
(1)− k−2θ
(2)
Equilibrium(c → 0) θ(1)eq = c
k1k−1
= ce∆G′/RT θ(2)eq = ck1k2
k−1k−2= ce∆G/RT
Assumption:
k1, k2 independent on ∆G∆G′
≈ γ∆G (they should be monotonically linked)Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 13 / 28
Three state model
10 20 30 40-∆G(kcal/mol)
10-8
10-6
10-4
10-2
100
θ
Langmuir isotherm
intermediateregime
dynamical saturation
(b)chemical saturation
Langmuir isotherm
intermediateregime
dynamical saturation
(b)chemical saturation
c fixed at 5.10-12
, various times
Solid line:Equilibrium solution(t → ∞)
Dashes lines:Finite time behavior
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 14 / 28
Experiments (Agilent arrays)
-7 -6 -5 -4 -3 -2 -1 0101
102
103
104
105
I
a)
-7 -6 -5 -4 -3 -2 -1 0101
102
103
104
105
I
b)
-7 -6 -5 -4 -3 -2 -1 0
∆∆G (kcal/mol)10
1
102
103
104
105
I
c)
-7 -6 -5 -4 -3 -2 -1 0
∆∆G (kcal/mol)10
1
102
103
104
105
I
d)
t = 40min t = 3h24min
t = 17h t = 86h8min
kinetic model
Target sequenceL = 30Exper. to ≈ 86h
Temperature 65◦C
Solid line:three state modelDashed line:
Equilibrium isotherm
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 15 / 28
Experiments (Agilent arrays)
-7 -6 -5 -4 -3 -2 -1 0101
102
103
104
105
I
a)
-7 -6 -5 -4 -3 -2 -1 0101
102
103
104
105
I
b)
-7 -6 -5 -4 -3 -2 -1 0
∆∆G (kcal/mol)10
1
102
103
104
105
I
c)
-7 -6 -5 -4 -3 -2 -1 0
∆∆G (kcal/mol)10
1
102
103
104
105
I
d)
t = 40min t = 3h24min
t = 17h t = 86h8min
Target sequenceL = 25Exper. to ≈ 86h
Temperature 65◦C
Dashed line:Equilibrium isotherm
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 15 / 28
Experiments (Agilent arrays)
14 16 18 20 22 24 26 28 30-∆G
sol (kcal/mol)
100
102
104
106
I
50pM
500pM
5000pMequilibrium
region
non-equilibrium region
Target sequenceL = 30Exper. ≈ 17h
Temperature 55◦C
Dashed line:Three state model
Evidence of dynamicalsaturation
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 15 / 28
Experiments (High density arrays, Suzuki et al.)
15 20 25 30-∆G (kcal/mol)
10
103
105
I
1.4fM14fM140fM1.4pM14pM140pM1.4nM
140 targets andprobes with 1 or 2mismatches
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 16 / 28
Experiments (High density arrays, Suzuki et al.)
15 20 25 30-∆G (kcal/mol)
10
103
105
I
1.4fM14fM140fM1.4pM14pM140pM1.4nM
140 targets andprobes with 1 or 2mismatches
Solid line:Teff ≈ 1200K
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 16 / 28
Experiments (High density arrays, Suzuki et al.)
15 20 25 30-∆G (kcal/mol)
10
103
105
I
1.4fM14fM140fM1.4pM14pM140pM1.4nM
140 targets andprobes with 1 or 2mismatches
Solid line:Teff ≈ 1200K
Dashed line:Texp ≈ 45C
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 16 / 28
Experiments (High density arrays, Suzuki et al.)
15 20 25 30-∆G (kcal/mol)
10
103
105
I
1.4fM14fM140fM1.4pM14pM140pM1.4nM
140 targets andprobes with 1 or 2mismatches
Solid line:Teff ≈ 1200K
Dashed line:Texp ≈ 45C
Dynamical saturation
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 16 / 28
Experiments (High density arrays, Suzuki et al.)
15 20 25 30-∆G (kcal/mol)
10
103
105
I
1.4fM14fM140fM1.4pM14pM140pM1.4nM
140 targets andprobes with 1 or 2mismatches
Solid line:Teff ≈ 1200K
Dashed line:Texp ≈ 45C
Dynamical saturation... or depletion?
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 16 / 28
Experiments (High density arrays, Suzuki et al.)
10 15 20 25 30 35-∆G (kcal/mol)
10
103
105
I
14fM140fM1.4pM
140 targets andprobes with 1 or 2mismatches
Solid line:Teff ≈ 1200K
Dashed line:Texp ≈ 45C
Dynamical saturation... or depletion?Solid line: three statemodel
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 16 / 28
Experiments (Spotted array, Weckx et al.)
25 30 35 40-∆G
sol (kcal/mol)
102
103
104
105
I -
I 0 scannersaturation
slope 1/RTexp
PM1MM2MM3MM
Experiment with 4targets and probes upto three mismatches
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 17 / 28
Experiments (Spotted array, Weckx et al.)
25 30 35 40-∆G
sol (kcal/mol)
102
103
104
105
I -
I 0 scannersaturation
slope 1/RTexp
PM1MM2MM3MM
Experiment with 4targets and probes upto three mismatches
Large range of ∆G
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 17 / 28
Experiments (Spotted array, Weckx et al.)
25 30 35 40-∆G
sol (kcal/mol)
102
103
104
105
I -
I 0 scannersaturation
slope 1/RTexp
PM1MM2MM3MM
Experiment with 4targets and probes upto three mismatches
Large range of ∆G
Solid line:Texp ≈ 45C
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 17 / 28
Experiments (Spotted array, Weckx et al.)
25 30 35 40-∆G
sol (kcal/mol)
102
103
104
105
I -
I 0 scannersaturation
slope 1/RTexp
PM1MM2MM3MM
Experiment with 4targets and probes upto three mismatches
Large range of ∆G
Solid line:Texp ≈ 45C
Dashed line:scanner saturation
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 17 / 28
Practical consequences
Microarrays working in a non-equilibrium regime suffer from enhancedcross-hybridization
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 18 / 28
Practical consequences
Microarrays working in a non-equilibrium regime suffer from enhancedcross-hybridization
Example: two different sequences in solution binding to the same probe atthe surface. The first is perfect match and has concentration c, the secondcarries mismatches and has concentration c′.
I = cf(∆G) + c′f(∆G′)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 18 / 28
Practical consequences
Microarrays working in a non-equilibrium regime suffer from enhancedcross-hybridization
Example: two different sequences in solution binding to the same probe atthe surface. The first is perfect match and has concentration c, the secondcarries mismatches and has concentration c′.
I = cf(∆G) + c′f(∆G′)
We wish to have f(∆G) ≫ f(∆G′). In equilibrium
feq(∆G)
feq(∆G′)= e(∆G−∆G′)/RTexp
is maximized. . .
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 18 / 28
Practical consequences
Microarrays working in a non-equilibrium regime suffer from enhancedcross-hybridization
Example: two different sequences in solution binding to the same probe atthe surface. The first is perfect match and has concentration c, the secondcarries mismatches and has concentration c′.
I = cf(∆G) + c′f(∆G′)
We wish to have f(∆G) ≫ f(∆G′). In equilibrium
feq(∆G)
feq(∆G′)= e(∆G−∆G′)/RTexp
is maximized. . . while in non-equilibrium
f(∆G)
f(∆G′)= e(∆G−∆G′)/RTeff ≪
feq(∆G)
feq(∆G′)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 18 / 28
Application: Mutant identification (Jef’s talk)
-8 -6 -4 -2 010
0
101
102
103
I
c’ = 0
baseline
(a) slope 1/RT
-8 -6 -4 -2 010
0
101
102
103
c’/c = 0.01
(b)
-8 -6 -4 -2 0
∆∆G (kcal/mol)
100
101
102
103
I
c’/c = 0.03
(c)
-8 -6 -4 -2 0
∆∆G (kcal/mol)
100
101
102
103
c’/c = 0.1
(d)
Target:Mixture of 2 sequences
Wild type at cMutant at c′
Detection limitc′/c ≈ 0.01
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 19 / 28
Back to complex biological experiments (Affy Genechip)
AffyILM: Algorithm available from BioConductor
AffyILM
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 20 / 28
Back to complex biological experiments (Affy Genechip)
AffyILM: Algorithm available from BioConductor
AffyILM
. . . Langmuir hybridization (with Teff)
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 20 / 28
Back to complex biological experiments (Affy Genechip)
AffyILM: Algorithm available from BioConductor
AffyILM
. . . background subtraction
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 20 / 28
Back to complex biological experiments (Affy Genechip)
AffyILM: Algorithm available from BioConductor
AffyILM
. . . not taken into account (should affect 5% of targets)Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 20 / 28
Back to complex biological experiments (Affy Genechip)
AffyILM: Algorithm available from BioConductor
AffyILM
. . .mean-field approximationEnrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 20 / 28
1st - Estimating of the Background Noise
Motivation:
Strong signals(Isp ≫ I0)⇒ Gene expressed
Weak signals (Isp ≈ I0)⇒ Weakly expressedgene or backgroundnoise?
Aim
Calculation of background intensity for each probe on array
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 21 / 28
2nd - Specific signal: extended Langmuir model
θ: fraction of bound probesc: mRNA concentrationα: fraction “free” mRNA
θ =αce∆G/RT
1 + αce∆G/RT
α =1
1 + ce∆GR/RT ′
kin
α c c(1−α)
kout
Intensity: I = Aθ∆G, ∆GR: from data in solution
Parameters: A, T , T ′, c
Scaling variablex′ = αc exp(∆G/RT )Fitted effective temperature T ≈ 700K
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 22 / 28
Modeling Hybridization in solution
~~tijt
+ + + + ....25
1i
j^
Chemical equilibrium
Target conservation
[t][ti,j ]
[tti,j ]= e∆GR(i,j)/RT
[t] +∑
i,j
[tti,j ] = c
α =[t]
c=
1
1 +∑
i,j [ti,j ]e−∆GR(i,j)/RT
≈1
1 + ce−∆GR(1,25)/RTeff
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 23 / 28
Calibration data (spike-in)
≈ 300 data points (concentrations c = 0, 0.25, 0.5 . . . 1024 pM)
Scaling collapses (4 fitting parameters) RED : PM / GREEN : MM
12
3 5678 910
1112 1314
15161 23
4567
8 910
11121314
15161 23 45678 910
11121314
15161 2
34567
8 910
1112 1314
15161 23 4
5678 910
11
1213
14
15161 23
4
5678 910
1112 1314
1516
1 2345678 910
1112 1314
1516
12
3 45678 9
10
1112 1314
15161 2
3456
78 910
1112 1314
1512
345
678 91011
1213
14
15
161 23 45678
910
11
12131415
16
1 2
34
5678 9
1011
121314
15
16
1 23 45678 910
1112 1314
15161 23 45678 910
1112 1314
15161 2
3 45678 910
1112 1314
15161 2
34567
8 910
11121314
15161 2
3 45678 910
1112 1314
15161 2
34567
8 910
1112 1314
15161 23 4567
8 910
1112 1314
1516
10−4
10−2
100
102
104
x’
100
102
104
I(c)
−I(
0)
236
71023 6710
16
2
36710
16
23
6710
16
23 67
10
162367
102
36710
162
3
67
10
16
23 6710
27
10
162
3
6710 16
23 710
162
3 671016
2
3 671016
2
3 6710
16
2
3 671016
2
3 6710
16
2
3 6710
16
2
36710
16
Ax’/(1+x’)
1024_at 12
345
678
9101112
1314
15161
2345
678
910
1112
1314
151612
34
5678
910
111213
14
151612
345
678
9101112
1314
15161
2 3
45
678
910
1112
1314
15161
2 3
45
678
910
1112
1314
15161
2 3
45678
910
1112
1314
15161
2 3
45678
9101112
1314
15161
23
45
678
910
1112
1314
15161
23
45
678
9
10
1112
1314
15161
23
45
6
78
9
101112
1314
1516
12
3
45
6
78
9
101112
1314
15161
23
45
6
78
9
10
1112
1314
15
161
23
45
678
9
101112
1314
15
161
2
3
45
6
78
9
101112
1314
15
161
2
3
45
6
78
101112
1314
15
161
3
45
678
9101112
1314
15
1613
4
57
8
9
101112
1314
15161
3
4
578
101112
13
14
15
16
10−4
10−2
100
102
104
x’
100
102
104
I(c)
−I(
0)
1
2
5712
161
2
5712
161
2
5712
161
2
5712
161
2
5712
161
2
5712
161
2
5712
161
2
5712
161
2
5712
161
2
57
12
161
2
57
12
16
1
2
57
12
161
2
57
12
161
2
5712
161
2
57
12
161
2
57 16
12
57
12
16
7
16
12 5
7
1216
Ax’/(1+x’)
37777_at
x-axis: scaling variable x′ = αc exp(∆G/RT )y-axis: background subtracted intensities
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 24 / 28
Outliers
1 23
45
6 78
9
10111213141516
1 23
45
6 78
9
10111213141516 1 2
34
5
6 78
9
10111213141516 1 2
34
5
6 78
9
10111213141516
1 23
4
5
6 78
9
10111213141516
1 23
4
5
6 78
9
10111213141516 1 2
34
5
6 78
9
10111213141516
1 23
45
6 78
9
10111213141516
1 23
4
5
6 78
9
10111213141516
1 2
3
45
6 78
9
10111213141516
1 23
45
6 78
9
10111213141516
1 23
4
5
6 78
9
10111213141516
23
4
5
67813141516
123
45
6 7
8
11121415162356
78
11
13141516
1 2
3
56
711
1215
161
234
56
7
81011
1214
1516
1 23456 7
8
9
101112131415161 2
34
5
6 78
9
10111213141516
10−4
10−2
100
102
104
x’
100
102
104
I(c)
−I(
0)
12
4
59
1013
152
45
9
1013
15
1 2
4
5 9
1013
15
12
4
5 9
1013
15
1
24
5
101315
1
24
5
9
1013
15
1
24
59
1013
15
12
45
9
1013
15
24
5
101315
1 24
5
1013
151 24
5 9
1013
1512
4
5
9
10
13
1525
13
152
5 25
9132
5
91 2
45
10
15
12
4
5 9
101315
12
4
5 9
1013
15
Ax’/(1+x’)
1091_at
probe 9
1091 at9sequence alignment (GenBank)
������������������������
������������������������
������������������������������
������������������������������
Affymetrix
M65066.1
BC013368.2
AL833563.1 Dat
a ba
se
3’−CTG TCC TTG GTC CG C ATG GCT CGT T−5’
3’−CTG
3’−CTG
3’−CTG
TCC
TCC
TCC
TTG
TTG
TTG
GTC
GTC
GTC
CG
CG
CG
C ATG GCT
GCT
GCT
CGT
CGT
T−5’
T−5’
T−5’
CGTAGGCT
AGGCT
Error in the annotation:the correct sequence was submitted to GenBank in 2003, after themicroarray was designed.
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 25 / 28
affyILM (available at www.bioconductor.org)
Input: Experimental microarray data (Affymetrix)Output: Estimate of Concentration per each probe sequence
20 25 30 35 40
5010
020
050
010
0020
0050
0010
000
206158_s_at − RIri−CEP−A−1a−U133A.CEL
−RT ln K
Inte
nsity
+
++
+
+
+
+
+
+
+
+
median c = 200 pM
m.a.d. c = 69 pM
Probe no. conc [pM]
1 437.24
2 466.92
3 176.57
4 200.39
5 433.10
6 153.85
7 211.93
8 154.55
9 124.68
10 35.74
11 201.57
Median 200
M.A.D. 69
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 26 / 28
Collaborators
Institute for Theoretical Physics - KULeuvenM. Baiesi, F. Berger, A. Ferrantini, K. M. Kroll, J. C. WalterJ. Hooyberghs (& Flemish Institute for Technological Research, VITO)
Institute for Theoretical Physics - Utrecht UniversityG. T. Barkema, J. Klein Wolterink, G. Mulders
Micorarray Facility - Flemish Institute for Biotechnology (VIB)P. Van Hummelen, S. Weckx
Financial support fromFWO (Flemish Research Council), KULeuven, VITO, VIB
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 27 / 28
Recent Papers
G. Mulders, G. Barkema and E. CarlonInverse Langmuir method for oligonucleotide microarray analysisBMC Bioinformatics 10, 64 (2009)
J. Hooyberghs, M. Baiesi, A. Ferrantini and E. CarlonBreakdown of thermodynamic equilibrium for DNA hybridization in microarraysPhys. Rev. E 81, 012901 (2010).
J. Hooyberghs and E. CarlonHybridisation thermodynamic parameters allow accurate detection of pointmutations with DNA microarraysBiosens. and Bioel. 26, 1692 (2010)
J. C. Walter, K. M. Kroll, J. Hooyberghs and E. CarlonNon-Equilibrium Effects in DNA Microarrays: A Multiplatform StudyJ.Phys. Chem. B (in press). DOI: 10.1021/jp2014034
Downloadable from http://itf.fys.kuleuven.be/∼enrico/
Enrico Carlon, KULeuven (ITF, K.U.Leuven)Nearest-neighbor model in microarray hybridization Apr. 11, 2011 28 / 28
Recommended