44
Yfiler® Plus Sensitivity Study (early release kit) Sonja Klein Mark Timken Martin Buoncristiani Cal DOJ Jan Bashinski DNA Lab

YFiler®Plus Sensitivity Study

Embed Size (px)

DESCRIPTION

Presentation Outline: * What is a sensitivity study? * Pre-PCR sampling statistics * Results of the empirical study compared to predicted results based on sampling statistics * How the results might help with interpretation and provide guidance with addressing the SWGDAM Y STR guidelines California Department of Justice (Cal DOJ)

Citation preview

Page 1: YFiler®Plus Sensitivity Study

Yfiler® Plus Sensitivity Study (early release kit)

Sonja Klein Mark Timken

Martin Buoncristiani

Cal DOJ Jan Bashinski DNA Lab

Page 2: YFiler®Plus Sensitivity Study

Presentation Outline

• What is a sensitivity study? • Pre-PCR sampling statistics • Results of the empirical study compared to

predicted results based on sampling statistics • How the results might help with interpretation

and provide guidance with addressing the SWGDAM Y STR guidelines

Page 3: YFiler®Plus Sensitivity Study

Yfiler® Plus from Applied Biosystems

• 27 locus Y STR kit – 3 known multi-copy loci

• DYS389, DYS385, DYF387S1 • 25 ul reaction volume • 30 cycles • 1 ng target template • 6 dye system (5 plus LIZ) • 3500

– 1.2kV 16sec. – 175 RFU Analytical Threshold

Page 4: YFiler®Plus Sensitivity Study

Yfiler® Plus Plot- 1 ng 10 new loci

Page 5: YFiler®Plus Sensitivity Study

What is a Sensitivity Study?

• Validation requirement • Typically:

– Replicate amps in a dilution series • Example: 2 ng to 16 pg (for a 1 ng target system)

– Positive control DNA or other known single source samples (extracted, purified DNA)

Page 6: YFiler®Plus Sensitivity Study

Some Sensitivity Study Goals

• Is the target template appropriate for system? – Cycle number, reaction volume, instrument/injection settings – Mid to low-mid range of CCD camera detection

• Over what range of input is signal linear? • Over what range of input are all peaks likely to be

detected? • What stochastic effects are observed?

– PHRs, dropout, (stutter) – Stochastic Threshold

• Multi-copy loci for Y STRs – Probability of dropout (for probabilistic genotyping)

• Null allele vs. drop-out for Y STRs

Page 7: YFiler®Plus Sensitivity Study

Sensitivity Study

• DNA Dilution Series Template (pg) # of replicates Loci # expected alleles

1000 2 27 54

500 2 27 54

250 4 27 108

125 8 27 216

62.5 14 27 378

31.25 14 27 378

15.6 14 27 378

7.8 14 27 378

72 amps 1944 alleles (loci)

Page 8: YFiler®Plus Sensitivity Study

Linearity

0

2000

4000

6000

8000

10000

12000

0 200 400 600 800 1000

RFU

Template (pg)

empirical avg PH

Page 9: YFiler®Plus Sensitivity Study

Peak Height Dispersion

0

2000

4000

6000

8000

10000

12000

0 200 400 600 800 1000

RFU

Template (pg)

empirical avg PH

empirical PHs

Page 10: YFiler®Plus Sensitivity Study

Peak Height Relative Stdev

0

20

40

60

80

100

120

7.8 15.6 31.25 62.5 125 250 500 1000

Stde

v PH

/Avg

PH

*100

(%CV

)

Template (pg)

Page 11: YFiler®Plus Sensitivity Study

Sources of Peak-Height Variation • Stochastic:

– Pre-PCR allelic sampling – PCR synthesis

• Systematic:

– Inter-locus imbalance – Preferential amplification – Cap-to-cap – Injection-to-injection – Degradation – Inhibition

Page 12: YFiler®Plus Sensitivity Study

Pre-PCR Stochastic Sampling Effects

• Publications pointing to pre-PCR sampling is the main source of peak height variance at low templates (for extracted DNA, i.e. dissociated alleles) – Walsh, et.al. CSH Genome Res. 1992 – Taberlet, et.al., NAR 1996 – Stenman and Orpana, Nature Biotechnology 2001 – Gill, et.al., NAR 2005 – Timken, et.al., FSIG 2014

• paper compared empirical results of 2372 with ID+ and MF to predicted results based on pre-PCR sampling alone

Page 13: YFiler®Plus Sensitivity Study

2001

“Assuming that the template molecules are evenly distributed in a solution of known concentration, the probability of a certain number of molecules to be present in an aliquot pipetted from this solution can be calculated according to the Poisson distribution”

Page 14: YFiler®Plus Sensitivity Study

Uniform Sampling (what we imagine) vs. Stochastic (Poisson) Sampling (what we get)

Assume a homogeneous DNA solution at 105.6 pg per 10 uL, i.e., 16 copies per 10 uL. This situation is depicted below as a uniform distribution of copies (black balls) in squares representing 10 uL volumes.

If we transfer 10 uL from this solution into a PCR tube, at a single locus with an average of 16 copies, what do stochastic sampling statistics (using the Poisson distribution) say that we’ll get?

22 copies ●●●●●●●●●●●●●●●●●●●●●●●●

11 copies ●●●●●●●●●●●●●●●●●●●●●●

18 copies ●●●●●●●●●●●●●●●●●●●●●●●●

13 copies ●●●●●●●●●●●●●●●●●●●●●●

Avg = 16 = λ Stdev=√λ

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Page 15: YFiler®Plus Sensitivity Study

Poisson Distribution

Poisson assumptions: 1. Alleles have the same average concentration (λ)

2. Are sampled equivalently

3. Are sampled independently

Poisson is a discrete probability distribution

Poisson properties: 1. The mean of the distribution is equal to λ 2. The variance is also equal to λ

•C.M. Grinstead, J.L. Snell, Introduction to Probability: Second Revised Edition, American Mathematical Society, Rhode Island, 1997.

𝑃 𝑋 λ =λ𝑋𝑒−λ

𝑋!

Page 16: YFiler®Plus Sensitivity Study

Uniform Sampling (what we imagine) vs. Stochastic (Poisson) Sampling (what we get)

Assume a homogeneous DNA solution at 105.6 pg per 10 uL, i.e., 16 copies per 10 uL. This situation is depicted below as a uniform distribution of copies (black balls) in squares representing 10 uL volumes.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

If we transfer 10 uL from this solution into a PCR tube, at a single locus with an average of 16 copies, what do stochastic sampling statistics (using the Poisson distribution) say that we’ll get?

22 copies ●●●●●●●●●●●●●●●●●●●●●●●●

11 copies ●●●●●●●●●●●●●●●●●●●●●●

18 copies ●●●●●●●●●●●●●●●●●●●●●●●●

13 copies ●●●●●●●●●●●●●●●●●●●●●●

Avg = 16 λ = 16 Stdev=√λ

The probability that we actually sample exactly 16 according to the Poisson distribution is:

𝑃 16 16 =1616 × 𝑒−16

16!

= 0.0992 (~9.9% of the time exactly 16 copies will be sampled)

We can also calculate a cumulative probability. For example, the probability we sample 16 or fewer copies with an average of 16 (calculated in Excel) is:

P(16 or fewer) = POISSON(16,16,TRUE) =0.566

Page 17: YFiler®Plus Sensitivity Study

Pre-PCR Sampling Statistics and Sensitivity Study with ID+/MF

Page 18: YFiler®Plus Sensitivity Study

Applying pre-PCR Sampling Statistics to Yfiler® Plus Sensitivity Study

• Test if Poisson sampling statistics can predict Yfiler® Plus peak height variance Need: – Accurate average starting copy number (λ) – Signal proportional to template (linear) – Dissociated alleles (extracted DNA)

Page 19: YFiler®Plus Sensitivity Study

NIST SRM® 2372A

• NIST Standard Reference Material (SRM® 2372A) – Single source male (white blood cell) extract that

was developed as a standard for Human DNA Quantitation*

– 57ng/ul concentration (known λ) – Most accurate template available

*Kline, et.al. Production and certification of NIST Standard Reference Material 2372 Human DNA Quantitation Standard, Anal. Bioanal. Chem. 394 (2009) 1183-1192.

Page 20: YFiler®Plus Sensitivity Study

Sensitivity Study 2372A

• DNA Dilution Series Template (pg) # of replicates Loci # expected alleles Avg starting copy #

1000 2 27 54 151.5

500 2 27 54 75.8

250 4 27 108 37.9

125 8 27 216 18.9

62.5 14 27 378 9.5

31.25 14 27 378 4.7

15.6 14 27 378 2.4

7.8 14 27 378 1.2

Assuming 6.6 pg per diploid cell

λ =

Page 21: YFiler®Plus Sensitivity Study

Linearity

RFU/copy = ~ 48

y = 47.66x - 33.196

0

2000

4000

6000

8000

10000

12000

0 20 40 60 80 100 120 140 160

RFU

Average Copy Number

empirical avg PH

Page 22: YFiler®Plus Sensitivity Study

Signal per starting copy #

49.8 46.8

50.2 50.9

41.2

46.4

41.2

48.9

0

10

20

30

40

50

60

7.8 15.6 31.25 62.5 125 250 500 1000

Aver

age

sign

al/s

tart

ing

copy

(RFU

)

Template (pg)

Page 23: YFiler®Plus Sensitivity Study

Detection Sensitivity Analytical Threshold (in copies)

𝐴𝐴𝑐 =𝐴𝐴𝑅𝑅𝑅

𝑅𝑅𝑅/𝑐𝑐𝑐𝑐

𝐴𝐴𝑐 =17548.9

𝐴𝐴𝑐 = 3.6

For reference, 2372 with ID+ (25ul volume, 28 cycles) on the 3500 at an AT of 175 RFU gave an ATc of 4.1 (175/42.4= 4.1).

𝐴𝐴𝑐𝑝 = 3.6 𝑐𝑐𝑐𝑐𝑒𝑐 𝑥6.6𝑐𝑝𝑐𝑐𝑐𝑐

~ 24 pg

Page 24: YFiler®Plus Sensitivity Study

Detection Sensitivity Analytical Threshold (in copies)

𝐴𝐴𝑐 =𝐴𝐴𝑅𝑅𝑅

𝑅𝑅𝑅/𝑐𝑐𝑐𝑐

𝐴𝐴𝑐 =17548.9

𝐴𝐴𝑐 = 3.6

For reference, 2372 with ID+ (25ul volume, 28 cycles) on the 3500 at an AT of 175 RFU gave an ATc of 4.1 (175/42.4= 4.1).

𝐴𝐴𝑐𝑝 = 3.6 𝑐𝑐𝑐𝑐𝑒𝑐 𝑥6.6𝑐𝑝𝑐𝑐𝑐𝑐

~ 24 pg

For repeated 24 pg amps, on average, half the alleles will be above the AT and half will be below.

Page 25: YFiler®Plus Sensitivity Study

STRBase http://www.cstl.nist.gov/strbase/ystrpos1.htm

Page 26: YFiler®Plus Sensitivity Study

STRBase http://www.cstl.nist.gov/strbase/ystrpos1.htm

Alleles on the Y chromosome are associated in terms of inheritance, but dissociated in terms of sampling from extracted DNA.

Page 27: YFiler®Plus Sensitivity Study

Sequences searched by using BLAT Human Genome Assembly (Feb. 2009)

STR Y Location DistanceDYS393 3.131E+06

1.140E+06DYS456 4.271E+06

2.590E+06DYS570 6.861E+06

1.920E+05DYS576 7.053E+06

8.148E+05DYS458 7.868E+06

3.502E+05DYS449 8.218E+06

2.080E+05DYS481 8.426E+06

2.240E+05DYS627 8.650E+06

8.719E+05DYS19 9.522E+06

4.581E+06DYS391 1.410E+07

STR Y Location DistanceDYS391 1.410E+07

2.772E+05DYS635 1.438E+07

8.711E+04DYS437 1.447E+07

4.826E+04DYS439 1.452E+07

9.680E+04DYS389 I 1.461E+07

DYS389 II 1.461E+073.257E+05

DYS438 1.494E+072.337E+06

DYS390 1.727E+074.512E+04

DYS518 1.732E+071.073E+06

DYS533 1.839E+07

STR Y Location DistanceDYS533 1.839E+07

3.506E+05Y GATA H4 1.874E+07

2.058E+06DYS385 a 2.080E+07

4.023E+04DYS385 b 2.084E+07

2.090E+05DYS460 2.105E+07

1.583E+06DYS392 2.263E+07

1.731E+06DYS448 2.436E+07

1.566E+06DYF387S1 2.593E+07

2.100E+06DYF387S1 2.803E+07

Page 28: YFiler®Plus Sensitivity Study

Sequences searched by using BLAT Human Genome Assembly (Feb. 2009)

STR Y Location DistanceDYS393 3.131E+06

1.140E+06DYS456 4.271E+06

2.590E+06DYS570 6.861E+06

1.920E+05DYS576 7.053E+06

8.148E+05DYS458 7.868E+06

3.502E+05DYS449 8.218E+06

2.080E+05DYS481 8.426E+06

2.240E+05DYS627 8.650E+06

8.719E+05DYS19 9.522E+06

4.581E+06DYS391 1.410E+07

STR Y Location DistanceDYS391 1.410E+07

2.772E+05DYS635 1.438E+07

8.711E+04DYS437 1.447E+07

4.826E+04DYS439 1.452E+07

9.680E+04DYS389 I 1.461E+07

DYS389 II 1.461E+073.257E+05

DYS438 1.494E+072.337E+06

DYS390 1.727E+074.512E+04

DYS518 1.732E+071.073E+06

DYS533 1.839E+07

STR Y Location DistanceDYS533 1.839E+07

3.506E+05Y GATA H4 1.874E+07

2.058E+06DYS385 a 2.080E+07

4.023E+04DYS385 b 2.084E+07

2.090E+05DYS460 2.105E+07

1.583E+06DYS392 2.263E+07

1.731E+06DYS448 2.436E+07

1.566E+06DYF387S1 2.593E+07

2.100E+06DYF387S1 2.803E+07

The minimum distance between the 27 Yfiler Plus loci is approximately 40,000 bases (except for DYS389I/II). High quality extracted DNA is ~10,000-25,000 bases so each Y locus should be dissociated and follow sampling statistics (with the exception of DYS389I/II).

Page 29: YFiler®Plus Sensitivity Study

Empirical vs. Predicted

• Results: – Comparison of the Yfiler® Plus empirical dropout

rate to the dropout rate predicted by pre-PCR sampling statistics

Page 30: YFiler®Plus Sensitivity Study

0

250

500

750

1000

1250

1500

1750

2000

0 25 50 75 100 125

RFU

Template (pg)

empirical avg PH

empirical PHs

Empirical Dropout: pg # alleles < 175 RFU Total expected alleles Fr(D) < 175 RFU 7.8 357 378 0.94

15.6 295 378 0.78 31.25 139 378 0.37 62.5 28 378 0.074 125 0 216 0

175

Empirical Frequency of Dropout at 175 RFU

Page 31: YFiler®Plus Sensitivity Study

Probability of Dropout at ATc of 3.6

pg copies NORM.DIST(ATc, λ, sqrtλ, TRUE) 7.8 1.18 0.99

15.6 2.36 0.79 31.25 4.73 0.30 62.5 9.47 0.028 125 18.94 2.11E-04

POISSON: A discrete probability distribution, e.g. 1, 2, 3, etc., so to simulate the continuous peak height data, we use a Normal approximation to the Poisson.

NORM.DIST(x, mean, stdev, TRUE)

NORM.DIST(ATc, λ, sqrtλ, TRUE) NORM.DIST(3.6,4.7,2.18,TRUE) = 0.306924 For example, starting with 31.25 pg, P(D) =

Page 32: YFiler®Plus Sensitivity Study

Fr(D) from Sensitivity Study vs. P(D) using λ and Sampling Statistics

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140

P(D

) (N

orm

al) o

r Fr(

D) Y

FP

Template (pg) in the PCR

P(D) Normal at ATc 3.6

Fr(D) at AT 175 RFU

Page 33: YFiler®Plus Sensitivity Study

Fr(D) from Sensitivity Study vs. P(D) using λ and Sampling Statistics

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140

P(D

) (N

orm

al) o

r Fr(

D) Y

FP

Template (pg) in the PCR

P(D) LR Analysis (1-18cells) at ATc 3.6

P(D) Normal at ATc 3.6

Fr(D) at AT 175 RFU

Logistic Regression: P(D) = 1

1+𝑒− 𝑎+𝑏𝑏 For ATc 3.6,

𝑎 = 3.029319 𝑏 = −0.11497

Page 34: YFiler®Plus Sensitivity Study

SWGDAM Interpretation Guidelines for Y-Chromosome STR Typing by Forensic DNA

Laboratories (Jan. 9, 2014) • 3.2.1 The laboratory should establish guidelines for the identification of

such null alleles. As an example, this guideline may be based on experimental studies designed to distinguish a null allele from an undetected allele resulting from low template amounts, DNA degradation or inhibition.

• 5.2 The laboratory should establish a stochastic threshold for known multi-copy Y-STR loci based on empirical data derived within the laboratory and specific to the quantitation and amplification systems (e.g., kits) and the detection instrumentation used. It is noted that a stochastic threshold may be established by assessing peak height ratios across any multi-copy locus in a dilution series of DNA amplified in replicate. The RFU value above which it is reasonable to assume that, at a given locus, allelic dropout of a sister allele has not occurred constitutes a stochastic threshold.

Page 35: YFiler®Plus Sensitivity Study

Yfiler® Plus Plot- 31 pg

?

?

Page 36: YFiler®Plus Sensitivity Study

Probability of Dropout vs. 2372 Average Peak Height Could Aid Null Allele Assessment

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 100 200 300 400 500 600 700 800 900 1000

P(D

) (N

orm

al) o

r Fr(

D) Y

FP

2372 Average Peak Height (RFU)

P(D) LR Analysis (1-18cells) at ATc 3.6

P(D) Normal at ATc 3.6

Fr(D) at AT 175 RFU

Page 37: YFiler®Plus Sensitivity Study

PHRs for Multi-copy Loci

• PHRs for Y kits – Very few multi-copy loci to measure PHRs from

• For example, 10 amps with YFP yields only 10*2= 20 PHRs as compared to ~120 PHRs for 10 amps with ID+ (~12 het loci*10)

– Can use Poisson distribution to predict PHR dispersion

Page 38: YFiler®Plus Sensitivity Study

Poisson Generated PHRs

Page 39: YFiler®Plus Sensitivity Study

Stochastic Threshold for Multi-copy Loci

• Stochastic Threshold – Can set an ST relative to P(D) risk using logistic

regression curve at the ATc

Page 40: YFiler®Plus Sensitivity Study

Logistic Regression (Semi-Log Plot) (Normal approx. of Poisson) for multi-copy ST

Page 41: YFiler®Plus Sensitivity Study

Logistic Regression (Semi-Log Plot) (Normal approx. of Poisson) for multi-copy ST

17

A “peak height” of 17 copies has a “sister allele” dropout probability of 1 in 1000 (using ATc of 3.6). 17 copies x 48.9 RFU/copy= 831 RFU

Page 42: YFiler®Plus Sensitivity Study

Conclusions

A sensitivity study was conducted with NIST SRM® 2372A and Yfiler® Plus on the 3500. Peaks, on average, were shown to be proportional to input template amount. Pre-PCR sampling statistics predicted empirical dropout rates at the analytical threshold (of 175 RFU or 3.6 ATc). Amplification of an accurate standard can serve as a good starting point for characterizing a system’s sensitivity and estimating dropout probabilities, ST, and PHR ranges.

Page 43: YFiler®Plus Sensitivity Study

Limitations

• Repeat in the presence of large amounts of female DNA to verify RFU per copy values

• Systematic signal differences – run-to-run – Inter-color or inter-locus signal differences

• Predictions assume extracted, diluted DNA (100% dissociated model) – Poisson sampling will overestimate the variance

• if performing direct amps (associated alleles)