Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Outline• Background• Reading and managing data
– Naming– Matrix– Plate view
• Formal Quality control– Re-run guidelines
• Selection information– Specifically ANOVAs
• Plate Data– Common observable
patterns• Data View
– Troubleshooting• Heatmap view
– Identify extremes
• LOD View– High LOD
• Z-score view– Both IPC and intensity
• IPC vs Intensity– A customer choice
• Overlapping correlations and protein-protein correlation
• “Hook” effects– How they reveal
themselves• Data under LOD
• Exporting– Available options
• Questions
Background• Size of average Olink project have increased• Need for criteria evaluating large projects– More than one plate equivalent
• Most recently, Olink has introduced intensity normalization and data under LOD in response to the larger projects– Intensity since it is improvement over IPC– Data under LOD since MaxLOD was not Ideal
• Formal Quality Control needs to be plate based, ideally sample based. – That does not stop us from trying to deliver the best
possible data.
Reading and managing data
• Unique sample naming – On all customer
samples– On all plates
• And panels
• Randomized sample manifest– Strive to facilitate
Formal quality control
• Rerun is always the first option– From incubation unless there is specific reason to think detection failed.
Selection information• Plates and samples
should add up to expectations
• ANOVAS measure the number of proteins significantly associated to technical parameters– But not the extent
• The degree of variation is estimated by Plate Data and Z-score
• CV:s are interesting when randomization is in question
Plate Data
• When a column or row effect is suspected, review of plate data is warranted• First column effects are common
– Pipette preconditioning• Re-run is a judgement call, and at Olink it is made by the head of Analysis Service.
Plate Data: When it is not a difficult decision
• First row effects that do not normalize, and will affect data.
Data View and Heatmap view• Troubleshooting tools– Identifying extreme
samples• Mislabeled negative
controls
– Interim analyses
ICAM_Ct vs IGBP2_Ct
Z-score view• Properly Randomized, Intensity
normalization is superior to IPC– But is in the end a decision that
should be made by the person responsible for the study (customer)
• Intensity normalization could disguise dilution errors.– Which is why it may be beneficial
to look at IPC Data
Overlapping correlationsPanel 1 Panel 2 Spearman Name
Olink ONCOLOGY II(v.7002) Olink IMMUNE RESPONSE(v.3201) 0.891864 AREG
Olink NEUROLOGY(v.8011) Olink INFLAMMATION(v.3012) 0.914541 Beta-NGF
Olink CELL REGULATION(v.3701) Olink CARDIOVASCULAR II(v.5003) 0.662383 BOC
Olink INFLAMMATION(v.3012) Olink IMMUNE RESPONSE(v.3201) 0.954263 CCL11
Olink INFLAMMATION(v.3012) Olink CARDIOVASCULAR II(v.5003) 0.88894 CCL3
Olink INFLAMMATION(v.3012) Olink CARDIOVASCULAR II(v.5003) 0.990715 CXCL1
Olink IMMUNE RESPONSE(v.3201) Olink CELL REGULATION(v.3701) 0.728117 DCBLD2
Olink ORGAN DAMAGE(v.3301) Olink CELL REGULATION(v.3701) 0.379815 ENTPD6
Olink CELL REGULATION(v.3701) Olink CARDIOVASCULAR II(v.5003) 0.957846 FGF-21
Olink INFLAMMATION(v.3012) Olink CARDIOVASCULAR II(v.5003) 0.959302 FGF-21
Olink CARDIOVASCULAR II(v.5003) Olink CELL REGULATION(v.3701) 0.957846 FGF-21
Olink INFLAMMATION(v.3012) Olink CARDIOVASCULAR II(v.5003) 0.831894 FGF-23
Olink NEUROLOGY(v.8011) Olink INFLAMMATION(v.3012) 0.641495 GDNF
• For projects involving multiple panels, correlations in overlapping assays can be a good tool to investigate results.
• Poor correlation is not necessarily a problem on its own.
Protein-Protein correlation
• Four assays with similar behavior, indicates that there is a global source of variance among the samples.
• In this case, it is pre-analytical variations related to time on bench.
“Hook” effects and binary distributions• Hook effects are
common in special sample matrices– Cell lysates– Tissue
homogenates– Other species– Take the form of a
ceiling• Some assays have
special distributions in normal samples– Binary
distributions– Long tailed
The LOD problem• LOD is inherently plate-specific• Variations in LOD may or may not transfer directly to samples
– The below example illustrates that LOD from one plate is not necessarily translated.
Data under LOD-why the switch was made• If we assume
protein concentrations are in reality normally distributed, data under LOD better represents that distribution.
Data under LOD-false positives• Using data under LOD
does not result in false positives.
• In contrast, unequal groupings, coupled with a strict LOD threshold could.
• Un-truncated data also facilitates normalizations and combinations.
Export• Make sure final export
has the expected number of rows and columns
• Incorrect number of rows is typically naming related– Too many or too few
unique names.
Summary• Formal Quality control is
quick– But is always one step
behind
• Large portion of NPX manager is focused on troubleshooting.
• Re-Runs are almost always done on a plate basis.– Because customer may
have designed the plate for a particular purpose.
• Data under LOD is the new standard
Thank you for your time
Customer support – from start to finish
Internal investigations –performance qualification
Support – both external and internal
Development of analysis pipelines and internal databases
3Statistical Analysis Plan | [email protected]
Volcano-plot example
Boxplot example Heatmap example
Example output table from a t-test
Example output table from an ANOVA
Protein P-value Adj.P Case-Ctrl
PVRL4 3 × 10-5 0.013 0.23
CSF.1 0.0001 0.029 0.12
SYND1 0.0001 0.030 0.13
TIM 0.0003 0.034 0.28
Protein P-value Adj.P Pval A vs B A-B Pval A vs C A-C Pval B vs C B-C
PVRL4 3 × 10-5 0.013 0.03 0.80 3.2 × 10-4 0.60 0.05 -0.20
CSF.1 0.0001 0.029 0.04 0.12 0.02 0.12 0.99 0.00
SYND1 0.0001 0.030 0.56 0.48 0.08 0.31 0.21 -0.17
TIM 0.0003 0.034 0.04 0.19 0.91 0.41 0.92 0.22
OUTPUT
OUTPUT FROM DIFFERENT TESTS
Each test is associated with a results table and a visualization. These will depend on which kind of test out are performing. A report summarizing the results from each test is also included.
T-TEST AND REGRESSION
The volcano-plot takes its values from the results table. Each protein is represented by its p-value and difference. Proteins above the dashed line pass correction for multiple tests, while proteins above the solid line is significant if you do not consider multiple testing.
ANOVA
Results table and boxplots or heatmap depending on the number of significant hits. The boxplot will visualize the behaviour of significant proteins across all groups. If there are more than 25 significant proteins, the visualization will be in the form of a heatmap instead.
CONFIDENTIAL
3Statistical Analysis Plan | [email protected]
Volcano-plot example
Boxplot example Heatmap example
Example output table from a t-test
Example output table from an ANOVA
Protein P-value Adj.P Case-Ctrl
PVRL4 3 × 10-5 0.013 0.23
CSF.1 0.0001 0.029 0.12
SYND1 0.0001 0.030 0.13
TIM 0.0003 0.034 0.28
Protein P-value Adj.P Pval A vs B A-B Pval A vs C A-C Pval B vs C B-C
PVRL4 3 × 10-5 0.013 0.03 0.80 3.2 × 10-4 0.60 0.05 -0.20
CSF.1 0.0001 0.029 0.04 0.12 0.02 0.12 0.99 0.00
SYND1 0.0001 0.030 0.56 0.48 0.08 0.31 0.21 -0.17
TIM 0.0003 0.034 0.04 0.19 0.91 0.41 0.92 0.22
OUTPUT
OUTPUT FROM DIFFERENT TESTS
Each test is associated with a results table and a visualization. These will depend on which kind of test out are performing. A report summarizing the results from each test is also included.
T-TEST AND REGRESSION
The volcano-plot takes its values from the results table. Each protein is represented by its p-value and difference. Proteins above the dashed line pass correction for multiple tests, while proteins above the solid line is significant if you do not consider multiple testing.
ANOVA
Results table and boxplots or heatmap depending on the number of significant hits. The boxplot will visualize the behaviour of significant proteins across all groups. If there are more than 25 significant proteins, the visualization will be in the form of a heatmap instead.
CONFIDENTIAL
3Statistical Analysis Plan | [email protected]
Volcano-plot example
Boxplot example Heatmap example
Example output table from a t-test
Example output table from an ANOVA
Protein P-value Adj.P Case-Ctrl
PVRL4 3 × 10-5 0.013 0.23
CSF.1 0.0001 0.029 0.12
SYND1 0.0001 0.030 0.13
TIM 0.0003 0.034 0.28
Protein P-value Adj.P Pval A vs B A-B Pval A vs C A-C Pval B vs C B-C
PVRL4 3 × 10-5 0.013 0.03 0.80 3.2 × 10-4 0.60 0.05 -0.20
CSF.1 0.0001 0.029 0.04 0.12 0.02 0.12 0.99 0.00
SYND1 0.0001 0.030 0.56 0.48 0.08 0.31 0.21 -0.17
TIM 0.0003 0.034 0.04 0.19 0.91 0.41 0.92 0.22
OUTPUT
OUTPUT FROM DIFFERENT TESTS
Each test is associated with a results table and a visualization. These will depend on which kind of test out are performing. A report summarizing the results from each test is also included.
T-TEST AND REGRESSION
The volcano-plot takes its values from the results table. Each protein is represented by its p-value and difference. Proteins above the dashed line pass correction for multiple tests, while proteins above the solid line is significant if you do not consider multiple testing.
ANOVA
Results table and boxplots or heatmap depending on the number of significant hits. The boxplot will visualize the behaviour of significant proteins across all groups. If there are more than 25 significant proteins, the visualization will be in the form of a heatmap instead.
CONFIDENTIAL
1 µL sample1,000 validated proteins in <15 uL
Validation inspired by FDA
Biostatistical and data analysis
PEA for exceptional specificity
Custom development
Truly scalable -Data you can trust
Our Unique Offer
Reagent kit or service Our lab: CLIA, ISO, GCP (under development)