Data Quality Indicators(DQIs)
What are they, and how do they affect me?An US-EPA Approach
A
CC
P
S R
DQIs DefinedDQIs are quantitative and qualitative measures of principal quality attributes;Precision;Bias;Representativeness;Comparability;Completeness; andSensitivity.
Quantitative DQIsPrecision, bias, and sensitivity.
Qualitative DQIsrepresentativeness, comparability, and completeness.
The Hierarchy of Quality Terms
DQOs Qualitative and quantitative study objectives
Attributes Descriptive qualitative and quantitative aspects of collected data
DQIs Indicators of the quality attributesMQOs Acceptance criteria for the quality attributes
measured by project DQIs
Precision
Precision is the measure of agreement among repeated measurements of the same property under identical or substantially similar conditions.A precision DQI is a quantitative indicator of the random errors or fluctuations in the measurement process.e.g., standard deviation or variance
BiasBias is systematic or persistent distortion of a measurement process that causes error in one direction.A bias DQI is a quantitative indicator of the magnitude of systematic error resulting from:biased sampling design;calibration errors;response factor shifts;unaccounted-for interferences; andchronic sample contamination.
e.g., instrument reads XX mg/L too high
Accuracy
Accuracy is composed of precision and bias. Accuracy is a measure of the overall agreement of a measurement to a known value: when random errors are tightly controlled, bias dominates the overall accuracy; and
when random errors predominate, variance dominates the overall accuracy.
Influence of Bias and Imprecision on Overall Accuracy
Imprecise andbiased
Imprecise andunbiased
Precise andbiased
Precise andunbiased
RepresentativenessRepresentativeness is the measure of the degree to which data suitably represent a characteristic of a population, parameter variations at a sampling point, a process condition, or an environmental condition.Representativeness DQIs are qualitative and quantitative statements regarding the degree to which data reflect the true characteristics of a well defined population.e.g., these samples are representative of surface soil to be found in a specific area of XX square meters.
ComparabilityComparability is a qualitative expression of the measure of confidence that two or more data sets may contribute to a common analysis.a comparability DQI is a qualitative indicator of the similarity of attributes of data sets. e.g., soil salinity or soil acidity data sets are comparable as they share a common preparation and analytical method operated under similar conditions.
Completeness
Completeness is a measure of the amount of valid data obtained from a measurement system, expressed as a percentage of the number of valid measurements that should have been collected.the DQI for completeness is often expressed as a percentage.e.g., the percentage of valid samples for which data for all analytes of interest were reported.
SensitivitySensitivity is the capability of a method or instrument to discriminate between measurement responses representing different levels of the variable of interest.Sensitivity can be regarded as detection limitbut this term is often used without defining what is intended (minimum detection or quantitation).
A sensitivity DQI describes the capability of measuring a constituent at low levels.a Practical Quantitation Level (PQL) describes the ability to quantify a constituent with known certainty. e.g., a PQL of .05 g/L for mercury represents the level where a precision of +/- 15% can be obtained.
VerificationData verification refers to the procedures needed to ensure that a set of data is a faithful reflection of all the processes and procedures used to generate the data.verification involves the examination of objective evidence that the specified method, procedures, and contractual requirements were fulfilled.
ValidationData validation is an analyte and sample matrix-specific process to determine the analytical quality of a specific data set.validation entails the inspection of data handling practices for deviations from consistency, the review of quality control (QC) information for deviations, assessment of deviations, and assignment of data qualification codes.
Validation can entail the examination of the data with respect to the QA Plan.
IntegrityLack of integrity affects all aspects of data interpretation, especially data used for decision making; andLack of integrity includes:manipulation of QC measurements;Dry-labbing (complete falsification of data);manipulation of results during analysis;failure to conduct required analytical steps; andpost-analysis alteration of results.
After Verification and ValidationThe set of data are then analyzed by comparing the results to the original objectives. In many cases this is a comparison of the results to the DQOs using data quality assessment;Data quality assessment, a five step process:Review of DQOs and sample designPreliminary data reviewSelection of statistical testVerification of assumptionsDrawing conclusions from the data
But that is another course all together!
RepresentativenessStatistical and ConceptualModel-Based Approaches
Representativeness
Representativeness is the measure of the degree to which data suitably represent a characteristic of a population, parameter variations at a sampling point, a process condition, or an environmental condition:Representativeness DQIs are qualitative and quantitative statements regarding the degree to which data reflect the true characteristics of a well- defined population.
What Does "Representativeness" Mean?
Very vaguely defined in working English:"seal of approval" by simple statement of writer.there is an absence of biasing forces.it is a miniature or replica of the population.it is a typical or ideal case.there is wide coverage of a population.it enables good estimation.it is good enough for the purposes of the study.statistically based sampling method.
Different definitions of "Representativeness”
"...expected to exhibit the average properties of the universe or whole" "...should be selected on the basis of spatial and temporal representativeness" "...samples should be representative of daily operations”
Achieving Representativeness Involves a Process
Planning, design, and assessment careful attention to measurement and analytical process;consideration of the size (amount of material) and method for sample collection and handling;
determination of adequate type, location, timing, and number of samples to be taken; and
defensible approach for drawing inferences from sample data to the target population.
Sample design and measurement processes should minimize unintentional bias.
The Process Involves Evaluating Both Micro and Macro Scales
Micro scalehow well measurements taken within a sampling unit reflect that unit (e.g.,"parameter variations at a sampling point")
Macro scaledegree to which measurements from a set of sampling units reflect the population of interest (e.g., "accurately and precisely represent a characteristic of a population")
Micro Scale (Within-Sampling-Unit) Representativeness
An appropriate quality system to ensure quality implementation and sample integrity;Carefully defined sampling units with correct sampling procedures and equipment;Adequate sample support (amount of material) to make inferences about the characteristics within the sampling unit; andAppropriate analytical methods (including sample preparation), designed to achieve MQOs for measurement precision, bias, and sensitivity.
What is a Sampling Unit?A sampling unit (SU) can be defined as the portion of the environment for which a measurement has meaning for its intended use.Defining SUs for a project allows us to communicate more clearly about components of total-study precision.
SUs can vary depending on the specific problem; they can be: as small as the physical sample itself;something encompassing multiple physical samples; or
something much larger.In classical survey design (e.g., opinion survey) the SU is typically an individual.
Specifying Sampling Units
Specifying Sampling Units
SUs are less well defined in other types of surveys (e.g., in a survey to determine soil salinity levels)in this case, a soil sample is much smaller than the area it represents - is the sampling unit topsoil sample, a pedon, or the farm as a whole?
Consider how data will be usedaverage over multiple units, the spatial distribution of units, or some combination?
Default SU Definitionequivalent to the physical sample (soil, water, or plant specimen) taken.
Alternative SU Definitionsunits comprised of multiple samples to obtain enough of the medium to perform all desired analyses.
units of a size adequate to collect multiple specimens (such as composite samples).
units defined to include a group of samples when individual samples are not the unit of interest.
Alternative Sampling Unit Definitions
Choice of Sampling Unit -What Does a Sample Represent?
7.5 cm core
A small farm
1-ha area
Sampling Theory: Within-SU ErrorTo what degree is heterogeneity within a sampling unit inherent?Gy refers to this as the “constitution heterogeneity.” No amount of mixing or homogenization can reduce this.Constitution heterogeneity leads to fundamental errorFundamental errors are negligible for liquids and gases without suspended solids, but are significant in soil and any other solids.
Sampling Theory: Within-SU Error
What is the distribution and variance between small increments of the media?Gy refers to this as “distribution heterogeneity” which reflects the distribution of groups of some number of neighboring fragments.
Grouping and segregation errors result from distribution heterogeneityminimize these errors by taking more increments to form a sample of the required weight
Heterogeneity of Pollutants Can Lead to Sampling Errors
h1 = small scale (random fluctuations)
h2 = large scale (trends, nonrandom, bias)
h3 = cyclic phenomena
h = h1 + h2 + h3
Each of these components of heterogeneity lead to errors
Experiments to characterize these components (using variograms) allow one to optimize a design
Ensure the field sampling protocol does not distort or bias sampleshould be capable of ensuring all parts of the media (e.g., all particle sizes) have the same probability of being included in the increment obtained to form a sample
Ensure the laboratory subsamples represent all the particle-size fractionssubsamples must be large enough (optimal sample weight to accommodate the range of particle sizes
samples and subsamples should be comprised of as many correctly obtained increments as possible
Controlling Sampling Errors
Questions raised by Sampling Theory Related to “Within-SU Error”
What is the correct scale at which to sample?What is the correct protocol for obtaining increments to form samples of the media of interest?
Questions raised by Sampling Theory Related to “Within-SU Error” (cont.)
pilot studies needed to determine the nature of the heterogeneity.If, for example, soil salinity areas are highly clustered on a scale smaller than the scale of real concern, small grabs will reveal varied results
If homogenization and sub-sampling does not remove clustering, representation of a sampling unit from a single sample will not be achievable
sampling protocols should be selected that do not alter the characteristics of the media (e.g., particle- size composition)
Classical Statistical ApproachDefine the population of interestspatial and temporal boundaries; andsampling units
Develop a statistical sampling planprobability-based design, every sampling unit has a known probability of inclusion.
Evaluate process for drawing inferences from datahow well the sampling units selected represent the population under study;
how data will be used to estimate target population parameters such as the mean and variance; and
how well the sampled population provide information on subject in question.
Strategies for Improving Within SU Representativeness
Utilize within-sampling-unit replicationcan reduce the variability of the average by a factor of n-1/2.
Utilize within-sampling-unit compositingincreasing the number of increments in the sample reduces the variability of the unit average.
Increase the sample support area or volumeexpanding the definition of what area or volume the analytical measurement will represent can alleviate small-scale (or short-term) fluctuations.
Statistical Strategies for Improving Between-Sampling-Unit Representativeness
Statistical sampling schemessimple random samplingsystematic (grid) samplingstratified random samplingranked set samplingcluster samplingbetween-sampling-unit composite sampling
Balanced Design to Achieve Representativeness
Understanding the relative contribution of within-sampling-unit and between- sampling-unit variancefocus on components of variance to which the total variability is most sensitive
More samplesto lower between-sampling-unitvariance
More precisemeasurementsto lower within-sampling-unitvariance
Assessing RepresentativenessEvaluating existing datarepresentativeness affects the degree to which a data set can be used for a purpose other than originally intended.
Use of a checklist promotes a thorough evaluation of the attributes of representativeness.
Use of quality assessment samples such as dups, splits, or other replicates can assist in answering questions about within-sampling-unit representativeness.
Important Attributes (Micro-level)Was a rationale provided to support the selection of sampling equipment and handling procedures?correct choice of equipment and handling procedures directly affect degree to which the increments and samples reflect the characteristics of the matrix.
Was the rationale to support selection of analytical methods provided?choice of sample preparation and analytical instrument is critical.
Were samples collected from all selected sampling units? incomplete sampling, if biased due to the lack of completeness, can lead to incorrect conclusions
Important Attributes (Macro-level)Were study objectives adequately defined using the DQO process or equivalent planning process?intended use of data provides the context for evaluating representativeness.
Was the population of interest clearly defined?probability-based designs require the population to be defined as a set of sampling units.
Was the statistical basis for the sampling plan explained (number of samples, their allocation)?representativeness hinges on adequate number of samples.
different sample allocation approaches can maximize effectiveness.
Precision Indicators Reflective of the Data Collection Life Cycle
Planning
ImplementationAssessment
Precision"Precision is the measure of agreement among repeated measurements of the same property under identical or substantially similar conditions."properties in soil studies
concentration of a constituent, say nitrogenphysical measurement (e.g., grain size) of soil media
a precision DQI is a quantitative indicator of the random errors or fluctuations in the measurement process
Common Indicators of PrecisionRangedifference between largest and smallest values
Variance or standard deviationa statistical measure of the spread of data calculated from two or more measured values
the standard deviation is the square root of the variance
Relative rangethe Range divided by the mean of the data set
Relative standard deviation (CV)the standard deviation calculated from two or more values divided by the mean of those values
Framework for EvaluatingIndicators of Precision
A simple model allows us to evaluate the components and indicators of total-study variabilitywithin-sampling-unit variability:
measurement processsmall-scale variabilitysample acquisition
between-sampling-unit variability:
inherent spatial variabilitysampling design error
Total-Study Variability
Within- Sampling-Unit
Variability
Between- Sampling-Unit
Variability
Simple Total-Study Variability Model
Total-Study Variability
Within- Sampling-Unit
Variability
Between- Sampling-Unit
Variability
Small-Scale Variability
(within unit)
Sample Collection and Measurement Process Variability
Inherent Spatial Variability
(among units)
Sampling Design Error
Sampling Units
A sampling unit (SU) can be defined as the portion of the natural environment (soil, water, plant) for which a measurement has meaning for its intended use.Defining SUs for a soil, water or plant sampling project allows us to communicate more clearly about components of total-study precision.
SUs can vary depending on the specific problem; they can be:as small as the physical sample itself;something encompassing multiple physical samples; orsomething much larger.
In classical survey design (e.g., opinion survey) the SU is typically an individual.SUs are less well defined in other types of surveys (e.g., in a survey to determine soil salinity levels).in this case, a soil sample is much smaller than the individual - is the sampling unit the soil sample, the pedon, the farm, or the project?
Consider how data will be used.average over multiple units, the spatial distribution of units, or some combination?
Specifying SUs
Default SU Definitionequivalent to the physical sample taken
Alternative SU Definitionsunits comprised of multiple samples to allow for obtaining enough of the medium to perform all desired analyses
units of a size adequate to collect multiple samples (such as collocated samples)
units uniquely defined to measure properties of interest when a sample is not the unit of interest, nearby samples are highly correlated, or there is an explicit desire to control the precision within the unit
Alternative Sampling Unit Definitions
Evaluating Sampling Unit Definitions
Defining SUs larger than the physical sample has some potential benefits.clarifies whether collocated samples should be treated as additional field samples or replicates;
forces to consider the scale at which measurements have meaning; and
facilitates a more comprehensive consideration of sources of error affecting our understanding of properties of interest, and sources of variability affecting individual measurements.
Most study designs do not account for within-sampling-unit variability in any explicit way.tradeoffs between fewer precise measurements versus more imprecise measurements begin to address the issue.
Sampling Theory Raises Important Questions Related to Within-SU Error
What is the correct scale at which to sample?What is the correct protocol for obtaining samples?pilot studies needed to determine the nature of the heterogeneity if concentration of analytes is highly clustered on a scale smaller than the scale of real concern, small grabs will reveal varied results.
if homogenization and sub-sampling does not remove clustering, representation of a sampling unit from a single sample will not be achievable.
sampling protocols should be selected that do not alter the characteristics of the media (e.g., particle- size composition).
Components ofWithin-Sampling Unit Precision
Sample Collection and Measurement Process
Within Sample Variability
Measurement Method
Imprecision
Inherent small-scale variability
Subsampling or homogenization
Sample handling and preparation
Analytical instrument
Within- Sampling-Unit Variability
Small-Scale Variability(within unit)
QA Samples Used to Evaluate Components of Total-Study Variability
Total Within-Sampling-UnitPrecision Pyramid
QC SampleComponents of
Variability Captured
Estimated Standard Deviation
Instrument replicate Instrument response (IR) 0.046
Laboratory replicate IR + subsampling and extraction/digestion (S&E)
0.11
Laboratory split IR + S&E + lab homogenization (LH) 0.12Field split IR + S&E + LH + sample handling (SH) 0.12Collocated samples IR + S&E + LH + SH + field sample acquisition
and small-scale variability (A&SV)0.15
Field samples IR + S&E + LH + SH + A&SV + between- sampling-unit variability
1.11
Example of Total-Study Variability Components
* Data is from actual lab/field study in New Mexico lab in the USA.
A Simple Additive Variance Model
2t = 2
b + 2w
2t = 2
b + 2m + 2
st = total studyw = within-sampling-unitb = between-sampling-unitm = measurements = small-scale variability (eg. field duplicate) = Variance2
Visualizing the Contribution of Components of Total-Study Variance
Total-Study(Field Samples) Within-Sampling-
Unit
Between-Sampling-Unit
t(Collocates)
b
w
Total-Study(Field Samples)
Within-Sampling-
Unit Variability
Between-Sampling-Unit
w = .15
(Collocates) 0.10Small-Scale Variability
0.11
0.046 Sample-Preparation Variability 0.10
Analytical-InstrumentVariability
(Instrument Replicates)
Measurement Variability(Lab Replicates)
Visualizing the Within-Sampling-Unit Components of Variability
t
b
Calculating Variance
Total variance:
Within-sampling-unit variance:
for duplicates: for multiple replicates:
Between-sampling-unit variance:
Calculating Components of Variance from an Existing Data Set
Sample IDArsenic(ppm)
Cadmium(ppm)
Lead(ppm) Sample Type
99-7510 3.9 <0.07 22
99-7510 2.2 <0.06 20 lab replicate
99-7511 3.1 <0.07 104
99-7512 2.6 5.3 37.6
99-7512a 2 5.9 34.8 field split
99-7513 2.4 <0.07 782
99-7513 4.6 <0.07 829 lab replicate99-7514 2.5 0.47 35.999-7514a 3.1 <0.07 37.7 field split
99-7515 4.4 1.4 17.5
99-7515a 2.9 1.6 28.3 field split
99-7516 3.2 4.5 55.2
99-7517 2.8 5.1 921
99-7517 3.5 5.6 902 lab replicate
Standard deviation 0.78 2.5 390
Calculating Components of Variance from an Existing Data Set (cont.)
Within-sampling-unit variance for lead may be calculated using the field split data
Based on all samples, the total-study variance for lead is estimated by:
Between-sampling-unit variance for lead is calculated as:
Using Indicators of Variance
Arsenic Cadmium Leads2
w 0.25 0.13 21.3s2
b 0.36 6.01 148,876s2
t 0.61 6.14 148,897
For cadmium and lead, the total variability is swamped by between-unit-sampling variability (i.e., site heterogeneity).Arsenic probably near background - most variability comes from measurement process.
Establishing MQOs
Decomposing total-study variance facilitates the identification of the relative importance of components of total error.this exercise also helps determine what kind of QA samples to employ.
Total-study variance estimates are plugged directly into sample-size calculations.Individual “Measurement Quality Objectives” (MQOs) should be established for components of variance that primarily drive the total variability.
MQOs on specific measurement components must reflect the requirements for total-study error.
Strategies for Reducing Within-Sampling-Unit Variance
ReplicationSmall-scale compositingIncreasing sample supportMore precise measurement method
BiasAnalysis and Prevention
Bias
Bias = measured result - true value
Relative bias = measured result - true value
true value
When dealing with recovery rates:
Recovery = 1 + measured result - true value
true value
and expressed as a percentage
Principal Causes of Bias
Incomplete dataAnalyticalcalibration errorsample contaminationmatrix effectsinterferences
Samplingincorrect location identificationjudgmental sampling scheme
Bias Due to Incompleteness
Example: the objective is to estimate the percentage of correctly documented permits for exemptionData obtained by requesting the holders of permits to respond, 60% responded to the request.
Of these responses, 70% were correctly documented, does the 40% non-response rate really matter?
True percentage = (respondents x their percentage) + (non-respondents x their percentage)
Bias = non-respondents x (difference in percentages)
If non-responses were 70% correctly documented
Bias = 0%, correct estimate is 70%If non-responses were 50% correctly documented
Bias = 8%, correct estimate is 62%If non-responses were 30% correctly documented
Bias = 16%, correct estimate is 54%If non-responses were 10% correctly documented
Bias = 24%, correct estimate is 46%
Bias Due to Incomplete Response
Calibration Errors Leading to Bias
Matrix Effects Leading to Bias
The composition of the matrix can influence both preparation and analysis.Non-ideal chemical behavior influences samples differently than standards.
Method of Standard Addition (MSA)
Concentration in sample
Atomic Absorption: spectral: cannot resolve analyte from other species chemical: chemical processes alter absorption characteristics of analyte
Possible resolution: successive serial dilutions matrix modification MSA
Interferences Leading to Bias
Sample Handling Errors Leading to Bias
Loss of sample during collection, storageinadequate preservation (acid, darkness, cool, excessive holding time).
Examples:metals require acidification to prevent or minimize precipitation and adsorption to sample container.
Why Bias, Why Not Accuracy?
Accuracy includes both precision (random error that could be positive or negative for each individual reading) and bias (systematic error that is either positive or negative for all readings)Accuracy (mean square error) = variance + bias2
Precision is estimated through replicate measurementsBias is estimated by comparison of the mean of replicate measurements to a known standardWithout standards bias cannot be estimated with confidence, only a reduction in bias is possible
Bias Hidden as Variability
x
xx
x
x
x
xxx
xx
x x
x
x
x
xx
x
x
x
x
xx
x
x
0
50
40
30
20
Is data set A or B a better representation of the population?
x
xx
x
x
x
xxx
xx
x x
x
x
x
xx
x
x
x
x
xx
x
x
A B
10
Both data sets have similar variability. Data set B is a biased representation of the population of interest.
=38.5
Bias Hidden as Variability (cont.)
x
xx
x
x
x
xxx
xx
x x
x
x
x
xx
x
x
x
x
xx
x
x
0
50
40
30
20
x
xx
x
x
x
xxx
xx
x x
x
x
x
xx
x
x
x
x
xx
x
x
A B
10
SensitivityDiscerning the Signal in the Noise
Concentration
Response
SensitivitySensitivity is the capability of a method or instrument to discriminate between measurement responses representing different levels of the variable of interest.the term "detection limit" is often used without consideration of what is really meant.
there are several sensitivity DQIs including IDLs, MDLs, and PQLs.
A sensitivity DQI describes the capability of measuring a constituent at low levels.a PQL describes the ability to quantify a constituent with known certainty.
• e.g., a PQL of 0.05 mg/L for mercury represents the level where a precision of +/- 15% can be obtained.
Calibration Standards
Samples containing the analytes of interest are generally prepared in a clean matrix, to develop a relationship between concentration and instrument response. This can cause a problem if the matrix under investigation interacts differently with the analyte compared to a clean matrix.The relationship between concentration and instrument response is normally used to predict the unknown concentration found in samples of interest.Calibration allows for the determination of theoretical detection and quantification limits.
Calibration Standards - graphically
1 g/L 2 g/L 3 g/L 3.5 g/L 4 g/L 4.5 g/L
This graph shows:Instrument response for six calibration levelsEach response represents an average of multiple runsWhere are the quantification levels?
Calibration Curve from Standards
By graphing the calibration standards we can see three regions of interest in the relationship: below the linear rangethe linear range above the linear range
Concentration (g/L)
Relationship of Instrument Calibration Curve and Analyte Detection/Quantification
Instrument Response
Concentration
Region of unknown identification and quantitation
Region of less certain quantification
Region of less certain quantification
Region of less certain identification
Region of knownquantification
ConcentrationIDL MDL PQL LOL
12
3
4
5 1
2
3
4
5
IDL = instrument detection limit MDL = method detection limit PQL = practical quantitation limit LOL = limit of linearity
Commonly Used Sensitivity Indicators
Sensitivity Indicator Numerical Definition Definition Common UseInstrument Detection Limit (IDL)
Usually 3 times the instrument noise level
Lowest value at which instrument can distinguish from zero
Provides basis for determining an MDL
Method Detection Limit (MDL)
MDL = t (n-1, 0.99) x s
s = standard deviation
for 7 aliquots:t (n-1, 0.99) = 3.14
Defined 40 CFR Part 136 Appendix B
Determines the theoretical detection limit
Practical Quantitation Limit (PQL)
PQL = 5 x MDLorPQL = 10 x MDL
(more precisely defined as the lowest standard on the instrument calibration curve)
"the lowest concentration of an analyte that can be reliably measured within specified limits of precision and accuracy during routine laboratory operating conditions"
Provides numerical lower limit for critical data
Reporting Limit (RL) Laboratory defined(often the RL = PQL)
Lowest value reported by laboratory without a "J" flag
Laboratory basis for data reporting
MDL: Controls the Type I Error
USEPA Definition of MDL:
“Concentration where we have 99% confidence that value of analyte concentration is greater than zero.“
By accepting 1% chance of type I error (false positive), we define the MDL as equal to:
MDL = t * Sd.
Type I error (false positive): concludes that the analyte is present when in fact it is absent (zero).
MDL: Does Not Control the Type II Error
If the MDL is chosen as the reporting limit, then by default there is a 50% probability of a type II (false negative) error.
This means a sample that is truly at the MDL will be considered below the MDL 50% of the time.
Type II error (false negative): concluding that the analyte is absent when in fact it is present.
PQL Relationship to MDL
PQL: Multiple definitions, generally 5-10 times MDL, or lowest point on calibration.
Requiring 5-10 times MDL usually provides precision of less than 20% RSD (quantification).
If the PQL is 5 times the MDL, and the MDL is about 3 times standard deviation, then the PQL is approximately 15 times the standard deviation.
Quality Associated with Calibration Regions
3
Zero AnalyteConcentration
Region of high uncertainty
Region of certain detection
Region of less certain quantification
ApproximateMDL Level
Region of certainquantification
Instrument signal, standard deviation units
LODMatrix/method blank
Approximate PQL Level
LOQ
LOD = limit of detectionLOQ = limit of quantitationpopulation standard deviation
Project Management Perspective
There are major differences between the various sensitivity DQIs (RL, IDL, MDL, and PQL) even when you specify a particular indicator, it is very important to get a precise definition, description of the process and formula to know what it really means!
it is important to know if the indicator reflects the detection limit in a clean matrix, or in an actual sample
The PQL is usually the most useful indicatorwhen selecting an analytical method or laboratory
It is important to specify what results should be reported and how.all results above the MDL should be reported.
values that fall between the MDL and PQL should be "flagged" to indicate uncertainty in the value; and
values below the MDL should be reported as Non Detects and the MDL included.
Project Management Perspective (cont.)
General Advice to Lab/Project Managers
Have a very specific understanding of the indicator(s) you choose to use;
Do not make any assumptions regarding the adequacy of a method, or of a data set, based on detection limits that have not been carefully defined.
The MDL and PQL should be reported, as well as all concentrations above the MDL.
Avoid censoring data at the PQL or RL. Get all values down to the MDL, but have the values
between the MDL and PQL flagged (they are estimates and there is no confidence in their exact concentration).
What Drives The "Detection Limits"?
Regulatory requirementsprimary drinking water quality requirementsrisk based goals (nutrients, toxic elements)Irrigation water quality requirements
Background values for comparisonproject-wide site-specific
Examples of Other Sensitivity Indicators
Detection limit (DL)commonly seen in regulations, no rigorous definition
Limit of detection (LOD)similar to MDL, however statistical formula modified (set at 3 times Sd.)
Reliable detection limit (RDL)level where detection is extremely likely (set at 6 times Sd)
Limit of quantification (LOQ)set at 10 times Sd.
Contract required detection limits (CRDL)Contract required quantification limits (CRQL)
Controversies
Definition of MDL, provided by USEPA, is statistically incorrect/confusing. However, it remains the most widely documented DQI for sensitivity and one of the simplest ways to calculate a detection limitThere are too many definitions and indicators:
DL, MDL, CRDL, PQL, LOQ, ML, …
Censoring of low level values results in lost information because by not reporting data below a sensitivity indicator (e.g., PQL), information that could potentially be utilized in decision making is lost.
The correct approach is dependent upon how the data will be interpreted and used.
Techniques for Lowering the Method Detection Limit
Use more sample materialIf possible, improve detector sensitivityReduce interferences
During sample preparation:instrumentation:
selective AA: graphite furnace versus flame matrix modifiers in AA to alter volatility
Case Study 1: Calculation of MDL
Preliminary sampling has resulted in request for lower detection limit for obtained data.What are options?
more sample material;reduced interferences; ormore sensitive detector.
Elect to choose more sensitive detector .Use of graphite furnace AAS instead of AAS (more sensitive).
Initial Calibration of More Sensitive Detector (graphite furnace equipped AAS)
Concentration (g/L)
Concentrationg/L Response
Standard Deviation Average
Average/Std Dev.
.02 383
.02 178
.02 400 124 320 2.6
.03 451
.03 500
.03 754 163 568 3.5
.05 1448
.05 1178
.05 1220
.05 1089 153 1234 8.1
Evaluate the calibration data The calculated MDL must be less than the spike levelThe spike level should not be greater than 10 times calculated MDL (prefer spike 1-5 times MDL) Optional: pick a spike at level where signal/noise (average/Sd) is 2.5-5.0
Select the Spiking Level
Based on initial calibration curve and review of low level standards chose to spike at .05 mg/L
AliquotInstrument Response
Concg/L
1 1331 0.049972 1052 0.041933 1066 0.042344 1245 0.047515 1069 0.042446 1138 0.044437 1267 0.048128 1325 0.04981
Conc.g/L
Average 0.04582
Std Dev 0.00342MDL 0.010PQL .05
Calculating the MDL
MDL = t(n-1, 0.99) x Sd
–for 8 aliquots: t(n-1, 0.99) = 2.998MDL = 2.998 x .0034 = 0.010
Some Common Mistakes
Miscalculation of MDLuse population (n) standard deviation instead of sample (n - 1) in the statistical calculations
pick wrong t statistic (df = number of samples - 1)use less than 7 aliquots
Spike too high or too lowMDL < spike < 10 times MDL
(preferably spike at 1-5 times MDL)
Select method based on reported MDLs, without considering whether this detection limits is achievable in the true matrix
In a study by Wisconsin DNR (1993), it was found that 23 of 56 labs incorrectly calculated MDL.A 1998 survey found 26% of submitted lab results in Wisconsin were incorrect.only 17% of the laboratories (total of 122) reported data that met the criteria for all analyte.
for this study 2313 MDLs were reported.
Mistakes are Common everywhere
Question the MDL Study Results
What are the problems with the following MDL results?
Mercury MDL study
Zinc MDL study
spike 0.2 .32 .31 .34 .33 .35 .34 .35 s = .0151MDL =.047
spike .05
.0511 .0516 .0511 .0507 .0512 .0505 .0520s =
.00051MDL
= .0016
spike/MDL =31
Question the MDL Study Results
Mercury MDL studyrecovery of the spiked samples is approximately 160% indicating quantification bias in this range
Zinc MDL studyspiked level is 31 times MDL, should be 10 or less
spike 0.2 .32 .31 .34 .33 .35 .34 .35 s = .0151MDL =.047
spike .05
.0511 .0516 .0511 .0507 .0512 .0505 .0520s =
.00051MDL
= .0016
spike/MDL =31
Conclusions
Detection limit language is looseregulations specify required DLs
do they mean IDLs, MDLs, PQLs?
USEPA has defined MDLs that is widely used, but statisticians debate precise formulation.
Procedural implementation is not standardizedlabs routinely take shortcuts (e.g., do not rerun at appropriate concentrations);
miscalculations are common; andmatrix effects seldom considered or reported.
Know what you want and how to communicate it with all parties in the process (lab-client).
Comparability Using More Than One Data Source
Comparability
Comparability is a qualitative expression of the measure of confidence that two or more data sets may contribute to a common analysis.a comparability DQI is a qualitative indicator of the similarity of attributes of data sets.
Common Indicators of Comparability
Field attributesmatrix compositionsample collection methodtime/season of sampling
Qualitative Analytical attributessample preparationanalytical methoddetection limit determination/reporting
Quantitative Analytic attributesspread or variability of datacommonalties in central tendency
Comparability of New and Existing Data
Combining existing dataThe comparability of data sets generated at different times or by different organizations must be evaluated.
The evaluation should establish whether two data sets can be considered equivalent with respect to the measurement of a specific variable or groups of variables.
Gathering new data New data must be collected so that it is comparable to existing data for key characteristics.
Existing Data Example
A Regional government agency and a local Woreda town have both collected data. Are the two data sets comparable?both data were collected during the summer of 2009;some of the data collected was from very similar locations;
several common chemicals were analyzed by both; and
equivalent analytical methods were used.
Existing Data Example (cont.)
Existing Data Example (cont.)
Compare: Sample Design
Are all data equally representative of the population of interest?random implies equal probability of selection;biased implies unequal probability of selection; andcombining data from multiple design types may limit the usability of the data.
Compare: Temporal/Spatial Consistency
Data representative of the same populationdata from similar time frames and locations;data from different times or locations are fine if equivalency across space/time is reasonable to assume;
data from different temporal or spatial zone may be used for trend analyses, if are comparable in other ways.
Compare: Sample Collection Methods
Field methodsType of sampling instrument;sample collection procedure; andfield splits, duplicates, or composites.
Sample handlingFiltering;preservatives or other special requirements; andtime of sample preparation and time of analyses.
Compare: Variables of Interest
Variables for grouping data–location, date, media, etc.
Variables for determining comparability–particle size, total organic compounds, percent moisture, etc.
Variables for analysis–reported concentration, depth to groundwater, etc.
Compare: Units of Measurement
All data sets should have units that are convertible to a common metric or SI unitmg/kg, , ng/g, and mg/g - AcceptablepCi/g and mg/L - Not Acceptable (unless additional information is available for the conversion)
Comparability Summary TableField Attributes
Data Attribute
Data Set #1City: Mota OARD
Data Set #2BoARD Remarks
Sample collection method
Composited topsoil sample
Grab sample
Matrix Soil SoilSample handling unknown Followed approved SOPsSampling event Oct 2009 Sept 2008
Sample preparation methods should be consistentsample handling timesSample preparation (e.g., saturated paste vs. 1:2.5 H2O)
Laboratories sometimes differwas the same laboratory used for all analyses?are there noticeable differences between laboratory operations and techniques?
Compare: Sample Preparation
Compare: Analytical Methods
Analytical methods should be consistentAre the methods documented and how do they perform for the intended elements under investigation?
Are the same methods used for all analyses?Are the same method options used for all analyses and how do they compare to each other?
Are the methods capable of selecting the right analyte or fraction?
Can the required MDL/PQL be achieved consistently?
Compare: Detection Limits
Are the same type of detection limits reported for all data? (e.g., MDL, PQL, etc.)How do the detection limits compare between data sets?How do the detection limits compare to detected values in the data sets?
Are the detection limits acceptable for use in decision making?
Compare: Quality Control
Quality control of data entryresults reported into database in similar manner
Qualification and/or validation of datasimilar QA/QC information available from all laboratories
criteria consistent for data qualification or validation
Comparability Summary TableAnalytical Attributes
Data Attribute
Data Set #1City: White Springs
Data Set #2Federal Agency Remarks
Sample preparation
unknown 1:2.5 H2O
Analytical method unknown Method 34-b
Analytical method option
unknown UV Spectrophotometer
Detection level 680 - 1240 1 -3.4Units g/kg (ppb) mg/kg (ppm) Can be converted to
matchFields of interest that were reported
Missing QC data and sample collection method
All desired fields Locations in different coordinates: conversion possible
Criteria for exclusion of samples
None Rosner's test for outliers No data excluded based on these rules
Statistical Comparability
VarianceMean or medianDistribution
Example of Temporal Differences
3 related sites, samples collected from 1993 - 1998Changes in site conditions not expected over time
Thresholds of interest are 30 mg/kg and 500 mg/kg and there are data sets outside this region. This may affect comparisons of a statistical natureCombining data from across years for joint analysis is probably not appropriate (compare 1996 data with 1998 data, the latter being much greater in magnitude)Cause of differences (bias) should be investigated
Conclusion: Temporal Differences
CompletenessAnalysis and Prevention
Completeness
Completeness % = 100 x Number of valid measurements
Total # of measurements
e.g., 12 field samples were collected, but at the laboratory two were found to be contaminated by a foreign substance and were rejected
Completeness % = 100 x 10 (valid samples) = 83%
12 (total samples)
Causes of Incompleteness
Loss through contaminationInvalidation due to violation of QA protocolsRestriction due to time constraints in obtaining samplesLoss through QC calibration mistakesPhysical loss through sample destructionErrors in field collection techniquesInsufficient physical sample material for analysis
Qualitative Incompleteness
If the missing data are randomly disbursed throughout the study and the variability is less than expected, the DQOs established at the outset of the project may still be achieved.If the missing data are not randomly disbursed but affect either the target area for inference, or affect predominantly one element of interest, then even a high degree of completeness may be inadequate for decision making purposes.
The Consequences of Incompleteness
Simple sampling schemeRandom missing data
loss of statistical power and a general weakening of confidence in conclusions.
Clustered or systematic missing dataloss of statistical power and weaker confidence in conclusions.
loss in assumption of representativeness.
Consequences of Incompleteness (cont.)
Complex or statistical sampling schemeRandom missing data
loss of statistical power and a weakening of confidence in conclusions.
possible severe compromising of results depending on where the missing data occurred.
Clustered or systematic missing datasevere compromising of results.probable abandonment of the study.
Corrective Actions for Incompleteness
Minor incompleteness in simple random sample additional sample collection to achieve DQOsanalysis of effect on false decision errors for DQOs
Minor incompleteness in a complex sampleadditional sample collection to achieve DQOsstatistical imputation (repair schemes) necessary
Minor incompleteness in sample analysesadditional re-analyses from existing samples
Major incompleteness in sample analysesadditional sample collection to achieve DQOsstatistical correlation analysis of surrogates
Corrective Actions for Incompleteness (cont.)
Incompleteness due to failure of QC requirementsstatistical readjustment of the dataexpert opinion from the QA manager
Incompleteness due to violated QA protocolsexpert opinion from the QA manager
Incompleteness due to lack of representativenessexpert opinion on potential impact on DQO decisions
Data Verification, Data Validation, and Data
Integrity
Definition of Verification & ValidationDATA VERIFICATION: Evaluating the completeness, correctness, and conformance or compliance of a specific data set against the method, procedural, or contract requirements (lab staff activity);DATA VALIDATION: Confirming that the particular requirements for a specific intended use are fulfilled by examination and provision of objective evidence (3rd party activity); Sample-specific process extending data evaluation beyond method, procedural, or contractual compliance to decide analytical quality of a specified data set.
DATA INTEGRITY: Assurance that individuals and organizations involved in the collection and analysis of laboratory measurements, have performed their activities in accordance with appropriate ethical standards, and that the resulting data sets are free from deliberate falsification.
Definition of Integrity
Goals of Data Verification
To document steps that were taken to collect or analyze laboratory measurements.
To confirm compliance with applicable technical requirements.
Who is Responsible for Verification?
Verification is part of an organization's management processMultiple parties are involved in verification:
those involved with data should document their actions;peers should observe and confirm performance of tasks;supervisors/managers should review processes and outputs; and
verification can also be external (performed by an independent party) or internal (performed by a specially assembled group or team).
The Use of a Graded Approach
There is no "one size fits all" formula for verificationVerification steps may vary according to project needs;Variables can include:amount and type of documentation;extent of observation by peers;number of levels of management review; androle of third parties.
Typical Verification Activities
Confirm that the practices required by method, procedural, or contract requirements were actually employed;Perform independent double check of calculations;Confirm that field or lab notes are consistent with formal reports;Assess analytical and field uncertainty; andCheck the operation of Laboratory Information Management System (LIMS)
DQIs to Consider in Verification
Completeness: Did the desired number of samples/data points were obtained, along with corresponding meta-data?
Comparability: Can it be confirmed that all samples were analyzed using comparable methods?
Representativeness: Has the planned sampling design been followed appropriately?
Verification Outputs
Documented field report;Verified laboratory data package; andVerified and documented data.
What Verification Does/Doesn't Achieve
Does:document performance; andconfirm compliance with SOPs, methods, contract provisions, or similar requirements.
Doesn't:resolve data quality issues; ordetermine data usability.
Goals of Data Validation
An independent scrutiny of processes for collecting and analyzing laboratory related activities and measurements; andAn evaluation and qualification of resulting data with respect to its intended use.
Who should do Data Validation?
An independent party (individual or organization) reporting to the data user, orIf both field and laboratory validation are pursued, they are likely to be conducted by different parties.
Graded Approach for Validation
There is no "one size fits all" method of validationThe depth of validation depends on the goals of the project and eventual use of the data.Variables can include:percentage of data validated;double checking of calculations;formal "flags" versus narrative evaluations of data; andextent of scrutiny of pre-analytical steps (e.g., sample preparation)
Typical Validation Activities
Review records for consistency and completeness;Evaluate analytical method performance;Inspect data packages and draw conclusions about data quality;Assign or review data qualification codes; andReview data against project requirements.
DQIs to Consider in Validation
Sensitivity: Were the desired detection limits achieved?
Bias: Do recoveries meet performance goals?
Precision: Were performance goals established in the QA Project Plan met?
Validation Outputs
Validation reports for data user or data usability assessor; andDocumented qualified (flagged) data.
What Validation Does/Doesn't Achieve
Does:provide independent confirmation that requirements for a specific intended use were fulfilled; and
evaluate the quality (versus MQOs) of data points/data sets.
Doesn't:resolve data usability issues.
Integrity: a Vital Issue
Every component of the data collection process is subject to manipulation;Lapses in integrity can lead to dire consequences; andPast history has demonstrated that individual and organizational integrity cannot be simply “assumed”.
Consequences of Lack in Integrity
Delays in completion of projects;Cost overruns/rework;Loss of public credibility by affected agencies/institutions;Continuing risks to the environment or natural resources; andPotential civil or criminal penalties for offenders.
Improper Field Activities
Failure to collect required samples or specimens from the proper locations;The tampering of collected samples; andMisrepresentation of field procedures and documentation.
Improper Laboratory Activities
Failure to analyze samples;Manipulation of sample prior to analysis;Failure to conduct required analytical steps;Manipulation of results during or after analysis; andPost-analysis alteration of records.
Falsification is Difficult to Detect
Its purpose is to "disguise" records such as data packages - to hide itself from scrutiny;Some practices are legitimate in certain cases but improper in others;Falsification can be conducted by isolated individuals and on an organization-wide basis; andOnce a type of falsification can be detected, other types can take its place.
Detection Tools
Data validation;Site audits of management and technical systems;Data audits;Review of data tapes/disks;Performance evaluation (PE) and split sampling programs;Review of LIMS audit trail when available; BUT, no single tool can guarantee detection.
Warning Signs of Breakdown in Integrity
Inconsistencies in dates and times;Evidence of "special handling" of QC or PE samples;Indications of instrument manipulation; andEvidence of retroactive editing of data.
Is Lack of Integrity Universal?
No, but serious violations do occur;Be conscious of the possibility;Lab industry is aware of the problem and putting effort into ethics training; andMost soil, plant and water analysis data from laboratories are the result of proper analyses with appropriate QA protocols, only a few are not.
Summary
PARCCS & VVIQuantitative DQIs:
Precision Bias (Accuracy)Sensitivity
Qualitative DQIs:RepresentativenessComparabilityCompleteness
Associated DQIs:VerificationValidationIntegrity
Precision
Precision is the measure of agreement among repeated measurements of the same property under identical or substantially similar conditions.A precision DQI is a quantitative indicator of the random errors or fluctuations in the measurement process.
Bias
Bias is systematic or persistent distortion of a measurement process that causes error in one direction.A bias DQI is a quantitative indicator of the magnitude of systematic error.
Sensitivity
Sensitivity is the capability of a method or instrument to discriminate between measurement responses representing different levels of the variable of interest.A sensitivity DQI describes the capability of measuring a constituent at low levels.
Comparability
Comparability is a qualitative expression of the measure of confidence that two or more data sets may contribute to a common analysis.A comparability DQI is a qualitative indicator of the similarity of attributes of data sets.
Completeness
Completeness is a measure of the amount of valid data obtained from a measurement system, expressed as a percentage of the number of valid measurements that should have been collected.The DQI for completeness is often expressed as a percentage.
Representativeness
Representativeness is the measure of the degree to which data suitably represent a characteristic of a population, parameter variations at a sampling point, a process condition, or an environmental condition.Representativeness DQIs are qualitative and quantitative statements regarding the degree to which data reflect the true characteristics of a well defined population.
Verification
Data verification refers to the procedures needed to ensure that a set of data is a faithful reflection of all the processes and procedures used to generate the data.
Validation
Data validation is an analyte and sample matrix-specific process to determine the analytical quality of a specific data set.
Integrity
Lack of integrity affects all aspects of data interpretation, especially data used for decision making.
What are the Most Important DQIs?
Representativeness: If the samples are not representative of the environmental conditions, how much reliability can be given to the results of the analyses? The samples must be representative of the field, and the analytical work performed must be representative of the sample sent for analysis.
Integrity: If the samples have been tampered with, how much reliability can be given to the results of the analyses? The samples must have come from where they were planned, and the analytical work performed properly.
Questions?