Download ppt - Data Quality Indicators (DQIs) What are they, and how do they affect me? An US-EPA Approach

Data Quality Indicators(DQIs)

What are they, and how do they affect me?An US-EPA Approach

A

CC

P

S R

DQIs DefinedDQIs are quantitative and qualitative measures of principal quality attributes;Precision;Bias;Representativeness;Comparability;Completeness; andSensitivity.

Quantitative DQIsPrecision, bias, and sensitivity.

Qualitative DQIsrepresentativeness, comparability, and completeness.

The Hierarchy of Quality Terms

DQOs Qualitative and quantitative study objectives

Attributes Descriptive qualitative and quantitative aspects of collected data

DQIs Indicators of the quality attributesMQOs Acceptance criteria for the quality attributes

measured by project DQIs

Precision

Precision is the measure of agreement among repeated measurements of the same property under identical or substantially similar conditions.A precision DQI is a quantitative indicator of the random errors or fluctuations in the measurement process.e.g., standard deviation or variance

BiasBias is systematic or persistent distortion of a measurement process that causes error in one direction.A bias DQI is a quantitative indicator of the magnitude of systematic error resulting from:biased sampling design;calibration errors;response factor shifts;unaccounted-for interferences; andchronic sample contamination.

e.g., instrument reads XX mg/L too high

Accuracy

Accuracy is composed of precision and bias. Accuracy is a measure of the overall agreement of a measurement to a known value: when random errors are tightly controlled, bias dominates the overall accuracy; and

when random errors predominate, variance dominates the overall accuracy.

Influence of Bias and Imprecision on Overall Accuracy

Imprecise andbiased

Imprecise andunbiased

Precise andbiased

Precise andunbiased

RepresentativenessRepresentativeness is the measure of the degree to which data suitably represent a characteristic of a population, parameter variations at a sampling point, a process condition, or an environmental condition.Representativeness DQIs are qualitative and quantitative statements regarding the degree to which data reflect the true characteristics of a well defined population.e.g., these samples are representative of surface soil to be found in a specific area of XX square meters.

ComparabilityComparability is a qualitative expression of the measure of confidence that two or more data sets may contribute to a common analysis.a comparability DQI is a qualitative indicator of the similarity of attributes of data sets. e.g., soil salinity or soil acidity data sets are comparable as they share a common preparation and analytical method operated under similar conditions.

Completeness

Completeness is a measure of the amount of valid data obtained from a measurement system, expressed as a percentage of the number of valid measurements that should have been collected.the DQI for completeness is often expressed as a percentage.e.g., the percentage of valid samples for which data for all analytes of interest were reported.

SensitivitySensitivity is the capability of a method or instrument to discriminate between measurement responses representing different levels of the variable of interest.Sensitivity can be regarded as detection limitbut this term is often used without defining what is intended (minimum detection or quantitation).

A sensitivity DQI describes the capability of measuring a constituent at low levels.a Practical Quantitation Level (PQL) describes the ability to quantify a constituent with known certainty. e.g., a PQL of .05 g/L for mercury represents the level where a precision of +/- 15% can be obtained.

VerificationData verification refers to the procedures needed to ensure that a set of data is a faithful reflection of all the processes and procedures used to generate the data.verification involves the examination of objective evidence that the specified method, procedures, and contractual requirements were fulfilled.

ValidationData validation is an analyte and sample matrix-specific process to determine the analytical quality of a specific data set.validation entails the inspection of data handling practices for deviations from consistency, the review of quality control (QC) information for deviations, assessment of deviations, and assignment of data qualification codes.

Validation can entail the examination of the data with respect to the QA Plan.

IntegrityLack of integrity affects all aspects of data interpretation, especially data used for decision making; andLack of integrity includes:manipulation of QC measurements;Dry-labbing (complete falsification of data);manipulation of results during analysis;failure to conduct required analytical steps; andpost-analysis alteration of results.

After Verification and ValidationThe set of data are then analyzed by comparing the results to the original objectives. In many cases this is a comparison of the results to the DQOs using data quality assessment;Data quality assessment, a five step process:Review of DQOs and sample designPreliminary data reviewSelection of statistical testVerification of assumptionsDrawing conclusions from the data

But that is another course all together!

RepresentativenessStatistical and ConceptualModel-Based Approaches

Representativeness

Representativeness is the measure of the degree to which data suitably represent a characteristic of a population, parameter variations at a sampling point, a process condition, or an environmental condition:Representativeness DQIs are qualitative and quantitative statements regarding the degree to which data reflect the true characteristics of a well- defined population.

What Does "Representativeness" Mean?

Very vaguely defined in working English:"seal of approval" by simple statement of writer.there is an absence of biasing forces.it is a miniature or replica of the population.it is a typical or ideal case.there is wide coverage of a population.it enables good estimation.it is good enough for the purposes of the study.statistically based sampling method.

Different definitions of "Representativeness”

"...expected to exhibit the average properties of the universe or whole" "...should be selected on the basis of spatial and temporal representativeness" "...samples should be representative of daily operations”

Achieving Representativeness Involves a Process

Planning, design, and assessment careful attention to measurement and analytical process;consideration of the size (amount of material) and method for sample collection and handling;

determination of adequate type, location, timing, and number of samples to be taken; and

defensible approach for drawing inferences from sample data to the target population.

Sample design and measurement processes should minimize unintentional bias.

The Process Involves Evaluating Both Micro and Macro Scales

Micro scalehow well measurements taken within a sampling unit reflect that unit (e.g.,"parameter variations at a sampling point")

Macro scaledegree to which measurements from a set of sampling units reflect the population of interest (e.g., "accurately and precisely represent a characteristic of a population")

Micro Scale (Within-Sampling-Unit) Representativeness

An appropriate quality system to ensure quality implementation and sample integrity;Carefully defined sampling units with correct sampling procedures and equipment;Adequate sample support (amount of material) to make inferences about the characteristics within the sampling unit; andAppropriate analytical methods (including sample preparation), designed to achieve MQOs for measurement precision, bias, and sensitivity.

What is a Sampling Unit?A sampling unit (SU) can be defined as the portion of the environment for which a measurement has meaning for its intended use.Defining SUs for a project allows us to communicate more clearly about components of total-study precision.

SUs can vary depending on the specific problem; they can be: as small as the physical sample itself;something encompassing multiple physical samples; or

something much larger.In classical survey design (e.g., opinion survey) the SU is typically an individual.

Specifying Sampling Units

Specifying Sampling Units

SUs are less well defined in other types of surveys (e.g., in a survey to determine soil salinity levels)in this case, a soil sample is much smaller than the area it represents - is the sampling unit topsoil sample, a pedon, or the farm as a whole?

Consider how data will be usedaverage over multiple units, the spatial distribution of units, or some combination?

Default SU Definitionequivalent to the physical sample (soil, water, or plant specimen) taken.

Alternative SU Definitionsunits comprised of multiple samples to obtain enough of the medium to perform all desired analyses.

units of a size adequate to collect multiple specimens (such as composite samples).

units defined to include a group of samples when individual samples are not the unit of interest.

Alternative Sampling Unit Definitions

Choice of Sampling Unit -What Does a Sample Represent?

7.5 cm core

A small farm

1-ha area

Sampling Theory: Within-SU ErrorTo what degree is heterogeneity within a sampling unit inherent?Gy refers to this as the “constitution heterogeneity.” No amount of mixing or homogenization can reduce this.Constitution heterogeneity leads to fundamental errorFundamental errors are negligible for liquids and gases without suspended solids, but are significant in soil and any other solids.

Sampling Theory: Within-SU Error

What is the distribution and variance between small increments of the media?Gy refers to this as “distribution heterogeneity” which reflects the distribution of groups of some number of neighboring fragments.

Grouping and segregation errors result from distribution heterogeneityminimize these errors by taking more increments to form a sample of the required weight

Heterogeneity of Pollutants Can Lead to Sampling Errors

h1 = small scale (random fluctuations)

h2 = large scale (trends, nonrandom, bias)

h3 = cyclic phenomena

h = h1 + h2 + h3

Each of these components of heterogeneity lead to errors

Experiments to characterize these components (using variograms) allow one to optimize a design

Ensure the field sampling protocol does not distort or bias sampleshould be capable of ensuring all parts of the media (e.g., all particle sizes) have the same probability of being included in the increment obtained to form a sample

Ensure the laboratory subsamples represent all the particle-size fractionssubsamples must be large enough (optimal sample weight to accommodate the range of particle sizes

samples and subsamples should be comprised of as many correctly obtained increments as possible

Controlling Sampling Errors

Questions raised by Sampling Theory Related to “Within-SU Error”

What is the correct scale at which to sample?What is the correct protocol for obtaining increments to form samples of the media of interest?

Questions raised by Sampling Theory Related to “Within-SU Error” (cont.)

pilot studies needed to determine the nature of the heterogeneity.If, for example, soil salinity areas are highly clustered on a scale smaller than the scale of real concern, small grabs will reveal varied results

If homogenization and sub-sampling does not remove clustering, representation of a sampling unit from a single sample will not be achievable

sampling protocols should be selected that do not alter the characteristics of the media (e.g., particle- size composition)

Classical Statistical ApproachDefine the population of interestspatial and temporal boundaries; andsampling units

Develop a statistical sampling planprobability-based design, every sampling unit has a known probability of inclusion.

Evaluate process for drawing inferences from datahow well the sampling units selected represent the population under study;

how data will be used to estimate target population parameters such as the mean and variance; and

how well the sampled population provide information on subject in question.

Strategies for Improving Within SU Representativeness

Utilize within-sampling-unit replicationcan reduce the variability of the average by a factor of n-1/2.

Utilize within-sampling-unit compositingincreasing the number of increments in the sample reduces the variability of the unit average.

Increase the sample support area or volumeexpanding the definition of what area or volume the analytical measurement will represent can alleviate small-scale (or short-term) fluctuations.

Statistical Strategies for Improving Between-Sampling-Unit Representativeness

Statistical sampling schemessimple random samplingsystematic (grid) samplingstratified random samplingranked set samplingcluster samplingbetween-sampling-unit composite sampling

Balanced Design to Achieve Representativeness

Understanding the relative contribution of within-sampling-unit and between- sampling-unit variancefocus on components of variance to which the total variability is most sensitive

More samplesto lower between-sampling-unitvariance

More precisemeasurementsto lower within-sampling-unitvariance

Assessing RepresentativenessEvaluating existing datarepresentativeness affects the degree to which a data set can be used for a purpose other than originally intended.

Use of a checklist promotes a thorough evaluation of the attributes of representativeness.

Use of quality assessment samples such as dups, splits, or other replicates can assist in answering questions about within-sampling-unit representativeness.

Important Attributes (Micro-level)Was a rationale provided to support the selection of sampling equipment and handling procedures?correct choice of equipment and handling procedures directly affect degree to which the increments and samples reflect the characteristics of the matrix.

Was the rationale to support selection of analytical methods provided?choice of sample preparation and analytical instrument is critical.

Were samples collected from all selected sampling units? incomplete sampling, if biased due to the lack of completeness, can lead to incorrect conclusions

Important Attributes (Macro-level)Were study objectives adequately defined using the DQO process or equivalent planning process?intended use of data provides the context for evaluating representativeness.

Was the population of interest clearly defined?probability-based designs require the population to be defined as a set of sampling units.

Was the statistical basis for the sampling plan explained (number of samples, their allocation)?representativeness hinges on adequate number of samples.

different sample allocation approaches can maximize effectiveness.

Precision Indicators Reflective of the Data Collection Life Cycle

Planning

ImplementationAssessment

Precision"Precision is the measure of agreement among repeated measurements of the same property under identical or substantially similar conditions."properties in soil studies

concentration of a constituent, say nitrogenphysical measurement (e.g., grain size) of soil media

a precision DQI is a quantitative indicator of the random errors or fluctuations in the measurement process

Common Indicators of PrecisionRangedifference between largest and smallest values

Variance or standard deviationa statistical measure of the spread of data calculated from two or more measured values

the standard deviation is the square root of the variance

Relative rangethe Range divided by the mean of the data set

Relative standard deviation (CV)the standard deviation calculated from two or more values divided by the mean of those values

Framework for EvaluatingIndicators of Precision

A simple model allows us to evaluate the components and indicators of total-study variabilitywithin-sampling-unit variability:

measurement processsmall-scale variabilitysample acquisition

between-sampling-unit variability:

inherent spatial variabilitysampling design error

Total-Study Variability

Within- Sampling-Unit

Variability

Between- Sampling-Unit

Variability

Simple Total-Study Variability Model

Total-Study Variability

Within- Sampling-Unit

Variability

Between- Sampling-Unit

Variability

Small-Scale Variability

(within unit)

Sample Collection and Measurement Process Variability

Inherent Spatial Variability

(among units)

Sampling Design Error

Sampling Units

A sampling unit (SU) can be defined as the portion of the natural environment (soil, water, plant) for which a measurement has meaning for its intended use.Defining SUs for a soil, water or plant sampling project allows us to communicate more clearly about components of total-study precision.

SUs can vary depending on the specific problem; they can be:as small as the physical sample itself;something encompassing multiple physical samples; orsomething much larger.

In classical survey design (e.g., opinion survey) the SU is typically an individual.SUs are less well defined in other types of surveys (e.g., in a survey to determine soil salinity levels).in this case, a soil sample is much smaller than the individual - is the sampling unit the soil sample, the pedon, the farm, or the project?

Consider how data will be used.average over multiple units, the spatial distribution of units, or some combination?

Specifying SUs

Default SU Definitionequivalent to the physical sample taken

Alternative SU Definitionsunits comprised of multiple samples to allow for obtaining enough of the medium to perform all desired analyses

units of a size adequate to collect multiple samples (such as collocated samples)

units uniquely defined to measure properties of interest when a sample is not the unit of interest, nearby samples are highly correlated, or there is an explicit desire to control the precision within the unit

Alternative Sampling Unit Definitions

Evaluating Sampling Unit Definitions

Defining SUs larger than the physical sample has some potential benefits.clarifies whether collocated samples should be treated as additional field samples or replicates;

forces to consider the scale at which measurements have meaning; and

facilitates a more comprehensive consideration of sources of error affecting our understanding of properties of interest, and sources of variability affecting individual measurements.

Most study designs do not account for within-sampling-unit variability in any explicit way.tradeoffs between fewer precise measurements versus more imprecise measurements begin to address the issue.

Sampling Theory Raises Important Questions Related to Within-SU Error

What is the correct scale at which to sample?What is the correct protocol for obtaining samples?pilot studies needed to determine the nature of the heterogeneity if concentration of analytes is highly clustered on a scale smaller than the scale of real concern, small grabs will reveal varied results.

if homogenization and sub-sampling does not remove clustering, representation of a sampling unit from a single sample will not be achievable.

sampling protocols should be selected that do not alter the characteristics of the media (e.g., particle- size composition).

Components ofWithin-Sampling Unit Precision

Sample Collection and Measurement Process

Within Sample Variability

Measurement Method

Imprecision

Inherent small-scale variability

Subsampling or homogenization

Sample handling and preparation

Analytical instrument

Within- Sampling-Unit Variability

Small-Scale Variability(within unit)

QA Samples Used to Evaluate Components of Total-Study Variability

Total Within-Sampling-UnitPrecision Pyramid

QC SampleComponents of

Variability Captured

Estimated Standard Deviation

Instrument replicate Instrument response (IR) 0.046

Laboratory replicate IR + subsampling and extraction/digestion (S&E)

0.11

Laboratory split IR + S&E + lab homogenization (LH) 0.12Field split IR + S&E + LH + sample handling (SH) 0.12Collocated samples IR + S&E + LH + SH + field sample acquisition

and small-scale variability (A&SV)0.15

Field samples IR + S&E + LH + SH + A&SV + between- sampling-unit variability

1.11

Example of Total-Study Variability Components

* Data is from actual lab/field study in New Mexico lab in the USA.

A Simple Additive Variance Model

2t = 2

b + 2w

2t = 2

b + 2m + 2

st = total studyw = within-sampling-unitb = between-sampling-unitm = measurements = small-scale variability (eg. field duplicate) = Variance2

Visualizing the Contribution of Components of Total-Study Variance

Total-Study(Field Samples) Within-Sampling-

Unit

Between-Sampling-Unit

t(Collocates)

b

w

Total-Study(Field Samples)

Within-Sampling-

Unit Variability

Between-Sampling-Unit

w = .15

(Collocates) 0.10Small-Scale Variability

0.11

0.046 Sample-Preparation Variability 0.10

Analytical-InstrumentVariability

(Instrument Replicates)

Measurement Variability(Lab Replicates)

Visualizing the Within-Sampling-Unit Components of Variability

t

b

Calculating Variance

Total variance:

Within-sampling-unit variance:

for duplicates: for multiple replicates:

Between-sampling-unit variance:

Calculating Components of Variance from an Existing Data Set

Sample IDArsenic(ppm)

Cadmium(ppm)

Lead(ppm) Sample Type

99-7510 3.9 <0.07 22

99-7510 2.2 <0.06 20 lab replicate

99-7511 3.1 <0.07 104

99-7512 2.6 5.3 37.6

99-7512a 2 5.9 34.8 field split

99-7513 2.4 <0.07 782

99-7513 4.6 <0.07 829 lab replicate99-7514 2.5 0.47 35.999-7514a 3.1 <0.07 37.7 field split

99-7515 4.4 1.4 17.5

99-7515a 2.9 1.6 28.3 field split

99-7516 3.2 4.5 55.2

99-7517 2.8 5.1 921

99-7517 3.5 5.6 902 lab replicate

Standard deviation 0.78 2.5 390

Calculating Components of Variance from an Existing Data Set (cont.)

Within-sampling-unit variance for lead may be calculated using the field split data

Based on all samples, the total-study variance for lead is estimated by:

Between-sampling-unit variance for lead is calculated as:

Using Indicators of Variance

Arsenic Cadmium Leads2

w 0.25 0.13 21.3s2

b 0.36 6.01 148,876s2

t 0.61 6.14 148,897

For cadmium and lead, the total variability is swamped by between-unit-sampling variability (i.e., site heterogeneity).Arsenic probably near background - most variability comes from measurement process.

Establishing MQOs

Decomposing total-study variance facilitates the identification of the relative importance of components of total error.this exercise also helps determine what kind of QA samples to employ.

Total-study variance estimates are plugged directly into sample-size calculations.Individual “Measurement Quality Objectives” (MQOs) should be established for components of variance that primarily drive the total variability.

MQOs on specific measurement components must reflect the requirements for total-study error.

Strategies for Reducing Within-Sampling-Unit Variance

ReplicationSmall-scale compositingIncreasing sample supportMore precise measurement method

BiasAnalysis and Prevention

Bias

Bias = measured result - true value

Relative bias = measured result - true value

true value

When dealing with recovery rates:

Recovery = 1 + measured result - true value

true value

and expressed as a percentage

Principal Causes of Bias

Incomplete dataAnalyticalcalibration errorsample contaminationmatrix effectsinterferences

Samplingincorrect location identificationjudgmental sampling scheme

Bias Due to Incompleteness

Example: the objective is to estimate the percentage of correctly documented permits for exemptionData obtained by requesting the holders of permits to respond, 60% responded to the request.

Of these responses, 70% were correctly documented, does the 40% non-response rate really matter?

True percentage = (respondents x their percentage) + (non-respondents x their percentage)

Bias = non-respondents x (difference in percentages)

If non-responses were 70% correctly documented

Bias = 0%, correct estimate is 70%If non-responses were 50% correctly documented



Bias = 24%, correct estimate is 46%

Bias Due to Incomplete Response

Calibration Errors Leading to Bias

Matrix Effects Leading to Bias

The composition of the matrix can influence both preparation and analysis.Non-ideal chemical behavior influences samples differently than standards.

Method of Standard Addition (MSA)

Concentration in sample

Atomic Absorption: spectral: cannot resolve analyte from other species chemical: chemical processes alter absorption characteristics of analyte

Possible resolution: successive serial dilutions matrix modification MSA

Interferences Leading to Bias

Sample Handling Errors Leading to Bias

Loss of sample during collection, storageinadequate preservation (acid, darkness, cool, excessive holding time).

Examples:metals require acidification to prevent or minimize precipitation and adsorption to sample container.

Why Bias, Why Not Accuracy?

Accuracy includes both precision (random error that could be positive or negative for each individual reading) and bias (systematic error that is either positive or negative for all readings)Accuracy (mean square error) = variance + bias2

Precision is estimated through replicate measurementsBias is estimated by comparison of the mean of replicate measurements to a known standardWithout standards bias cannot be estimated with confidence, only a reduction in bias is possible

Bias Hidden as Variability

x

xx

x

x

x

xxx

xx

x x

x

x

x

xx

x

x

x

x

xx

x

x

0

50

40

30

20

Is data set A or B a better representation of the population?

x

xx

x

x

x

xxx

xx

x x

x

x

x

xx

x

x

x

x

xx

x

x

A B

10

Both data sets have similar variability. Data set B is a biased representation of the population of interest.

=38.5

Bias Hidden as Variability (cont.)

x

xx

x

x

x

xxx

xx

x x

x

x

x

xx

x

x

x

x

xx

x

x

0

50

40

30

20

x

xx

x

x

x

xxx

xx

x x

x

x

x

xx

x

x

x

x

xx

x

x

A B

10

SensitivityDiscerning the Signal in the Noise

Concentration

Response

SensitivitySensitivity is the capability of a method or instrument to discriminate between measurement responses representing different levels of the variable of interest.the term "detection limit" is often used without consideration of what is really meant.

there are several sensitivity DQIs including IDLs, MDLs, and PQLs.

A sensitivity DQI describes the capability of measuring a constituent at low levels.a PQL describes the ability to quantify a constituent with known certainty.

• e.g., a PQL of 0.05 mg/L for mercury represents the level where a precision of +/- 15% can be obtained.

Calibration Standards

Samples containing the analytes of interest are generally prepared in a clean matrix, to develop a relationship between concentration and instrument response. This can cause a problem if the matrix under investigation interacts differently with the analyte compared to a clean matrix.The relationship between concentration and instrument response is normally used to predict the unknown concentration found in samples of interest.Calibration allows for the determination of theoretical detection and quantification limits.

Calibration Standards - graphically

1 g/L 2 g/L 3 g/L 3.5 g/L 4 g/L 4.5 g/L

This graph shows:Instrument response for six calibration levelsEach response represents an average of multiple runsWhere are the quantification levels?

Calibration Curve from Standards

By graphing the calibration standards we can see three regions of interest in the relationship: below the linear rangethe linear range above the linear range

Concentration (g/L)

Relationship of Instrument Calibration Curve and Analyte Detection/Quantification

Instrument Response

Concentration

Region of unknown identification and quantitation

Region of less certain quantification


Region of less certain identification

Region of knownquantification

ConcentrationIDL MDL PQL LOL

12

3

4

5 1

2

3

4

5

IDL = instrument detection limit MDL = method detection limit PQL = practical quantitation limit LOL = limit of linearity

Commonly Used Sensitivity Indicators

Sensitivity Indicator Numerical Definition Definition Common UseInstrument Detection Limit (IDL)

Usually 3 times the instrument noise level

Lowest value at which instrument can distinguish from zero

Provides basis for determining an MDL

Method Detection Limit (MDL)

MDL = t (n-1, 0.99) x s

s = standard deviation

for 7 aliquots:t (n-1, 0.99) = 3.14

Defined 40 CFR Part 136 Appendix B

Determines the theoretical detection limit

Practical Quantitation Limit (PQL)

PQL = 5 x MDLorPQL = 10 x MDL

(more precisely defined as the lowest standard on the instrument calibration curve)

"the lowest concentration of an analyte that can be reliably measured within specified limits of precision and accuracy during routine laboratory operating conditions"

Provides numerical lower limit for critical data

Reporting Limit (RL) Laboratory defined(often the RL = PQL)

Lowest value reported by laboratory without a "J" flag

Laboratory basis for data reporting

MDL: Controls the Type I Error

USEPA Definition of MDL:

“Concentration where we have 99% confidence that value of analyte concentration is greater than zero.“

By accepting 1% chance of type I error (false positive), we define the MDL as equal to:

MDL = t * Sd.

Type I error (false positive): concludes that the analyte is present when in fact it is absent (zero).

MDL: Does Not Control the Type II Error

If the MDL is chosen as the reporting limit, then by default there is a 50% probability of a type II (false negative) error.

This means a sample that is truly at the MDL will be considered below the MDL 50% of the time.

Type II error (false negative): concluding that the analyte is absent when in fact it is present.

PQL Relationship to MDL

PQL: Multiple definitions, generally 5-10 times MDL, or lowest point on calibration.

Requiring 5-10 times MDL usually provides precision of less than 20% RSD (quantification).

If the PQL is 5 times the MDL, and the MDL is about 3 times standard deviation, then the PQL is approximately 15 times the standard deviation.

Quality Associated with Calibration Regions

3

Zero AnalyteConcentration

Region of high uncertainty

Region of certain detection


ApproximateMDL Level

Region of certainquantification

Instrument signal, standard deviation units

LODMatrix/method blank

Approximate PQL Level

LOQ

LOD = limit of detectionLOQ = limit of quantitationpopulation standard deviation

Project Management Perspective

There are major differences between the various sensitivity DQIs (RL, IDL, MDL, and PQL) even when you specify a particular indicator, it is very important to get a precise definition, description of the process and formula to know what it really means!

it is important to know if the indicator reflects the detection limit in a clean matrix, or in an actual sample

The PQL is usually the most useful indicatorwhen selecting an analytical method or laboratory

It is important to specify what results should be reported and how.all results above the MDL should be reported.

values that fall between the MDL and PQL should be "flagged" to indicate uncertainty in the value; and

values below the MDL should be reported as Non Detects and the MDL included.

Project Management Perspective (cont.)

General Advice to Lab/Project Managers

Have a very specific understanding of the indicator(s) you choose to use;

Do not make any assumptions regarding the adequacy of a method, or of a data set, based on detection limits that have not been carefully defined.

The MDL and PQL should be reported, as well as all concentrations above the MDL.

Avoid censoring data at the PQL or RL. Get all values down to the MDL, but have the values

between the MDL and PQL flagged (they are estimates and there is no confidence in their exact concentration).

What Drives The "Detection Limits"?

Regulatory requirementsprimary drinking water quality requirementsrisk based goals (nutrients, toxic elements)Irrigation water quality requirements

Background values for comparisonproject-wide site-specific

Examples of Other Sensitivity Indicators

Detection limit (DL)commonly seen in regulations, no rigorous definition

Limit of detection (LOD)similar to MDL, however statistical formula modified (set at 3 times Sd.)

Reliable detection limit (RDL)level where detection is extremely likely (set at 6 times Sd)

Limit of quantification (LOQ)set at 10 times Sd.

Contract required detection limits (CRDL)Contract required quantification limits (CRQL)

Controversies

Definition of MDL, provided by USEPA, is statistically incorrect/confusing. However, it remains the most widely documented DQI for sensitivity and one of the simplest ways to calculate a detection limitThere are too many definitions and indicators:

DL, MDL, CRDL, PQL, LOQ, ML, …

Censoring of low level values results in lost information because by not reporting data below a sensitivity indicator (e.g., PQL), information that could potentially be utilized in decision making is lost.

The correct approach is dependent upon how the data will be interpreted and used.

Techniques for Lowering the Method Detection Limit

Use more sample materialIf possible, improve detector sensitivityReduce interferences

During sample preparation:instrumentation:

selective AA: graphite furnace versus flame matrix modifiers in AA to alter volatility

Case Study 1: Calculation of MDL

Preliminary sampling has resulted in request for lower detection limit for obtained data.What are options?

more sample material;reduced interferences; ormore sensitive detector.

Elect to choose more sensitive detector .Use of graphite furnace AAS instead of AAS (more sensitive).

Initial Calibration of More Sensitive Detector (graphite furnace equipped AAS)

Concentration (g/L)

Concentrationg/L Response

Standard Deviation Average

Average/Std Dev.

.02 383

.02 178

.02 400 124 320 2.6

.03 451

.03 500

.03 754 163 568 3.5

.05 1448

.05 1178

.05 1220

.05 1089 153 1234 8.1

Evaluate the calibration data The calculated MDL must be less than the spike levelThe spike level should not be greater than 10 times calculated MDL (prefer spike 1-5 times MDL) Optional: pick a spike at level where signal/noise (average/Sd) is 2.5-5.0

Select the Spiking Level

Based on initial calibration curve and review of low level standards chose to spike at .05 mg/L

AliquotInstrument Response

Concg/L

1 1331 0.049972 1052 0.041933 1066 0.042344 1245 0.047515 1069 0.042446 1138 0.044437 1267 0.048128 1325 0.04981

Conc.g/L

Average 0.04582

Std Dev 0.00342MDL 0.010PQL .05

Calculating the MDL

MDL = t(n-1, 0.99) x Sd

–for 8 aliquots: t(n-1, 0.99) = 2.998MDL = 2.998 x .0034 = 0.010

Some Common Mistakes

Miscalculation of MDLuse population (n) standard deviation instead of sample (n - 1) in the statistical calculations

pick wrong t statistic (df = number of samples - 1)use less than 7 aliquots

Spike too high or too lowMDL < spike < 10 times MDL

(preferably spike at 1-5 times MDL)

Select method based on reported MDLs, without considering whether this detection limits is achievable in the true matrix

In a study by Wisconsin DNR (1993), it was found that 23 of 56 labs incorrectly calculated MDL.A 1998 survey found 26% of submitted lab results in Wisconsin were incorrect.only 17% of the laboratories (total of 122) reported data that met the criteria for all analyte.

for this study 2313 MDLs were reported.

Mistakes are Common everywhere

Question the MDL Study Results

What are the problems with the following MDL results?

Mercury MDL study

Zinc MDL study

spike 0.2 .32 .31 .34 .33 .35 .34 .35 s = .0151MDL =.047

spike .05

.0511 .0516 .0511 .0507 .0512 .0505 .0520s =

.00051MDL

= .0016

spike/MDL =31

Question the MDL Study Results

Mercury MDL studyrecovery of the spiked samples is approximately 160% indicating quantification bias in this range

Zinc MDL studyspiked level is 31 times MDL, should be 10 or less

spike 0.2 .32 .31 .34 .33 .35 .34 .35 s = .0151MDL =.047

spike .05

.0511 .0516 .0511 .0507 .0512 .0505 .0520s =

.00051MDL

= .0016

spike/MDL =31

Conclusions

Detection limit language is looseregulations specify required DLs

do they mean IDLs, MDLs, PQLs?

USEPA has defined MDLs that is widely used, but statisticians debate precise formulation.

Procedural implementation is not standardizedlabs routinely take shortcuts (e.g., do not rerun at appropriate concentrations);

miscalculations are common; andmatrix effects seldom considered or reported.

Know what you want and how to communicate it with all parties in the process (lab-client).

Comparability Using More Than One Data Source

Comparability

Comparability is a qualitative expression of the measure of confidence that two or more data sets may contribute to a common analysis.a comparability DQI is a qualitative indicator of the similarity of attributes of data sets.

Common Indicators of Comparability

Field attributesmatrix compositionsample collection methodtime/season of sampling

Qualitative Analytical attributessample preparationanalytical methoddetection limit determination/reporting

Quantitative Analytic attributesspread or variability of datacommonalties in central tendency

Comparability of New and Existing Data

Combining existing dataThe comparability of data sets generated at different times or by different organizations must be evaluated.

The evaluation should establish whether two data sets can be considered equivalent with respect to the measurement of a specific variable or groups of variables.

Gathering new data New data must be collected so that it is comparable to existing data for key characteristics.

Existing Data Example

A Regional government agency and a local Woreda town have both collected data. Are the two data sets comparable?both data were collected during the summer of 2009;some of the data collected was from very similar locations;

several common chemicals were analyzed by both; and

equivalent analytical methods were used.

Existing Data Example (cont.)

Existing Data Example (cont.)

Compare: Sample Design

Are all data equally representative of the population of interest?random implies equal probability of selection;biased implies unequal probability of selection; andcombining data from multiple design types may limit the usability of the data.

Compare: Temporal/Spatial Consistency

Data representative of the same populationdata from similar time frames and locations;data from different times or locations are fine if equivalency across space/time is reasonable to assume;

data from different temporal or spatial zone may be used for trend analyses, if are comparable in other ways.

Compare: Sample Collection Methods

Field methodsType of sampling instrument;sample collection procedure; andfield splits, duplicates, or composites.

Sample handlingFiltering;preservatives or other special requirements; andtime of sample preparation and time of analyses.

Compare: Variables of Interest

Variables for grouping data–location, date, media, etc.

Variables for determining comparability–particle size, total organic compounds, percent moisture, etc.

Variables for analysis–reported concentration, depth to groundwater, etc.

Compare: Units of Measurement

All data sets should have units that are convertible to a common metric or SI unitmg/kg, , ng/g, and mg/g - AcceptablepCi/g and mg/L - Not Acceptable (unless additional information is available for the conversion)

Comparability Summary TableField Attributes

Data Attribute

Data Set #1City: Mota OARD

Data Set #2BoARD Remarks

Sample collection method

Composited topsoil sample

Grab sample

Matrix Soil SoilSample handling unknown Followed approved SOPsSampling event Oct 2009 Sept 2008

Sample preparation methods should be consistentsample handling timesSample preparation (e.g., saturated paste vs. 1:2.5 H2O)

Laboratories sometimes differwas the same laboratory used for all analyses?are there noticeable differences between laboratory operations and techniques?

Compare: Sample Preparation

Compare: Analytical Methods

Analytical methods should be consistentAre the methods documented and how do they perform for the intended elements under investigation?

Are the same methods used for all analyses?Are the same method options used for all analyses and how do they compare to each other?

Are the methods capable of selecting the right analyte or fraction?

Can the required MDL/PQL be achieved consistently?

Compare: Detection Limits

Are the same type of detection limits reported for all data? (e.g., MDL, PQL, etc.)How do the detection limits compare between data sets?How do the detection limits compare to detected values in the data sets?

Are the detection limits acceptable for use in decision making?

Compare: Quality Control

Quality control of data entryresults reported into database in similar manner

Qualification and/or validation of datasimilar QA/QC information available from all laboratories

criteria consistent for data qualification or validation

Comparability Summary TableAnalytical Attributes

Data Attribute

Data Set #1City: White Springs

Data Set #2Federal Agency Remarks

Sample preparation

unknown 1:2.5 H2O

Analytical method unknown Method 34-b

Analytical method option

unknown UV Spectrophotometer

Detection level 680 - 1240 1 -3.4Units g/kg (ppb) mg/kg (ppm) Can be converted to

matchFields of interest that were reported

Missing QC data and sample collection method

All desired fields Locations in different coordinates: conversion possible

Criteria for exclusion of samples

None Rosner's test for outliers No data excluded based on these rules

Statistical Comparability

VarianceMean or medianDistribution

Example of Temporal Differences

3 related sites, samples collected from 1993 - 1998Changes in site conditions not expected over time

Thresholds of interest are 30 mg/kg and 500 mg/kg and there are data sets outside this region. This may affect comparisons of a statistical natureCombining data from across years for joint analysis is probably not appropriate (compare 1996 data with 1998 data, the latter being much greater in magnitude)Cause of differences (bias) should be investigated

Conclusion: Temporal Differences

CompletenessAnalysis and Prevention

Completeness

Completeness % = 100 x Number of valid measurements

Total # of measurements

e.g., 12 field samples were collected, but at the laboratory two were found to be contaminated by a foreign substance and were rejected

Completeness % = 100 x 10 (valid samples) = 83%

12 (total samples)

Causes of Incompleteness

Loss through contaminationInvalidation due to violation of QA protocolsRestriction due to time constraints in obtaining samplesLoss through QC calibration mistakesPhysical loss through sample destructionErrors in field collection techniquesInsufficient physical sample material for analysis

Qualitative Incompleteness

If the missing data are randomly disbursed throughout the study and the variability is less than expected, the DQOs established at the outset of the project may still be achieved.If the missing data are not randomly disbursed but affect either the target area for inference, or affect predominantly one element of interest, then even a high degree of completeness may be inadequate for decision making purposes.

The Consequences of Incompleteness

Simple sampling schemeRandom missing data

loss of statistical power and a general weakening of confidence in conclusions.

Clustered or systematic missing dataloss of statistical power and weaker confidence in conclusions.

loss in assumption of representativeness.

Consequences of Incompleteness (cont.)

Complex or statistical sampling schemeRandom missing data

loss of statistical power and a weakening of confidence in conclusions.

possible severe compromising of results depending on where the missing data occurred.

Clustered or systematic missing datasevere compromising of results.probable abandonment of the study.

Corrective Actions for Incompleteness

Minor incompleteness in simple random sample additional sample collection to achieve DQOsanalysis of effect on false decision errors for DQOs

Minor incompleteness in a complex sampleadditional sample collection to achieve DQOsstatistical imputation (repair schemes) necessary

Minor incompleteness in sample analysesadditional re-analyses from existing samples

Major incompleteness in sample analysesadditional sample collection to achieve DQOsstatistical correlation analysis of surrogates

Corrective Actions for Incompleteness (cont.)

Incompleteness due to failure of QC requirementsstatistical readjustment of the dataexpert opinion from the QA manager

Incompleteness due to violated QA protocolsexpert opinion from the QA manager

Incompleteness due to lack of representativenessexpert opinion on potential impact on DQO decisions

Data Verification, Data Validation, and Data

Integrity

Definition of Verification & ValidationDATA VERIFICATION: Evaluating the completeness, correctness, and conformance or compliance of a specific data set against the method, procedural, or contract requirements (lab staff activity);DATA VALIDATION: Confirming that the particular requirements for a specific intended use are fulfilled by examination and provision of objective evidence (3rd party activity); Sample-specific process extending data evaluation beyond method, procedural, or contractual compliance to decide analytical quality of a specified data set.

DATA INTEGRITY: Assurance that individuals and organizations involved in the collection and analysis of laboratory measurements, have performed their activities in accordance with appropriate ethical standards, and that the resulting data sets are free from deliberate falsification.

Definition of Integrity

Goals of Data Verification

To document steps that were taken to collect or analyze laboratory measurements.

To confirm compliance with applicable technical requirements.

Who is Responsible for Verification?

Verification is part of an organization's management processMultiple parties are involved in verification:

those involved with data should document their actions;peers should observe and confirm performance of tasks;supervisors/managers should review processes and outputs; and

verification can also be external (performed by an independent party) or internal (performed by a specially assembled group or team).

The Use of a Graded Approach

There is no "one size fits all" formula for verificationVerification steps may vary according to project needs;Variables can include:amount and type of documentation;extent of observation by peers;number of levels of management review; androle of third parties.

Typical Verification Activities

Confirm that the practices required by method, procedural, or contract requirements were actually employed;Perform independent double check of calculations;Confirm that field or lab notes are consistent with formal reports;Assess analytical and field uncertainty; andCheck the operation of Laboratory Information Management System (LIMS)

DQIs to Consider in Verification

Completeness: Did the desired number of samples/data points were obtained, along with corresponding meta-data?

Comparability: Can it be confirmed that all samples were analyzed using comparable methods?

Representativeness: Has the planned sampling design been followed appropriately?

Verification Outputs

Documented field report;Verified laboratory data package; andVerified and documented data.

What Verification Does/Doesn't Achieve

Does:document performance; andconfirm compliance with SOPs, methods, contract provisions, or similar requirements.

Doesn't:resolve data quality issues; ordetermine data usability.

Goals of Data Validation

An independent scrutiny of processes for collecting and analyzing laboratory related activities and measurements; andAn evaluation and qualification of resulting data with respect to its intended use.

Who should do Data Validation?

An independent party (individual or organization) reporting to the data user, orIf both field and laboratory validation are pursued, they are likely to be conducted by different parties.

Graded Approach for Validation

There is no "one size fits all" method of validationThe depth of validation depends on the goals of the project and eventual use of the data.Variables can include:percentage of data validated;double checking of calculations;formal "flags" versus narrative evaluations of data; andextent of scrutiny of pre-analytical steps (e.g., sample preparation)

Typical Validation Activities

Review records for consistency and completeness;Evaluate analytical method performance;Inspect data packages and draw conclusions about data quality;Assign or review data qualification codes; andReview data against project requirements.

DQIs to Consider in Validation

Sensitivity: Were the desired detection limits achieved?

Bias: Do recoveries meet performance goals?

Precision: Were performance goals established in the QA Project Plan met?

Validation Outputs

Validation reports for data user or data usability assessor; andDocumented qualified (flagged) data.

What Validation Does/Doesn't Achieve

Does:provide independent confirmation that requirements for a specific intended use were fulfilled; and

evaluate the quality (versus MQOs) of data points/data sets.

Doesn't:resolve data usability issues.

Integrity: a Vital Issue

Every component of the data collection process is subject to manipulation;Lapses in integrity can lead to dire consequences; andPast history has demonstrated that individual and organizational integrity cannot be simply “assumed”.

Consequences of Lack in Integrity

Delays in completion of projects;Cost overruns/rework;Loss of public credibility by affected agencies/institutions;Continuing risks to the environment or natural resources; andPotential civil or criminal penalties for offenders.

Improper Field Activities

Failure to collect required samples or specimens from the proper locations;The tampering of collected samples; andMisrepresentation of field procedures and documentation.

Improper Laboratory Activities

Failure to analyze samples;Manipulation of sample prior to analysis;Failure to conduct required analytical steps;Manipulation of results during or after analysis; andPost-analysis alteration of records.

Falsification is Difficult to Detect

Its purpose is to "disguise" records such as data packages - to hide itself from scrutiny;Some practices are legitimate in certain cases but improper in others;Falsification can be conducted by isolated individuals and on an organization-wide basis; andOnce a type of falsification can be detected, other types can take its place.

Detection Tools

Data validation;Site audits of management and technical systems;Data audits;Review of data tapes/disks;Performance evaluation (PE) and split sampling programs;Review of LIMS audit trail when available; BUT, no single tool can guarantee detection.

Warning Signs of Breakdown in Integrity

Inconsistencies in dates and times;Evidence of "special handling" of QC or PE samples;Indications of instrument manipulation; andEvidence of retroactive editing of data.

Is Lack of Integrity Universal?

No, but serious violations do occur;Be conscious of the possibility;Lab industry is aware of the problem and putting effort into ethics training; andMost soil, plant and water analysis data from laboratories are the result of proper analyses with appropriate QA protocols, only a few are not.

Summary

PARCCS & VVIQuantitative DQIs:

Precision Bias (Accuracy)Sensitivity

Qualitative DQIs:RepresentativenessComparabilityCompleteness

Associated DQIs:VerificationValidationIntegrity

Precision

Precision is the measure of agreement among repeated measurements of the same property under identical or substantially similar conditions.A precision DQI is a quantitative indicator of the random errors or fluctuations in the measurement process.

Bias

Bias is systematic or persistent distortion of a measurement process that causes error in one direction.A bias DQI is a quantitative indicator of the magnitude of systematic error.

Sensitivity

Sensitivity is the capability of a method or instrument to discriminate between measurement responses representing different levels of the variable of interest.A sensitivity DQI describes the capability of measuring a constituent at low levels.

Comparability

Comparability is a qualitative expression of the measure of confidence that two or more data sets may contribute to a common analysis.A comparability DQI is a qualitative indicator of the similarity of attributes of data sets.

Completeness

Completeness is a measure of the amount of valid data obtained from a measurement system, expressed as a percentage of the number of valid measurements that should have been collected.The DQI for completeness is often expressed as a percentage.

Representativeness

Representativeness is the measure of the degree to which data suitably represent a characteristic of a population, parameter variations at a sampling point, a process condition, or an environmental condition.Representativeness DQIs are qualitative and quantitative statements regarding the degree to which data reflect the true characteristics of a well defined population.

Verification

Data verification refers to the procedures needed to ensure that a set of data is a faithful reflection of all the processes and procedures used to generate the data.

Validation

Data validation is an analyte and sample matrix-specific process to determine the analytical quality of a specific data set.

Integrity

Lack of integrity affects all aspects of data interpretation, especially data used for decision making.

What are the Most Important DQIs?

Representativeness: If the samples are not representative of the environmental conditions, how much reliability can be given to the results of the analyses? The samples must be representative of the field, and the analytical work performed must be representative of the sample sent for analysis.

Integrity: If the samples have been tampered with, how much reliability can be given to the results of the analyses? The samples must have come from where they were planned, and the analytical work performed properly.

Questions?