Upload
jonas-ranstam
View
218
Download
2
Tags:
Embed Size (px)
Citation preview
How to collect relevant data
Databases
Data mining
Registers
Questionnaires
Clinical research data
Patient CRF
Physician, nurse, monitor
CRF Database
Data manager, systems programmer
Database Report
Statistician, statistical programmer
Anecdotal evidence
(Case reports)
Evidence basedmedicine
(The Cochrane collaboration 1993)
Cohort study of smoking and lung cancer (1954)(Bradford Hill)
Case-control study of smoking and lung cancer (1950)(Bradford Hill)
Randomised clinical trial of streptomycin and tubercolosis (1948)(Bradford Hill)
Prövningsregistrering (2005)
Etikprövningslag (2004)
EU-direktiv (2001)
CONSORT (1996)
ICH GCP (1996)
WHO CIOMS (1993)
Vancouverkonventionen (1978)
Helsingforsdeklarationen (1964)
Nürnbergkonventionen (1949)
Why statistics?
1. Describing data (statistics in plural)
2. Interpreting uncertain data (statistics in singular)
Statistics in singular
Two kinds of uncertainty
1. Uncertainty of measurement
2. Uncertainty of sampling
1. Uncertainty of measurement
The precision of the used measurement instrument.
The precision of the Finapres non-invasive blood pressure monitor is on the average 12.1 mm Hg.
The same phenomenon affects biochemical and other common analyses performed in laboratories.
Statistical analyses are different. They relate to sampling uncertainty, which affects all results.
2. Uncertainty of sampling
Assume that the cumulative 10-year revision rate of the Oxford knee prosthesis is 8% and that two groups of 100 patients receiving the prosthesis are randomly selected and followed over time.
One of the groups is given bisphosphonates. Does the treatment affect the revision rate?
Patients with knee prostheses
Not revised
Revised
bisphosphonates6% revised
placebo12% revised
bisphosphonates6% revised
placebo12% revisedHypothesis test
H0: The two samples represent the same populationH1: The two samples represent different populations
P-value
12/100 vs. 6/100, Fisher's exact test p = 0.22
This means that the chance of getting at least the observed difference in revision rate between two random samples from the same population is 22%.
375 randomly ordered patients of which 30 (8%) are revised within 10 years
6% revised
12% revised
Sampling uncertainty
Sampling uncertainty
1. Individual effects vary between subjects.Different samples of subjects leads to
different findings.
2. The between-subject variation in the population can be estimated using the information in the sample.
3. The probability that an effect observed in the sample only reflects samplinguncertainty can be calculated.
P-value
The probability that an observed effect only reflects sampling variability.
P-values are often misunderstood
They cannot
- describe clinical relevance (they depend on sample size)
- show that a difference “does not exist”, because n.s. is absence of evidence, not evidence of absence
Confidence intervals are better than p-values
In contrast to p-values they do
- describe clinical relevance
- show when a difference “does not exist”
because they present lower and upper limits of potential clinical effects/differences
0Effect
Clinically significant effects
Statistically and clinically significant effect
Statistically, but not necessarily clinically, significant effect
Inconclusive
Neither statistically nor clinically significant effect
Statistically significant reversed effect
p < 0.05
p < 0.05
n.s.
n.s.
p < 0.05
P-value Conclusion from confidence intervals
P-value and confidence interval
Statistically but not clinically significant effectp < 0.05
Statistical inference
- random errors (precision)
- systematic errors (validity)
Systematic errors (bias)
Internal validity- Selection bias- Information bias- Confounding bias
External validity- Representativity- Protopatic bias
Collect data to facilitate statistical inference
- External validityrepresentativity (inclusion/exclusion)
- Internal validityrandomization and blindingstatistical modelling for bias adjustment
- Precisionwidth of confidence intervalssignificance and powersample size
Collect data to facilitate statistical inference
Remember
1. Source of subjects
2. Inclusion/exclusion
3. Random sampling (study design)
4. Modelling (include confounders in the database)
5. Number of subjects to include
Collect data in a safe manner
- Monitoring of data collection process
- Audit trail
- GCP (Documented quality assurance system)
- EU Clinical Trials Directive 2001
- ICH-GCP (USA, EU and Japan)
Store data in a safe database
Excel worksheet
Relational (SQL) database
Clear variable definitions (string, numeric, date)
Defined variable codes (numeric)
Documented revisions/updating
Coordination
Authentic research database
Analyze and report in a safe way
- Methodologically correct
- ICH/FDA/EMEA Guidelines
- ICMJE Guidelines (“Vancouver convention”)
- Study registration (ICMJE + WHO)
- Results registration (ICMJE)
ICMJEScientists have an ethical obligation to submit creditable research results for publication. Moreover, as the persons directly responsible for their work, researchers should not enter into agreements that interfere with their access to the data and their ability to analyze them independently, and to prepare and publish manuscripts.
ICMJEAuthors should identify individuals who provide writing or other assistance and disclose the funding source for this assistance.
Editors should ask corresponding authors to declare whether they had assistance with study design, data collection, data analysis, or manuscript preparation. If such assistance was available, the authors should disclose the identity of the individuals who provided this assistance and the entity that supported it in the published article.
Recommendations
- Use a professional statistics package
- Use double data entry
- Use variable and value labels
- Use written programmes
- Include lots of comments in the programmes
- Use independent programme validation
- Do not hardcode
- Back up data regularly
- Comply with local archiving instructions