39
How to collect relevant data Databases Data mining Registers Questionnaires

Malmo 11.11.2008

Embed Size (px)

Citation preview

Page 1: Malmo 11.11.2008

How to collect relevant data

Databases

Data mining

Registers

Questionnaires

Page 2: Malmo 11.11.2008

Clinical research data

Patient CRF

Physician, nurse, monitor

CRF Database

Data manager, systems programmer

Database Report

Statistician, statistical programmer

Page 3: Malmo 11.11.2008
Page 4: Malmo 11.11.2008

Anecdotal evidence

(Case reports)

Evidence basedmedicine

(The Cochrane collaboration 1993)

Cohort study of smoking and lung cancer (1954)(Bradford Hill)

Case-control study of smoking and lung cancer (1950)(Bradford Hill)

Randomised clinical trial of streptomycin and tubercolosis (1948)(Bradford Hill)

Page 5: Malmo 11.11.2008

Prövningsregistrering (2005)

Etikprövningslag (2004)

EU-direktiv (2001)

CONSORT (1996)

ICH GCP (1996)

WHO CIOMS (1993)

Vancouverkonventionen (1978)

Helsingforsdeklarationen (1964)

Nürnbergkonventionen (1949)

Page 6: Malmo 11.11.2008

Why statistics?

1. Describing data (statistics in plural)

2. Interpreting uncertain data (statistics in singular)

Page 7: Malmo 11.11.2008

Statistics in singular

Two kinds of uncertainty

1. Uncertainty of measurement

2. Uncertainty of sampling

Page 8: Malmo 11.11.2008

1. Uncertainty of measurement

The precision of the used measurement instrument.

The precision of the Finapres non-invasive blood pressure monitor is on the average 12.1 mm Hg.

The same phenomenon affects biochemical and other common analyses performed in laboratories.

Statistical analyses are different. They relate to sampling uncertainty, which affects all results.

Page 9: Malmo 11.11.2008

2. Uncertainty of sampling

Assume that the cumulative 10-year revision rate of the Oxford knee prosthesis is 8% and that two groups of 100 patients receiving the prosthesis are randomly selected and followed over time.

One of the groups is given bisphosphonates. Does the treatment affect the revision rate?

Patients with knee prostheses

Not revised

Revised

Page 10: Malmo 11.11.2008

bisphosphonates6% revised

placebo12% revised

Page 11: Malmo 11.11.2008

bisphosphonates6% revised

placebo12% revisedHypothesis test

H0: The two samples represent the same populationH1: The two samples represent different populations

Page 12: Malmo 11.11.2008

P-value

12/100 vs. 6/100, Fisher's exact test p = 0.22

This means that the chance of getting at least the observed difference in revision rate between two random samples from the same population is 22%.

Page 13: Malmo 11.11.2008

375 randomly ordered patients of which 30 (8%) are revised within 10 years

Page 14: Malmo 11.11.2008
Page 15: Malmo 11.11.2008

6% revised

12% revised

Sampling uncertainty

Page 16: Malmo 11.11.2008

Sampling uncertainty

1. Individual effects vary between subjects.Different samples of subjects leads to

different findings.

2. The between-subject variation in the population can be estimated using the information in the sample.

3. The probability that an effect observed in the sample only reflects samplinguncertainty can be calculated.

Page 17: Malmo 11.11.2008

P-value

The probability that an observed effect only reflects sampling variability.

Page 18: Malmo 11.11.2008

P-values are often misunderstood

They cannot

- describe clinical relevance (they depend on sample size)

- show that a difference “does not exist”, because n.s. is absence of evidence, not evidence of absence

Page 19: Malmo 11.11.2008
Page 20: Malmo 11.11.2008

Confidence intervals are better than p-values

In contrast to p-values they do

- describe clinical relevance

- show when a difference “does not exist”

because they present lower and upper limits of potential clinical effects/differences

Page 21: Malmo 11.11.2008

0Effect

Clinically significant effects

Statistically and clinically significant effect

Statistically, but not necessarily clinically, significant effect

Inconclusive

Neither statistically nor clinically significant effect

Statistically significant reversed effect

p < 0.05

p < 0.05

n.s.

n.s.

p < 0.05

P-value Conclusion from confidence intervals

P-value and confidence interval

Statistically but not clinically significant effectp < 0.05

Page 22: Malmo 11.11.2008

Statistical inference

- random errors (precision)

- systematic errors (validity)

Page 23: Malmo 11.11.2008

Systematic errors (bias)

Internal validity- Selection bias- Information bias- Confounding bias

External validity- Representativity- Protopatic bias

Page 24: Malmo 11.11.2008

Collect data to facilitate statistical inference

- External validityrepresentativity (inclusion/exclusion)

- Internal validityrandomization and blindingstatistical modelling for bias adjustment

- Precisionwidth of confidence intervalssignificance and powersample size

Page 25: Malmo 11.11.2008

Collect data to facilitate statistical inference

Remember

1. Source of subjects

2. Inclusion/exclusion

3. Random sampling (study design)

4. Modelling (include confounders in the database)

5. Number of subjects to include

Page 26: Malmo 11.11.2008

Collect data in a safe manner

- Monitoring of data collection process

- Audit trail

- GCP (Documented quality assurance system)

- EU Clinical Trials Directive 2001

- ICH-GCP (USA, EU and Japan)

Page 27: Malmo 11.11.2008
Page 28: Malmo 11.11.2008

Store data in a safe database

Excel worksheet

Relational (SQL) database

Clear variable definitions (string, numeric, date)

Defined variable codes (numeric)

Documented revisions/updating

Coordination

Page 29: Malmo 11.11.2008

Authentic research database

Page 30: Malmo 11.11.2008

Analyze and report in a safe way

- Methodologically correct

- ICH/FDA/EMEA Guidelines

- ICMJE Guidelines (“Vancouver convention”)

- Study registration (ICMJE + WHO)

- Results registration (ICMJE)

Page 31: Malmo 11.11.2008
Page 32: Malmo 11.11.2008
Page 33: Malmo 11.11.2008
Page 34: Malmo 11.11.2008
Page 35: Malmo 11.11.2008
Page 36: Malmo 11.11.2008
Page 37: Malmo 11.11.2008

ICMJEScientists have an ethical obligation to submit creditable research results for publication. Moreover, as the persons directly responsible for their work, researchers should not enter into agreements that interfere with their access to the data and their ability to analyze them independently, and to prepare and publish manuscripts.

Page 38: Malmo 11.11.2008

ICMJEAuthors should identify individuals who provide writing or other assistance and disclose the funding source for this assistance.

Editors should ask corresponding authors to declare whether they had assistance with study design, data collection, data analysis, or manuscript preparation. If such assistance was available, the authors should disclose the identity of the individuals who provided this assistance and the entity that supported it in the published article.

Page 39: Malmo 11.11.2008

Recommendations

- Use a professional statistics package

- Use double data entry

- Use variable and value labels

- Use written programmes

- Include lots of comments in the programmes

- Use independent programme validation

- Do not hardcode

- Back up data regularly

- Comply with local archiving instructions