9 - How to Read Critique an Article

Preview:

Citation preview

How to appraise evidence & research paper

Presented by Presented by Dr khamis elessiDr khamis elessi

Evaluate your informationEverything written has at least some bias or point-of-view. You need to evaluate how much that bias affects the content of the article.

– Who is the author?• Did the author have any authority in what they wrote? What

credentials do they have?– Why was the article written?

• Many articles and websites were written to present specific arguments or theories. Make sure you know if the information you are using was written for a specific purpose.

– Where was it published?• Was it published in a peer-reviewed, scholarly, or otherwise

authoritative journal? Or, merely on someone’s personal website?

– When was it published?• Obvious, yes. But, make sure that the website you use is not

outdated.

Simple steps for appraisal of research paper At first: Scan abstract for a few seconds

Briefly assess study designBriefly assess statistical precision of resultsAre the authors conclusions of interest?Formulate a brief summary

Critically appraise methods section for validityCritically appraise results section (especially the

tables and figures) for validity & relevanceDraw your own conclusions about applicability

Step 3: Critically appraise the evidence (cont.)

There are 4 issues in critical appraisal of any paper:• Validity: validity is the extent to which the results are

free from bias. It Measures dimensions of occlusion that are considered clinically important.

Relevance: refers to the extent to which the research paper matches your needs.

Not all papers are “good”Not all papers/thesis are “interesting” for you

Reliability (Consistency): refers to the extent to which the results are similar across different analysis in the study & in agreement with evidence from other studies

Importance & significance of Results ( analyzed in light of type of Study

General points when appraising EvidenceCritique Requires some knowledge of diff. study types

– ASSUMING THAT IT IS A WELL DESIGNED STUDY• Check for appropriate sample size, randomization,,

treatment allocation, stats. used, etc.

– Meta-analysis of RCT’s > RCT > Cohort > Case Control > Case Series > Case Report.

– Retrospective studies weaker than prospective studies

Critique requires basic knowledge of essential terms of biostatistics– Sensitivity, specificity, prevalence, likelihood ratios– Absolute risk reduction, relative risk reduction, odds ratios,

number needed to treat, numbers needed to harm.

Critical appraisal of evidence – cont.• Check Validity of the evidence

– Internal Validity– External Validity

• Relevance of the evidence– Did they measure something pts care about?– Is population similar (enough) to mine?

• Importance of the evidence– Magnitude of effect or clinical significance?– P values, confidence intervals, relative risk or

absolute risk reduction

Internal Validity

It is the extent to which the observed difference in outcomes between the two comparison groups can be attributed to the intervention rather than other factors.

study design, blinding, randomized, Bias. sample size, appropriate statistics, etc

Appraisal of Internal Validity • Was assignment of patients to treatments randomised?• Were groups similar at start of trial?• Were groups treated similarly, apart from the

experimental treatment?• Were all participants accounted for in the conclusions?• Were all participants analysed in the groups to which

they were randomised (Intention To Treat analysis)?• Were participants and clinicians kept “blind” to treatment

received?

Appraise study design

• A quality case-control study is more meaningful than a flawed RCT

External Validityis the validity of generalized (causal) conclusions or

inferences in scientific studies. in simple terms, It is the degree to which the study conclusions would hold for other persons in other places and at other times.

Appraisal of External Validity Do the inclusion and exclusion criteria make sense? What proportion of the screened population was recruited? Can the results be reasonably applied to a definable group of

patients in a particular clinical setting? Where were the participants recruited from (primary care /

referral centre)? To whom do the results of this trial apply? Are the results generalizable beyond the trial setting?

Reliability (consistency)

•Reliability refers to the consistency of a measure .

•A measure is said to have a high reliability if it produces consistent results under consistent

conditions .

•For example, measurements of people’s height and weight are often extremely reliable.[1]

13

How to assess reliability• Intra-rater

– Have same person rate the “case” more than once.• Inter-rater

– Have different people rate the “case”.

14

Valid and Reliable

Reliable but NOT valid

NOT reliable or valid

Reliable and valid

Can’t be valid unless reliable

Bias• Allocation (Selection) Bias – a systematic error in creating intervention

groups, causing them to differ with respect to prognosis. Failure of randomisation Systematic differences in comparison groups

 • Performance Bias: Systematic differences in interventions

received by the two groups.• • Attrition Bias: Systematic differences in withdrawals from the trial • Measurement (Detection) Bias – Failure of blinding

Systematic differences in outcome assessment

• Reporting Bias – Selective reporting of results• Drawing conclusions that are not supported by the results

• Publication Bias – Failure to publish (Systematic reviews)

• Funding Bias – company or sponsor interests

Performance Bias Confounding: a situation in which the estimated intervention effect is biased because

of some difference between the comparison groups apart from the planned interventions - such as baseline characteristics, prognostic factors, or concomitant interventions.•Contamination: Provision of the intervention to the control group.•Compliance: Poor compliance with the allocated intervention.•Co-interventions: Provision of unintended additional interventions to either group.•Note: both contamination & compliance problems tend towards no effect.•Co-interventions may bias in either direction

Attrition Bias: Loss to follow-upLoss to follow-up (Drop out) rate: should not exceed

outcome event rate and should be equal in all groups.Rough guide: 5% - OK If >20% - validity doubtfulIntention-to-treat analysisMaintains the randomisation and analyse all despite the

drop outs.

Summary: “Analyses…compare all simvastatin-allocated versus all placebo-allocated participants. These ‘intention-to-treat’ comparisons…”

Measurement (Detection) Bias

Best: Double-blind• Both patient and investigator unaware of treatment

allocation

• Less important if outcome is objective (e.g. death)Critical if outcome is subjective

 • Impossible for some comparisons eg medical vs

surgical intervention

What are the possible causes of an “effect” in a RCT?

• Bias• Placebo• Chance• Real effect

Placebo Effect

Placebo : is a false or fake intervention (Drug, Manoeuvre, procedure)

You can only know the size of a placebo effect if a placebo has been used!

• Ex. in depression we know that antidepressant therapy is not much more effective than placebo

• However because of the medical attention and follow-up received it is quite likely that placebo is better than no treatment.

Could the treatment effect have arisen by chance ?

p-values• Statistical test of the (“null”) hypothesis (means the

intervention had no effect) • How large was the treatment effect?

• If p<0.05 then the result is statistically significant• (i.e. effect would occur by chance less than 5% of the time)• The smaller the p-value the less likely is the effect to

occur by chance.

Confidence interval (CI): An interval within which the population parameter (the ‘true’ value) is expected to lie with a given degree of certainty (eg 95%,99%).

Appraisal of significance/ Chance• Were all the outcomes studied important?• Was sub-group analysis pre-planned?• Could the treatment effect have arisen by

chance? • How large was the treatment effect? P

value

24

Measures of Quality of a Diagnostic Test/outcome measure

• Sensitivity• Specificity• Accuracy• Predictive Value (positive and negative) -- The

higher these numbers - the better the test.

25

the “Gold Standard”

• The definitive diagnostic technique• Often expensive, elaborate, or difficult to

perform.• However, We are always looking for faster,

cheaper, better ways to diagnose disease (and to determine treatment).

26

Sensitivity• The number of people with the disease (Gold

Standard) who have a positive test result.• sensitivity (how often we get positive results in people

with the condition)• Relates Gold Standard to New Test.• A sensitive test rarely misses people with disease.• Sensitive tests should be selected when there is an

important penalty for missing disease (i.e., cancer diagnosis, AIDS)

27

Specificity• specificity (how often we get negative results in people

without the condition). (Rule out the false positive)• A specific test will rarely misclassify people without

disease as diseased. • Specific tests are used to “rule in” a diagnosis that has

been suggested by other tests.• Ideally, these measures should both be 100% but they

rarely are! More often there will be some false positives and some false negatives.

Appraising Applicability

• Is my patient similar to the study population?• Is the treatment feasible in my clinical

setting?• Will potential treatment benefits outweigh

potential harms of treatment for my patient?

Recommended