27
Important definitions in statistics ABOUBAKR ELNASHAR Benha University Hospital, Egypt ABOUBAKR ELNASHAR

Important definitions in statistics

Embed Size (px)

Citation preview

Page 1: Important definitions in statistics

Important definitions in statistics

ABOUBAKR ELNASHAR

Benha University Hospital, Egypt

ABOUBAKR ELNASHAR

Page 2: Important definitions in statistics

Sensitivity:

Probability of test to be positive when the disease

is present

True positive test

Specificity

Probability of the test to be negative when the

disease is absent

True negative test

Systematic review

is qualitative reports

Meta-analysis

Qualitative analysis of systematic review

ABOUBAKR ELNASHAR

Page 3: Important definitions in statistics

Precision

a description of a level of measurement that

yields consistent results when repeated. It is

associated with the concept of "random error", a

form of observational error that leads to

measurable values being inconsistent when

repeated.

ABOUBAKR ELNASHAR

Page 4: Important definitions in statistics

Precision or positive predictive value

the proportion of the true positives against all the

positive results (both true positives and false

positives)

ABOUBAKR ELNASHAR

Page 5: Important definitions in statistics

Accuracy

two definitions:

a level of measurement with no inherent

limitation (i.e. free of systematic error, another

form of observational error).

ISO definition

a level of measurement that yields true (no

systematic errors) and consistent (no random

errors) results.

ABOUBAKR ELNASHAR

Page 6: Important definitions in statistics

Accuracy

used as a statistical measure of how well a binary

classification test correctly identifies or excludes a

condition.

Accuracy

is the proportion of true results (both true

positives and true negatives) among the total

number of cases examined.

To make the context clear by the semantics, it is

often referred to as the "Rand accuracy" or "Rand

index". It is a parameter of the test.

ABOUBAKR ELNASHAR

Page 7: Important definitions in statistics

Accuracy may be determined from sensitivity and specificity,

provided prevalence is known, using the equation:

The accuracy paradox for predictive analytics states that

predictive models with a given level of accuracy may have

greater predictive power than models with higher accuracy. It

may be better to avoid the accuracy metric in favor of other

metrics such as precision and recall.

In situations where the minority class is more important, F-

measuremay be more appropriate, especially in situations

with very skewed class imbalance.

ABOUBAKR ELNASHAR

Page 8: Important definitions in statistics

Another useful performance measure is the balanced accuracy which avoids inflated performance estimates on

imbalanced datasets.

It is defined as the arithmetic mean of sensitivity and

specificity, or the average accuracy obtained on either class:

ABOUBAKR ELNASHAR

Page 9: Important definitions in statistics

Confidence interval

A way of expressing certainty about the findings

from a study or group of studies, using statistical

techniques.

A confidence interval describes a range of

possible effects (of a treatment or intervention)

that is consistent with the results of a study or

group of studies.

I am confident 95% that the range is between so

and so

If the range cross 1 , it is insignificant

95% CI (1.05-1.15)= I am 95% confident that the

risk between 1.05 and 1.15

ABOUBAKR ELNASHAR

Page 10: Important definitions in statistics

A wide confidence interval indicates a lack of

certainty or precision about the true size of the

clinical effect and is seen in studies with too few

patients.

Where confidence intervals are narrow they

indicate more precise estimates of effects and a

larger sample of patients studied.

It is usual to interpret a ‘95%’ confidence interval

as the range of effects within which we are 95%

confident that the true effect lies

ABOUBAKR ELNASHAR

Page 11: Important definitions in statistics

In case control study

It is better to have more controls than cases

In clinical studies

It is better for cases and control to be the same

For numbers: t test

For %: chi square

ABOUBAKR ELNASHAR

Page 12: Important definitions in statistics

ABOUBAKR ELNASHAR

Relative risk

A summary measure which represents the ratio

of the risk of a given event or outcome (e.g. an

adverse reaction to the drug being tested) in one

group of subjects compared to another group.

When the ‘risk’ of the event is the same in the

two groups the relative risk is 1.

In a study comparing two treatments, a relative

risk of 2 would indicate that patients receiving

one of the treatments had twice the risk of an

undesirable outcome than those receiving the

other treatment.

Relative risk is sometimes used as a synonym

for risk ratio.

Page 13: Important definitions in statistics

RR

If 1: no association

<1: negative association

>1: positive association

RR= 2 i.e. risk is doubled

= 5 i.e. risk is 5 times

= 0.5 i.e. negative association ad the risk is

halfed

OR

Is like RR and interpreted as it

ABOUBAKR ELNASHAR

Page 14: Important definitions in statistics

ABOUBAKR ELNASHAR

Odds ratio

a way of representing probability, especially

familiar for betting.

They provide an estimate (usually with a

confidence interval) for the effect of a treatment.

Odds are used to convey the idea of ‘risk’ and an

odds ratio of one between two treatment groups

would imply that the risks of an adverse outcome

were the same in each group.

For rare events the odds ratio and the relative

risk (which uses actual risks and not odds) will be

very similar.

Page 15: Important definitions in statistics

Very common 1/1-1/10 A person in family

Common 1/10-1/100 A person in street

Uncommon 1/100-1/1000 A person in village

Rare 1/1000-1/10,000 A person in small town

Very rare <1/10,000 A person in large town

Royal College of Obstetricians and

Gynaecologists

ABOUBAKR ELNASHAR

Page 16: Important definitions in statistics

Incidence

The rate of new (or newly diagnosed) cases of

the disease.

It is generally reported as the number of new

cases occurring within a period of time (e.g.,

per month, per year).

It is more meaningful when the incidence rate

is reported as a fraction of the population at risk

of developing the disease (e.g., per 100,000 or

per million population).

ABOUBAKR ELNASHAR

Page 17: Important definitions in statistics

The accuracy of incidence data depends upon

the accuracy of diagnosis and reporting of the

disease.

In some cases (including ESRD) it may be

more appropriate to report the rate of treatment

of new cases since these are known, whereas

the actual incidence of untreated cases is not.

Incidence rates can be further categorized

according to different subsets of the population

– e.g., by gender, by racial origin, by age group

or by diagnostic category.

ABOUBAKR ELNASHAR

Page 18: Important definitions in statistics

Prevalence

The actual number of cases alive, with the

disease either during a period of time (period

prevalence) or at a particular date in time (point

prevalence).

Period prevalence provides the better measure

of the disease load since it includes all new cases

and all deaths between two dates

Point prevalence only counts those alive on a

particular date.

Prevalence is also most meaningfully reported as

the number of cases as a fraction of the total

population at risk and can be further categorized

according to different subsets of the population.ABOUBAKR ELNASHAR

Page 19: Important definitions in statistics

ABOUBAKR ELNASHAR

Page 20: Important definitions in statistics

ABOUBAKR ELNASHAR

Controlled clinical trial (CCT)

A study testing a specific drug or other treatment

involving two (or more) groups of patients with the

same disease.

One (the experimental group) receives the

treatment that is being tested, and the other (the

comparison or control group) receives an

alternative treatment, a placebo (dummy treatment)

or no treatment.

The two groups are followed up to compare

differences in outcomes to see how effective the

experimental treatment was.

A CCT where patients are randomly allocated to

treatment and comparison groups is called a

randomised controlled trial.

Page 21: Important definitions in statistics

ABOUBAKR ELNASHAR

Meta-analysis

Results from a collection of independent studies

(investigating the same treatment) are pooled,

using statistical techniques to synthesise their

findings into a single estimate of a treatment

effect.

Where studies are not compatible e.g. because

of differences in the study populations or in the

outcomes measured, it may be inappropriate or

even misleading to statistically pool results in this

way.

Page 22: Important definitions in statistics

ABOUBAKR ELNASHAR

Systematic review

A review in which evidence from scientific

studies has been identified, appraised and

synthesised in a methodical way according to

predetermined criteria.

May or may not include a meta-analysis.

Page 23: Important definitions in statistics

ABOUBAKR ELNASHAR

Cochrane Collaboration

An international organisation in which people

find, appraise and review specific types of

studies called randomised controlled trials.

The Cochrane Database of Systematic Reviews

contains regularly updated reviews on a variety

of health issues and is available electronically as

part of the Cochrane Library.

Page 24: Important definitions in statistics

ABOUBAKR ELNASHAR

Cochrane Library

The Cochrane Library consists of a regularly

updated collection of evidence-based medicine

databases including the Cochrane Database of

Systematic Reviews (reviews of randomised

controlled trials prepared by the Cochrane

Collaboration).

The Cochrane Library is available on CD-ROM

and the Internet.

Page 25: Important definitions in statistics

ABOUBAKR ELNASHAR

Cohort

A group of people sharing some common

characteristic (e.g. patients with the same

disease), followed up in a research study for a

specified period of time.

Cohort study

An observational study that takes a group

(cohort) of patients and follows their progress

over time in order to measure outcomes such as

disease or mortality rates and make comparisons

according to the treatments or interventions that

patients received.

Page 26: Important definitions in statistics

ABOUBAKR ELNASHAR

Thus within the study group, subgroups of

patients are identified (from information collected

about patients) and these groups are compared

with respect to outcome, e.g. comparing mortality

between one group that received a specific

treatment and one group which did not (or

between two groups that received different levels

of treatment).

Cohorts can be assembled in the present and

followed into the future (a ‘concurrent’ or

‘prospective’ cohort study) or identified from past

records and followed forward from that time up to

the present (a ‘historical’ or ‘retrospective’ cohort

study).

Page 27: Important definitions in statistics

ABOUBAKR ELNASHAR

Because patients are not randomly allocated to

subgroups, these subgroups may be quite

different in their characteristics and some

adjustment must be made when analysing the

results to ensure that the comparison between

groups is as fair as possible.