How to Practise Evidence-based Surgery

8/12/2019 How to Practise Evidence-based Surgery

1/5

How to practiseevidence-based surgeryAD Shearman

CP Shearman

AbstractEvery year an estimated 234 million major surgical procedures are under-

taken worldwide. In 2009e10, 4.8 million hospital admissions involved

surgical input in England alone and around four in five adults are likely

to have an operation in their lifetime. Despite these enormous numbers,

objective evidence for the indications and benefits (or otherwise) of

surgical procedures is often lacking. Lack of robust research into surgical

disease and treatments has been criticized, and less than 2% of national

funding for health research involves surgical projects. This seems

surprising, as inappropriate surgical treatments can be hazardous for

the patient and costly to the healthcare system.

The demand for evidence-based clinical practice is increasing, driven by

public and professional expectations. The scarcity of high-quality studies

across many different fields of surgery has led to ambiguity in the

management of many common surgical conditions, with widely varying

clinical outcomes in different geographical areas. Surgical treatments

are costly and need to be justified, not only on clinical benefit, but on

their cost effectiveness compared to other treatments.

This article outlines the various approaches that have been adopted to

evaluate evidence of benefit for surgical treatments, and their application

in a clinical setting. The components of evidence-based medicine and the

GRADE method of evaluating quality of evidence are explored. The impor-

tance of taking into consideration cost-effectiveness and patient attitudes

to treatment is also discussed.

Keywords Evidence-based medicine; meta-analyses; randomized

controlled trials; surgical research; systematic review

Background

I think your solution is just; but why think, why not try the

experimenteJohn Hunter(1728e1794).

John Hunter made important contributions to surgical science

by carrying out experiments and observations on animals and

humans. It has been argued that surgeons have subsequently

failed to adopt modern and more appropriate research tools, such

as randomized controlled trials, thus falling behind other

specialties in evaluating treatments for their patients.1

Wound care is a good example of an area where there is vast

expenditure on complex dressings, but little, if any, high-quality

evidence for their role. Not only is lack of evidence frustrating

when planning treatment, it can potentially damage or prevent

the adoption of an effective therapy. Surgeons managing people

with complications of diabetic foot disease have enthusiastically

adopted topical negative pressure therapy. However, this treat-

ment is expensive and may delay discharge from hospital.

Although some studies have shown an increased rate of wound

healing, at present there is no evidence of any benefit to thepatient in terms of complete wound healing. Purchasers of

healthcare are increasingly looking for evidence of clinical and

cost-effectiveness data, and without it are unlikely to support the

treatment.2 Therefore, if topical negative pressure therapy truly is

a major improvement in diabetic foot care, without evidence, it

could be lost to patients.

On the other hand, some of the most complex surgical treat-

ments have been successfully studied and current practice is now

based on the study findings. In 1985 over 100,000 carotid

endarterectomies were performed in the USA alone costing over

1 billion dollars. It was estimated that a third of these cases were

inappropriate and a further third were of dubious benefit. The

European Carotid Surgery Trial (ECST) and the North AmericanSymptomatic Carotid Endarterectomy Trial (NASCET) published

their results in 1991. Based on a combined total of 3730 patients

the studies clearly demonstrated benefit, in terms of stroke

reduction, in those patients with severe carotid disease offered

surgery in combination with best medical treatment when

compared to best medical treatment alone. This has allowed clear

criteria to be developed to select patients who will gain the most

benefit for carotid endarterectomy3 and the procedure is now

recognized as a significant tool in reducing stroke in the National

Stroke Strategy.

Evidence-based medicine (EBM)

Although the concept had been discussed before, the practice of

evidence-based medicine (EBM) emerged in the 1980s and 1990s,

in an attempt to encourage clinicians to critically appraise

evidence more thoroughly when treating patients. It represented

a generalized dissatisfaction on the reliance of expert opinion

and individual approaches to treatments, resulting in a wide

range of therapies offered to patients with the same condition.

EBM is the conscientious, explicit, and judicious use of current

best evidence in making decisions about individual patients.4

The principle of EBM is to identify the best available evidence,

evaluate it and use it in combination with clinical expertise to aid

the treatment of patients. This approach has been successfully

utilized by groups such as the Scottish Intercollegiate GuidelinesNetwork (SIGN) and the National Institute for Health and Clinical

Excellence (NICE), both of which are resourced to carry out

detailed evaluations of therapies that can inform national

behaviour. This approach relieves individual clinicians of

carrying out detailed analysis themselves, uses the most detailed

evaluation tools possible and is regularly updated.

The practice of EBM can be divided into four key stages.

The research question

Having identified the clinical problem to be studied, a clear

researchquestion must be designed so that current evidence can

clearly be identified in the literature. If the research question is

AD Shearman BSc MBBS MRCS is a Core Surgical Trainee in the London

Deanery, UK. Conflicts of interest: none declared.

CP Shearman BSc MBBS FRCS MS is Professor of Vascular Surgery at the

University of Southampton, UK. Conflicts of interest: none declared.

DEVELOPING LEADERSHIP IN SURGICAL TRAINING

SURGERY 30:9 481 2012 Published by Elsevier Ltd.
http://dx.doi.org/10.1016/j.mpsur.2012.06.005http://dx.doi.org/10.1016/j.mpsur.2012.06.005


2/5

vague or too broad, then the returns from literature searches will

be vast and unhelpful. The PICO structure is commonly used to

design the question:

ThePopulation that is of interest, for example people with

diabetes and foot wounds and arterial disease

The Intervention that is to be examined, for example

topical negative pressure therapy after digit amputation

The Comparison, for example standard healing by

secondary intention

TheOutcome, for example complete wound healing.

Using this approach allows a literature search to be conducted,

which is likely to identify key information in the field. It is also

a useful exercise when reading a research paper to identify exactly

what question the authors were trying to answer; this is often not

very clear!

Searching the existing evidence

It is wise to start to look at resources such as NICE and SIGN. If

a topic has been covered, then the methodology and analysis

will be described in detail including evidence that was not

included. The Cochrane Library is a collection of six databasesof high quality, independently collected evidence that includes

systematic reviews and a registry of randomized controlled

trials.

To search published evidence key words from the research

question should be entered into the medical subject heading

(MeSH) of a search engine such as Embase or Medline. Medline

can be accessed via PubMed, which is hosted by the US National

Library of Medicine of the National Institute of Health. Embase

(Excerpta Medica database) contains biomedical and pharma-

ceutical data and is available through NHS Evidence. Depending

on the key words used for the search, a number of articles will be

identified which can then be filtered further. Reviewing the

abstracts allows irrelevant articles to be discarded and theremainder can then be critically reviewed. A typical search may

produce over 2e3000 returns of which approximately two to

30 may be relevant and warrant further analysis.

Critical appraisal of the evidence and its application to the

clinicalenvironment

There is a hierarchy of evidence values based on the study design

(Table 1). Although randomized controlled trials (RCTs) are

generally held to be the best study design individually, they may

have limitations and combining a series of RCTs in a systematic

review or combining the data sets from a number of RCTs in

a meta-analysis produces more powerful evidence (Box 1).A systematic review identifies relevant studies for a specific

research question from the literature, assesses the quality of

evidence produced and combines the results of those of high

quality. If the raw data are clear, or the authors can help verify it,

the results from a series of studies may be combined mathe-

matically to provide a more comprehensive meta-analysis. This

has the advantage of increasing the numbers of subjects and

hence the power of the study to detect a real difference.

However, this approach is based on the assumption that all the

study populations are similar and the methodology used was the

same in all the reports included. The results can be graphically

represented as Forest plots, illustrating the size and the confi-

dence intervals of the effects seen (Table 2).In acohort studyone group who has received an intervention

is compared to another group who did not. There is no randomi-

zation and a number of factors such as case selection and surgeon

preference for a treatment may confound the results. However,

prospective cohort studies can be very important in surgical

disease, for example identifying the significance of micrometa-

stasis in lymph nodes removed during breast surgery.5

In acase-controlled study, a patient group is compared retro-

spectively to a group matchedfor a number of variables such as age

or sex. Acase seriesis a selected number of patients (not usually

consecutive) who have a condition or treatment. There is no

comparator group and often the observations reflect the opinions of

the authors. Case series, however, may be helpful in identifyinga particular area that needs further research. Individual case

reports have no role in changing practice, but may be useful

educationally to raise awareness of an unusual condition.

Different types of study design, arranged in order of impact. Combined studies in yellow, experimental in orange,observational in blue and opinion in green

Study design Description

Systematic reviews meta-analysis A compilation of matched studies, usually RCTs, which are compiled or combined mathematically

(known as a meta-analysis) to provide a high-impact answer to a specific research question

Randomized controlled trials (RCT) A group of matched individuals who are randomized prospectively to receive one or moretherapies or not. Outcomes are measured between groups. RCTs are particularly powerful when

patients and observers do not know if they are receiving/performing the intervention or not (blinding)

Cohort studies (observational) In a population where one group is exposed to the treatment or condition. And is compared to

those who have not been exposed

Case-control studies Identify factors that may contribute to a condition by comparing subjects with a disease of interest

to members of the same population without

Case series A collection of similar cases which are of interest and may lead to a change in practice

Case reports A single case of a previously unreported occurrence which may lead to reflection on clinical practice

Expert opinion A group of individuals who are considered experts in their field provide guidelines based

on their experience

Table 1




3/5

If there is no evidence available then expert opinion can beused. Although this can be very biased it is possible to extract

more value out of the process by challenging groups of experts

with a series of clinically relevant questions and debating an

agreement; a Delphi Consensus.

Registry data are increasingly being used to inform surgical

practice. The National Joint Registry and National Vascular

DataBase are good examples of this.

Application of the conclusions including integration with

clinical management

Having assessed the quality of the evidence as outlined above

perhaps the most difficult and often controversial step is applying

this to clinical practice. Traditional methods of evaluating

evidence such as systematic review rely heavily on the quality of

the studies they are synthesized from. However, methods of

assessing quality have previously only been related to the typeof study undertaken, with RCTs being considered as the only

valid source of evidence. This is particularly difficult in surgery

where the quality of studies may be considered to be poor.

The GRADE (Grading of Recommendations Assessment,

Development and Evaluation) system is a comprehensive and

systematic tool, which can be used to assess the quality of

evidence in the context of producing clinical guidelines. It does

this by considering all possible factors that might influence

whether or not a certain test or intervention is recommended.

These factors or outcomes are designated critical, important or

unimportant.

Following a literature search, the evidence is scored based on

study design, but also on the clinical significance and magnitudeof effect of the study intervention. The study is interrogated for

confounding factors such as publication bias or commercial

interest that may influence the results. Inconsistency compared

to other studies (if the result is very different from similar

studies), indirectness (the benefit is unclear as PICO was not

clear) and imprecision (wide confidence intervals or clinically

unimportant effects) are negative factors. Conversely, publica-

tions of poor design (for example observational studies) may be

upgraded where there is a large magnitude of effect, where all

Forest plots

Study Treatment

(n/N)

Control

(n/N)

Weight

(%)

Odds ratio

(MeH, fixed, 95% CI)

Odds ratio (MeH, fixed, 95% CI)

Study 1; 2006 720/6498 750/6500 49.9 0.98 (0.68e1.28)

0 0.5 1.0 1.5 2.0

Line of no effect

Study 2; 2008 989/8542 1014/8569 37.1 0.96 (0.66e1.26)

Study 3; 2010 200/3000 270/3000 13.0 0.52 (0.21e0.83)

Total 18040 18069 100 0.94 (0.73e

1.15)

In this example the number of events in a treatment group compared to those who received placebo in a control group is compared. Studies identified of high quality

are listed with the number of events that occurred in each arm of the study. The vertical line passing through one indicates no effect. Each study is displayed as a box

the size of which represents the sample size. Passing through this the horizontal line represents the 95% confidence intervals. If the box appears on the favours treat-

ment side but the confidence intervals cross the no effect line then there is a chance that treatment is not effective. Pooled results are represented by a diamonds of

which the right and left extent indicates the confidence. Weighting indicates the influence of the individual study on the pooled results. The odds ratio is a measure of

the size of the effect and ManteleHaensel (MeH) statistic is a method of determining pooled odds ratios.

Table 2

Key points to look for when assessing a randomizedcontrolled trial

C Was a method of randomization used and was this satisfactory

and not biased?

C Were the patient groups similar at base line?

C Was the patient blinded to the intervention?C Was the team caring for the patient blinded to the treatment?C Was the outcome assessor blinded to the treatment?

C Was the dropout rate described and was it acceptable (


4/5

plausible confounding factors would in fact reduce the observed

effect, or where there is a dose-response gradient. A GRADE

grid which is relatively simple to use gives a standardized score

of the quality of data.6

Once the GRADE score has been determined for each outcome,

the overall quality of evidence is graded on a 1e4 scale (Table 3).

The overall quality grade feeds into the decision to formulate

a recommendation for or against the intervention in question,along with the balance of desirable and undesirable outcomes of

alternative treatments, variability in values and patient preference

and cost. Having considered these four variables, a recommenda-

tion can be made for the intervention. These are stratified into

either strong or weak recommendations for and against the

intervention.

Other factors which must be considered

Cost-effectiveness

All healthcare budgets have finite limits, so not only should

surgical treatments be clinically effective, they should also be cost-

effective and fairly distributed. This does not mean that cheap

treatments are best, but that the cost of the health gain obtainedneeds to be assessed against the cost of gaining it.

A common way of doing this is to determine the cost of the

Quality Adjusted Life Years (QALYs) gained by the treatment.

Quality of life can be determined using a number of tools that

assess the various domains of a subjects life. A QALY is the life

expectancy multiplied by the quality of life of the subject. For

example, if a treatment was to give a patient 10 years of extra life,

but their quality of life was 50% of what would be expected, the

number of QALYs gained would be five. The costs of treatment

can be determined and the cost per QALY calculated. This

approach not only allows comparisons of different therapies, but

also allows the best use of health resources. An expensive treat-

ment that gains a high number of QALYs might be better than

a less expensive option, which although safe, does not confer the

same long-term benefit to the patient. QALYs are used by NICE in

this way when evaluating treatments.7 The threshold for a treat-

ment to be supported on cost-effectiveness grounds is usually

20,000 (30,000 if the health gain is considered very important)

per QALY. As an example, when abdominal aortic aneurysm

screening was first evaluated the cost per QALY at 5 years was

36,000 and therefore not supported. However, at 10 years due to

the increased survival the cost per QUALY was 8000 and so

deemed acceptable on cost-effectiveness.8

Patients attitudes

Treatments need not only need to be highly effective in treating

the surgical condition, they must also be acceptable to the

patient. This aspect of surgical research has had little attention

paid to it until recently with the use of Patient Reported

OutcomeS (PROMS). Also techniques such as standard gamble

can be used to assess how much risk or inconvenience a patient

will accept in exchange for a gain in health benefit. An example

would be patient willingness to travel further for specialist care

to gain a better outcome.

Evidence-based medicine applied to surgery

There are inherent problems with the practice of EBM in surgery.

These include the ability of the surgeon to be able to formulate

relevant questions and design the study. Patients undergoing

surgery often have a range of co-morbidities which affect the

surgical procedure (e.g. obesity). Standardizing surgical tech-

nique across a number of surgeons can be difficult, and outcomes

of surgery are dependent on complex interactions of team

experience, case volume and facilities. Many surgical treatments

are well established and it may perceive to be difficult to

randomize patients to other treatments or in some cases no

treatment. Surgical techniques take time to learn reach a high

level of performance, so there may be a vested interest for the

surgeon to continue with the procedure rather than having to

learn a new procedure which they may not be so competent at.

However, many patients nowquite rightly demandto know the

benefits and risks of the surgery they are due to undergo, and it is

the duty of the surgeon to be able to make these available. There

have also been a number of established treatments that have been

found to be ineffective on careful study. Examples include radical

surgery for breast cancer, which was shown to offer no advantage

compared to less invasive treatments and the surgical repair of

abdominal aortic aneurysms measuring less than 5.5 cm in

diameter, shown to carry more risk than careful observation.

Many of the problems of conducting research in surgery can be

overcome by robust study design, close coordination of the study

participants and careful attention to standardizing surgical

procedures. The National Institute of Health Research has recently

launched an initiative with the Royal College of Surgeons of

England to improve the volume and quality of surgical research.

With the increasing demand for evidence of clinical benefit and

cost effectiveness there has probably never been a better time to

undertake high-quality surgical research.

Conclusion

Allsurgicaltreatments carry an element of risk for thepatients and

are generally expensive. With increased public knowledge and

awareness of surgical outcomes and cost-effectiveness, it is

imperative to be able to demonstrate clearevidencesupporting the

treatments offered by surgeons to their patients. If this cannot be

done, then at the very least the procedure will not be supported by

GRADE definitions of quality of evidence levels. Basedon this recommendations can be made which are eitherstrong or weak

GRADE Definition

High Further research is very unlikely to change

confidence in estimate of effect

Moderate Further research is likely to have an important

impact on confidence in estimate of effect and

may change estimate

Low Further research is very important and is very

likely to have an important impact on the

confidence in the estimate of effect and is

likely to change estimate

Very low Any estimate of effect is uncertain

Table 3




5/5

healthcare commissioners and at the worst the surgeon may be

causing unintentional harm, if the technique turns out not to be

effective. A

REFERENCES

1 Horton R. Surgical research or comic opera? Questions but few

answers. Lancet1996;347: 984e5.

2 NICE Clinical Guideline 119. Diabetic foot problems. In patient

management of diabetic foot problems.www.nice.org.uk/

guidelineCG119.

3 Naylor AR, Rothwell PM, Bell PRF. Overview of results and secondary

analyses from the European and North American randomsied trials

endarterectomy for symptomatic carotid stensosis. Eur J Vasc

Endovasc Surg2003;26: 115e29.

4 Sackett DL, Rosenberg WMC, Muir Gray JA, Haynes RB, Richardson WS.

Evidence based medicine: what it is and what it isnt. Br Med J1996;

312: 71e2.

5 Merkow RP, Ko CY. Evidence based medicine in surgery. The impor-

tance of both experimental and observational study designs.J Am Med

Assoc2011;306: 436e7.

6 Jaeschke R, Guyatt GH, Dellinger P, et al. Use of Grade grid to reach

decisions on clinical practice guidelines when consensus is elusive.Br

Med J2008;337: 327e37.

7 National Institute for Health and Clinical Excellence. Guide to the

methods of technology appraisal. www.nice.org.uk/media/B52/A7/

TAMethodology.

8 Multicentre aneurysm screening group. Multi centre aneurysm

screening study (MASS): cost effectiveness analysis of screening for

abdominal aortic aneurysms based on four years results from rando-

mised controlled trial. Br Med J2002;325: 113542.

FURTHER READING

Bhandari M, Deveraux PJ, Montorini V, Cina C, Tandan V, Guyatt GH. Users

guide to the surgical literature: how to use a systematic literature

review and meta-analysis. Can J Surg2004;47: 60e7.

Grade working group. Grading quality of evidence and strength of

recommendations. Br Med J2004;328: 1490e8.

Hsu C-C, Sandford BA. The Delphi Technique: making sense of consensus.

Practical Assess Res Eval: 1e8, http://pareoline.net/getvn.asp?

v12&n10, 2007;12.

Persad G, Wertheimer A, Emanuel EJ. Principles for allocation of scare

medical interventions. Lancet2009;373: 423e31.

Practice points

C Evidence-based medicine allows surgeons to use the best

evidence in making decisions for their patients

C Lack of high-quality study designs in surgery has led to

criticism, which must be addressedC The GRADE system scores quality of evidence across a range of

relevant factors and allows strong and weak recommendations

to be made

C Careful study design can overcome many of the initial barriers

to research into surgical techniques

C Treatments should be assessed on their clinical and cost-

effectiveness as well as their acceptability to patients


http://www.nice.org.uk/guidelineCG119http://www.nice.org.uk/guidelineCG119http://www.nice.org.uk/media/B52/A7/TAMethodologyhttp://www.nice.org.uk/media/B52/A7/TAMethodologyhttp://www.nice.org.uk/media/B52/A7/TAMethodologyhttp://pareoline.net/getvn.asp%3Fv%3D12%26n%3D10http://pareoline.net/getvn.asp%3Fv%3D12%26n%3D10http://pareoline.net/getvn.asp%3Fv%3D12%26n%3D10http://pareoline.net/getvn.asp%3Fv%3D12%26n%3D10http://dx.doi.org/10.1016/j.mpsur.2012.06.005http://dx.doi.org/10.1016/j.mpsur.2012.06.005http://pareoline.net/getvn.asp%3Fv%3D12%26n%3D10http://pareoline.net/getvn.asp%3Fv%3D12%26n%3D10http://pareoline.net/getvn.asp%3Fv%3D12%26n%3D10http://pareoline.net/getvn.asp%3Fv%3D12%26n%3D10http://www.nice.org.uk/media/B52/A7/TAMethodologyhttp://www.nice.org.uk/media/B52/A7/TAMethodologyhttp://www.nice.org.uk/guidelineCG119http://www.nice.org.uk/guidelineCG119

Documents

How to Practise Evidence-based Surgery