211
University of Groningen Demystification of commonly used measurements in paediatrics Bekhof, Jolita IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2014 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Bekhof, J. (2014). Demystification of commonly used measurements in paediatrics [S.l.]: s.n. Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 16-05-2018

Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Embed Size (px)

Citation preview

Page 1: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

University of Groningen

Demystification of commonly used measurements in paediatricsBekhof, Jolita

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2014

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):Bekhof, J. (2014). Demystification of commonly used measurements in paediatrics [S.l.]: s.n.

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 16-05-2018

Page 2: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Demystification of commonly used measurements in paediatrics

Jolita Bekhof

Page 3: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

ISBN: 978-90-367-6989-1

Layout & Printing: Optima Grafische Communicatie, Rotterdam, The Netherlands

©Copyright Jolita Bekhof, Zwolle, 2014. The research presented in this thesis was conducted at Isala, Amalia Children’s Center, in Zwolle, The Netherlands.

Page 4: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Demystification of commonly used measurements in paediatrics

Proefschrift

ter verkrijging van de graad van doctor aan deRijksuniversiteit Groningen

op gezag van derector magnificus prof. dr. E. Sterken

en volgens besluit van het College voor Promoties.

De openbare verdediging zal plaatsvinden op

woensdag 18 juni 2014 om 16:15 uur

door

Jolita Bekhofgeboren op 6 juni 1971

te Leeuwarden

Page 5: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Promotores Prof. dr. P.L.P BrandProf. dr. J.H. van den Horn - Kok

copromotor Dr. H.L.M. van Straaten

Beoordelingscommissie Prof. dr. A.A.E. VerhagenProf. dr. M. OffringaProf. dr. E.E.S. Nieuwenhuis

Page 6: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

contents

Introduction and outline of the thesis 7

Chapter 1 Semi-quantitative measurement of glucosuria in neonates 19

Validity and interobserver agreement of reagent strips for measurement of glucosuria.

21

Reliability of reagent strips for measurement of glucosuria in a neonatal intensive care setting.

31

Chapter 2 Early diagnosis of late onset neonatal sepsis in preterms 41

Clinical signs to identify late-onset sepsis in preterm infants. 43

Glucosuria as an early marker of late onset sepsis in preterm infants

57

Chapter 3 Fluid balance charts in neonates 69

Reliability of the fluid balance in neonates. 71

Usefulness of the fluid balance: a randomised controlled trial in neonates.

81

Chapter 4 Clinical assessment of dyspnoea in children 95

Systematic review: insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children.

97

Large interobserver and intraobserver variation in clinical assessment of dyspnoea in wheezing children.

127

Chapter 5 Viral tests for cohortisolation in children hospitalised for bronchiolitis

145

Co-infection in children hospitalised for bronchiolitis: role of roomsharing.

147

Roomsharing in hospitalised children with bronchiolitis. 157

General discussion and future perspectives 165

Summary 181

Nederlandse samenvatting voor de leek 185

List of co-authors 193

List of publications 197

Dankwoord 201

About the author 207

Page 7: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 8: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Introduction and outline of the thesis

Page 9: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 10: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Introduction and outline of the thesis 9

i

introduction

“Not everything that can be counted counts, and not everything that counts can be counted”

Albert Einstein

Measurements are essential in medical practice. Throughout their training, doctors are taught the adage “meten is weten” (the knowledge is in the numbers). In this thesis we explore this theorem within the field of paediatrics by investigating four specific aspects of measurements: what it is a certain measurement actually measures (validity); how precise the measurement is (accuracy, reliability); the degree of variation between the individuals performing the measurement (intraobserver and interobserver agreement); and why we should perform the measurement in the first place (utility).

While measurements can take place at two different levels – either for diagnostic purposes or to evaluate a treatment effect – the studies in this thesis mainly focus on diagnostic measurements. For such diagnostic measurements to be useful, they should have sufficient validity, reliability and utility. Although this may be self-evident, many diagnostic measurements are either incompletely validated or not at all, and reliability is hardly ever perfect. As a result, each and every measurement in medicine is surrounded by a certain degree of uncertainty. Not only is uncertainty around measurements dif-ficult to explain to patients – and in paediatrics to their parents – but doctors also tend to be uncomfortable admitting uncertainty to themselves.1,2

When making a diagnosis, the most important instruments at a physician’s disposal are taking the patient’s history and performing a physical examination. Nevertheless, when after a thorough history taking and physical examination, uncertainty around a diagnosis remains, a physician will often perform additional testing, which is sometimes excessive and not indicated. The choice of additional tests and the extent to which phy-sicians order them varies considerably, depending on experience and specific knowl-edge, as well as on personal characteristics and self-confidence. Throughout the history of medicine, tradition and training have generated many myths, which may prompt physicians to perform unnecessary testing based on rationalisations such as “We always do it this way” or “I was taught to…”.3 Another conventional justification for ordering diagnostic testing is providing reassurance to the patient, which is doubtful in patients with a low probability of serious disease.4,5

This thesis addresses a number of diagnostic measurements commonly performed in different areas of paediatric practice, thereby emphasising that – apart from additional tests – every part of the history taking and physical examination can be viewed as a diagnostic test.6 We attempt to provide answers to questions regarding the validity and reliability of the measurement, the variability within and between users, and whether the

Page 11: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

10 Introduction and outline of the thesis

measurement is useful in improving patient care. This introductory chapter first discusses the general aspects of uncertainty in medicine. This is followed by a brief explanation of the methodological issues concerning diagnostic research and the clinimetric evaluation of measurements, and the way in which evidence-based medicine may help doctors to deal with this feeling of uncertainty. Finally, in the outline of the thesis, we provide back-ground information concerning the different areas in which our questions were explored.

uncertainty in medicine

“Medicine is a science of uncertainty and an art of probability”

William Osler

In general, people dislike ambiguity and try to avoid risks and uncertainty. When pa-tients consult a doctor, they expect their doctor to give clear and unequivocal advice.7 However, although medicine strives to be a rigorous science, uncertainty is inherent in medical practice. “The only thing certain about ambiguity is that it is a fixed part of medi-cine”.7 For example, it is hard to explain to patients that a positive test result does not always confirm the presence of the disease and that – on the other hand – a negative test result does not rule out the possibility that one might still have the disease. Few di-agnostic measurements have been rigorously studied and even if they have, diagnostic tests seldom show perfect accuracy. Most of the time, therefore, we cannot give patients (or ourselves!) clear-cut answers, and 100% certainty is seldom obtained.

Physicians have to deal with a wide range of diagnostic uncertainties: “Does this patient have disease X?”, “How can I be sure my diagnosis is correct?”, “To what risk am I exposing my patient by performing this invasive diagnostic test?” One of the ways physicians deal with diagnostic uncertainties is by performing additional tests or by combining data from history and physical examination in scores or scales, as if quantita-tive results from a computer, machine or calculation yield a higher level of certainty. However, normal results may give false reassurance and there is no absolute guarantee that a result, positive or negative, is correct.

There is increasing awareness that excessive testing and procedures are a problem, re-sulting in increased cost and sometimes harm to the patient.8,9 In the USA, the American Board of Internal Medicine Foundation recently started the Choosing Wisely campaign, an initiative set up to reduce waste and overuse in healthcare. Since the campaign was launched in 2012, more than 130 tests and procedures have been called into question by 25 medical specialty societies (www.choosingwisely.org).

The awareness of the potentially harmful effects of additional diagnostic measurements is also increasing in the Netherlands. A number of recent papers in Medisch Contact, the

Page 12: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Introduction and outline of the thesis 11

i

journal of the Royal Netherlands Medical Society (KNMG), introduced the term “VOMIT” – Victim Of Modern Imaging Technology – to describe patients who had undergone a radiological investigation (judged to be uncalled for in retrospect), that led to further unnecessary investigations and even to unnecessary surgery or treatment.10,11,12,13

evidence-Based medicine

“The process of diagnosis is complex and poses many challenges to practicing clinicians,

attempting to practice in an evidence based manner”

Strauss 200614

Evidence-based medicine (EBM) is one of the ways of dealing with uncertainties in medi-cal practice. EBM as defined by Sackett is “The integration of best research evidence with clinical expertise and patient values” (Figure 1).15 The development of EBM has been a ma-jor advance in healthcare and it can help when making choices in healthcare. Moreover, EBM provides several ways of quantifying and communicating uncertainty. However, most of the available evidence comes from studies done in the domain of therapy. Since the evaluation of most diagnostic tests is often far less rigorous than that of new drugs, evidence-based decision making in the diagnostic area remains a challenging task for practising physicians.16 Apart from the need to investigate new diagnostic tests, it is strik-ing to note that many commonly used existing tests have not been seriously evaluated.

1Formulate structured question

2Literature

search

3Appraisal of

literature

4Application in

practice

Patientdilemma

Principles of evidence-based

practice

5Evaluate the

process

figure 1. Principles of evidence-based medicine, showing the five steps

Page 13: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

12 Introduction and outline of the thesis

Since 2005, the paediatricians at Isala, a teaching hospital in Zwolle, a city with about 120.000 citizens have been implementing evidence-based practice, which includes a weekly EBM meeting during which a critically appraised topic is presented and dis-cussed. The questions explored in this thesis were raised during these EBM meetings. For example, in cases where the conclusion of a critically appraised topic was that there was “absence of evidence”, we would formulate a research question that was suitable for investigation in our own patient population. Although the medical topics addressed in this thesis are not necessarily either cutting-edge or new, they all originated from our own paediatric practice, emphasising the clinical relevance.

diagnostic and clinimetric research

“Much effort is directed towards optimising doctor-patient communication and avoiding mis-

understandings. The language of everyday diagnostic reasoning as it routinely occurs among

doctors in teaching hospitals could benefit from similar attention”

Matt Bianchi17

Making a diagnosis involves balancing evidence. Several differential diagnoses may have to be considered and it is seldom for the final diagnosis to rely entirely on a single test.7,14 Moreover, it is important to realise that apart from laboratory tests, radiology and other diagnostic tests, all patient features and thus all elements of the clinical history and physical examination qualify as diagnostic tests.6 As a result, evaluating the value of diagnostic tests or measurements is complex. Moreover, it can be addressed in different ways, depending on the nature of the diagnostic research question.15 The fact that the field of diagnostic research is developing strongly and that international consensus on the methods for assessing diagnostic tests is lacking16,18,19 further increases the complex-ity of this part of clinical decision making.

There are various ways to look at diagnostic research. One such method divides di-agnostic research into four phases, as shown in Table 1.15,16,20 The first phase deals with the question of whether the test results in affected patients differ from those in healthy persons, and whether patients with a certain test result are more likely to have the target disease.15,19 This phase is especially important when evaluating a new diagnostic test. In the next phase, diagnostic accuracy is determined through case-control studies, which include the assessment of the predictive value in patients with suspected disease.16 The accuracy of a diagnostic test is the ability to correctly identify patients with or without the disease, referred to as sensitivity and specificity.21 Sensitivity relates to the number of false negatives (negative result in the presence of disease) and specificity to false positives (positive result when no disease).7 In such diagnostic accuracy studies, an in-

Page 14: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Introduction and outline of the thesis 13

i

dependent, blind comparison of the test results with a reference standard, using a study population consisting of patients suspected of having the target disease is critical for adequate validity.15,22 Once the specificity and sensitivity of a test have been established, the next question is whether tested patients are better off than similar but untested patients.16 These studies are necessary to evaluate the beneficial and harmful effects of implementing a diagnostic test and usually require a randomised controlled trial.16 In the fourth and final phase, the effects of introducing a diagnostic test into routine clinical practice are assessed by surveillance or cohort studies.

table 1. Four phases in architecture of diagnostic research

Phase 1 Determining the normal range of values Observational studies in healthy people

Phase 2 Determining the diagnostic accuracy Case-control study in patients with suspected disease

Phase 3 Determining the clinical consequences of a diagnostic test

Randomised trial

Phase 4 Determining the effects of introducing a diagnostic test

Surveillance, cohort studies

Adapted from Gluud and Sackett16,18

Apart from diagnostic accuracy and validity, other aspects are important when mea-surements are being evaluated. Since diagnosis is seldom predicted by a single symptom or sign, numerous composite scores have been developed – comprising combinations of several different clinical signs and symptoms and sometimes also laboratory values – to help physicians in making a diagnosis. The methodology used to evaluate such scores or scales is called clinimetrics, and it is used to investigate reliability or agreement, accuracy, measurement error and utility, for example. Clinimetrics is a methodological discipline that focuses on the quality of measurement in medical research.23

Despite the fact that diagnostic accuracy or clinimetric studies are commonly used to evaluate newly available diagnostic tests, it is important to realise that many routinely used measurements – in particular elements from history taking or physical examina-tion – have never been evaluated according to the clinimetric principles mentioned above. Below, in the outline of the thesis, we describe how we have used different study designs to explore our research questions.

outline of the thesis

The main goal of this thesis was to explore the validity, reliability, accuracy and utility of commonly used measurements in paediatrics. The clinical topics addressed in this thesis all arose from the day-to-day practice of paediatricians at Isala, a large teaching hospital

Page 15: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

14 Introduction and outline of the thesis

in Zwolle, The Netherlands. In the following paragraphs we briefly introduce these top-ics and discuss the design of each study in relation to the architecture of diagnostic research as presented in Table 1.

Chapter 1 Semi-quantitative measurement of glucosuria in neonates

In this chapter, we present a study we performed to investigate the validity, reliability and utility of a simple diagnostic tool frequently used in many areas of medicine – not only in the field of paediatrics – namely, semi-quantitative measurement of glucosuria. Disturbed glucose homeostasis is frequently encountered in prematurely born neonates. In this patient category with a limited amount of blood (the circulating blood volume in an infant weighing 1000 grams is approximately 80 ml), it is vitally important to restrict blood sampling. This reduces the need for blood transfusion and avoids infant discom-fort as much as possible. The bedside measurement of glucosuria with a reagent strip is non-invasive and cheap, making it a potentially feasible and suitable method for assess-ing glucose balance in premature neonates. Despite the widespread use and availability of this semi-quantitative measurement of glucosuria, data on the validity and reliability of this method are sparse. We wanted to know the validity of this semi-quantitative measurement compared to the quantitative laboratory measurement, and whether this measurement can be used interchangeably between different professionals. Finally, we assessed whether the specific circumstances in the setting of a neonatal intensive care unit (NICU) influenced the measurement, by examining the impact of the way in which urine is collected from diapers and of the temperature in the environment (incubators).

To clarify these issues, we performed an observational study in an experimental set-ting, in which we used 300 experimentally derived urine samples tested under NICU circumstances – assessed independently by three different nurses – and compared the results of the reagent strips with the urinary glucose concentrations measured quanti-tatively in the laboratory.

Chapter 2 Early diagnosis of late-onset neonatal sepsis in preterm infants

Late-onset sepsis is a major health problem in neonatology, with considerable morbidity and mortality. Early and accurate diagnosis is difficult because symptoms are subtle and nonspecific. While late diagnosis may lead to mortality and morbidity, overtreatment may lead to unnecessary hospital stay and use of antibiotics. The unnecessary use of em-piric antibiotics should be minimised to limit growing microbial resistance and to avoid any possible harmful effects on gastrointestinal immunity and allergy. Many attempts have been undertaken to improve accurate and early diagnosis of late-onset neonatal sepsis (LONS) in preterm neonates. In our neonatal intensive care unit, when deciding whether or not LONS was present in prematurely born neonates suspected of having this condition, the presence of glucosuria was one of the clinical factors considered,

Page 16: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Introduction and outline of the thesis 15

i

although a literature search yielded no studies to support this policy. For this reason, we explored the predictive value of various clinical symptoms, including glucosuria, as markers for LONS. We did this by performing an observational study in 350 premature infants during a two-year period (2005-2007). We prospectively collected daily measure-ments on glucosuria in all infants – blinded to treating physicians – and followed these children for the occurrence of septic episodes, thus creating a case-control study within this prospective cohort.

Chapter 3 Fluid balance charts in neonates

Recording a patient’s fluid balance is a time-honoured tradition, frequently used in many areas in medical practice, but rarely investigated for its reliability and utility. Dur-ing daily ward rounds, we were often confronted with the results of fluid balances being inconsistent with daily weight changes or with the clinical condition of patients. Such discrepancies sometimes prompted additional investigations (sometimes extra labora-tory tests), as well as recalculations and lengthy discussions during ward rounds, which usually ended with the conclusion that the recordings of fluid balance must have been inaccurate. These experiences prompted us to explore whether fluid balance recording is reliable, and whether it improves the health outcomes of the patients involved. To answer these questions, in 2009-2010 we performed a randomised controlled trial in 172 neonates admitted to our high-care neonatal unit. In one group, physicians were blinded to the fluid balance data, while in the other group fluid balance data were made available to attending physicians. We also investigated reliability in this study population by comparing the data on daily fluid balance with that on daily changes in body weight.

Chapter 4 Clinical assessment of dyspnoea in children

Dyspnoea (difficulty in breathing) in children is one of the most commonly encountered problems in paediatrics. Evaluation of the severity of dyspnoea is important in clinical decision making and evaluation of treatment. Because pulmonary function tests are not readily available in the emergency department – particularly not for preschool children – the severity of dyspnoea is primarily assessed by physical examination. Since no single clinical finding accurately reflects dyspnoea severity, in children it is usually assessed using composite dyspnoea scores, which comprise a range of clinical findings. When we were developing a clinical practice guideline for patients hospitalised with dyspnoea – for example with a diagnosis of asthma or viral wheeze – we wanted to use a well-validated dyspnoea score. We therefore performed a systematic review that assessed the validity of all published dyspnoea scores. We also explored the intraobserver and interobserver variability of the individual clinical items used in existing dyspnoea scores by asking several professionals to assess video recordings of 27 dyspnoeic children.

Page 17: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

16 Introduction and outline of the thesis

Chapter 5 Viral tests for cohort isolation in children hospitalised for bronchiolitis

Every winter, the paediatric wards in many countries, including those in the Netherlands, are crowded with infants with bronchiolitis, commonly caused by the respiratory syn-cytial virus (RSV). This epidemic causes major logistic problems because it is common practice to separate patients infected with RSV from those infected with other viruses (cohort isolation). This is done to avoid nosocomial co-infection, which is assumed to be associated with more severe disease. For this purpose, almost all patients with bron-chiolitis undergo a rapid diagnostic RSV test. Confronted with a shortage of hospital rooms for isolating patients during the peak season, we investigated whether such cohort isolation was really necessary, thus questioning the use of expensive viral testing. As a proof of principle, during the winter season of 2011-2012, we first performed an observational cohort study in 48 patients who only shared rooms during the first day of admission, pending the results of the rapid RSV test. In the following season (2012-2013) we continued this study, including 65 patients who shared rooms during their entire stay in hospital.

General discussion and future perspectives

In this chapter we summarise our main findings in relation to the current available literature. We discuss pitfalls and implications for clinical and research practice and offer suggestions for future research.

references

1. Ghosh AK. Dealing with medical uncertainty: a physician’s perspective. Minn Med 2004;87:48-51. 2. Hayward R. Balancing certainty and uncertainty in clinical medicine. Dev Med Child Neurol

2006;48:74-77. 3. Redberg R, Katz M, Grady D. Diagnostic tests: another frontier for less is more: or why talking to

your patient is a safe and effective method of reassurance. Arch Intern Med 2011;171:619. 4. Rolfe A, Burton C. Reassurance after diagnostic testing with a low pretest probability of serious

disease. Systematic review and meta-analysis. JAMA Intern Med 2013;173:407-416. 5. Kroenke K. Diagnostic testing and the illusory reassurance of normal results. JAMA Inter Med

2013;173:416-417. 6. Gill CJ, Sabin L, Schmid CH. Why clinicians are natural Bayesians. BMJ 2005;330:1080. 7. Richard Draper. Coping with uncertainty in primary care. 2010, © EMIS document ID 1541, version

23. www.patient.co.uk/doctor/coping-with-uncertainty-in-primary-care. 8. Moynihan R, Doust J, Henry D. Preventing overdiagnosis: how to stop harming the healthy. BMJ

2012;344:e3502 9. Grady D, Redberg RF. Less is more: how less health care can result in better health. Arch Intern

Med. 2010;170:749-50. 10. Levi M. Kweek bewustzijn bij dokters over wat dingen werkelijk kosten. Medisch Contact

2013:1709.

Page 18: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Introduction and outline of the thesis 17

i

11. Lincke, Del Canho, Akker v/d C. Overbehandeling is vaak onvermijdelijk. Medisch Contact 46/2012:2602-2604.

12. Willems H, Overdiagnostiek: bedreiging en kostenpost. Medisch Contact 31/2012:1846. 13. Hayward R. VOMIT (victims of modern imaging technology)—an acronym for our times BMJ

2003;326:1273. 14. Straus SE. Bridging the gaps in evidence based diagnosis. BMJ 2006;333:405-406. 15. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: what

is it and what it isn’t. BMJ 1996;312:71-72. 16. Gluud C, Gluud L. Evidence based diagnostics. BMJ 2005;330:724-726. 17. Bianchi MT, Alexander BM. Evidence based diagnosis: does the language reflect the theory? BMJ

2006;333:442-5 18. Sackett D, Haynes RB. The architecture of diagnostic research. BMJ 2002;324:539-541. 19. Knottnerus JA, Buntinx F. The evidence base of clinical diagnosis, theory and methods of diagnos-

tic research. Second edition 2009 Wiley Blackwell, BMJ Books. 20. Ferrante di Rufano L, Hyde CJ, McCafery KJ, Bossuyt PMM, Deeks JJ. Assessing the value of diag-

nostic tests: a framework for designing and evaluating trials. BMJ 2012;344:e686. 21. De Groot JAH, Bossuyt PMM, Reitsma JB, Rutjes AWS, Dendukuri N, Janssen KJM, Mons KGM.

Verification problems in diagnostic accuracy studies: consequences and solutions. BMJ 2011;343:d4770.

22. Bossuyt PM, Reitsma JB, Bruns, DE, Gatsonis CA, Glasziou PP, Irwig LM, Lijmer JG, Moher D, Ren-nie D, de Vet HCM for the STARD group. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Clin Chem 2003:49:1-6. www.consort-statement.org/stardstatement.htm/

23. De Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in medicine: a practical guide. Cam-bridge: Cambridge University Press; 2011.

Page 19: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 20: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Chapter 1

Semi-quantitative measurement of glucosuria in neonates

Page 21: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 22: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Validity and interobserver agreement of reagent strips for measurement of glucosuria

Jolita BekhofBoudewijn Kollen

Liesbeth Groot-JebbinkCorrie Deiman

Sjef van de LeurIrma van Straaten

Scand J Clin Lab Invest 2011 May;71:248-52.

Page 23: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

22 Chapter 1 : Semi-quantitative measurement of glucosuria in neonates

aBstract

Background: Measurement of glucosuria by means of a visually readable reagent test strip is frequently used in a wide variety of clinical settings. The aim of this study was to evaluate the validity and reliability of this semi-quantitative measurement of glucosuria compared to laboratory measurement of glucose concentrations in urine.

Methods: Reagent test strips (Combur3Testâ Roche) from 375 artificially supplemented samples of urine, covering a wide range of glucose concentrations, were independently read by 3 different observers. Scores of the strips were categorized as 0, 1+, 2+, 3+ or 4+, in ascending degree of glucosuria. Results of the test-strips were compared to the quantitative measurement of urinary glucose concentration in the laboratory.

Results: 21.7% of reagent strips readings were disconcordant with the laboratory measurements (p<0.001). Under- or overestimating the degree of glucosuria occurs predominantly in category 1+ and 2+. In category “0” only 5.1% of the readings were incorrect. The interobserver-agreement was very good with 85% overall agreement and multirater Kappa 0.81. Interobserver-scores of the reagent strips never deviated more than 1 category from each other.

Conclusion: The validity as well as the interobserver-agreement for the semi-quantitative measurement of glucosuria using reagent strips is moderate, but sufficient for exclud-ing glucosuria. However it is too imprecise for an accurate quantitative measurement. It might only be valuable in settings where automated readings are not available or suitable.

Page 24: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Validity and interobserver agreement of reagent strips for measurement of glucosuria 23

1

introduction

Glucosuria is frequently measured in a wide variety of clinical settings, for example in diabetes, renal disease or critical illness.1 Although monitoring of glucosuria in dia-betes is no longer routinely recommended, urine glucose measurements may still be advocated to check for inappropriate use of blood examinations or for those patients unwilling to use blood sampling.2,3 Moreover, in less favourable economical situations, urine glucose monitoring is better than no monitoring.3,4 Also in newborns undergoing intensive care hyperglycemia is a common problem, especially in very low birth weight infants.5 In this patient category, with only limited availability of blood due to very low body weights (circulating blood volume in an infant weighing 1000 gram is approxi-mately 80 mL), it is of major importance to restrict blood sampling in order to avoid the need for bloodtransfusions.6

Urinalysis can be performed quantitative in the laboratory by spectrophotometry or semi-quantitative by using reagent strips (dipsticks) that are interpreted visually. Testing urine for glucosuria using a reagent strip relies on a chemical reaction (glucose-oxidase/peroxidase reaction) between the reagent on the strip and the glucose constituents of the urine. The degree of glucosuria is determined by the change in color of the test strip after dipping the stick in urine shortly. The analysis of the strip can be performed (semi)-automated or by visual reading. Logically, (semi)-automated reading of strips will rule out human subjectivity in identifying the colors and in general will be preferred over visual reading of the reagent strips. However automated readers are not available, nor suitable in all settings. Still, the use of visually read reagent strips for measurement of glucosuria is quick, easy and economical and thus frequently applied as a point-of-care test.1 The advantage of having direct bedside results makes it suitable for use in hospital as well as outpatient settings. Moreover in neonates, in whom collection of urine is im-peded by limited available amounts and use of diapers, direct bed-side visually reading of reagent strips amounts of urine is more suitable.7 In low-birth weight infants it is often not feasible to collect sufficient amounts of urine. As a consequence, in order to enable urinalysis, reagent strips are directly pressed on the wet diaper.

Despite its wide-spread use and availability, reports on the validity and reliability of this semi-quantitative method are sparse and with varying results.8,9,10,11,12 Also reports on interobserver variability, which seems rather relevant since the reagent strips are visually read, are lacking.

Validity is defined as the extent to which a test measures what is intended to measure. Reliability is defined as the extent to which a test is reproducible or consistent in differ-ent settings, for example by different observers.13 Aim of this study is to determine the validity and interobserver agreement of the measurement of glucosuria by visually read reagent strips compared with laboratory performed urinalysis.

Page 25: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

24 Chapter 1 : Semi-quantitative measurement of glucosuria in neonates

methods

We used Combur reagent strips (Roche) to test artificially supplemented (contrived) urine, intended to simulate pathological specimens. Discriminatory power is defined as the ability of a test to discriminate among subjects, i.e. test results should be spread along the entire possible range.13 Hence, five different quantities of glucose corresponding with the midrange glucose concentrations provided by the package inserts of Combur strips, were added to freshly collected urine of healthy volunteers. A total of 375 urine samples, 75 of each of 5 different glucose concentrations, were thus contrived. Glucose (Merck, Darmastadt, Germany) was added to the urine of healthy volunteers. Glucose concentra-tion of the contrived urine samples were as follows: specimen A 0 mmol/L, specimen B 2.8 mmol/L, specimen C 5.5 mmol/L, specimen D 17 mmol/L and specimen E 55 mmol/L. According to the package insert of Combur strips, the corresponding ranges were 0.6-5.0 mmol/L for score 1+, 3.3-7.7 for score 2+, 14.8-19.2 for score 3+ and 52.8-57.2 for score 4+.

The reagent strips were dipped in the urine in a randomised order of glucose concen-trations. All strips were independently read between 60-120 seconds after contact with the urine sample, by 3 nurses at the same time, who were unaware of the randomisation order. The observers were blinded to scores of others by dividers, practically eliminating visual or any other form of contact between observers. The glucose-concentration of each urine sample was also directly measured in the laboratory by the Gluco-quant glu-cose/HK method (Roche Diagnostics, Mannheim, Germany) on a Modular P800 (Hitachi, Tokyo, Japan). This is an enzymatic hexokinase method, de rate of NADPH formation is measured photometrically. The laboratory staff was not aware of the reagent strip results.

Statistical methods

To assess whether the reagent strips scores were in agreement with the true glucose concentrations, we compared the test results of 1125 (3 times the scores of the 375 urine samples) categorized ratings of the reagent strips with the categorized ranges of the glucose concentrations measured in the laboratory, using Chi-square tests.

Interrater agreement was evaluated by calculating raw agreement and agreement adjusted for chance (weighted Kappa for agreement between two observers and Fleiss multi-rater Kappa for agreement amongst all 3 observers). The rationale for using these two methods is as follows: Raw agreement - the proportion of reagent strips ratings in which 2 observers rate the same category of glucosuria – can be misleading. In par-ticular, if two observers both make a high or low proportion of positive ratings for a category, raw agreement will be high even if the observers are just guessing. That is, their agreement will be simply high by change. High agreement by chance tends to occur when two raters believe the prevalence of a clinical entity of interest is high or low in the

Page 26: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Validity and interobserver agreement of reagent strips for measurement of glucosuria 25

1

population under study.14 Because of this problem with raw agreement, we calculated change-corrected agreement using the Kappa statistic. Kappa is defined as the observed agreement beyond expected agreement (i.e. change) divided by the maximum agree-ment possible beyond. Weighting of the Kappa statistic allows for the partial correction of discordant observations, with greater weight given when categories assigned by the different observers are closer together.15 Given the ordinal nature of the semi-quanti-tative test strip score, a weighted Kappa using quadratic weights was used to quantify reliability. Weighted Kappa was calculated for the 3 observer pairs using http://faculty.vassar.edu/lowry/kappa.html. Fleiss’ multi-rater Kappa for change-corrected agreement between 3 observers was calculated using http://justusrandolph.net/kappa/.16

We interpreted Kappa coefficients as follows: values of less than 0, poor; 0 to 0.2, slight; 0.2 to 0.4, fair; 0.4 to 0.6, moderate; 0.6 - 0.8, substantial agreement; and 0.8 - 1.0 represent almost perfect agreement.17

negative

figure 1. Agreement between reagent strips and “true” glucose concentrations as measured in the laboratoryThe “true glucose category” is the categorized measurement of the glucose concentration as measured in the laboratory by spectophometry. The “glucosticks” provide the associated result of the visually read reagent strip. The width of the stripes represents the number of observations.

Page 27: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

26 Chapter 1 : Semi-quantitative measurement of glucosuria in neonates

results

Figure 1 shows the results of the level of agreement between the categorized scores of the reagent strips and the categorized test results of the “true” glucose concentrations as measured in the laboratory.

Table 1 presents percentages of reagent strips predicting the correct true glucose concentrations. Of 1125 glucosticks, 21.7% were disconcordant with the laboratory measurements (p<0.001). Under- or overestimating the degree of glucosuria by means of the reagent strips predominantly occurred in category 1+ and 2+, with 34.2% and 38.5% of sticks incorrectly predicting the degree of true glucosuria within these catego-ries. In category “0” only 5.1% of readings were incorrect. The maximum urinary glucose concentration incorrectly categorized as “0” (negative for glucosuria) was 2.55 mmol/L.

The difference in the visual assessment scores of the reagent strips was never more than 1 category. Multi-rater Kappa was 0.811, representing almost perfect agreement, with 85% overall agreement.

table 1 Diagnostic accuracy of reagent strips for glucose for each rating category

Result of reagent strip for glucose

Concordance with “True” category (measured in the laboratory)

Underestimating 1 category (“true” category 1 higher)

Overestimating 1 category (“true” category 1 lower)

0 205/216 (94.9%) 11/216 (5.1%) Not applicable

+ 144/219 (65.8%) 55/219 (25.1%) 20/219 (9.1%)

++ 123/200 (61.5%) 7/200 (3.5%) 70/200 (35%)

+++ 196/255 (76.9%) 12/255 (4.7%) 47/255 (18.4%)

++++ 213/235 (90.6%) Not applicable 22/235 (9.4%)

Each of 5 categories of glucose concentration was represented by 225 urine samples

Figure 2 represents raw data on agreement between the three observer-pairs. The interobserver-agreement, as presented by percentage of agreement and Kappa coef-ficients, was very good (table 2).

discussion

Principal findings

In this study the validity of the semi-quantitative measurement of glucosuria by visually read reagent strips was moderate, with more than one-fifth of reagent strips readings in-correctly predicting the laboratory measurement. Under- or overestimating the degree of glucosuria occurs predominantly in category 1+ and 2+. In category “0” only 5.1% of readings were incorrect. Thus, the validity of the reagent strips for ruling out glucosuria

Page 28: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Validity and interobserver agreement of reagent strips for measurement of glucosuria 27

1

is fairly good. The discriminative power between the lower categories of glucosuria (1+ and 2+) is low, probably due to the dissimilar differences in concentrations between the lower and higher categories of glucosuria, with smaller differences in concentra-tions between the lower categories. This is confirmed by product information from the manufacturer that indicate overlapping ranges for glucose concentrations between the lower categories 1+ (0.6-5.0 mmol/L) and 2+ (3.3-7.7 mmol/L).

We found a high degree of interobserver agreement in the assessment of the reagent strips, demonstrated by high Kappa-values. The scores of the different observers never deviated more than one category from each other, which seems fairly adequate for clini-cal use.

Observer 2 Observer 3

0 + ++ +++ ++++ Total 0 + ++ +++ ++++ Total

Obs

erve

r 1

0 71 0 0 0 0 71 0 66 5 0 0 0 71

+ 8 51 1 0 0 60 + 0 59 1 0 0 60

++ 0 24 44 2 0 70 ++ 0 20 48 2 0 70

+++ 0 0 25 67 1 93 +++ 0 0 13 77 3 93

++++ 0 0 0 9 72 81 ++++ 0 0 0 6 75 81

Total 79 75 70 78 73 375 66 84 62 85 78 375

Observer 3

Obs

erve

r 2

0 66 13 0 0 0 79

+ 0 66 9 0 0 75

++ 0 5 50 15 0 70

+++ 0 0 3 69 6 78

++++ 0 0 0 1 72 73

Total 66 84 62 85 78 375

figure 2 Raw data of agreement in rating glucosuria with reagent strips between the three observer pairs

table 2 Agreement between raters for test results of the reagent strips

Rater 1 versus 2 Rater 1 versus 3 Rater 2 versus 3

Raw agreement 83.7% 86.7% 86.1%

Kappa 0.942 0.952 0.827

Page 29: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

28 Chapter 1 : Semi-quantitative measurement of glucosuria in neonates

Strength and weaknesses of the study

The strength of our study is the use of a large number of samples covering the complete range of glucose concentrations. In particular when using a Kappa value for evaluating interobserver agreement, this is of vital importance in order to prevent misleading high or low Kappa values. Another strength is that we thoroughly investigated interobserver variability based on an adequate blinding of the observers for or each other ratings to prevent bias in this respect.

Surely, this study suffers from several limitations. The use of predefined, categorized glucose concentrations scored under optimal controlled study conditions, may affect the generalizability of these results to real-life clinical practice. In clinical settings, the reagent-strip is read under variable circumstances, varying in light, temperature by well and less trained staff. In this study, all dippings of the reagent strips were done by one same person, strictly according to the manufacturer’s instructions, under controlled circumstances. Another concern that is relevant for clinical practice is the influence of medication on the results of the reagent strip. For example, high doses of salicylates or levodopa may produce false negative results, whereas high doses of ascorbic acid or the presence of ketones may depress the development of color on the reagent strip.1 Other factors that may threaten validity in real life clinical practice, are the timing of reading the strips (reagent strips are designed to react progressively, producing colour changes along the strip at specified times), the use of strips after the expiry date or improper storage of the strips.1 Since automated urinary strip readers will not be affected by the above mentioned factors, it is likely to be a more valid method.18

In summary, the semi-quantitative measurement of glucosuria using visually read reagent strip seems to be a valid method to rule out glucosuria and might be valuable in neonatal intensive care settings and in situations where automated readings are not available.

acknowledgement

The authors gratefully acknowledge Carin Bunkers, Annelies Vogelszang, Gerrie Venek-laas for their help with the assessment of the reagent strips.

references

1. Wilson LA. Urinalysis. Nurs Stand 2005;19:51-4. 2. American Diabetes Association. Standards of medical care in diabetes—2010. Diabetes Care.

2010;33 Suppl 1:S11-61.

Page 30: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Validity and interobserver agreement of reagent strips for measurement of glucosuria 29

1

3. European Confederation of Laboratory Medicine. European urinalysis guidelines. Scand J Clin Lab Invest Suppl 2000;S231:1-86.

4. Feleke Y, Abdulkadir J. Urine glucose testing: another look at its relevance when blood glucose monitoring is unaffordable. Ethiop Med J 1998;36:93-9.

5. Mitanchez D. Glucose regulation in preterm newborn infants. Horm Res 2007;68:265-71. 6. Madsen LP, Rasmussen MK, Bjerregaard LL, Nøhr SB, Ebbesen F. Impact of blood sampling in very

preterm infants. Scand J Clin Lab Invest 2000;60:125-32. 7. Wilkins BH. Renal function in sick very low birth weight infants: 4. glucose excretion. Arch Dis

Child 1992;67:1162-5. 8. Kirkland JA, Morgan HG. An assessment of routine hospital urine testing for protein and glucose.

Scot Med J 1961;6:5513-9. 9. Assa S. Evaluation of urinalysis methods used in 35 Israeli laboratories. Clin Chem 1977;23:126-8. 10. Gupta RC, Goyal A, Singh PP. Reliability of urinalysis for glucose. 1982;28:1724 11. Olesen H, Mortensen H, Mølsted-Pedersen L. More on reliability of reagent-strips for glucose. Clin

Chem. 1983;29:212. 12. Winkens RA, Leffers P, Degenaar CP, Houben AW. The reproducibility of urinalysis using multiple

reagent test strips. Eur J Clin Chem Clin Biochem 1991;29:813-8. 13. Knottnerus JA, Buntinx F. The evidence base of clinical diagnosis: theory and methods of diagnos-

tic research. Second edition Wiley-Blackwell, BMJ books. Oxford 2009. 14. Meade MO, Cook RJ, Guyatt GH, Groll R, Kachura JR, Bedard M, Cook DJ, Slutsky AS, Stewart TE.

Interobsever variation in interpreting chest radiographs for the diagnosis of acute respiratory distress syndrome. Am J Resp Crit Care Med 2000;161:85-90.

15. Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 1968;70:213-20.

16. Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull 1971;76:378-82. 17. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics

1977;33:159-74. 18. Penders J, Fiers T, Delanghe JR. Quantitative evaluation of urinalysis test strips. Clin Chem.

2002;48:2236-41. 19. Goldstein DE, Little RR, Lorenz RA, Malone JI, Nathan DM, Peterson CM, American Diabetes As-

sociation. Tests of glycemia in diabetes. Diabetes Care 2003;26 Suppl 1:S106-8.

Page 31: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 32: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of reagent strips for semi-quantitative measurement of glucosuria

in a neonatal intensive care setting

Jolita BekhofBoudewijn KollenSjef van de Leur

Joke KokIrma van Straaten

Accepted for publication in Pediatrics & Neonatology

Page 33: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

32 Chapter 1 : Semi-quantitative measurement of glucosuria in neonates

aBstract

Background: In preterm infants, measurement of glucosuria using a visually readable reagent strip is frequently applied, for example in monitoring of total parenteral nutri-tion, during sepsis or corticosteroids use. However, specific circumstances in a neonatal intensive care (NICU) setting, such as the use of diapers and the high temperature in incubators, could affect its reliability.

Objectives: To evaluate reliability of the semi-quantitative measurement of glucosuria under the specific circumstances of a NICU-setting.

Methods: A total of 900 assessments of artificially glucose-supplemented urine samples - kept in diapers -were performed under the following varying circumstances: environ-mental temperature (21 and 34 degrees Celsius), different periods of contact time of the urine in the diaper and using two different methods of collecting urine from the diaper. Each reagent strip was read-out independently by three observers. Scores of the test strips are categorized as 0, 1+, 2+, 3+ or 4+, in ascending degree of glucosuria.

Results: Agreement was excellent under all different circumstances (temperature κw 0.92; method of urine collection κw 0.88, time p 0.266). Inter observer reliability was very good (multirater Kappa 0.81). The deviation between the different circumstances was seldom larger than one category (2.9%). Reagent strips readings were concordant with the true urinary glucose concentrations in 79.0% of assessments. The discordance was never larger than one category.

Conclusion: The reliability of the semi-quantitative measurement of glucosuria in new-borns using reagent strips is good, even under the circumstances of a NICU-setting. Changes in rating of reagent strips of more than one category are most likely beyond measurement error.

Page 34: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of reagent strips for semi-quantitative measurement of glucosuria in a neonatal intensive care setting 33

1

introduction

Disturbed glucose homeostasis is frequently encountered in preterm neonates. 1-5 Mea-suring glucosuria in very prematurely born infants can be useful for example in adminis-tration of monitoring total parenteral nutrition,6 insulin,7 corticosteroids,8 during sepsis9 or for evaluation of renal function.10-11 Measurement of glucosuria can be performed quantitatively in the laboratory by spectrophotometry or semi-quantitatively by using reagent strips that are interpreted visually at the bedside. The use of reagent strips for urinalysis is quick, easy and economical and thus frequently applied as a point-of-care test.12 Previously, we showed good inter-rater reliability of reagent strips for measure-ment of glucosuria (Kappa 0.81).13

However, the circumstances in a neonatal intensive or high care unit differ substan-tially from those described in the manufacturer’s product information. Contact of the urine with diapers and high environmental humidity and temperature in the incubator are factors that may influence the results of the reagent strips. Another major issue is the minimally available quantities when considering urinalysis in (premature) neonates. Often it is not possible to collect a single large sample of urine to dip the strip in, as recommended by the manufacturer’s product information. As a consequence, reagent strips are often used by directly pressing the strip on the wet diaper.14 To our knowledge data on reliability of the use of reagent strips for the measurement of glucosuria in neonatology are lacking.

The aim of this study is to determine the reliability of the measurement of glucosuria by visually read reagent strips, under circumstances, specific for a neonatal intensive care unit.

methods

We used Combur reagent strips (Combur3Testâ Roche) to test artificially supplemented (contrived) urine, intended to simulate pathological specimens. Scores of the reagent strips are categorized as 0, 1+, 2+, 3+ and 4+, in ascending degree of glucosuria, resulting in five different colors of the test strip (webappendix 1). After 60 seconds of contact between the urine and the test strip, the color of the test strip is compared to a standardized color scale, corresponding with the five different categories of glucosuria. According to the product information of Combur strips, the corresponding ranges were 0.6-5.0 mmol/l for score +, 3.3-7.7 mmol/l for score 2+, 14.8-19.2 mmol/l for score 3+ and 52.8-57.2 mmol/l for score 4+. Five different quantities of glucose corresponding with the midrange glucose concentrations (respectively 0, 2.8, 5.5, 17.0 and 55.0 mmol/l), were added to freshly collected urine of healthy volunteers.

Page 35: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

34 Chapter 1 : Semi-quantitative measurement of glucosuria in neonates

A total of 300 urine samples were used, consisting of 60 samples of 5 different glucose concentrations. The urine samples were distributed over disposable diapers with insert gazes. Half of the diapers were kept in room temperature of 21 degree Celsius (oC) and half in an incubator at 34 oC. The reagent strips were read from urine collected by two dif-ferent non-invasive methods: the “diaper-strip” and the “drip-strip”. The “drip-strip method” uses a piece of cloth (gaze) in the diaper. After compression of the wetted cloth, urine can be evacuated with a syringe and subsequently applied to the reagent strip. During the procedure gloves were worn. Alternatively, reagent strips are frequently interpreted after pressing the strip in a wet diaper for a few seconds. This method will be referred to as the “diaper-strip method”. The reagent strips were applied in a randomized order of glucose concentrations. At five different times of contact between the diaper and urine (0, 30, 60, 120 and 180 minutes) reagent strips were read. All reagent strips were independently read by three different observers – intensive care neonatology nurses with at least 5 years of experience- , between 60-120 seconds after contact with the urine sample.

Analysis of urinary glucose in the laboratory was done by the Gluco-quant glucose/HK method (Roche Diagnostics, Mannheim, Germany) on a Modular P800 (Hitachi, Tokyo, Japan). This is an enzymatic hexokinase method, the rate of NADPH formation is measured photometrically. Urine collected by the diaper-strip method was used for this purpose. The laboratory staff was not aware of the reagent strip results.

Statistical methods

Assessments of agreement between the results of the reagent strips were quantified by calculating percentage agreement and weighted Kappa (κw) using quadratic weights (http://faculty.vassar.edu/lowry/kappa.html). We interpreted Kappa statistics as follows: poor (κ 0-0.40), fair (κ 0.41-0.75), or excellent (κ 0.76-1.0).15

We used non-parametric tests for the analysis of the effect of time spent in the diaper on the urine samples (Friedman’s ANOVA) and the effect of environmental temperature (Wilcoxon) on glucose concentrations.

All analyses, except for calculation of weighted Kappa, were performed in SPSS, ver-sion 18.0.

results

Results of visual reading showed excellent agreement with the true concentration as mea-sured by the laboratory (Table 1: κw 0.934, 95%CI 0.93-0.94, raw agreement 79.0%, 95%CI 76.2-81.6). Results equally over and underestimate the true degree of glucosuria (8.7% too low vs. 12.3% too high). Agreement was lowest for categories 1+ and 2+ (67.2% and 63.5% respectively, versus 94.0%, 77.9% and 88.6% for categories 0, 3+ and 4+ respectively).

Page 36: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of reagent strips for semi-quantitative measurement of glucosuria in a neonatal intensive care setting 35

1

As we showed in a previous report,13 inter observer reliability was good (85% overall agreement and multirater Kappa 0.81). Inter-observer-scores of the reagent strips never deviated more than one category from each other.

table 1 Agreement in rating of glucosuria through reagent strips compared to true urinary glucose concentrations as measured by the laboratory.

true glucosuria

0 + ++ +++ ++++ Total

reag

ent s

trip

0 173 11 0 0 0 184

+ 7 121 52 0 0 180

++ 0 48 94 6 0 148

+++ 0 0 34 152 9 195

++++ 0 0 0 22 171 193

Total 180 180 180 180 180 900

κw 0.934 (95%CI 0.928-0.940)

Temperature

Results of the comparisons between the two different environmental temperatures are presented in Table 2. The agreement of the reagent strips used in the two different environmental temperatures is excellent (κw 0.921, 95%CI 0.909-0.933, raw agreement 74.9%, 95%CI 70.6-78.8). The samples in room temperature were rated higher than the incubator (higher in 21.6%, lower in 3.6%). This difference was significantly influenced by the time the urine spent in the diaper (p 0.027, Friedman): ratings of reagent strips of urine samples in room temperature were higher than ratings of samples in incubator temperature in 15.6%, 20.0%, 18.9%, 28.9% and 24.4% after respectively 0, 30, 60, 120 and 180 minutes.

table 2 Agreement in rating of glucosuria through reagent strips between two environmental temperatures

room temperature

0 + ++ +++ ++++ Total

incu

bato

r te

mpe

ratu

re

0 83 15 0 0 0 98

+ 3 55 48 0 0 106

++ 0 4 36 20 0 60

+++ 0 0 4 76 14 94

++++ 0 0 0 5 87 92

Total 86 74 88 101 101 450

κw 0.921 (95%CI 0.909-0.933)

Page 37: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

36 Chapter 1 : Semi-quantitative measurement of glucosuria in neonates

Agreement with the true glucose concentration was excellent for both environmental temperatures (room: κw 0.919 with 78.0% raw agreement, incubator: κ 0.949 with 80.0% raw agreement). Overestimating the true glucose concentration occurred in 17.3% of samples in room temperature compared to 7.3% in the incubator. Underestimating the true glucose concentration was found in 4.7% of samples in room temperature and in 12.7% of samples in the incubator. The measured glucose concentration in the laboratory of samples retrieved through the diaper stick method was also higher in room temperature than in incubator temperature (mean difference 0.93 mmol/l, 95%CI 0.52-1.35 mmol/l; Wilcoxon, p<0.001). Still, all measured laboratory concentrations, both from room and incubator samples remained within the ranges corresponding to the categories of the reagent strip as provided by the manufacturer.

Urine collection method

Results of agreement between the different urine collection methods are presented in Table 3, showing good agreement between the different methods (κw 0.877, 95%CI 0.864-0.890, agreement 62.2%, 95%CI 58.9-66.7%). Results of drip-strip scored higher than diaper-strip in 37.8% of assessments, and lower in 0.0%. The diaper-strip method shows better agreement with the true glucose concentration compared to the drip-strip method (diaper-strip: κw 0.965 with 82.9% raw agreement, drip-strip: κ 0.905 with 75.1% raw agreement). Overestimating the true glucose concentration occurred in 24.4% of drip-strips compared to 0.002% in diaper-strips. Underestimating the true glucose concentration was found in 0.004% of drip-strips and in 16.9% of diaper-strips.

table 3 Agreement in rating of glucosuria through reagent strips between two methods of urine collection from the diaper

drip-strip

0 + ++ +++ ++++ Total

dia

per-

stri

p

0 83 15 3 0 0 101

+ 0 36 83 10 0 129

++ 0 0 16 30 0 46

+++ 0 0 0 63 29 92

++++ 0 0 0 0 82 82

Total 83 51 102 103 111 450

κw 0.877 (95%CI 0.864-0.890)

Time

Accuracy of the reagent strips showed statistically significant differences in categories 2+ and 4+, however no trend with time is seen (Table 4), meaning that the accuracy was

Page 38: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of reagent strips for semi-quantitative measurement of glucosuria in a neonatal intensive care setting 37

1

not worse or better the longer the urine stayed in the diaper. Maximum change in time of reagent strips readings was one category.

The measured glucose concentration by the laboratory collected through the drip-strip method did not change significantly over the 180 minutes of time spent in the diaper (Friedman’s ANOVA, p 0.266).

table 4 Agreement for glucosuria between visually read reagent strips and “true” glucosuria (laboratory measurement) after different periods of contact

time (minutes)

n=900 0 30 60 120 180 p*

“tru

e” g

luco

se

conc

entr

atio

n

0 91.7% 100% 100% 94.4% 94.4% 0.061

+ 77.8% 63.9% 58.3% 72.2% 63.9% 0.184

++ 44.4% 63.9% 44.4% 47.2% 61.1% 0.003

+++ 83.3% 83.3% 77.8% 88.9% 88.9% 0.056

++++ 75.0% 92.7% 100% 100% 100% <0.001

Total 74.4% 80.6% 76.1% 80.6% 82.7% <0.001

* Friedman’s ANOVA

discussion

The urinary glucose concentration as measured by the reagent strips is not influenced by the time spent in the diaper, and only slightly influenced by the temperature of the incubator and by the method of urine collection. The variability largely remains restricted to only one category difference, and is seen mainly in category 1+ and 2+. This can be explained by the fact that the ranges of glucosuria of the categories 1+ and 2+ are overlapping (0.6-5.0 mmol/l for category 1+, 3.3-7.7 mmol/l for category 2+). We therefore conclude that the use of reagent strips for semi-quantitative measurement is reliable enough for use in NICU-circumstances, especially when categories 1+ and 2+ are taken together as one category. To our knowledge this is the first study to address the reliability of reagent strips for measurement of glucosuria in a NICU-setting.

The two factors that slightly influenced the measurement of glucosuria were tem-perature and the collection method. We have purposely used the room temperature as a reference standard and have chosen the maximum temperature used in our NICU-department, arguing that this would reveal the maximum possible difference in results. Since we found only slight, not clinically relevant influence of these two “extreme” temperatures, we can conclude that lesser differences in temperature will even more certainly not influence the results. Concerning the method of urine collection:the reagent strips can be best used by pushing the strip to the wetted diaper (the diaper method). This is the most reliable and simple method, the contents of the diaper do not

Page 39: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

38 Chapter 1 : Semi-quantitative measurement of glucosuria in neonates

appear to influence the glucose concentrations, which is in accordance to earlier reports in the literature.14,16,17 When too little urine is available, for example in case of extremely absorbing diapers, the urine can be collected by putting a gaze in the diaper. The urine collected by this drip-stick method tends to overestimate the degree of glucosuria. This finding may be explained by the fact that more water could have evaporated from the gaze compared to urine collected in the diaper. Still, the largest part of the variation in reagent strips result can be explained by the inter observer reliability (κw 0.81).13 This inter-observer variation is a factor that cannot be neglected, since in daily clinical prac-tice it is unavoidable that different professionals use the reagent strip.

Strength and weaknesses of the study

The strength of our study is the use of a large number of samples covering the com-plete range of glucose concentrations. In particular when using a Kappa value for the evaluation of reliability, it is of vital importance to be cautious in case of low or high prevalences.18 By use of equally distributed prevalence of all degrees of glucosuria we aimed to prevent misleading high or low Kappa values. Another strength is the fact that we studied the reliability under specific NICU-circumstances, which was not reported earlier.

Surely, this study suffers from limitations. The use of predefined, categorized glucose concentrations scored under optimal controlled study conditions, may affect the gener-alizability of our results to real-life clinical practice. In clinical settings, the reagent-strip is read under variable circumstances, varying in room light, and importantly by well and less trained staff. Future studies using urine samples from patients in real-time clinical practice with a wide spectrum of glucosuria are needed to elucidate this matter.

In summary, the semi-quantitative measurement of glucosuria using visually read reagent strip seems to be a reliable and simple bedside method, suitable for use in a NICU-setting. Changes in rating of reagent strips of more than one category are almost certainly beyond measurement error. Moreover this measurement error may be largely avoided, when the category 1+ and 2+ are taken together as one category.

acknowledgements

The authors gratefully acknowledge Liesbeth Groot-Jebbink, Corrie Deiman, Carin Bunkers, Annelies Vogelszang and Gerrie Veneklaas for their help with the assessment of the reagent strips.

Page 40: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of reagent strips for semi-quantitative measurement of glucosuria in a neonatal intensive care setting 39

1

references

1. Hey E. Hyperglycaemia and the very preterm baby. Semin Fetal Neonatal Med 2005;10:377-87. 2. Platt MW, Deshpande S. Metabolic adaptation at birth. Semin Fetal Neonatal Med 2005;10:341-

350. 3. Farrag HM, Cowett RM. Glucose homeostasis in the micropremie. Clin Perinatol 2000;27:1-22. 4. Beardshaw K. Measurement of glucose levels in the newborn. Early Hum Dev 2010;86:263-7. 5. Beardsall K, Vanhaesebrouck S, Ogilvy-Stuart AL, Vanhole C, Palmer CR, Ong K, et al. Prevalence

and determinants of hyperglycemia in very low birth weight infants: cohort analyses of the NIRTURE study. J Pediatr 2010;157:715-9.

6. Falcao MC, Leone CR, Ramos JL. Is glucosuria a reliable indicator of adequacy of glucose infusion rate in preterm infants? Sao Paulo Med J 1999;117:19-24.

7. Decaro MH, Vain NE. Hyperglycaemia in preterm neonates: What to know, what to do. Early Hum Dev 2011;87S:S19-22.

8. Mitanchez D. Glucose regulation in preterm newborn infants. Horm Res 2007;68:265-71. 9. Manzoni P, Castagnola E, Mostert M, Sala U, Galetto P, Gomirato G. Hyperglycaemia as a possible

marker of invasive fungal infection in preterm neonates. Acta Paediatrica 2006;95:486-93. 10. Coulthard MG, Edmund N. Renal processing in well and sick neonates. Arch Dis Child Fetal Neo-

natal Ed 1999;81:92-8. 11. Drukker A, Guignard JP. Renal aspects of the term and preterm infant: a selective update. Curr

Opinion Pediatr 2002;14:175-182. 12. Wilson LA. Urinalysis. Nurs Stand 2005;19:51-4. 13. Bekhof J, Kollen BJ, Groot-Jebbink LJM, Deiman C, Van de Leur SJCM, Van Straaten HLM. Validity

and interobserver agreement of reagent strips for measurement of glucosuria. Scand J Clin Lab Invest 2011;71:248-52.

14. Burke N. Alternative methods for newborn urine sample collection. Pediatr Nurs 1995;21:546-49. 15. Fleiss JL. Measuring nominal scale agreement among many raters. Psychological Bulletin

1971;76:378-82. 16. Muratore C, Dhanireddy R. Urine collection from disposable diapers in premature infants: bio-

chemical analysis. Clin Pediatr (Phila) 1993;32:314-5. 17. Roberts SB, Lucas A. Measurements of urinary constituents and output using disposable napkins.

Arch Dis Child 1985;60:1021-4. 18. Hallgren KA. Computing inter-rater reliability for observational data: an overview and tutorial.

Tutor Quant Methods Psychol 2012;8:23-34.

Page 41: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 42: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Chapter 2

Early diagnosis of late onset neonatal sepsis in preterms

Page 43: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 44: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Clinical signs to identify late-onset sepsis in preterm infants

Jolita BekhofHans Reitsma

Joke KokIrma Van Straaten

Eur J Pediatr 2013;172:501-8.

Page 45: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

44 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

aBstract

Late-onset neonatal sepsis (LOS) in preterm infants is an important cause of morbidity and mortality in preterm infants. Since presenting symptoms may be non-specific and subtle, early and correct diagnosis is challenging. We aimed to develop a nomogram based on clinical signs, to assess the likelihood of LOS in preterms with suspected infec-tion without the use of laboratory investigations. We performed a prospective cohort study in 142 preterm infants <34 weeks admitted to the NICU with suspected infection. During 187 episodes 21 clinical signs were assessed. LOS was defined as bloodculture-proven and/or clinical sepsis, occurring after 3 days of age. Logistic regression was used to develop a nomogram to estimate the probability of LOS being present in individual patients. LOS was found in 48% of 187 suspected episodes. Clinical signs associated with LOS were: increased respiratory support (OR 3.6; 95%CI 1.9-7.1), capillary refill (OR 2.2; 95%CI 1.1-4.5), gray skin (OR 2.7; 95%CI 1.4-5.5) and central venous catheter (OR 4.6; 95%CI 2.2-10.0) (AUC 0.828; 95%CI 0.764-0.892).

Conclusion: Increased respiratory support, capillary refill, gray skin and central venous catheter are the most important clinical signs suggestive of LOS in preterms. Clinical signs that are too non-specific to be useful in excluding or diagnosing LOS were tem-perature-instability, apnoea, tachycardia, dyspnoea, hyper –and hypothermia, feeding difficulties and irritability.

Page 46: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Clinical signs to identify late-onset sepsis in preterm infants 45

2

introduction

Late-onset neonatal sepsis (LOS), defined as neonatal sepsis occurring after three days of age, is an important cause of morbidity and mortality in preterm infants.2,3,19 Early and correct diagnosis of LOS is a challenging task. Especially in preterm infants the present-ing signs are often very subtle and non-specific. Furthermore, as microbiological culture results are not available within 48 hours, early identification of a genuine sepsis is a major problem. Considering the possibly devastating consequences of missing LOS, providers often have a low threshold for starting antibiotic therapy. However, unnecessary use of empirically started broad-spectrum antibiotics should be minimized for reasons of growing resistance against antibiotics and possible harmful effects on gastro-intestinal immunity and allergy.4,12

Although many authors state that clinical signs are unreliable in the diagnosis of LOS in neonates,8,14 good quality studies addressing the value of clinical signs are sparse, especially in preterm neonates.5,9,11,14,18 Furthermore, most studies relied on blood-culture proven sepsis as definite outcome measure. However, sepsis-like episodes with false-negative blood-cultures (i.e. clinical sepsis) are frequently encountered in preterm infants because of the limitation in number of blood-cultures taken and quantity of blood drawn.6,13

In the era of sophisticated laboratory techniques, much emphasis has gone to the value of haematological and biochemical markers in the diagnosis of LOS. 10,15,16 Hence, clinical judgement might unjustly be undervalued. Still, it remains of great importance to investigate the diagnostic value of clinical judgement.

The aim of this study was to evaluate the value of various clinical signs in identifying both blood-culture proven as well as clinical LOS in preterm neonates in a NICU-setting, without the use of laboratory investigations. In addition we wished to develop a nomo-gram, consisting of clinical signs, to assist in decision-making for treatment in preterm neonates suspected of LOS.

methods

Patients and data collection

A prospective cohort study was undertaken at our level-III neonatal intensive care unit (NICU) in Zwolle, the Netherlands, from July 2005 until November 2007. Eligible patients included all patients with a postconceptional age < 34 weeks, and more than 72 hours postnatal age and not on antibiotic therapy for the last 24 hours. Patients were followed until a corrected gestational age of 35 weeks or until discharge to other hospitals before 35 weeks.

Page 47: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

46 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

An episode of suspected infection was defined as a clinical suspicion of infection by the attending neonatologist. The clinical suspicion of infection ranged from very mild to very severe. Each patient with mild to severe suspicion of infection was included ir-respective of the prescription of antibiotics. In this way inclusion of a wide spectrum of disease severity was aimed for in order to minimise the chance of overestimating the diagnostic accuracy of the clinical signs. In case of a very mild suspicion of infection, where no antibiotics were started nor blood-cultures were taken, the episode was evalu-ated for the occurrence of LOS over the following 3 days. We reasoned that in case of withholding antibiotics, without further aggravation of clinical symptoms, no clinically relevant bloodstream infections would be missed.

At the onset of each episode, data on clinical signs were assessed in a standardised way. Before the start of antibiotic treatment blood for blood cultures (1-3 ml), C-reactive protein (CRP) and full blood count was drawn.

Clinical signs and symptoms

Based on the literature5,8,10,7 and clinical experience a total of 14 clinical signs were as-sessed: pallor or gray skin colour, capillary refill time > 2 seconds,20 dyspnoea (grunting, nasal flaring and/or chest retractions), tachypnoea (respiratory rate > 60/min during > 1 hour), need for increased respiratory support ( intensifying the modus, i.e. low flow, CPAP or endotracheal ventilation and/or degree of respiratory support), increasing need for supplemental oxygen, tachycardia (pulse >180/min during > 1 hour), temperature instability (difference in body temperature > 0.5O Celsius within 24 hours), hyperthermia (rectal temperature > 38.0O Celsius), hypothermia (rectal temperature < 36.0O Celsius), feeding difficulties (vomiting or gastric aspirates > 50% of feed volume), increasing frequency of apnoea, bradycardia and/or cyanotic spells, lethargy and irritability. The clinicians and research nurses prospectively assessed these signs at the onset of each episode using a standardised form. Furthermore, the following 7 risk factors were noted: gestational age at birth, birth weight, sex, central venous catheter (CVC) or removal of a CVC in the preceding 24 hours, mechanical ventilation, actual weight and postnatal age.

Laboratory investigations

Bloodsamples for C-reactive protein, leucocytes with differential count and blood-cul-ture were taken at onset of clinical symptoms signalling suspected infection. C-reactive protein was measured by immunoturbidimetry on a Roche modular P instrument using the C-reactive protein latex Tina-quant® assay (Roche Diagnostics). CRP > 10 mg/l was judged to be indicative for sepsis. Leukocytes and differential count was measured by flowcytometry on a Celldynn 4000 machine (Abbott). Leucocytosis was defined as leucocytcount ≥ 25 x10E9/l and leucopenia ≤ 5 x10E9/l.

Page 48: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Clinical signs to identify late-onset sepsis in preterm infants 47

2

Bloodcultures were drawn before the start of antibiotic therapy (1-3 ml in 40 ml Cul-ture vials: BactecTM). A positive blood-culture with organisms regarded as commensals (predominantly Coagulase-negative Staphylococcus) was defined as contamination. However a positive blood culture with skin commensals was defined as proven sepsis when the same organism was found in at least 2 blood cultures and/or signs of catheter-related sepsis were present (ie. Inflammation of the skin at the site of line insertion).

Final diagnosis of LOS

The outcome of each episode was classified in 3 mutually exclusive categories: blood-culture proven sepsis, clinical sepsis and rejected sepsis. The classification was made by the researchers based on the course of the episode after the start of antibiotics and laboratory values (CRP, full blood count). Blood-culture proven sepsis was defined as an episode with positive non-contaminated blood-culture. Infants were classified as having clinical sepsis in case of a strong clinical suspicion for infection despite negative blood-cultures as defined by the attending neonatologist or in case of raised CRP (> 10 mg/l), leucocytosis or leukopenia or haematological markers. Rejected sepsis was defined as an episode with negative blood-culture and/or an episode with a favourable course where no blood-culture was done ánd no antibiotics were started in case of low CRP and normal haematological markers. In all our analyses and results, LOS was defined as blood-culture proven and/or clinical sepsis.

Statistical analysis

We used logistic regression to examine the association between clinical signs and the presence or absence of LOS. Some patients (n=38) experienced more than one episode of suspected infection. To avoid inaccurately extra weighing of risk factors in these pa-tients, we evaluated patient-specific risk factors (gestational age, birth weight and sex) only for the first episode. Odds ratios (OR) and 95% confidence intervals (95%CI) were used to quantify the strength of these associations.

Our variable selection and modelling approach was based on the following steps. Simultaneously fitting all 14 clinical signs and 7 risk factors in a single model would lead to overfitting (ratio of variables to number of events = 1:4) and unpredictable results.17 Therefore, we classified all signs into four clinically coherent groups, so that the signs within each group shared common features related to pathophysiology. These four groups were: respiratory, circulatory, general symptoms and risk factors.

Within each group we applied backward logistic regression to select only those signs that were significantly associated with LOS using a p-value <0.1 as criterion to stay. We then built a multivariable model with the selected signs from each group. Model evaluation consisted of receiver operating characteristics analysis (AUC) and Hosmer-Lemeshow goodness-of-fit-test.

Page 49: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

48 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

Bootstrapping was used to correct for possible overoptimistic results of the final model. Bootstrapping is an internal validation technique, where many repeated samples are drawn with replacement from the data set at hand. Bootstrapping generates an estimate of how well the model might fit in a new study population. In other words bootstrapping estimates the expected optimism in model performance or shrinkage of the model.17

Finally, a nomogram was developed to visualize the predictive strength of the differ-ent clinical signs and risk factors in a single diagram. This nomogram allows readers to calculate an expected risk of LOS based on the specific profile of a patient. The number of points for each predictor was based on the regression coefficients of the reduced mul-tiple regression model. The total numbers of points derived by the presence or absence of all predictors was used to calculate the expected probability of LOS.

Analyses were performed using the statistical package SPSS (PASW) version 18.0.

Consent and ethical approval

The study was approved by the local medical ethical committee of our hospital. Written informed consent was obtained from the parents.

results

During the 2-years study period, a total of 319 eligible patients were admitted to our NICU of whom 142 experienced one or more episodes of suspected infection. A total of 187 episodes of suspected infection occurring in 142 patients were evaluated. Basic characteristics of included patients are presented in Table 1; inclusion and classification of episodes is presented in Figure 1.

table 1. Characteristics of the study population: patients with suspected infection. Data from patients who experienced more than one episodes of suspected infection are from their first episode.

Patients with suspected infectionn=142

Gestational age in weeks+days (mean, ± SD) 29+6 (± 2+1)

Birthweight in gram (mean, ± SD) 1207 (± 351)

Male sex (n,%) 79 (56%)

Died during admission (n,%) 2 (1.4%)

Age at onset of suspected infection (median,IQR) 10* (7-15)

Follow-up in days (median,IQR) 24* (14-35)

SD standard deviation, IQR inter quartile range

Page 50: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Clinical signs to identify late-onset sepsis in preterm infants 49

2

319 patients eligible

142 patients included (evaluated for suspected late-onset sepsis)

Broad spectrum antibiotics started?

Yes (60)

laboratory values abnormal (65)

Broad spectrum antibiotics started?

Bloodculture positive

(32)

No (5) Yes (59) No (63)

Bloodculture negative (36) or missing (5)

Bloodculture negative (5) or missing (58)

Bloodculture positive

(18)

Bloodculture negative (26) or missing (2)

Bloodculture missing (5)

187 episodes of suspected late-onset sepsis included

laboratory values normal (122)

Severe clinical symptoms?

Yes (21) Yes (18) No (7) No (23)

Severe clinical symptoms?

Proven sepsis (50)

Rejected sepsis (98)

Clinical sepsis (39)

Final diagnosis:

figure 1. Inclusion and classification of episodes of suspected late-onset sepsis.

Final diagnosis

A final diagnosis of LOS was made in 89 (48%) out of the 187 episodes of suspected infection. Twenty-six per cent (n=50) of the episodes were classified as proven sepsis and in 21% (n=39) clinical sepsis was judged to be present (Figure 1 and Table 2). Of the 39 episodes of clinical sepsis, in 46% (n=17) laboratory values were normal (low CRP and haematological markers) and the diagnosis clinical sepsis was made on the basis of the severity and the course of clinical signs alone. Median CRP in the clinical sepsis group

Page 51: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

50 Chapter 2 : Early diagnosis of late onset neonatal sepsis in pretermsta

ble

2. C

linic

al s

igns

and

risk

fact

ors

in e

piso

des

of s

uspe

cted

infe

ctio

n an

d as

soci

ated

defi

nite

out

com

e

freq

uenc

y of

sign

su

niva

riate

ana

lysi

s

reje

cted

seps

isn=

98cl

inic

al se

psis

n=39

Prov

en se

psis

n=50

all e

piso

des

n=18

7Pr

oven

seps

iscl

inic

al a

nd p

rove

n se

psis

or†

95%

cip

or†

95%

cip

resp

irato

ry sy

mpt

oms

Apno

ea, b

rady

card

ia a

nd/o

r cya

notic

spel

ls52

(53.

1%)

23 (5

9.0%

)33

(66.

0%)

108

(57.

8%)

1.61

0.82

-3.1

80.

171.

500.

84-2

.70

0.17

Tach

ypno

ea46

(46.

9%)

15 (3

8.5%

)29

(58.

0%)

90 (4

8.1%

)1.

610.

83-3

.11

0.16

1.11

0.62

-1.9

60.

73

Incr

ease

d re

spira

tory

supp

ort

27 (2

7.6%

)23

(59.

0%)

32 (6

4.0%

)82

(43.

9%)

3.25

1.64

-6.4

10.

001

4.25

2.30

-7.8

7<0

.001

Incr

ease

d O

2-re

quire

men

t16

(16.

3%)

14 (3

5.9%

)17

(34.

0%)

47 (2

5.1%

)1.

720.

84-3

.50

0.14

2.74

1.37

-5.4

70.

004

Dys

pnoe

a16

(16.

3%)

8 (2

0.5%

)8

(16.

0%)

32 (1

7.1%

)0.

940.

39-2

.26

0.88

1.12

0.53

-2.4

10.

77

circ

ulat

ory

sym

ptom

s

Pallo

r/gra

y ski

n37

(37.

8)27

(69.

2%)

30 (6

0.0%

)94

(50.

3%)

1.81

0.93

-3.5

00.

082.

941.

62-5

.33

<0.0

01

Capi

llary

refil

l tim

e > 2

sec

25 (2

5.5%

)17

(43.

6%)

28 (5

6.0%

)70

(37.

4%)

2.86

1.46

-5.6

00.

012.

991.

61-5

.53

<0.0

01

Tach

ycar

dia

27 (2

7.6%

)10

(25.

6%)

21 (4

2.0%

)58

(31.

0%)

1.89

0.96

-3.7

30.

071.

410.

76-2

.62

0.28

gen

eral

sym

ptom

s

Tem

pera

ture

inst

abili

ty66

(67.

3%)

24 (6

1.5%

)41

(82.

0%)

131

(70.

1%)

2.17

0.97

-4.8

90.

061.

310.

70-2

.47

0.40

Leth

argy

28

(28.

6%)

13 (3

3.3%

)32

(64.

0%)

73 (3

9.0%

)4.

302.

16-8

.58

<0.0

012.

561.

40-4

.68

0.00

2

Hyp

erth

erm

ia13

(13.

3%)

7 (1

7.9%

)8

(16.

0%)

28 (1

5,0%

)1.

110.

45-2

.74

0.82

1.33

0.59

-2.9

70.

49

Feed

ing

into

lera

nce

13 (1

3.3%

)5

(12.

8%)

9 (1

8.0%

)27

(14,

4%)

1.46

0.60

-3.5

30.

401.

220.

54-2

.76

0.63

Irrita

bilit

y13

(13.

3%)

3 (7

.7%

)6

(12.

0%)

22 (1

1,8%

)0.

970.

36-2

.64

0.96

0.74

0.30

-1.8

20.

51

Hyp

othe

rmia

0 (0

%)

1 (2

.6%

)1

(2.6

%)

2 (1

,1%

)2.

630.

16-4

2.92

0.50

****

0.99

risk

fact

ors

Ges

tatio

nal a

ge (i

n w

eeks

+ d

ays)

29+4

(2+1

)28

+3 (2

+5)

29+5

(2+1

)1.

020.

92-1

.12

0.76

0.94

0.86

-1.0

40.

24

Birt

hwei

ght

1207

(351

)10

91 (3

54)

1220

(341

)1.

09**

*0.

92-1

.30

0.31

0.93

***

0.79

-1.0

80.

33

Mal

e se

x41

(57%

)16

(62%

)22

(50%

)0.

730.

36-1

.49

0.38

0.90

0.46

-1.7

40.

75

CVC

in la

st 2

4 hr

s17

(17%

)9

(23%

)31

(61%

)57

(31%

)6.

923.

37-1

4.19

<0.0

013.

891.

99-7

.60

<0.0

01

Wei

ght a

t epi

sode

in g

ram

1339

(315

)12

23 (3

34)

1188

(289

)12

74 (3

18)

0.89

0.80

-0.9

90.

040.

870.

79-0

.96

0.00

4

Age

at e

piso

de in

day

s17

(12)

20 (1

3)10

(6)

16 (1

1)0.

900.

85-1

.00

<0.0

010.

980.

95-1

.00

0.08

Vent

ilatio

n5

(5%

)7

(18%

)4

(8%

)16

(9%

)0.

860.

27-2

.81

0.81

2.62

0.87

-7.8

70.

09

Mea

n fo

r con

tinuo

us d

ata

with

sta

ndar

d de

viat

ion

in p

aren

thes

es, n

umbe

r for

cat

egor

ical

dat

a w

ith p

erce

ntag

es in

par

enth

eses

. Clin

ical

sig

ns a

nd ri

sk fa

ctor

s in

clud

ed in

fina

l m

ultip

le re

gres

sion

ana

lysi

s m

odel

in it

alic

. Pat

ient

cha

ract

eris

tics:

ges

tatio

nal a

ge, b

irth

wei

ght a

nd s

ex a

re fr

om th

e fir

st e

piso

de (r

ejec

ted

seps

is n

=72,

clin

ical

sep

sis

n=26

, pro

ven

seps

is n

=44)

.CV

C, C

entr

al v

enou

s ca

thet

er.

* Uni

varia

te a

naly

sis

** re

gres

sion

ana

lysi

s no

t per

form

ed b

ecau

se o

f too

low

num

ber

***

Birt

hwei

ght p

er 1

00 g

ram

Page 52: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Clinical signs to identify late-onset sepsis in preterm infants 51

2

was 11 mg/l (range 0-86). A positive bloodculture was found in 56 episodes, of which 6 were considered contaminated. Among the 50 episodes of proven sepsis Staphylococ-cus Epidermidis was the most common isolate (50%), followed by Bacillus Cereus (28%), Staphylococcus Aureus (12%), gram negatives (6%), Streptococcus (2%) and Candida (2%).

Clinical signs and risk factors for late-onset sepsis

Table 2 shows data on clinical signs and risk factors for the different outcome groups. Clinical signs and risk factors associated with LOS at the p 0.05 level were: weight at the episode, CVC, respiratory insufficiency, lethargy, capillary refill, pallor/gray skin and increased oxygen requirements. Clinical signs that showed no significant association with LOS were: temperature instability, apnoea, tachypnoea, tachycardia, dyspnoea, hyper- and hypothermia, feeding difficulties and irritability.

Results of multiple regression analysis

Selection of variablesAfter backward elimination within each of the 4 clinically coherent categories sepa-rately, the following variables remained significantly associated with LOS within the 4 categories: respiratory signs: increased respiratory support, circulatory signs: capillary refill time and pallor/gray skin, general signs: lethargy, risk factors: weight at episode and a central venous catheter.

Multiple regression modelThe remaining variables were entered in a multiple regression analysis and the corre-sponding results are presented in table 3. Performance of this “full” model for predict-ing LOS was good, with AUC 0.80 (95%CI 0.74-0.87; p<0.001) and Hosmer-Lemeshow goodness-of-fit-test p=0.438.

The expected optimism in model performance evaluated by bootstrapping was small (e.g. a decrease in AUC from 0.80 to 0.71, a shrinkage factor of 0.89).

In the next step we applied backward elimination (p<0.1 as criterion to stay) to see whether variables could be excluded without a relevant loss in performance. Four variables remained significantly associated with LOS: increased respiratory support, capillary refill time, pallor/gray skin and a central venous catheter (Table 3). Labelling infants as low risk for sepsis when all four factors are absent, which was the case in 36 infants, would miss 3 infants (8.3%) with clinical and proven sepsis and 1 infant (2.8%) with proven sepsis. Sensitivity of the presence of one or more of these 4 factors for LOS was 97% and specificity 37%.

Page 53: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

52 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

A nomogram was constructed based on the reduced model and is shown in Figure 2. The value of each predictor corresponds to a score. The scores for all predictors are summed to a total score, which is then translated into a probability for LOS.

table 3. Results from multivariable logistic regression analysis

clinical and proven sepsis Proven sepsis

full model or 95%-ci P or 95%-ci P

CVC in last 24 hours 4.39 2.02-9.52 <0.001 7.13 3.15-16.16 <0.001

Increased respiratory support 3.33 1.67-6.63 <0.001 2.16 0.97-4.84 0.06

Pallor/ gray skin 2.66 1.29-5.48 0.008 1.25 0.52-2.97 0.62

Capillary refill time > 2 seconds 2.13 1.03-4.42 0.04 2.32 1.00-5.37 0.05

Weight at episode < 1200 gr 1.72 0.87-3.40 0.12 1.75 0,80-3.85 0.16

Lethargy 1.14 0.55-2.36 0.73 2.61 1.14-6.01 0.02

reduced model or 95%-ci P or 95%-ci P

CVC in last 24 hours 4.63 2.16-9.95 <0.001 7.10 3.19-5.78 <0.001

Increased respiratory support 3.63 1.85-7.11 <0.001 2.33 1.05-5.15 0.04

Pallor/ gray skin 2.73 1.35-5.52 0.005 Not in this model

Capillary refill > 2 seconds 2.20 1.09-4.48 0.029 2.46 1.11-5.48 0.03

Lethargy Not in this model 2.78 1.25-6.21 0.01

CVC central venous catheter

Difference between clinical sepsis and proven sepsis

When the analysis approach was repeated but now for the associations with solely proven sepsis, instead of clinical ánd proven sepsis, the same clinical signs stayed in the final model, except for pallor and/or gray skin (Table 3). Performance of these models for predicting proven sepsis were also good with AUC 0.84 (95%CI 0.78-0.90; p<0.001) and Hosmer-Lemeshow goodness-of-fit-test p=0.319 for the full model, and AUC 0.83 (95%CI 0.76-0.89; p<0.001) and Hosmer-Lemeshow goodness-of-fit-test p=0.174 for the reduced model.

discussion

Principal findings

This study shows that most clinical symptoms in isolation have only moderate predic-tive value for identifying LOS in preterm infants suspected for infection. The strongest predictive signs were: increased respiratory support, capillary refill time > 2 seconds, pallor or gray skin colour and a central venous catheter in 24 hours preceding the onset of suspected infection.

Page 54: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Clinical signs to identify late-onset sepsis in preterm infants 53

2

Combining several clinical signs in a nomogram augments the predictive value for identifying LOS. This nomogram allows users to calculate an expected risk of LOS in an individual patient with suspected infection, based on the specific profile of the patient.

Clinical signs that are too nonspecific to be useful in excluding or diagnosing LOS were temperature instability, apnoea, tachypnoea, tachycardia, dyspnoea, hyper- and hypothermia, feeding difficulties and irritability. Even when two or more of these non-

figure 2. Nomogram for prediction of LOS in preterms suspected of infection with use of clinical signs and risk factors. LOS is defined as clinical and/or blood-culture proven sepsis.Instructions: Determine how many points the patient receives for each feature using the upper part of the nomogram. Sum the points for all features. Locate this sum score on the “total points” axis. Draw a straight line to the lower axis “probability sepsis” to find the estimated probability of that patient having LOS (clinical and/or proven LOS).

Page 55: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

54 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

specific signs occur simultaneously the risk of neither clinical nor proven sepsis is hardly changed (data not shown).

Lack of clinical relevance of body temperature in diagnosing LOS in preterms, might be attributable to the use of incubators in this specific patient population. When changes in body temperature are observed, the environmental temperature in the incubator will be manipulated before serious hypo- or hyperthermia can occur. Probably, the need and magnitude of temperature adjustment of the incubator is a more valuable item to measure.

The marginal diagnostic value of general respiratory signs (apnoea, dyspnoea, tachy-pnoea) might be explained by the high prevalence in preterms of non-infectious respira-tory problems due to lung immaturity or bronchopulmonary dysplasia. The same might hold true for the minor clinical relevance of other general symptoms such as feeding intolerance and irritability.

Another interesting finding of this study is the difference in observed frequency of clinical signs between clinical and proven sepsis. In this study, especially pallor and/or gray skin colour was strongly associated with clinical sepsis compared to blood-culture proven sepsis. A gray skin colour is obviously considered a serious sign of sepsis by the medical team. The extent, to which this assumption of medical workers is correct, however, cannot be answered by this study. Future studies looking at for example viral cultures might possibly further elucidate the issue of blood-culture negative sepsis in preterm infants.

Strengths and weaknesses

The strength of our study is that the data were prospectively collected in a population of solely preterm infants. The latter is important because clinical symptoms in preterm in-fants may have different clinical relevance compared to term infants.8 Several presumed signs of LOS can also be caused by prematurity itself, such as temperature instability, apnoea and feeding intolerance. Temperature problems for example, will be more in-dicative for infectious disease in term infants than in preterms.

Another strength of our study is that we did not only evaluate the predictive aspects of blood-culture proven sepsis, but also clinical blood-culture negative sepsis. Certainly in clinical practice clinical sepsis is frequently encountered and cannot be simply ignored.

There are certain limitations of our study. One of the major problems is the definition of clinical sepsis. The clinical signs we have evaluated were to a certain extent all con-tributing to the final diagnosis of clinical sepsis. This situation results in what is named incorporation bias. Since the symptoms and the diagnosis of clinical sepsis will be positively correlated in our study, misclassification may have resulted in overestimation of accuracy.1

Page 56: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Clinical signs to identify late-onset sepsis in preterm infants 55

2

A serious threat when using multiple regression models, as we did in this study, is overfitting. Overfitting results in overly optimistic models. When the model is used in new patients the performance is often worse than expected. To minimize the problem of overfitting an adequate sample size, with enough events (i.e. LOS) compared to the potential predictors is very important. Taken to the extreme, if the number of predic-tors equals the number of events, the model will fit perfectly, even if all predictors are entirely unrelated to the outcome variable.17 In general it is assumed that a minimum of 10 to 15 events per predictor variable will allow good estimates. Finally 6 predictors were used for multiple logistic regression, which seems fairly adequate, considering the 89 observed events of LOS (ratio 1:14).

Clinical and research implications

Most important predictive signs for identifying LOS in preterm infants are: increased respiratory support, capillary refill, pallor/gray skin and a central venous catheter in the 24 hours preceding the episode of suspected infection. Our nomogram based on a combination of these clinical signs may predict LOS in preterms suspected of infec-tion, even before ordering additional laboratory investigations. Clearly, we would like to emphasize that this model needs external validation in new patient groups. Still, the nomogram might be helpful in deciding on duration of antibiotic therapy, for example in situations were no bloodculture is available. Moreover the start of antibiotics could be postponed in case of low risk, under close monitoring of clinical symptoms. Furthermore further research evaluating the reliability and inter- and intra-observer variation for the more subjective symptoms is warranted.

acknowledgements

We wish to thank L.J.M. Groot-Jebbink and C.M. Bunkers, research nurses, for their vital assistance in collecting and managing the data.

references

1. Bayak MA. What you see may not be what you get: a brief, nontechnical introduction to overfit-ting in regression-type models. Psychosom Med 2004;66:411-421.

2. Berger A, Salzer HR, Weninger M, Sageder B, Aspöck C. Septicaemia in an Austrian neonatal intensive care unit: a 7-year analysis. Acta Paediatrica 1998;87:1066-1069.

3. Brodie SB, Sands KE, Gray JE, Parker RA, Goldmann A, Davis RB, Richardson DK. Occurrence of nosocomial bloodstream infections in six neonatal intensive care units. Pediatr Infect Dis J 2000;19:56-65.

Page 57: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

56 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

4. Conroy ME, Shi HN, Walker WA. The long-term health effects of neonatal microbial flora. Curr Opin Allergy Clin Immunol 2009;9:197-201.

5. Fanaroff AA, Korones SB, Wright LL, Verter J, Poland RL, Bauer CR, Tyson JE, Philips JB 3rd, Edwards W, Lucey JF, Catz CS, Shankaran S, Oh W. Incidence, presenting features, risk factors and signifi-cance of late onset septicaemia in very low birth weight infants. Pediatr Infect Dis J 1998;17:593-598.

6. Fischer JE. Physicians’ ability to diagnose sepsis in newborns and critically ill children. Pediatr Crit Care Med 2005;6:S120-S125.

7. Franz AR, Bauer K, Schalk A, Garland SM, Bowman ED, Rex K, Nyholm C, Norman M, Bougatef A, Kron M, Mihatsch WA, Pohlandt F; International IL-8 Study Group. Measurement of interleukin 8 in combination with C - reactive protein reduced unnecessary antibiotic therapy in newborn infants: a multi-centre, randomized, controlled trial. Pediatrics 2004;114:1-8.

8. Gerdes JS. Clinicopathologic approach to the diagnosis of neonatal sepsis. Clinics in perinatology 1991;18:361-381.

9. Kudawla M, Dutta S, Narang A. Validation of a clinical score for the diagnosis of late onset neona-tal septicemia in babies weighing 1000-2500 g. J Trop Pediatr 2008;54:66-69.

10. Mahieu LM, De Muynck AO, De Dooy JJ, Laroche SM, Van Acker KJ. Prediction of nosocomial sepsis in neonates by means of a computer-weighted bedside scoring system (NOSEP). Crit Care Med 2000;28:2026-2033.

11. Modi N, Doré CJ, Saraswatula A, Richards M, Bamford KB, Coello R, Holmes A. A case definition for national and international neonatal bloodstream infection surveillance. Arch Dis Child fetal Neonatal Ed 2009;94:F8-F12.

12. Neu J. Perinatal and neonatal manipulation of the intestinal microbiome: a note of caution. Nutr Rev 2007;65:282-285.

13. Ohlin A. What is neonatal sepsis? Acta Paediatrica 2011;100:7-8. 14. Ohlin A, Björkqvist M, Montgomery SM, Schollin J. Clinical signs and CRP values associated with

blood culture results in neonates evaluated for suspected sepsis. Acta Paediatr 2010;99:1635-1640.

15. Okascharoen C, Hui C, Cairnie J, Morris AM, Kirpalani H. External validation of bedside prediction score for diagnosis of late-onset neonatal sepsis. J Perinatol 2007;27:496-501.

16. Okascharoen C, Sirinavin S, Thakkinstian A, Kitayaporn D, Supapanachart S. A bedside prediction-scoring model for late-onset neonatal sepsis. J Perinatol 2005;25:778-783.

17. Reitsma JB, Rutjes AW, Khan KS, Coomarasamy A, Bossuyt PM. A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. J Clin Epidemiol 2009;62:797-806.

18. Singh S, Dutta S, Narang A (2003) Predictive clinical scores for diagnosis of late onset neonatal septicemia. J Trop Pediatr 49:235-239.

19. Stoll BJ, Hansen N, Fanaroff AA, Wright LL, Waldemar AC, Ehrenkranz RA, Lemons LA, Donovan EF, Stark AR, Tyson JE, Oh W, Bauer CR, Korones SB, Shankaran S, Laptook AR, Stevenson DK, Papile L-A, Poole WK. Late-onset sepsis in very low birth weight neonates: the experience of the NICHD Neonatal Research Network. Pediatrics 2002;110:285-291.

20. Tibby SM, Hatherill M, Murdoch IA. Capillary refill and core-peripheral temperature gap as indica-tors of haemodynamic status in paediatric intensive care patients. Arch Dis Child 1998;80:163-166.

Page 58: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Glucosuria as an early marker of late onset sepsis in preterm infants

Jolita BekhofBoudewijn Kollen

Joke KokIrma Van Straaten

Submitted

Page 59: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

58 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

aBstract

Background: Early and accurate diagnosis of late-onset neonatal sepsis (LONS) in preterm infants is difficult since presenting signs are subtle and non-specific. Because neonatal sepsis may be accompanied by glucose intolerance and glucosuria, we hypothesized that glucosuria may be associated with LONS in preterms, in an early stage.

Objective: To evaluate the association of glucosuria and LONS in preterms.

Methods: We prospectively measured glucosuria semi-quantitatively at least 8 times a day in 316 preterms (gestational age <34 weeks) and followed the patients for occur-rence of LONS. Attending physicians were blinded to glucosuria results. We assessed the predictive value of glucosuria for clinical LONS (clinical signs of sepsis but negative blood culture) and blood culture-proven LONS using logistic regression analysis.

Results: Glucosuria was found in 65.8% of patients, sepsis was suspected 157 times in 123 patients. LONS was found in 47.1% of 157 suspected episodes (clinical sepsis in 20.4% and culture-proven sepsis in 26.7%). Glucosuria was associated with lower birth weight (ORper 100g birth weight 0.80; 95%CI 0.75-0.85), lower gestational age (ORper week gestational age 0.64; 95%CI 0.56-0.73) and with suspected LONS (OR 2.39; 95%CI 1.93-2.95). Glucosuria was more common in infants with confirmed LONS (70.3%) than in those in whom sepsis was ruled out (44.6%, p=0.007). After correction for gestational age, birth weight and postnatal age this association weakened to non-significant (OR 1.36; 95%CI 0.99-1.85, p=0.055). An increase in glucosuria in the 24 hours before onset of symptoms was as-sociated with confirmed LONS (OR 2.52, 95%CI 1.13-5.59) with sensitivity 30.4% and specificity 85.2% (LR 2.05).

Conclusion: Glucosuria in preterms is positively associated with the occurrence of LONS. An increase in glucosuria is associated with the occurrence of LONS within 24 hours, however this association is too weak to be of diagnostic value.

Page 60: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Glucosuria as an early marker of late onset sepsis in preterm infants 59

2

introduction

Late-onset neonatal sepsis (LONS), mostly defined as neonatal sepsis occurring after three days of age, is an important cause of morbidity and mortality in preterm infants.1 Early and correct diagnosis of LONS in preterm infants is challenging because present-ing signs are subtle and non-specific.2 Several screening tools have been investigated to improve early diagnosis of LONS, such as a combination of clinical signs3-6 haemato-logical biomarkers5,7-9, changes in microcirculation10, heart rate monitoring11,12 and use of central-peripheral temperature gradient.13

Disturbed glucose homeostasis is frequently seen in prematurely neonates. Hyper-glycaemia has been found in 25 to 80% of premature newborns, depending on their gestational age and birth weight.14,15 Because of inadequate hepatic and diminished pancreatic insulin secretory responsiveness, the risk of hyperglycaemia increases in stressful episodes, such as sepsis.16,17 Together with a lower renal threshold, such glycaemic instability may produce glucosuria in preterm infants, even in absence of hyperglycemia.18

Although it has been recognized that hyperglycaemia may be an important early sign in neonatal sepsis,17,19, 20 monitoring blood glucose levels requires repeated blood sampling, with its disadvantages of discomfort, stress and the risk of iatrogenic anemia and blood transfusions. To our knowledge, the usefulness of non-invasive measurement of glucosuria as an early sign of neonatal sepsis has not been prospectively investigated.

Therefore the aim of this study was to evaluate the diagnostic value of (changes in) glucosuria in the early detection of LONS in preterm infants.

methods

Patients

This study was part of a prospective cohort study investigating clinical signs and symp-toms in preterms with suspected LONS, undertaken from July 2005 until November 2007, at our level-III neonatal intensive care unit (NICU) in Zwolle, the Netherlands.21 Eligible patients included all patients with a postconceptional age < 34 weeks, > 72 hours after birth who had not been on antibiotic therapy for the last 24 hours. Patients were followed until a corrected gestational age of 35 weeks or until discharge to other hospitals before 35 weeks.21

Glucosuria measurement

We prospectively collected data on glucosuria and the occurrence of late-onset sepsis on a daily basis. Combur reagent strips (Combur3Testâ Roche) were used to test urine

Page 61: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

60 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

samples for glucosuria, using a semi-quantitative scoring system of 0 (no glucosuria), 1+ (urine glucose level 0.6-5.0 mmol/L), 2+ (3.3-7.7 mmol/L), 3+ (14.8-19.2 mmol/L) and 4+ (52.8-57.2 mmol/L), according to the product information. Reagent strips were pressed into the wetted diaper directly or applied to urine collected by needle and syringe from a piece of gauze in the diaper. Glucosuria was assessed at least 8 times each day of admission in each infant until a corrected age of 35 weeks. Measurement of glucosuria was performed by the attending nurse and were collected only for research purposes. Treating clinicians were unaware of glucosuria results.

Definition

A change in glucosuria was defined as an increase of at least one category; category 1+ and 2++ were pooled since we previously showed that these categories cannot be reliably distinguished, because the glucosuria ranges that can be found in these two categories overlap.22

An episode of suspected sepsis was defined as a clinical suspicion by the attending neonatologist. At the onset of each episode, data on clinical signs were assessed in a standardised way, by means of a form where the physician filled in the clinical signs on which the suspicion of LONS were based, and routine laboratory investigations, includ-ing blood cultures, C-reactive protein (CRP) and full blood count were performed.22

The outcome of each episode was classified in 3 mutually exclusive categories: blood-culture proven sepsis, clinical (blood-culture negative) sepsis and rejected sepsis. The classification was made by the researchers based on the course of the episode after the start of antibiotics. Blood-culture proven sepsis was defined as an episode with positive non-contaminated blood-culture. Infants were classified as having clinical sepsis in case of a negative blood culture despite strong clinical suspicion of sepsis as defined by the attending neonatologist or raised CRP (>10 mg/l) or positive haematological markers. Rejected sepsis was defined as an episode with negative blood culture (plus CRP < 10 and negative haematological markers) or an episode in which no blood culture was performed and no antibiotics were started. In all our analyses and results, LONS was defined as blood-culture proven and/or clinical sepsis, unless stated otherwise.

Statistical analysis

Associations of the degree of glucosuria with patient characteristics and LONS were analysed by ANOVA or Chi2 as appropriate.

Association between the presence of glucosuria as well as an increase in glucosuria and the occurrence of LONS was analysed in the group of patients with suspected infection using multivariate logistic regression analysis, adjusting for birth weight, ges-tational and postnatal age, and the as known risk factors for glucosuria. Furthermore we used a multivariable model based on backward regression analysis, to correct for some

Page 62: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Glucosuria as an early marker of late onset sepsis in preterm infants 61

2

major clinical signs that were shown to be the strongest predictors of LONS in this study population, as we reported earlier, i.e. increased respiratory support, lengthened capil-lary refill, grey skin and the presence or recent removal of a central venous catheter.21

Analyses were performed using SPSS version 20.0.

Consent and ethical approval

The study was approved by the local medical ethical committee of our hospital. Written informed consent was obtained from the parents.

results

During the 2-years study period, a total of 316 of 360 eligible patients < 34 weeks were included. Reasons for exclusion were: short admission < 1 day (n=7), antibiotics during the entire admission (n=31) or missing measurements of glucosuria (n=6). A total of 187 episodes of suspected sepsis occurred in 142 patients. After excluding 30 episodes with missing glucosuria measurements, we evaluated 157 episodes of suspected infection in 123 patients. Patient characteristics and their suspected sepsis episodes are presented in tables 1 and 2.

table 1. Characteristics of the study population.

all included patientsn=316

Patients with suspected infectionn=123

Gestational age, weeks+days 30+3 (2+1) 29+1 (2+1)

Birthweight, g 1431 (481) 1187 (345)

Male sex, n (%) 181 (57.3%) 70 (56.9%)

Died during admission, n (%) 4 (1.3%) 1 (0.8%)

Age at onset of suspected infection, days na 12 (7-18)

Follow-up, days 16 (9-16) 24 (14-35)

Mean (standard deviation) for gestational age, birth weight. For age at onset and follow-up in days median (p25 and p75) are given because of skewed distribution. Data from patients who experienced more than one episode of suspected infection are from their first episode.

Glucosuria

Glucosuria within the individual patient at any moment during the study period and in varying degrees, was found in the majority (65.8%) of patients (+ or ++ in 40.2%, +++ in 17.7% and ++++ in 7.9%). Glucosuria was negatively associated with gestational age (ORper week gestational age 0.64; 95%CI 0.56-0.73, p<0.001) and birth weight (ORper 100g birth weight

0.80; 95%CI 0.75-0.85, p<0.001). Glucosuria was found in 92.2% of patients with gesta-tional age ≤ 30 weeks, as compared to 57.7% in gestational age 31-32 weeks and 21.9%

Page 63: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

62 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

in 33-34 weeks (p=0.001). Glucosuria was more common in patients with birth weight < 1200 g (90.8%) than in those > 1200 g (52.7%, p< 0.001).

Glucosuria was equally common in males and females (63.5% versus 68.8%; p=0.321).

Sepsis diagnosis

A diagnosis of LONS was made in 74 (47.1%) of the 157 suspected sepsis episodes: 32 (20.4%) clinical sepsis and 42 (27%) culture-proven sepsis. In the culture proven sepsis episodes the following microbacteriae were found: Staphylococcus Epidermidis (57.1%), Staphylococcus Aureus (14.3%), Bacillus Cereus (19.0%), gram negative strains (5.0%), Streptococcus (2.3%) and Candida (2.3%).

Association of glucosuria with late onset sepsis

The 108 patients who never had glucosuria had fewer suspected sepsis episodes (13.9%) than the 208 patients with glucosuria ever (60.1%, OR 9.34; 95%CI 5.06-17.22, p<0.001). This association remained significant after correction for gestational age and birth weight (OR 4.19; 95%CI 2.11-8.32, p<0.001).

table 2 Characteristics of included suspected sepsis episodes

n=157

respiratory symptoms

(Increase in) apnea, bradycardia and/or cyanotic spells 56.1%

Increased respiratory support 45.2%

circulatory symptoms

Pallor/gray skin 48.4%

Capillary refill time > 2 sec 39.5%

Tachycardia 32.5%

general symptoms

Temperature instability 71.3%

Lethargy 41.4%

Irritability 10.8%

laboratory values

CRP (mg/l) 2 (0-12)

Leucocytes (10E9/l) 12.5 (8.9-17.9)

Rods (%) 0 (0-3)

IT-Ratio (%) 0 (0-5.1)

risk factors

CVC in last 24 hours 31.2%

Mechanical ventilation 8.9%

Number (percentage) are given, except for “Laboratory values” where mean (p25-p75) are presented

Page 64: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Glucosuria as an early marker of late onset sepsis in preterm infants 63

2

Glucosuria in the period before onset of LONS

Suspected sepsis episodes were commonly preceded by glucosuria between 48 and 24 hours (46.5%), or in the 24 hours before onset of clinical suspicion (56.7% of episodes). Glucosuria 48-24 hours before onset was not associated with confirmed LONS, defined as clinical ánd proven sepsis (OR 1.08, 95%CI 0.77-1.51, p 0.705), nor with solely culture-proven sepsis (OR 0.84, 95%CI 0.56-1.26, p=0.401). The degree of glucosuria in the 24 hours before clinical suspicion was positively associated with confirmed LONS (OR 2.95, 95%CI 1.50-5.77, p=0.002), but not with culture-proven sepsis (OR 1.23;95%CI 0.91-1.66, p=0.185, Table 3). After adjustment for gestational age, birth weight and postnatal age, the association between the degree of glucosuria in the 24-hours before the episode and confirmed LONS became almost non-significant (OR 1.36, 95%CI 0.99-1.85, p=0.055). Diagnostic value of the presence of glucosuria in the 24-hours period before suspected LONS is presented in table 4.

table 3 Association of glucosuria in the 24-hours period before clinical suspicion of infection and the final diagnosis of late-onset neonatal sepsis

n=157 Prevalence

Rejected infection

n=83

Clinical sepsis

n=74

Proven sepsis (positive

bloodculture)n=42

P (ANOVA)

Max

imum

gl

ucos

uria

24 h

ours

bef

ore

onse

t of s

uspe

cted

in

fect

ion

0 68 (43.3%) 46 (67.7%) 9 (13.2%) 13 (19.1%)

0.031+ 35 (22.3%) 15 (42.9%) 8 (22.9%) 12 (34.3%)

++ 34 (21.7%) 14 (41.2%) 10 (29.4%) 10 (29.4%)

+++ 15 (9.6%) 6 (40.0%) 2 (13.3%) 7 (46.7%)

++++ 5 (3.1%) 2 (40.0%) 3 (60.0%) 0 (0%)

Increase in glucosuria* 33 (22.0%) 12 (36.4%) 8 (24.2%) 13 (39.4%) 0.017

*Data on change in glucosuria in the 24-hour period before onset of clinical suspicion of sepsis were missing in 7 of 157 episodes

table 4 Diagnostic value of glucosuria in LONS

157 episodes of suspected LONS Confirmed clinical and culture proven LONSn=74 (47.1%)

prevalence sensitivity specificity ppv npv LR+ LR-

Presence of glucosuria 56.7% 70.3% 55.4% 58.4% 67.6% 1.6 0.54

Increase in glucosuria 22.0% 30.4% 85.2% 63.6% 59.4% 2.1 0.82

LONS Late-onset neonatal sepsis, ppv positive predictive value, npv negative predictive value, LR Likelihoodratio

Increase in glucosuria in the period before onset of LONS

A change in glucosuria in the 24-hours preceding before the onset of suspected LONS was found in 48 (32%) of episodes (decrease in 10%, no change in 68%, one category increase in 20%, two categories increase in 3% (Table 3)).

Page 65: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

64 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

An increase in glucosuria in the 24 hours before onset of symptoms was associated with confirmed clinical and culture-proven LONS (OR 2.52, 95%CI 1.13-5.59) but not for solely culture-proven sepsis (OR 2.26 95%-CI 0.99-5.19, p=0.05). After adjustment for gestational age, birth weight, and postnatal age, this association weakened and was no longer significant (adjusted OR 2.19; 95%CI 0.96-4.97, p=0.061). Diagnostic value of the increase in glucosuria in the 24-hours period before suspected LONS is presented in table 4.

Comparison of glucosuria with other clinical signs and symptoms in LONS

Correction with some major clinical signs that were shown to be the strongest predictors of LONS in this study population, as we reported earlier21, weakened the association of an increase in glucosuria to non-significant (OR 1.6; 95%CI 0.61-4.00, p0.356). When put-ting all the signs in a multivariable model, using backward logistic regression analysis, the increase in glucosuria was left out of the final model, meaning that this factor does not add to the predictive value of the combination of the other signs and symptoms.

discussion

Principal findings

This study shows that glucosuria is very common in preterm infants, with increasing prevalence with lower gestational age, lower birth weight, and with the occurrence of late onset sepsis. Although an increase in the degree of glucosuria in the 24 hours before clinical suspicion of sepsis is associated with confirmed LONS, the diagnostic value is only marginal. This is probably due to the fact that glucosuria is merely associated with gestational age and weight, which are well-known, and stronger risk factors for the oc-currence of LONS. The measurement of glucosuria doesn’t seem to be of additive value in detecting LONS, when compared to the stronger clinical signs, such as increased re-spiratory support severe respiratory or circulatory symptoms.21 Because glucosuria was more strongly associated with confirmed LONS (either clinical sepsis or culture-proven sepsis) than with solely culture-proven LONS, it appears that (an increase in) glucosuria can possibly be regarded as a sign of stress and not as a specific marker of infectious sepsis. This is in accordance with earlier reports of the association of hyperglycaemia with LONS.14,15,17

Our study confirms earlier observations that preterm infants have an increased risk of glucosuria and that glucosuria is negatively associated with gestational age and birth weight.14,20,23 Our study is the first, however, to assess the value of the presence and degree of glucosuria as an early marker in detecting LONS in preterm infants.

Page 66: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Glucosuria as an early marker of late onset sepsis in preterm infants 65

2

Strengths and weaknesses

The strength of our study is that the data were prospectively collected in a well-defined population of preterm infants suspected of late-onset sepsis. This is important because both clinical symptoms and inflammatory responses may differ between preterm and term infants.24 Attending physicians and researchers who classified the sepsis episodes were strictly blinded to glucosuria results. This is crucial to prevent incorporation bias and thus overestimation of diagnostic accuracy or associations.25

We consider it to be an additional strength that we did not only include blood-culture proven sepsis, but also clinical blood-culture negative sepsis, reflecting daily practice in neonatal intensive care wards.

An important weakness of our study was the lack of a strict definition of suspicion of sepsis. We left the treating physician to judge whether the signs and symptoms gave rise to clinical suspicion of a LONS, realizing that this will lead to a certain inter-physician variation. Another point of criticism is that we only analyzed the glucosuria results in suspected episodes of LONS and thus did not further analyse the predictive value of glucosuria. This means that we do not know how often preterms experience a change in degree of glucosuria without subsequent clinical suspicion of sepsis. It would be interesting to know whether monitoring of daily glucosuria, is of additional value above monitoring of clinical findings and vital parameters, for early detection of LONS.

In conclusion, we demonstrate that changes in the degree of glucosuria occur early in the course of LONS in preterm infants, but that this marker is only of marginal diagnostic value.

acknowledgements

We wish to thank L.J.M. Groot-Jebbink and C.M. Bunkers, research nurses, for their vital assistance in collecting and managing the data.

references

1. Stoll BJ, Hansen N, Fanaroff AA, et al. Late-onset sepsis in very low birth weight neonates: the experience of the NICHD Neonatal Research Network. Pediatrics 2002;110:285-291.

2. Fanaroff AA, Korones SB, Wright LL, Verter J, Poland RL, Bauer CR, et al. Incidence and presenting features, risk factors and significance of late onset septicaemia in very low birth weight infants. Pediatr Infect Dis 1998;17:593-8.

3. Mahieu LM, De Muynck AO, De Dooy JJ, et al. Prediction of nosocomial sepsis in neonates by means of a computer-weighted bedside scoring system (NOSEP). Crit Care Med 2000;28:2026-2033.

Page 67: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

66 Chapter 2 : Early diagnosis of late onset neonatal sepsis in preterms

4. Okascharoen C, Hui C, Cairnie J, et al. External validation of bedside prediction score for diagnosis of late-onset neonatal sepsis. J Perinatol 2007;27:496-501.

5. Ohlin A, Bjorkqvist, Montgomery SM, Schollin J. Clinical signs and CRP-values associated with blood culture results in neonates evaluated for suspected sepsis. Acta Paediatrica 2010;99:1635-1640.

6. Bekhof J, Reitsma JB, Kok JH, Van Straaten IH. Clinical signs to identify late-onset sepsis in preterm infants. Eur J Pediatr. 2013;172:501-8.

7. Da Silva O, Ohlsson A, Kenyon C. Accuracy of leucocyte indices and C-reactive protein for diagno-sis of neonatal sepsis: a critical review. Pediatr Infect Dis J 1995;14:362-366.

8. Gonzalez BE, Mercado CK, Johnson L, Brodsky NL, Bhandari V. Early markers of late-onset sepsis in premature neonates: clinical, haematological and cytokine profile. J Perinatal Med 2003;31:60-68.

9. Caldas J, Marba S, Blotta M, Carlil R, Morais S, Oliviera R. Accuracy of white blood cell count, C-reactive protein, interleukin-6 and tumor necrosis factor alpha for diagnosing late onset neonatal sepsis. J Pediatr 2008;84:536-42.

10. Weidlich K, Kroth J, Nussbaum C, Hiedl S, Bauer A, Christ F, Genzel-Boroviczeny O. Changes in microcirculation as early markers for infection in preterm infants- an observational prospective study. Pediatr Res 2009;66:461-465.

11. Griffin MP, Lake D, Bissonette E, Harrell F, O’Shea TM, Randall J. Heart rate characteristics: novel physiomarkers to predict neonatal infection and death. Pediatrics 2005;116:1070-4.

12. Moorman JR, Carlo WA, Kattwinkel J, Schelonka RL, Porcelli PJ, Navarrete CT, et al. Mortality reduc-tion by heart rate characteristic monitoring in very low birth weight neonates: a randomized trial. J Pediatr 2011;159:900-6.

13. Leante-Castellanos JL, LLoreda-Garcia JM, Garcia-Gonzalez A, LLopis- Bano C, Fuentes-Gutierrez C, Alonso-Galeggo JA, Martinez-Gimeno A. Central-peripheral temperature gradient: an early diagnostic sign of late-onset sepsis in very low birth weight infants. J Perinat Med 2012;40:571-576.

14. Hey E. Hyperglycaemia and the very preterm baby. Semin Fetal Neonatal Med 2005;10:377-87. 15. Beardsall K, Vanhaesebrouck S, Ogilvy-Stuart AL, Vanhole C, Palmer CR, Ong K, Weissenbruch van

M, Midgley P, Thompson M, Thio M, Cornette L, Ossuetta I, Iglesias I, Theyskens C, Jong de M, Gill B, Ahluwalia JS, Zegher de F, Dunger DB. Prevalence and determinants of hyperglycemia in very low birth weight infants: cohort analyses of the NIRTURE study. J Pediatr 2010;157:715-9.

16. Farrag HM, Cowett RM. Glucose homeostasis in the micropremie. Nutrition and metabolism of the micropremie 2000;27:1-22.

17. Manzoni P, Castagnola E, Mostert M, Sala U, Galetto P, Gomirato G. Hyperglycemia as a possible marker of invasive fungal infection in preterm neonates. Acta Paediatrica 2006;95:486-493.

18. Wilkins BH. Renal function in sick very low birthweight infants: 4. Glucose excretion. Arch Dis Child 1992;67:1162-1165.

19. Kao LS, Morris BH, Lally KP, Stewart CD, Huseby V, Kennedy KA. Hyperglycemia and morbidity and mortality in extremely low birth weight infants. J Perinatol 2006;26:730-736.

20. Ogilvy-Stuart AL, Beardsall K. Management of hyperglycaemia in the preterm infant. Arch Dis Child Fetal Neonatal Ed 2010;95:126-131.

21. Bekhof J, Reitsma JB, Kok JH, Van Straaten HLM. Clinical signs to identify late-onset sepsis in preterm infants. Eur J Pediatr 2013;172:501-8

22. Bekhof J, Kollen BJ, Groot-Jebbink LJ, Deiman C, van de Leur SJ, van Straaten HL. Validity and interobserver agreement of reagent strips for measurement of glucosuria. Scand J Clin Lab Invest 2011;71:248-52.

Page 68: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Glucosuria as an early marker of late onset sepsis in preterm infants 67

2

23. Decaro MH, Vain NE. Hyperglycaemia in preterm neonates: What to know, what to do. Early Hum Dev 2011;87S:S19-22.

24. Gerdes JS. Clinicopathologic approach to the diagnosis of neonatal sepsis. Clinics in perinatology 1991;18:361-381.

25. Reitsma JB, Rutjes AW, Khan KS, Coomarasamy A, Bossuyt PM. A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. J Clin Epidemiol 2009;62:797-806.

Page 69: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 70: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Chapter 3

Fluid balance charts in neonates

Page 71: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 72: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of the fluid balance in neonates

Yvette van AsperenPaul Brand

Jolita Bekhof

Acta Paediatr 2012;101:479-83

Page 73: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

72 Chapter 3 : Fluid balance charts in neonates

aBstract

Aim: To assess the reliability of fluid balance charts in neonates.

Methods: An observational study in 170 non-breastfed neonates, requiring continuous monitoring on a high care unit, but not critically ill. The fluid balance was compared to daily body weight changes using Bland-Altman analysis. Differences more than 20% of daily fluid intake were considered clinically relevant.

Results: The mean gestational age was 36+2 weeks (SD 18.7 days) and mean birth weight 2782 g (SD 749 g). The mean difference between 394 fluid balances over 24 hours (in ml) and daily weight changes (in g) was -12.1 (limits of agreement -128.1 to 103.8). In 40% of comparisons the difference with daily weight change was more than 20% of daily fluid intake.

Conclusion: Fluid balance charts both over- and underestimate body weight changes in an unpredictable pattern and are therefore unreliable as a single measure of fluid status in neonates.

Page 74: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of the fluid balance in neonates 73

3

introduction

Fluid balances are widely used in several clinical settings. In neonatal high-care, keeping a fluid balance is accepted as part of routine care.1,2,3 Accurately keeping a fluid balance is a challenging and time-consuming task for the nursing staff.3 Problems in estimating the quantities of body fluids leaking into bed sheets and diapers, insensible losses and errors in calculations threaten the accuracy of the fluid balance results.4,5 In neonates insensible water loss is particularly difficult to estimate because it is influenced in an uncertain way by various factors such as ambient humidity, gestational age, respiratory rate, and phototherapy.1,6 Furthermore the fluid intake of breastfed infants is uncertain because there is no reliable and practical method to assess milk intake in breastfeeding infants.7

Despite its widespread use, studies evaluating the reliability of the fluid balance are sparse, showing poor accuracy in adults.4,5,8,9 No studies to date have evaluated the reli-ability of fluid balance recording in neonates. Therefore we performed an observational study in sick neonates to investigate the reliability of fluid balances by comparing fluid balance data to daily changes in body weight, which we consider the most accurate estimation of fluid status.

methods

Patients

From June 2009 through March 2010, we conducted this study in our neonatal high care ward. Patients admitted to this high care ward (approximately 500 per year) are sick, but non-critically ill neonates with gestational age > 32 weeks or moderate disease severity, requiring continuous monitoring of cardiorespiratory status without the need for intensive care. Whenever possible, infants are breastfed, however in this high-care population, most infants (or their mothers) are too sick to breastfeed. The majority of infants receive tube feeding or bottle feeding (when possible with expressed mother’s milk).

Each neonate initially admitted to the neonatal high care ward was eligible for inclu-sion in the study. We chose only to include patients who were not breastfed, because there is no reliable method for assessing milk intake in breastfed neonates.7 Patients transferred from the neonatal intensive care unit were excluded, because these patients are well stabilized and routine fluid balance keeping in this patient category is not com-mon practice in our clinic.

Page 75: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

74 Chapter 3 : Fluid balance charts in neonates

Assessment of daily fluid balance and body weight

In all included patients, fluid input and output, fluid balance chart, and body weight were assessed daily during the first three days of admission to the ward. In the majority of patients this will be the first 3 days after birth.10,5 Insensible water loss was calculated as 30 ml/kg/day for neonates born at a gestational age between 32 and 37 weeks, and 20 ml/kg/day for neonates born after 37 weeks.11 Patients were weighed daily without clothes, diaper and monitor or pulse oximeter wires, before feeding between 6:00 and 9:00 am.

Statistical analysis

The correlation between fluid balance and daily body weight changes was assessed using Pearson correlation coefficients. We assumed 1 milliliter of fluid to correspond with 1 gram of bodyweight. The agreement of the fluid balance with daily body weight changes was assessed by a Bland-Altman plot.12 The Bland-Altman plot is the preferred method for assessment of the agreement between two different measurements of a continuous measure. The Bland Altman plot assigns the average of the weight change and the fluid balance on the x-axis, and the difference between the weight change and the fluid balance on the y-axis. With the Bland-Altman plot the mean difference between the two methods is given and the limits of agreement (±1.96SD), allowing the reader to judge on the clinical relevance of the agreement.12

Before the start of the study, we defined differences between the two methods of more than 20 percent of the daily fluid intake (accounting for approximately 50 ml or grams change in body weight), to be clinically relevant. We used t-tests to assess the influence of potential confounding factors (mentioned in table 2) on the difference between the fluid balance and body weight change and ANOVA for the influence of postnatal age.

To rule out the possibility of human calculation errors in calculating the fluid balance, we entered all recorded data on fluid intake and output in a spread sheet program and recalculated the fluid balance electronically.

All p values mentioned are two-sided. Analyses were performed with Statistical Pack-age for the Social Sciences (SPSS) software, version 18.0.

Ethical approval

Written informed consent was obtained from the parents or caretakers of the partici-pants. The study was approved by the institutional’s ethics review board.

Page 76: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of the fluid balance in neonates 75

3

results

Figure 1 presents the enrolment of patients. During the study period 321 patients were admitted to the neonatal high care ward. Finally, fluid balance calculations and body weight changes were compared on 394 days in 170 patients. Table 1 shows the baseline characteristics of included patients. Of 170 included patients, 136 (80%) were included on the first day, 17 (10%) on the second day, 6 (4%) on the third day and 11 (6%) beyond the third day after birth.

381 patients admitted during 9-months study period

164 not eligible (fluid balance not routinely indicated)

70 transferred from neonatal intensive care

60 medium care 34 other

observation for peri-operative complications

post-vaccination social indication, etc

217 eligible

170 included in prospective cohort

47 missed inclusion 10 no parental consent 37 missed

(referral within 24 hours, data not collected)

figure 1. Patient enrolment

The correlation coefficient between fluid balance (in ml) and body weight change (in g) over 24 hours was 0∙64 (p<0∙001) and 0.925 over 3 days (p<0.001). The mean differ-ence when subtracting the 24-hour fluid balance (in ml) from the daily change in body weight (in g) was –12∙1 (limits of agreement -128∙1 to 103∙8). In 158 comparisons (40%), the difference between the fluid balance and change in body weight exceeded 20% of the daily fluid intake, our predefined threshold of clinical relevance Complete data on weight change and fluid balance on all 3 days were available in 85 patients. The mean

Page 77: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

76 Chapter 3 : Fluid balance charts in neonates

table 1 Characteristics of patients

characteristics n=170

Male sex 100 (59%)

Gestational age (weeks +days)• Preterm (< 37 weeks)

36+2(14+5)106 (62%)

Bodyweight (g)• <1800 g

2782 (749)14 (8%)

Respiratory support # 47 (28%)

Phototherapy 30 (18%)

Incubator 145 (85%)

Tube feeding 117(69%)

Data are mean (SD) or number (%). # Nasal continuous positive airway pressure or supplemental oxygen through nasal canula

Average of weight difference and fluid balance (ml)

Diff

eren

ce b

etw

een

wei

ght d

iffer

ence

and

flui

d ba

lanc

e (m

l)

figure 2. Bland Altman plot for agreement between daily fluid balance and difference in body weight.The fluid balance includes correction for insensible water losses. Every dot represents a measurement (n=394). Mean difference between daily difference in body weight and the fluid balance is -12.1 (95% CI -128.1 to 103.8).

Page 78: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of the fluid balance in neonates 77

3

difference between the total of weight differences and the total of fluid balances over 3 days was -44.5 (limits of agreement -246.2 to 157.2). In 21.2% of comparisons the total difference over 3 days was more than 5% of birth weight. Fluid balances both over- and underestimated body weight changes in an unpredictable pattern (Figure 2).

The difference between fluid balance and body weight change was significantly influenced by regurgitations/vomiting and prematurity: in patients with regurgitations/vomiting lower agreement between fluid balance and daily weight change was found (p=0∙03), whilst in premature patients with gestational age < 36 weeks, agreement was higher (p<0∙01). The imprecision of the fluid balance was not significantly related to respiratory support (p=0∙06) nor to phototherapy (p=0∙53) Fever and tachypnoea were only present in a very small number of days, precluding meaningful statistical analysis (Table 2). Postnatal age did not influence the difference between weight change and fluid balance: differences on day 1, 2 and 3 respectively were -17.3 ml, -7.8 ml and -24.1 ml (ANOVA p=0.11).

table 2 Factors that might influence the reliability of the fluid balance. Discrepancy in daily bodyweight changes and fluid balance in patients with and without potential confounding factors.

Potential confounding factors

factor present factor absent

n

Mean difference between fluid balance and daily body weight change (SD)

n

Mean difference between fluid balance and daily body weight change (SD)

95% CIp value

Tachypnoea (>80/min >1hr)

16 * 378 -12∙08 (59∙88) * *

Respiratory support # 46 6∙65 (72∙60) 345 -15∙03 (56∙46) -0∙64:44∙01 0∙06

Frequent defecation (≥3 times a day)

263 -13∙71 (58∙70) 131 -8∙20 (57∙19) -12.94: 23∙96 0∙56

Phototherapy 67 -19∙59 (56∙99) 327 -10∙62 (59∙58) -24∙57: 6∙62 0∙26

Fever (>38 oC) 7 * 387 -12∙92 (57∙96) * *

Premature 274 -2∙1 (53∙54) 120 -35∙08 (64∙98) 19∙66:46:31 <0∙01

Vomiting/regurgitation

125 -21∙88 (58∙46) 269 -7∙62 (59∙06) -26∙79:-1∙73 0∙03

Total comparisons between daily bodyweight changes and fluid balance was 394. n= number of comparisons, mean = mean difference between fluid balance and daily body weight change: subtracting the fluid balance (in ml) from the change in body weight (in g), SD= standard deviation, CI= 95% confidence interval and p-value using t-test. * due to small number mean and SD and t-test not applicable. # Nasal continuous positive airway pressure or supplemental oxygen through nasal canula.

Recalculating the fluid balance data by a spreadsheet program yielded a number of 179 human calculation errors (45%). The differences in relation to the hand-recorded fluid balances were small, with a median difference of 0∙0 ml (interquartile range -8∙0

Page 79: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

78 Chapter 3 : Fluid balance charts in neonates

to 0∙00, p 0∙11) . More over the agreement of the fluid balance with daily body weight changes did not improve when spreadsheet-calculated data were used. The mean dif-ference between the calculated fluid balance (in ml) and body weight change (in g) was slightly larger than that observed for the hand-recorded fluid balance data (-19∙3, 95% CI: -131∙7 to 93∙2).

discussion

This study shows that keeping a fluid balance on a neonatal high-care ward shows poor agreement with daily changes in body weight, with more than 40% of the comparisons showing clinically relevant differences. Our findings confirm earlier reports in adults showing poor agreement of fluid balance with changes in body weight4,5,13, central venous monitoring14 or bioelectrical impedance measurements.15

Although the concept of estimating fluid status by recording fluid in- and output seems logical, the inaccuracy of the fluid balance raises serious doubt about the useful-ness of keeping a fluid balance in clinical practice. The numerous assessments of fluid volumes during a day are all subject to measurements errors. Although measuring fluid intake in sick newborns appears to be straightforward, spilling of milk while feeding and regurgitation or vomiting may reduce the precision of such intake recordings. In our study, vomiting significantly influenced the imprecision of the fluid balance. Similarly, problems may occur in estimating the quantities of fluid output. Measuring urinary output is usually performed by weighing diapers before and after micturation. This does not take spilling of urine into bedsheets into account. In addition, the scales used to weigh diapers are unlikely to be sensitive enough to pick up small weight changes of the diaper.7 Defecation might be another factor increasing inaccuracy of the fluid balance. It is uncertain how much fluid is lost through defecation, however frequent defecation (>3 times a day) did not significantly influence the imprecision of the fluid balance in our study (p=0∙56).The insensible losses may be influenced by several factors. In this study, tachypnoea and fever were only present in a small number of comparisons, limiting our ability to study their influence on the imprecision of the fluid balance. Respiratory sup-port and phototherapy did not significantly influence the difference between the fluid balance and weight change, possibly because it is routine on our ward to nurse infants with respiratory support or phototherapy in incubators, ensuring humidification The significant difference in imprecision of the fluid balance between premature and term infants is remarkable. A possible explanation might be that the (calculated) correction for insensible water loss is more accurate in premature infants (30 ml/kg/day) than in term infants (20 ml/kg/day).11

Page 80: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Reliability of the fluid balance in neonates 79

3

We acknowledge the following limitations of our study. Firstly, it could be argued that using another measure of fluid status, such as central venous pressure monitoring or body impedance measurements, would have been valuable. Because we wanted to assess the reliability of the fluid balance in clinical practice, we purposely chose (trends in) body weight as comparison, since this is most frequently used in clinical practice, and probably the most adequate and simple method to estimate total fluid balance over time.in neonates. Secondly, this study was not performed in infants requiring neonatal intensive care, in whom recording a fluid balance may have the largest clinical impact and usefulness. We deliberately chose a high-care population, in whom recording all required data (including daily body weight measurements) would be most practical and feasible. This means that we cannot conclude on usefulness of keeping a fluid balance in critically ill patients. It can be argued that urinary output and trends in fluid balance and body weight remain indispensable in critically ill patients. Thirdly, the frequent use of incubators in our population might threaten generalizability of the results. For reasons of optimal humidification and facilitation of close observation, it is common practice in our hospital to prefer an incubator over a cot or open radiant table, however this policy is merely a matter of tradition and is at least debatable. Since use of an open radiant table augments the insensible losses, the reliability of the fluid balance might even be diminished in this situation.

The main strengths of our study are that we examined the reliability of fluid balances as they are currently recorded in many neonatal wards and the large size of our study group.

In conclusion, this study demonstrates the imprecision of fluid balance in neonates. Fluid balance charts both over- and underestimate body weight changes in non-critically ill neonates in an unpredictable pattern and are therefore unreliable as a single measure of fluid status

references

1. Aggarwal R, Deorari AK, Paul VK. Fluid and electrolyte management in term and preterm neo-nates. Indian J Pediatr 2001;68:1139-42.

2. Scales K, Pilsworth J. The importance of fluid balance in clinical practice. Nurs Stand 2008;22:50-7. 3. Tsang LD, DeMarini S, Rath LL. Electrolytes, vitamins, trace minerals. In: Kenner C, Wright Lott J.

Neonatal nursing 3rd ed. Philadelphia: WB Saunders, 2003:409-24. 4. Chung LH, Chong S, French P. The efficiency of fluid balance charting: an evidence-based man-

agement project. J Nurs Manag 2002;10:103-13. 5. Daffurn K, Hillman KM, Bauman A, Lum M, Crispin C, Ince L. Fluid balance charts: do they measure

up? Br J Nurs 1994;3:816-20. 6. Grunhagen DJ, de Boer MG, de Beaufort AJ, Walther FJ. Transepidermal water loss during halogen

spotlight phototherapy in preterm infants. Pediatr Res 2002;51:402-5.

Page 81: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

80 Chapter 3 : Fluid balance charts in neonates

7. Savenije OE, Brand PL. Accuracy and precision of test weighing to assess milk intake in newborn infants. Arch Dis Child Fetal Neonatal Ed.2006;91:F330-32.

8. Eastwood GM. Evaluating the reliability of recorded fluid balance to approximate body weight change in patients undergoing cardiac surgery. Heart Lung 2006;35:27-33.

9. Mank A, Semin-Goossens A, Lelie J, Bakker P, Vos R. Monitoring hyperhydration during high-dose chemotherapy: body weight or fluid balance? Acta Haematol 2003;109:163-8.

10. Brand PLP Test weighing for term and premature infants is an accurate procedure: author’s reply. Arch Dis Child Fetal Neonatal Ed 2007; 92: F328.

11. Davis ID, Avner ED. Fluid and Electrolyte Management. In: Martin RJ, Fanaraff AA, Walsh MC. Neonatal-perinatal medicine: diseases of the fetus and infant 8th ed. Mosby, 2005.

12. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307-10.

13. Meier PP, Engstrom JL. Test weighing for term and preterm infants is an accurate procedure. Arch Dis Child Fetal Neonatal Ed 2007;92:F155-6.

14. Craig J, Mathieu S. Is central venous pressure monitoring appropriate for assessment of periop-erative fluid balance? Br J Hosp Med (Lond) 2006;67:108.

15. Nenchev N, Hatib F, Daskalov I. Monitoring relative fluid balance alterations in haemodialysis of diabetic patients by electrical impedance. Physiol Meas 1998;19:35-52

Page 82: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Usefulness of the fluid balance: A randomised controlled trial in neonates

Jolita BekhofYvette van Asperen

Paul Brand

J Paediatr Child Health 2013;49:486-92

Page 83: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

82 Chapter 3 : Fluid balance charts in neonates

aBstract

Aim: To assess the effects of fluid balance charts in neonates with moderate disease severity on duration of hospitalisation and medical interventions.

Methods: Randomised, controlled trial in a neonatal ward in a general teaching hospital in the Netherlands between June 2009 and March 2010.170 neonates with moderate disease severity, requiring continuous monitoring of vital parameters (mean gestational age 36+2 weeks (SD 2+5 days), mean birth weight 2782 g (SD 749 g)) participated. In the control group (n=86), attending physicians could ac-cess all fluid balance data, whilst these data were blacked out in the intervention group (n=84). Primary outcome was length of hospital stay. Secondary outcomes were per-centage weight loss, interventions based on the fluid status, unblinding of fluid balance data and incident reporting.

Results: Length of hospital stay did not differ significantly between the intervention and the control group (Median 9 versus 8 days, with ratio of geometric mean 1∙25, 95%-CI 0∙99 to 1∙57; p 0.06). We found no significant differences in secondary outcomes.

Conclusions: Routinely keeping fluid balances in neonates with moderate disease sever-ity does not affect duration of hospitalisation or medical treatment.

Trial registration: ClinicalTrials.gov, number NCT00962754.

Page 84: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Usefulness of the fluid balance: A randomised controlled trial in neonates 83

3

introduction

Assessment of fluid status is commonly performed in a wide variety of clinical settings, especially in critically ill patients. Several methods of examining the fluid status are used: physical examination aimed at identifying clinical symptoms of overhydration or dehydration, laboratory investigations (blood or urine osmolarity and electrolytes), more invasive haemodynamic measurements (central venous pressure monitoring) or bioelectrical impedance measurements. The most commonly used method, however, is keeping a fluid balance. This includes meticulously measuring fluid in- and output, taking insensible water losses into account.1 Accurately keeping a fluid balance chart is a challenging and labour-intensive task. Despite its widespread use, only a few studies have examined its reliability, most in adults, and these studies showed poor accuracy.2-7 Earlier we published a study which confirmed poor accuracy of fluid balances in neo-nates.8 Moreover, the clinical usefulness of keeping a fluid balance is unknown. To our knowledge, no studies to date have assessed whether keeping a fluid balance improves patient care, and if so, against what costs.

In intensive care settings, keeping a fluid balance is commonly accepted as part of routine care.9 No studies to date have evaluated the usefulness of fluid balance record-ing in neonates. Therefore we undertook a randomised controlled trial to evaluate the effects of fluid balance recording in sick neonates on duration of hospitalisation, weight loss and medical interventions aimed at the patient’s fluid status. For safety reasons we chose a patient group with moderate disease severity as a proof of principle before studying more critically ill patients in a Neonatal Intensive Care unit We also analysed the time spent keeping and calculating a fluid balance.

materials and methods

Trial design

From June 2009 through March 2010, we conducted a randomised controlled in neonates with moderate disease severity (Patients), comparing the situation were the physician has access to all fluid balance data (Control) to the situation were the fluid balance data were blacked out (Intervention). The primary outcome was the duration of hospitalisation (Outcome)

Page 85: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

84 Chapter 3 : Fluid balance charts in neonates

Patients

SettingThe study was conducted in the Neonatal High Care ward of the Isala Klinieken, a large general teaching hospital in Zwolle, the Netherlands. Patients admitted to this 12-bed High Care ward are neonates with moderate disease severity requiring con-tinuous monitoring of cardiorespiratory status without the need for intensive care. The population primarily consists of preterm infants (gestational age 32-35 weeks), low birth weight infants (birth weight 1200-1800 grams), infants needing non-invasive respiratory support (nasal continuous positive airways pressure or supplemental oxygen through nasal canula) and infants treated for neonatal hyperbilirubinemia, hypoglycaemia or (suspected) neonatal sepsis. The majority of patients are nursed in a humidified (50%) incubator, also (near) term newborns. The choice for the use of humidified incubators is mainly based on the nurses’ preferences to observe the sick babies in our ward without clothes or blankets, especially during the first 1-2 days after birth. More over our ward has only access to a limited number of open beds with radiant heathers. Neonates need-ing mechanical ventilation or inotropic support are admitted to the Neonatal Intensive Care, as well as preterm neonates born at less than 32 weeks gestational age or with birth weight less than 1200 grams. The Neonatal High Care ward receives approximately 500 admissions per year.

EligibilityEach neonate initially admitted to the Neonatal High Care ward was eligible for inclusion in the study, because it was routine practice in our clinic to record the fluid balance in these patients. Patients transferred from the Neonatal Intensive Care unit were ex-cluded, because these patients are well stabilized and routine fluid balance keeping in this patient category is not common practice in our clinic.

Study intervention

In all included patients, fluid input and output, fluid balance chart, and body weight were assessed daily during the first three days of admission to the ward, which in the majority of patients means the first 3 days after birth. In the control group, all fluid balance data were available to the attending physician. In the intervention group, these data on fluid intake, fluid output and fluid balance charts were masked except for daily body weight measurements and the number of wet diapers. As part of routine care diapers of admit-ted neonates are refreshed every 3 hours. The attending physician had the opportunity to unblind the fluid balance data when this was judged to be clinically necessary.

Page 86: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Usefulness of the fluid balance: A randomised controlled trial in neonates 85

3

Procedures (assessment of daily fluid balance and body weight)

The fluid balance was calculated every 24 hours by subtracting daily fluid output and calculated insensible water losses from daily fluid intake. Fluid intake was calculated by adding the amount of enteral feeding, intravenous fluids and oral or intravenous medi-cation. Urinary output and defaecation were determined by weighing diapers. When a diaper was soiled with stool, the weight of the full diaper was denoted as defaecation and added to the daily output. Fluid output was not adjusted for vomiting because such estimation has been found to be unreliable in previous studies.6 Insensible water loss was calculated as 30 ml/kg/day for neonates born at a gestational age between 32 and 37 weeks and 20 ml/kg/day for neonates born after 37 weeks.10 Patients were weighed daily without clothes, diaper and pulse oximeter, before feeding between 6:00 and 9:00 am. One single infant weighing scale was used, which was calibrated monthly according to hospital and manufacturer guidelines.

Outcome measures

The primary outcome measure was length of hospital stay. Secondary outcomes included the maximum percentage of weight loss during the study period, medical ac-tions relevant for the fluid status, unblinding of the fluid balance data by the attending physician and adverse clinical incident reporting. The maximum percentage of weight loss was computed by calculating the difference between birth weight and the lowest weight during the study period (ie the first 3 days after birth) as a proportion of birth weight (maximum percentage weight loss = [birth weight - lowest weight postpartum]/birth weight *100%). Adverse clinical incidents are recorded electronically in a dedi-cated system on the hospital’s Intranet11. We judged medical actions to be relevant to the fluid status when they were aimed at influencing the fluid status, for example when the physician prescribed intravenous fluid boluses or diuretics. Also, situations in which fluid intake deviated from our local protocol with more than 10 ml/kg per day were considered relevant. The primary study investigator (YvA) attended daily ward rounds during the week, making sure that each of these interventions was recorded for study purposes. Unblinding of the fluid balance data in the intervention group by the treating physicians and reported incidents were also used to detect any potential disadvantage of not using the fluid balance.11

The time spent on recording a fluid balance was assessed in 10 patients, by following the nurse, recording the time spent on recording the fluid balance with a stop-watch.

Randomisation and blinding

Randomisation was performed following a computer generated list of random numbers prepared by an independent statistician. No stratification was performed. Concealment of allocation was ensured by recording the randomisation codes in opaque, sealed,

Page 87: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

86 Chapter 3 : Fluid balance charts in neonates

numbered envelopes. On admission the nurse opened the randomisation envelope in consecutive order. As soon as possible informed consent was obtained from the parents. We chose to randomise the patients first, before informed consent was given, because we didn’t want to bother the parents directly after giving birth. We made sure informed consent was given during a convenient moment, before the first ward round in the morning when fluid balances were reported to the attending physician. Inherent to the study design, parents were blinded to the study intervention but the treating nurse and the treating physician were not. The study investigators performing the statistical analyses (YvA/JB) were not aware which of the two groups was the intervention group.

Sample size

We determined the required sample size for our study from patient admission data in 2005. These admission data showed a positively skewed distribution (skewed-to-the-right) of the length of hospital stay (median 11 days, range 2-43 days). Since our primary outcome measure, length of hospital stay did not show a normal distribution, logarith-mic transformation was performed, showing a geometric mean of recorded length of hospital stay of 10∙7 days (SD 2∙09 days). We calculated that 85 patients in each group would give a power of at least 85% to detect a difference of two days or more in the geometric mean length of hospital stay with a two-sided significance level of 0.05. In order to account for potential patient withdrawal and missing data, we aimed to include and randomise 190 patients.

Statistical analysis

The difference in length of hospital stay was assessed by obtaining mean differences with 95% confidence intervals from a linear regression model, incorporating gestational age, birth weight and early discharge with tube feeding, since these factors are known to be associated with length of hospital stay. Gestational age and birth weight are un-derstandably inversely associated with length of hospital stay. On the other hand, early discharge with tubefeeding will shorten length of hospital stay. In our clinic, preterm neonates can be discharged early with tube feeding under close supervision of a nurse.12

As the mean length of hospital stay was log transformed before analysis, the differ-ence between the two groups is expressed as a ratio of geometric means for length of hospital stay. In addition we calculated the difference in median length of hospital stay with 95% confidence intervals.13 The length of hospital stay was also shown graphically with the use of Kaplan-Meier survival estimates.

All p values mentioned are two-sided. Analyses were performed with Statistical Pack-age for the Social Sciences (SPSS) software, version 18.0.

This trial was registered with ClinicalTrials.gov, number NCT00962754.

Page 88: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Usefulness of the fluid balance: A randomised controlled trial in neonates 87

3

Ethical approval

Written informed consent was obtained from the parents or caretakers of the partici-pants. The study was approved by the institutional’s ethics review board.

results

Patients

Overall, 287 neonates were screened for eligibility, 190 of which were enrolled and randomised (figure 1). After randomisation, 14 patients were excluded for failure to meet eligibility criteria or refusal of parental consent. These 14 patients were not sicker than included patients (mean gestational age 36+1 weeks, birth weight 2705 g, 2/14 had respiratory support). Six envelopes used for study group assignment were opened but not assigned to a patient. The remaining 170 patients were included in the intention-to-treat analysis (figure 1).

287 Patients assessed for eligibility

97 not enrolled 70 transferred from neonatal

intensive care 27 missed inclusions

190 randomised

95 allocated to control group (with fluid balance data)

95 allocated to intervention group (no fluid balance data)

86 included in the intention to treat analysis 84 included in the intention to treat analysis

2 did not meet inclusion criteria

5 withdrew consent 4 envelopes not

assigned to a patient

2 did not meet inclusion criteria

5 withdrew consent 2 envelopes not

assigned to a patient

figure 1. Enrolment of patients

Page 89: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

88 Chapter 3 : Fluid balance charts in neonates

Table 1 shows the baseline characteristics of included patients. Overall mean gesta-tional age was 36+2 weeks (SD 2+5 weeks), mean birth weight 2782 gram (SD 749 g). We observed a notable difference in gestational age, with more preterm infants (73%) in the intervention group compared to the control group (49%). This also caused a difference in body weight (intervention: 2614 g, versus control: 2947 g) and the number of patients discharged early with tube feeding (intervention: 26%, versus control: 16%).

table 1 Baseline characteristics of patients

characteristics intervention group(no fluid balance)(n=84)

control group(with fluid balance)(n=86)

Male sex 51 (61%) 49 (57%)

Gestational age (weeks +days) 35+5 (2+4) 37+0 (2+5)

• Preterm (< 37 weeks) 61 (73%) 42 (49%)

Mean birthweight (g) 2614 (713) 2947 (749)

• <1800 g 10 (12%) 4 (5%)

Breastfeeding 24 (29%) 34 (40%)

Respiratory support (supplemental O2, low flow or nasal CPAP)

28 (33%) 20 (23%)

Phototherapy 10 (12%) 20 (23%)

Incubator 73 (87%) 68 (79%)

Early discharge home with tube feeding† 22 (26%) 14 (16%)

Data are mean (SD) or number (%). CPAP is Continuous Positive Airway Pressure.† In our clinic, preterm neonates can be discharged early with tube feeding under close supervision of a nurse shortening hospital stay11

Primary outcome

Median length of hospital stay in the intervention group was nine days (range: 2-37 days) compared to eight days (range 2-27 days) in the control group (table 2), with ratio of geometric mean 1∙25 (95%-CI 0∙99 to 1∙57; p 0.06). The Kaplan-Meier estimates showed no significant difference in length of hospital stay (p 0.07) (Figure 2). After adjustment for gestational age, birth weight and early discharge with tube feeding, the ratio of geo-metric mean hospital duration was 0.98 (95% confidence interval 0∙81 to 1∙19; p=0.84) (table 2). This means that the length of hospital stay in the intervention group could be 19% shorter or 19% longer than the duration in the control group.

Secondary outcomes

No differences in secondary outcomes were found (table 3). In the intervention group three incidents were reported: the lamp for phototherapy should have been replaced earlier, blood transfusion was given two days after the result of the blood haemoglobin showed anaemia and a wrong medication order. In the control group five incidents were

Page 90: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Usefulness of the fluid balance: A randomised controlled trial in neonates 89

3

table 2. Length of hospital stay (primary outcome)

duration intervention group(no fluid balance)(n=84)

control group(with fluid balance)(n=86)

difference (95%-interval)

p value

Median(days)

9 8 1 (0-4)

Range of distribution (min.-max)

2-37 2-27

Log10 mean ±SD 0.93 ±0.35 0.84 ±0.31

Geometric mean (days) 8.51 6.92

Ratio of geometric means(95%CI)

1.25 (0.99-1.57) NA 0.06

Adjusted ratio of geometric means (95%CI)†

0.98 (0.81-1.19) NA 0.84

The duration of hospitalisation was log transformed before analysis, since this variable was positively skewed. The effect of the intervention refers to the ratio of geometric means for the duration of hospitalisation. The duration of hospital stay of one patient in the control group was withdrawn because there was an exceptional non-medical reason for the long hospital stay (55 days).†A linear regression model was used to adjust for factors influencing the duration of hospital stay (gestational age, birth weight an early discharge with tube feeding). NA denotes not applicable.

Neo

nate

s ad

mitt

ed to

the

war

d (%

)

Duration of hospitalisation (days)

- - - Control group

Intervention group

figure 2. Kaplan-Meier estimations of the proportions of the neonates remaining in the hospital (logrank test, p=0.07)

Solid line represents patients without fluid balance (intervention) and the dashed line patients with fluid balance (control)

Page 91: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

90 Chapter 3 : Fluid balance charts in neonates

reported: the hot-water bottle had not been replaced (2 infants), wrong positioning of a biliblanket and wrong medication order (2 infants). Some of the incidents might have lengthened the hospitalisation, however none of the incidents was related to the intervention.

In one situation the attending physician unblinded the fluid balance data, because of oedema in the patient. The fluid balance information didn’t give new insights nor did it influence medical treatment.

Time investment of fluid balance keeping

The mean time spent on a daily fluid balance was 7∙2 minutes per neonate per day (SD 0∙39).

discussion

This study shows that keeping a fluid balance does not influence length of hospital stay or medical interventions in neonates with moderate disease severity. We found a differ-ence of one day in favour of the control group with a range of 0-4 days, meaning that there is no clinically relevant difference in the length of hospital stay, with an uncertainty that the hospital duration in the intervention group could be equal to or 4 days longer than the duration in the control group.

To our knowledge, this is the first randomised controlled trial examining the clinical usefulness of the fluid balance. Our findings confirm earlier reports in different patient

table 3. Secondary outcomes

intervention group(no fluid balance)(n=84)

control group(with fluid balance)(n=86)

difference (95%-interval) p value

Percentage weight loss (SD)5.0%(3.2)

4.5%(3.1)

0.41%(-1.4 to 5.5) 0.90*

Medical actions relevant to fluid status†

Intravenous fluid bolus 1 3 0.621§

diuretics 0 0 NA

deviation from fluid protocol 61/187 5%

(per day) (33%) 53/192 (28%) (-2.3 tot 14.3) 0.29$

Unblinding of fluid balance data 1 NA

Incident reporting (which might influence hospital stay)‡ 5 2 0.28§

† Number of protocol deviations per days. ‡ None of the incidents was related to the study intervention. * T-test, § Fisher exact, $Chi2, NA denotes not applicable

Page 92: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Usefulness of the fluid balance: A randomised controlled trial in neonates 91

3

categories showing poor agreement of fluid balance with changes in body weight5, central venous monitoring14 or bioelectrical impedance measurements15.

Although the theoretical concept of estimating the fluid status by recording fluid in- and output seems logical and sound, the lack of usefulness of the fluid balance may be largely explained by the inaccuracy of the fluid balance. Earlier we reported on the accuracy of the fluid balance in this study population, were we found that 40% of the fluid balances showed clinically relevant differences with daily bodyweight measure-ments.8 The numerous assessments of fluid volumes during a day are all subject to measurements errors. For example, in breastfed neonates there is no reliable method to assess milk intake.16 Similar problems occurring in estimating the quantities of fluid output, such as body fluids leaking into bed sheets and insensible losses. Furthermore, dry nappies can absorb moisture in the humidified environment of an incubator.17

To rule out the possibility of human calculation errors in adding and subtracting totals in calculating the fluid balance, we recalculated all fluid balance data by a spreadsheet program, but this did not influence agreement with the change in body weight (median difference 0 ml, interquartile range -8,0-0 ml, p=0.11).8

A remarkable finding was that in both groups a substantial deviation from the feed-ing- and fluid protocol was found: The fluid intake deviated more than 10 ml/kg from the standard protocol on 33% and 28% of the observed days in respectively the intervention and the control group. An explanation could be that our fluid protocol may be too strict. Especially in self-drinking neonates, the actual fluid intake will often be individually adjusted, for example when breastfeeding on demand. Since the deviation from our fluid protocol equally occurred in both groups this finding most probably will not have influenced the main results.

Strengths and limitations of the study

The main strengths of our study include the design as a randomised controlled trial, the meticulous attention to avoiding revelation of the fluid balance data to the physicians in the intervention group, and the use of measurements and techniques readily available and commonly used in clinical practice.

A drawback of our study was the lack of a potentially more reliable assessment method of fluid status, such as central venous pressure monitoring or body impedance measure-ments. We purposely chose body weight as comparison, since these two methods are most frequently used in clinical practice, in particular in neonates.

Another potential limitation of our study is the moderate disease severity in our patients. Medical interventions aiming at influencing the fluid status, apart from minor adjustments in fluid intake, were rare in our cohort. It may be argued that keeping a fluid balance is more useful in sicker newborns admitted to a neonatal intensive care unit. For safety reasons, we deliberately chose to perform this study in neonates with moderate

Page 93: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

92 Chapter 3 : Fluid balance charts in neonates

disease severity as a proof of principle before embarking on a similar project in more critically ill patients in a Neonatal Intensive Care unit.

A further limitation of our study is the choice of the primary outcome. The median hospitalisation duration in our study was eight days, and it may not be realistic to expect an effect of keeping a fluid balance during the first three days on such relatively long hospitalisation duration. This primary outcome might be more relevant if the study were repeated in a sicker group of infants, were this and additional outcomes such as chronic lung disease, duration of mechanical ventilation and incidence of persistent ductus ar-teriosus may also be useful. Some may suggest that percentage weight loss electrolyte levels, for example serum sodium levels would have been better primary outcomes. Al-though a secondary outcome measure we showed that keeping a fluid balance does not influence the percentage weight loss. And, for ethical reasons we preferred not to take extra blood samples only for study purpose, since in our ward is not common practice to take daily blood samples in the patient group we studied. The other secondary outcome measures occurred too rarely to serve as a valid, practical primary outcome measure and for ethical reasons we did not want to burden the patients with blood punctures.

A final limitation of our study is the unequal distribution of premature infants amongst the two study groups (table 1). The overrepresentation of preterm infants in the group in which fluid balance data were unavailable would have put this group at an increased risk of lengthened hospital stay and unfavourable secondary outcomes. The fact that all these outcomes were not significantly different between the two groups is therefore reassuring. In addition, when we adjusted for gestational age, the difference in hospital stay was close to zero, which supports the lack of usefulness of routinely keeping a fluid balance in sick newborns. In retrospective, stratification for gestational age before ran-domisation could have avoided this demographic differences between the two groups.

conclusions

In conclusion, this study shows a lack of clinical usefulness of routinely keeping a fluid balance in newborns with moderate disease severity. Therefore, fluid balance charts are not useful as a routine procedure in moderately sick neonates. To investigate the generalisability of our results we hope our results encourage colleagues caring for other patient populations, for example in a neonatal intensive care setting, to further investi-gate whether the time-consuming and error-prone ritual of fluid balance keeping really helps in improving patient care.

Page 94: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Usefulness of the fluid balance: A randomised controlled trial in neonates 93

3

references

1. Scales K, Pilsworth J. The importance of fluid balance in clinical practice. Nurs Stand. 2008;22:50-7.

2. Eastwood GM. Evaluating the reliability of recorded fluid balance to approximate body weight change in patients undergoing cardiac surgery. Heart Lung. 2006;35:27-33.

3. Wise LC, Mersch J, Racioppi J, Crosier J, Thompson C. Evaluating the reliability and utility of cumulative intake and output. J Nurs Care Qual. 2000;14:37-42.

4. Chung LH, Chong S, French P. The efficiency of fluid balance charting: an evidence-based man-agement project. J Nurs Manag. 2002;10:103-13.

5. Mank A, Semin-Goossens A, Lelie J, Bakker P, Vos R. Monitoring hyperhydration during high-dose chemotherapy: body weight or fluid balance? Acta Haematol. 2003;109:163-8.

6. Daffurn K, Hillman KM, Bauman A, Lum M, Crispin C, Ince L. Fluid balance charts: do they measure up? Br J Nurs. 1994;3:816-20.

7. Centraal Begeleidingsorgaan voor Intercollegiale Toetsing (CBO) en Verpleegkundig Weten-schappelijke Raad. Sense and nonsense of the fluid balance [in Dutch]. Ned Tijdschr Geneeskd. 1994;138:2124

8. Asperen van Y, Brand PLP, Bekhof J. Reliability of the fluid balance in neonates. Acta Paediatrica 2012;101:479-483

9. Tsang LD, DeMarini S, Rath LL. Electrolytes, vitamins, trace minerals. In: Kenner C, Wright LJ, edi-tors. Comprehensive Neonatal nursing. 3rd ed. Philadelphia: WB Saunders;. 2003:409-24.

10. Koolen AMP, Semmekrot BA, Sijstermans JMJ. luid and electrolyts. In: Lafeber HN, van Zoeren-Grobben D, van Beek RHT, eds. Workbook enteral and parenteral feeding in neonates [ in Dutch]. 2nd ed, , Amsterdam, VU Uitgeverij; 2004.:17-23..

11. Snijders C, van Lingen RA, Klip H, Fetter WP, van der Schaaf TW, Molendijk HA. Specialty-based, voluntary incident reporting in neonatal intensive care: description of 4846 incident reports. Arch Dis Child Fetal Neonatal Ed. 2009;94:F210-F215.

12. Meerlo-Habing ZE, Kosters-Boes EA, Klip H, Brand PLP. Early discharge with tube feeding at home for preterm infants is associated with longer duration of breast feeding. Arch Dis Child Fetal Neonatal Ed. 2009;94:F294-F297.

13. Campbell MJ, Gardner MJ. Medians and their differences. In: Altman DG, Machin D, Bryant TN, Gardner MJ, eds. Statistics with confidence. 2nd ed. Bristol: BMJ books; 2000:36-44.

14. Craig J, Mathieu S. Is central venous pressure monitoring appropriate for assessment of periop-erative fluid balance? Br J Hosp Med (Lond) 2006;67:108.

15. Nenchev N, Hatib F, Daskalov I. Monitoring relative fluid balance alterations in haemodialysis of diabetic patients by electrical impedance. Physiol Meas. 1998;19:35-52.

16. Savenije OE, Brand PL. Accuracy and precision of test weighing to assess milk intake in newborn infants. Arch Dis Child Fetal Neonatal Ed. 2006;91:F330-F332.

17. Amey M, Butchard N, Hanson L, Kinross D, Mannion M, Parsons J et al. Cautionary tales from the neonatal intensive care unit: diapers may mislead urinary output estimation in extremely low birthweight infants. Pediatr Crit Care Med. 2008;9:76-9.

Page 95: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 96: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Chapter 4

Clinical assessment of dyspnoea in children

Page 97: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 98: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children

Jolita BekhofRoelien Reimink

Paul Brand

Paediatr Respir Rev. 2014;15:98-112

Page 99: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

98 Chapter 4 : Clinical assessment of dyspnoea in children

aBstract

Background: A reliable, valid, and easy-to-use assessment of the degree of wheeze-associated dyspnoea is important to provide individualised treatment for children with acute asthma, wheeze or bronchiolitis.

Objective: To assess validity, reliability, and utility of all available paediatric dyspnoea scores.

Methods: Systematic review. We searched Pubmed, Cochrane library, National Guideline Clearinghouse, Embase and Cinahl for eligible studies. We included studies describing the development or use of a multivariate score, assessing two or more clinical symp-toms and signs, for the assessment of severity of dyspnoea in an acute episode of acute asthma, wheeze or bronchiolitis in children aged 0-18 years. Study selection and data extraction was done independently by two reviewers. We assessed validity, reliability and utility of the retrieved dyspnoea scores using a total of 15 quality criteria.

Results: We selected 60 articles describing 36 dyspnoea scores. Fourteen scores were judged unsuitable for clinical use, because of insufficient face validity, use of items unsuitable for children, difficult scoring system or because complex auscultative skills are needed, leaving 22 possibly useful scores. The median number of quality criteria that could be assessed was 7 (range 6-11). The median number of positively rated quality criteria was 3 (range 1-5).

Conclusion: None of the many dyspnoea scores has been sufficiently validated to allow for clinically meaningful use in children with acute wheeze. Proper validation of existing scores is warranted to allow paediatric professionals to make a well balanced decision on the use of the dyspnoea score most suitable for their specific purpose.

Page 100: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 99

4

introduction

Acute exacerbations of wheeze associated respiratory disease account for many hospital admissions and medical costs in children.1,2 In order to assess the severity of such an exacerbation, and to monitor the effects of treatment, clinicians and researchers need a valid and reliable measurement of the degree of dyspnoea and airway narrowing. In adults, lung function measurements are most commonly used for the latter purpose, but this is not suitable as a routine procedure in acute wheeze in children. First, because the best standardized pulmonary function test, the expiratory flow-volume curve,3 is de-pendent on active cooperation of the child, it is unsuitable for use in preschool children, which constitute the largest group of children presenting with acute severe wheeze.4 Second, even in school-aged children and adolescents, the severity of dyspnoea and unavailability of lung function equipment in the emergency room may hinder the assessment of lung function in acute wheeze or asthma. Third, the assessment and follow-up of severity of dyspnoea is not only made by physicians, but also by nurses and allied health professionals, who can not always be expected to have experience with lung function measurements. Therefore, clinical assessment remains of key importance in determining the severity of dyspnoea in acute wheeze, asthma exacerbations or bronchiolitis in children of all ages.

Because single clinical signs do not correlate well with the degree of dyspnoea and airway narrowing in acute wheeze,5 paediatric dyspnoea scores usually comprise a combination of clinical symptoms and signs. Over the past decades, a large number of such clinical scores have been developed, most commonly for use in clinical trials.6,7 In order to be clinically useful, a dyspnoea score needs to be valid and reliable.8-10 Two earlier reviews focused only on asthma scores in preschool children, and concluded that available scores were poorly validated.6,7 These reviews, however, did not comprise the full paediatric age range, and several new scores have been developed after publica-tion of these older reviews. Therefore, we performed a systematic review to evaluate the measurement properties and degree of validation of all available paediatric severity scores for wheeze associated dyspnoea in the literature across the entire paediatric age range.

methods

Definitions

In health measurement literature, different definitions are being used, and terminology may be confusing.8 Therefore, we choose to explain and define the terms as they are being used in this review (Table 1).

Page 101: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

100 Chapter 4 : Clinical assessment of dyspnoea in children

table 1. Quality criteria for measurement properties of paediatric dyspnoea scores

Property definition Quality criteria

validity

Face validity Qualitative judgement if the score is a good measurement of dyspnoea.7

+ At least 3 of the following items were part of the score: 1. respiratory and/or heart rate, 2. oxygen saturation or cyanosis, 3. work of breathing, retractions or use of muscles or dyspnoea, 4. wheezing or auscultatory findings, 5. mental status.

± 2 of the above mentioned items

- 1 item

Content validity* Appropriate representation of the concept dyspnoea by the items in the score (i.e., clear description of development process of the score).10

+ Clear description is provided for measurement aim, target population and item selection and –reduction

? Potential methodological shortcomings

- No clear description

Construct validity* Extent to which the score relates to other measures, consistent with theoretically derived prespecified hypotheses concerning dyspnoea10

+ Specific hypotheses were formulated and at least 75% of the results are in correspondence with these hypotheses in subgroups of at least 50 patients.

? Less than 50 patients OR potential methodological shortcomings or no MIC

- Less than 75% of the hypotheses are confirmed

0 No information

Criterion-concurrent validity*

Criterion validity refers to the extent to which a score relates to the gold standard of the phenomenon. Because a gold standard of dyspnoea is unavailable, this is replaced by concurrent validity, the degree of agreement with other measurements of dyspnoea.10

+ Valid comparison (oxygen saturation, laboratory findings or pulmonary function tests) and correlation >0.70

? Doubts about gold standard

- Correlation < 0.70

0 No information

reliability

Measurement error* Absolute measurement error, usually expressed as smallest detectable change (SDC), i.e. the smallest within-person change in score which can be interpreted as real change above measurement error10

+ MIC>SDC or LOA<MIC10

? Potential methodological shortcomings or no MIC

- SDC or LOA ≥ MIC

0 No information

Inter observer reliability*

Degree to which different users obtain the same result when using the score on the same patients at the same time

+ ICC or weighted kappa >0.70 in at least 50 patients

? Pearson correlation >0.70, or < 50 patients

- ICC or kappa <0,70

0 No information

Intra observer reliability*

Similarity of results when the score is repeated by the same user on the same patient under similar conditions

+ ICC or weighted kappa >0.70 in at least 50 patients10

? Pearson correlation coefficient >0.70, or < 50 patients

- ICC or kappa < 0.70

0 No information

Page 102: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 101

4

Internal consistency* Correlation between items of the score9,10 + Factor analysis performed and Cronbach’s alfa 0.70-0.95

? No factor analysis OR potential methodological shortcomings

- Cronbach’s alfa < 0.70 or > 0.95

0 No information

Responsiveness* Ability of the score to detect change in time8

+ Guyatts’s RR >1.96 or AUC ≥ 0.70

? Potential methodological shortcomings

- RR ≤ 1.96 of AUC < 0.70

0 No information

utility

Suitability Suitability for use in children + No invasive techniques or items which may be difficult to obtain in young children (e.g. pulsus paradoxus, information on speech not specified for infants).

± As in + with information on speech specified for infants

- Use of invasive techniques or items which may be difficult to obtain in young children

Age span Coverage of the entire paediatric age span + Evaluated from infancy (<2 years) to puberty (>12 years)

- Evaluated in a smaller age span

Ease of scoring Complexity of scoring system + <4 categories per item46

± 4 categories per item

- > 4 categories per item or complex calculations needed

Auscultation skills Feasibility in clinical practice by different health care providers

+ no auscultation skills required.

± no complex auscultation skills required (no inspiratory:expiratory ratio).

- complex auscultation skills required

Floor or ceiling effect*

Unequal distribution of score results10) + < 15% of patients with lowest or highest possible score in at least 50 patients

? Potential methodological shortcomings or < 50 patients

- < 15% of patients with lowest or highest possible score

0 No information

Interpretability* Clinical meaningfulness + Mean scores and SD given in at least 4 relevant subgroups and MIC determined

? Potential methodological shortcomings or < 4 subgroups or no MIC determined

0 No information

+ positive rating; ± indeterminate; - negative rating; ? unclear or potential methodological shortcomings; 0 no information available (potential) methodological shortcomings = description of design or methods of the study not clear, or study group < 50 persons*Items together form the Terwee checklist10

SDC smallest detectable change; MIC minimal important change; LOA limits of agreement; ICC intraclass correlation coefficient; SD standard deviation; RR Guyatt’s responsiveness ratio; AUC area under curve of the receiver operating curve

Page 103: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

102 Chapter 4 : Clinical assessment of dyspnoea in children

We use the term “dyspnoea score” to comprise any clinical score which has been developed to assess the severity of dyspnoea in acute wheeze, asthma or bronchiolitis in children.

Dyspnoea scores can be used for discriminative, predictive and/or evaluative purposes. A discriminative score is used to rate the severity of dyspnoea at a single point in time. Predictive scores aim to predict the outcome of an acute exacerbation, and evaluative scores assess the changes over time, usually in response to treatment.8

Clinical scores need to be valid, reliable, and easy to use. Validity describes whether the instrument measures what it is supposed to, i.e. the correct signal. Validity includes face, content, construct and criterion (or concurrent) validity (Table 1). Reliability (also known as reproducibility or accuracy) refers to the extent to which patients can be distinguished from each other, despite measurement error.9 Reliability can be evaluated by assessing measurement error, inter and intra-observer reliability, internal consistency and responsiveness (Table 1). Utility refers to usability, thus suitability for use in children, the ease of scoring, skills needed to use the score, floor and ceiling effects, and interpret-ability.7,10

Literature search and Study selection

We searched Pubmed, Cochrane library, National Guideline Clearinghouse, Embase and Cinahl (November 2011) for eligible studies using the terms “wheeze”, “asthma”, “score”, “scale”, “index”, “assessment”, “validity”, “validation”, “reliability”,” and limited to child 0-18 when possible. The full search strategy in each database is available in appendix 1 of the online depository. References and “related articles” in Pubmed of selected articles were screened for additional relevant publications. We used no language restrictions. Eligible titles and abstracts (if available) were screened independently by two reviewers for inclusion (JB, RR). The final decision on inclusion in the review was made by consensus.

Inclusion and exclusion criteria

Studies that described the development or use of a multivariate score, assessing two or more clinical symptoms and signs, for the assessment of severity of dyspnoea in an acute episode of wheeze, asthma or bronchiolitis were considered eligible for inclusion in the review if they included children in the age range of 0 to 18 years. Scores that were designed for discriminative or evaluative purposes were selected for further review. Scores aimed at assessing asthma control or quality of life in children with asthma, and those including historical parameters (for example questionnaires for parents, medica-tion use, and number of exacerbations), pulmonary function parameters, or laboratory parameters were excluded. We also excluded predictive scores.

Page 104: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 103

4

Data extraction

Two reviewers (JB, RR) extracted the data independently by using a data extraction form specifically designed for this study (online appendix 2). We extracted data on the individual items of the score, the scaling of the items, the calculation of the final (total) score, the number and age range of the children in which the score was used, the setting (emergency or outpatient department or hospitalized patients), and the clinical condi-tion of the patients (wheeze, acute asthma, bronchiolitis). In addition, we recorded the purpose (discriminative, predictive or evaluative), the development process of the score and the use of the score in clinical trials. We assessed whether item selection and reduc-tion were based on clinical observation, expert opinion, and theory, or on previously reported scores.

Methodological quality assessment of dyspnoea scores

Based on recent reviews of clinimetric measurements, we assessed each retrieved clini-cal asthma score on a total of 15 methodological quality criteria: 4 on validity, 5 on reli-ability, and 6 on utility.6,7 Each of these 15 criteria was rated as positive (+), intermediate (+), negative (-), unclear (?) or no information (0) using the criteria listed in table 1.

results

Search characteristics

Our search yielded 1014 possibly relevant titles, most of which evaluated scores for quality of life or control of asthma using medical history data or a parent-completed questionnaire. The selection procedure of included articles is depicted in figure 1. A total of 65 articles, describing a total of 36 different asthma scores, were selected for review of the full-text articles, including 60 original articles and 5 reviews.6,7 Two reviews evaluated dyspnoea scores6,7 and 3 reviews described studies concerning the treatment of asthma or bronchiolitis referring to articles using clinical scores. Van der Windt6 et al evaluated 16 scores for use in preschool children, 14 of which were developed for evalu-ative or discriminative purposes. This review assessed 6 methodological criteria but did not specify how these were rated.6 Birken et al7 reviewed 10 asthma scores for use in preschool children on 10 methodological quality criteria, and found that only 2 scores could be evaluated on more than 4 criteria. Overall, validity was considered to be insuf-ficient for most scores in both reviews. Responsiveness of the scores was only assessed in medication trials, and recorded as some degree of improvement after medication.7

The aim of the majority of selected articles (31 out of 60) was not to validate the score, but only described the use of dyspnoea scores in clinical trials.

Page 105: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

104 Chapter 4 : Clinical assessment of dyspnoea in children

General characteristics of the scores

Appendix 2 (online) summarizes the characteristics of the 36 retrieved scores. The mean number of items recorded per score was 5 (SD 1.7; range 2 to 9), and the mean maximum possible result of the scores was 12 (SD 5.4; range 3 to 27). Accessory muscle use was included in almost all scores (in 35 (97%) of the scores), followed by wheeze in 34 (94%), respiratory rate in 26 scores (72%), cyanosis and mental status/activity level both in 17 (47%), dyspnoea in 10 (28%) and inspiratory breath sounds or air entry in 9 (25%). Other

Records identified through database checking (n=957)

Duplicates removed (n=17)

Articles not relevant (n=910)

Article could not be retrieved (Japanese) (n=1)72

Articles included for full review (n=65) 60 original articles 5 reviews

Additional records found through other sources

(n=57)

Records screened (Title and/or abstract) (n=1014)

Full-text articles assessed for eligibility (n=87)

Articles excluded (n=21): No composite score used (11) No full description of the score (4) Blood gas analysis included (3) Unclear reference (3)

(not found in the databases mentioned in the methods)

figure 1. Flowchart of selection process of articles included in the review

Page 106: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 105

4

items were used infrequently: inspiratory to expiratory ratio in 6, heart rate in 5, speech impairment in 4, grunting in 4, nasal flaring in 3, rales or rhonchi in 3, cough, liver and spleen size, pulsus paradoxus, nebulisations, intravenous infusion and resonance each in one score.

The development process of the score was described for 21 scores (58%). Only three scores were developed using a formal procedure for item selection and reduction (CAS, PRAM, RAD). Another 5 scores used merely expert opinion, and 14 scores were based on previous scores.

Forty-one percent (24/59, one study did not mention patient numbers) of the studies included less than 50 patients, which is below the minimal number of subjects required for adequate assessment of the quality criteria.10 Only 11 scores were developed to be used in a wide age range from infancy to puberty. Fourteen scores were used only in the outpatient ward or emergency department, 14 only in hospitalised patients, and 8 in both settings.

The clinical characteristics of patients were described as asthma in 16 scores, bron-chiolitis in 11, wheeze in 4, bronchusobstructive syndrome in 2, and both asthma and bronchiolitis patients were used in 3 scores. Ten scores were only described for use in medication trials, without any description of development or measurement properties of the score.

Methodological characteristics of the scores

Figure 2 summarizes the result of the methodological quality assessment of the retrieved scores, using the criteria described in table 1; a detailed description of this assessment of each score is presented in appendix 3. None of the scores could be assessed on all 15 quality criteria. The number of quality criteria that could be assessed ranged from 6 to 11, with a mean of 7.6 (SD 1.5). The mean number of positively rated quality criteria was 2.9 (SD 1.1; range 1-5). Assessment of the criteria measurement error, intra observer reliability, and floor and ceiling effect was impossible because no information on these criteria was given in any of the studies. In addition, quality assessment of construct va-lidity, responsiveness, and interpretability was hampered by either absence of sufficient information or methodological shortcomings of the studies.

ValidityFace validity was good in almost all scores. The only exception was the RDAI which contains only 2 items: wheeze and retractions. Content validity was poor in all scores, apart from the three scores which were developed using a formal procedure of item development and reduction (PRAM, CAS and RAD). Criterion validity was described for 16 scores, but received satisfactory ratings in only 2 (good correlations for comparison of CSGS-1 with blood gas analysis and for CSS-2 with SaO2).

Page 107: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

106 Chapter 4 : Clinical assessment of dyspnoea in childrenFigure 2 Flowchart of quality assessment of dyspnoea scores on 15 quality criteria

None of the scores could be adequately assessed on the following 7 criteria because of insufficient information:

Construct validity Measurement error Intra observer reliability Responsiveness Floor and ceiling effect Interpretability

Exclusion of 14 scores because of insufficient rating on essential quality criteria

Face validity inadequate: RDAI58-67

Dyspnoea scores retrieved (n=36)

Scores possibly suitable (n=22)

Suitability for use in children insufficient: CAES-124,25, CSS-36

Ease of scoring insufficient: BS-423, CSGS-130 and 231, RDAI58-67, RDI68,69, SS70

Difficult auscultation skills needed: BS-120, CAS28, MPIS43, PASS44,45, PIS48-54, RA57

Sum of positively rated quality criteria of possibly suitable dyspnoea scores: PRAM45-47 5 points ASS14-19 RAD45 4 points CAES-226

AS13 CS29 BS-221 CSS-234 3 points BS-322 CSS-435

CAES-325 CSS-738-40

CSS-132,33 EDRCH42

CSS-536 PS-155

CSS-637 PS-256 2 points CSS-841 SOIS71

EDRAR42

CAES-427 1 point

Content validity adequate: PRAM45-47, RAD45

Evaluated in entire age span of childhood: AS13, ASS14-19, CAES-226, CS29, PRAM45-47

Interobserverreliability adequate: BS-221, CSS-738-40, PRAM45-47

Criterion validity adequate: CSS-234

Internal consistency adequate: none

figure 2. Flowchart of quality assessment of dyspnoea scores on 15 quality criteria

Page 108: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 107

4

ReliabilityInterobserver reliability was described for 16 scores, and rated positively in 8 (BS-2, CAS, CSS-7, PASS, PRAM, PIS, RA, RDAI). There were only three scores, (CAS, PRAM, PIS) in which internal consistency was calculated with Crohnbach’s alpha, with values of 0.86, 0.71 and 0.84 respectively. Although these values indicated adequate internal consistency, we rated these criteria as indeterminate because no use of factor analysis was reported.10 Although many studies reported on improvement in scores after treat-ment, however in none of the studies this done by calculating an area under the curve or Guyatt’s responsiveness ratio.

UtilitySuitability for use in children was judged to be good in all scores, except for CAES-1 (because it includes blood gas analysis) and CSS-3 (includes pulsus paradoxus). Inter-mediate ratings were given to AS and CAES-4 (because these scores include information on speech). The scoring system was rated as “easy-to-use” in 11 scores. Five scores were judged to have insufficient utility because of too complex rating scales or calculations. In all but one score (BS-4), auscultation was included. Six scores included inspiratory to expiratory ratio which we judged to be a difficult auscultation skill.

discussion

This study shows that none of the many scores used for assessment of the degree of wheeze-associated dyspnoea in children with acute asthma, wheeze or bronchiolitis.

has been sufficiently validated to allow clinically meaningful use in children. Although many scores have adequate face validity and are easy to use in children, important deficits were noted in all scores across the three methodological quality domains of validity, reliability, and utility. Out of the 15 methodological quality criteria considered to be important for clinical scores,7,10 the paediatric dyspnoea scores reported in the literature provided sufficient data on an average of only 7 criteria, and received positive ratings on a median of only 3 out of these 7 criteria (Figure 2). There is, therefore, insuf-ficient evidence to trust the currently available paediatric dyspnoea scores to provide an accurate reflection of disease severity, either for research purposes, or for clinical use.

Although face validity is usually adequate, content and construct validity have been insufficiently addressed in almost all scores, primarily because these scores have been developed in an hoc fashion for use in clinical trials. Only three scores (CAS, PRAM and RAD) were developed using more formal methods, ensuring adequate content validity. Criterion validity was described for almost half of the scores, but received satisfactory ratings in only 2 (CSGS-1 and CSS-2). Information on reliabililty was lacking for the large

Page 109: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

108 Chapter 4 : Clinical assessment of dyspnoea in children

majority of scores. If available, it was usually limited to an assessment of interobserver reliability between two observers, and this was found to be adequate in 8 out of 16 scores. No study assessed reliability between more than two observers, which is remark-able given the large number of medical staff and allied health personnel involved in the assessment and follow-up of children with acute severe dyspnoea, in particular when hospitalized. Even more striking is the lack of data on responsiveness, which was lacking for 22 scores. Although many studies reported on improvement in scores after treat-ment, however in none of the studies this done by calculating an area under the curve or Guyatt’s responsiveness ratio.

Responsiveness is important because it drives clinical decisions (when to increase or taper down medication, when to discharge). Although most scores were suitable for use in children, a number of scores were judged to be impractical or impossible to use in daily clinical practice because they involved complicated measurements (pulsus para-doxus) or calculations. Auscultation skills, in different degrees, were needed in the large majority of scores, limiting the use of these scores to personnel with specific training in these skills.

We acknowledge the following main weaknesses of our study. First, we included scor-ing systems used for different diagnostic categories (asthma, wheeze, and bronchiolitis). Although this may be viewed as an unjust comparison of measurement instruments for different disorders, the overlap in items recorded between the scores (Appendix 2) was considerable. Apparently, all these clinical scores aim at measuring the degree of dyspnoea in children, irrespective of the underlying disease causing airway narrowing. Thirdly, the importance of the utility items we assessed may be questioned. Using infor-mation technology, even complicated scores can be calculated easily and reliably, for example. Similarly, although auscultation requires specific skills, it can be argued that an accurate assessment of dyspnoea in children is impossible without chest auscultation. It should be noted, however, that the poor methodological quality of the available scores was not limited to utility items, but almost always included validity and reliability issues. Finally, the relative importance of each of the validity and reliability items is debatable. Unfortunately, current health assessment literature does not allow a clear-cut definition of the relative merits of each of these issues.10

The main strengths of this study include the systematic nature of the literature review, the adherence to guidelines for such systematic reviews,11 and the detailed analysis and description of 15 methodological quality items of all available paediatric dyspnoea scores.10

We would like to emphasize that the limited methodological quality of the paediatric asthma and dyspnoea scores that we evaluated does not necessarily mean that these existing scores are not suitable for use in clinical practice and research. The main reason for the poor methodological quality that we report is that studies on existing asthma

Page 110: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 109

4

scores provided insufficient data to enable quality scoring (absence of evidence). In other words, proper comprehensive validation has simply not been performed for any of these scores.

In conclusion, none of the many available paediatric dyspnoea or asthma scores has been sufficiently validated to allow for clinically meaningful use in children with acute asthma, wheeze or bronchiolitis, in particular not for evaluative purposes. Instead of developing new scores, we consider it to be more useful that existing dyspnoea scores are being validated in a more comprehensive and proper way. This will allow clinicians and researchers to make a well balanced decision on the use of the dyspnoea score most suitable for their specific purpose. It is preferable that only one or at least fewer scores are used. The large amount of different scores that are currently used is not beneficial for widespread use and hampers comparisons between different studies and patient groups.

aBBreviations of the scores with references

AS Asthma score;12 ASS asthma severity score;13-18 BS bronchiolitis score (original names: BS-2 respiratory scale; BS-3 clinical scoring; BS-4 severity score);19-22 CAES clinical asthma evaluation score (original names: CAES-2 clinical severity score or modified clinical asthma score; CAES-3 modified clinical asthma score);23-26 CAS clinical asthma score27; CS clinical score28; CSGS clinical symptom grading system;29,30 CSS clinical scoring system (original names: CSS-2 modified Tal’s clinical score; CSS-4 clinical score);5,31-40 EDRAR escala de diffcultad respiratoria de la Argentina;41 EDRCH escala de dificultad respiratoria de Chile;41 MPIS modified pulmonary index score;42 PASS pediatric asthma severity score;43,44 PRAM paediatric or preschool respiratory assessment measure;44-46 PIS pulmonary index score;47-53 PS pulmonary score;44,45 RA respiratory assessment;56 RAD respiratory rate – accessory muscle use – decreased breath sounds;44 RDAI respiratory distress assessment instrument;57-66 RDI respiratory distress index;67,68 SS severity score;69 SOIS severity of illness score.70

acknowledgements

Mirell Papenhuijzen, clinical librarian, is acknowledged for her assistance in the literature search.

Page 111: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

110 Chapter 4 : Clinical assessment of dyspnoea in children

references

1. Keeley D, McKean M. Asthma and other wheezing disorders in children. Clin Evid 2005;14:238-262.

2. Bisgaard H, Szefler S. Prevalence of asthma-like symptoms in young children. Pediatr Pulmonol 2007;42:723-728.

3. Global Strategy for Asthma Management and Prevention, Global Initiative for Asthma (GINA) 2010. Available from:http://www.ginasthma.org

4. Ducharme FM, Davis GM. Measurement of respiratory resistance in the emergency department; feasibility in young children with acute asthma. Chest 1997;111:1519-1525.

5. Kerem E, Canny G, Tibshirani R, et al. Clinical-physiologic correlates in acute asthma of childhood. Pediatrics 1991;87:481-486.

6. Windt van der DAW, Nagelkerke AF, Bouter LM, Dankert-Roelse JE, Veerman AJP. Clinical scores for acute asthma in pre-school children. A review of the literature. J Clin Epidemiol 1994;47:635-646.

7. Birken CS, Parkin PC, Macarthur C. Asthma severity scores for preschoolers displayed weaknesses in reliability, validity, and responsiveness. J Clin Epidemiol 2004;57:1177-1181.

8. Guyatt GH, Kirschner B, Jaeschke R. Measuring health status: What are the necessary properties? J Clin Epidemiol 1992;45:1341-1345.

9. Streiner DL, Norman GR. Health measurement scales. A practical guide to their development and use. New York: Oxford University Press 2003:126-152, 172-212.

10. Terwee CB, Bot SDM, Boer de MR, et al. Quality criteria were proposed for measurement proper-ties of health status questionnaires. J Clin Epidemiol 2007;60:34-42.

11. Mokkink LB, Terwee CB, Stratford PW, et al. Evaluation of the methodological quality of systematic reviews of health status measurement instruments. Qual Lif Res 2009;18:313-333.

12. Qureshi F, Pestian J, Davis P, Zaritsky A. Effect of nebulized ipratropium on the hospitalization rates of children with asthma. N Engl J Med 1998;339:1030-1035.

13. Conway SP, Littlewood JM. Admission to hospital with asthma. Arch Dis Child 1985;60:636-639. 14. Dawson KP. The severity of acute asthma attacks in children admitted to hospital. Aust Paediatr J

1987;23:167-168. 15. Bishop J, Carlin J, Nolan T. Evaluation of the properties and reliability of a clinical scoring severity

scale for acute asthma in children. J Clin Epidemiol 1992;45:71-76. 16. Dawson KP. The asthma clinical score and oxygen saturation. Aust Clin Rev 1991;11:20-21. 17. Yung M, South M, Byrt T. Evaluation of an asthma severity score. J Paediatr Child Health

1996;32:261-264. 18. Zar HJ, Streun S, Levin M, Weinberg EG, Swingler GH. Randomised controlled trial of the efficacy

of a metered dose inhaler with bottle spacer for bronchodilator treatment in acute lower airway obstruction. Arch Dis Child 2007;92:142-146.

19. Dabbous IA, Tkachyk JS, Stamm SJ. A double blind study on the effects of corticosteroids in the treatment of bronchiolitis. Pediatrics 1966;37:477-484.

20. Gajdos V, Beydon N, Bommenel L, et al. Inter-observer agreement between physicians, nurses, and respiratory therapists for respiratory clinical evaluation in bronchiolitis. Pediatr Pulmonol 2009;44:754-762.

21. Kristjansson S, Lodrup Carlsen KC, Wennergren G, Strannegard I-L, Carlsen K-H. Nebulised ra-cemic adrenaline in the treatment of acute bronchiolitis in infants and toddlers. Arch Dis Child 1993;69:650-654.

Page 112: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 111

4

22. Wainwright C, Altamirano L, Cheney M, et al. A multicenter, randomized double-blind, controlled trial of nebulized epinephrine in infants with acute bronchiolitis. N Engl J Med 2003;349:27-35.

23. Wood DV, Downes JJ, Lecks HI. Clinical scoring system for the diagnosis of respiratory failure. Am J Dis Child 1972;123:227-228.

24. Angelilli ML, Thomas R. Inter-rater evaluation of a clinical scoring system in children with asthma. Ann Allergy Asthma Immunol 2002;88:209-214.

25. Hurwitz ME, Burney RE, Howatt WE, Crowley D, Mackenzie JR. Clinical scoring does not accurately assess hypoxemia in pediatric asthma patients. Ann Emerg Med 1984;13:1040-1043.

26. Obata T, Kimura Y, Likura Y. Relationship between arterial blood gas tensions and a clinical score in asthmatic children. Annals of Allergy 1992;68:530-532.

27. Parkin PC, Macarthur C, Saunders NR, Diamond SA, Winders PM. Development of a clinical asthma score for use in hospitalized children between 1 and 5 years of age. J Clin Epidemiol 1996;49:821-882.

28. Liu LL, Gallaher MM, Davis RL, Rutter CM, Lewis TC, Marcuse EK. Use of a respiratory clinical score among different providers. Pediatr Pulmonol 2004;37:243-248.

29. Oberger E, Engstrom I. Blood gases and acid-base balance in children with bronchial asthma. Lung 1978;155:111-122.

30. Wennergren G, Engstrom I, Bjure J. Transcutaneous oxygen and carbon dioxide levels and a clini-cal symptom scale for monitoring the acute asthmatic state in infants and young children. Acta Paediatr Scand 1986;75:465-469.

31. Tal A, Bavilski C, Yohai D, Bearman JE, Gorodischer R, Moses SW. Dexamethason and salbutamol in the treatment of acute wheezing in infants. Pediatrics 1983;71:13-18.

32. Chong Neto HJ, Chong-Silva DC, Marani DM, Kuroda F, Olandosky M, Noronha L. [Different inhaler devices in acute asthma attacks: a randomized, double-blind, placebo-controlled study]. J Pediatr (Rio J) 2005;81:298-304. [Article in Portuguese]

33. Pavon D, Castro-Rodriguez JA, Rubilar L, Girardi G. Relation between pulse oximetry and clinical score in children with acute wheezing less than 24 months of age. Pediatr Pulmol 1999;27:423-427.

34. Bentur L, Kerem E, Canny G, et al. Response of acute asthma to a beta 2 agonist in children less than two years of age. Ann Allergy 1990;65:122-126.

35. Bentur L, Canny GJ, Shields MD, et al. Controlled trial of nebulized albuterol in children younger than 2 years of age with acute asthma. Pediatrics 1992;89:133-137.

36. Berger I, Argaman Z, Schwartz SB, et al. Efficacy of corticosteroids in acute bronchiolitis: short-term and long-term follow-up. Pediatr Pulmonol 1998;26:162-166.

37. De Boeck K, Van der Aa N, Van Lierde S, Corbeel L, Eeckels R. Respiratory syncytial virus bronchiol-itis: a double-blind dexamethasone efficacy study. J Pediatr. 1997;131:919-921.

38. Goebel J. Estrada B, Quinonez J, Nagji Noorkarim, Sanford D, Boerth RC. Prednisolone plus alb-uterol versus albuterol alone in mild to moderate bronchiolitis. Clin Pediatr. 2000;39:213-220.

39. Teeratakulpisarn J, Limwattananon C, Tanupattarachai S, Limwattananon S, Teeratakulpisarn S, Kosalaraksa P. Efficacy of dexamethasone injection for acute bronchiolitis in hospitalized chil-dren: a randomized, double-blind, placebo-controlled trial. Pediatr Pulmonol. 2007;42:433-439.

40. Bentur L, Shoseyov D, Feigenbaum D, Gorichovsky Y, Bibi H. Dexamethasone inhalations in RSV bronchiolitis: a double-blind, placebo-controlled study. Acta Paediatrica. 2005;94:866-871.

41. Coarasa A, Giugno H, Cutri A, et al. [Validation of a clinical prediction tool to evaluate severity in children with wheezing]. Arch Argent Pediatr 2010;108:116-123. [article in Spanish]

Page 113: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

112 Chapter 4 : Clinical assessment of dyspnoea in children

42. Carroll CL, Sekaran AK, Lerer TJ, Schramm CM. A modified pulmonary index score with predictive value for pediatric asthma exacerbations. Ann Allergy Asthma Immunol 2005;94:355-359.

43. Gorelick MH, Stevens MW, Schultz TR, Scribano PV. Performance of a novel clinical score, the Pediatric Asthma Severity Score (PASS), in the evaluation of acute asthma. Acad Emerg Med 2004;11:10-18.

44. Arnold DH, Gebretsadik T, Abramo TJ, Moons KG, Sheller JR, Hartert TV. The RAD score: a simple acute asthma severity score compares favourably to more complex scores. Ann Allergy Asthma Immunol 2011;107:22-28.

45. Chalut DS, Ducharme FM, Davis GM. The Preschool Respiratory Assessment Measure (PRAM): a responsive index of acute asthma severity. J Pediatr 2000;137:762-768.

46. Ducharme FM, Chalut D, Plotnick L, et al. The Pediatric Respiratory Assessment Measure: a valid clinical score for assessing acute asthma severity from toddlers to teenagers. J Pediatr 2008;152:476-480.

47. Pierson WE, Bierman W, Stamm SJ, VanArsdel PP. Double-blind trial of aminopylline in status asthmaticus. Pediatrics 1971;48:642-646.

48. Bierman CW, Pierson WE. The pharmacologic management of status asthmaticus in children. Pediatrics 1974;54:245-247.

49. Becker AB, Nelson NA, Simons FE. The pulmonary index. Assessment of a clinical score for asthma. Am J Dis Child 1984;138:574-576.

50. Tal A, Levy N, Bearman J. Methylprednisolone therapy for acute asthma in infants and toddlers: a controlled clinical trial. Pediatrics 1990;86:350-6.

51. Barnett PLJ, Caputo GL, Baskin M, Kuppermann N. Intravenous versus oral corticosteroids in the management of acute asthma in children. Ann Emerg Med 1997;29:212-7.

52. Scarfone RJ, Loiselle JM, Joffe MD, et al. A randomized trial of magnesium in the emergency department treatment of children with asthma. Ann Emerg Med 2000;36:572-578.

53. Hsu P, Lam LT, Browne G. The pulmonary index score as a clinical assessment tool for acute child-hood asthma. Ann Allergy Asthma Immunol 2010;105:425-459.

54. Smith SR, Baty JD, Hodge D. Validation of the pulmonary score: an asthma severity score for children. Acad Emerg Med 2002;9:99-104.

55. Wang EEL, Milner RA, Navas L, Maj H. Observeragreement for respiratory signs and oximetry in infants hospitalized with lower respiratory infections. Am rev Respir Dis 1992;145:106-109.

56. Stevens MW, Gorelick MH, Schultz T. Interrater agreement in the clinical evaluation of acute pediatric asthma. J Asthma 2003;40:311-315.

57. Lowell DI, Lister G, Von Koss H, McCarthy P. Wheezing in infants: the response to epinephrine. Pediatrics 1987;79:939-945.

58. Klassen TP, Rowe PC, Sutcliffe T, Ropp LJ, McDowell IW, Li MM. Randomized trial of salbutamol in acute bronchiolitis. J Pediatr 1991;118:807-811.

59. Menon K, Sutcliffe T, Klassen TP. A randomized trial comparing the efficacy of epinephrine with salbutamol in the treatment of acute bronchiolitis. J Pediatr 1995;126:1004-1007.

60. Klassen TP, Sutcliffe T, Watters LK, Wells GA, Allen UD, Li MM. Dexamethasone in salbutamol-treat-ed inpatients with acute bronchiolitis: A randomized, controlled trial. J Pediatr 1997;130:191-196.

61. Patel H, Platt RW, Pekeles GS, Ducharme FM. A randomized, controlled trial of the effectiveness of nebulized therapy with epinephrine compared with albuterol and saline in infants hospitalized for acute viral bronchiolitis. J Pediatr 2002;141:818-824.

62. Kuyucu S, Unal S, kuyucu N, Yilgor E. Additive effects of dexamethasone in nebulized salbutamol or L-epinephrine treated infants with acute bronchiolitis. Pediatr Int 2004;46:539-544.

Page 114: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 113

4

63. Corneli HM, Zorc JJ, Mahajan P, et al. Bronchiolitis Study Group of the Pediatric Emergency Care Applied Research Network (PECARN). A multicenter, randomized, controlled trial of dexametha-sone for bronchiolitis. N Engl J Med 2007;357:331-339.

64. Plint AC, Johnson DW, Patel H, et al.; Pediatric Emergency Research Canada (PERC). Epinephrine and dexamethasone in children with bronchiolitis. Engl J Med 2009;360:2079-2089.

65. Mesquita M, Castro-Rodriguez JA, Heinichen L, Farina E, Iramain R. Single oral dose of dexa-methasone in outpatients with bronchiolitis: a placebo controlled trial. Allergol et Immunopathol 2009;37:62-67.

66. Tinsa F, Ben Rhouma A, Ghaffari H, et al. A randomized, controlled trial of nebulized terbutaline for the first acute bronchiolitis in infants less than 12-months-old. Tunis Med 2009;87:200-3.

67. Alario AJ, Lewander WJ, Dehenny P, Seifer R, Mansell AL. The efficacy of nebulized metaproterenol in wheezing infants and young children. Am J Dis Child 1992;146:412-418.

68. Alario AJ, Lewander WJ, Dennehy P, Seifer R, Mansell AL. The relationship between oxygen satura-tion and the clinical assessment of acutely wheezing infants and children. Pediatr Emerg Care 1995;11:331-334.

69. Goh A, Chay OM, Foo AL, Ong EK. Efficacy of bronchodilators in the treatment of bronchiolitis. Singapore Med J 1997;38:326-328.

70. Gadomski AM, Lichenstein R, Horton L, King J, Keane V, Permutt T. Efficacy of albuterol in the management of bronchiolitis. Pediatrics 1994;93:907-912.

71. Mitsui S, KemboT. Management of severe asthma attack. Sogo Rinsho 1971;20:2500-2504. 72. Shuh S, Canny G, Reisman JJ, et al. Nebulized albuterol in acute bronchiolitis. J Pediatr

1990;117:633-637.

Page 115: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

114 Chapter 4 : Clinical assessment of dyspnoea in children

aPPendix 1 search strategy Paediatric dysPnoea scores

Pubmed:[asthma OR wheeze] AND [validity OR validation OR reliability] AND [score OR scale OR assessment OR index].

Cochrane Library:“(asthma OR wheeze) and child and (scale OR index OR score) and (validation OR valida-tion OR reliability)

Cinahl:((MH “Asthma”) AND ((MH “Reliability”) OR (MH “Reliability and validity+”)) AND ((“index”) or (MH “Scales”) or (MH “Clinical Assessment Tools+”))

Embase:((‘asthma’/exp OR ‘wheezing’/exp) AND (‘validity’/exp OR ‘reliability’/exp) AND (clinical assessment’/exp OR (score OR scale OR assessment OR index))) AND ([newborn]/lim OR [infant]/lim OR [preschool]/lim OR [school]/lim OR [child]/lim OR [adolescent]/lim)

National Guideline Clearinghouse:Asthma, limit child

Page 116: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 115

4

app

endi

x 2

des

crip

tion

of d

yspn

oea

scor

es a

nd a

rtic

les

sele

cted

for r

evie

w

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

ASRF

, SaO

2, w

heez

e,

retra

ctio

ns, d

yspn

oea

15 (m

in.

5)no

info

rmat

ion

disc

rimin

ativ

e,

eval

uativ

eQ

ures

hi

1998

13

434

2-18

year

sED

acut

e as

thm

aRC

T of

neb

ulize

d ip

atro

pium

, in

tero

bser

relia

bilit

yim

prov

emen

t in

AS n

ot q

uant

ified

, in

tero

bser

relia

bilit

y r 0

.98

in 9

8 pa

tient

s

ASS

whe

eze,

acc

esso

ry

mus

cle, H

F9

no in

form

atio

ndi

scrim

inat

ive

Conw

ay

1985

14

110

0-14

year

sIP

acut

e as

thm

ade

scrip

tive

stud

y abo

ut

child

ren

adm

itted

to th

e ho

spita

l with

ast

hma

70%

had

a sc

ore

< 7

Daw

son

1987

15

126

6 mon

ths-

13

year

s

IPac

ute

asth

ma

asse

ssm

ent o

f ast

hma

seve

rity

inte

robs

erve

r rel

iabi

lity n

ot

quan

tified

Bish

op 1

99216

606

mon

ths

-17

year

sED

acut

e w

heez

eco

mpa

rison

with

SaO

2,

adm

issio

n,in

tero

bser

ver

relia

bilit

y

SaO

2 po

or a

gree

men

t (no

t qu

antifi

ed),

κ w 0

.63

Daw

son

1991

17

413-

13 ye

ars

asth

ma

com

paris

on w

ith S

aO2

r 0.7

6

Yung

199

61810

70-

19 ye

ars

ED a

nd

IPas

thm

aco

mpa

rison

with

SaO

2 an

d FE

V1, in

tero

bser

relia

bilit

ySa

O2t

r -0

.45;

FEV

1 r -

0.51

; κw 0

.82

Zar 2

00719

400

2 m

onth

s-5

year

sED

acut

e lo

wer

ai

rway

ob

stru

ctio

n

RCT

of a

met

ered

dos

e in

hale

r w

ith b

ottle

spac

erm

edia

n ch

ange

in cl

inica

l sco

re

was

equ

al in

bot

h gr

oups

-2 (p

25

-3 ,p

75 -1

)

BS-1

cyan

osis,

act

ivity

, co

ugh,

RF,

retra

ctio

ns,

reso

nanc

e, w

heez

e, I:E

ra

tio, li

ver a

nd sp

leen

27no

info

rmat

ion

disc

rimin

ativ

e,

eval

uativ

eDa

bbou

s 19

6620

91-

18

mon

ths

IPbr

onch

iolit

isRC

T co

rtico

ster

oids

in 5

3 pa

tient

s, 15

com

paris

ons

in 9

pat

ient

s pilo

t to

asse

ss

inte

robs

erve

r rel

iabi

lity

not q

uant

ified

BS-2

RF, r

etra

ctio

ns, w

heez

e12

no in

form

atio

ndi

scrim

inat

ive,

ev

alua

tive

Gajd

os 2

00921

180

0-15

m

onth

sIP

bron

chio

litis

asse

ssm

ent o

f int

erob

serv

er

relia

bilit

yκ w

0,7

2 in

180

obs

erve

rpai

rs.

Blan

d Al

tman

plo

t (no

limits

of

agre

emen

t giv

en)

Page 117: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

116 Chapter 4 : Clinical assessment of dyspnoea in children

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

BS-3

RF, r

eces

sions

, au

scul

tato

ry so

unds

, co

lour

, gen

eral

co

nditi

on

10no

info

rmat

ion

disc

rimin

ativ

e,

eval

uativ

eKr

istja

nsso

n 19

9322

292-

18

mon

ths

IPbr

onch

iolit

isRC

T ne

bulis

ed a

dren

alin

eno

t qua

ntifi

ed

BS-4

rece

ssio

ns (i

nter

cost

al,

subc

osta

l, sub

ster

nal),

tra

chea

l tug

, nas

al

flarin

g, S

aO2,

RF

7no

info

rmat

ion

disc

rimin

ativ

eW

ainw

right

20

0323

194

<12

mon

ths

IPbr

onch

iolit

isRC

T ep

inep

hrin

eno

t qua

ntifi

ed

CAES

-1

PO2

or cy

anos

is,

Insp

irato

ry b

reat

h so

unds

, acc

esso

ry

mus

cle u

se, w

heez

e,

cere

bral

func

tion

10cli

nica

l crit

eria

ba

sed

on e

arlie

r pu

blica

tions

disc

rimin

ativ

e,

eval

uativ

e,

pred

ictiv

e

Woo

d 19

7224

18?

IPst

atus

as

thm

aticu

sco

mpa

rison

with

blo

odga

s an

alys

is (P

aCo2

and

PaO

2)r (

PaCo

2) 0

.69,

r (P

aO2)

-0.4

4

Ange

lilli

2002

25

176-

17 ye

ars

IPac

ute

asth

ma

inte

robs

erve

r rel

iabi

lity

mul

tirat

er κ

(4 o

bser

vers

) no

t spe

cified

for t

otal

scor

e,

for i

ndiv

idua

l item

s < 0

.70;

ox

ygen

atio

n 0.

759

CAES

-2

Insp

irato

ry b

reat

h so

unds

, acc

esso

ry

mus

cle u

se, w

heez

e,

cere

bral

func

tion

8ba

sed

on

exist

ing

scor

e (C

AES-

1)

disc

rimin

ativ

eHu

rwitz

19

8426

382-

13 ye

ars

EDac

ute

asth

ma

com

paris

on w

ith p

aO2

r -0.

15

CAES

-3

FiO

2, S

aO2,

Insp

irato

ry

brea

th so

unds

, ac

cess

ory m

uscle

us

e, w

heez

e, ce

rebr

al

func

tion

11ba

sed

on

exist

ing

scor

e (C

AES-

1)

disc

rimin

ativ

eAn

gelil

li 20

0225

176-

17 ye

ars

IPac

ute

asth

ma

inte

robs

erve

r rel

iabi

lity

mul

tirat

er κ

(4 o

bser

vers

) no

t spe

cified

for t

otal

scor

e,

for i

ndiv

idua

l item

s < 0

.70;

ox

ygen

atio

n 0.

759

CAES

-4

Dys

pnoe

a, w

heez

e,

rale

s, sp

eech

im

pairm

ent,

cyan

osis,

m

enta

l sta

tus

10ba

sed

on

exist

ing

scor

e {M

itsui

et a

l in

Sogo

Rin

sho

1971

, Jap

anes

e}

disc

rimin

ativ

eO

bata

199

12732

< 5

year

sED

episo

dic

asth

ma

com

paris

on w

ith a

rteria

l bl

ood

gas

r -0.

67

Page 118: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 117

4

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

CAS

RF, w

heez

e, in

draw

ing,

dy

spne

a, I:E

ratio

10fo

rmal

item

se

lect

ion

and

redu

ctio

n ba

sed

on

clini

cal

obse

rvat

ion,

an

d ot

her

clini

cal s

core

s

disc

rimin

ativ

e,

eval

uativ

ePa

rkin

199

62858

1-5

year

sIP

acut

e as

thm

ade

velo

pmen

t and

val

idat

ion

of th

e sc

ore

corre

latio

n w

ith o

xyge

n sa

tura

tion

r -0.

31; κ

w 0

.82-

0.89

, Cro

nbac

h’s

α in

30

patie

nts 0

.86

(no

fact

or

anal

ysis

repo

rted)

, con

stru

ct

valid

ity g

ood

in 3

0 pa

tient

s

CSRF

, ret

ract

ions

, dys

pnea

(w

ith fe

edin

g, a

ctiv

ity,

voca

lizat

ions

/spe

ech)

, w

heez

e

12 (m

in.

1)ba

sed

on

exist

ing

scor

es

and

expe

rt

opin

ion

eval

uativ

eLi

u 20

0429

551

mon

th-1

9 ye

ars

IPas

thm

a,

bron

chio

litis

or

whe

eze

asse

ssm

ent o

f int

erob

serv

er

relia

bilit

yκ w

0.6

2 in

165

obs

erve

rpai

rs

CSGS

-1

rhon

chi, w

heez

e,

dysp

noea

, ret

ract

ions

, fa

tigue

, anx

iety

5no

info

rmat

ion

disc

rimin

ativ

eO

berg

er

1978

30

471-

17 ye

ars

IP, O

Pas

thm

aco

mpa

rison

with

PaO

2r -

0.86

(p<0

.01)

CSGS

-2

RF, r

honc

hi, w

heez

e,

retra

ctio

ns, f

atig

ue,

anxi

ety,

cyan

osis,

br

eath

soun

ds,

cons

cious

ness

6ba

sed

on

exist

ing

scor

e (C

SGS-

1)

eval

uativ

eW

enne

rgre

n 19

8631

106-

23

mon

ths

IPac

ute

asth

ma

rela

tion

with

paO

2 an

d pa

CO2

corre

latio

n no

t spe

cified

CSS-

1RF

, whe

eze,

cyan

osis,

ac

cess

ory m

uscle

use

12ba

sed

on

exist

ing

scor

e (P

IS)

eval

uativ

eTa

l 198

33232

1-12

m

onth

sIP

whe

eze

asso

ciate

d vi

ral il

lnes

s (b

ronc

hiol

itis)

RCT

of d

exam

etha

sone

and

sa

lbut

amol

impr

ovem

ent i

n sc

ore

not

quan

tified

Chon

g Ne

ton

2005

33

406-

18 ye

ars

EDAc

ute

asth

ma

RCT

of d

iffer

ent m

etho

ds

of a

dmin

istra

tion

of

bron

chod

ilato

rs

impr

ovem

ent i

n cli

nica

l sco

re n

ot

quan

tified

CSS-

2RF

, whe

eze,

cyan

osis,

ac

cess

ory m

uscle

use

12ba

sed

on

exist

ing

scor

e (C

SS-1

)

eval

uativ

ePa

von

1999

3413

81-

24

mon

ths

OP

acut

e w

heez

eco

mpa

rison

with

SaO

2R

-0.7

6

Page 119: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

118 Chapter 4 : Clinical assessment of dyspnoea in children

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

CSS-

3HF

, RF,

pulsu

s pa

rado

xus,

dysp

noea

, ac

cess

ory m

uscle

use

, w

heez

e

6no

info

rmat

ion

disc

rimin

ativ

eKe

rem

199

1671

5-17

year

sED

acut

e as

thm

aco

mpa

rison

with

SaO

2 an

d FE

V1r (

SaO

2) 0

.49;

r (F

EV1)

0.5

2

CSS-

4RF

, HF,

whe

eze,

ac

cess

ory m

uscle

use

, dy

spno

ea

5no

info

rmat

ion

eval

uativ

eBe

ntur

19

9034

5

435

mon

ths-

2 ye

ars

EDac

ute

asth

ma

trial

alb

uter

olCS

S im

prov

ed in

23/

43 p

atie

nts,

from

a m

ean

scor

e 3.

75 to

2.8

0

CSS-

5HF

, RF,

whe

eze,

ac

cess

ory m

uscle

use

12ba

sed

on a

2-

item

exi

stin

g sc

ore72

eval

uativ

eBe

ntur

199

23628

3-24

m

onth

sED

acut

e as

thm

atri

al o

f neb

ulize

d al

bute

rol

decr

ease

of 2

.9 p

oint

s and

0.4

po

ints

in a

lbut

erol

resp

. pla

cebo

gr

oup

CSS-

6Ac

cess

ory m

uscle

use

, w

heez

e, R

F9

base

d on

ex

istin

g sc

ore

(CSS

-1)

eval

uativ

eBe

rger

199

83738

1-18

m

onth

sED

bron

chio

litis

RCT

of co

rtico

ster

oids

±80%

dec

reas

e of

2 p

oint

s afte

r 3

days

in b

oth

grou

ps

CSS-

7RF

, whe

eze,

SaO

2,

acce

ssor

y mus

cle u

se12

base

d on

ex

istin

g sc

ore

(CSS

-1)

eval

uativ

eDe

Boe

ck

1997

38

290-

24

mon

ths

IPbr

onch

iolit

isRC

T of

dex

amet

haso

nim

prov

emen

t in

clini

cal s

core

eq

ual in

bot

h gr

oups

, not

qu

antifi

ed

Goeb

el

2009

39

510-

23

mon

ths

IP, E

Dbr

onch

iolit

isRC

T of

pre

dniso

lone

plu

s al

bute

rol v

ersu

s alb

uter

olim

prov

emen

t afte

r 2 d

ays i

n tre

atm

ent g

roup

scor

e 4.

5 to

2.7

Teer

ataku

lpisa

rn

2007

40

174

1-24

m

onth

sED

, OP

bron

chio

litis

RCT

dexa

met

haso

n in

ject

ion

and

inte

robs

erve

r rel

iabi

lity

Kapp

a 0.

7 (p

<0.0

1)

CSS-

8RF

, whe

eze,

acc

esso

ry

mus

cle u

se, g

ener

al

cond

ition

, SaO

2

10ba

sed

on

exist

ing

scor

e (C

SS-1

, CSS

-5,

PS-2

)

eval

uativ

eBe

ntur

200

54161

3-12

m

onth

sED

bron

chio

litis

RCT

of in

hale

d de

xam

etha

son

impr

ovem

ent i

n cli

nica

l sco

re

equa

l in b

oth

grou

ps, n

ot

quan

tified

EDRA

RHF

, RF,

whe

eze,

cost

al

retra

ctio

ns12

no in

form

atio

ndi

scrim

inat

ive

Coar

asa

2010

42

200

< 2

year

sIP

bron

chia

l ob

stru

ctiv

e sy

ndro

me

pred

ictio

n of

hyp

oxem

ia (S

aO2

≤91%

)r (

SaO

2) -0

.49

(p<0

.001

). ED

RAR

≥5, s

e 10

0% a

nd sp

ec 5

4%. A

UC

0.90

Page 120: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 119

4

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

EDRC

HRF

, whe

eze,

cyan

osis,

co

stal

retra

ctio

ns12

no in

form

atio

ndi

scrim

inat

ive

Coar

asa

2010

42

200

< 2

year

sIP

bron

chia

l ob

stru

ctiv

e sy

ndro

me

pred

ictio

n of

hyp

oxem

ia (S

aO2

≤91%

),ED

RCH

≥5 se

56%

and

spec

93%

. AU

C 0.

88

MPI

SSa

O2,

acc

esso

ry

mus

cle u

se, in

hala

tion-

ex

hala

tion

ratio

, w

heez

e, H

F, RF

18ba

sed

on

exist

ing

scor

e (P

IS)

disc

rimin

ativ

e,

eval

uativ

e,

pred

ictiv

e

Caro

ll 200

54330

mea

n ag

e 7,

6 ye

ars (

SD

5,5

year

s)

IPst

atus

as

thm

aticu

s (8

on in

tens

ive

care

)

pred

ictiv

e va

lue

for I

CU

adm

issio

n, a

sses

smen

t of

valid

ity a

nd re

liabi

lity

r (da

ys o

f oxy

gen

supp

lem

enta

tion)

0.7

8; r

(day

s of

cont

inuo

us a

lbut

erol

) 0.7

6,

r (le

ngth

of h

ospi

talis

atio

n)

0.70

; inte

robs

erve

r rel

iabi

lity r

>0

.94.

Bla

nd A

ltman

(no

limits

of

agre

emen

t giv

en).

AUC

> 0,

82.

scor

e ≥

12 S

e 88

% e

n sp

ec 9

1%

PASS

whe

eze,

air

entry

, wor

k of

bre

athi

ng (a

cces

sory

m

uscle

s, re

tract

ions

), pr

olon

gatio

n of

ex

pira

tion,

RF,

men

tal

stat

us

10ba

sed

on o

ther

st

udie

s and

ex

pert

opin

ion

eval

uativ

eGo

relic

k 20

0444

852

1-18

year

sED

acut

e as

thm

ade

velo

pmen

t and

val

idat

ion

of th

e sc

ore

(relia

bilit

y, va

lidity

and

resp

onsiv

enes

s, co

mpa

rison

with

SaO

2 an

d pe

ak fl

ow)

inte

robs

erve

r rel

iabi

lity;

kap

pa

0.79

-0.8

3; PA

SS –

Peak

flow

r 0.

22-

0.42

, PAS

S an

d Sa

O2

r 0.2

8-0.

47);

AUC

for h

ospi

taliz

atio

n 0.

82

Arno

ld 2

01145

536

5-17

year

sED

acut

e as

thm

ava

lidat

ion

and

com

paris

on

of 3

diff

eren

t sco

res a

nd

com

paris

on w

ith F

EV1

coeffi

cient

of v

aria

tion

in a

m

ultiv

aria

ble

linea

r reg

ress

ion

mod

el u

sing

age,

sex,

race

, the

GI

NA-s

ever

itysc

ore

and

the

PASS

w

ith F

EV1:

R2 0

.434

, thi

s mod

el

with

chan

ge in

FEV

1: R

2 0.1

09

Page 121: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

120 Chapter 4 : Clinical assessment of dyspnoea in children

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

PRAM

retra

ctio

ns

(supr

aste

rnal

), sc

alen

e m

uscle

cont

ract

ion,

air

entry

, whe

eze,

SaO

2

12cli

nica

l ob

serv

atio

ns,

mul

tivar

iate

an

alys

is

disc

rimin

ativ

e,

eval

uativ

e,

pred

ictiv

e

Chal

ut 2

00046

217

3-6

year

sED

acut

e as

thm

ade

velo

pmen

t and

val

idat

ion,

co

mpa

rison

of r

espi

rato

ry

resis

tanc

e w

ith fo

rced

os

cilla

tion

tech

niqu

e (R

fO8)

corre

latio

n w

ith p

rofe

ssio

nal

judg

emen

t 0.5

0, R

fO8

0.32

(p

hysic

ians

) and

0.1

5 (n

urse

s). M

IC

3 po

ints

/12,

. sev

ere

≥9

Duch

arm

e 20

0847

782

2-17

year

sED

and

IP

acut

e as

thm

aex

tern

al v

alid

atio

n (a

ssoc

iatio

n w

ith a

dmiss

ion,

re

spon

siven

ess,

inte

rnal

co

nsist

ency

and

inte

robs

erve

r re

liabi

lity)

asso

ciatio

n fo

r adm

issio

n r

0.5

(p<0

.001

); Gu

yatt’

s RR

0.7;

Cr

onba

ch’s

α 0.

71; in

tero

bser

ver

relia

bilit

y κw 0

.78

in 2

54 o

bser

ver

pairs

Arno

ld 2

01145

536

5-17

year

sED

acut

e as

thm

ava

lidat

ion

and

com

paris

on

of 3

diff

eren

t sco

res a

nd

com

paris

on w

ith F

EV1

coeffi

cient

of v

aria

tion

in a

m

ultiv

aria

ble

linea

r reg

ress

ion

mod

el u

sing

age,

sex,

race

, the

GI

NA-s

ever

itysc

ore

and

the

PRAM

w

ith F

EV1:

R2 0

.462

, thi

s mod

el

with

chan

ge in

FEV

1: R

2 0.1

06

Page 122: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 121

4

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

PIS

RF, w

heez

e, I:E

ratio

, ac

cess

ory m

uscle

use

12no

info

rmat

ion

disc

rimin

ativ

e,

eval

uativ

e,

pred

ictiv

e

Pier

son

1971

48

235-

18IP

acut

e as

thm

aRC

T am

inop

yllin

ePI

S co

mpa

rabl

e in

bot

h gr

oups

bef

ore

med

icatio

n: N

o qu

antit

ativ

e da

ta g

iven

Bier

man

19

7449

--

-ac

ute

asth

ma

desc

riptio

n of

a

treat

men

tpro

toco

lno

resu

lts

Beck

er 1

98450

406-

17 ye

ars

EDac

ute

asth

ma

com

paris

on w

ith p

ulm

onar

y fu

nctio

nr (

FEV1

) -0.

28 a

nd 0

.52,

im

prov

emen

t of s

core

afte

r tre

atm

ent w

ith β

-adr

ener

gic d

rug

not q

uant

ified

Tal 1

99051

747-

54

mon

ths

EDac

ute

asth

ma

trial

met

hylp

redn

isolo

nede

crea

se o

f 2.5

and

0.8

2 po

ints

(m

ax p

ossib

le sc

ore

12) i

n m

ethy

lpre

dniso

lone

resp

. pla

cebo

gr

oup

Barn

ett

1997

52

4918

m

onth

s-18

ye

ars

EDac

ute

asth

ma

RCT

intra

veno

us ve

rsus

or

al co

rtico

ster

oids

and

in

terra

terre

liabi

lity i

n 10

% o

f ob

serv

atio

ns,3

obs

erve

rs

Kapp

a 0.

94-0

.97,

impr

ovem

ent o

f PI

S fro

m 8

,6 to

4 a

fter 4

hou

rs in

bo

th g

roup

s

Scar

fone

20

1053

541-

18 ye

ars

EDac

ute

asth

ma

RCT

mag

nesiu

m, a

sses

smen

t of

inte

robs

erve

rrelia

bilit

y of

PIS,

sam

ple

size

not m

entio

ned

ICC

0.79

-0.9

4, m

axim

um

diffe

renc

e be

twee

n ob

serv

ers 1

po

int.

No d

iffen

ce in

scor

e-ch

ange

be

twee

n bo

th g

roup

s

Hsu

2010

5465

1-12

year

sED

acut

e as

thm

ava

lidat

ion

of th

e sc

ore

and

com

paris

on w

ith N

AGC

(inclu

ding

pul

mon

ary

func

tion)

Cron

bach

’s α

0.83

5, A

UC 0

.896

for

pred

ictin

g no

n-m

ild ca

ses;.

PIS

med

ian

valu

e 3.

0,7.

0, a

nd 8

.o

for m

ild, m

oder

ate

and

seve

re

NACG

cate

gorie

s

Page 123: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

122 Chapter 4 : Clinical assessment of dyspnoea in children

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

PS-1

RF, w

heez

e, a

cces

sory

m

uscle

use

9ba

sed

on

exist

ing

scor

e (P

IS)

disc

rimin

ativ

e,

eval

uativ

eSm

ith 2

00255

465-

17 ye

ars

EDac

ute

asth

ma

com

paris

on w

ith p

eak fl

ow,

inte

robs

erve

r rel

iabi

lity

r 0.-0

.44-

-0.6

7; IC

C 0.

53-0

.63

PS-2

RF, w

heez

e, a

cces

sory

m

uscle

use

12no

info

rmat

ion

disc

rimin

ativ

eW

ang

1992

5656

0-2

year

sIP

bron

chio

litis

or

pneu

mon

iaco

mpa

rison

with

SaO

2,

inte

robs

erve

r rel

iabi

lity

r (Sa

O2)

-0.0

4; in

tero

bser

ver

relia

bilit

y r 0

.68

RAw

ork o

f bre

athi

ng,

whe

eze,

dec

reas

ed

air e

ntry

, pro

long

ed

expi

ratio

n,

brea

thle

ssne

ss, R

F

24ex

pert

opin

ion

disc

rimin

ativ

eSt

even

s 20

0357

115

1-16

year

sED

and

IP

acut

e w

heez

ein

tero

bser

ver r

elia

bilit

yKa

ppa

0.82

RAD

RF, a

cces

sory

mus

cle,

decr

ease

d br

eath

so

unds

3ex

pert

opin

ion

eval

uativ

e,

disc

rimin

ativ

eAr

nold

201

14553

65-

17 ye

ars

EDac

ute

asth

ma

valid

atio

n an

d co

mpa

rison

of

3 d

iffer

ent s

core

s and

co

mpa

rison

with

FEV

1

coeffi

cient

of v

aria

tion

in a

m

ultiv

aria

ble

linea

r reg

ress

ion

mod

el u

sing

age,

sex,

race

, the

GI

NA-s

ever

itysc

ore

and

the

RAD

with

FEV

1: R

2 0.4

26, t

his m

odel

w

ith ch

ange

in F

EV1:

R2 0

.139

Page 124: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 123

4

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

RDAI

whe

eze,

retra

ctio

ns17

expe

rt op

inio

nev

alua

tive

Low

ell 1

98758

300-

2 ye

ars

EDbr

onch

iolit

isRC

T epi

neph

rine,

inte

robs

erve

r re

liabi

lity

impr

ovem

ent n

ot q

uant

ified

, κw

0.

9 an

d 0.

64

Klas

sen

1991

59

830-

24

mon

ths

EDbr

onch

iolit

isRC

T sa

lbut

amol

, in

terra

terre

liabi

lity

no d

iffer

ence

at 6

0 m

in in

bot

h gr

oups

(sal

buta

mol

8.7

5 to

5.0

; pl

aceb

o 8.

0 to

6.2

5); κ

w 0

.94

(61

obse

rvat

ions

in 2

2 pa

tient

s)

Men

on

1995

60

426

wee

ks-1

ye

arED

bron

chio

litis

RCT

epin

ephr

ine

vers

us

salb

utam

olim

prov

emen

t in

scor

ecom

para

ble

in b

oth

grou

ps (n

ot q

uant

ified

)

Klas

sen

1997

61

616

wee

ks-1

5 m

onth

sIP

bron

chio

litis

RCT

dexa

met

haso

nm

ean

chan

ge in

scor

e:

dexa

met

haso

n 1.

6; 1

.4 in

pla

cebo

1.

4

Pate

l 200

26214

9<1

2 m

onth

sIP

bron

chio

litis

RCT

nebu

lized

epi

neph

rine

vers

us a

lbut

erol

no e

ffect

of e

pine

phrin

e on

tim

e to

reac

h sc

ore

< 4

Kuyu

cu

2004

63

692-

21

mon

ths

IPbr

onch

iolit

isRC

T de

xam

etha

son

impr

ovem

ent i

n sc

ore

from

7.4

to

4.1

afte

r 120

min

utes

, not

di

ffere

nces

bet

wee

n gr

oups

Corn

eli

2007

64

600

2-12

m

onth

sED

bron

chio

litis

RCT

dexa

met

haso

nim

prov

emen

t ins

core

co

mpa

rabl

e in

bot

h gr

oups

(-5.

3 n

dexa

met

haso

n; -4

.8 in

pla

cebo

af

ter 4

hou

rs)

Plin

t 200

96580

06

wee

ks –

12

mon

ths

EDbr

onch

iolit

isRC

T ep

inep

hrin

e an

d de

xam

etha

son

impr

ovem

ent i

n sc

ore

afte

r 60

min

utes

bet

ter i

n ep

inep

hrin

e (-2

.5) t

han

in p

lace

bo (o

r de

xam

thas

on -1

.7)

Mes

quita

20

0966

652-

24

mon

ths

EDbr

onch

iolit

isRC

T de

xam

etha

son

chan

ge in

scor

e 10

to 8

afte

r 60

min

utes

, no

diffe

renc

e be

twee

n gr

oups

Tins

a 20

0967

3-12

m

onth

sIP

bron

chio

litis

RCT

terb

utal

insc

ore

decr

ease

d w

ith ti

me

in b

oth

grou

ps, n

ot q

uant

ified

Page 125: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

124 Chapter 4 : Clinical assessment of dyspnoea in children

nam

eite

ms

max

i m

um

scor

e

deve

lop

men

tPu

rpos

ere

fer

ence

sn

of

patie

nts

age

sett

ing

clin

ical

ch

arac

ter

istic

s

desc

riptio

n of

the

stud

yre

sults

RDI

colo

ur, w

heez

e,

acce

ssor

y mus

cle,

flarin

g, g

runt

ing,

di

stre

ssfu

lnes

s

not c

lear

no in

form

atio

ndi

scrim

inat

ive,

ev

alua

tive

Alar

io 1

99268

741-

36

mon

ths

OP

acut

e w

heez

eRC

T of

neb

ulise

d m

etap

rote

reno

l and

co

mpa

rison

with

SaO

2

scor

e 23

±3 b

efor

e an

d 16

±4 a

fter

met

apro

tere

nol (

p<0.

01),

r (Sa

O2)

-0

.36

Alar

io 1

99569

741-

36

mon

ths

OP

acut

e w

heez

eco

mpa

rison

with

SaO

2r -

0.36

p<0

,01

SSRR

, ret

ract

ions

, cr

epita

tions

, whe

eze,

ox

ygen

requ

irem

ent,

nebu

lisat

ion,

in

trave

nous

infu

sion

15no

info

rmat

ion

eval

uativ

eGo

h 19

9770

120

<2 ye

ars

IPbr

onch

iolit

isRC

T sa

lbut

amol

and

ip

atro

pium

brom

ide

impr

ovem

ent i

n sc

ore

in 4

day

s in

all g

roup

s fro

m 6

.7 to

3.2

SOIS

grun

ting,

nas

al fl

arin

g,

supr

acla

vicu

lar

retra

ctio

ns, in

terc

osta

l re

tract

ions

, air

entry

, ai

r hun

ger, d

urat

ion

of w

heez

e, lo

catio

n of

whe

eze,

gen

eral

ap

pear

ance

27Ga

dom

ski

1994

71

88<1

5 m

onth

sED

and

O

Pw

heez

eRC

T al

bute

rol

impr

ovem

ent,

but n

ot q

uant

ified

Page 126: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Systematic review: Insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children 125

4

app

endi

x 3

Qua

lity

of re

port

ed c

riter

ia fo

r mea

sure

men

t pro

pert

ies

of p

aedi

atric

dys

pnoe

a sc

ores

scor

eva

lidity

relia

bilit

yut

ility

sum

of

posit

ivel

y ra

ted

crite

ria

face

cont

ent

cons

truc

tcr

iteriu

m-

conc

urre

ntm

easu

rem

ent

erro

rin

ter

obse

rver

re

liabi

lity

intr

a ob

serv

er

relia

bilit

y

inte

rnal

co

nsist

ency

resp

ons

iven

ess

suita

ble

in

child

ren

age

ease

of

scor

ing

ausc

ult

atio

nflo

or o

r ce

iling

eff

ect

inte

rpr

etab

ility

AS+

-0

?0

?0

00

±+

0?

3

ASS

+-

00

0?

00

?+

++

±0

?4

BS-1

+-

00

0?

00

?+

-0

02

BS-2

+-

00

0+

00

0+

±0

03

BS-3

+-

00

00

00

?+

-+

±0

03

BS-4

+-

00

00

00

0+

--

+0

?3

CAES

-1+

-0

-0

00

0?

--

00

2

CAES

-2+

-0

-0

00

00

++

00

4

CAES

-3+

-0

00

?0

00

+-

00

3

CAES

-4+

-0

-0

00

00

±-

±±

0?

1

CAS

++

?-

0+

0?

?+

-+

-0

05

CS+

-0

00

-0

00

++

±±

00

3

CSGS

-1+

-0

+0

00

0?

++

00

4

CSGS

-2+

-0

?0

00

00

+-

00

2

CSS-

1+

-0

00

00

0?

+-

±±

00

2

CSS-

2+

-0

+0

00

00

+-

±±

00

3

CSS-

3+

-0

-0

00

00

--

00

2

CSS-

4+

-0

00

00

0?

+-

00

3

CSS-

5+

-0

00

00

0?

+-

±±

00

2

CSS-

6+

-0

00

00

0?

+-

±±

00

2

CSS-

7+

-0

00

+0

00

+-

±±

00

3

CSS-

8+

-0

00

00

00

+-

±±

00

2

Page 127: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

126 Chapter 4 : Clinical assessment of dyspnoea in children

scor

eva

lidity

relia

bilit

yut

ility

sum

of

posit

ivel

y ra

ted

crite

ria

face

cont

ent

cons

truc

tcr

iteriu

m-

conc

urre

ntm

easu

rem

ent

erro

rin

ter

obse

rver

re

liabi

lity

intr

a ob

serv

er

relia

bilit

y

inte

rnal

co

nsist

ency

resp

ons

iven

ess

suita

ble

in

child

ren

age

ease

of

scor

ing

ausc

ult

atio

nflo

or o

r ce

iling

eff

ect

inte

rpr

etab

ility

EDRA

R+

-0

-0

00

00

+-

±±

00

2

EDRC

H+

-0

00

00

00

+-

±±

00

2

MPI

S+

-0

00

?0

00

++

±-

00

4

PASS

+-

??

0+

00

?+

++

-0

?5

PRAM

++

??

0+

0?

?+

±0

05

PIS

+-

?-

0+

0+

0+

-0

?5

PS-1

+-

?-

0-

00

0+

±0

02

PS-2

+-

0-

0-

00

0+

±0

02

RA+

-0

00

+0

00

++

±-

00

4

RAD

++

0?

00

00

?+

-+

±0

04

RDAI

±-

00

0+

00

?+

--

±0

02

RDI

+-

00

00

00

0+

--

±0

02

SS SOIS

+ +- -

0 00 0

0 00 0

0 00 0

0 0+ +

- -± ±

± ±0 0

0 02 2

Page 128: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children

Jolita BekhofRoelien Reimink

Ine-Marije BartelsHendriekje Eggink

Paul Brand

Submitted

Page 129: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

128 Chapter 4 : Clinical assessment of dyspnoea in children

aBstract

Objective: To determine intra- and inter-observer variation in clinical assessment of the severity of dyspnoea in children and to compare this variability to its change after therapy.

Study design and setting: We recorded 27 acutely wheezing children aged 0-8 years on video before and after treatment with inhaled bronchodilators. Nine observers indepen-dently assessed these videorecordings by scoring wheeze, prolonged expirium, retrac-tions, nasal flaring, mental status and a general assessment of dyspnoea on a Likert-scale (0-10). Assessment was repeated after two weeks to evaluate intra-observer variation.

Results: We analysed 972 observations. Intra-observer reliability was highest for supracla-vicular retractions (kappa 0.84) and fair to substantial for other items (kappa 0.39-0.65). Inter-observer reliability showed highest reliability for subcostal retractions (kappa 0.46) and <0.4 for other items. The Smallest Detectable Change of the dyspnoea-score (>3 points) was larger than the Minimal Important Change (<1 point), meaning that in 69% of observations a clinically important change after treatment cannot be distinguished from measurement error.

Conclusion: Intra-observer variation is modest, and inter-observer variation is large for most clinical findings in dyspnoeic children. The measurement error induced by this variation is too large to distinguish potentially clinically relevant changes in dyspnoea after treatment in two-third of observations.

Page 130: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children 129

4

introduction

Acute dyspnoea is one of the most common reasons for emergency room visits and hospitalizations of children.1 It is most commonly caused by lower airway obstruction and accompanied by wheeze.2 Evaluating the severity of dyspnoea in these children is important in clinical decision making and evaluation of treatment. The severity of dyspnoea is primarily assessed by clinical findings, because pulmonary function tests are not readily available in the emergency department, particularly not for preschool children, the largest group presenting with acute dyspnoea.3

Like any clinical measurement, the usefulness of the clinical assessment of dyspnoea is strongly determined by its reliability.4-6 Variation within and between professionals assessing the degree of dyspnoea may importantly influence the reliability of this clini-cal judgement. Since assessment of response to treatment, and follow-up of severity of dyspnoea is often performed by different professionals, knowledge of the inter-observer variation of the clinical assessment of dyspnoeic children is important.5 Only a few studies assessed the inter-observer variation of clinical findings in dyspnoeic children, demonstrating substantial variability between observers {webappendix 1}. The small number of observers in these studies (two in most studies; four in one study) limits their applicability in clinical practice, where the number of health care professionals involved in assessing dyspnoea in children may be considerably larger.7-14 Intra-observer varia-tion has never been studied for these clinical findings. Although variation in dyspnoea assessment within and between observers may hinder identification of improvement in dyspnoea after (bronchodilator) therapy, the extent to which this occurs has not been studied to date.

The aim of this study was to determine the intra- and inter-observer variation of common clinical findings in children with acute severe dyspnoea and wheeze and to compare the variability of clinical dyspnoea assessment to its change after bronchodila-tor therapy.

methods

Study design and setting

We performed an observational study using a crossed design, meaning that all observ-ers performed two repeated assessments (before and after treatment) of all patients. We consecutively enrolled children aged 0 to 8 years presenting to the emergency department in the period September 2009 to September 2010 with acute dyspnoea and wheeze. The included patients are a convenience sample, since the persons (RR, ND, IMB) who recorded the videos had to be available. After undressing, the patient’s head and

Page 131: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

130 Chapter 4 : Clinical assessment of dyspnoea in children

chest were recorded on digital video for 2-3 minutes, before and 15 to 30 minutes after inhaling nebulized salbutamol (2.5 mg for patients aged < 4 years and 5 mg for ages ≥ 4 years). Patients with transcutaneous oxygen saturation < 93% were given supplemental oxygen by nasal canula before recordings were made. Oxygen saturation and heart rate were measured by pulse oximetry, which was visible on the video. The video included sound recording.

The study was approved by the hospital’s ethical review board (09.0536n), and written informed consent was obtained from the parents.

Assessment of video recordings

The video recordings of all patients were assessed independently by five experienced consultant paediatricians and four masters of advanced paediatric nursing, after receiv-ing written and verbal instructions. These observers all had at least five years of experi-ence in assessing dyspnoeic children. All assessments were repeated in random order by the same observers after an interval of at least two weeks to assess intra-observer variation. Observers did not receive any clinical information regarding the patients, nor did they know whether the video was recorded before or after bronchodilator treat-ment. Each observer recorded the presence and severity of the following clinical signs on a structured case report form: wheeze, prolonged expirium, subcostal retractions, intercostal retractions, jugular retractions, supraclavicular retractions, nasal flaring, and mental status. These items were chosen because they represent the items in all compos-ite dyspnoea scores in the literature.15 Each item was rated as a binary variable (absent or present). Observers were also requested to give an overall assessment of the degree of dyspnoea on a Likert scale ranging from 0 (no dyspnoea) to 10 (very severe dyspnoea requiring mechanical ventilation), which we will call the dyspnoea score.

Terminology and definitions

To avoid confusion in the terminology in clinical measurement analysis, two terms should be explained: measurement error and reliability. Repeated measurements show variation such as variations within and between assessors and spontaneous variation within patients. The “standard error of measurements” (SEM) represents the magnitude of this measurement error.6 The SEM is interpreted as the standard deviation around a single measurement, describing the spread of repeated measurements in the unit of the measurement.6 Reliability is the degree to which the measurement is free from measure-ment error.16 It is expressed as the proportion of the true difference between patients, compared to the total variance.16

Kappa and the intraclass correlation coefficient (ICC) are reliability parameters, ex-pressing how well patients can be distinguished from each other despite the measure-

Page 132: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children 131

4

ment error.6 Kappa and ICC range from 0 (totally unreliable) to 1(perfect reliability). In general a kappa or ICC > 0.70 represents sufficient reliability.6

Reliability depends on the relative magnitude of measurement error and variation between patients. If measurement error is small in comparison to variation between patients, the reliability parameter approaches 1. Large measurement error and relatively small variation between patients yields low reliability parameter values.6

Clinical relevance: smallest detectable change (SDC) and minimal important change (MIC)In addition to reliability parameters, clinical measurements of dyspnoea should also be examined on their ability to identify change in the degree of dyspnoea in the individual patient over time, either spontaneously or as a result of treatment. This is expressed as the smallest detectable change (SDC), the smallest within-person change which can be interpreted as real change above measurement error. The SDC should be compared to the minimal important change (MIC),5 the smallest change in the measurement which the clinician or patient perceive as important.6 When the MIC exceeds the SDC, the measurement has good clinical value, because clinically relevant changes can be distinguished from measurement error.6 Conversely, if the MIC is smaller than the SDC, the clinical usefulness of the measurement is limited because changes larger than the MIC but smaller than the SDC cannot be distinguished from measurement error.

Statistical analysis

We excluded observations with missing or “don’t know” or “unsure” responses. Percen-tiles of heart rate and respiratory rate were compared to reference values for age and categorized as <p1, p1-10, p10-25, p25-50, p50-75, p75-90, p90-99, and >p99.17 Tachy-pnoea and tachycardia were defined as respiratory and heart rates ≥ p90.

Calculation of measurement error (SEM)Quantifying the measurement error in the units of measurement, or comparing it with change over time or response to treatment, is only possible for continuous variables (dyspnoea score). We calculated the SEM due to variation within observers by using the formula: SEM = SDdifference/√2, where SDdifference is the standard deviation of the mean dif-ference between the repeated observations.6 The SEM due to variation between observ-ers was calculated by using the pooled SD of the mean scores of the different observers using the formula: SEM = SDpooled*Ö(1-ICCagreement). SDpooled was calculated as √(SD2

observer1 + SD2

observer2 + etc/2).6

Calculation of reliability parameters (ICC and Kappa)For the dichotomous clinical findings (present or absent) we calculated Kappa and per-centage agreement. Kappa can be calculated in different ways, depending on number

Page 133: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

132 Chapter 4 : Clinical assessment of dyspnoea in children

of observers, variable distribution (dichotomous, categorical, continuous) and statistical package.18 For intra-observer reliability we calculated Cohen’s Kappa in SPSS, using Siegel and Castellan’s calculation.19 For inter-observer reliability we calculated Cohen’s kappa for each observer pair, and used their arithmetic mean to calculate multirater Kappa, i.e. Light’s Kappa.20

Kappa values were categorized as follows: values of less than 0, poor; 0 to 0.2, slight; 0.2 to 0.4, fair; 0.4 to 0.6, moderate; 0.6 - 0.8, substantial agreement; and 0.8 - 1.0 repre-sent almost perfect agreement.21

For the continuous variable (dyspnoea score), we calculated intra-class correlation co-efficient (ICC) in SPSS, using two-way-mixed, absolute agreement and single measures calculations.

Calculation of the minimal important change (MIC) and the smallest detectable change (SDC)We used the visual anchor based MIC-distribution to calculate the MIC of the dyspnoea-score in our study population.6 This approach uses an external criterion, or “anchor”, a well-accepted and readily interpretable measurement instrument to determine what patients or their clinicians consider important improvement. We used two anchors: the clinical judgement of the consultant paediatrician who had assessed the patient in the emergency department and the difference in respiratory rate percentile categories before and after bronchodilator. The SDC was calculated by multiplying the standard deviation of the change in the dyspnoea-score after treatment in the stable group of patients (defined by the anchor as “not importantly changed”) by a factor of 1.96 which corresponds to 1,96Ö2SEM.6 Detailed explanation and calculation of the MIC and SDC are given in webappendix 2.

All analyses were performed in SPSS, version 20.0.

results

We included and video-recorded 27 patients twice, before and after bronchodilator therapy. Each of these 54 recordings was assessed by nine observers on two occasions, resulting in a total of 972 assessments. Characteristics of included patients are listed in Table 1. Overall, patients had mild to moderate dyspnoea. None of the patients required intensive care or mechanical ventilation.

Table 2 shows the mean values, intra- and inter-observer reliability and the measure-ment error (SEM) of the dyspnoea score.

Page 134: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children 133

4Table 3 shows the prevalence, intra- and inter-observer reliability, and percentage

agreement of the different binary clinical findings. Reliability within observers was fair to almost perfect, and reliability between observers was slight to moderate. Inter-observer reliability was considerably lower than intra-observer reliability for each item.

The MIC for the dyspnoea score as calculated in this study population, was 1 (0.5) for both anchors. The SDC in the dyspnoea score, again calculated in this study population, was 3.2 for the clinician’s judgement and 3.3 for the change in respiratory rate. For both anchors, therefore, the SDC was considerably larger than the MIC, the consequences of which are expressed in Figure 1. In only 5.8% of observations in our study, the change in dyspnoea-score was both statistically significant and clinically relevant (webappendix 2).

table 1 Patient Characteristics

N=27

Age in months, median, (range) 19 (3-85)

Male gender, n (%) 17 (63%)

Hospitalization, n (%) 20 (74%)

Diagnosis, n (%)

Acute asthma 7 (26%)

Exacerbation of episodic viral wheeze 13 (48%)

Bronchiolitis 6 (22%)

Other* 1 (4%)

Clinical findings before treatment (‘live” assessment)

Tachypnea**, n (%) 20 (74%)

Transcutaneous oxygen saturation ≤ 92%, n (%) 9 (33%)

Tachycardia**, n (%) 18 (67%)

* patient was finally diagnosed with foreign body aspiration** ≥ P90 for age according to Fleming et al.17

table 2 Intra- and inter-observer variation of continuous measures of dyspnoea in children

Results of972 assessments

Intra-observer variation

Inter-observer variation

Mean (SD) ICC SEM ICC SEM

Dyspnea score (0-10) 4.1 (1.8) 0.70 1.0 0.42 1.4

SD standard deviation; ICC Intraclass correlation coefficient; SEM standard error of measurements

Page 135: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

134 Chapter 4 : Clinical assessment of dyspnoea in children

table 3 Intra- and inter-observer variation of categorical measures of dyspnoea in children

972 assessments Prevalence intra-observer variation inter-observer variation

agreement kappa*(95%ci)

agreement kappa**(95%ci)

Subcostal retractions 68.0% 84.6% 0.65 (0.58-0.73) 70.4% 0.46 (0.44-0.49)

Prolonged expirium 61.1% 81.6% 0.61 (0.47-0.75) 61.6% 0.12 (0.19-0.28)

Jugular retractions 57.4% 78.8% 0.57 (0.48-0.66) 65.1% 0.29 (0.25-0.30)

Wheeze 42.6% 78.5% 0.56 (0.48-0.64) 68.0% 0.36 (0.33-0.39)

Intercostal retractions 42.0% 76.2% 0.49 (0.39-0.58) 67.6% 0.27 (0.24-0.30)

Nasal flaring 22.8% 88.6% 0.63 (0.52-0.74) 83.0% 0.31 (0.26-0.37)

Supraclavicular retractions 21.2% 96.0% 0.84 (0.75-0.93) 79.3% 0.37 (0.32-0.42)

Mental state affected 4.4% 94.9% 0.39 (0.16-0.62) 92.4% 0.10 (0.07-0.21)

* Cohen’s kappa (Siegel & Castellan)20, ** Multirater Kappa (Light)21

discussion

This study is unique because it examines both within- and between-observer varia-tion amongst more than two observers of physical examination signs of dyspnoea in children, which are being used in all published composite dyspnoea severity scoring systems. The results of our study show fair to good intra-observer reliability but poor

figure 1. Interpretation of change in dyspnoea score after treatment, explaining the relevance of the MIC being smaller than the SDC

Page 136: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children 135

4

inter-observer reliability of the clinical assessment of dyspnoea. Due to this variation within and between observers, the smallest detectable change exceeded the minimally important effect of treatment in 69.4% of observations in our study, obscuring the de-tection of a clinically important improvement in dyspnoea after treatment.

The poor inter-observer reliability we found is in accordance with the existing litera-ture (see webappendix 1).15 To our knowledge, however, no earlier studies examined the clinical relevance of this variation between observers in terms of the ability to detect clinically relevant improvement after treatment, partly because the clinimetric method-ology to assess this is relatively new.5,6

Our findings implicate that in clinical practice, assessment of the severity of dyspnoea in children is not interchangeable between professionals. The results of our study, there-fore, argue for great caution in interpreting the effect of a trial treatment with bron-chodilator, which is recommended in clinical guidelines for infants with bronchiolitis22 and in young children with acute severe wheeze,23 in particular when the assessment of dyspnoea before and after bronchodilator is being performed by different observers. But even when the same professional assesses the degree of dyspnoea before and after bronchodilator, the considerable intra-observer variation (table 3) should be taken into account. If clinical dyspnoea scoring is being used in clinical trials of young children, the number of different observers should be presented and discussed, because of the large variation between observers (table 3). Although most published clinical dyspnoea scores report validation against lung function,24 we recently described that no clinical dyspnoea scoring system available to date has been sufficiently validated for routine clinical use.15 Even in large trials with considerable clinical impact, the degree of intra- and inter-observer reliability is not formally studied.24 Our results suggest that clinical dyspnoea scoring systems require further validation testing and assessment of variation between and within observers. Further studies are needed to assess whether training can reduce the amount of variation we observed in this study (table 3).

We also postulate that the use of more objective parameters, such as oxygen satura-tion and lung function assessments with acceptably small measurement error, will give less variable and thus more reliable assessments of dyspnoea in children.

Strengths and limitations

The major strengths of our study include the measurement of intra- as well as inter-observer reliability, the use of a large group of observers in a crossed design, and the assessment of the clinical impact of reliability of these clinical signs of dyspnoea by computing measurement error.

We acknowledge the following weaknesses of our study. The use of video recordings clearly has limitations. The video recordings were relatively short (2-3 minutes), which

Page 137: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

136 Chapter 4 : Clinical assessment of dyspnoea in children

may have led to less accurate ratings or missed observations and may have decreased the likelihood of detecting subtle signs on physical examination. For our study purposes, however, which included the assessment of variation between and within multiple observers, video recordings were considered to be the only feasible method. The lack of chest auscultation could also be viewed as a weakness, because this helps physicians to detect lower airway obstruction. However, previous studies have shown poor associa-tion between wheeze severity on auscultation and the degree of airway obstruction and hypoxaemia.4, 25 Furthermore, leaving out auscultation in the assessment of dyspnoea severity in children reflects clinical practice where many assessments are being made by health care professionals who have not been trained in chest auscultation. We examined a limited number of patients for feasibility reasons, to avoid observer fatigue and bore-dom when assessing the videos. The reliability of the dyspnoea score and the individual items may have been greater if we would have included more children with (very) severe dyspnoea. However, we argued that health care professionals are reasonably accurate in identifying severe dyspnoea, reducing the need for a scoring system to identify this. A clinical dyspnoea scoring system is potentially most useful in identifying and following up mild to moderate dyspnoea in clinical practice, which our study population repre-sented.

A limitation of the Kappa statistic is its bias in case of low or high prevalence. Although we used a bias correcting-version of Kappa (Castellan & Siegel), no formal adjustment for the influence of prevalence on Kappa was made in our analyses.20 This may have played a role in explaining the low Kappa value for abnormal mental state, which had a low prevalence (6% and 2.9% before and after treatment, respectively).

conclusion

The measurement error induced by inter-observer variability of clinical signs of dys-pnoea in children is considerable and cannot be distinguished from a possibly relevant effect of therapy in two thirds of patients. The poor inter-observer reliability of clinical dyspnoea assessment in children limits its usefulness in clinical practice and research, and highlights the need to use more objective measurements in these patients.

references

1. Bisgaard H, Szefler S. Prevalence of asthma-like symptoms in young children. Pediatr Pulmonol 2007;42:723-728.

2. Keeley D, McKean M. Asthma and other wheezing disorders in children. Clin Evid 2005;14:238-262.

Page 138: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children 137

4

3. Bishop J, Carlin J, Nolan T. Evaluation of the properties and reliability of a clinical scoring severity scale for acute asthma in children. J Clin Epidemiol 1992;45:71-76.

4. Kerem E, Canny G, Reisman J, Bentur L, Levison H, Tibshirani R, Schuh S. Clinical-physiologic cor-relation in acute asthma of childhood. Pediatrics 1991;87:481-486.

5. Streiner DL, Norman GR. Health measurement scales. A practical guide to their development and use. New York: Oxford University Press 2003:126-152, 172-212.

6. De Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in medicine: a practical guide. Cam-bridge: Cambridge University press, 2011.

7. Angelilli ML, Thomas R. inter-rater evaluation of a clinical scoring system in children with asthma. Ann Allergy Asthma Immunol 2002;88:209-214.

8. Parkin PC, MacArthur C, Saunders NR, Diamond SA, Winders PM. Development of a clinical asthma score for use in hospitalized children between 1 and 5 years of age. J Clin Epidemiol 1996;49:821-825.

9. Yung M, South M, Byrt T. Evaluation of an asthma severity score. J Paediatr Child Health 1996;32:261-264.

10. Gajdos V, Beydon N, Bommenel L, Pellegrino B, de Pontual L, Bailleux S, Labrune P, Bouyer J. Inter-observer agreement between physicians, nurses, and respiratory therapists for respiratory clinical evaluation in bronchiolitis. Pediatr Pulmonol 2009;44:754-762.

11. Wang EEL, Milner RA, Navas L, Maj H. Observeragreement for respiratory signs and oximetry in infants hospitalized with lower respiratory infections. Am Rev Respir Dis 1992;145:106-109.

12. Liu LL, Gallaher MM, Davis RL, Rutter CM, Lewis TC, Marcuse EK. Use of a respiratory clinical score among different providers. Pediatr Pulmonol 2004;37:243-248.

13. Ducharme FM, Chalut D, Plotnick L, et al. The Pediatric Respiratory Assessment Measure: a valid clinical score for assessing acute asthma severity from toddlers to teenagers. J Pediatr 2008;152:476-480.

14. Walsh P, Gonzales A, Satar A, Rothenberg SJ. The Interrater reliability of a validated bronchiolitis severity assessment tool. Pediatric Emergency Care 2006;22:316-320.

15. Bekhof J, Reimink R, Brand PLP. Systematic review: insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children. Paediatric Respir Rev, in press. doi 10.1016/j.prrv.2013.08.004

16. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol 2010 Jul;63:737-745.

17. Fleming S,Thompson M, Stevens R, Heneghan C, Plüddemann A, Maconochie I, Tarassenko L, Mant D. Normal ranges of heart rate and respiratory rate in children from birth to 18 years of age: a systematic review of observational studies. Lancet 2011;377:1011–1018.

18. Hallgren KA. Computing inter-rater reliability for observational data: an overview and tutorial. Tutor Quant Methods Psychol 2012;8:23-34.

19. Siegel S, Castellan NJ. Nonparametric statistics for the behavioural sciences. New York: Mc-GrawHill;1988:284-291.

20. Light RJ. Measures of response agreement for qualitative data: some generalizations and alterna-tives. Psychological Bulletin 1971;76:356-377.

21. Fleiss JL. Measuring nominal scale agreement among many raters. Psychological Bulletin 1971;76:378-382.

Page 139: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

138 Chapter 4 : Clinical assessment of dyspnoea in children

22. Diagnosis and management of bronchiolitis. American Academy of Pediatrics Subcommittee on Diagnosis and Management of Bronchiolitis. Pediatrics 2006;118:1774-1793.

23. Brand PL, Baraldi E, Bisgaard H, Boner AL, Castro-Rodriguez JA, Custovic A, de Blic J, de Jongste JC, Eber E, Everard ML, Frey U, Gappa M, Garcia-Marcos L, Grigg J, Lenney W, Le Souëf P, McKenzie S, Merkus PJ, Midulla F, Paton JY, Piacentini G, Pohunek P, Rossi GA, Seddon P, Silverman M, Sly PD, Stick S, Valiulis A, van Aalderen WM, Wildhaber JH, Wennergren G, Wilson N, Zivkovic Z, Bush A. Definition, assessment and treatment of wheezing disorders in preschool children: an evidence-based approach. Eur Respir J 2008;32:1096-1110.

24. Panickar J, Lakhanpaul M, Lambert PC, Kenia P, Stephenson T, Smyth A, Grigg J. Oral prednisolone for preschool children with acute virus-induced wheezing. N Engl J Med 2009;360:329-338.

25. Commey JOO, Levison H. Physical signs in childhood asthma. Pediatrics 1976;58:537-541.

Page 140: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children 139

4

web

appe

ndix

1 In

tero

bser

ver r

elia

bilit

y of

clin

ical

find

ings

repo

rted

in th

e lit

erat

ure

Ange

lilli7

Bish

op3

Duch

arm

e13Ga

jdos

10Li

u12Pa

rkin

8W

ang11

Wal

sh14

Yung

9

Study characteristics

Num

ber o

f obs

erve

rs4

22

22

22

22

Num

ber o

f pat

ient

s17

6025

418

055

5856

164

107

age

6-17

year

s1/

2-17

year

s2-

17 ye

ars

½-1

5 m

onth

s0-

19 ye

ars

1-5

year

s<

2 ye

ars

< 18

mon

ths

0-19

year

s

diag

nosis

asth

ma

whe

eze

asth

ma

bron

chio

litis

whe

eze

asth

ma

low

er re

spira

tory

tra

ct in

fect

ion

bron

chio

litis

asth

ma

stat

istics

Flei

ss m

ultir

ater

ka

ppa

wei

gthe

d ka

ppa

wei

gthe

d ka

ppa

wei

gthe

d ka

ppa

wei

gthe

d ka

ppa

wei

gthe

d ka

ppa

Flei

ss m

ultir

ater

ka

ppa

wei

gthe

d ka

ppa

wei

gthe

d ka

ppa

Acce

ssor

y mus

cle u

se o

r ind

raw

ing

0.53

0.56

0.8

0.77

0.39

0.79

0.25

0.30

0.76

Air e

ntry

0.32

0.72

dysp

noea

0.53

0.63

Hear

t rat

e0.

510.

67

Oxy

gen

satu

ratio

n0.

760.

81

Insp

irato

ry to

exp

irato

ry ra

tio0.

45

Men

tal s

tate

/gen

eral

cond

ition

0.7

0.48

Resp

irato

ry ra

te p

erce

ntile

s0.

810.

360.

850.

38

Nasa

l flar

ing

0.54

Whe

eze

0.70

0.51

0.70

0.73

0.31

0.19

0.67

Wor

k of b

reat

hing

0.61

Page 141: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

140 Chapter 4 : Clinical assessment of dyspnoea in children

weBaPPendix 2: exPlanation of the visual anchor Based method for calculation of the minimal imPortant change (mic).

Clinimetric textbooks recommend to use different anchors to assess minimal important change.6 We used two anchors: the clinical judgement of the consultant paediatrician who had assessed the patient in the emergency department and the difference in re-spiratory rate percentiles before and after bronchodilator. Clinicians rated the change in the child’s condition after bronchodilator as worse, no change, slightly improved, or markedly improved. Respiratory rate percentile scores were categorized as <p1, p1-10, p10-25, p25-50, p50-75, p75-90, p90-99, and >p99.17 Changes in respiratory rate percen-tile scores were categorized as two or one categories worse, no change, or as one, two or three categories improved. Table A presents the distribution of the dyspnoea score compared to the different anchors. For each of these anchor changes, we calculated the corresponding mean (SD) change in the dyspnoea score (table A).

The next step to obtain the MIC value is to determine the cut-off point on the different anchors, so the observations can be divided in two parts: observations which improved importantly and those which did not.6

The cut-off points on the anchors were laid between the categories “no change” and “slightly improved” for the judgement of the clinician. The cut-off point for the respira-tory rate as anchor was laid between “1 category improved” and “2 categories improved”.

Subsequently, for each change in dyspnoea score, we calculated the proportion of ob-servations classified as “improved” or “not importantly changed” according to the anchor (Table B and C). Thereafter sensitivity and specificity for each change in dyspnoea-score was calculated. The optimal ROC (receiver operating curve) cut-off (i.e. the change in dyspnoea-score for which the sum of 1-sensitivity and 1-specificity is smallest, thus with the least false negative and false positive classifications) is considered the MIC.6

Page 142: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children 141

4

web

appe

ndix

2ta

ble

a D

istr

ibut

ion

of th

e dy

spno

ea s

core

s as

com

pare

d to

the

diffe

rent

anc

hors

chan

ge in

dys

pnoe

a sc

ores

bef

ore

and

afte

r tre

atm

ent

Mea

n chan

ge (S

D chan

ge)

Num

ber o

f pa

tient

sNu

mbe

r of

obse

rvat

ions

Mea

n chan

ge (S

D chan

ge)

Num

ber o

f pa

tient

sNu

mbe

r of

obse

rvat

ions

clinician’s assessment

mar

ked

impr

oved

1.5

(1.6

)9

161

impo

rtan

tly im

prov

ed1.

5 (1

.6)

916

1

slig

htly

impr

oved

0.3

(1.7

)8

144

not i

mpo

rtan

tly im

prov

ed0.

1 (1

.6)

1832

3

no ch

ange

-0.7

(1.6

)10

179

respiratory rate*

3 ca

tego

ries i

mpr

oved

-1.4

(1.5

)1

18im

port

antly

impr

oved

0.9

(2.0

)7

125

2 ca

tego

ries i

mpr

oved

1.3

(1.8

)6

107

1 ca

tego

ry im

prov

ed0.

5 (1

.7)

712

6no

t im

port

antly

impr

oved

0.4

(1.6

)20

359

no ch

ange

0.5

(1.6

)9

161

1 ca

tego

ry w

orse

-0.6

(1.6

)3

54

2 ca

tego

ries w

orse

0.4

(1.0

)1

18

*Diff

eren

ce in

num

ber o

f per

cent

ile c

ateg

orie

s( p

erce

ntile

cat

egor

ies:

< p

1; p

1-10

; p10

-25;

p25

-50;

p50

-75;

p75

-90;

p90

-99;

>p9

9)17

Page 143: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

142 Chapter 4 : Clinical assessment of dyspnoea in children

tabl

e B

Det

erm

inat

ion

of th

e M

IC b

y us

ing

the

ROC-

curv

e w

ith th

e an

chor

: clin

icia

n’s

(live

) ass

essm

ent

chan

ge in

dys

pnoe

a sc

ore

impo

rtan

tly im

prov

edno

t im

port

antly

impr

oved

roc

cut-

offse

nsiti

vity

spec

ifici

ty1-

se1-

sp(1

-se)

+ (1

-sp)

npr

opor

tion

npr

opor

tion

-50

02

0.00

62-5

10.

0062

00,

9938

0.99

38

-40

03

0.00

93-4

.51

0.01

550

0.98

450.

9845

-31

0.00

629

0.02

79-3

.50.

9999

0.04

340.

0001

0.95

660.

9567

-23

0.01

8631

0.09

60-2

.50.

9937

0.13

940.

0063

0.86

060.

8669

-19

0.05

5963

0.19

50-1

.50.

9751

0.33

440.

0249

0.66

560.

6905

030

0.18

6390

0.27

86-0

.50.

9192

0.61

30.

0808

0.38

70.

4678

147

0.29

1975

0.23

220.

50.

7329

0.84

520.

2671

0.15

480.

4219

227

0.16

7730

0.09

291.

50.

441

0.93

810.

559

0.06

190.

6209

328

0.17

3913

0.04

022.

50.

2733

0.97

830.

7267

0.02

170.

7484

48

0.04

974

0.01

243.

50.

0994

0.99

060.

9006

0.00

940.

91

57

0.04

351

0.00

314.

50.

0497

0.99

370.

9503

0.00

630.

9566

60

02

0.00

625.

50.

0062

0.99

990.

9938

0.00

010.

9939

71

0.00

620

07

0.00

621

0.99

380

0.99

38

tota

l16

132

3

Page 144: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Large inter and intra-observer variation of clinical assessment of dyspnoea in wheezing children 143

4

tabl

e c

Det

erm

inat

ion

of th

e M

IC b

y us

ing

the

ROC-

curv

e w

ith th

e an

chor

: res

pira

tory

rate

in p

erce

ntile

s

chan

ge in

dys

pnoe

a sc

ore

impo

rtan

tly im

prov

edno

t im

port

antly

impr

oved

roc

cut-

offse

nsiti

vity

spec

ifici

ty1-

se1-

sp(1

-se)

+

(1-s

p)

npr

opor

tion

npr

opor

tion

70

01

0.00

37

01

10

1

60

02

0.00

65.

50

0.99

71

0.00

31.

003

56

0.04

82

0.00

64.

50.

048

0.99

10.

952

0.00

90.

961

45

0.04

07

0.01

93.

50.

088

0.98

50.

912

0.01

50.

927

315

0.12

260.

072

2.5

0.20

80.

966

0.79

20.

034

0.82

6

217

0.13

640

0.11

11.

50.

344

0.89

40.

656

0.10

60.

762

137

0.29

685

0.23

70.

50.

640.

783

0.36

0.21

70.

577

022

0.17

698

0.27

3-0

.50.

816

0.54

60.

184

0.45

40.

638

-18

0.06

464

0.17

8-1

.50.

880.

273

0.12

0.72

70.

847

-29

0.07

225

0.07

0-2

.50.

952

0.09

50.

048

0.90

50.

953

-33

0.02

47

0.01

9-3

.50.

976

0.02

50.

024

0.97

50.

999

-42

0.01

61

0.00

3-4

.50.

992

0.00

60.

008

0.99

41.

002

-51

0.00

81

0.00

3-5

.51

0.00

30

0.99

70.

997

tota

l12

51

359

1

Page 145: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 146: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Chapter 5

Viral tests for cohortisolation in children hospitalised for bronchiolitis

Page 147: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 148: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Co-infections in children hospitalised for bronchiolitis: role of roomsharing

Jolita BekhofJoline Bakker

Roelien ReiminkMirjam Wessels

Veerle LangenhorstPaul BrandGijs Ruijs

J Clin Med Res 2013;5:426-31

Page 149: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

148 Chapter 5 : Viral tests for cohortisolation in children hospitalised for bronchiolitis

aBstract

Background: Bronchiolitis is a major cause for hospitalisation in young children during the winter season, with respiratory syncytial virus (RSV) as the main causative virus. Apart from standard hygiene measures, cohorting of RSV-infected patients separately from RSV-negative patients is frequently applied to prevent cross-infection, although evidence to support this practice is lacking.

Objectives: To evaluate the risk of room sharing between RSV-positive and RSV-negative patients.

Study design: We performed a prospective observational cohort study in children <2 years hospitalised with acute bronchiolitis. During the first day of admission, patients shared one room, pending results of virological diagnosis (PCR). When diagnostic re-sults were available, RSV-positive and RSV-negative patients were separated. Standard hygienic measures (gowns, gloves, masks, hand washing) were used in all patients.

Results: We included 48 patients (83% RSV-positive). Co-infection was found in nine patients at admission, and two during hospitalisation (23%). The two patients with ac-quired co-infection had been nursed in a single room during the entire admission. None of 37 patients sharing a room with other bronchiolitis patients (20 with patients with a different virus) were co-infected during admission. Disease severity in co-infection was not worse than in mono-infection.

Conclusion: One in five patients with bronchiolitis was co-infected, but co-infection acquired during admission was rare and was not associated with more severe disease. Room sharing between RSV-positive and RSV-negative patients (on the first day of admission) did not influence the risk of co-infection, suggesting that cohorting of RSV-infected patients separate from non-RSV-infected patients may not be indicated.

Page 150: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Co-infections in children hospitalised for bronchiolitis: role of roomsharing 149

5

introduction

Acute bronchiolitis is a major cause for hospitalisation in young children during the winter season.1,2 Human Respiratory Syncytial Virus (RSV) is the most frequently identified virus, however with the use of new and highly sensitive molecular amplification methods, the role of other viral pathogens in bronchiolitis has been increasingly recognized. Various dis-ease severity has been shown for a range of respiratory viruses, and double viral infection is relatively common, occurring in about 10-30% of hospitalised patients.3-7 There is no consensus, however, on the impact of such co-infection on disease severity:5 Some stud-ies showed more severe disease in co-infected children,8-14 while others did not.15-21 Most hospitals perform routine viral testing to identify and isolate RSV-infected infants, with the aim of reducing the risk of nosocomial cross-infection of other patients.22,23,24 However, no good evidence is available of how effective this approach is in preventing nosocomial cross-infections among admitted patients with the clinical diagnosis of bronchiolitis.

Because of limited isolation facilities, patients with bronchiolitis admitted to our pediatric ward initially share a room, pending the results of virological diagnosis. We hypothesize that contact isolation measures and maintaining enough distance between the beds in a shared room should be sufficient in preventing cross-infection, since the major route of transmission of respiratory viruses is by close contact with infected secre-tions and not by small-particle aerosol.24,25

Objectives

The purpose of this study was to determine the incidence of cross-infection in children hospitalised for bronchiolitis, when patients with RSV share the same room with patients with bronchiolitis infected with another virus during the first day of admission.

study design

The study was conducted at our 30-bed pediatric ward. From December 2011 through March 2012, all eligible infants younger than two years of age hospitalised for acute bronchiolitis were prospectively enrolled. Bronchiolitis was defined as acute respiratory disease, accompanied by coryza, cough, inspiratory crackles and/or expiratory wheezing on auscultation. Infants with chronic lung disease, congenital heart disease and Down’s syndrome were excluded.

We prospectively collected the following demographic and clinical information, including presence and number of roommates, virological diagnosis of the patient and roommates, and daily dyspnoea score assessed by an independent researcher, who was unaware of virological diagnosis (Table 1).26

Page 151: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

150 Chapter 5 : Viral tests for cohortisolation in children hospitalised for bronchiolitis

A nasopharyngeal aspirate was collected for virological diagnosis by direct immu-nochromatographic antigen detection (RespiFinder TwoStep kit, Pathofinder) imme-diately at admission, every fourth day during admission, and five to seven days after discharge.27,28

All patients with bronchiolitis were treated with standard hygienic measures. Medi-cal and nursing personnel wore gowns, gloves and masks during patient contact and washed their hands before and after patient contact. Parents and visitors were asked to wash hands before leaving the room. On the first day of admission, pending the results of the RSV-PCR, patients shared a two- or four-bed room, with beds separated at least 1,5 meter. Cohorting of RSV-infected patients commenced as soon as the result of RSV-PCR was known, generally within one day after admission.

table 1. Dyspnoea score

0 1 2

respiratory rate normal< 40/min

slightly increased 40-60/min

clearly increased>60/min

oxygen saturation ≥ 95% in room air 92-94% in room air < 92% in room air,or need for supplemental oxygen

wheezing none audible with stethoscope audible without stethoscope

retractions none mild-moderate severe

general condition not affected:alert/quietly sleeping

moderately affected:Irritable or agitated

severely affected:lethargic, poor feeding

Adapted from Kristiansson27

Statistical analysis

Chi-square test was used to compare categorical data, Mann-Whitney U-tests for con-tinuous data because of skewed distributions. Statistical analyses were performed using Statistical Package for the Social Sciences (SPSS) version 19.

This study is registered at clinicaltrials.gov (NCT01441466).

results

Of the 84 patients with bronchiolitis hospitalised during the 11-weeks study period, 36 were excluded for the following reasons: cardiac disease (2), chronic lung disease with home oxygen (2), Down’s syndrome (3), no parental consent (12), age > two (7), missed inclusion (6), missing nasal wash specimen at admission (3). A total of 48 patients completed the study (Table 2).

Page 152: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Co-infections in children hospitalised for bronchiolitis: role of roomsharing 151

5

table 2 Patient characteristics

n=48

age, months 3.2 (1.8-9.7)

male 26 (54.2%)

Birth characteristics

…….gestational age, weeks 38.5 (37.8-40.1)

…….preterm birth (<37 weeks) 2 (4.2%)

…….birth weight, gram 3420 (3120-3740)

environmental factors

…….day care attendance 16 (33.3%)

…….siblings 39 (81.2%)

disease severity

…….length of hospitalization (days) 1.9 (1.6-4.0)

…….oxygen supplementation 30 (62.5%)

…….tubefeeding 20 (41.7%)

…….highest dyspnoea score 3.0 (2.0-4.8)

…….mechanical ventilation 3 (6.2%)

Data are presented as median and interquartile range in parentheses, or number and percentage in parentheses. Highest possible dyspnoea score 10

table 3 Distribution of viral pathogens

virus at admissionn=48

at dischargen=48

after dischargen=44

mono-infections

….RSV-A 7 (14.5%) 6 (12.5%) 2 (4.5%)

….RSV-B 25 (52.1%) 19 (39.6%) 5 (11.4%)

….hMPV 2 (4.2%) 1 (2.1%) 1 (2.3%)

….RhV 3 (6.3%) 3 (.3%) 3 (6.8%)

….CoV 0 2 (4.2%) 1 (2.3%)

….AdV 0 0 3 (6.8%)

co-infections

RSV-A and hMPV 1 (2.1%) 1 (2.1%) 0

RSV-B and

….PIF 1 (2.1%) 0 0

….AdV 0 1 (2.1%) 0

….RhV 4 (8.3%) 4 (8.3%) 1 (2.3%)

….CoV 1 (2.1%) 0 0

….hMPV 1 (2.1%) 1 (2.1%) 0

CoV and PIF 1 (2.1%) 1 (2.1%) 0

no virus 2 (4.2%) 9 (18.8%) 28 (63.6%)

Number with percentage in parentheses

Page 153: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

152 Chapter 5 : Viral tests for cohortisolation in children hospitalised for bronchiolitis

The distribution of viral pathogens is shown in Table 3; RSV was the major pathogen detected in 83%. Co-infection was found in 11 (22.9%) patients, nine of whom were already co-infected at admission, and two acquired co-infection during admission.

Of all included patients, 37 (77.1%) had shared a room with other bronchiolitis pa-tients, 20 of whom (54.1%) had shared a room with a patient infected with a different virus. The two patients who acquired co-infection during admission had never shared a room with another patient. None of the bronchiolitis-patients sharing rooms had been infected with another virus during admission.

Co-infected patients did not suffer from more severe disease than patients infected with a single virus, but, although not statistically significant, disease severity tended to be higher in RSV-infected patients compared to RSV-negative patients. (Table 4)

table 4 Comparison of disease severity between mono- versus co-infected patients and RSV-infected versus RSV-uninfected patients

n=48 Mono versus co-infection RSV-infected versus RSV-uninfected

Co-infectionn=11

Mono-infectionn=37

p-value RSV-infectedn=40

RSV-uninfectedn=8

p-value

age, months 4.3 (2.2-11.4) 3.2 (1.6-9.4) 0.413 3.3 (1.8-9.8) 3.0 (1.6-8.5) 0.740

length of hospitalization, days 2.0 (1.7-3.4) 1.9 (1.2-4.2) 0.864 2.5 (1.6-4.4) 1.8 (1.2-1.9) 0.162

oxygen supplementation 6 (54.6%) 24 (64.9%) 0.535 25 (62.5%) 5 (62.5%) 1.000

tubefeeding 4 (36.4%) 16 (43.2%) 0.681 18 (45.0%) 2 (25%) 0.295

highest dyspnoea score (0-10) 3.0 (2.0-4.0) 3.0 (1.5-5.0) 0.654 3.0 (2.0-5.0) 2.5 (1.0-4.0) 0.285

mechanical ventilation 1 (9.1%) 2 (5.4%) 0.658 3 (7.5%) 0 (0%) 0.424

Data are presented as median and interquartile range in parentheses, or number and percentage in parentheses as appropiateP value: Mann-Whitney-U test for continuous variables, Х2 test for dichotomous variables

discussion

This study showed that nosocomially acquired co-infection is rare, even when RSV-positive and RSV-negative patients share a room during the first day of hospital admis-sion. Furthermore, co-infection was not associated with more severe disease. The small number of our study limits any firm conclusion, however these findings may suggest that separating RSV-infected from RSV-negative patients with bronchiolitis may not be indicated. Cohorting of patients with bronchiolitis as one group, irrespective of viral diagnosis, may suffice.

Our finding that cohorting of RSV-infected patients may not add to the prevention of co-infection is supported by the fact that the main route of transmission of respiratory viruses is through direct contact, with only a minor role for aerosol transmission.24,25 Therefore, we stress that strict adherence to other hygienic measures by medical staff

Page 154: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Co-infections in children hospitalised for bronchiolitis: role of roomsharing 153

5

and patient’s relatives is clearly of crucial importance.23,24 Hand washing is the single most important procedure in the prevention of nosocomial infections, yet it remains the most violated of all infection control procedures.23,24 It is conceivable that placing children in a cohort generates considerable peer and parental pressure to ensure that measures such as hand washing are followed.

Our results may also imply that routinely performing virological diagnostic testing is not needed in children with bronchiolitis. The diagnosis of bronchiolitis is a clinical diagnosis and for this purpose further diagnostic testing is not needed.29 Since cohort-ing of RSV-infected patients is the most importance reason for virological testing in bronchiolitis, health care expenses can be reduced by omitting the routine use of these tests, provided that influenza, a serious and treatable infection, is excluded.

This does not exclude the potential usefulness of rapid broad range viral testing in specific circumstances, for example in young febrile infants, where rapid broad range viral testing might reduce the need for invasive sepsis workup, or in case of unclear clinical presentation (apnoea without respiratory signs) or for surveillance purposes.

Our findings add to the current controversy considering this issue and we realise that the small numbers of our study limit solid comments on this subject and no definite conclusions can be made. Another important limitation is the fact that we only evalu-ated the risk of room sharing during the first 24 hours of admission. It is well possible that prolonged sharing of rooms increases the incidence of cross-infections. For practi-cal and safety reasons, we deliberately chose to perform the study under these specific circumstances as a proof of principle, before embarking on a similar project with room sharing during the entire admission.

We conclude that, with standard hygiene control measures, the risk of nosocomially acquired co-infection is low, and does not appear to be related to room sharing be-tween RSV-positive and RSV-negative patients (during the first day of admission). These findings argue against routine cohorting of RSV-infected bronchiolitis patients and against routinely carrying out broad range virological testing of infants hospitalised for bronchiolitis. Yet a larger number of patients, applying room sharing during the entire admission is needed before definite conclusions can be made.

Ethical approvementWritten informed consent from the parents was obtained before inclusion. The study was approved by the institutional’s ethical review board.

Page 155: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

154 Chapter 5 : Viral tests for cohortisolation in children hospitalised for bronchiolitis

references

1. Smyth RL, Openshaw PJM. Bronchiolitis. Lancet 2006;368,312-322. 2. Bush A, Thomson AH. Acute bronchiolitis. BMJ 2007;335,1037-1041. 3. Paranhos-Baccala G, Komuian-Pradel F, Richard N, Vernet G, Lina B, Floret D. Mixed respiratory

virus infection. J of Clin Virol 2008;43,407-410. 4. Schuh S. Update on management of bronchiolitis. Curr Opin Pediatr 2011;23,110-114. 5. Tregoning JS, Schwarze J. Respiratory viral infections in infants: causes, clinical symptoms, virol-

ogy, and immunology. Clin Microbiol Rev 2010;23,74-98. 6. Ray CG, Minnich LL, Holberg CJ, Shehab ZM, Wright AL, Barton LL, et al. Respiratory syncytial

virus-associated lower respiratory tract illnesses: possible influence of other agents. Pediatr Infect Dis J 1993;12,15-19.

7. Drews AL, Atmar RL, Glezen WP, Baxter BD, Piedra PA, Greenberg SB. Dual respiratory virus infec-tions. Clin Infect Dis 1997;1425,1421-1429.

8. Greensill J, McNamara PS, Dove W, Flanagan B, Smyth RL, Hart CA. Human metapneumovirus in severe respiratory syncytial virus bronchiolitis. Emerg Infect Dis 2003;9,372-375.

9. Semple MG, Cowell A, Dove W, Greensill J, McNamara PS, Halfhide C, et al. Dual infection of infants by human metapneumovirus and human respiratory syncytial virus is strongly associated with severe bronchiolitis. J Infect Dis 2005;191,382-386.

10. Richard N, Komurian-Pradel F, Javouhey E, Perret M, Rajoharison A, Bagnaud A, et al. The impact of dual viral infection in infants admitted to a pediatric intensive care unit associated with severe bronchiolitis. Pediatr Infect Dis J 2008:27,213-217.

11. Foulogne V, Guyon G, Rodiere M, Segondy M. Human metapneumovirus infection in young children hospitalized with respiratory tract disease. Pediatr Infect Dis J 2006,25:4632-4635.

12. König B, König W, Arnold R, Werchau H, Ihorst G, Forster J. Prospective study of human metapneu-movirus infection in children less than 3 years of age. J Clin Microbiol 2004;42,4632-4635.

13. Franz A, Adams O, Willems R, Bonzel L, Neuhausen N, Schweizer-Krantz S, et al. Correlation of viral load of respiratory pathogens and co-infections with disease severity in children hospitalized for lower respiratory tract infection. J Clin Virol 2010,48:239-245.

14. Aberle JH, Aberle SW, Pracher E, Huter HP, Kuni M, Popw-Kraupp T. Single versus dual respiratory virus infections in hospitalized infants: impact on clinical course of disease and interferongamma response. Pediatr Infect Dis J 2005;24,605-610.

15. Lazar I, Weibel C, Dziura J, Ferguson D, Landry ML, Kahn JS. Human metapneumovirus and sever-ity of respiratory syncytial virus disease. Emerg Infect Dis 2004;10,1318-1320.

16. Van Woensel JB, Bos AP, Lutter R, Rossen JW, Schuurman R. Absence of human metapneumovirus co-infection in cases of severe respiratory syncytial virus infection. Pediatr Pulmonol 2006;41,872-874.

17. García-García ML, Calvo C, Pérez-Breña P, De Cea JM, Acosta B, Casas I. Prevalence and clinical characteristics of human metapneumovirus infections in hospitalized infants in Spain. Pediatr Pulmonol 2006;41,863-871.

18. Wolf DG, Greenberg D, Kalkstein D, Shemer-Avni Y, Givon-Lavi N, Saleh N. Comparison of human metapneumovirus, respiratory syncytial virus and influenza A virus lower respiratory tract infec-tions in hospitalized young children. Pediatr Infect Dis J 2006;25,320-324.

19. Wilkesmann A, Schildgen O, Eis-Hubinger AM, Geikowski T, Glatzel T, Lentze MJ. Human meta-pneumovirus infections cause similar symptoms and clinical severity as respiratory syncytial virus infections. Eur J Pediatr 2006;25,320-324.

Page 156: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Co-infections in children hospitalised for bronchiolitis: role of roomsharing 155

5

20. Brand HK, de Groot R, Galama JM, et al. Infection with multiple viruses is not associated with increased disease severity in children with bronchiolitis. Pediatr Pulmonol. 2012 ;47:393-400.

21. Martin ET, Kuypers J, Wald A, Englund JA. Multiple versus single virus respiratory infections: viral load and clinical disease severity in hospitalized children. Influenza Other Respi Viruses 2012;6,71-77.

22. Krasinski K, LaCouture R, Holzman RS, Waithe E, Bonk S, Hanna B. Screening for respiratory syncy-tial virus and assignment to a cohort at admission to reduce nosocomial transmission. J Pediatr 1990;116,894-898.

23. Madge P, Paton JY, McColl JH, Mackie PLK. Prospective controlled study of four infection-control procedures to prevent nosocomial infection with respiratory syncytial virus. Lancet 1992;340,1079-1083.

24. Hall CB. Nosocomial respiratory syncytial virus infections: the “Cold War” has not ended. Clin Infect Dis 2000;31,590-596.

25. Hall C, Douglas RJ. Modes of transmission of respiratory syncytial virus. J Pediatr 1981;99,100-103. 26. Kristjansson S, Lodrup Carlsen KC, Wennergren G, Strannegard IL, Carlsen KH. Nebulised race-

mic adrenaline in the treatment of acute bronchiolitis in infants and toddlers. Arch Dis Child 1993;69,650-654.

27. Lessler J, Reich NG, Brookmeyer R, Perl TM, Nelson KE, Cummings DA. Incubation periods of acute respiratory viral infections: a systematic review. Lancet Infect Dis 2009;9,291-300.

28. Reijans M, Dingemans G, Klaassen CH, Meis JF, Keijdener J, Mulders B, et al. RespiFinder: a new multiparameter test to differentially identify fifteen respiratory viruses. J Clin Microbiol 2008;46,1232-1240.

29. Doan Q, Enarson P, Kissoon N, Klassen TP, Johnson DW. Rapid viral diagnosis for acute febrile respiratory illness in children in the Emergency Department. Cochrane Database of Syst Rev 2012;5,CD006452.

Page 157: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 158: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Roomsharing in hospitalised children with bronchiolitis

Jolita BekhofMirjam WesselsRoelien Reimink

Veerle LangenhorstLesla Bruijnesteijn

Paul BrandGijs Ruijs

Work in progress

Page 159: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

158 Chapter 5 : Viral tests for cohortisolation in children hospitalised for bronchiolitis

aBstract

Background: In infants hospitalized for bronchiolitis, cohorting of RSV-positive patients separately from RSV-negative patients is a commonly applied procedure to prevent nosocomially acquired cross-infections. We previously described that cross-infections did not occur when patients shared a room on the first day of admission, irrespective of the causative viral agent.

Objective: To determine the incidence (and clinical severity) of cross- infections in pa-tients admitted with bronchiolitis, when patients with different causative viral agents share a room during the entire admission course.

Methods: A prospective cohort of all infants < 2 years old, hospitalized with bronchiolitis during the winter season 2012-2013. Patients shared a 2 to 4-bed hospital room during the entire admission, irrespective of virological diagnosis. Standard contact hygienic measures were applied in all patients (gowns, gloves and hand-washing).

Results: We included 65 patients in the period December 2012 to March 2013 (94% RSV-positive), 56 of whom (85%) shared a room with another bronchiolitis patient, 18 (28%) of which with a different causative virus. On admission, 10 patients (15%) were co-infected with another virus. During admission, 8 (12%) patients were cross-infected with a different virus. Of these cross-infected patients, one patient had shared a room with a virus identified in one of the roommates. We found no significant differences in disease severity between patients infected with single or multiple viruses, nor between RSV-positive versus RSV-negative patients.

Conclusion: Co-infection amongst patients with bronchiolitis is common. Roomsharing does not seem to play a meaningful role in the transmission of viruses between patients with bronchiolitis sharing one room.

Page 160: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Roomsharing in hospitalised children with bronchiolitis 159

5

introduction

Acute bronchiolitis is an important cause for hospitalization in young children, espe-cially during the winter season.1,2 Human Respiratory Syncytial Virus (RSV) is the most frequently identified virus, detected in 70-85% of hospitalized infants during the winter epidemic followed by rhinovirus (RV), human metapneumovirus (hMPV), adenovirus (AdV), influenza virus (IV), and parainfluenza virus (PIV).1-5 Traditionally, RSV was consid-ered to be associated with more severe disease and need for hospitalization.5 Since the availability of new molecular amplification methods, the role of other viral pathogens, including co-infections, in bronchiolitis has been given more attention. Studies using multiplex polymerase chain reaction (PCR) have shown varying degrees of bronchiolitis severity for a range of respiratory viruses.3,5 In these studies, co-infection was relatively common, occurring in about 10-30% of hospitalized patients.3,5-7 There is no consensus, however, on the impact of such co-infection on disease severity:5 some studies showed more severe disease in co-infected children,8-14 while others did not.15-21 Since diagnosis of co-infection does not lead to specific therapy and it is unclear how it predicts se-verity or length of disease,5 there is debate about whether RSV-testing of infants with bronchiolitis changes clinical management or outcome.1 Still, many hospitals perform routine viral testing to identify and isolate RSV-infected infants, with the aim of reduc-ing the risk of nosocomial cross-infection of other patients.22 It has been shown that the combination of contact isolation measures (gowns, gloves and hand washing) and cohorting of RSV-infected patients lowers the risk of nosocomial cross-infection of other patients.23,24 However, no good evidence is available of how effective this approach is in preventing nosocomial cross-infections amongst patients with bronchiolitis. Such a policy would be justified if it could prevent cross-infection and if co-infection leads to more severe disease than single infection.1 Since it is unclear which isolation measures are most effective, it is advised that the choice of infection control measures is decided by individual institutions depending on the patients, the type of ward, and the benefit relative to cost.24

We hypothesized that contact isolation measures and maintaining enough distance between the beds in a shared room should be sufficient to prevent cross-infection, since the major route of transmission of respiratory viruses is by close contact with infected secretions and not by small-particle aerosol.24,25 Earlier we showed that room sharing between RSV-positive and RSV-negative patients on the first day of admission did not influence the risk of co-infection.26 The purpose of the present study was to determine the incidence of cross-infection and the severity of single versus co-infection in children hospitalized for bronchiolitis, when patients with bronchiolitis share the same room irrespective of the causative virus, during the entire course of admission.

Page 161: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

160 Chapter 5 : Viral tests for cohortisolation in children hospitalised for bronchiolitis

materials and methods

Patients

The study was conducted at the pediatric ward (30 beds) of Isala, a general teaching hospital in Zwolle, The Netherlands. We plan to continue the study during 3 to 4 winter seasons to allow inclusion of a sufficiently large number of patients. The results presented here are the preliminary results of the first season. From December 2012 through March 2013, all eligible infants younger than 2 years of age hospitalised for acute bronchiolitis were prospectively enrolled. Bronchiolitis was defined as acute respiratory disease, ac-companied by coryza, cough, and inspiratory crackles and/or expiratory wheezing on auscultation. Infants with chronic lung disease, congenital heart disease and Down’s syndrome were excluded.

The study was approved by the medical ethical committee. Written informed consent from the parents was obtained before inclusion.

Clinical data collection

We prospectively collected the following demographic and clinical information: age, medical history, presence and number of room mates, virological diagnosis of the pa-tient and their room mates, oxygen supplementation, apnoea, tube feeding, intensive care admission, and daily dyspnoea score assessed by an independent researcher.27 This person was not aware of the virological diagnosis.

Virological diagnosisA nasopharyngeal aspirate was collected for virological diagnosis immediately at admis-sion, every fourth day during admission, at discharge, and 4-7 days after discharge, ac-counting for an incubation period of 4-7 days in most respiratory viral pathogens.28 The nasopharyngeal aspirate was collected by placing a catheter through one of the nostrils in the nasopharynx. Then 1-2 ml of saline was instilled and aspirated. The specimen was stored with saline, transported to the laboratory and frozen at -85oC for later analysis. All specimens were analyzed batchwise by the RespiFinder TwoStep kit (Pathofinder, Maastricht, The Netherlands) for the purpose of this study.29 Results of this multiplex-PCR were not made available to the attending physicians, but were only collected for research purposes. Neither the parents of the patients nor the nurses and doctors were aware of virological diagnosis.

Treatment protocol and hygienic measuresAll patients were managed according to our local bronchiolitis practice guideline. In-fants younger than 3 months of age were monitored for at least 3 days for apnoea and oxygen saturation; oxygen was being supplemented when saturation fell below 92%.

Page 162: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Roomsharing in hospitalised children with bronchiolitis 161

5

Patients were discharged when respiratory support was no longer needed and their oral intake was at least two-thirds of their normal intake.

All patients with bronchiolitis were treated with standard contact hygienic measures. Medical and nursing personnel wore gowns and gloves during patient contact and washed their hands before and after each patient contact. Parents and visitors were asked to wash hands before leaving the room. Patients were nursed together in a 2 or a 4-bed room, with beds separated at least 1.5 meters.

Statistical analysis

Chi-square test was used to compare categorical data, Mann-Whitney U-tests for con-tinuous data because of skewed distributions. Statistical analyses were performed using Statistical Package for the Social Sciences (SPSS) version 20.

This study is registered at clinicaltrials.gov (NCT01441466)

results

Of the 94 patients with bronchiolitis hospitalized during the study period, two were excluded because of relevant background disease (cardiac disease), 17 because parents declined consent, and four because they were older than two years of age. Four patients were missed, in one no nasal wash specimen was obtained immediately at admission

table 1 Patient characteristics

n=65

age, months 3.5 (1.0-6.3)

male 43 (66%)

Birth characteristics

…….gestational age, weeks 38 (37-40)

…….preterm birth (<37 weeks) 7 (11%)

…….birth weight, gram 3250 (2770-3770)

environmental factors

day care attendance (missing data in 6) 48 (81%)

…….siblings (missing data in 5) 51 (85%)

disease severity

…….length of hospitalization (days) 4 (3-6)

…….oxygen supplementation 53 (82%)

…….tubefeeding 43 (66%)

…….highest dyspnoea score (0-10) 4 (3-5.5)

…….mechanical ventilation 2 (3%)

Data are presented as median and interquartile range in parentheses, or number and percentage in parentheses. Highest possible dyspnoea score 10.

Page 163: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

162 Chapter 5 : Viral tests for cohortisolation in children hospitalised for bronchiolitis

and in one patient data were not collected. A total of 65 patients completed the study. Demographic and clinical data are given in Table 1.

RSV was the major pathogen detected in 61 (94%) of patients on admission. No virus was detected at admission in two patients; influenzavirus was found in one, and human metapneumovirus in combination with parainfluenza in another. Co-infection on admis-sion was found in 10 patients (15%): one patient was infected with metapneumovirus in combination with parainfluenza , nine were infected with RSV in combination with adenovirus (n=1), Rhinovirus (3), or Coronavirus (4). One patient had 3 viruses on admis-sion (RSV, adeno and parainfluenza).

Cross-infection, defined as acquired co-infection during admission, was found in 8 pa-tients (12%), all of whom were initially infected with RSV and co-infected with rhinovirus (n=5), enterovirus (n=5), and influenza (n=1), respectively.

Of all included patients, 56 (86%) had shared a room with other bronchiolitis patients, 18 of whom had shared a room with a patient infected with a different virus. Of the eight patients who acquired co-infection during admission two had never shared a room with another patient, four had shared a room with patients with the same virus and two had shared a room with patients with another virus. Only one of the latter cross-infected patients was cross-infected with a virus identified in one of the roommates. The other seven cross-infected patients had never shared a room with a patient that was identified with the virus they were cross-infected with.

Co-infected patients did not suffer from more severe disease than patients infected with a single virus, and disease severity was equal in RSV-infected patients compared to RSV-negative patients. (Table 2)

table 2 Comparison of disease severity between mono- versus co-infected patients and RSV-infected versus RSV-uninfected patients

n=65 Mono versus co-infection RSV-infected versus RSV-uninfected

Co-infectionn=18

Mono-infectionn=47

p-value RSV-infectedn=60

RSV-uninfectedn=5

p-value

age, months 3 (1-6) 5 (2-7) 0.168 3 (1-6) 6 (4-13) 0.083

length of hospitalization, days 4 (1-12) 4 (3-6) 0.750 4 (3-6) 3 (1.5-7.5) 0.450

oxygen supplementation 13 (72%) 40 (85%) 0.231 50 (83%) 3 (60%) 0.196

days of oxygen supplemantation 2.5 (0-4) 3 (1-4) 0.667 3 (1-4) 2 (0-6) 0.283

tubefeeding 13 (72%) 30 (64%) 0.522 40 (67%) 3 (60%) 0.762

days of tubefeeding 3.5 (0-5) 2 (0-13) 0.317 2 (0-4) 2 (0-6) 0.990

highest dyspnoea score (0-10) 4 (4-6) 4 (4-5) 0.633 4.5 (3-6) 4 (2-4.5) 0.194

Data are presented as median and interquartile range in parentheses, or number and percentage in parentheses as appropriate. P value: Mann-Whitney-U test for continuous variables, Х2 test for dichotomous variables.

Page 164: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Roomsharing in hospitalised children with bronchiolitis 163

5

discussion

This study showed that co-infection, including nosocomially acquired cross-infection, is frequently seen in infants admitted for bronchiolitis. However, cross-infection between patients sharing a room is rare, occurring in only one patient in this study population. Furthermore, co-infection was not associated with more severe disease. Our findings therefore suggest that roomsharing of patients with bronchiolitis irrespective of the virological diagnosis, is safe.

We want to emphasize that these are preliminary results, and firm conclusions are limited by the relatively small number of patients. We will continue this study for an-other three to four winter seasons, to obtain a sufficiently large number of patients, and to account for the variability in viral prevalence and virulence between different years.

We did not address the source of the cross-infections other than the roommates. In future studies we will investigate viral carriership in medical personal and visitors of admitted patients with bronchiolitis to further elaborate this issue.

We cautiously conclude that roomsharing of patients with bronchiolitis is not an important cause of nosocomially acquired cross-infection. These findings suggest that roomsharing of infants with bronchiolitis, irrespective of virological diagnosis, is a safe procedure, when standard hygiene measures are guaranteed. This conclusion implies that routinely performing virological tests for the purpose of cohorting may not be necessary for this purpose. Larger study samples are warranted before firm conclusions can be made.

references

1. Smyth RL, Openshaw PJM. Bronchiolitis. Lancet 2006;368:312-322. 2. Bush A, Thomson AH. Acute bronchiolitis. BMJ 2007;335:1037-1041. 3. Paranhos-baccala G, Komuian-Pradel F, Richard N, et al. Mixed respiratory virus infection. J of Clin

Virol 2008;43:407-410. 4. Schuh S. Update on management of bronchiolitis. Curr Opin Pediatr 2011;23:110-114. 5. Tregoning JS, Schwarze J. Respiratory viral infections in infants: causes, clinical symptoms, virol-

ogy, and immunology. Clin Microbiol Rev 2010:23;74-98. 6. Ray CG, Minnich LL, Holberg CJ, et al. Respiratory syncytial virus-associated lower respiratory tract

illnesses: possible influence of other agents. Pediatr Infect Dis J 1993;12:15-19. 7. Drews AL, Atmar RL, Glezen WP, et al. Dual respiratory virus infections. Clin Infect Dis

1997;1425:1421-1429. 8. Greensill J, McNamara PS, Dove W, et al. Human metapneumovirus in severe respiratory syncytial

virus bronchiolitis. Emerg Infect Dis 2003;9:372-375. 9. Semple MG, Cowell A, Dove W, et al. Dual infection of infants by human metapneumovirus and

human respiratory syncytial virus is strongly associated with severe bronchiolitis. J Infect Dis 2005;191:382-386.

Page 165: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

164 Chapter 5 : Viral tests for cohortisolation in children hospitalised for bronchiolitis

10. Richard N, Komurian-Pradel F, Javouhey E, et al. The impact of dual viral infection in infants admitted to a pediatric intensive care unit associated with severe bronchiolitis. Pediatr Infect Dis J 2008:27:213-217.

11. Foulogne V, Guyon G, Rodiere M, et al. Human metapneumovirus infection in young children hospitalized with respiratory tract disease. Pediatr Infect Dis J 2006;25:4632-4635.

12. Konig B, Konig W, Arnold R, et al. Prospective study of human metapneumovirus infection in children less than 3 years of age. J Clin Microbiol 2004;42:4632-4635.

13. Franz A, Adams O, Willems R, et al. Correlation of viral load of respiratory pathogens and co-infections with disease severity in children hospitalized for lower respiratory tract infection. J Clin Virol 2010;48(4):239-245.

14. Aberle JH, Aberle SW, Pracher E, et al. Single versus dual respiratory virus infections in hospital-ized infants: impact on clinical course of disease and interferongamma response. Pediatr Infect Dis J 2005;24:605-610.

15. Lazar I, Weibel C, Dziura J, et al. Human metapneumovirus and severity of respiratory syncytial virus disease. Emerg Infect Dis 2004;10:1318-1320.

16. Van Woensel JB, Bos AP, Lutter R, et al. Absence of human metapneumovirus co-infection in cases of severe respiratory syncytial virus infection. Pediatr Pulmonol 2006;41:872-874.

17. Garcia-Gracia ML, Calvo C, Perez-Brena P, et al. Prevalence and clinical characteristics of human metapneumovirus infections in hospitalized infants in Spain. Pediatr Pulmonol 2006;41:863-871.

18. Wolf DG, Greenberg D, Kalkstein D, et al. Comparison of human metapneumovisrus, respiratory syncytial virus and influenza A virus lower respiratory tract infections in hospitalized young chil-dren. Pediatr Infect Dis J 2006;25:320-324.

19. Wilkesmann A, Schildgen O, Eis-Hubinger AM, et al. Human metapneumovirus infections cause similar symptoms and clinical severity as respiratory syncytial virus infections. Eur J Pediatr 2006;25:320-324.

20. Brand HK, de Groot R, Galama JM, et al. Infection with multiple viruses is not associated with increased disease severity in children with bronchiolitis. Pediatr Pulmono. 2012 Apr;47:393-400.

21. Martin ET, Kuypers J, Wald A, et al. Multiple versus single virus respiratory infections: viral load and clinical disease severity in hospitalized children. Influenza Other Respi Viruses 2012;6:71-77.

22. Krasinski K, LaCouture R, Holzman RS, et al. Screening for respiratory syncytial virus and assign-ment to a cohort at admission to reduce nosocomial transmission. J Pediatr 1990 Jun;116:894-898.

23. Madge P, Paton JY, McColl JH, et al. Prospective controlled study of four infection-control proce-dures to prevent nosocomial infection with respiratory syncytial virus. Lancet 1992;340:1079-1083.

24. Hall CB. Nosocomial respiratory syncytial virus infections: the “Cold War” has not ended. Clin Infect Dis 2000;31:590-596.

25. Hall C, Douglas RJ. Modes of transmission of respiratory syncytial virus. J Pediatr 1981;99:100-103. 26. Bekhof J, Bakker J, Reimink R, Wessels M, Langenhorst V, Brand PLP, Ruijs JHM. Co-infections

in children hospitalised for bronchiolitis: role of roomsharing. J Clin Med Res 2013, In press. doi:10.4021/jocmr1556w

27. Kristjansson S, Lodrup Carlsen KC, Wennergren G, et al. Nebulised racemic adrenaline in the treat-ment of acute bronchiolitis in infants and toddlers. Arch Dis Child 1993;69:650-654.

28. Lessler J, Reich NG, Brookmeyer R, et al. Incubation periods of acute respiratory viral infections: a systematic review. Lancet Infect Dis 2009;9:291-300.

29. Reijans M, Dingemans G, Klaassen CH, et al. RespiFinder: a new multiparameter test to differen-tially identify fifteen respiratory viruses. J Clin Microbiol 2008;46:1232-1240.

Page 166: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

General discussion and future perspectives

Page 167: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 168: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

General discussion and future perspectives 167

g

general discussion and future PersPectives

In this thesis we evaluated various diagnostic measurements in paediatrics – across a wide age range, from neonates to older children – aimed at clarifying (demystifying) these commonly used procedures. We translated our aim into the following research questions:

1. What is the validity and interobserver reliability of the semi-quantitative measure-ment of glucosuria using reagent strips? Is this method reliable and feasible under the specific setting of a neonatal intensive care unit (NICU)?

2. What is the diagnostic value of various clinical signs, including glucosuria, in identi-fying late-onset neonatal sepsis (LONS) in prematurely born neonates?

3. What is the reliability and usefulness of fluid balance charts kept for moderately ill neonates admitted to a high-care neonatal unit?

4. What is the validity, reliability and utility of clinical findings including current com-posite paediatric dyspnoea scores, when assessing severity of dyspnoea in wheezing children?

5. What is the benefit of identifying and isolating RSV-infected infants hospitalised for bronchiolitis, for the purpose of reducing the risk of nosocomial cross-infection among patients with bronchiolitis? In other words: what is the incidence and clinical severity of cross-infections in patients admitted with bronchiolitis, when patients with different causative viral agents share a room during their stay in hospital?

In general we conclude that the adage “meten is weten” (the knowledge is in the numbers) still holds true, provided that one knows what is being measured, how the measurement is performed, by whom, and for what reason. The main findings, strengths, limitations, and clinical and research implications of each of the five research topics are discussed below, followed by some final, summarising reflections on the role played by evidence-based medicine (EBM) when investigating this theme.

chaPter 1 semi-Quantitative measurement of glucosuria in neonates

In this chapter, we present a study we performed to investigate the validity, reliabil-ity and utility of semi-quantitative measurement of glucosuria, which can be used to monitor glucose homeostasis, a parameter frequently disturbed in prematurely born neonates. We conducted an observational study, in which we used 300 experimentally derived urine samples tested under the specific circumstances of a neonatal intensive

Page 169: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

168 General discussion and future perspectives

care unit – assessed independently by three different nurses – and compared the results of the reagent strips with the urinary glucose concentrations measured quantitatively in the laboratory.

Main findings

The validity of the semi-quantitative measurement of glucosuria using visually read reagent strips was moderate, with more than one-fifth of reagent strips readings incor-rectly predicting the laboratory measurement. Measurement using the reagent strips gave the urine samples a score in one of five categories (0 to 4+). Under or overestima-tion of the degree of glucosuria occurred predominantly in the categories 1+ and 2+. The discriminative power between the lower categories of glucosuria (1+ and 2+) was low. When the categories 1+ and 2+ were taken together as a single category, the false ratings were halved (from 21.7% to 10.6%). In category 0, only 5.1% of readings were incorrect, i.e. the validity of the reagent strips for ruling out glucosuria is fairly good.

The interobserver agreement for glucosuria assessment using reagent strips was high (Kappa 0.81). The maximum difference between observers was one category, which seems acceptable for clinical use. The degree of glucosuria as measured by reagent strips is not influenced by the time spent by the urine in the diaper, and only slightly by incubator temperature and urine collection method.

Strengths and weaknesses

The main strength of our study is the use of a large number of samples that cover the complete range of glucose concentrations, which is important for obtaining reliable Kappa values.1 Interobserver variation was thoroughly investigated by adequate blind-ing of the observers. This is the first study to examine the reliability of glucosuria reagent strips when used in the specific setting of a neonatal intensive care unit.

The main limitation lies in the generalisability of the results. While our study was performed under controlled study conditions (standardised light and temperature, single investigator), in real-life clinical practice the reagent strip is read under variable circumstances (differences in ambient light and temperature) and by both experienced and inexperienced staff members. Future studies using urine samples from patients in real-life clinical practice with a wide spectrum of glucosuria are needed to elucidate this matter.

Page 170: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

General discussion and future perspectives 169

g

Clinical and research implications

• Glucosuria can be assessed reliably and simply in a semi-quantitative way using visually read reagent strips in a NICU setting, provided that categories 1+ and 2+ are taken together as a single category.

• Changes in glucosuria of more than one category are likely to be real (change exceeds measurement error).

Future studies should assess interobserver reliability under real life circumstances.

chaPter 2 early diagnosis of late-onset sePsis in Preterm infants

In this second chapter, we explored the predictive value of various clinical symptoms, including glucosuria, as markers for late-onset neonatal sepsis (LONS) in preterm infants. We did this by performing an observational study in 350 premature infants dur-ing a two-year period (2005-2007). We prospectively collected daily measurements on glucosuria in all infants – blinded to treating physicians – and followed these children for the occurrence of sepsis episodes, thus creating a case-control study within this prospective cohort.

Main findings

Most of the individual clinical symptoms had only moderate predictive value for identify-ing LONS in preterm infants suspected of infection. The clinical signs that had the high-est predictive value for LONS were as follows: increased respiratory support, capillary refill time > 2 seconds, pallor or grey skin colour, and a central venous catheter in the 24 hours preceding the onset of suspected infection. When we represented these clinical signs graphically in a nomogram, their predictive value for identifying LONS improved. This nomogram allows users to calculate the expected risk of LONS in individual patients with suspected infection. Other clinical signs that were too non-specific to be useful in excluding or confirming LONS included glucosuria, temperature instability, apnoea, tachypnoea, tachycardia, dyspnoea, hyper and hypothermia, feeding difficulties and irritability. Although glucosuria was associated with LONS, the association was too weak to be of diagnostic value, probably because glucosuria was also strongly associated with gestational age, birth weight and postnatal age, which are well-known risk factors for LONS.

Strengths and weaknesses

The main strength of this study was the prospective and blind method of data collection and the fact that we evaluated both clinical and blood-culture-proven sepsis. Clinically

Page 171: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

170 General discussion and future perspectives

confirmed but culture-negative sepsis is frequently encountered in clinical practice. This was the first study to examine the association between glucosuria and LONS.

A major limitation was incorporation bias: some clinical signs were used both in the definition of suspected sepsis and in the reference standard for clinically confirmed sepsis. Incorporation bias increases the risk of misclassification and overestimates test accuracy.2

Clinical and research implications

• Glucosuria, temperature instability, apnoea, tachycardia, dyspnoea, feeding difficulties or irritability are too non-specific to be of diagnostic value when identifying LONS.

• Clinical signs that increase the likelihood of LONS in preterm infants include the following, particularly when represented together in a nomogram:

o increased respiratory support,

o capillary refill > 2 seconds,

o pallor/grey skin,

o a central venous catheter in the 24 hours preceding the episode of suspected infection.

• This nomogram may help to decide on duration of antibiotic therapy, for example in situations where no blood culture is available. Moreover, it may be possible to postpone the start of antibiotics in cases of low risk, under close monitoring of clinical symptoms.

For the future: Clearly, this model needs external validation in new patient groups. Further research to evaluate the reliability and interobserver and intraobserver variation for the more subjective symptoms is also warranted. It might also be useful to further analyse our data to evaluate the usefulness of daily monitoring of glucosuria in the early detection of LONS.

chaPter 3 fluid Balance charts in neonates

In this chapter we investigated the reliability and utility of the fluid balance in neonates. In 2009-2010 we performed a randomised controlled trial in 172 neonates admitted to our high-care neonatal unit. In one group, physicians were blinded to the fluid balance data, while in the other group fluid balance data were made available to attending physicians. We also investigated reliability in this study population by comparing the data on daily fluid balance with that on daily changes in body weight.

Main findings

The agreement between fluid balance data and daily weight changes in neonates was poor, with differences of >20% of daily fluid intake (our predefined threshold for clinical relevance) occurring in 40% of fluid balances charts. Fluid balance charts are imprecise, both over and underestimating body weight changes in an unpredictable pattern, and are therefore unreliable measures of fluid status in sick neonates. In the randomised

Page 172: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

General discussion and future perspectives 171

g

trial, keeping a fluid balance chart did not influence length of hospitalisation, degree of weight loss or degree of medical interventions by the attending physician.

Strengths and weaknesses

The main strengths of this study include the randomised controlled trial design, the me-ticulous attention paid to avoid revealing of the fluid balance data to the physicians in the intervention group, and the close reflection of clinical practice, using measurements and techniques commonly used in neonatal wards.

The main limitation was the moderate disease severity in our patients. Keeping a fluid balance might be more useful in sicker newborns, for example in a neonatal intensive care setting.

Clinical and research implications

• Because of their imprecision, keeping fluid balance charts in sick neonates with moderate disease severity is not useful.

• This study is an example of how a time-consuming and error-prone tradition appears to be of little use in improving patient care.

Further studies are needed to investigate the usefulness of keeping a fluid balance in the setting of a neonatal intensive care unit.

chaPter 4 clinical assessment of dysPnoea in children

In this chapter we investigated the clinical assessment of dyspnoeic children. We performed a systematic review that assessed the validity of all published composite dyspnoea scores. We also explored the intraobserver and interobserver variability of the individual clinical items used in existing composite dyspnoea scores by asking several professionals to assess video recordings of 27 dyspnoeic children.

Main findings

Of the 36 paediatric dyspnoea scores published in the literature, 14 were unsuitable for clinical use because of either insufficient face validity, use of items unsuitable for children, or difficult scoring systems. Although the other 22 scores were easy to use, information on reliability was sparse, and none of the scores had been sufficiently vali-dated to allow for clinically meaningful use in children with acute dyspnoea or wheeze, in particular not when the purpose of using the score is evaluation of treatment effect.

We found intraobserver variation between the individual items of the composite scores to be modest. However, in two-thirds of observations the interobserver varia-tion was too large to allow detection of any clinically relevant changes in dyspnoea in response to treatment.

Page 173: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

172 General discussion and future perspectives

Strengths and weaknesses

The limited methodological validation of paediatric dyspnoea scores does not neces-sarily exclude their suitability for use in clinical practice and research. However, the collection of additional data – thereby enabling quality scoring – will improve the acceptability and usefulness of existing dyspnoea scores. The main limitation of our study was the use of video recordings without auscultation, which may have limited the accuracy of the observations.

Clinical and research implications

• None of the numerous paediatric dyspnoea scores currently available have been sufficiently validated to allow clinically meaningful use in children with acute dyspnoea, in particular not for the purpose of evaluating effects of treatment.

• Intraobserver reliability of the clinical assessment of dyspnoea in children is fair.

• The large interobserver variation in the assessment of dyspnoeic children obscures detection of clinically important improvement, limiting its usefulness in clinical practice and research.

Future studies should not be aimed at developing new scores, but at validating existing dyspnoea scores more comprehensively, using live assessments and evaluating the added value of auscultation. Interventions to reduce observer variation (training and education) should be examined. Future studies should also make a direct comparison between the physician’s general impression of dyspnoea and the results of the composite scores.

chaPter 5 viral tests for cohort isolation in children hosPitalised for Bronchiolitis

During the annual winter epidemic of bronchiolitis – most commonly caused by respira-tory syncytial virus (RSV) – rapid diagnostic RSV tests are frequently applied in order to separate patients infected with RSV from those infected with other viruses (cohort isolation). We investigated whether such cohort isolation was really necessary, thus questioning the use of expensive viral testing. As a proof of principle, during the winter season of 2011-2012 we first performed an observational cohort study in 48 patients who only shared rooms during the first day of admission, pending the results of the rapid RSV test. In the following season (2012-2013) we continued this study, including 65 patients who shared rooms during their entire stay in hospital.

Main findings

Although co-infection (including nosocomial cross-infection) was frequently seen in infants admitted for bronchiolitis, cross-infection between patients sharing a room was uncommon, and co-infection was not associated with more severe disease. Our findings therefore suggest that it might be safe for patients with bronchiolitis to share a room, irrespective of the causative virus.

Page 174: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

General discussion and future perspectives 173

g

Strengths and weaknesses

The major strength of the study is its originality: the effects of room sharing on cross-infection have not previously been reported. The main limitations include the fact that we did not address potential sources of cross-infection other than roommates, and the small sample size.

Clinical and research implications

• Room sharing among patients with bronchiolitis appears to be safe when appropriate hygiene measures are taken during contact with patients.

• Routine performance of virological tests for cohorting purposes may not be necessary.

Future studies should include larger study samples and should assess the source of co-infections (family, visitors or medical staff).

summarising reflections: is the knowledge in the numBers? and how has evidence-Based medicine shaPed this thesis?

“There is no safety in numbers, or in anything else.”

James Thurber (New Yorker, 1939)

In this thesis, we evaluated five frequently used measurements in paediatrics, and found that the majority appeared to be imprecise and of limited clinical use. This calls into question the overall usefulness of measurements in paediatric clinical practice. Many measurements have a strong tradition in medicine in general, including paediatrics. From the perspective of medicine as a scientific discipline, measurements such as the ones evaluated in this thesis are commonly viewed as being “objective”, as opposed to the more “subjective” data obtained by history taking.3 Both healthcare professionals and patients (or their parents) tend to prefer such “objective” measurements, under the assumption that “the knowledge is in the numbers” (meten is weten).4,5,6,7 But the fact that such measures or procedures are commonly used – and have been used for ages – does not mean they are valid, precise or useful. In fact, it is quite striking to note that many of these measurements have been widely accepted as part of routine clinical practice, de-spite the fact that their validity, precision, and usefulness have never been examined.8,9

Based on the results of this thesis, we conclude that the adage “meten is weten” (the knowledge is in the numbers) only holds true if one knows what is being measured, how the measurement is performed, by whom, and for what reason. For example, in our study of fluid balance data we found that the measurement error in neonates is large: it is simply not accurate when compared with the daily fluid intake and changes in body weight. Together with the results of the randomised controlled trial – which showed no effect of fluid balance keeping on patient health outcome – these findings led to

Page 175: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

174 General discussion and future perspectives

our conclusion that routine recording of fluid balance is not useful in sick neonates. Nevertheless, if in other patient groups (e.g. adults) the measurement error for fluid balance charts is relatively small when compared with the total fluid intake, this same measurement could still be sufficiently reliable to be of use in other specific situations or patient groups.

Why should we investigate time-honoured routine measurements?

With many innovations in the field of medical therapies and diagnostic technologies, it seems somewhat anachronistic to investigate time-honoured, routine measurement procedures. One therefore might question our reasons for investigating these topics in the first place and how we arrived at the research questions explored in this thesis. One way of answering this question is to turn it around: why would we thoroughly test new therapies and diagnostic technologies if we already accept and use many older thera-pies and diagnostic procedures that have not been sufficiently investigated for validity, reliability and utility? Instead of investing limited research resources on developing new technologies and measurements, is it not important to investigate the utility and valid-ity of procedures already in common use, so that we can avoid using measurements and techniques that are insufficiently valid, precise, or useful?

An important stimulus for us to investigate such commonly used measurements in paediatrics was the fact that – at the time – we were implementing evidence-based practice (EBP) in our paediatric department. This approach to clinical practice stimulated a self-critical attitude, leading us to question not only new tests, but also our day-to-day routines. The implementation of EBP is not without its hurdles, however. Firstly, for most doctors and medical departments, a paradigm shift is required before we can easily recognise and admit the many gaps in our knowledge, analogous to the way in which we deal with medical errors.10 Secondly, doctors must above all be self-critical and recognise for themselves any questions that they might have. And finally, if doctors are to admit publicly that they do not know something, or they are to question department routine, then the environment must also be sufficiently safe. we feel that having a self-critical and curious attitude is essential for evidence-based practice, and that it is this attitude that has stimulated us to question not only new medical developments, but also existing ones.

Implementing evidence-based practice

As mentioned above, the basic strategy of critically evaluating one’s own clinical habits, traditions and routine clinical procedures is a key component of the application of EBM in practice. In 2005, the paediatric department decided to further develop its clinical practice into an approach that included EBP. The reason for this decision was that we felt that critically evaluating our own practices and habits would help to reduce unnecessary

Page 176: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

General discussion and future perspectives 175

g

variations in diagnostic and therapeutic strategies between paediatricians in commonly encountered clinical problems in our department. We also thought that, as a result, this would help to improve health outcomes for patients, improve patient safety, and reduce health expenditure in our department.11,12,13

To achieve and implement evidence-based practice, we developed a project plan covering 5 years. The project plan consisted of three parts: training, implementation of EBM activities, and development and maintenance of clinical practice guidelines. For the training programme, one paediatrician from our department (JB) completed a core course in clinical epidemiology (EMGO Institute, VU University, Amsterdam); a further two paediatricians participated in several international EBM courses, some of them teaching courses (Centre for Evidence-based Medicine, University of Oxford; UCL Institute of Child Health, London; and McMaster University, Toronto). We subsequently developed a two-day, interactive, basic workshop on EBM skills – specifically targeted at Dutch hospital-based medical practice – and a trial version of the workshop was held for the entire consultant staff of our paediatric department. This enabled all consultants and nurse practitioners to become competent in basic EBM skills. At the same time, these skills were being introduced into the weekly routine of our department (the second part of our project plan), a process involving various activities during which the five steps of EBM were practised on a weekly basis. An overview of the weekly schedule is provided in Table 1.

The third and final part of our project comprised the development of clinical practice guidelines for approximately 100 topics, including the 20 most commonly encountered clinical problems, both on our paediatric ward and at our paediatric outpatient clinic. For this purpose, all paediatricians systematically searched the literature on these clini-cal problems in their specific area of expertise – including published evidence-based guidelines – and either summarised and adapted the literature or retrieved existing guidelines, both for local purposes. This generated a library of dozens of local evidence-based guidelines which were sufficiently concise and specifically adapted to the local situation. This library was used to encourage optimal application in our department. If consultants or junior doctors using the developed guidelines encountered problems in practice when applying them to patients they had seen, the guidelines were reviewed and discussed within the team during the weekly “guideline discussion” meeting.

All of the issues critically reviewed and analysed in this thesis originated from discus-sions within our paediatric team during one of the EBM activities (Table 1).

Paradigm shift is necessary for successful implementation of EBP

During the first phase of this project, we discovered that several clinicians in our depart-ment were struggling with the newly introduced custom of critically questioning both one’s own and each other’s ways of addressing clinical problems, and of discussing the

Page 177: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

176 General discussion and future perspectives

tabl

e 1.

Wee

kly

sche

dule

of t

he p

aedi

atric

dep

artm

ent a

t Isa

la, Z

wol

le, s

how

ing

whe

n th

e fiv

e EB

M s

teps

wer

e pu

t int

o eff

ect.

eBm

ste

ps

(figu

re 1

)d

aily

war

d ro

unds

, wee

kly

gran

d ro

und

or m

orni

ng

repo

rts

wee

kly

cat*

mee

ting

(15

min

utes

)Bi

wee

kly

jour

nal c

lub

(60

min

utes

)w

eekl

y gu

idel

ine

mee

ting

(15

min

utes

)

1Fo

rmul

atio

n of

str

uctu

red,

cl

inic

ally

rele

vant

que

stio

ns.

Thes

e qu

estio

ns a

re re

cord

ed o

n a

dedi

cate

d in

tran

et w

ebsi

te.

The

resi

dent

** o

r sta

ff m

embe

r who

is

sche

dule

d to

pre

sent

the

wee

kly

PICO

ch

oose

s a

PICO

from

the

web

site

or

form

ulat

es a

que

stio

n hi

m/h

erse

lf.

In a

qua

rter

ly m

eetin

g, s

taff

mem

bers

dem

ocra

tical

ly

choo

se re

cent

jour

nal a

rtic

les

or to

pics

to b

e di

scus

sed

in

the

follo

win

g m

onth

’s jo

urna

l cl

ub.

2Th

e re

side

nt*

or s

taff

mem

ber w

ho is

sc

hedu

led

to p

rese

nt th

e w

eekl

y PI

CO

sear

ches

the

med

ical

lite

ratu

re; s

uppo

rt

from

the

clin

ical

libr

aria

n is

facu

ltativ

e.

All

staff

mem

bers

and

re

side

nts*

can

brin

g re

cent

ar

ticle

s to

abo

ve-m

entio

ned

quar

terly

mee

tings

.

Staff

mem

bers

sea

rch

the

liter

atur

e to

de

velo

p a

guid

elin

e on

the

mos

t com

mon

cl

inic

al p

robl

ems

in th

eir s

peci

fic a

rea

of

expe

rtis

e.

3Th

e re

side

nt*

or s

taff

mem

ber s

ched

uled

fo

r the

PIC

O m

eetin

g ap

prai

ses

the

liter

atur

e an

d pr

esen

ts th

e re

sults

.

The

artic

le is

ext

ensi

vely

ap

prai

sed

in p

lena

ry b

y re

side

nts

and

staff

mem

bers

in

a s

truc

ture

d, in

tera

ctiv

e m

anne

r.

Staff

mem

bers

app

rais

e th

e lit

erat

ure

on

the

mos

t com

mon

clin

ical

pro

blem

s in

th

eir s

peci

fic a

rea

of e

xper

tise

and

draw

up

guid

elin

es o

r adj

ust n

atio

nal/i

nter

natio

nal

guid

elin

es to

the

loca

l situ

atio

n.

4A

pply

ing

the

resu

ltsA

t the

end

of t

he P

ICO

mee

ting,

the

impl

icat

ions

for o

ur c

linic

al p

ract

ice

in

Zwol

le a

re d

iscu

ssed

.

At t

he e

nd o

f the

jour

nal

club

, the

impl

icat

ions

for o

ur

clin

ical

pra

ctic

e in

Zw

olle

are

di

scus

sed.

App

lyin

g th

e re

sults

: are

we

follo

win

g ou

r gui

delin

es?

Are

they

up-

to-d

ate?

Do

we

enco

unte

r bar

riers

in th

e us

e of

the

guid

elin

es?

5Th

e pr

oces

s is

eva

luat

ed a

nnua

lly in

a m

eetin

g ov

ervi

ew w

hich

form

s pa

rt o

f the

dep

artm

ent’s

ann

ual r

epor

t. Th

e pr

oces

s is

als

o di

scus

sed

durin

g th

e st

aff

mem

bers

’ eva

luat

ive

mee

tings

whi

ch a

re h

eld

twic

e a

year

.

*CAT

Crit

ical

ly A

ppra

ised

Topi

c, i.

e. a

met

hod

used

to a

ddre

ss th

e qu

estio

n of

inte

rest

, for

mul

ated

as

a PI

CO (P

ICO

is a

lite

ratu

re s

earc

h m

etho

d an

d st

ands

for:

Patie

nt/

Popu

latio

n - I

nter

vent

ion

- Com

paris

on -

Out

com

e)**

Not

onl

y re

side

nts

part

icip

ate,

but

als

o m

edic

al s

tude

nts

in th

e fin

al p

hase

of t

heir

stud

ies,

and

nurs

e pr

actio

ners

.

Page 178: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

General discussion and future perspectives 177

g

reasons for these differences between doctors. This taught us that the first prerequisite for successfully implemented EBP is to ensure that the working (and learning) climate is both self-critical and safe.14 After all, the first step of EBM is to formulate a structured, answerable, clinically relevant question, as shown in Figure 1.11 However, the tradition in medicine has long been one of avoiding raising critical questions in public because this would either acknowledge that one does not know something, or suggest that one is critical about a colleague’s knowledge or skills.15 Since childhood we have been taught to raise our hands in class when we know the answer to the teacher’s question, but not the other way around. Throughout our youth and training this emphasis on show-ing knowledge and skills is reinforced, perhaps even more so now that “excellence” is increasingly being sought in students. Knowing is hot, while ignorance is not. Neverthe-less, the first step in EBM, is to recognise and admit that one does not know or is not sure of something. This requires a paradigm shift which is still difficult for many physicians, even though it has become generally accepted that “to err is human”10 and that all doc-tors make mistakes from time to time.3

1Formulate structured question

2Literature

search

3Appraisal of

literature

4Application in

practice

Patientdilemma

Principles of evidence-based

practice

5Evaluate the

process

figure 1. Principles of evidence-based practice, showing the five steps

It should also be noted that although medicine is a scientific study in which it is es-sential to raise questions and ask why and how, the many uncertainties that doctors

Page 179: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

178 General discussion and future perspectives

deal with in busy daily clinical practice lead them to rely heavily on experience, patterns and routine for most of their actions.3,16,17 The high reliance on experience and routine and the time constraints of a busy day in the hospital discourage clinicians from asking critical questions about their own and their colleagues’ behaviour.

Diagnostic studies and the limitations of EBM

While developing research questions that were based on identifying gaps in our knowl-edge of the topics that originated from our EBM meetings, we discovered that the EBM methodology is better developed in the therapeutic field than in the field of diagnostic studies. While most if not all clinicians are familiar and feel comfortable with the meth-odology of the randomised controlled trial, the great majority of doctors have limited knowledge of the methodology of diagnostic studies.18 While diagnostic studies are difficult to perform and diagnostic reasoning is complex,19,20 understanding diagnostic methodology may help to clarify and demystify our diagnostic reasoning. Doctors are well-trained in making diagnoses, and have several ways of arriving at certain diagno-ses, ranging from spot diagnosis and pattern recognition to probabilistic reasoning and the use of clinical prediction rules.21 Although the terms specificity and sensitivity cross our path during our medical training, when experienced consultants are asked to calculate the post-test likelihood that a patient has a disease – given a certain disease prevalence and a specified test sensitivity and specificity – most are unable to do so.22 Given that diagnostic studies are far less frequently published than therapeutic studies, most doctors have little experience in appraising diagnostic studies and they therefore find it difficult to appraise the validity and results of these studies. In our department, the implementation and weekly practice of EBM activities enabled us to become more familiar with diagnostic studies and the methodology behind them, which encouraged us not only to question the day-to-day use of tests in our own practice, but also to study them critically.

Closing remarks

All of the questions addressed in this thesis emerged from one of our departmental EBM meetings. For this reason, we believe that EBP has two main advantages. Firstly, it leads to improved healthcare by giving the patient the best available treatment, based on solid methodological evidence which takes into account both the doctors’ experi-ence and the patient’s preferences. Secondly, it has beneficial effects on the self-critical attitude of doctors, which helps to raise interesting research questions that are relevant for clinical practice. Such changes in attitude help to raise interesting research questions that are relevant for clinical practice. We hope that our experience in this area, reflected in the studies in this thesis, helps other physicians and medical teams to adopt EBM skills and to develop evidence-based practice.

Page 180: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

General discussion and future perspectives 179

g

references

1. Hallgren KA. Computing inter-rater reliability for observational data: an overview and tutorial. Tutor Quant Methods Psychol 2012;8:23-34.

2. Reitsma JB, Rutjes AW, Khan KS, Coomarasamy A, Bossuyt PM. A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. J Clin Epidemiol 2009;62:797-806.

3. Sanders L. Every patient tells a story. Broadway Books New York, 2009. 4. Porter T. Trust in numbers: The pursuit of objectivity in science and public life. Princeton Univer-

sity Press, 1996. 5. Van Maanen H. Goochelen met getallen: Cijfers en statistiek in krant en wetenschap. Boom,

Amsterdam 2009. 6. Dewdney AK. 200% of nothing: An eye-opening tour through the twists and turns of math abuse

and innumeracy. John Wiley & Sons, Inc. New York 1993. P109-120. 7. Gigerenzer G. Reckoning with risk: learning to live with uncertainty. Penguin books Ltd. London

2002. 8. Sackett DL. The rational clinical examination. A primer on the precision and accuracy of the clini-

cal examination JAMA. 1992;267:2638-44. 9. De Jongh TOH, Zaat JOM. Fysische diagnostiek: Een serie over de waarde van lichamelijk onder-

zoek. Ned Tijdschr Geneeskd. 2010;154:A2650. 10. Kohn KT, Corrigan JM, Donaldson MS. To Err Is Human: Building a Safer Health System. Washing-

ton, DC: National Academy Press; 1999. 11. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: what

is it and what it isn’t. BMJ 1996;312:71-72. 12. Grol R, Grimshaw J. From best evidence to best practice: effective implementation of change in

patients’ care. Lancet. 2003;362:1225-1230. 13. Guyatt G, Drummond R, Meade MO, Cook DJ. Users’ guide to the medical literature: a manual for

evidence-based practice. 2nd Ed. McGrawHill, New York 2002. 14. Sutkin G, Wagner E, Harris I, Schiffer R. What makes a good clinical teacher in medicine? Review of

the literature. Acad Med 2008; 83:452-456. 15. Ingram JR, Anderson EJ, Pugsley L. Difficulty giving feedback on underperformance undermines

the educational value of multi-source feedback. Med Teach2013;35:838-846. 16. Groopman J. How doctors think. Houghton Mifflin Company. Boston, 2007. 17. Van Vugt PJH. Beslissen is menselijk: een overzicht met een accent op geneeskundige aspecten.

Eburon, Delft 2007. 18. Richardson WS. We should overcome the barriers to evidence-based clinical diagnosis! J Clin

Epidemiol. 2007;60:217-227. 19. Bianchi MT, Alexander BM. Evidence based diagnosis: does the language reflect the theory? BMJ

2006;333:442-445. 20. Gluud C, Gluud L. Evidence based diagnostics. BMJ 2005;330:724-726. 21. Heneghan C, Glasziou P, Thompson M, Rose P, Balla J, Lasserson D, Scott C, Perera R. Diagnostic

strategies used in primary care. BMJ 2009. 20;338:b946. 22. Gigerenzer G, Edwards A. Simple tools for understanding risks: from innumeracy to insight. BMJ

2003;327:741-744

Page 181: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 182: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Summary

Page 183: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 184: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Summary 183

s

summary

Measurements are essential In medical practice. Throughout their training, doctors are taught the adage “meten is weten” (the knowledge is in the numbers). In this thesis we evaluated various diagnostic measurements in paediatrics – across a wide age range, from neonates to older children – aimed at clarifying (demystifying) these commonly used procedures.

In the introductory chapter we discuss the general aspects of uncertainty in medicine, since uncertainty is a major reason for performing diagnostic tests in the first place. We emphasize that apart from additional tests, such as laboratory or radiology investiga-tions, every part of history taking and physical examination can be viewed as a diag-nostic test. This is followed by a brief explanation of methodological issues concerning the architecture of diagnostic research and the clinimetric evaluation of measurements. Furthermore we explain the way evidence-based medicine may help doctors to deal with this feeling of uncertainty.

In the first chapter we present an experimental study in which we evaluated the validity and interobserver reliability of the semi-quantitative measurement of glucosuria using reagent strips in neonates. We conclude that glucosuria can be assessed reliably and simply by visually read reagent strips in the specific setting of a neonatal intensive care unit, provided that category 1+ and 2+ are taken together as one category. Changes in glucosuria of more than one category (of five categories) are likely to exceed the measurement error.

In the second chapter we investigate the diagnostic value of various clinical signs, in-cluding glucosuria, in identifying late-onset neonatal sepsis (LONS) in prematurely born neonates, through an observational study. We present a nomogram that may help to identify LONS in prematures. This nomogram graphically depicts the likelihood of LONS depending on the presence of the following signs and symptoms: increased respiratory support, capillary refill > 2 seconds, pallor and/or gray skin, or the presence a central ve-nous catheter in the 24 hours preceding the episode of suspected infection. Glucosuria, temperature instability, apnoea, tachycardia, dyspnoea, feeding difficulties or irritability all were too non-specific to be of diagnostic value in LONS.

In the third chapter we show - from a randomised controlled trial - that fluid balance charts are imprecise and not useful for routine use in moderately ill neonates.

Page 185: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

184 Summary

The fourth chapter is about the clinical assessment of severity of dyspnoea in children. We present a systematic review, which shows that none of the numerous dyspnoea scores in the current literature has been sufficiently validated to allow for clinically meaningful use in children with acute dyspnoea. In an observational study we found large interobserver variation in the assessment of dyspnoeic children, obscuring the detection of clinically important improvement, limiting its usefulness in clinical practice and research.

In the fifth chapter we present two observational studies investigating the benefit of identifying and isolating RSV-infected infants hospitalised for bronchiolitis, for the pur-pose of reducing the risk of nosocomial cross-infection. We looked for the occurrence of nosocomial cross-infections when patients with different causative viral agents share a room during their stay in hospital. We found that roomsharing of children with bronchi-olitis appears to be safe, when proper standard hygienic measurements are taken.

In the last chapter we summarize our research questions, main findings and discuss the strengths, limitations, clinical and research implications of each of the five research topics, followed by some final, summarising reflections on the role played by evidence-based medicine (EBM) when investigating this theme.

In general we conclude that the adage “meten is weten” (the knowledge is in the numbers) still holds true, provided that one knows what is being measured, how the measurement is performed, by whom, and for what reason.

Page 186: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Nederlandse samenvatting voor de leek

Page 187: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 188: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Nederlandse samenvatting voor de leek 187

n

nederlandse samenvatting voor de leek

Dit proefschrift gaat over een aantal metingen die vaak gedaan worden binnen mijn vakgebied, de kindergeneeskunde. Dokters worden opgevoed met het adagium “meten is weten”. In dit proefschrift probeer ik deze stelling aan een kritisch oordeel te onder-werpen: Is meten wel weten?

Waarom meten dokters zo graag?

Naast het feit dat dokters opgeleid worden met “meten is weten“, is een aantal andere redenen te noemen: onzekerheid over bijvoorbeeld de diagnose, geruststelling van de patiënt (of de dokter zelf ), maar ook een onvoorwaardelijk vertrouwen in getallen , vooral als ze uit geavanceerde machines komen. Daarnaast worden metingen vaak uitgevoerd omdat we dat zo gewend zijn. We hebben dit zo geleerd, het zijn routinehandelingen.

Waarom is het verkeerd als je ten onrechte iets meet of als je teveel meet?

De meting op zichzelf kan schadelijk zijn, zeker als het een invasieve meting betreft. Daarnaast zijn aan elke meting kosten verbonden. Bovendien hebben we bij elke meting te maken met vals-positieve uitslagen. Een vals positieve uitslag betekent dat gezonde mensen als ziek of afwijkend worden bestempeld. Een vals positieve uitslag kan erin resulteren dat gezonde mensen ten onrechte een behandeling krijgen of worden bloot-gesteld aan vervolgonderzoeken. Inmiddels is uit onderzoek duidelijk geworden dat - bij een lage voorafkans op ziekte “ het doen van aanvullend onderzoek niet leidt tot minder ongerustheid of klachten bij de patiënt, dan wanneer geen aanvullende testen worden verricht. In ons vak, de kindergeneeskunde, is de voorafkans op ziekte doorgaans laag: de meeste kinderen zijn gezond omdat de meeste ziekten in onze westerse wereld welvaartsziekten zijn die vooral volwassenen treffen. Daarom is het kritisch kijken naar “meten is weten“ vooral in de kindergeneeskunde belangrijk.

Evidence-based medicine

Op de Amalia kinderafdeling van Isala, een groot opleidingsziekenhuis in Zwolle, zijn we in 2005 actief begonnen met het implementeren van evidence-based medicine (EBM) in onze praktijk. Dit betekent dat we proberen steeds kritisch naar ons eigen handelen te kijken. We vragen ons regelmatig af, of we het wel goed doen en houden hier we-kelijks besprekingen over, waarbij een klinische vraag op een gestructureerde manier behandeld wordt (dit noemen we een CAT, critically appraised topic, een kritisch beoor-deeld onderwerp). Als we op een vraag in de beschikbare literatuur geen bevredigend antwoord kunnen vinden, stellen we vaak een onderzoeksvraag, en gaan we kijken of we deze onderzoeksvraag met een studie in onze eigen praktijk kunnen oplossen. De

Page 189: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

188 Nederlandse samenvatting voor de leek

onderwerpen die in dit proefschrift aan bod komen, zijn allemaal voortgekomen uit deze EBM-besprekingen en de daaruit voortvloeiende studies.

Dit proefschrift

In dit proefschrift onderzoek ik de betrouwbaarheid en het nut van vijf veelgebruikte metingen of procedures binnen de kindergeneeskunde:1. De semikwantitatieve bepaling van glucose in de urine met behulp van test-strips.2. Klinische symptomen bij het opsporen van ernstige infecties bij te vroeg geboren

kinderen.3. Vochtbalans bij zieke pasgeborenen.4. De beoordeling van de mate van benauwdheid bij kinderen.5. Het gebruik van virale tests bij bronchiolitis.

Betrouwbaarheid van test-strips voor het bepalen van de hoeveelheid glucose in de urine (glucosurie)In het eerste hoofdstuk onderzoeken we de betrouwbaarheid van test-strips voor het bepalen van de hoeveelheid glucose in de urine (glucosurie). Deze test-strip verkleurt na contact met urine, waarbij de mate van verkleuring een maat is voor de hoeveelheid glucose in de urine. De uitslag wordt weergegeven in vijf categorieën, 0, 1+, 2+, 3+ en 4+. We hebben drie verpleegkundigen onafhankelijk van elkaar 900 urinemonsters laten beoordelen. De urinemonsters werden bewaard onder verschillende omstandigheden (kamer vs. couveuse temperatuur, kort of lang verblijf in de luier) en de urine werd op verschillende manieren uit de luier verkregen (de strip werd direct op de natte luier gedrukt, of de urine werd met een spuitje uit een gaasje dat in de luier lag gedrukt). De uitslagen van de test-strips werd vergeleken met de glucosewaarde in de urine zoals die kwantitatief werd gemeten in het laboratorium. Uit de resultaten bleek dat de uitslag van de teststrip goed overeenkwam met de glucosewaarden bepaald in het laboratorium, behalve in categorie 1+ en 2+. Als deze 2 categorieën samen als één categorie worden beschouwd, is de betrouwbaarheid van de teststrip voor het bepalen van glucosurie goed. Deze betrouwbaarheid wordt niet beïnvloed door de specifieke omstandigheden op een neonatologie intensive care (temperatuur, contact met luier) of de methode van urine opvangen. De test-strips zijn dus geschikt om te gebruiken voor het meten van glucosurie bij neonaten op een neonatale intensive care afdeling.

De waarde van klinische symptomen, waaronder glucosurie, voor het vroegtijdig herkennen van een ernstige infectie (sepsis) bij te vroeg geboren kinderen.In het tweede hoofdstuk hebben we de waarde van verschillende klinische symptomen, waaronder glucosurie, onderzocht voor het vroegtijdig herkennen van een ernstige in-fectie (sepsis) bij te vroeg geboren kinderen (zwangerschapsduur < 32 weken). Ernstige

Page 190: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Nederlandse samenvatting voor de leek 189

n

infecties zijn een belangrijke reden voor overlijden bij deze kinderen, reden waarom we deze infecties zo snel mogelijk willen opsporen, om tijdig met antibiotische behande-ling te kunnen starten. Helaas is het zo dat het lastig kan zijn deze infecties vroegtijdig te herkennen; de symptomen van een infectie bij vroeg geborenen zijn niet erg specifiek, wat betekent dat de symptomen ook kunnen voorkomen zonder dat sprake is van een (ernstige) infectie. Dit is vooral zo bij een beginnende sepsis. Om de waarde van de verschillende symptomen, die zouden kunnen duiden op een infectie, te bepalen, hebben we bij ruim 300 vroeg geborenen dagelijks glucosurie gemeten met bovenge-noemde test-strips. Vervolgens hebben we gekeken bij welke kinderen een verdenking op een ernstige infectie bestond. Bij de 142 kinderen bij wie een dergelijke verdenking optrad, werd gekeken welke klinische symptomen werden gezien en of daadwerkelijk sprake was een ernstige infectie, bijvoorbeeld wanneer een bacterie werd gevonden in het bloed van de baby. Uit de resultaten bleek dat meer behoefte aan ondersteuning van de ademhaling, een verlengde capillary refill (slechte doorbloeding van de huid), een grauwe huidskleur en de aanwezigheid van een centraal veneuze lijn (een infuus in een groot bloedvat) de belangrijkste tekenen van een ernstige infectiewaren. Gluco-surie, temperatuurinstabiliteit, apneu (onvoldoende doorademen), tachycardie (snelle hartslag), dyspneu (benauwdheid), voedingsproblemen en geprikkeldheid bleken te aspecifiek te zijn om een infectie te kunnen aantonen, danwel te kunnen uitsluiten. De kenmerken die we “objectief” kunnen meten (glucosurie, lichaamstemperatuur, apneus op de monitor, hartslag) bleken dus van onvoldoende waarde bij het voorspellen van ernstige infecties bij te vroeg geboren kinderen.

Betrouwbaarheid en nut van de vochtbalans bij zieke pasgeborenen.In hoofdstuk drie onderzochten we de betrouwbaarheid en het nut van het dagelijks bijhouden van de vochtbalans bij zieke pasgeborenen. Het bijhouden van de vochtba-lans wordt bij veel patiënten gedaan om een indruk te krijgen van de vochthuishouding. Hiertoe wordt gedurende 24 uur alles wat de patiënt inneemt (voeding, maar ook infusen en medicatie) en wat de patiënt uitscheidt (urine, defecatie, braken, zweten, etc) bijgehouden. Als een vochtbalans goed wordt bijgehouden zou het resultaat van de balans (hoeveel vocht er in de patiënt gaat min de hoeveelheid vocht die de patiënt uitscheidt) overeen moeten komen met het verschil in lichaamsgewicht van de baby tijdens diezelfde periode. Bijvoorbeeld: als de vochtbalans +40 ml is verwacht je dat de baby 40 g is aangekomen (1 ml water weegt immers 1 g). Tot enkele jaren geleden was het op onze neonatologie high care afdeling gewoonte om bij elke opgenomen pasgeborene tijdens de eerste 5 dagen van de opname een vochtbalans bij te houden. Dit kost de verpleegkundigen veel tijd.

In het onderzoek in hoofdstuk drie werd bij 170 pasgeborenen die op de high-care afdeling voor pasgeborenen werden opgenomen, de vochtbalans vergeleken met het

Page 191: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

190 Nederlandse samenvatting voor de leek

dagelijks gewichtsverschil. De vochtbalans en het gewichtsverschil lieten een slechte overeenkomst zien. Bij 40% van de vochtbalansen was het verschil met de gewichtstoe-name of afname meer dan 50 ml (overeenkomend met 20% van de dagelijkse vochtin-name), een verschil dat we voor het onderzoek als klinisch relevant hadden gedefini-eerd. In een vervolgstudie werden deze 170 patiënten gerandomiseerd toegewezen aan twee verschillende groepen: in de ene groep had de arts inzage in de vochtbalans in de andere groep had de arts geen inzage in de vochtbalans. Uit dit gerandomiseerde, geblindeerde onderzoek, bleek geen verschil tussen de twee groepen: de opnameduur, gewichtsverlies verschilden niet, evenmin was er een verschil in behandeling gericht op de vochthuishouding (bijvoorbeeld gebruik van diuretica). We concluderen dus, dat de vochtbalans niet nauwkeurig gemeten kan worden bij matig zieke pasgeborenen en ook geen invloed had op klinisch relevante uitkomsten voor deze groep kinderen. Daarom is het bijhouden van een vochtbalans niet nuttig bij deze patiënten. Op onze neonatologie high care afdeling zijn we op grond van de resultaten dan ook gestopt met het routinematig bijhouden van de vochtbalans bij opgenomen pasgeborenen. Dit scheelt de kinderverpleegkundigen erg veel (nutteloos) werk.

Klinische beoordeling van benauwde kinderenHoofdstuk 4 gaat over de klinische beoordeling van acuut benauwde kinderen. Benauwdheid is een veel voorkomend probleem op de kinderleeftijd. Longfunctie-onderzoek is bij (jonge) kinderen niet mogelijk, zeker niet in acute situaties. Artsen en verpleegkundigen gebruiken om deze reden vaak benauwdheidsscores waarbij verschillende klinische symptomen bij elkaar worden opgeteld. Met behulp van een sys-tematisch literatuuronderzoek zijn we nagegaan of de bestaande benauwdheidsscore wel voldoende onderzocht waren op betrouwbaarheid (validiteit) en bruikbaarheid. We vonden 36 verschillende benauwdheidsscores, en geen van deze scores was voldoende gevalideerd! Vervolgens hebben we -aan de hand van video-opnames van 27 benauwde kinderen in ons ziekenhuis- naar de variatie in de beoordeling gekeken tussen negen verschillende kinderartsen en kinderverpleegkundigen bij de beoordeling van de mate van benauwdheid. Het bleek dat de variatie binnen dezelfde beoordelaar acceptabel was, maar dat de variatie tussen de verschillende beoordelaars erg groot was. Deze vari-atie tussen kinderartsen en –verpleegkundigen was zodanig groot dat bij tweederde van de situaties geen onderscheid kon worden gemaakt tussen de meetfout als gevolg van deze variatie en een eventueel effect van behandeling. Omdat de mate van be-nauwdheid van kinderen tijdens een ziekenhuisopname altijd door verschillende artsen en verpleegkundigen wordt beoordeeld is deze grote variatie tussen beoordelaars een groot probleem. Eigenlijk kunnen we de mate van benauwdheid van deze kinderen, en de veranderingen die daarin van dag tot dag optreden, onder andere door onze be-

Page 192: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Nederlandse samenvatting voor de leek 191

n

handeling, dus niet goed beoordelen. Vervolgonderzoek zal moeten uitwijzen of betere training van artsen en verpleegkundigen dit probleem kan oplossen.

Virale tests bij kinderen met een bronchiolitisIn het laatste hoofdstuk onderzochten we het nut van virale tests bij kinderen die in het ziekenhuis worden opgenomen met een bronchiolitis (een virale infectie van de lagere luchtwegen die vooral bij jonge kinderen voor benauwdheid en slecht drinken zorgt en waar sommige kinderen extra zuurstof voor moeten krijgen of zelfs op een intensive care moeten worden opgenomen). In het winterseiszoen worden grote aantallen kinderen met een bronchiolitis opgenomen, meestal veroorzaakt door het RS-virus. Vanwege de besmettelijkheid is het gebruikelijk om kinderen met een RS-infectie apart te verplegen van kinderen zonder RS-virus, om te voorkomen dat kinderen met een bronchiolitis een dubbel-infectie oplopen (co-infectie). Om deze reden wordt bij opname vaak een RS-(snel) test afgenomen. Deze gewoonte berust op de aanname dat kinderen met een co-infectie een ernstiger ziektebeloop hebben dan kinderen met maar één virus. In een observationele studie hebben we onderzocht of het veilig kan zijn om kinderen met een bronchiolitis samen op één kamer te verplegen. In het winterseizoen 2011-2012 deelden de kinderen met een bronchiolitis gedurende de eerste dag van opname één kamer. Van de 48 kinderen met een bronchiolitis kregen twee kinderen (4%) een co-infectie tijdens opname. Beide kinderen hadden tijdens de hele opname alleen op een kamer gelegen, en konden de co-infectie dus niet hebben gekregen van een ander kind op dezelfde kamer. Bij geen van de 37 kinderen die een kamer deelden trad een co-infectie op. In het winterseizoen 2012-2013 deelden de patiënten met bronchiolitis gedurende de gehele opname een kamer met andere bronchiolitis patiënten. Van de 65 kinderen die in deze studie werden geïncludeerd, trad bij acht (12%) een co-infectie op. Bij slechts een van deze acht kinderen was de co-infectie te herleiden tot een kamergenoot. De kinderen met een co-infectie waren niet zieker dan de kinderen met een mono-infectie. We concluderen dat het zeer waarschijnlijk veilig is om kinderen met bronchiolitis op één kamer te verpleging, onafhankelijk van de virale verwekker. Het verrichten van virale test met als doel kinderen met RS-infecties te onderscheiden van kinderen zonder RS-infecties is derhalve niet nodig. Op onze kinderafdeling doen we deze tests dan ook niet meer routinematig (maar alleen nog in het kader van vervolgonderzoek). Het niet meer doen van deze tests levert de kinderafdeling een aanzienlijke kostenbesparing op.

Meten is weten, maar met onzekerheid!

Samenvattend klopt het adagium “meten is weten“ in de kindergeneeskunde dus meestal niet. Eigenlijk klopt het alleen onder de voorwaarde dat je weet wat, hoe, wie en waarom je meet! Daarnaast blijkt uit dit proefschrift dat het nuttig kan zijn om veel gebruikte, routinematige metingen onder de loep te nemen. We willen benadrukken

Page 193: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

192 Nederlandse samenvatting voor de leek

dat onze evidence-based practice een belangrijke rol heeft gespeeld bij het creëren van een een zelf-kritische en nieuwsgierige houding en een belangrijke rol heeft gespeeld bij het tot stand komen van dit proefschrift. Daarmee willen we het belang onderstre-pen van zo’n evidence based practice. Het werken als kinderarts wordt er interessanter en leuker van, en we kunnen op deze wijze een steentje bijdragen aan de ontwikkeling van ons vak.

Page 194: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

List of Co-authors

Page 195: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 196: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

List of Co-authors 195

c

list of co-authors

Yvette (Y) van Asperen, MD.Beatrix Children’s Hospital, University Medical Center Groningen, Groningen

Joline (JF) Bakker, MsCIsala, Zwolle

Ine Marije (IM) Bartels, MDDepartment of General medicine, University Medical Center Groningen, Groningen

Paul (PLP) Brand, Prof, MDAmalia Children’s Center, Isala, Zwolle

Lesla (ES) BruijnesteijnLaboratory for Clinical Microbiology, Isala, Zwolle

Wieke (H) Eggink, MDDepartment of Neurology, University Medical Center Groningen, Groningen

Liesbeth (JM) Groot-JebbinkAmalia Children’s Center, Isala, Zwolle

Boudewijn (BJ) Kollen, PhDDepartment of General Practice, University Medical Center Groningen, Groningen

Joke (JH) Kok, Prof, MDDepartment of Neonatology, Amsterdam Medical Center, Amsterdam

Veerle (VJ) Langenhorst, MDAmalia Children’s Center, Isala, Zwolle

Sjef (JJCM) van de Leur,Department of Clinical Chemistry, Isala, Zwolle

Roelien (R) Reimink, MANPAmalia Children’s Center, Isala, Zwolle

Page 197: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

196 List of Co-authors

Hans (JB) Reitsma, MD, PhDJulius Center for Health Sciences and Primary Care, University Medical Center, Utrecht

Gijs (GJHM) Ruijs, MD, PhDLaboratory for Clinical Microbiology, Isala, Zwolle

Irma (HLM) van Straaten, MD, PhDAmalia Children’s Center, Isala, Zwolle

Mirjam (MM) Wessels, MANPAmalia Children’s Center, Isala, Zwolle

Page 198: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

List of publications

Page 199: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 200: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

List of publications 199

P

list of PuBlications

International (Pubmed) 1. Bekhof J, Kollen BJ, Leur van der SJCM, Kok JH, Straaten HLM. Reliability of reagent strips for

measurement of glucosuria in a neonatal intensive care setting. Accepted for publication in Pediatr Neonatol.

2. Bekhof J, Reimink R, Brand PLP. Systematic review: insufficient validation of clinical scores for the assessment of acute dyspnoea in wheezing children. Paediatr Respir Rev. 2014;15:98-112

3. Bekhof J, Bakker J, Reimink R, Wessels M, Langenhorst V, Brand PL, Ruijs JHM. Co-infection in children hospitalized for bronchiolitis: role of roomsharing. J Clin Res Med. 2013;5:426-31.

4. van Ijsselmuiden MN, Bekhof J, Peek AM. A neonate with the ‘blueberry muffin syndrome’. Ned Tijdschr Geneeskd. 2013;157:A6460.

5. Bekhof J, van Asperen Y, Brand PL. Usefulness of the fluid balance: a randomised controlled trial in neonates. J Paediatr Child Health. 2013;49:486-92.

6. Bekhof J, Reitsma JB, Kok JH, Van Straaten IH. Clinical signs to identify late-onset sepsis in preterm infants. Eur J Pediatr. 2013;172:501-8.

7. van Asperen Y, Brand PL, Bekhof J. Reliability of the fluid balance in neonates. Acta Paediatr. 2012 May;101:479-83.

8. Bekhof J, Kollen BJ, Groot-Jebbink LJ, Deiman C, van de Leur SJ, van Straaten HL. Validity and interobserver agreement of reagent strips for measurement of glucosuria. Scand J Clin Lab Invest. 2011 May;71:248-52.

9. Leen WG, Klepper J, Verbeek MM, Leferink M, Hofste T, van Engelen BG, Wevers RA, Arthur T, Bahi-Buisson N, Ballhausen D, Bekhof J, van Bogaert P, Carrilho I, Chabrol B, Champion MP, Coldwell J, Clayton P, Donner E, Evangeliou A, Ebinger F, Farrell K, Forsyth RJ, de Goede CG, Gross S, Grunewald S, Holthausen H, Jayawant S, Lachlan K, Laugel V, Leppig K, Lim MJ, Mancini G, Marina AD, Martorell L, McMenamin J, Meuwissen ME, Mundy H, Nilsson NO, Panzer A, Poll-The BT, Rauscher C, Rouselle CM, Sandvig I, Scheffner T, Sheridan E, Simpson N, Sykora P, Tomlinson R, Trounce J, Webb D, Weschke B, Scheffer H, Willemsen MA. Glucose transporter-1 deficiency syndrome: the expanding clinical and genetic spectrum of a treatable disorder. Brain. 2010 Mar;133(Pt 3):655-70.

10. Emmen E van, Roord STA, Brouwer AFJ, Kuiters GRR, Bekhof J. Puistjes en blaasjes bij pasge-borenen. Ned Tijdschr Geneeskd 2007;151:277-83.

11. Baatenburg de Jong R, Bekhof J, Langenhorst V, P Zwart, Roorda RJ. Ontwikkelingsachterstand bij borstgevoede kinderen door ontoereikend dieet van de moeder. Ned Tijdschr Geneeskd. 2006;150:465-9.

12. Crone MR, van Spronsen FJ, Oudshoorn K, Bekhof J, van Rijn G, Verkerk PH. Behavioural factors related to metabolic control in patients with phenylketonuria. J Inherit Metab Dis 2005:28:627-37

13. Bekhof J, Norbruis OF, Scheenstra R, Weerd de W. Actief beleid bij een kind dat een knoopbatterij heeft ingeslikt. Ned Tijdschr Geneeskd. 2005;149:163-7.

14. Baatenburg de Jong R, Bekhof J, Roorda RJ, Zwart P. Severe nutritional vitamin deficiency in a breast infant of a vegan mother. Eur J Pediatr. 2005;164:259-60.

15. Bekhof J, De Langen R, Verkade HJ. Icterus prolongatus reden voor laboratoriumdiagnostiek, ook bij borstgevoede zuigelingen. Ned Tijdschr Geneeskd. 2005;149:613-7

16. Bekhof J, van Rijn M, Sauer PJJ, Ten Vergert EM, Reijngoud DJ, van Spronsen FJ. Plasma phenylala-nine in patients with phenylketonuria self managing their diet. Arch Dis Child. 2005;90:163-4

Page 201: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

200 List of publications

17. Bekhof J, Norbruis O, Scheenstra R, Dikkers F, De Langen R, de Weerd W. Babies and batteries. Lancet. 2004 Aug 21;364:708.

18. van Rijn M, Bekhof J, Dijkstra T, Smit PG, Moddermam P, van Spronsen FJ. A different approach to breast-feeding of the infant with phenylketonuria. Eur J Pediatr. 2003 May;162:323-6.

19. Bekhof J, van Spronsen FJ, Crone MR, van Rijn M, Oudshoorn CG, Verkerk PH. Influence of knowl-edge of the disease on metabolic control in phenylketonuria. Eur J Pediatr. 2003 Jun;162:440-2.

20. Spaapen LJ, Bakker JA, Velter C, Loots W, Rubio-Gozalbo ME, Forget PP, Dorland L, De Koning TJ, Poll-The BT, Ploos van Amstel HK, Bekhof J, Blau N, Duran M, Rubio-Gonzalbo ME. Tetrahydrobi-opterin-responsive phenylalanine hydroxylase deficiency in Dutch neonates. J Inherit Metab Dis. 2001 Jun;24:352-8.

21. van Spronsen FJ, van Rijn M, Bekhof J, Koch R, Smit PG. Phenylketonuria: tyrosine supplementa-tion in phenylalanine-restricted diets. Am J Clin Nutr. 2001 Feb;73:153-7.

22. de Koning TJ, Nikkels PG, Dorland L, Bekhof J, De Schrijver JE, van Hattum J, van Diggelen OP, Duran M, Berger R, Poll-The BT. Congenital hepatic fibrosis in 3 siblings with phosphomannose isomerase deficiency. Virchows Arch. 2000 Jul;437:101-5.

other

1. Reimink R, Nijboer CM, Langenhorst VL, Norbruis OF, Bekhof J. Acute gastro-enteritis bij kinderen: Implementatie van een richtlijn gebaseerd op snelle enterale rehydratie met ORS. Tijdschr Kin-dergeneeskd. 2013;5(81):119-125.

2. Draaisma E, Bekhof J. Partiële wisseltransfusie bij de polycythemische pasgeborene: effectiviteit en veiligheid. Praktische pediatrie, Nascholingstijdschrift over kindergeneeskunde. 2013;3:168-172.

3. Tiemersma S, Bekhof J. Bij welk gewicht kunnen prematuren in de wieg. Praktische Pediatrie, Nascholingstijdschrift over kindergeneeskunde. 2011;3:194-197.

4. Bekhof J, Brand PLP, Boluyt N. Bij welke saturatiegrens is opname geïndiceerd bij kinderen met een (RSV-)bronchiolitis? Praktische Pediatrie, Nascholingstijdschrift over kindergeneeskunde. 2010;3:200-3.

5. Schakel W, Bekhof J. Prematuren geboren na 36 weken zwangerschapsduur: 48 uur observatie op de kraamafdeling is voldoende. Tijdschr Kindergeneesk. 2010;78:3-6

6. Bekhof J. Evidence-based medicine in de kliniek: van theorie naar praktijk. Nederlands Tijdschrift voor Obstetrie & Gynaecologie. Maart 2010;123:50-52.

7. Bekhof J, Boluyt N, Boere-Bonekamp M. Het effect van helmredressietherapie bij positionele plagiocephalie. Tijdschrift voor Jeugdgezondheidszorg. Dec 2009 41(6):115-118

8. Bekhof J, Boluyt N, Boere-Bonekamp M. Het effect van helmredressietherapie bij positionele plagiocephalie. Praktische Pediatrie, Nascholingstijdschrift over kindergeneeskunde. 2009;2:140-143.

9. Bekhof J. Pasgeborenen met meconiumhoudend vruchtwater: Hoe lang is observatie nodig? Praktische pediatrie, Nascholingstijdschrift voor kindergeneeskunde. 2008;1:79.

10. van Rijn M, Bekhof J, Dijkstra T, Smit GPA, Modderman P, van Spronsen FJ. Borstvoeding: ook voor het kind met fenylketonurie. Tijdschr Kindergeneeskd. 2002;70;195-9.

Page 202: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Dankwoord

Page 203: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 204: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Dankwoord 203

d

dankwoord

Onderzoek doen, doe je niet alleen. Daarnaast kun je als medisch specialist met een full-time baan alleen een promotie onderzoek afronden, wanneer de mensen in je omgev-ing je daarvoor de ruimte geven. Aangezien dit promotie-onderzoek een behoorlijke tijdspanne in beslag heeft genomen, ben ik aan veel verschillende mensen mijn dank verschuldigd. Ik hoop van harte dat ik niet iemand vergeten ben in dit dankwoord.

Het allerbelangrijkste voor mij is het “thuisfront”: Marco, Noor en Sara, maar zoals dat een beetje gebruikelijk is in dankwoorden van proefschriften, noem ik jullie pas als laatste (maar op deze manier komen jullie – terecht - 2 keer aan bod!).

Mijn onderzoek heb ik volledig gedaan op de kinderafdeling van Isala; de vragen zijn voortgekomen vanuit de Zwolse werkvloer en het onderzoek hebben we uitgevoerd met Zwolse patiënten. Mijn collega’s, de vakgroep kindergeneeskunde ben ik dan ook veel dank verschuldigd. Toen ik, als arts-assistent in opleiding, in Zwolle kwam werken, dacht ik al vrij snel: ‘ik hoop dat ik hier later als kinderarts kan werken“. Naast het feit dat het inhoudelijke werk in Isala voor een kinderarts zeer de moeite waard is , door het ruime aanbod aan patiënten in combinatie met de ruime expertise, is de sfeer bin-nen de vakgroep een enorme motivator. Door de kritische houding die als het ware “inborn“in“de“Zwolse-kinderarts“ is, of anders op zijn minst ernstig besmettelijk is, werd ik van meet af aan erg aangetrokken. Door deze houding werd ik al tijdens mijn opleidingsperiode in Isala gestimuleerd om onderzoek te doen. Loes, Danielle, Dorien, Gert, Pieter, Obbe, Angelien, Eelco, Eric, Paul, Veerle, Hans en sinds kort Francis en Sarah, mijn collega-kinderartsen, en ook de collega-neonatologen, bedankt voor jullie niet aflatende steun.

Paul, je bent een groot voorbeeld voor mij en ik waardeer het enorm dat je mijn eerste promotor wilt zijn. Je bent een vriendelijke, hardwerkende en prettige collega en barst van de energie en ambitie, die op onze gehele vakgroep afstraalt. Jouw manier van on-derzoek doen, naar “gewone” praktische zaken binnen de geneeskunde heb ik altijd erg inspirerend gevonden. Je bent iemand met een tomeloze soms jaloersmakende energie. Je reageert zo snel op mails dat ik me al zorgen begon te maken als ik na een uur of 12 niets van je hoorde…. Ik ben erg blij dat je jouw ambities niet naar een academisch ziekenhuis hebt verplaatst en ben er trots op met jou te mogen werken en hoop dat we samen nog vele “heilige huisje” omver kunnen werpen.

Irma, heel fijn dat je mijn co-promotor wilt zijn. We zijn per slot van rekening samen aan de glucosurie studie begonnen. Ik heb jou altijd als een van de kritische neonatologen

Page 205: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

204 Dankwoord

gezien, en heb veel van je geleerd tijdens mijn periode op de NICU, niet alleen medisch-inhoudelijk maar ook van de manier waarop je ouders te woord staat tijdens de vaak zeer aangrijpende momenten met hun zieke kind. Dank daarvoor.

Joke Kok, dank dat je me hebt willen helpen met dit promotie-onderzoek. Ondanks dat we weinig met elkaar hebben kunnen werken, ben ik je erkentelijk voor je positieve inzet en warme begeleiding.

Hans Reitsma, arts-epidemioloog, bedankt voor je hulp bij het onderzoek naar klinische voorspellers van sepsis bij prematuren. Je hebt me goed ingewijd in de methodologie van regressie analyses en hebt ons het idee voor het nomogram aangereikt, waarvoor dank.Ook Boudewijn Kollen, epidemioloog, inmiddels werkzaam in het UMCG, wil ik bedan-ken voor je hulp bij de statistiek.

Uiteraard wil ook de leden leescommissie, Prof. Dr. M. Offringa, Prof. Dr. E.E.S. Nieuwen-huis en Prof. Dr. E.A. Verhagen, hartelijk bedanken voor de genomen moeite om mijn proefschrift te beoordelen.

Mijn paranimfen Veerle Langenhorst en Hans van Unen. Wat ontzettend fijn en gerust-stellend dat jullie me op deze dag terzijde willen staan.

Veerle, vanaf het moment dat jij in Zwolle binnen kwam, met je Afrikaanse kralenketting om, te pas en te onpas roepend: “moet dit onderzoek echt?”, “weet je wel wat een infuus kost?”, “misschien-moeten-we-gewoon-even-afwachten-houding”, was ik al fan van jou. Ik vind het geweldig dat ik samen met jou de EBM binnen onze vakgroep (en verder) heb kunnen vormgeven. Je bent een ontzettend originele en onafhankelijke denker en een voorbeeld voor mijn stelling dat “onderzoek doen niet nodig is om een goede dokter te zijn”.

Hans, jij staat voor mij voor de sfeer binnen onze vakgroep. Nog meer dan de inhoud is de werksfeer het allerbelangrijkste voor mijn werkplezier, en ik denk ook voor het functioneren van een vakgroep of maatschap. Ik hecht veel waarde aan jouw positieve bijdrage aan de goede sfeer binnen onze vakgroep. Daarnaast geldt voor jou als geen ander: “Anima Sana in Corpore Sano” en treed je regelmatig coachend op tijdens hardloopwedstrijden. Met jouw mantra’s zoals: “Als je last hebt van de wind tijdens het hardlopen, heeft jouw tegenstander dat ook…” heb ik tijdens menig loopje mijn tempo kunnen opvoeren. Ik hoop dat we naast ons contact op de werkvloer nog vele hardloop-wedstrijden samen mogen lopen tussen de Sallandse Toppers.

Page 206: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

Dankwoord 205

d

Eric de Groot, wat heerlijk om met jouw een werkkamer te mogen delen. Jouw rustige uitstraling en enorme relativeringsvermogen hebben mij door mening emotionele storm heen geholpen. Aan de ene kant jammer dat we ons promotiefeest niet gezamenlijk op de Agnietenberg gaan vieren, aan de andere kant vind ik ons argument “hoe vaker je feest kunt vieren hoe beter”, erg sterk. Ik wens je veel succes bij de laatste loodjes voor het afronden van jouw proefschrift!

Roelien Reimink en Mirjam Wessels, 2 van onze verpleegkundig specialisten. Wat zouden we als kinderafdeling zonder jullie moeten? Ik waardeer jullie inzet op zowel patiëntenzorg als onderzoeksgebied enorm. Zonder jullie had ik dit onderzoek nooit tot een goed einde kunnen brengen. Ik wil iedereen die nog zijn bedenkingen heeft bij nurse practitioners of verpleegkundig specialisten, op het hart drukken: aarzel niet, want betere zorg! Ik zie jullie soms zuchten als ik weer met een nieuw idee kom aanzet-ten, maar toch hoop ik dat we samen nog veel vernieuwingen kunnen implementeren en zo voor een werkelijke verbetering aan het bed van de patiënt kunnen zorgen!

M3 studenten: (destijds) Yvette van Asperen, Marjolein Breedveld, Nynke Doorenbos, Joline Bakker, Ine-Marije Bartels en Wieke Eggink, de co-assistenten die bij mij hun M3 oftewel wetenschapsstage hebben gelopen en daarmee van onmisbare waarde zijn geweest bij dit onderzoek. Dank voor jullie bijdrage!Liesbeth Groot Jebbink en Carin Bunkers, research nurses, en ook Corrie Deiman, Gerrie Veneklaas, Sjef van de Leur en Annelies Vogelsang voor jullie onmisbare hulp bij het uitvoeren van de studie naar de betrouwbaarheid van de glucosticks. De kinderver-pleegkundigen van Isala’s NICU en kinderafdeling wil ik bedanken voor het meten van de glucose in de luiers en de hulp bij de vochtbalans en bronchiolitis studie. Mirell Papenhuijzen, medisch informatiespecialist, bedankt voor je hulp bij het zoeken van li-teratuur. Gijs Ruijs, arts-microbioloog en Lesla Bruijnesteijn, moleculair bioloog, bedankt voor jullie actieve bijdrage bij het ontrafelen van de kruis-infecties bij bronchiolitis.

Papa en mama, jullie noemden me thuis altijd “de professor”. Alhoewel ik die titel niet heb gehaald, ben ik wel de eerste Bekhof (binnen “onze stam”) met een doctors titel. Bedankt voor het warme nest, een onvervalste “niet-lullen-maar-poetsen”-houding en het onvoorwaardelijk vertrouwen dat jullie me van jongs af aan hebben gegeven (en nog altijd geven)!

Lieve, allerliefste Marco, Noor en Sara. Wat ben ik intens gelukkig met jullie, mijn heer-lijke man en prachtige, lieve meiden. Noor, je hebt me zo vaak gevraagd “wanneer dat boekje nou eens klaar is”, hier is het dan, helemaal voor jullie.

Page 207: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 208: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

About the author

Page 209: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments
Page 210: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments

About the author 209

a

aBout the author

After receiving her medical degree at the University of Groningen in 1997, Jolita Bekhof attended the residency programme in paediatrics at the UMCG in Groningen and Isala Clinic in Zwolle. From 2004 until 2005 she worked as a fellow neonatology at Isala’s NICU, and subsequently started her current job as a general paediatrician in Isala.

The critical and strong innovative character of the paediatric department in Isala was the ideal stimulus for her interest in research and evidence-based medicine (EBM). Her main drive is to continuously improve (paediatric) health care, and she considers EBM a suitable manner to achieve this, fitting well to her character. Following the postgradu-ate epidemiology programme at the EMGO-institute in Amsterdam and several (inter)national courses in (teaching) EBM gave her the possibility to implement EBM in Isala’s paediatric department. The studies in this thesis all originated from discussions with our paediatric team during one of the EBM-activities.

Page 211: Demystification of commonly used measurements … of commonly used measurements in ... Demystification of commonly used measurements in paediatrics ... the most important instruments