4
HYDROLOGICAL PROCESSES Hydrol. Process. 21, 985 – 988 (2007) Published online 27 February 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/hyp.6639 On not undermining the science: coherence, validation and expertise. Discussion of Invited Commentary by Keith Beven Hydrological Processes, 20, 3141 – 3146 (2006) Jim Hall*, Enda O’Connell and John Ewen School of Civil Engineering and Geosciences, Newcastle University, Newcastle upon Tyne NE1 7RU, UK *Correspondence to: Jim Hall, School of Civil Engineering and Geosciences, Newcastle University, Newcastle upon Tyne NE1 7RU, UK. E-mail: [email protected] Received 19 October 2006 Accepted 9 November 2006 Keith Beven in his recent Invited Commentary ‘On undermining sci- ence?’ (Beven, 2006a) called for a discussion on the topic of uncertainty in the pages of HP Today. This is a topic that we have enjoyed debating with Professor Beven in the past and we are happy to respond to his call. His Commentary argued that uncertainty analysis does not undermine science and, by the same token, we argue that subjecting uncertainty analysis to vigorous scrutiny and challenge should not undermine it. In short, we are supporters of uncertainty analysis in practically all its forms and so, by challenging the aspects of Beven’s commentary, we are not seeking to undermine his aim, which we interpret as conscientious reporting of uncertainties and of the methodological assumptions used in calculating those uncertainties. We agree that hydrologists should be challenged to apply more rigour and consistency in their reporting of uncertainties. Beven suggests that there is now only one defence against the evaluation of uncertainty in the prediction of environmental models, and that is computational cost. Yet, as he goes on to point out, even GCMs are now being run in ensemble mode and advances in parallel and grid computing are weakening even the computational defence. Uncertainty analysis of very complex models will never be easy. Since this is the case, as much effort should go into developing computational methods and reduced complexity versions (be they data-based emulators, reduced form mechanistic models or hybrids thereof), which do permit uncertainty analysis, as is invested in development of the models themselves. At the moment, uncertainty analysis is too often an afterthought for model builders. At the last count, the Hadley Centre for Climate Change Research employed 173 scientists, most of them developing and running GCMs. Fewer than 10 were involved in uncertainty analysis! Decision-making and Coherence Beven’s Commentary is concerned with the effect of uncertainty analysis on decision-makers who make use of hydrological science. He asks ‘Will showing the results of such analyses to users and stakeholders undermine their confidence in the science if the uncertainty bounds are large?’ It seems to us that rather more careful attention needs to be paid to the needs of decision-makers when implementing uncertainty analysis (Hall, 2003). Uncertainty analysis should be preceded by careful consideration of the ways in which uncertainty estimates will be used to inform decision-making, which will guide the choice of uncertainty representation. Exploratory modelling to inform contested policy questions of environmental management requires a quite different approach to, for example, supporting flood warning decisions, though we are inclined to concur with Beven that, even in the latter case, decision- makers and members of the public have a greater capacity to cope with probabilistic information than is often assumed. Governments are emphasizing uncertainty in their procedures for environmental management and capital investment. For example, UK Copyright 2007 John Wiley & Sons, Ltd. 985

On not undermining the science: coherence, validation and expertise. Discussion of Invited Commentary by Keith Beven Hydrological Processes, 20, 3141–3146 (2006)

Embed Size (px)

Citation preview

HYDROLOGICAL PROCESSESHydrol. Process. 21, 985–988 (2007)Published online 27 February 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/hyp.6639

On not undermining the science: coherence, validation andexpertise. Discussion of Invited Commentary by Keith BevenHydrological Processes, 20, 3141–3146 (2006)

Jim Hall*, Enda O’Connelland John EwenSchool of Civil Engineering andGeosciences, Newcastle University,Newcastle upon Tyne NE1 7RU, UK

*Correspondence to:Jim Hall, School of Civil Engineeringand Geosciences, NewcastleUniversity, Newcastle upon Tyne NE17RU, UK. E-mail: [email protected]

Received 19 October 2006Accepted 9 November 2006

Keith Beven in his recent Invited Commentary ‘On undermining sci-ence?’ (Beven, 2006a) called for a discussion on the topic of uncertaintyin the pages of HP Today. This is a topic that we have enjoyed debatingwith Professor Beven in the past and we are happy to respond to his call.His Commentary argued that uncertainty analysis does not underminescience and, by the same token, we argue that subjecting uncertaintyanalysis to vigorous scrutiny and challenge should not undermine it.In short, we are supporters of uncertainty analysis in practically all itsforms and so, by challenging the aspects of Beven’s commentary, we arenot seeking to undermine his aim, which we interpret as conscientiousreporting of uncertainties and of the methodological assumptions usedin calculating those uncertainties.

We agree that hydrologists should be challenged to apply more rigourand consistency in their reporting of uncertainties. Beven suggests thatthere is now only one defence against the evaluation of uncertainty in theprediction of environmental models, and that is computational cost. Yet,as he goes on to point out, even GCMs are now being run in ensemblemode and advances in parallel and grid computing are weakening eventhe computational defence. Uncertainty analysis of very complex modelswill never be easy. Since this is the case, as much effort should gointo developing computational methods and reduced complexity versions(be they data-based emulators, reduced form mechanistic models orhybrids thereof), which do permit uncertainty analysis, as is investedin development of the models themselves. At the moment, uncertaintyanalysis is too often an afterthought for model builders. At the lastcount, the Hadley Centre for Climate Change Research employed 173scientists, most of them developing and running GCMs. Fewer than 10were involved in uncertainty analysis!

Decision-making and CoherenceBeven’s Commentary is concerned with the effect of uncertainty analysison decision-makers who make use of hydrological science. He asks‘Will showing the results of such analyses to users and stakeholdersundermine their confidence in the science if the uncertainty bounds arelarge?’ It seems to us that rather more careful attention needs to bepaid to the needs of decision-makers when implementing uncertaintyanalysis (Hall, 2003). Uncertainty analysis should be preceded bycareful consideration of the ways in which uncertainty estimates willbe used to inform decision-making, which will guide the choice ofuncertainty representation. Exploratory modelling to inform contestedpolicy questions of environmental management requires a quite differentapproach to, for example, supporting flood warning decisions, though weare inclined to concur with Beven that, even in the latter case, decision-makers and members of the public have a greater capacity to cope withprobabilistic information than is often assumed.

Governments are emphasizing uncertainty in their procedures forenvironmental management and capital investment. For example, UK

Copyright 2007 John Wiley & Sons, Ltd. 985

J. HALL, E. O’CONNELL AND J. EWEN

Treasury (HM Treasury, 2003) guidance for decision-makers in government states: ‘An expected value is auseful starting point for understanding the impact ofrisk between different options. But however well risksare identified and analysed, the future is inherentlyuncertain. So it is also essential to consider how futureuncertainties can affect the choice between options’.Increasingly, decision-makers may need and/or askfor a full display of the magnitudes and the sourcesof uncertainties before making an informed judgment(Pate-Cornell, 1996).

Careful analysis of the decision context can actu-ally permit some uncertainties to be put to one sideas being irrelevant so that effort can be focussedupon critical thresholds that determine the orderingbetween decision options. It can help target uncer-tainty analysis. Yet, an emphasis on decision-makingalso provides some rather exacting criteria for uncer-tainty representation. A decision maker expects uncer-tainty estimates that imply that in the long run theywill avoid sure loss of utility. In other words, theyexpect the probability calculus to be coherent. Thisis in addition to the usual scientific requirementsfor transparency and repeatability of method. Therequirement for coherence might seem to be self-evident, but in practice it provides rather tight con-straints on the admissible ways of representing uncer-tainty.

Is GLUE Undermining Statistics?The introduction of the GLUE methodology in 1992(Beven and Binley, 1992) represents a landmarkin both the practice and philosophy of uncertaintyin hydrological science. The method is convenientto apply and, as a multitude of publications andcitations attest, it has been widely used in researchand practice. GLUE has stood the test of time, but hasalso been subject to cogent criticism, most recentlyby Mantovan and Todini (2006). The two criticismsgiven below stand out, though much more attentionhas been paid to the former than to the latter:

ž The use of ‘informal’ likelihood functions (for exam-ple, Nash-Sutcliffe efficiency) and the setting of like-lihoods to zero for ‘non-behavioural’ models

ž The routine use of uniform priors to represent priorignorance.

It is the first of these that most upsets orthodoxBayesians, who are themselves sometimes guilty ofthe uncritical use of conjugate or ‘non-informative’priors. If a formal likelihood is employed in GLUE(with appropriate normalization), then Bayes theoremis recovered (Beven and Smith, 2006). Mantovan andTodini (2006) illustrate the incoherence of employinginformal likelihoods and set out formal desiderata fora likelihood function, or more generally, an approachto learning from observations:

1. The likelihood function should yield the sameresults, starting from the same prior uncertainty,irrespective of the order in which independent dataare processed and irrespective of whether the dataare analysed successively or in combination.

2. The expected value of acquiring additional datashould always be non-negative.

3. As the number of observations of a variableincreases, the relative frequency with which thevariable is observed to exceed some value shouldconverge to the predictive probability of the vari-able exceeding that value.

Abandoning these formal requirements for a likeli-hood function, which actually stem from the moreprimitive concept of exchangeability (Bernardo andSmith, 1994), will lead to paradoxical results. Bevenand Smith (2006) retort by arguing that use of formallikelihood functions may be based on strong and per-haps untenable assumptions about the nature of modelstructural error. They demonstrate that incautioususe of formal likelihoods can, in the presence of modelstructural error, lead to inaccurate prediction anduncertainty estimates. However, the use of an informallikelihood function is not assumption-free. There is animplied statistical model underlying any function thatis used to generate probabilities from data. Typically,the simpler the likelihood function, the stronger thestatistical choice (e.g. independence gives a productform). If we are applying a simple likelihood func-tion (be it formal or informal) to complex processes,we should be wary of the fact that strong and per-haps unwarranted assumptions are being made. Thequestion is whether that statistical model has prop-erties that we would judge to be desirable. Some ofthese, like the ones listed above, can be set down apriori, while others require careful reflection upon thedata at our disposal and the extent to which our mod-els provide meaningful information about processes inthe real world.

Obviously, model structural uncertainty cannot beignored, a fact that is now recognized in modernBayesian approaches to computer model calibration(Kennedy and O’Hagan, 2001). In Kennedy andO’Hagan’s structure, posterior distributions of modelcalibration parameters, the parameters of a randomfunction representing model inadequacy and observa-tion error, are computed at the same time. Applyingthis approach in practice is fraught with difficulties.There is a delicate balance to be struck if the modelinadequacy function is to represent the complexityof model structural error and yet also be reasonablyidentifiable from available data. However, this frame-work does provide a convenient structure for mar-shalling prior insights about the sources and natureof errors. For example, there often is reasonable priorknowledge (from instrument tests and ground truth)about observation errors. A structure is required thatcan accommodate this knowledge and not confound

Copyright 2007 John Wiley & Sons, Ltd. 986 Hydrol. Process. 21, 985–988 (2007)DOI: 10.1002/hyp

INVITED COMMENTARY

it with other insights that modellers may have aboutreasonable ranges of calibration parameters and thelikely effects of model structural uncertainty. Theextension of the Bayesian framework to consider mul-tiple model structures (Draper, 1995) may well bejustified where there is some prospect of the consid-erable effort involved yielding improved predictions.

An appeal to a rejectionist philosophy (Beven,2006b) is not in our opinion entirely convincing.Surely, a situation in which ‘all models are rejected’will undermine the credibility of the scientists inthe eyes of decision-makers! We embark upon amodelling activity because we believe that our modelis in some sense informative about the world. Weshould of course be open to having our beliefs revisedin the light of observations, leading us to proposealternative model structures. Yet, on the other hand,our prior beliefs about our chosen model and how itis informative about reality must at least stand forsomething if there is to be justification in our beingcommissioned as experts by a decision maker in thefirst place. Apparent lack of predictive accuracy ofa model can be reflected in the uncertainty estimateassociated with the predictive distribution rather thanin a blank refusal to predict at all.

Validation of Uncertainty EstimatesIn answering the question ‘Are the uncertainties beingoverestimated?’ Beven (2006a) catalogues the limi-tations of available observations, in particular, theabsence of repeat experiments in space (catchmentto catchment, plot to plot) or time (event to event,calibration period to calibration period). Certainly,we are far from being in the enviable situation ofnumerical weather prediction, where dense streams ofobservations have enabled prediction uncertainties tobe driven down in recent years. However, the absenceof repeat experiments (which, strictly speaking, do notoccur in numerical weather prediction either) does notpreclude statistical analysis. Statistics requires someregularity in the process being observed, not neces-sarily repeat experiments. Nor should a scarcity ofmeasurements be used as an excuse for not carry-ing out validation. Calibration must be followed byvalidation (i.e. testing), and so must be uncertaintyestimation. Without validation, calibration is worth-less, and so is uncertainty estimation.

For split-sample calibration/validation testing, val-idation is needed to ensure that it is the trends inthe responses that are being fitted in calibration, notthe errors, and this requires a proper allocation ofdata between calibration and validation. Validation isalso needed, however, because the data do not con-tain full information about how the catchment willrespond in the future. The same arguments apply touncertainty estimation. It is not sensible to use all theavailable data to ‘improve’ the uncertainty modelling.Some should be used to test the uncertainty estimates.

As Beven argues, useful data are a very scarce com-modity. This means that great care must be exercisedin decisions about how to use data in the uncertaintyestimation process. An alternative paradigm to thedata-intensive approach to calibration espoused byBeven and co-researchers is provided by Ewen andParkin (1996), who promote the use of process-basedmodels for uncertainty estimation without referenceto observed output data and then use the availabledata to validate the uncertainty estimates. This helpsto build some confidence in the model’s capacity topredict at ungauged sites and in the more challengingsituations described by Hall and Anderson (2002).

Expertise and Subjectivity

We trust that this discussion has progressed the debateon uncertainty that was called for by Professor Bevenin his Commentary. We share Professor Beven’s cau-tious optimism about the prospects for uncertaintyanalysis. Failure to improve upon the extent and qual-ity of uncertainty analysis will mean that hydrologistsare not living up to the requirements of decision-makers, set out at the outset of this discussion. Nor, asProfessor Beven argues, will hydrologists be satisfyingthe reasonable standards of scientific practice. Thosestandards not only require honest reporting of uncer-tainties but also transparency in the methodology forcalculating uncertainties, which should be based onrational principles. Yet, an appeal to rationality doesnot mean that we can escape the subjective contentof any uncertainty analysis. Scientists are valued bydecision-makers because they have expert knowledgethat they are willing to express implicitly as modelchoices or explicitly as (subjective) judgements aboutthe world. Different scientists have different knowl-edge; so decision-makers should be prepared to tol-erate the fact that different scientists provide differ-ent uncertainty estimates in situations where relevantdata are scarce. This is a fact of life that climatepolicy makers are having to come to terms with, ata time when every few months sees the publicationof a new (and different) pdf for climate sensitivity.Given the complexity of hydrological processes, thescarcity of relevant observations and the timescalesof systematic change in hydrological systems, thereis little prospect of this subjectivity being eliminatedfrom uncertainty estimates. Nor is it necessarily likelyto be the case that uncertainty estimates will be nar-rowed in future. What we should, however, hope for isthat the estimates are based on sound and convergentprinciples as well as transparent and repeatable meth-ods. Moreover, an increased emphasis on validationof uncertainty estimates should mean that decision-makers can be more confident that the numbers theyare using will provide a sound basis for decision-making.

Copyright 2007 John Wiley & Sons, Ltd. 987 Hydrol. Process. 21, 985–988 (2007)DOI: 10.1002/hyp

J. HALL, E. O’CONNELL AND J. EWEN

References

Bernardo JM, Smith AFM. 1994. Bayesian Theory . John Wiley andSons: New York.

Beven K. 2006a. On undermining the science? Hydrological Processes20: 3141–3146.

Beven K. 2006b. A manifesto for the equifinality thesis. Journal ofHydrology 320: 18–36.

Beven KJ, Binley AM. 1992. The future of distributed models: modelcalibration and uncertainty prediction. Hydrological Processes 6:279–298.

Draper D. 1995. Assessment and propagation of model uncertainty.Journal of the Royal Statistical Society B 57: 45–97.

Ewen J, Parkin G. 1996. Validation of catchment models for pre-dicting land-use and climate change impacts: 1. Method. Journal ofHydrology 175: 583–594.

Hall JW. 2003. Handling uncertainty in the hydroinformatic process.Hydroinformatics 5(4): 215–232.

Hall JW, Anderson MG. 2002. Handling uncertainty in extreme orunrepeatable hydrological processes–the need for an alternativeparadigm. Hydrological Processes 16(9): 1867–1870.

HM Treasury. 2003. The Green Book: Appraisal and Evaluationin Central Government, Treasury Guidance. The Stationery Office:London.

Kennedy MC, O’Hagan A. 2001. Bayesian calibration of computermodels (with discussion). Journal Royal Statistical Society Series B63(3): 425–464.

Mantovan P, Todini E. 2006. Hydrological forecasting uncertaintyassessment: incoherence of the GLUE methodology. Journal ofHydrology 330(1–2): 368–381.

Pate-Cornell ME. 1996. Uncertainties in risk analysis: six levels oftreatment. Reliability Engineering and System Safety 54: 95–111.

Copyright 2007 John Wiley & Sons, Ltd. 988 Hydrol. Process. 21, 985–988 (2007)DOI: 10.1002/hyp