14
Hierarchical Bayesian modeling of multisite daily rainfall occurrence: Rainy season onset, peak, and end C. H. R. Lima 1 and U. Lall 1 Received 28 September 2008; revised 7 March 2009; accepted 13 May 2009; published 23 July 2009. [1] A quantitative definition of and ability to predict the onset and duration of the dominant rainfall season in a region are important for agricultural and natural resources management. In this paper, a methodology based on an analysis of daily rainfall occurrence is proposed and applied to define the onset and end of the rainfall season in northeast Brazil. Multiple rainfall gauges are considered simultaneously, and a hierarchical Bayesian framework is used for parameter estimation. The proposed model can be used to identify the rainy season onset by finding the first day of year in which the estimated probability of rainfall is greater than a specified number, e.g., 0.5. The determination of the end of the rainy season follows a similar procedure. Thus, the rainfall season can be defined as a period during which rainfall is most likely to occur. Monotonic trends are identified for the maximum probability of daily rainfall across the region, suggesting evidence of climate change. However, only the southern stations in the region exhibit a trend in the corresponding date. An examination of the correlations of dates of maximum rainfall probability with leading climate indicators leads to a promising direction for 1–3 month ahead forecasts of onset, which will be very useful for effective planning. Particularly, El Nin ˜o events in December are found to be associated with delays in the oncoming rainy season onset over a large part of southern northeast Brazil. Negative anomalies in the tropical South Atlantic sea surface temperature between January and March can anticipate the rainy season onset in the northern region and along the eastern coast of northeast Brazil. Citation: Lima, C. H. R., and U. Lall (2009), Hierarchical Bayesian modeling of multisite daily rainfall occurrence: Rainy season onset, peak, and end, Water Resour. Res., 45, W07422, doi:10.1029/2008WR007485. 1. Introduction [2] The understanding of the interannual variability as well as the forecast of the onset and end of the rainy season with some months or weeks in advance is as important as the prediction of rainfall amounts, especially for agriculture and hydroelectricity, and for related areas such as banks and local governments. It is also of scientific interest in under- standing factors that influence the regional hydrologic cycle. In regions with rain-fed agriculture, such as northeast Brazil, the timing of the onset and end of the rainy season becomes even more important than the rainfall distribution or amount, as pointed out by Ingram et al. [2002] and Sultan et al. [2005]. For instance, by knowing the best period to seed a specific crop, farmers can mitigate losses by anticipating or delaying the planting of specific varieties or by selecting another more appropriate crop. A formal and reproducible definition of onset and end of the rainy season emerges as the first challenge in such analysis. A variety of methods have been proposed in recent years, but still there is no agreement among researchers of a method that can be effectively applied everywhere. It turns out that the different mechanisms that drive the rainy season (e.g., monsoon systems versus cold fronts) have led to the development of different models for each specific region. [3] A simple method to define the start of the rainy season is to take the first day of year in which the total daily rainfall is greater than a threshold value, usually subjectively chosen. For instance, Stern [1982] has used a Markov chain model for the total daily rainfall in order to estimate the probability distribution of the rainy season onset. However, in regions where the wet season rainfall is mainly driven by deep convection, an isolated and unusual event of high rainfall may mask the onset of the rainy season. As an alternative method, the pentad, i.e., 5-day average rainfall, associated with a different threshold is used to define the start of the rainy season. Subjective selections of threshold values lead to a lack of robust conclusions unless sensitivity to these choices is investigated. [4] As an alternative to the rainfall pentad, especially in regions with few rainfall observations, such as the oceans, an outgoing longwave radiation (OLR) pentad [e.g., Wu and Wang, 2000], which is indicative of the presence of clouds and consequently the probability of rainfall, has been used. Since the OLR signal is a proxy, a combination of the rainfall and OLR pentads [Marengo et al., 2001] has also been applied. Alternative methods to define the timing of the rainy season have also been based on the accumulated 1 Department of Earth and Environmental Engineering, Columbia University, New York, New York, USA. Copyright 2009 by the American Geophysical Union. 0043-1397/09/2008WR007485$09.00 W07422 WATER RESOURCES RESEARCH, VOL. 45, W07422, doi:10.1029/2008WR007485, 2009 Click Here for Full Articl e 1 of 14

Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

Hierarchical Bayesian modeling of multisite daily rainfall

occurrence: Rainy season onset, peak, and end

C. H. R. Lima1 and U. Lall1

Received 28 September 2008; revised 7 March 2009; accepted 13 May 2009; published 23 July 2009.

[1] A quantitative definition of and ability to predict the onset and duration of thedominant rainfall season in a region are important for agricultural and natural resourcesmanagement. In this paper, a methodology based on an analysis of daily rainfalloccurrence is proposed and applied to define the onset and end of the rainfall season innortheast Brazil. Multiple rainfall gauges are considered simultaneously, and a hierarchicalBayesian framework is used for parameter estimation. The proposed model can beused to identify the rainy season onset by finding the first day of year in which theestimated probability of rainfall is greater than a specified number, e.g., 0.5. Thedetermination of the end of the rainy season follows a similar procedure. Thus, the rainfallseason can be defined as a period during which rainfall is most likely to occur. Monotonictrends are identified for the maximum probability of daily rainfall across the region,suggesting evidence of climate change. However, only the southern stations in the regionexhibit a trend in the corresponding date. An examination of the correlations of dates ofmaximum rainfall probability with leading climate indicators leads to a promisingdirection for 1–3 month ahead forecasts of onset, which will be very useful for effectiveplanning. Particularly, El Nino events in December are found to be associated with delaysin the oncoming rainy season onset over a large part of southern northeast Brazil.Negative anomalies in the tropical South Atlantic sea surface temperature between Januaryand March can anticipate the rainy season onset in the northern region and along theeastern coast of northeast Brazil.

Citation: Lima, C. H. R., and U. Lall (2009), Hierarchical Bayesian modeling of multisite daily rainfall occurrence: Rainy season

onset, peak, and end, Water Resour. Res., 45, W07422, doi:10.1029/2008WR007485.

1. Introduction

[2] The understanding of the interannual variability aswell as the forecast of the onset and end of the rainy seasonwith some months or weeks in advance is as important asthe prediction of rainfall amounts, especially for agricultureand hydroelectricity, and for related areas such as banks andlocal governments. It is also of scientific interest in under-standing factors that influence the regional hydrologiccycle. In regions with rain-fed agriculture, such as northeastBrazil, the timing of the onset and end of the rainy seasonbecomes even more important than the rainfall distributionor amount, as pointed out by Ingram et al. [2002] andSultan et al. [2005]. For instance, by knowing the bestperiod to seed a specific crop, farmers can mitigate losses byanticipating or delaying the planting of specific varieties orby selecting another more appropriate crop. A formal andreproducible definition of onset and end of the rainy seasonemerges as the first challenge in such analysis. A variety ofmethods have been proposed in recent years, but still thereis no agreement among researchers of a method that can beeffectively applied everywhere. It turns out that the different

mechanisms that drive the rainy season (e.g., monsoonsystems versus cold fronts) have led to the developmentof different models for each specific region.[3] A simple method to define the start of the rainy

season is to take the first day of year in which the totaldaily rainfall is greater than a threshold value, usuallysubjectively chosen. For instance, Stern [1982] has used aMarkov chain model for the total daily rainfall in order toestimate the probability distribution of the rainy seasononset. However, in regions where the wet season rainfallis mainly driven by deep convection, an isolated andunusual event of high rainfall may mask the onset ofthe rainy season. As an alternative method, the pentad,i.e., 5-day average rainfall, associated with a differentthreshold is used to define the start of the rainy season.Subjective selections of threshold values lead to a lack ofrobust conclusions unless sensitivity to these choices isinvestigated.[4] As an alternative to the rainfall pentad, especially in

regions with few rainfall observations, such as the oceans,an outgoing longwave radiation (OLR) pentad [e.g., Wu andWang, 2000], which is indicative of the presence of cloudsand consequently the probability of rainfall, has been used.Since the OLR signal is a proxy, a combination of therainfall and OLR pentads [Marengo et al., 2001] has alsobeen applied. Alternative methods to define the timing ofthe rainy season have also been based on the accumulated

1Department of Earth and Environmental Engineering, ColumbiaUniversity, New York, New York, USA.

Copyright 2009 by the American Geophysical Union.0043-1397/09/2008WR007485$09.00

W07422

WATER RESOURCES RESEARCH, VOL. 45, W07422, doi:10.1029/2008WR007485, 2009ClickHere

for

FullArticle

1 of 14

watercenter
Stamp
Page 2: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

rainfall [Liebmann and Marengo, 2001; Grantz et al.,2007], cumulative percentage mean rainfall [Odekunle,2004] and the daily rainfall probability [Garbutt et al.,1981; Odekunle, 2004]. Examples of climate variables usedto define the onset of the rainy season include the shift ofthe ITCZ [Sultan et al., 2005], vertically integrated moisturetransport [Fasullo and Webster, 2003], global daily normal-ized precipitable water [Zeng and Lu, 2004] and themeridional gradient of tropospheric temperature [Goswamiand Xavier, 2005]. Of these, some have investigated climateteleconnections associated with delays or advances in theonset of the rainy season by looking at extreme climateevents and their effects on the rainy season onset or bydoing composite analysis. No one has attempted to developan operational forecast model for the onset and end of therainy season.[5] In this paper we propose a model to identify the onset

and end of the rainy season based on the seasonally varyingrainfall occurrence as done in a similar context (on the basisof the probability of rain instead) by some authors [e.g.,Garbutt et al., 1981; Odekunle, 2004], but here we considerdata from multiple rainfall sites and we allow the single sitemodel parameters to vary across years; that is, we considermodel parameters that are nonstationary in time and space.The parameter estimates are made under a hierarchicalBayesian framework. This allows a formal uncertaintyanalysis for model structure and parameters, including theinterpretation of differences across sites and the relationshipof the estimated parameters and exogenous climate varia-bles. Examples of Bayesian models in hydrology andclimate include the modeling of monthly maximum tem-perature [Wikle et al., 1998], rainfall [Gaudard et al., 1999],surface winds [Berliner et al., 1999; Wikle et al., 2001], air-sea interaction [Berliner et al., 2003], a variety of hydro-logic applications [Parent et al., 1998], rainfall extremes[Coles and Tawn, 1996] and amounts [Sanso and Guenni,1999] and hydrologic models [Thiemann et al., 2001; Frostet al., 2007]. Wikle et al. [1998] pointed out that Bayesianinference is particularly useful when the model to beconstructed involves the estimation of many parametersand the data are nonstationary in space and time, which isthe case for most rainfall data. We apply our model toevaluate the daily rainfall occurrence at 504 rainfall siteslocated in northeast Brazil. Annually varying parametersallow us to assess temporal trends and climate teleconnec-tions associated with delays and advances in the rainyseason. This analysis sets up a potential capacity forforecasting the temporal aspects of the rainy season.[6] Background information on northeast Brazil, its asso-

ciated climate regime and the source of data are presentednext. Section 3 describes the hierarchical Bayesian modelproposed to identify the onset of the rainy season. Theresults obtained are provided in sections 4 and 5, followedby a discussion and summary.

2. Northeast Brazil and Rainfall Data

[7] The northeast region of Brazil, hereafter referred to asNordeste, is characterized by high climate variability, rain-fed and irrigated agriculture and medium to high levels ofpoverty, particularly concentrated in the central region oftencalled the dry polygon. Floods and severe droughts alternateacross years in the Nordeste. Hastenrath [1994] points out

that the Nordeste is one of the regions of the world with thehighest social and economical impacts due to severedroughts (such as those of 1877, 1915, 1958, 1983, 1993and 1997). On the other hand, the tropical location ofNordeste and the influence of the Atlantic and Pacific seasurface temperature (SST) on rainfall lead to some of thehighest skills for seasonal climate predictions. Motivated bythis high rainfall predictability and by the importance ofpredictions for the economical and social developments ofthe region, many researchers and institutions have proposeddynamical and statistical forecasting models for the rainfallover Nordeste [e.g., Hastenrath, 1990; Folland et al., 2001;Moura and Hastenrath, 2004; Sun et al., 2006].[8] The rainfall season of Nordeste shows a spatial

variability that is associated with different climate mecha-nisms. The main rainy season occurs between March andApril [Strang, 1972] and is most associated with thedisplacement of the Intertropical Convergence Zone (ITCZ)[e.g., Moura and Shukla, 1981; Hastenrath, 1994]. Severalstudies have demonstrated that the variation of the inter-hemispheric SST gradient has a significant impact on theposition and intensity of the ITCZ, which in turn influencesthe climate of Nordeste and also over Northwest Africa.This tropical Atlantic gradient is usually defined as thedifference between North (5�N–25�N) and South (5�S–25�S) Atlantic SSTs. The main rainfall period in Nordestecoincides with negative values of this gradient (TropicalSouth Atlantic (SAI) warmer than Tropical North Atlantic(NAI)), which is associated to a displacement of the ITCZ toits southern most position. The end of the Nordeste mainrainfall season is marked by the return of the ITCZ to itslocation in the North hemisphere [Uvo et al., 1998]. On theother hand, an anomalous positive gradient (NAI warmerthan SAI) forces anomalous fields of meridional windcomponent and an anomalously far northerly ITCZ location,which reduces Nordeste rainfall. Moura and Shukla [1981]suggested also that the interhemispheric SST gradient withwarm waters to the North and cold waters to the Southcorresponds to a Hadley cell circulation in the atmospherewith descending branch over Nordeste.[9] In southern Nordeste the main rainy season is

between December and February [Strang, 1972; Chu,1983; Hastenrath, 1994]. It is modulated by cold frontsystems from the Southern hemisphere that enter into theregion during the year [Kousky, 1979; Hastenrath, 1994].Along the eastern coast of Nordeste the principal rainfallseason is from April to July. It is associated with southeast-erly winds that propagates perpendicular to the shoreline[Rao et al., 1993] and the land-sea breeze system [Kousky,1980].[10] ENSO events also play a big role on the spatiotem-

poral variability of Nordeste rainfall, with negative rainfallanomalies in northern Nordeste during extreme El Ninoevents. The drier than normal conditions are in part due tothe enhancement of atmospheric subsidence over Nordestecaused by an eastward displacement of the Walker cellduring El Nino years [Ropelewski and Halpert, 1987; Uvoet al., 1998; Giannini et al., 2001, 2004; Kayano andAndreoli, 2006], with its ascending branch along the eastcoast of the equatorial Pacific, dominated by anomalouswarm SST, and the descending branch over tropical Atlantic.Opposite patterns in the general circulation and SST are

2 of 14

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING W07422

Page 3: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

associated with positive rainfall anomalies during La Ninayears [Kayano and Andreoli, 2006]. Conversely, the southernNordeste experiences opposite effects of ENSO. Theenhancement of rainfall over southern Nordeste duringEl Nino events often occurs during the rainy season(December–May) of the year subsequent to the El Ninoyear. One of the potential causes of such an increase inrainfall is associated with stronger subtropical westerlies inEl Nino years, which in turn favors the intensification ofmesoscale convection centers in South Brazil [Ropelewskiand Halpert, 1987; Diaz et al., 1998; Grimm et al., 1998,2000; Coelho et al., 2002; Grimm, 2004; Cardoso and Dias,2006]. Anomalous circulations southeast of Brazil andsouthwest of South America that favor baroclinic instabil-ities and changes in rainfall patterns over southern Brazilhave also been linked to ENSO events [Grimm et al., 1998,2000]. Although these studies depict a general pattern andassociated climate teleconnections with the Nordeste rainyseason, they did not analyze or forecast delays in the onsetof the rainy season as well as the interannual variability ofthe onset on daily and weekly scales.[11] The set of daily rainfall data used in our investiga-

tions is composed of 504 stations, as shown in Figure 1, andcovers the period from 1940 to 2000. Most stations have afull period of record from 1960 to 1980. The averagenumber of available years for each station is 45, with afew stations having less than 30 years of data. None haveless than 20. We consider 1 year as complete if there aredata available on at least 200 days. For the seasonalityanalysis, we first obtain the rainfall occurrence time seriesof all sites by classifying the rainfall amounts data into wet(y = 1) and dry (y = 0) days. We define a wet state when therainfall amount exceeds 0.254 mm and a dry state other-wise. This threshold value is commonly used [e.g., Katz,1977] to censor low rainfall values at the measurement

precision and our model results were not sensitive to modestchanges in this value. The rainfall data was obtainedthrough the NOAA NCDC Global Daily ClimatologyNetwork and can be obtained from the International ResearchInstitute for Climate and Society at http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.

3. Methodology

3.1. Main Model

[12] Our primary assumption is that rainfall occurrencecan be used to characterize the rainfall season onset,duration and end, in a manner similar to that of Garbuttet al. [1981] and Odekunle [2004]. This assumption isreasonable for Nordeste, where the rainfall season is pre-dominantly in the first half of the year, while there is little orno rainfall occurrence in the second half. We note that thedefinition we propose can easily address seasonality asso-ciated with different target amounts by varying the thresholdused to define wet or dry states. If extreme rainfall eventswere of primary interest, e.g., for defining a flood season,then a higher rainfall threshold could be used to classify thedata into wet and dry days.[13] The daily rainfall occurrence time series are binary

data (1 = rain, 0 = dry) with a marked seasonal cycle,usually unimodal or bimodal. Let yst(n) be a Bernoullirandom variable of rainfall station s, taking the value 1when there is rainfall on the nth day of year t and 0otherwise. A natural candidate model for this randomvariable is the logistic applied to a Fourier series, represent-ing the probability of rainfall as a function of seasonalcomponents. This is similar to the models derived for thedaily probability of rainfall [Coe and Stern, 1982; Garbuttet al., 1981]:

yst nð Þ � Bin 1; pst nð Þð Þ; ð1Þ

where

pst nð Þ ¼ logit�1 ast þXk

bstk sin wknð Þ þ cstk cos wknð Þ !

ð2Þ

is the probability of rain at station s for the nth day of year t,wk is the angular frequency of the kth Fourier harmonic andthe inverse of the logit function is defined as

logit�1 xð Þ ¼ exp xð Þ1þ exp xð Þ : ð3Þ

This guarantees that pst(n) lies between 0 and 1.[14] Coe and Stern [1982] suggested for daily rainfall

models that the number of Fourier harmonics to be kept in(1) is usually between two and three [Coe and Stern, 1982].The power spectrum densities (not shown here) of therainfall occurrence time series of most rainfall stationsin Nordeste show a significant peak at a frequency of�0.0027 d�1, which corresponds to the annual cycle. Noother frequency appears to be significant. Therefore, forthe sake of simplicity and in order to reduce the number

Figure 1. Location of the 504 rainfall stations in northeastBrazil that are used in this work. The three black circleslabeled as 1 (northern), 2 (eastern), and 3 (southern) arerepresentative stations of the three regimes of rainfall seasonin northeast Brazil. Longitude and latitude are shown alongthe x and y axes, respectively.

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING

3 of 14

W07422

Page 4: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

of parameters in the model, we consider only the firstFourier harmonic to represent the annual cycle:

yst nð Þ � Bin 1; logit�1 ast þ bst sin wnð Þ þ cst cos wnð Þð Þ� �

; ð4Þ

where the angular frequency w of an annual cycle is givenby w = 2p

365= 0.0172 rad d�1. Implicitly, we are assuming

that, given the Fourier parameters, the random variablesyst(n) in (4) are conditionally independent in time (day-to-day and year-to-year dependence) and space.[15] Rainfall lag correlations across years are negligible

and hence are not accounted in (4). A Markov chain model[e.g., Coe and Stern, 1982] could be considered for the dailyrainfall process. However, in the work presented here asimpler model is used.

3.2. Parameter Estimates and Hierarchical BayesianModeling

[16] In general, enough data is available to estimate thetime varying Fourier parameters in (4). At least 200 days areavailable to estimate only three parameters, therefore max-imum likelihood estimates (MLE) using standard statistical

software would potentially lead to robust estimates. How-ever, one faces two additional problems: the distribution ofthe available data along the year and data quality. Forinstance, Figure 2 displays the maximum likelihood esti-mates of (4) for station 3 (see location in Figure 1). Clearlythe intercept parameter ast when t = 1959 is most likely anoutlier, also suggested by a simple c2 outlier test. Figure 2dshows many missing days in the beginning of the hydrolog-ical year 1959, which results in a larger than usual confidenceinterval for the Fourier parameters (Figures 2a–2c) andmodel outputs (thick line in Figure 2d) that significantlydiffer from the output (thin line in Figure 2d) obtained usingthe time averaged (across years) Fourier parameters, whichis indicative of the average probability of rainfall. Inaddition to this, some stations (not shown here) show veryunusual distribution of rainfall occurrence (e.g., a singlerainy day in 1 year) along the year, most probably due todata quality, which in turn significantly increases parameteruncertainties and model analysis. One approach to system-atically assess and reduce parameter uncertainties in thiscontext is to estimate the parameters under a hierarchicalBayesian framework [Gelman et al., 2004], where more

Figure 2. (a, b, c): Maximum likelihood estimates (MLE) of the Fourier harmonics in (4) for station 3(see location in Figure 1). The shaded region shows the 95% confidence interval for the estimates.(d) Fitted curve using the MLE for year 1959 (thick solid line) and the time-averaged (across years) MLE(thin solid line) and Bayesian (dashed line) point estimates (based on the expected value of the posteriordistribution of parameters) for year 1959. The dots show observations (1, rain; 0, dry).

4 of 14

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING W07422

Page 5: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

reliable model outputs are obtained (e.g., dashed line inFigure 2d).[17] In the first stage of the hierarchical Bayesian model,

we assume that, for each station s, the 3 Fourier coefficientsast, bst and cst are sampled from a Gaussian process whichhas the mean value for these coefficients, As, Bs, Cs,respectively, with a corresponding 3 � 3 covariance matrixSs. This way we account for the temporal variability of thecoefficients at each station. Mathematically, we can repre-sent this through a prior distribution:

astbstcst

0@

1A � N

As

Bs

Cs

;Ss

0@

1A: ð5Þ

The large interannual variability of Nordeste rainfall and theobservation that rainfall is essentially uncorrelated acrossyears motivated a simpler model where correlation of ast, bstand cst across years is neglected.[18] In the second stage of the hierarchical model we

define priors for As, Bs and Cs, which represent temporalaverages of the coefficients defined in (4). Note that onlyyears in which there is data available are used to estimatethese hyperparameters. By defining prior distributions foreach of those parameters, we aggregate more informationand constraint the estimates, reducing their variance [Gelmanand Hill, 2007]. Our prior information here is the existenceof a spatial structure in seasonality and hence among thecoefficients. We adopt an approach similar to that of Wikleet al. [1998], where the coefficients are modeled as simplelinear spatial trends:

As � N a0 þ a1 longs þ a2 lats;s2A

� �; ð6Þ

Bs � N b0 þ b1 longs þ b2 lats;s2B

� �; ð7Þ

Cs � N g0 þ g1 longs þ g2 lats;s2C

� �; ð8Þ

where longs and lats refer to the longitude and latitude ofrain site s, respectively. A further justification of the use oflinear spatial trends is presented in section 3.2.2.[19] In a more comprehensive approach, one could con-

sider a full spatial model, such as the vector regressionmodel used by Wikle et al. [1998], which would explicitlyaccount for the spatial dependence across rainfall sites.However, it would require the estimation of a variogramand its associated parameters (see Cressie [1993] for somedefinitions) for each hyperparameter and year in the record.As we discuss in section 3.2.2, the residuals from the linearspatial trend models (6), (7) and (8) show no spatialcorrelation for distances greater than �500 km, which isthe spatial scale of most interest in this work. Therefore, wechose to work with a simplified version of this model,which implicitly accounts for the spatial dependence of therain probability across sites through the Fourier parameters.

[20] For the covariance matrix of (5) we propose amultivariate Jeffreys prior density [Gelman et al., 2004],since the number of dimensions d = 3 is relatively small:

p Ssð Þ / Ssj j� dþ1ð Þ=2: ð9Þ

In the last stage of the hierarchical model, since we do nothave any prior information of the regression coefficients in(6), (7) and (8), we specify independent uniform priors:

p a0;a1;a2;b0;b1;b2; g0; g1; g2ð Þ / 1: ð10Þ

For the scale parameters in (6), (7) and (8), since there isenough data available (504 data points correspondent to theFourier coefficients of the 504 sites) to estimate them, wecan just assume independent uniform priors as suggested byGelman [2005]:

p sA;sB;sCð Þ / 1: ð11Þ

3.2.1. Parameter Estimation[21] The estimation of the parameters described in equa-

tions (4), (5), (6), (7) and (8) is done using simulationmethods. From (4) we can write the likelihood function ofthe binary data as

p yjqð Þ ¼YSs¼1

YTst¼1

Y366n¼1

Binðyst nð Þj1;

logit�1 ast þ bst sin wnð Þ þ cst cos wnð Þð Þ; ð12Þ

where q refers to all parameters being estimated, S is thetotal number of rainfall stations and Ts is the total number ofyears of station s.[22] The Bayes’ rule gives the posterior density for q:

p qjyð Þ ¼ p q; yð Þp yð Þ / p yjqð Þ � p qð Þ; ð13Þ

where p(q) is referred as the prior distribution of theparameters.[23] Substituting the prior of the parameters as defined in

(5) to (11), and the likelihood function (12) into (13), yieldsthe joint posterior distribution

p qjyð Þ /YS

s¼1

YTs

t¼1

Y366

n¼1Binðyst nð Þj1;

logit�1 ast þ bst � sin wnð Þ þ cst � cos wsð Þð Þ

� Nast

bst

cst

0B@

�������As

Bs

Cs

;Ss

1CA � N Asja0 þ a1 longs þ a2 lats;s2

A

� �

� N Bsjb0 þ b1 longs þ b2 lats;s2B

� �� N Csjg0 þ g1 longs þ g2 lats;s2

C

� �� Ssj j�2: ð14Þ

[24] Equation (14) involves the estimation of severalparameters with nonconjugate prior distributions. The inte-gral over all parameters cannot be directly solved. In thiscase, we have to turn to other methods. We adopt here thewidely used Markov chain Monte Carlo (MCMC) methodto draw values of q from its posterior distribution (14).

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING

5 of 14

W07422

Page 6: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

In particular, we combine the Gibbs sampler and theMetropolis algorithm for simulating from the posteriordistribution [Gelman et al., 2004]. We apply the Gibbssampler for conditionally conjugate parameters (i.e., whenprior and posterior are in the same family of distributions)and when there is a closed form for the conditional posteriordistribution, so that it is possible to draw samples directlyfrom it. When the model is not conditionally conjugate andthere is no closed form for the conditional posterior, wedraw posterior samples using the Metropolis algorithm.Details about the conditional posterior distributions areshown in the appendix. A MCMC simulation was run withfive chains to verify the convergence of the results (or themixing) on the basis of the methodology suggested byGelman et al. [2004].3.2.2. Spatial Variability of Fourier Coefficients[25] An important step in building the hierarchical

Bayesian model is to identify the spatial structure amongthe coefficients defined in (6), (7) and (8) in order to defineprior distributions. We use here the concept of samplevariogram [Cressie, 1993; Pebesma, 2004] to evaluate thespatial structure and check whether or not the linear trend

suggested in (6), (7) and (8) is reasonable to model thespatial pattern across the Fourier parameters.[26] Figure 3 shows the sample variogram of the time

averaged coefficients corresponding to As, Bs and Cs

assuming isotropy. A linear trend as well as a quadratictrend are fit on the latitude and longitude coordinates andthe variogram of the residuals from this fit is analyzed. Forall coefficients, a great amount of the semivariance isreduced with the addition of a linear trend. A quadratictrend does not provide any further reduction in the semi-variance. The residuals of the linear trend fit still show aspatial dependence structure related to smaller-scale varia-tions that may require further modeling. A model thataddresses this residual structure would lead to a change inthe estimates of uncertainty of the parameters defined in (6),(7) and (8). However, the complexity added to the modelwith one more stage of spatial variance modeling may notjustify the increment of uncertainty reduction gained. More-over, the linear trend residuals displayed in Figure 3 showthat most of the residual correlation is limited to less than500 km and in the current paper we are more interested inanalyzing temporal trends and climate teleconnections over

Figure 3. (left) Sample variograms for the (a) As, (b) Bs, and (c) Cs coefficients and (middle) residualsof a linear spatial trend and (right) residuals of a quadratic spatial trend fitted on the As, Bs, and Cs

coefficients.

6 of 14

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING W07422

Page 7: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

relatively larger areas. Therefore, a simple linear trend wasused to model the main spatial structure of those parameters.

4. Features of the Rainfall Season Cycle

[27] In order to illustrate how the rainfall occurrencemodel can be used to draw inferences about the rainfallseason cycle, we select three benchmark stations (Figure 1)representative of the three different regimes of rainfallseasonality in Nordeste and evaluate the models outputsfor those stations using the Bayesian estimates.[28] On the basis of an areal average of 67 stations in

North Nordeste over the February–May period [Secretariade Infraestrutua Hıdrica, 2000], we select 1968, 1972 and1980 as benchmark years of wet, dry and normal conditions,respectively. As in some regions of Nordeste the rainfallseason starts in November, we chose to work with ahydrological year instead of the usual calendar year. Thestart of the hydrological year is taken to be the 274th day ofyear (1 October for non–leap years) and the end on the

273rd day of the following calendar year. The average andthe 95% interval of the rainfall probability for the selectedstations and those years are presented in Figure 4. Shifts inthe onset, peak and end of the rainy season can be easilyobserved in Figure 4. This methodology also allows us toidentify an anomalous wet season, which is usually markedby an increase in the number of rainy days and consequentlyin the maximum probability of rainfall. The converseapplies to the identification of dry years. For instance,Figure 4 shows an early peak in the rainy season of thestation 1 in 1980 as well as a low rainfall total for thisperiod. The wet season in 1968 is marked by high rainfallprobabilities. The late peak during 1972 is also illustrated.Similarly, Figure 4 depicts a dry 1968 and an earlier rainfallpeak in 1980 for station 2.[29] The average and the 95% interval of the rainfall

mean probability obtained from the Bayesian estimates ofAs, Bs and Cs for the three selected stations is shown inFigure 5. The proposed model fits well the daily probabilityon the basis of the observed frequency of rain days. The

Figure 4. Daily probability of rainfall for the three selected stations during 1968, 1972, and 1980obtained from the Bayesian estimates of the time-varying Fourier coefficients. The red line is the fittedmodel. The black dots indicate occurrence (at 1) or absence (at 0) of rainfall. The horizontal dashed lineshows the 0.5 probability of rainfall, while the vertical dashed line presents the day of year of in whichthe maximum probability of rainfall is achieved. The blue lines show the 95% Bayesian interval of theaverage probability of rainfall.

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING

7 of 14

W07422

Page 8: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

daily probability model for the northern and eastern stationsshows values that are under 0.5. The maximum of thosestations occurs around March and May, respectively. Thesouthern station presents the rainy season between Novemberand February. All these results agree with the literature, asdetailed in section 2.[30] Another more intuitive way to look at the spatial

distribution of As, Bs and Cs is to analyze the maximumprobability of rainfall Pmax and the day of year Jmax inwhich it occurs. We can easily derive the expression forPmax for each rainfall station s:

Pmaxs ¼ logit�1 As þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiB2s þ C2

s

q� �: ð15Þ

[31] By solving the equation

Js ¼ arctanBs

Cs

� �� 3652p

¼ arctanBs

Cs

� �� 1w; ð16Þ

we find J smax:

Jmaxs ¼

J ; Bs

Cs> 0;Bs > 0

J þ 3652; Bs

Cs> 0;Bs � 0

J þ 3652; Bs

Cs� 0;Bs > 0

J þ 365; Bs

Cs� 0;Bs � 0:

ð17Þ

[32] Note that Psmax and J s

max are random variablesobtained for each sample of the posterior distribution ofAs, Bs and Cs. Figure 6a illustrates the spatial distribution ofthe expected value of Ps

max, i.e., the Bayesian point estimate

Figure 5. Historical frequency of rainfall days (black line) for the three selected stations and associatedfitted model (red line) with Bayesian estimates of Fourier coefficients. The horizontal dashed line showsthe 0.5 probability of rainfall, while the vertical dashed line presents the day of year in which themaximum probability of rainfall is achieved. The blue lines shows the 95% Bayesian interval ofthe average probability of rainfall. The rainy season onset can be defined as occurring when p > p*,where p is the daily probability of rainfall and p* can be any choice that the user makes.

8 of 14

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING W07422

Page 9: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

of Psmax. The maximum probability of rainfall is very low in

the central region of Nordeste. This is consistent with theliterature, which refers to this area as the dry polygon. Theprobability reaches the highest values along the coast andtoward southern Nordeste.[33] The day of year of maximum probability of rain

depicted in Figure 6b is also in agreement with previousfindings. In southern Nordeste the rainfall peak is aroundJanuary [Strang, 1972], then the displacement of the ITCZbrings maximum rainfall to northern Nordeste around April[Moura and Shukla, 1981; Hastenrath, 1994] and finally thepropagation of southeasterly winds perpendicular to theshoreline [Rao et al., 1993] is associated with the rainfallalong the Nordeste coast in June.

5. Temporal Trends and Climate Teleconnections

[34] We assess trends in the time series of the maximumprobability of rainfall ps,t

max,

pmaxs;t ¼ logit�1 as;t þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffib2s;t þ c2s;t

q� �; ð18Þ

and in the day of year js,tmax in which it takes place by fitting

a linear curve on those time series. Note again that ps,tmax and

js,tmax are random variables and posterior estimates of themwere obtained for each posterior sample of as,t, bs,t and cs,t.The expected value of ps,t

max and js,tmax are used as point

estimation values in a weighted linear fit on the timevariable, where the uncertainty is accounted by usingweights defined as the inverse of the variance of ps,t

max andjs,tmax.[35] The spatial distribution of the regression slope of

ps,tmax (not shown here) suggests a positive trend in the

maximum probability of rainfall over northern Nordesteand a decreasing trend in some rainfall stations along theeast coast. In order to be field significant at the 95%confidence level, the spatial map should present at least21 locations (out of 504) with significant slopes, accordingto the method proposed by Livezey and Chen [1983]. Themap of ps,t

max exhibits 202 significant slopes, which is a largenumber and strongly indicates a field significant trend. Thespatial distribution of the regression slope of js,t

max (notshown) has a spatial pattern in eastern Nordeste, with atendency to later days of maximum probability of rainfall.The map has 98 stations with statistically significant slopes.[36] Another way to look at temporal trends is by

assessing the spatial average of variables that share similarrainfall patterns. We apply the k means technique [Hastie etal., 2001] to define spatial clusters on the basis of the day ofyear of maximum probability of rainfall J s

max (Figure 6). Weset k = 3, given the three rainfall regimes of Nordestereported in the literature. The clusters identified are shownin Figure 7. Note that they correspond to the spatial patternsof rainfall over the northern, eastern and southern regions ofNordeste. Figure 8 shows the spatial average of the maxi-mum probability of rainfall ps,t

max and day of year in each it isachieved ( js,t

max) across the rainfall stations within eachcluster. Monotonic trends in the maximum probability ofrainfall can be observed in all three curves. Evidences of

Figure 6. (a) Spatial distribution of maximum probabilityof daily rainfall Ps

max obtained from Bayesian pointestimates and (b) spatial distribution of day of year J s

max

in which Psmax is achieved.

Figure 7. Clustering based on the Bayesian estimates ofthe average day of year of maximum probability of rainfall.

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING

9 of 14

W07422

Page 10: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

monotonic trends in the day of year of maximum probabil-ity of rainfall is observed only for the southern stations.[37] Previous work indicated that the Pacific and the

Atlantic sea surface temperature (SST) play a big role onthe Nordeste Rainfall. We analyze climate teleconnectionsby examining temporal correlations between the day ofmaximum probability of rainfall, js,t

max, and four indices ofSST: NINO3, representative of El Nino events, the NorthAtlantic index (NAI), the South Atlantic index (SAI) and

the difference NAI minus SAI, which is indicative of themeridional position of the Atlantic intertropical Conver-gence Zone (ITCZ). We use the extended SST data [Kaplanet al., 1998; Reynolds and Smith, 1994] provided by TheInternational Research Institute for Climate Prediction (IRI),through http://iridl.ldeo.columbia.edu/SOURCES/.Indices/.nino/.EXTENDED/.NINO3/. Climate teleconnections asso-ciated with the interannual variability of the maximumprobability of rainfall are not evaluated given the extensive

Figure 8. (top) Time series of maximum probability of rainfall for the three clusters depicted in Figure 7.(bottom) Time series of day of year of maximum probability of rainfall for the same clusters. Shown fromleft to right are northern, eastern, and southern clusters. A linear curve is fit on the data.

10 of 14

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING W07422

Page 11: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

literature about remote climate influences on Nordesterainfall [e.g., Moura and Shukla, 1981; Rao et al., 1993;Uvo et al., 1998; Giannini et al., 2001; Kayano andAndreoli, 2006].[38] The lagged correlations of northern stations js,t

max andthe four climate indices (Figure 9) suggest that NAI minusSAI has a meaningful influence on js,t

max in the months thatprecede the main rainfall season. Such association isexpected because of the well known role of NAI minusSAI on Nordeste rainfall. The scatterplot in Figure 9illustrates a variability of about 2 months in js,t

max, whereearlier onsets of the rainfall season are associated withpositive NAI minus SAI anomalies in January. Significanteffects of SAI are also observed in March, when the averageprobability of rainfall in northern Nordeste achieves itsmaximum value, as showed in Figure 6b. Therefore, fore-casts of js,t

max could be issued in January using NAI minusSAI as predictor and updated in March on the basis of theSAI anomalies. The influence of NINO3 is not welldetermined. An analysis (not shown here) using weightedcorrelation coefficients defined as the inverse of the vari-ance from the Bayesian simulations shows similar results.[39] The js,t

max of the eastern cluster of stations is signif-icantly affected by SAI in March (figure not shown here).The influence of the tropical south Atlantic SST on easternNordeste rainfall have already been reported in the literature[Kousky, 1980]. Positive anomalies are associated with laterrainfall season onsets. The lag time between the peak in thecorrelation (March) and the timing of maximum probabilityof rainfall (June) suggests the possibility of long lead time

forecasts for the onset of the rainfall season in easternNordeste.[40] In southern Nordeste the NINO3 index turns out to

be the best climate predictor for the onset of the rainfallseason (figure not shown here). Positive anomalies ofNINO3 in December are associated with advances in thepeak of rainfall. La Nina years yield to earlier rainy seasononsets but with higher variability. Correlations with Atlanticindexes are not statistically significant before the timing ofthe rainy season.[41] Figure 10 further illustrates the spatial pattern of the

correlations of js,tmax and the potential climate predictors

discussed above and suggested in the scatterplot ofFigure 9. The positive correlation of NINO3 and the rainfallseason onset in southern Nordeste appears clearly. Thetropical Atlantic SST interhemispheric gradient (NAI minusSAI) has a significant influence on the rainy season onset innorthern Nordeste. SAI is linked with the rainfall onsetalong the east coast and over northern Nordeste.

6. Summary and Discussion

[42] In the present paper we analyzed daily rainfalloccurrence in northeast Brazil. A hierarchical Bayesianmodel that uses common information to constrain parameterestimates was developed. The three main regimes of rainfallin Nordeste usually reported in the literature were correctlyidentified on a daily time scale with an estimate of theassociated uncertainties.

Figure 9. (a) Correlation of climate indexes and day of year of maximum probability of rainfall for thenorthern cluster of stations, as shown in Figure 7. The vertical solid line indicates the month of theaverage maximum probability of rainfall, as shown in Figure 5. Correlations above (below) the top(bottom) dashed line are statistically significant at the 95% confidence level. (b) Scatterplot of a potentialclimate predictor and the day of year of maximum probability of rainfall. The black line shows a linearcurve fitted on the data.

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING

11 of 14

W07422

Page 12: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

[43] The estimates of daily rainfall probability can beused to define attributes of rainfall seasonal timing. Forinstance, in southern Nordeste, we could define the onset ofthe rainy season when the probability of rainfall crosses 0.5and the end when it crosses it again and becomes lowerthan 0.5. Looking at Figure 5, for example, we coulddeclare the rainfall onset of station 3 on the 326th day ofyear (22 November for non–leap years) and the end on the54th day (23 February) of the following calendar year, witha total duration of 95 days. The variability around theseestimates would be ±7 days according to the uncertainty ofthe coefficients. On the other hand, in regions where thehistorical rainfall probability is usually lower than 0.5, wecould outline the onset and end of the rainy season on thebasis of a specific rainfall season duration defined a priori.For instance, let us assume that a specific rain-fed croplocated close to station 2 needs water for 30 days. On thebasis of the day of maximum probability of rainfall, asillustrated in Figure 5, we would say that the best period forplanting is between the 142nd day of year (22 May) and the171st day (20 June). In this case, the uncertainty of thisperiod is only ±1 day.[44] Field significance of trends in the maximum proba-

bility of daily rainfall showed a positive trend for northernNordeste. We note that this trend is likely to be associatedwith the recent (after the 1980s) positive trend in the SAI(not shown here) and more southward position of the ITCZas reported by some authors [e.g., Rao et al., 2006].Suggestive trends of earlier onsets in the rainfall seasonover eastern and southern Nordeste were observed but theassociated correlations are weak. An examination of whetherthese trends may be related to natural or anthropogenicfactors is a subject for climate model–based investigations.[45] Correlation maps identified three basic spatial pat-

terns of climate teleconnections. Positive anomalies in thetropical Atlantic SST gradient (NAI minus SAI) in Januaryprior to the rainfall season tend to anticipate its onset over alarge area of northern Nordeste, where the reduced rainfallis due to the more northern ITCZ meridional position duringpositive anomaly events. Later rainy season onsets overeastern and northern Nordeste are also highly associatedwith positive anomalies in the tropical South Atlantic SSTin March. El Nino events lead to later rainy season onsets ina large area of southern Nordeste. Anomalies of rainfall inthis region are associated with an intensification of thesubtropical westerly jets and consequently the increase ofmesoscale activities, as indicated in section 2.[46] The high correlations found suggest that the onset of

the rainy season in Nordeste could be potentially forecastone to four months ahead of time with reasonable accuracy.Future studies will focus on the development of a real timeforecast model for the onset and end of the Nordeste rainyseason with the inclusion of climate indexes as priordistributions for the model parameters. We also plan toexpand this work for other regions of Brazil and includeother climate variables such as pressure fields and thesubtropical Atlantic SST. The model structure will bereviewed to consider a Markov chain or other model forpersistence in daily rainfall and to more extensively con-sider the spatial correlation structure in the parameterestimates. The scale of the region that is best to model

Figure 10. Correlation map of day of year of maximumprobability of rainfall and (top) previous December NINO3(Dec(�1)), (middle) current January NAI minus SAI (Jan(0)),and (bottom) current March SAI (Mar(0)). Solid circles arestatistically significant at the 95% confidence level.

12 of 14

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING W07422

Page 13: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

and/or the selection of relatively homogeneous regions foranalysis will be part of these extensions to the current work.

Appendix A: Bayesian Model

[47] From the joint posterior distribution (14), the condi-tional posterior distribution of the intercept parameter in (4)can be written as

p astj�ð Þ /Y365

n¼1Binðyst nð Þj1;

logit�1 ast þ bst � sin wnð Þ þ cst � cos wnð Þð Þ

� Nast

bst

cst

0B@

�������As

Bs

Cs

;Ss

1CA; ðA1Þ

where p(astj�) indicates the posterior distribution of astconditional on the data and remaining parameters.[48] The conditional posterior distribution (A1) is not

conjugate and it has no closed form; that is, we cannotsample directly from it. Therefore, we use the Metropolisalgorithm inside a Gibbs sampler to estimate the posteriordistribution of ast. Details of the algorithm can be found inwork by Gelman et al. [2004]. Note that p(bstj�) and p(cstj�)have similar conditional posterior distributions. The condi-tional posterior of the first set of hyperparameters (5) can bewritten as

p As;Bs;Csj�ð Þ /YTnt¼1

N

ast

bst

cst

0B@

�������As

Bs

Cs

;Ss

1CA

� NAs

Bs

Cs

0B@

�������a0 þ a1longs þ a2 lats

b0 þ b1longs þ b2 lats ;Ss

g0 þ g1 longs þ g2 lats

1CA; ðA2Þ

where Ss is the diagonal matrix:

Ss ¼s2A 0 0

0 s2B 0

0 0 s2C

0@

1A: ðA3Þ

[49] Completing the quadratic of (A2) [Gelman et al.,2004] yields the following joint conditional posterior dis-tribution of As, Bs, Cs:

p As;Bs;Csj�ð Þ / exp � 1

2Ps � msð ÞTL�1

s Ps � msð Þ� �

¼ N Psjms;Lsð Þ; ðA4Þ

where

ms ¼ S�1s þ TsS�1

s

� ��1S�1

s m0 þ TsS�1s m1

� �;

L�1s ¼ S�1

s þ TsS�1s ðA5Þ

Ps ¼As

Bs

Cs

0B@

1CA ; m0 ¼

a0 þ a1longs þ a2 lats

b0 þ b1longs þ b2 lats

g0 þ g1 longs þ g2 lats

0B@

1CA ;

m1 ¼as�

bs�

cs�

0B@

1CA: ðA6Þ

[50] For the covariance matrix Ss we have the followingconditional posterior distribution:

p Ssj�ð Þ /YTst¼1

N

astbstcst

0@

������As

Bs

Cs

;Ss

1A � Ssj j2; ðA7Þ

which can be written as [Gelman et al., 2004]

p Ssj�ð Þ ¼ Inv-WishartTs�1 SsjSsð Þ; ðA8Þ

where Ss is the sum of squares matrix:

Ss ¼XTst¼1

ast bst cst½ � � As Bs Cs½ �ð ÞT ast bst cst½ � � As Bs Cs½ �ð Þ:

ðA9Þ

[51] The joint conditional posterior distribution of a =[a0 a1 a2] is given by

p aj�ð Þ /YSs¼1

N Asja0 þ a1 longs þ a2 lats;s2A

� �; ðA10Þ

which yields, after completing the square, the followingdensity:

p aj�ð Þ ¼ N a;Vas2A

� �; ðA11Þ

where a = (XTX)�1 XTy, Va = (XTX)�1, y = [A1 A2, . . . AS]T

and

X ¼

1 long1 lat12 long2 lat2: : :S longS latS

0BB@

1CCA:

[52] Similar posterior distributions are obtained for b =[b0 b1 b2] and g = [g0 g1 g2].[53] The same technique yields also the distribution of the

variance parameter:

p s2Aj�

� �/YS

s¼1N Asja0 þ a1 longs þ a2 lats; s2

A

� �¼ Inv-c2

�s2AjS � 1;

1

S � 1

XS

s¼1As � a0 � a1 longs � a2 latsð Þ2

�:

ðA12Þ

The conditional posterior distributions of sB and sC areobtained similarly.

[54] Acknowledgments. The first author thanks CAPES andFulbright for financial support through grant 1650/041. We thank AndrewGelman for several suggestions and discussion regarding several aspects ofthis paper.

ReferencesBerliner, L. M., J. A. Royle, C. K. Wikle, and R. F. Milliff (1999), Bayesianmethods in the atmospheric sciences, in Bayesian Statistics 6, edited byJ. M. Bernado et al., pp. 83–100, Oxford Univ. Press, London.

Berliner, L. M., R. F. Milliff, and C. K. Wikle (2003), Bayesian hierarchicalmodeling of air-sea interaction, J. Geophys. Res., 108(C4), 3104,doi:10.1029/2002JC001413.

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING

13 of 14

W07422

Page 14: Click Here Full Article Hierarchical Bayesian modeling of ...water.columbia.edu/files/2011/11/2008WR007485_final.pdf[Coles and Tawn, 1996] and amounts [Sanso and Guenni, 1999] and

Cardoso, A. O., and P. L. S. Dias (2006), The relationship between ENSOand Parana River flow, Adv. Geosci., 6, 189–193.

Chu, P.-S. (1983), Diagnostic studies of rainfall anomalies in northeastBrazil, Mon. Weather Rev., 111, 1655–1664.

Coe, R., and R. D. Stern (1982), Fitting models to daily rainfall data,J. Appl. Meteorol., 21, 1024–1031.

Coelho, C. A. S., C. B. Uvo, and T. Ambrizzi (2002), Exploring the impactsof the tropical Pacific SST on the precipitation patterns over SouthAmerica during ENSO periods, Theor. Appl. Climatol., 71, 185–197.

Coles, S. G., and J. A. Tawn (1996), A Bayesian analysis of extreme rainfalldata, Appl. Stat., 45(4), 463–478.

Cressie, N. A. C. (1993), Statistics for Spatial Data, 900 pp., John Wiley,New York.

Diaz, A. F., C. D. Studzinski, and C. R. Mechoso (1998), Relationshipsbetween precipitation anomalies in Uruguay and southern Brazil and seasurface temperature in the Pacific and Atlantic oceans, J. Clim., 11, 251–271.

Fasullo, J., and P. J. Webster (2003), A hydrological definition of Indianmonsoon onset and withdrawal, J. Clim., 16, 3200–3211.

Folland, C. K., A. W. Colman, D. P. Rowell, and M. K. Davey (2001),Predictability of northeast Brazil rainfall and real-time forecast wkill,1987–98, J. Clim., 14, 1937–1958.

Frost, A. J., M. A. Thyer, R. Srikanthan, and G. Kuczera (2007), A generalBayesian framework for calibrating and evaluating stochastic models ofannual multi-site hydrological data, J. Hydrol., 340, 129–148.

Garbutt, D. J., R. D. Stern, M. D. Dennett, and J. Elston (1981), A compar-ison of the rainfall climate of eleven places in West Africa using a two-part model for daily rainfall, Arch. Meteorol. Geophys. Bioklimatol., Ser.B, 29, 137–155.

Gaudard, M., M. Karson, E. Linder, and D. Sinha (1999), Bayesian spatialprediction, Environ. Ecol. Stat., 6, 147–171.

Gelman, A. (2005), Prior distribution for variance parameters in hierarch-ical models, Bayesian Anal., 1(1), 1–19.

Gelman, A., and J. Hill (2007), Data Analysis Using Regression and Multi-level/Hierarchical Models, Cambridge Univ. Press, Cambridge, U. K.

Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin (2004), BayesianData Analysis, Chapman and Hall, Boca Raton, Fla.

Giannini, A., J. C. H. Chiang, M. A. Cane, Y. Kushnir, and R. Seager(2001), The ENSO teleconnection to the tropical Atlantic Ocean: Con-tributions of the remote and local SSTs to rainfall variability in thetropical Americas, J. Clim., 14, 4530–4544.

Giannini, A., R. Saravanan, and P. Chang (2004), The preconditioning roleof tropical Atlantic variability in the development of the ENSO telecon-nection: Implications for the prediction of Nordeste rainfall, Clim. Dyn.,22, 839–855.

Goswami, B. N., and P. K. Xavier (2005), ENSO control on the south Asianmonsoon through the length of the rainy season, Geophys. Res. Lett., 32,L18717, doi:10.1029/2005GL023216.

Grantz, K., B. Rajagopalan, M. Clark, and E. Zagona (2007), Seasonalshifts in the North American monsoon, J. Clim., 20, 1923–1935,doi:10.1175/JCLI4091.1.

Grimm, A. M. (2004), How do La Nina events disturb the summer mon-soon system in Brazil?, Clim. Dyn., 22, 123–138, doi:10.1007/s00382-003-0368-7.

Grimm, A. M., S. E. T. Ferraz, and J. Gomes (1998), Precipitation anoma-lies in southern Brazil associated with El Nino and La Nina events,J. Clim., 11, 2863–2880.

Grimm, A. M., V. R. Barros, and M. E. Doyle (2000), Climate variability insouthern South America associated with El Nino and La Nina events,J. Clim., 13, 35–58.

Hastenrath, S. (1990), Prediction of northeast Brazil rainfall anomalies,J. Clim., 3, 893–904.

Hastenrath, S. (1994), Climate Dynamics of the Tropics, Atmos. Sci. Libr.,vol. 8, Kluwer Acad., Dordrecht, Netherlands.

Hastie, T., R. Tibshirani, and J. Friedman (2001), The Elements of Statis-tical Learning, Springer, New York.

Ingram, K. T., M. C. Roncoli, and P. H. Kirshen (2002), Opportunities andconstraints for farmers of West Africa to use seasonal precipitation fore-casts with Burkina Faso as a case study, Agric. Syst., 74, 331–349.

Kaplan, A., M. Cane, Y. Kushnir, A. Clement, M. Blumenthal, andB. Rajagopalan (1998), Analyses of global sea surface temperature1856–1991, J. Geophys. Res., 103, 18,567–18,589.

Katz, R. W. (1977), Precipitation as a chain-dependent process, J. Appl.Meteorol., 16(7), 671–676.

Kayano, M. T., and R. V. Andreoli (2006), Relationships between rainfallanomalies over northeastern Brazil and the El Nino–Southern Oscilla-tion, J. Geophys. Res., 111, D13101, doi:10.1029/2005JD006142.

Kousky, V. E. (1979), Frontal influences on northeast Brazil, Mon. WeatherRev., 107, 1140–1153.

Kousky, V. E. (1980), Diurnal rainfall variations in northeast Brazil, Mon.Weather Rev., 108, 488–498.

Liebmann, B., and J. A. Marengo (2001), Interannual variability of therainy season and rainfall in the Brazilian Amazon Basin, J. Clim., 14,4308–4318.

Livezey, R. E., and W. Y. Chen (1983), Statistical field significance andits determination by Monte Carlo techniques, Mon. Weather Rev., 111,46–59.

Marengo, J. A., B. Leibmann, V. E. Kousky, N. P. Filizola, and I. C. Wainer(2001), Onset and end of the rainy season in the Brazilian Amazon Basin,J. Clim., 14, 833–852.

Moura, A. D., and S. Hastenrath (2004), Climate prediction for Brazil’sNordeste: Performance of empirical and numerical modeling methods,J. Clim., 17, 2667–2672.

Moura, A. D., and L. Shukla (1981), On the dynamics of droughts innortheast Brazil: Observations, theory and numerical experiments witha general circulation model, J. Atmos. Sci., 38, 2653–2675.

Odekunle, T. O. (2004), Rainfall and the length of the growing season inNigeria, Int. J. Climatol., 24, 467–479.

Parent, E., P. Hubert, B. Bobee, and J. Miquel (Eds.) (1998), Statistical andBayesian Methods in Hydrological Sciences, UNESCO Press, Paris.

Pebesma, E. J. (2004), Multivariate geostatistics in S: The gstat package,Comput. Geosci., 30, 683–691.

Rao, V. B., M. C. D. Lima, and S. H. Franchito (1993), Seasonal andinterannual variations of rainfall over eastern northeast Brazil, J. Clim.,6, 1754–1763.

Rao, V. B., E. Giarolla, M. T. Kayano, and S. H. Franchito (2006), Is therecent increasing trend of rainfall over northeast Brazil related sub-Saharandrought?, J. Clim., 19, 4448–4453.

Reynolds, R. W., and T. M. Smith (1994), Improved global sea surfacetemperature analyses, J. Clim., 7, 929–948.

Ropelewski, C. F., and M. S. Halpert (1987), Global and regional scaleprecipitation patterns associated with the El Nino/Southern Oscillation,Mon. Weather Rev., 115, 1606–1626.

Sanso, B., and L. Guenni (1999), Venezuelan rainfall data analysed byusing a Bayesian space-time model, Appl. Stat., 48(3), 345–362.

Secretaria de Infraestrutua Hıdrica (2000), Transpostion of Sao Franciscowaters (in Portuguse), technical report, Minist. da Integracao Nac.,Brasilia.

Stern, R. D. (1982), Computing a probability distribution for the start of therains from a Markov chain model for precipitation, J. Appl. Meteorol.,21, 420–423.

Strang, D. M. G. D. (1972), Climatological analysis of rainfall normals innortheast Brazil, Tech. Rep. IAE-M 02/72, Cent. Tecnol. Aerosp., SaoJose dos Campos, Brazil.

Sultan, B., C. Baron, M. Dingkuhn, B. Sarr, and S. Janicot (2005), Agri-cultural impacts of large-scale variability of the West African mosoon,Agric. For. Meteorol., 128, 93–110.

Sun, L., D. Moncunill, H. Li, A. D. Moura, F. D. D. S. Filho, and S. Zebiak(2006), An operational dynamical downscaling prediction system forNordeste Brazil and the 2002–04 real time forecast evaluation, J. Clim.,19, 1990–2007.

Thiemann, M., M. Trosset, H. Gupta, and S. Sorooshian (2001), Bayesianrecursive parameter estimation for hydrologic models, Water Resour.Res., 37(10), 2521–2535.

Uvo, C. B., C. A. Repelli, S. E. Zebiak, and Y. Kushnir (1998), Therelationships between tropical Pacific and Atlantic SST and northeastBrazil monthly precipitation, J. Clim., 11, 551–561.

Wikle, C. K., L. M. Berliner, and N. Cressie (1998), Hierarchical Bayesianspace-time models, Environ. Ecol. Stat., 85, 117–154.

Wikle, C. K., R. F. Milliff, D. Nychka, and L. M. Berliner (2001), Spatio-temporal hierarchical Bayesian modeling: Tropical ocean surface winds,J. Am. Stat. Assoc., 96(454), 382–397.

Wu, R., and B. Wang (2000), Interannual variability of summer monsoononset over the western North Pacific and the underlying processes,J. Clim., 13, 2483–2501.

Zeng, X., and E. Lu (2004), Globally unified monsson onset and retreatindexes, J. Clim., 17, 2241–2248.

����������������������������U. Lall and C. H. R. Lima, Department of Earth and Environmental

Engineering, Columbia University, New York, NY 10027, USA. ([email protected]; [email protected])

14 of 14

W07422 LIMA AND LALL: HIERARCHICAL BAYESIAN MODELING W07422