Effect of discretization reﬁnementleo.ugr.es/pgm2012/proceedings/posters/vogel_flood_poster.pdf · Kristin Vogel 1, Carsten Riggelsen 1, Bruno Merz 2, Heidi Kreibich 2, Frank Scherbaum

Flood Damage and Influencing Factors: A Bayesian Network PerspectiveKristin Vogel 1, Carsten Riggelsen 1, Bruno Merz 2, Heidi Kreibich 2, Frank Scherbaum 1

1 Institute of Earth and Environmental Science, Potsdam University; Germany2 GFZ - GeoForschungszentrum Potsdam; Germany

contact: [email protected]

1. Flood risk assessment• Classical approaches relate flood dam-

age for a certain class of objects to inun-dation depth.

• Single and joint effects of other param-eters (e.g. inundation duration, qual-ity of warning) are largely unknown andwidely neglected in damage assessments.

• A dataset of 1135 (partly missing) obser-vations of 29 variables collected after the2002 and 2005/2006 floods in Elbe andDanube offers a data mining opportunityfor learning a Bayesian Network reveal-ing so far ignored interactions.

Dresden 2002

2. Bayesian Network

• A Bayesian Network (BN) describes a joint probability distribution decomposing it into aproduct of (local) conditional probability distributions according to a directed acyclic graph,which encodes the conditional independences.

• Graph structure, DAG, and parameters, Θ, can be learned from data, d, and is chosen here asMaximum aposteriori (MAP) of the joint posterior: P(DAG,Θ|d) ∝ P(d|DAG,Θ)P(Θ,DAG).

3. Automatic Discretization• Since we do not want to make assumptions about

the functional form of the (conditional) distributionfunctions, we need to discretize continuous vari-ables for BN learning.

• Λ, a set of interval boundaries for all continuousvariables, defines the discretization that "bins" thecontinuous data, dc, into a discrete version, d.

• Instead of a discretization prior to BN learning, weaim for optimizing BN and discretization simulta-neously, using an extension of the BN MAP score:

P(DAG,Θ,Λ|dc) ∝ P(dc|d,Λ)P(d|DAG,Θ,Λ)P(DAG,Θ,Λ).

• The score also takes care about the regularization ofthe number of intervals and network arcs.

4. A Single Continuous Target• An originally continuous target vari-

able (e.g. relative loss of building) isrediscretized into a very fine resolu-tion for an almost continuous approxi-mation of the conditional densities.

• The number of realizations per statedecreases significantly for the targetvariable, leading to unreliable param-eter estimates, if the maximum likeli-hood estimator is used.

• A Gaussian kernel density estimator isused instead, exploiting the observa-tions not only of the state of interest,but also of neighbouring states.

−12 −10 −8 −6 −4 −2 0

0.0

0.1

0.2

0.3

0.4

conditional probability of building loss

log loss

good precaution and warning

bad precaution and warning

Effect of discretization refinementShaded histograms show conditional densities forthe coarse automatic discretization. Lines showthe corresponding conditional densities for the re-fined discretization

5. Results

sd−f FL BN

0.08

0.12

0.16

RMSE

●

sd−f FL BN

0.4

0.5

0.6

0.7

0.8

correlation coefficiant

Comparison of prediction performance of BN with currently used approaches (sd-f: stagedamage function - depends only on water depth and object class; FL:FLEMOps+r - model devel-oped from same data set)100 bootstrap samples, each with 100 events, are drawn from the dataset; the building loss pre-diction is quantified by root mean squared error (left) and Pearson correlation coefficient (right).

• Box 2 "Bayesian network" shows the network learned fromthe collected flood data using the automatic discretization ap-proach. The learned BN reveals and confirms non-trivial inter-actions.

• The performance of the BN (with a refined discretization of thebuilding loss variable) in terms of predicting the building lossis compared to flood damage assessment approaches currentlyused in Germany (see picture to the left).

• Even though the BN is not designed for an optimal predictionof the target variable distribution (but for the joint distribution),the quality of building loss predictions is comparable to existingprocedures. Moreover the BN has the benefit to capture andallow reasoning under uncertainty

Documents

Effect of discretization reﬁnementleo.ugr.es/pgm2012/proceedings/posters/vogel_flood_poster.pdf · Kristin Vogel 1, Carsten Riggelsen 1, Bruno Merz 2, Heidi Kreibich 2, Frank Scherbaum