Upload
hadi-bun
View
34
Download
1
Tags:
Embed Size (px)
Citation preview
FLOOD RISK ASSESSMENT AND MAPPING USING GEOGRAPHICAL
INFORMATION SYSTEM: CASE STUDY IN KUALA LUMPUR
HADI
AEA090705
THESIS SUBMITTED IN FULFILMENT OF THE REQUIREMENTS FOR
THE DEGREE OF BACHELOR OF ARTS AND SOCIAL SCIENCE
DEPARTMENT OF GEOGRAPHY
FACULTY OF ARTS AND SOCIAL SCIENCE
UNIVERSITY OF MALAYA
SESSION 2011/2012
ABSTRACT
The last decade has seen increasing frequency and magnitude of hydrometeorological
hazards in Malaysia. Flash flood in particular has been considered as a major recurring hazard in
Malaysia, especially in urban areas where development activities and people livelihood are closely
interconnected with rivers. Rapid and uncontrollable development in Kuala Lumpur, the Malaysia
capital, especially in floodplain areas have increased the risk associated with flash flood. The risk
encompasses a wide dimension of livelihood and there is a growing need for effective integrated
flood risk management in Kuala Lumpur. One important component of the integrated approach is
the assessment of flood risk that results from the action of hazard on the vulnerable population and
elements in exposure. The risk varies in space and time and therefore can be best assessed using
geographical approach. Geographical Information System (GIS) provides powerful capabilities to
facilitate such spatial assessment and digital environment to store, process, and manage the involved
large amount of spatial and non-spatial data, as well as cartographic capabilities to produce high
quality flood risk maps to communicate the risk information. This study demonstrated the
integration of GIS in flood risk assessment and mapping using two spatial models: statistical (binary
logistic regression) model and index model. The logistic regression generated equation to predict
probability of flooding occurence from the statistically significant predictor variables, to produce
flood probability map. The index model conceptualized and quantified risk as a function of hazard,
vulnerability, and coping capacity, using weighted overlay analysis, to produce flood risk index
map. These maps are invaluable to assist policy makers and planners in planning land use
development and flood disaster management, while providing local residents the flood danger level
at their houses.
METHODOLOGY
CHART 3.3 FLOOD PREDICTION LOGISTIC REGRESSION MODELLING
SCENARIO 1
Geocode Flood Events’ Location and Digitize OCCURENCE Points
SCENARIO 2
Using Flood Area (Polygon) Year 2008
Randomly Generate OCCURENCE Sample Points Inside Flood Area
Randomly Generate NON-OCCURENCE Sample Points Outside Buffer Zones
Create Buffer for OCCURENCE Points
Randomly Generate NON-OCCURENCE Sample Points Outside Flood Area
Overlay with IV** raster layers:
Elevation Slope Planar
Curvature
Profile
Curvature
Population
Density
Drainage
Density
Distance
to River
Road
Density
Rainfall*
Land Use (Built Up and
Non Built Up)
Flow
Accumul
ation
Distance to
Mitigation Site*
Subbasin
Area
Curvature
Intersect Points with All IV Layers to Extract Rasters’ Values
from All Layers at All NON-OCCURENCE Sample Points
Intersect Points with All IV Layers to Extract Rasters’
Values from All Layers at All OCCURENCE Sample Points
Tabulate Extracted Rasters’ Values at All
OCCURENCE Sample Points
OCCURENCE Points IV Table NON-OCCURENCE Points IV Table
Assign DV*** value = 0 Assign DV*** value = 1
Open and Combine Tables into a
single dataset in SPSS
Run Backward Stepwise Binary Logistic Regression
Interpret Result and Construct Predictive Equation with
Significant IVs and Their Coefficients
Map Flood Probability for The Whole
Study Area using GIS Raster Calculator
Tabulate Extracted Rasters’ Values at All
NON-OCCURENCE Sample Points
Notes:
* Used in Scenario 2
** Independent Variables
** * Dependent Variable. New column created with value all 0 for NON-OCCURENCE table, all 1 for OCCURENCE table.
s
Distance to
river
Slo
pe
Dis
tan
ce t
o r
iver
Ele
vati
on
Dra
inag
e d
ensi
ty
Ro
ad d
ensi
ty
Rai
nfa
ll
Sub
bas
in A
rea
Dis
tan
ce t
o m
itig
atio
n s
ite
Cu
rvat
ure
Pla
nar
Cu
rvat
ure
Pro
file
cu
rvat
ure
Lan
d u
se (
bu
ilt u
p o
r
no
n b
uilt
up
)
Po
pu
lati
on
den
sity
Sample OCCURRENCE and NON-
OCCURRENCE Points
Intersect Point Tool extracts raster value
from all overlaid layers based on sample
points location, to generate independent
variables
FIGURE 3.4 MULTIPLE LAYER OVERLAY FOR LOGISTIC REGRESSION MODELLING
CHART 3.4 FLOOD RISK INDEX MODELLING
Hazard
Distance to
River
Coping Capacity
RISK
Exposure
Historical
Data
Binary Logistic
Regression
Model
Physical
Vulnerability
People
Vulnerability
Socioeconomic
Vulnerability
Distance to rescue station
Distance to shelter
Distance to hospital
Distance to main road
Distance to warning sign
board
Population Density
% People Aged
Below 10 Years Old
% People Aged 65
and Over
Land use
Type
Road
Density
Vulnerability
Flood Area
Up to Year
2000
Water Depth
Frequency of
Occurence
Flood
Occurence
Probability
NO NAME OF DATA
LAYER SOURCE USES
ENTITY
TYPE
DATA
MODEL ATTRIBUTES
1 Administrative Digitized from Kuala
Lumpur Local District
Map in 2010 Population
and Housing Census
Report published by
Department of Statistics
Malaysia
Main base map (for population
layer)
Polygon Vector Local district name, area.
2 Land Use Kuala Lumpur
2008
GIS Unit, JPF*, DBKL** Characterisation of social
geography in the study area,
socioeconomic vulnerability
assessment in flood risk index
modelling, extract river feature
for physical distance analysis to
assess physical vulnerability
also as independent variable for
flood probability (binary logistic
regression) modelling, raster
analysis mask, flood impact
estimation.
Polygon Vector FID, location, ID, and land use classes
(including:
1. Industry
2. Institution
3. Open area
4. Recreation
5. Religious use
6. Residential
7. Public facilities
8. Community facilities
9. KTM train rail
10. Commercial
11. Cemetery
12. Agriculture, fishery, forestry
13. Land reserve for electricity line
14. School
15. Squatter
16. River and water bodies
TABLE 4.1 DATABASE SUMMARY
17. Terminal)
3 Topography 30m x 30m Digital
Elevation Model (DEM)
downloaded from USGS
website, clipped for study
area, converted to raster
(.img).
To derive slope, aspect,
curvature, planar curvature,
profile curvature, flow direction,
flow accumulation (drainage),
contour, watershed, and 3D
view of study area. The DEM-
derivatives are parameters in
both index and logistic
regression modelling
Surface Raster Elevation
4 Roads (2008) Provided by JPF DBKL To estimate road density raster
for logistic regression, extract
main road for distance analysis
as factor in coping capacity
evaluation, flood economic
impact estimation
Polyline Vector Reference no., strategic, area, parliment,
hierarchy, hierarchy no., status, status
code, length, width, geometric area, from
(origin), to (destination),
5 Congested Road,
Morning (AM) and
Afternoon (PM)
Digitized from image
provided by JPB***
DBKL
Distance analysis for coping
capacity
Polyline Vector Road name
6 Infrastructure
Fire and rescue
station
Addresses obtained from
bomba.gov.my, geocoded
and point digitized on
location coordinate
Proximity analysis for coping
capacity assessment, distance to
mitigation site is input in
logistic regression analysis.
Point Vector Name, address, x-, y- coordinate
Public hospital and
clinic
Multipurpose hall
and community
center
Flood mitigation
work location
Flood electronic
warning sign board
Addresses obtained from
moh.gov.my, geocoded
and digitized as point
feature.
Addresses obtained from
jkpdbkl.com, geocoded
and point digitized on
location coordinate
Digitized from paper map
in Kuala Lumpur Flood
Problem and Solution
Report (DBKL, 2005)
Digitized from paper map
in Kuala Lumpur Flood
Problem and Solution
Report (DBKL, 2005)
7 Subbasin Area Digitized from image in
KL City Plan 2020 Report
(DBKL)
Hydrology base map, show
small catchment
Vector Polygon Name of draining stream/river, area.
8 Flood Points Geocoded and digitized
from DID Malaysia report
on flood events in Kuala
Lumpur year 2004 till
2011
Sample OCCURENCE points
(Dependent Variable/binary
response value = 1) for logistic
regression, flood historical
hazard assessment.
Vector Point Location, date and year of event, depth,
frequency
9 Flood Area Digitized from flood areal
extent paper map prepared
by DID Malaysia
Sample OCCURENCE points
(DV=1) for logistic regression,
flood exposure identification
Vector Polygon Year (until 2000, 2008, and 3 March
2009), Location, Area
10 Rainfall (2009) Daily station reading in
MET*** website
Create rainfall distribution
surface by interpolating
maximum value at reading
stations, as one dependent
variable for logistic regression
modelling
Vector Point Date, station location coordinate, rainfall
reading
11 Population Data entry from 2010
Population and Housing
Census
To derive population density,
percentage age group <15 and
65> for people vulnerability
assessment
Vector Polygon Local district name, total population,
population by ethnic (Malaysian, Non-
Malaysian, Bumiputera, Malay,other
Bumiputera, Chinese, India, others),
population by age group (0-4, 5-9, 10-14,
15-19, 20-24, 25-29, 30-34, 35-39, 40-44,
45-49, 5—54, 55-59, 60-64, 65-69, 70-74,
75 and over), population by gender (Male
and Female), households, living quarters,
population density.
RESULTS
The main results of both flood occurence probability logistic regression modeling and
risk index modeling are presented in the following maps. Maps are used to communicate the flood
risk information to the target stakeholders. Risk map should be produced with proper cartography
design for easier visualization to effectively convey the information to all the map readers.
Results of Binary Logistic Regression Test
Table 5.1d Variables in the Equation
B S.E. Wald df Sig. Exp(B)
95% C.I.for
EXP(B)
Lower Upper
Step 7a distriv -.002 .001 10.104 1 .001 .998 .997 .999
slope -.026 .014 3.336 1 .068 .974 .948 1.002
popdens .000 .000 4.640 1 .031 1.000 .999 1.000
Density .077 .034 5.147 1 .023 1.080 1.010 1.154
@10landuse(1) .735 .338 4.727 1 .030 2.086 1.075 4.049
Constant 1.802 .993 3.295 1 .070 6.060
a. Variable(s) entered on step 1: fill, subbasi, distriv, flowacc, slope, popdens, plancur, profile,
drainde, Density, @10landuse.
Table 5.1d above is the main result of the binary logistic regression analysis, after seven
iteration steps that eliminated statistically insignificant predictor variables sequentially. Referring
to the statement of variable(s) entered on step 1, ‗curvatu‘ was not included because SPSS found
redundancy between ‗curvatu‘ variable and, profile and planar curvature. Table 5.4 suggests that
from 12 dependent variables tested, 5 were found ‗significant‘. They are ‗distriv‘ (distance t river),
‗slope‘, ‗popdens‘ (population density), ‗density‘ (road density) and ‗@10landuse(1)‘ (built-up).
However, the results need further interpretation in terms of their significance.
The significance value (Sig) of Wald statistic is one way to assess the significance of the 5
variables. ‗Distriv‘, ‗popdens‘, ‗density‘, and ‗@10landuse(1)‘ were statistically significant at 95%
confidence interval having their Sig. Value less than 0.05. ‗Slope‘ is not significant in this case
because its Sig. value more than 0.05.
Another way to tell whether the predictors are significant was the Exp(B) value, which is the
odds ratio. Exp(B) equals to 1 indicates that increase in the corresponding variable does not change
the odds of the outcome occuring. Exp(B) value of less than 1 (more than 1) indicates that a one
unit increase in the value of the correponding variable leads to drop (increase) in the odds.
Interpreting in this way, ‗popdens‘is not significant as increase in its value does not change the
likelihood of flooding to occur. ‗Distriv‘ and ‗slope‘ cause a slight drop in the odds of flooding to
occur, while ‗density‘causes a slight increase in the odds. A more significant predictor is
‗@10landuse‘; presence of built-up land use increase the odds more than double, that is flooding is
more than 2 times as likely as in ‗non-built up‘ area.
Despite the significance tests, the ―B‖ values in table 5.4 are the logistic coefficients from
which the predicive equation was constructed as below:
Logit (y) = 1.802 – 0.002 distriv – 0.026 slope + 0.77 density + 0.735 @10landuse(1)
y = – –
– –
where y is the odds/probability of flooding occurence, distriv is distance to river, slope is slope
steepness , density is road density, and @10landuse(1) is built up land use category. Note that
variable ‗popdens‘ with ‗B‘ value of 0.000 can be left out from the equation due to its zero or
almost no contribution to the prediction.
The overall significance of the model is given by classification table 5.1e and the model
summary (table 5.1f). Classification table shows that 67.9% of the cases were correctly classified.
However, the Nagelkerke R square tells that only 27.6% variation in dependent variable value can
be explained by the independent variables.
Table 5.1e Classification Tablea
Observed
Predicted
Flood Percentage
Correct No Yes
Step 7 Flood No 62 33 65.3
Yes 28 67 70.5
Overall Percentage 67.9
a. The cut value is .500
Table 5.1f Model Summary
Step
-2 Log
likelihood
Cox & Snell
R Square
Nagelkerke R
Square
7 219.325b .207 .276
Map 5.1 FLOOD OCCURENCE PROBABILITY MAP BASED ON BINARY LOGISTIC
REGRESSION ANALYSIS SCENARIO A
Map 5.1 shows flood probability as predicted by the significant predictors from binary
logistic regression scenario A—using OCCURRENCE points of flooding location from year 2004
to 2011. The significant predictor variables are distance to river, slope, road density, and land use.
The computed probability was classified into five categories with equal class interval: very low
(0.00 – 0.20), low (0.21 – 0.40), moderate (0.41 – 0.60), high (0.61 – 0.80), and very high (0.81 and
above). The model predicts a vast area, from Taman Ibu Kota in the northeastern through the city
center local district of Bandar Kuala Lumpur until Taman Overseas Union in the south area, to be
very highly probable to experience flooding. Flooding is very unlikely to occur in the areas near
Taman Bukit Maluri, Jinjang Utara, Taman Kepong, and Taman Cheras.
Map 5.2 FLOOD OCCURENCE PROBABILITY MAP BASED ON BINARY LOGISTIC
REGRESSION ANALYSIS SCENARIO B
Logit (y) = 11.348 – 0.53 rf11max – 0.002 distriv + 0.690 plancur – 0.021 fill + 0.621
lu10(1) – 0.206 drainde
y = – – – –
– – – –
Map 5.2 shows the predicted probability of flooding occurrence based on binary logistic
regression scenario B—using sample OCCURRENCE points from the available flood areal extent
for year 2008. The significant variables are rainfall, distance to river, planar curvature, elevation,
landuse, and drainage density. In both scenarios, distance to river and landuse (built up or non-
built up) remain as important factors that determine the likelihood of flooding to happen. An even
wider area was classified with very high chance of flooding. Moderate and low flood probability
were predicted near Bukit Kiara, Universti Malaya, Ulu Klang, Taman Cheras, and Sungai Besi.
Map 5.3 LEVEL OF TOTAL VULNERABILITY TO FLOOD HAZARD
Map 5.4 COPING CAPACITY LEVEL
Map 5.3 shows that vulnerability is generally low (green). The highly vulnerable (pink)
areas were identified mainly in Batu, Setapak, and Petaling. Map 5.4 shows that majority of the
study area has high coping capacity, with low coping capacity detected in the western part of Batu,
Ulu Klang, and South part of Petaling.
Combination of predominant very high hazard (scenario A probability in this case) and the
more variable but generally low vulnerability, in map 5.5, results in a mixed distribution pattern of
low, moderate, and high value of (hazard x vulnerability). Pink areas are areas with high
vulnerability as identified in map 5.3.
Map 5.5 HAZARD X VULNERABILITY
Finally, taking into account coping capacity element, which is predominantly high, results in
risk map 5.6. The middle area where hazard level is high has low risk due to high coping capacity.
High risk areas appear in highly vulnerable areas with lower coping capacity, in Setapak and
Petaling especially. Very high flood risk was observable at the western side of Taman Overseas
Union.
Map 5.6 FLOOD RISK = HAZARD X VULNERABILITY / COPING CAPACITY
Risk
Score
Risk Level Cell
Count
%
0 Very Low 2511 6.14
1 Low 11990 29.34
2 Moderate 15752 38.54
3 High 9214 22.55
4 and 5 Very High 1402 3.43