1
A Comparison of Multiple Algorithms for Deriving Regional-Scale Biomass Maps with Airborne Lidar Metrics and Multispectral Datasets Nian-Wei Ku and Sorin C. Popescu Lidar Applications for the Study of Ecosystems with Remote Sensing Laboratory (LASERS), Department of Ecosystem Science and Management, Texas A&M University Abstract Aboveground biomass maps spatially show the distribution of aboveground biomass of rangelands and forests. Thus, this research investigated possible approaches to generate an aboveground biomass map of rangelands and forests. We used three approaches to generate regional-scale aboveground biomass maps with different combinations of airborne lidar metrics and multispectral dataset. The first and second approaches were the stepwise regression and least absolute shrinkage and selection operator (LASSO) to establish the regression relationship of the in-situ aboveground biomass samples with lidar metrics and a multispectral dataset to build a biomass map. The third approach imported the airborne lidar metrics and multispectral dataset into the random forest algorithm to generate an aboveground biomass map. The goal of this map is to investigate the most suitable approach for deriving the aboveground biomass with airborne lidar and multispectral remote sensing technologies. The results showed that the stepwise regression and LASSO approach showed the similar results that the lidar metrics and multispectral dataset have limitations in explaining the variance associated with the aboveground biomass maps. However, the random forests approach estimate the feasible amount of aboveground biomass at a regional scale map. In summary, results proved that the random forest approach is the most reliable and reasonable approach to generate an aboveground biomass map. Moreover, the point density of airborne lidar data constrains the accuracy of the map and the NAIP image dataset was useful to create an acceptable aboveground biomass map when the lidar data is difficult or expensive to acquire. Methods and Materials Airborne lidar and multispectral data preprocessing The list of lidar metrics The aboveground biomass map made by Stepwise Regression Results Discussions and Conclusion Contact Nian-Wei Ku (Tony) Lidar Applications for the Study of Ecosystems with Remote Sensing Laboratory (LASERS) Department of Ecosystem Science and Management Texas A&M University [email protected] Selected References Methods and Materials Results Study area The figures (above) are (A) the false color NAIP image (near-infrared, red, and green) of Smith Walker research unit and the study transects location, and (B) the classified image of the study area. (A) (B) Image Layers Segmentation Algorithms - Multiresolution Segmentation Criteria: o Layer weight o Scale parameter o Color vs. Shape o Compactness vs. Smoothness Supervised Classification Classes: o Grass o Mesquite o Non-Vegetation o Water Final Classification Map Post Classification Accuracy Assessment: o Overall accuracy o Kappa coefficient The figure (above) is a concept flowchart of the objected-oriented classification with multiresolution segmentation algorithm. Woody plant biomass field measurements and data processing The woody plant aboveground biomass field data was collected during the leaf-off season in December 2008, March 2009, and December 2009 from 18 study transects in the Smith Walker research unit. The woody plant aboveground biomass was estimated using the following allometric equation which measured the basal stem diameter at 5 cm to 15 cm height above ground related to the total woody plant mass (Ansley et al., 2010). = 0.34 1.73 Where Y represents the total woody plant mass (kg) and the x is basal stem diameter (cm). 2010 Lidar point cloud data 2010 NAIP imagery Final 27 lidar metric layers Generate layers from each metric Calculate lidar metrics at each cell as a CSV table Clip the polygon to one by one cell Covert NAIP to Polygon Create 1 m NDVI 1 m Blue, Green, Red, and Near Infrared bands Create 1 m CHM Generate point cloud data without elevation Create 1 m DEM Extract point cloud data between 0 to 7 m Index Variables Index Variables 1 Canopy Height Model 15 20th percentile value for cell 2 Minimum height\Minimum value for cell 16 25th percentile value for cell 3 Maximum height\Maximum value for cell 17 30th percentile value for cell 4 Mean height\Mean value for cell 18 40th percentile value for cell 5 Mode height (The most count of returns)\mode value for cell 19 50th percentile value for cell 6 Standard deviation of cell values 20 60th percentile value for cell 7 Variance of cell values 21 70th percentile value for cell 8 Coefficient of variation for cell 22 75th percentile value for cell 9 Interquartile range (IQR) 23 80th percentile value for cell 10 Skewness computed for cell 24 90th percentile value for cell 11 Kurtosis computed for cell 25 95th percentile value for cell 12 1st percentile value for cell 26 99th percentile value for cell 13 5th percentile value for cell 27 Generalized means for the 2nd (Height quadratic mean) power p=2 14 10th percentile value for cell 28 Generalized means for the 3rd (Height cubic mean) power p=3 The aboveground biomass map made by LASSO The aboveground biomass map made by Random Forests Stepwise Regrssion Data Coefficient MSE Adj. R 2 R 2 NAIP Intercept: NDVI: 1.14 18.46 13.23 0.34 0.34 Lidar Intercept: CHM: Skewness: Kurtosis: Percentile 50 th : 0.97 1.69 0.39 -0.40 0.25 9.58 0.52 0.52 NAIP & Lidar Intercept: CHM: Kurtosis: Percentile 50 th : 0.83 1.74 -0.24 0.14 9.66 0.52 0.52 LASSO Data Coefficient MSE Adj. R 2 R 2 NAIP Intercept: NDVI: Red: 6.13 5.43 -0.03 12.87 0.36 0.36 Lidar Intercept: CHM: Variance: 1.12 1.36 0.04 10.08 0.50 0.50 NAIP & Lidar Intercept: Red: CHM: Variance: 2.4 -0.01 1.2 0.16 9.66 0.52 0.52 Random Forests Data MSE Pseudo R 2 NAIP 12.03 0.41 Lidar 9.39 0.54 NAIP & Lidar 7.68 0.62 The figures (above) are the aboveground biomass maps made by (A) NAIP imagery, (B) Lidar data, and (C) the combination of NAIP and Lidar. (C) (B) (A) (C) (B) (A) (C) (B) (A) The figures (above) are the aboveground biomass maps made by (A) NAIP imagery, (B) Lidar data, and (C) the combination of NAIP and Lidar. The figures (above) are the aboveground biomass maps made by (A) NAIP imagery, (B) Lidar data, and (C) the combination of NAIP and Lidar. The airborne lidar data provides enough information to all approaches for separating woody plants from grass in aboveground biomass maps. The combination of NAIP and lidar shows the lowest MSE and highest R- squared in random forests approach. The random forests approach has better performance than the other two with different remote sensing variables. Though the NAIP-based aboveground biomass estimations has higher MSE and lower R-squared in all three approaches, the biomass estimations separate the bare ground from the grass and woody accurately. In contract, the lidar-based aboveground biomass estimation does not find the bare ground better. The point density of airborne lidar data constrains the accuracy of the map and the NAIP image dataset was useful to create an acceptable aboveground biomass map when the lidar data is difficult or expensive to acquire. Ansley, R. James, Mustafa Mirik, and Michael J. Castellano. "Structural biomass partitioning in regrowth and undisturbed mesquite (Prosopis glandulosa): implications for bioenergy uses." GCB Bioenergy 2.1 (2010): 26-36. Breiman, Leo. "Random forests." Machine learning 45.1 (2001): 5-32. Tibshirani, Robert. "Regression shrinkage and selection via the lasso." Journal of the Royal Statistical Society. Series B (Methodological) (1996): 267-288.

Nian-Wei Ku and Sorin C. Popescuaglifesciences.tamu.edu/.../sites/34/2017/01/AGU_2016_Poster_v01.pdf · Nian-Wei Ku and Sorin C. Popescu Lidar Applications for the Study of Ecosystems

Embed Size (px)

Citation preview

Page 1: Nian-Wei Ku and Sorin C. Popescuaglifesciences.tamu.edu/.../sites/34/2017/01/AGU_2016_Poster_v01.pdf · Nian-Wei Ku and Sorin C. Popescu Lidar Applications for the Study of Ecosystems

A Comparison of Multiple Algorithms for Deriving Regional-Scale Biomass Maps with Airborne Lidar Metrics and Multispectral Datasets

Nian-Wei Ku and Sorin C. Popescu

Lidar Applications for the Study of Ecosystems with Remote Sensing Laboratory (LASERS), Department of Ecosystem Science and Management, Texas A&M University

Abstract

Aboveground biomass maps spatially show the distribution of

aboveground biomass of rangelands and forests. Thus, this research

investigated possible approaches to generate an aboveground biomass

map of rangelands and forests. We used three approaches to generate

regional-scale aboveground biomass maps with different combinations of

airborne lidar metrics and multispectral dataset. The first and second

approaches were the stepwise regression and least absolute shrinkage and

selection operator (LASSO) to establish the regression relationship of the

in-situ aboveground biomass samples with lidar metrics and a

multispectral dataset to build a biomass map. The third approach

imported the airborne lidar metrics and multispectral dataset into the

random forest algorithm to generate an aboveground biomass map. The

goal of this map is to investigate the most suitable approach for deriving

the aboveground biomass with airborne lidar and multispectral remote

sensing technologies. The results showed that the stepwise regression and

LASSO approach showed the similar results that the lidar metrics and

multispectral dataset have limitations in explaining the variance

associated with the aboveground biomass maps. However, the random

forests approach estimate the feasible amount of aboveground biomass at

a regional scale map. In summary, results proved that the random forest

approach is the most reliable and reasonable approach to generate an

aboveground biomass map. Moreover, the point density of airborne lidar

data constrains the accuracy of the map and the NAIP image dataset was

useful to create an acceptable aboveground biomass map when the lidar

data is difficult or expensive to acquire.

Methods and Materials

Airborne lidar and multispectral data preprocessing

The list of lidar metrics

The aboveground biomass map made by Stepwise Regression

Results

Discussions and Conclusion

Contact

Nian-Wei Ku (Tony)

Lidar Applications for the Study of Ecosystems with Remote Sensing Laboratory (LASERS)

Department of Ecosystem Science and Management

Texas A&M University

[email protected]

Selected References

Methods and Materials

Results

Study area

The figures (above) are (A) the false color NAIP image (near-infrared, red, and

green) of Smith Walker research unit and the study transects location, and (B) the

classified image of the study area.

(A) (B)

Image Layers

Segmentation

Algorithms -

Multiresolution

Segmentation

Criteria:

o Layer weight

o Scale parameter

o Color vs. Shape

o Compactness vs.

Smoothness

Supervised Classification

Classes:

o Grass

o Mesquite

o Non-Vegetation

o Water

Final Classification Map

Post Classification

Accuracy Assessment:

o Overall accuracy

o Kappa coefficient

The figure (above) is a concept flowchart of the objected-oriented classification

with multiresolution segmentation algorithm.

Woody plant biomass field measurements and data processing

The woody plant aboveground biomass field data was collected during the

leaf-off season in December 2008, March 2009, and December 2009 from 18

study transects in the Smith Walker research unit.

The woody plant aboveground biomass was estimated using the following

allometric equation which measured the basal stem diameter at 5 cm to 15 cm

height above ground related to the total woody plant mass (Ansley et al.,

2010).

𝑌 = 0.34𝑥1.73

Where Y represents the total woody plant mass (kg) and the x is basal stem

diameter (cm).

2010

Lidar

point

cloud

data

2010

NAIP

imagery

Final 27

lidar

metric

layers

Generate

layers

from each

metric

Calculate

lidar metrics

at each cell

as a CSV

table

Clip the

polygon

to one by

one cell

Covert

NAIP to

Polygon

Create

1 m

NDVI

1 m Blue,

Green, Red,

and Near

Infrared

bands Create

1 m

CHM

Generate

point

cloud data

without

elevation

Create

1 m

DEM

Extract

point

cloud data

between 0

to 7 m

Index Variables Index Variables

1 Canopy Height Model 15 20th percentile value for cell

2 Minimum height\Minimum value for cell 16 25th percentile value for cell

3 Maximum height\Maximum value for cell 17 30th percentile value for cell

4 Mean height\Mean value for cell 18 40th percentile value for cell

5 Mode height (The most count of

returns)\mode value for cell 19 50th percentile value for cell

6 Standard deviation of cell values 20 60th percentile value for cell

7 Variance of cell values 21 70th percentile value for cell

8 Coefficient of variation for cell 22 75th percentile value for cell

9 Interquartile range (IQR) 23 80th percentile value for cell

10 Skewness computed for cell 24 90th percentile value for cell

11 Kurtosis computed for cell 25 95th percentile value for cell

12 1st percentile value for cell 26 99th percentile value for cell

13 5th percentile value for cell 27 Generalized means for the 2nd (Height

quadratic mean) power p=2

14 10th percentile value for cell 28 Generalized means for the 3rd (Height

cubic mean) power p=3

The aboveground biomass map made by LASSO

The aboveground biomass map made by Random Forests

Stepwise Regrssion

Data Coefficient MSE Adj. R2 R2

NAIP Intercept:

NDVI:

1.14

18.46 13.23 0.34 0.34

Lidar Intercept:

CHM:

Skewness:

Kurtosis:

Percentile 50th:

0.97

1.69

0.39

-0.40

0.25

9.58 0.52 0.52

NAIP & Lidar Intercept:

CHM:

Kurtosis:

Percentile 50th:

0.83

1.74

-0.24

0.14

9.66 0.52 0.52

LASSO

Data Coefficient MSE Adj. R2 R2

NAIP Intercept:

NDVI:

Red:

6.13

5.43

-0.03

12.87 0.36 0.36

Lidar Intercept:

CHM:

Variance:

1.12

1.36

0.04

10.08 0.50 0.50

NAIP & Lidar Intercept:

Red:

CHM:

Variance:

2.4

-0.01

1.2

0.16

9.66 0.52 0.52

Random Forests

Data MSE Pseudo R2

NAIP 12.03 0.41

Lidar 9.39 0.54

NAIP & Lidar 7.68 0.62

The figures (above) are the aboveground biomass maps made by (A) NAIP

imagery, (B) Lidar data, and (C) the combination of NAIP and Lidar.

(C) (B) (A)

(C) (B) (A)

(C) (B) (A)

The figures (above) are the aboveground biomass maps made by (A) NAIP

imagery, (B) Lidar data, and (C) the combination of NAIP and Lidar.

The figures (above) are the aboveground biomass maps made by (A) NAIP

imagery, (B) Lidar data, and (C) the combination of NAIP and Lidar.

The airborne lidar data provides enough information to all approaches for

separating woody plants from grass in aboveground biomass maps.

The combination of NAIP and lidar shows the lowest MSE and highest R-

squared in random forests approach.

The random forests approach has better performance than the other two with

different remote sensing variables.

Though the NAIP-based aboveground biomass estimations has higher MSE

and lower R-squared in all three approaches, the biomass estimations separate

the bare ground from the grass and woody accurately.

In contract, the lidar-based aboveground biomass estimation does not find the

bare ground better.

The point density of airborne lidar data constrains the accuracy of the map and

the NAIP image dataset was useful to create an acceptable aboveground

biomass map when the lidar data is difficult or expensive to acquire.

• Ansley, R. James, Mustafa Mirik, and Michael J. Castellano. "Structural biomass

partitioning in regrowth and undisturbed mesquite (Prosopis glandulosa):

implications for bioenergy uses." GCB Bioenergy 2.1 (2010): 26-36.

• Breiman, Leo. "Random forests." Machine learning 45.1 (2001): 5-32.

• Tibshirani, Robert. "Regression shrinkage and selection via the lasso." Journal of

the Royal Statistical Society. Series B (Methodological) (1996): 267-288.