22
Reason for Doing Cluster Analysis • Identify similar and dissimilar aerosol monitoring sites so that we can test the ability of the Causes of Haze Assessment methods to explain the similarities and differences.

Reason for Doing Cluster Analysis Identify similar and dissimilar aerosol monitoring sites so that we can test the ability of the Causes of Haze Assessment

Embed Size (px)

Citation preview

Reason for Doing Cluster Analysis

• Identify similar and dissimilar aerosol monitoring sites so that we can test the ability of the Causes of Haze Assessment methods to explain the similarities and differences.

Objectives of the Cluster Analysis

• Group the sites with similar chemical speciation together. Discuss the common properties in the clusters and the difference between the clusters, as well as the major factors that decide the grouping.

• Discuss the different causes of the regional haze for different groups of locations.

• Identify the anomalous sites (close sites in different groups, far away sites in the same group, et al.). Discuss the difference of aerosol properties of those sites and the major chemical components that contribute to the difference. Interpret the possible reasons that lead to the anomaly.

• Investigate the temporal stability of the clusters. Discuss the temporal trend of the aerosol properties and the possible reasons for that.

• Obtain information for further detail data analysis for specific regions and locations.

Cluster Analysis Methodology (Tree Clustering)

• Elements in the cluster analysis – Quarterly averaged concentrations of major chemical species normalized to the mean values of the whole network.

– Coarse Mass

– Elemental Carbon

– Organic Carbon

– Sulfate

– Nitrate

– Fine Soil Mass

• Euclidean distance:– Distance(x,y) = [i (xi – yi)2]1/2

• Linkage rule:– Weighted pair-group average:

The distance between two clusters is calculated as the average distance between all pairs of objects in the two different clusters weighted to the size of the respective clusters.

Join together objects into successively larger clusters, using some measure of similarity or distance. The higher the level of aggregation, the less similar are the members in the respective class.

Tree Diagram for 141 Cases

Weighted pair-group average

Euclidean distances

Lin

kage

Dis

tan

ce

0123456789

1011121314151617181920

P

HO

E1

S

AG

O1

S

AW

E1

S

AG

U1

S

YC

A1

S

PO

K1

V

IIS

1

QU

VA

1

GU

MO

1

BIB

E1

Y

OS

E1

K

AL

M1

C

RL

A1

H

OO

V1

H

AV

O1

S

IME

1

HA

LE

1

RE

DW

1

OLY

M1

S

NP

A1

M

OR

A1

T

UX

E1

M

OH

O1

W

HP

A1

N

OC

A1

G

AM

O1

T

RC

R1

D

EN

A1

P

EF

O1

B

RC

A1

S

AW

T1

T

RIN

1

LA

BE

1

GL

AC

1

MO

NT

1

JA

RB

1

WIC

A1

U

LB

E1

H

EC

A1

C

RM

O1

R

OM

O1

Z

ION

1

ME

VE

1

CA

NY

1

CA

PI1

G

RB

A1

B

AN

D1

N

OA

B1

M

OZ

I1

WH

PE

1

WE

MI1

S

AP

E1

G

RC

A2

W

HR

I1

BR

LA

1

BR

ID1

T

HS

I1

PA

SA

1

SU

LA

1

CA

BI1

Y

EL

L2

L

AV

O1

B

LIS

1

GIC

L1

B

AL

D1

S

TA

R1

C

OR

I1

SA

CR

1

TO

NT

1

IN

GA

1

SIA

N1

I

KB

A1

H

ILL

1

CH

IR1

D

EV

A1

W

HIT

1

GR

SA

1

BO

AP

1

PO

RE

1

KA

IS1

R

AF

A1

P

INN

1

CO

GO

1

LO

ST

1

TH

RO

1

ME

LA

1

BA

DL

1

BO

ND

1

WA

SH

1

MIN

G1

M

KG

O1

M

AC

A1

C

AD

I1

LIV

O1

A

RE

N1

S

AG

A1

J

OS

H1

D

OM

E1

S

EQ

U1

A

GT

I1

PU

SO

1

OK

EF

1

PR

IS1

O

LTO

1

EV

ER

1

BR

ET

1

WIM

O1

U

PB

U1

H

EG

L1

C

AC

R1

B

RIG

1

SIK

E1

S

AM

A1

R

OM

A1

C

HA

S1

S

WA

N1

C

AC

O1

Q

UR

E1

M

OM

O1

C

AB

A1

S

IPS

1

JA

RI1

G

RS

M1

L

IGO

1

DO

SO

1

CO

HU

1

SH

EN

1

QU

CI1

C

OH

I1

AD

PI1

S

HR

O1

S

EN

E1

I

SL

E1

V

OY

A2

B

OW

A1

P

MR

F1

L

YB

R1

G

RG

U1

B

RM

A1

M

OO

S1

A

CA

D1

Cluster Tree Diagram Based on Normalized Quarterly Average (1/2001-10/2002) Major Aerosol Chemical Speciation

27 Clusters Based on Tree Diagram

-2

0

2

4

6

8

10

12

14

16

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 1

otherfm

AvgOfsoilf:Obs_Value

AvgOfso4f:Obs_Value

AvgOfno3f:Obs_Value

AvgOfocf:Obs_Value

AvgOfecf:Obs_Value

PM10

05

10152025303540

ABCDEFGH I J K LMNOPQRSTUVWXYZ 1

0

0.5

1

1.5

2

2.5

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 1

CM

/FM

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

EC

/OC

, S

ulfa

te/F

M

CM/FMEC/OCSulfate/FM

Elevation Adjustment

• Objective– Eliminate the influence of elevation to the air density.

• Methodology– All the concentrations are normalized to the ground level

(elevation = 0) based on the elevation of each site and the air density profile of the standard atmosphere.

23 Clusters Based on Elevation Adjusted Aerosol Chemical Speciation

-2

0

2

4

6

8

10

12

14

16

A B C D E F G H I J K L M N O P Q R S T U V W

otherfm

AvgOfsoilf:Obs_Value

AvgOfso4f:Obs_Value

AvgOfno3f:Obs_Value

AvgOfocf:Obs_Value

AvgOfecf:Obs_Value

PM10

0

10

20

30

40

A C E G I K M O Q S U W

0

0.5

1

1.5

2

2.5

A B C D E F G H I J K L M N O P Q R S T U V W

CM

/FM

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

EC

/OC

, S

ulfa

te/F

M

CM/FM

EC/OC

Sulfate/FM

Examples of Anomalies

A

C

CC

D

D

DD

D

D

D

D

D

D

D

DD

DD

D

DD

D

D

D DD

D

D

D

D

E

E

E

F

F

F

F

G

J

JJ

K

K

L

O

RS

U

V

W

A

C

CC

D

D

DD

D

D

D

D

D

D

D

DD

DD

D

DD

D

D

D DD

D

D

D

D

E

E

E

F

F

F

F

G

J

JJ

K

K

L

O

RS

U

V

W

A

H

H

H

H

H

I

I

I

M

A

H

H

H

H

H

I

I

I

M

Saguaro West Saguaro National Monument

Yosemite National Park Hoover

Washington D.C. Arendtsville

Both sites have relatively high soil concentrations. Aerosol loading is generally higher in Saguaro West than Saguaro National Monument. Most of the difference is caused by the soil and coarse mass.

Difference of Aerosol Chemical Speciation Between SAGU1(R) and SAWE1(S) for Quarter 1 – 4

0

5

10

15

20

25

SAGU1 SAWE1 SAGU1 SAWE1 SAGU1 SAWE1 SAGU1 SAWE1

AvgOfsoilf:Obs_ValueAvgOfso4f:Obs_ValueAvgOfno3f:Obs_ValueAvgOfocf:Obs_ValueAvgOfecf:Obs_ValueAvgOfcm:Obs_Value

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil

Contr

ibution to the d

iffe

rence o

f tw

o s

ites

Map of SAGU and SAWE

Difference of Aerosol Chemical Speciation Between YOSE1(G) and HOOV1(L) for Quarter 1 – 4

During Quarter 1, Hoover has much more soil and coarse mass in aerosols than Yosemite NationalPark; while Yosemite National Park has more OC/EC and Nitrate throughout the year.

0

2

4

6

8

10

12

YOSE1 HOOV1 YOSE1 HOOV1 YOSE1 HOOV1 YOSE1 HOOV1

AvgOfsoilf:Obs_ValueAvgOfso4f:Obs_ValueAvgOfno3f:Obs_ValueAvgOfocf:Obs_ValueAvgOfecf:Obs_ValueAvgOfcm:Obs_Value

0

0.05

0.1

0.15

0.2

0.25

0.3

cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil

Co

ntr

ibu

tio

n t

o t

he

diffe

re

nce

of

two

site

s

Map of YOSE and HOOV

Difference of Aerosol Chemical Speciation Between AREN1(I) and WASH1(M) for Quarter 1 – 4

During the cold seasons (quarter 1 and 4), the aerosol loading of Washington D.C. is much higher than Arendtsville. EC is the major chemical component that contributes to the difference between Washington and Arendtsville (20% and 19% in Quarter 1 and 2, respectively).

0

2

4

6

8

1 0

1 2

1 4

1 6

1 8

2 0

A R E N 1 W A S H 1 A R E N 1 W A S H 1 A R E N 1 W A S H 1 A R E N 1 W A S H 1

A v g O f s o i l f : O b s _ V a l u e

A v g O f s o 4 f : O b s _ V a l u e

A v g O f n o 3 f : O b s _ V a l u e

A v g O f o c f : O b s _ V a l u e

A v g O f e c f : O b s _ V a l u e

A v g O f c m : O b s _ V a l u e

0

0.05

0.1

0.15

0.2

0.25

cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil

Co

ntrib

utio

n to

th

e d

iffe

re

nce

o

f tw

o site

s

Cluster Temporal Stability Analysis

• Objective– Investigate the temporal trend of the aerosol properties

and the possible reasons for that.

• Methodology– For a given site, in each year when aerosol

measurements are available, calculate the distance between the site and each cluster obtained from the 2001-2002 aerosol data.

– Decide which cluster the site belongs to (with the shortest distance) for the given year.

Cluster Stability (Examples)Location 19 88 19 89 19 90 19 91 19 92 19 93 19 94 19 95 19 96 19 97 19 98 19 99 20 00 20 01ACAD1 A H B B B B B A A A A A ABAND1 L D D D A D D D D D D D D DBIBE1 K B B B D B B B K B B K K KBRCA1 D D D D D D D D D D D D D DCANY1 L D D D D D D D D D D D D DCHIR1 L D D D D D D D L D D D D ECRLA1 C C C D D C C A C A C C GDENA1 A A D A A A A A A A A A ADOSO1 H H H H H H H H HGICL1 D D D C D G D

GLAC1 C N B B B B B C C B C C C CGRBA1 D D D D D D D C DGRCA2 D D D DGRSM1 H H H H H H H H H H H H H HGUMO1 K K K K D D Q K R D K K K KINGA1 D D D D D D D D D D D DLYBR1 A H B A A A A A AMEVE1 D D D D A D D D D D D K D DMOOS1 A A A A A A AMORA1 B B B B B A A A C C C C A AMOZI1 D D D D L C DOKEF1 H H B H B B B H H HPEFO1 L D D D D D D D D D D L D DPINN1 F N F F F F F F F F F F F FPORE1 E E E E E E E E E E E E E EREDW1 B A A A A A A A A A A C A AROMO1 D D D D D D D D D D DSAGO1 V V V V V V V V V J V V V VSHEN1 H H H H H H H I H H H H H HSHRO1 H H H H H H HTHSI1 C C C C C C C

TONT1 E D D B D D D E D D E D DUPBU1 I I I I I I I I IVIIS1 Q Q Q Q

WASH1 M M M M M M M M M M M M MYOSE1 L D F D F D C F C D G A G

Big Bend National Park (BIBE1)

K has more soil and coarse mass, while B has more EC/OC and sulfate. The contributions of dust source and anthropogenic sources decide if BIBE1 belongs to cluster B or K.

0

5

10

15

20

25

B K B K B K B K

AvgOfsoilf:Obs_Value

AvgOfso4f:Obs_Value

AvgOfno3f:Obs_Value

AvgOfocf:Obs_Value

AvgOfecf:Obs_Value

AvgOfcm:Obs_Value

Location 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001-2002BIBE1 K B B B D B B B K B B K K K

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil

Co

ntrib

utio

n to

th

e d

iffe

ren

ce

o

f tw

o site

s

Gila Wilderness (GICL1)

Shifting of Gila Wilderness from cluster D to G is caused by dramatically increase in OC/EC concentrations in Quarter 3, which may be caused by specific pollution events

0

2

4

6

8

10

12

14

16

18

20

D G D G D G D G

AvgOfsoilf:Obs_Value

AvgOfso4f:Obs_Value

AvgOfno3f:Obs_Value

AvgOfocf:Obs_Value

AvgOfecf:Obs_Value

AvgOfcm:Obs_Value

Location 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001-2002GICL1 D D D C D G D

0

0.1

0.2

0.3

0.4

0.5

0.6

cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil

Contrib

utio

n to the difference of tw

o sites

Wildfires surrounding the Gila Wilderness in 2000

Acadia National Park (ACAD1)

Aerosol loading, especially coarse mass concentration in ACAD decreased, causing it to transferfrom Cluster B to Cluster A in recent years

0

2

4

6

8

1 0

1 2

1 4

A B A B A B A B

A v g O f s o i l f : O b s _ V a l u e

A v g O f s o 4 f : O b s _ V a l u e

A v g O f n o 3 f : O b s _ V a l u e

A v g O f o c f : O b s _ V a l u e

A v g O f e c f : O b s _ V a l u e

A v g O f c m : O b s _ V a l u e

0

0.05

0.1

0.15

0.2

0.25

0.3

cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil

Co

ntr

ibu

tion

to

th

e d

iffe

ren

ce o

f tw

o

site

s

Location 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001-2002ACAD1 A H B B B B B A A A A A A

Yosemite National Park (YOSE1)

Very unstable site. Compare with D, A has much less soil and coarse mass, especially in quarter 2; C has more OC in quarter 3 and 4 and less soil and CM in quarter 1 and 2; F has much more Nitrate during quarter 2-4; G has more EC/OC in quarter 3.

Location 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001-2002YOSE1 L D F D F D C F C D G A G

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil cm ec oc no3 so4 soil

A/D

C/D

F/D

G/D

Summary of Preliminary Results

• Air density adjustments for some sites explain cause for some sites’ data to differ from others

• Anomalous sites are often caused by exposure to nearby sources

• Unstable sites seem to be the results of changing source influence (smoke from fires, longer term emission trends, etc.)

Next Steps

• Identify groups of sites with similar worst 20% visibility days and try to understand source implications by– Examining seasonality of worst days– Identify responsible species– Characterizing meteorology during worst

days– Examining anomalous and unstable sites