23
1 ing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 Grasslands ANPP Data Integration* JRN, SEV, SGS 1. Project History & BG (ANPP for Grasslands) 2. Data Integration to date Issues, Questions 3. Data Analysis to date Issues, Questions 4. ASM Workshop Feedback Next Steps (May – June, 2007) 5. QA/QC Wish List * Judy Cushing, Ken Ramsey, Nicole Kaplan, Kristin Vande Lee Zeman, Carri Le Roy , Anne Judith Kruger, Alan Knapp, Dan Milchunas, Esteban Mu

Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

Embed Size (px)

Citation preview

Page 1: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

1Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Grasslands ANPP Data Integration*JRN, SEV, SGS

Grasslands ANPP Data Integration*JRN, SEV, SGS

1. Project History & BG (ANPP for Grasslands)2. Data Integration to date

Issues, Questions3. Data Analysis to date

Issues, Questions4. ASM Workshop Feedback

Next Steps (May – June, 2007)5. QA/QC Wish List

* Judy Cushing, Ken Ramsey, Nicole Kaplan, Kristin Vanderbilt

Lee Zeman, Carri Le Roy, Anne FialaJudith Kruger, Alan Knapp, Dan Milchunas, Esteban Muldavin

Page 2: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

2Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Project HistoryProject HistoryAre DataBank concepts transferable beyond the canopy?

Can database components help the IMs?

1. Luquilla (Eda) – data visualization2. Cross site analysis of NPP (JRN, SEV, SGS).

• Compare production & species richness, using g/m2 per species per quadrat & number of species per quadrat.

• Compare biomass over areas of ecological interest using measures of central tendency (mean, median, mode, and standard deviation) of g/m2 over biomes at each site.

Original Goals (Eco-informatics/CS) • Published ecology data integration case study• Proof of concept for DataBank integration• Use of CLIO for ecology data integration• Example of data integration and use of site databases at LTER• Sample ontology for data integration

Adjusted Goals (Ecology): to know we have done it ‘right’…something of value to the ecologists….

Page 3: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

3Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Grasslands Biomass Data Integration Schema

Grasslands Biomass Data Integration Schema

AANPP(weight)

locationlocation

m

LTERsubsite

LTERsite

vegzone

m

m

m

species

date season

m

mm

TRT?

m

mSite1LocationMapSite1LocationMapSite1LocationMap

1

1Year?

m

unit

Page 4: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

4Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Scientific Background Modeling Annual Aboveground Productivity

in Grasslands

Data Inputs

Precipitation

Wind Speed

Radiation

Soil Type

Measurements of Biomass

Satellite imagery

Flux Tower array

Plot level harvest

Computational

Model

Productivity or

Carbon Flow

Parameter: Biome Type

Productivity

Soil & veg type

Page 5: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

5Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Methods for Above Ground NPP (Collection of Productivity Data)Methods for Above Ground NPP (Collection of Productivity Data)

Satellite Imagery

Flux Tower

Plot Level Harvest

100 ha

10 – 100 km.25 m2

Page 6: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

6Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Collection Methods for Above Ground Net Primary Productivity at SGS

Collection Methods for Above Ground Net Primary Productivity at SGS

Plot Level Harvest

.25 m2

Site:Site: SGS SGS

Sampling Design:Sampling Design:

6 Sub-Sites: esa, swale, mid-slope, ridge, section 25, and owl creek

3 plots: (called transects) at each sub-site

5 sub-plots: (called plots) at each plot

Total of 90 ¼ m2 sub-plots harvested

Harvest Methods:Harvest Methods:

Clip at crown-level, except for shrubs.

Plots are clipped by species.

Drying oven at a temperature of 55 C and weighed in the lab

Page 7: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

7Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Our Test-Case IntegrationWhat’s in the integrated database?

Our Test-Case IntegrationWhat’s in the integrated database?

• Aboveground net primary productivity, measured or calculated in autumn.

• Three LTERs: Sevilleta, Jornada, SGS

• NPP by species by plot• Data from grasslands only:

nothing from Sevilleta’s Pinon-Juniper woodland

• Contextual information on species and plots.

• Aboveground net primary productivity, measured or calculated in autumn.

• Three LTERs: Sevilleta, Jornada, SGS

• NPP by species by plot• Data from grasslands only:

nothing from Sevilleta’s Pinon-Juniper woodland

• Contextual information on species and plots.

NPP observation

year species weight plot

species

family c. path form com. name sci. name

plot

area? study site

Study site

LTER easting northing elevation Vegetation type

Page 8: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

8Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Size of the integrated databaseSize of the integrated database

• 1093 species• 44080 NPP

measurements• 1065 plots• Covers 1989 - 2004

• 1093 species• 44080 NPP

measurements• 1065 plots• Covers 1989 - 2004

Number of plots

735

24090

Sevilleta

Jornada

SGS

1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004

Page 9: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

9Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Database CreationDatabase CreationNPP observation

year species weight plot

species

family c. path form com. name sci. name

plot

Area study site

Study site

LTER easting northing elevation Vegetation typePublished LTER NPP data

Published LTER site metadata(species list, study protocol)

USDA PLANTSdatabase

Conversation with ecologists

Conversation with IMs

Page 10: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

10Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Integrating Dominant Vegetation TypeIntegrating Dominant Vegetation Type

Jornada Sevilleta SGS

Blue grama grassland Grassland

Creosote bush scrub Larrea core

Grassland Black grama grassland

Tarbush flats

Mesquite dunes

Playa

The three LTERs have overlapping vegetation types Try: cross-site comparison of

productivity by equivalent vegetation type.

Page 11: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

11Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Integrating growth formsIntegrating growth forms

Sevilleta Jornada SGS IntegratedTree Tree

Succulent Leaf succulent Succulent Succulent

Stem succulent

Shrub shrub Shrub Shrub

Sub-shrub Sub-shrub Sub-shrub

Herb Fern Forb Forb

Forb

Sedge

Herbaceous vine Herb

Grass Grass Grass Grass

A lowest-common-denominator classification for growth forms across 3 LTERs.

Page 12: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

12Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Integrating speciesIntegrating species• Integrating species very difficult

• 156 species found in > 1 LTER

• Species are constantly reclassified, so a timeline was constructed using author and reference.

• USDA Plants used to fill in missing species information.

• Integrating species very difficult

• 156 species found in > 1 LTER

• Species are constantly reclassified, so a timeline was constructed using author and reference.

• USDA Plants used to fill in missing species information.

JRN: 203 SEV: 660

SGS: 41

JRN + SEV: 126

JRN + SEV + SGS: 11

SGS + SEV: 14JRN + SGS: 5

1997 1998 1999 1999 2001 2002

Gilia mexicana G. mexicana

G. sinuata

Gilia flavocincta

G. flavocincta

Gilia opthamoides

Gilia sinuata

Gilia sinuata

Gilia flavocincta

Gilia mexicana

Page 13: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

13Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Sample analysis – by familySample analysis – by family

02040

6080

100120

140160

SGSSevilletaJornada

Page 14: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

14Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Issues 1: Species CodesIssues 1: Species Codes• Codes are site specific … ACNE Acacia Neovernicosa or

Acalypha Neomexicana” • Over time, species differentiate : Bothriochloa saccaroides

Bothriochloa laguroides, • LTER sites update at different times.• Some LTERs use subspecies & varieties; some do not

distinguish below species level.

We integrated species across two dimensions. 1. updated older data with newer species codes using “authority” (author,

publication). Jornada’s species database change log listed date of switch to a new “authority” – it was a BIG help!

2. The three separate updated species lists were merged with the official USDA species list.

For species diversity queries, we’re treating all subspecies as a single species.

Page 15: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

15Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Issues (cont)Issues (cont)

2. Form. Different LTERs use different categories as forms, e.g., all non-woody leafy herbs might be classed as forb, or separated into herbacious vine and herb….

3. Timing. Biomass was measured at different times of year. We took only fall measurements, but….

4. Plot Organization, Size. We did not combine hierarchies, but just used data at plot level.

5. Site types. Each research area is classified as a site type, but different terms are used, e.g., JRN Grassland = SEV black gramma

Page 16: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

16Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Analysis QuestionsAnalysis Questions

Which analyses OK if missing data for one site for one year?

Surprizing result: JRN ANPP is higher

Not surprizing result: SEV has highest species diversity.

Page 17: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

17Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Analysis Summary

JRN, SEV, SGS Plant Communities

All photos shamelessly taken from various websites

•Grassland LTER Synthesis 1999, Knapp&Smith 2001•Average differences by LTER site and dominant vegetation type

Page 18: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

18Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

AnalysisAnalysis

• NPP – Total net primary productivity in a 1m2 plot

• Species richness: Number of different species present in a 1m2 plot

• Community analyses weighted by NPP or by species Presence/Absence

• Indicator Species Analysis

• Correlations with Environmental Variables – still organizing data

• NPP – Total net primary productivity in a 1m2 plot

• Species richness: Number of different species present in a 1m2 plot

• Community analyses weighted by NPP or by species Presence/Absence

• Indicator Species Analysis

• Correlations with Environmental Variables – still organizing data

Page 19: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

19Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Grassland Community Analysis: Ordination

Significant differences among LTER sites

Axis 2

Axi

s 3

site

123

A = 0.0945P < 0.0001

LTER Site Jornada SGS Sevilleta

Based on Presence/ Absence

Page 20: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

20Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Indicator Species AnalysisIndicator Species Analysis23 Indicator Species for Jornada LTER

29 Indicator Species for Shortgrass Steppe LTER

32 Indicator Species for Sevilleta LTER

Page 21: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

21Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

ExtensionsMay and June Workshops

ExtensionsMay and June Workshops

• Add more years, look at trends through time• Add more sites: KNZ, two S. African sites, ….• Move to a “big-iron” database….• Compute biomass by species (are these data available?)• Compute Presence/Absence by species (except SGS?)• Do cover-based ordinations (except SGS?)• Correlate ANPP with env. variables: precip, temp, soil texture, soil type,

elevation, AET, soil moisture, PAR, soil temp

Identify standard analyses (derived data), ala Trends?

INVESTIGATE & DOCUMENT THE “CAN’T DOs”:• Relative frequency, diversity, species abundances, species richness (based

on SGS methods)

• Add more years, look at trends through time• Add more sites: KNZ, two S. African sites, ….• Move to a “big-iron” database….• Compute biomass by species (are these data available?)• Compute Presence/Absence by species (except SGS?)• Do cover-based ordinations (except SGS?)• Correlate ANPP with env. variables: precip, temp, soil texture, soil type,

elevation, AET, soil moisture, PAR, soil temp

Identify standard analyses (derived data), ala Trends?

INVESTIGATE & DOCUMENT THE “CAN’T DOs”:• Relative frequency, diversity, species abundances, species richness (based

on SGS methods)

Page 22: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

22Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

Revisit Simplifying Assumptions Revisit Simplifying Assumptions

Differences in data collection or methodologies ….

Differences in ANPP calculation …. data result from regressions particular to each site.

Differences in Plot Size ANPP probably scales up…. Species Richness (#species per plot) probably doesn’t….

Differences in plot designation ….

Page 23: Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007 1 Grasslands ANPP Data Integration* JRN, SEV, SGS 1.Project History & BG (ANPP for Grasslands) 2.Data

23Cushing; LTER ASM QA-QC Las Cruces, Jan 31-Feb 1, 2007

QA/QC* Wish ListQA/QC* Wish ListHow do we automate integration and mark-up (even a little) ?• Our integration done by hand… not feasible….• Tracking was ad hoc….

How do we track and distribute changes to data?• Species and species family changes (SEEK?)• Assignments to Form

How do we document differences among data:• Methodology and plot differences, e.g., Sub-plot based analysis is below the scale of

interest. Statistical n becomes 3 or 5, typical of ecological data, but ok once all years of data are analyzed

How do we determine (and fix) critical ecology issues:• Under- or over-estimates, e.g., “SGS 14-17% under-estimate of cool seasons based on

C14 data.”

How do we automate integration and mark-up (even a little) ?• Our integration done by hand… not feasible….• Tracking was ad hoc….

How do we track and distribute changes to data?• Species and species family changes (SEEK?)• Assignments to Form

How do we document differences among data:• Methodology and plot differences, e.g., Sub-plot based analysis is below the scale of

interest. Statistical n becomes 3 or 5, typical of ecological data, but ok once all years of data are analyzed

How do we determine (and fix) critical ecology issues:• Under- or over-estimates, e.g., “SGS 14-17% under-estimate of cool seasons based on

C14 data.”

* Las Cruces, Jan 31-Feb 1, 2007.