16
1 ER IM Town Hall Panel 2007 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers Judy Cushing The Evergreen State College Olympia WA [email protected] www.evergreen.edu/bdei NSF EIA-0310659, EIA-0131952 http://canopy.evergreen.edu/canopydb NSF DBI- 0417311, DBI-0319309, … www2.evergreen.edu/quantecology

LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

Embed Size (px)

Citation preview

Page 1: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

1LTER IM Town Hall Panel 2007

Challenges in Integrating Diverse Data for Ecological Synthesis

Special Roles & Responsibilitiesfor Information Managers

Challenges in Integrating Diverse Data for Ecological Synthesis

Special Roles & Responsibilitiesfor Information Managers

Judy CushingThe Evergreen State College

Olympia [email protected]

www.evergreen.edu/bdei NSF EIA-0310659, EIA-0131952

http://canopy.evergreen.edu/canopydb NSF DBI-0417311, DBI-0319309, …

www2.evergreen.edu/quantecology

Page 2: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

2LTER IM Town Hall Panel 2007

Challenges in Integrating Diverse Data Lessons Learned from the

Grasslands Data Integration (GDI) Project*

Challenges in Integrating Diverse Data Lessons Learned from the

Grasslands Data Integration (GDI) Project*

Integrate Above-Ground Net Primary Productivity (ANPP) data, with its drivers (contextual data) for

cross-site comparisons (Ecological Synthesis),

past and future

(come visit our poster!)

EcologistsChristine Laney (JRN), Alan Knapp (SGS),

Daniel Milchunas (SGS), Esteban Muldavin (SEV)

Information ManagersJincheng Gao (KNZ), Nicole Kaplan (SGS), Ken Ramsey (JRN) , Mark Servilla (NET),

Kristin Vanderbilt (SEV)

Computer Scientists and Data AnalystsJudy Cushing, Carri LeRoy,

Juli Mallett, Lee Zeman

Page 3: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

3LTER IM Town Hall Panel 2007

What’s in the GDI Database?What’s in the GDI Database?• recorded or calculated

annual aboveground NPP values from 5 LTERs: Jornada, Sevilleta,SGS, Konza, Kruger

• 4,126,700 grams, over 20 years in 1697 plots

• recorded or calculated annual aboveground NPP values from 5 LTERs: Jornada, Sevilleta,SGS, Konza, Kruger

• 4,126,700 grams, over 20 years in 1697 plots

1980

1985

1990

1995

2000

2005

KNZ

KRG

SGS

SEV

JRN

Plots per LTER105

240

735

536

79

SGS

SEV

JRN

KRG

KNZ

Page 4: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

4LTER IM Town Hall Panel 2007

What’s did we Find? What’s did we Find? 1. Ecology

Environmental drivers of ANPPANPP-based grassland community composition.

2. Preliminary definition & provision of contextual data – Ecotrends ++….

3. Information Management: species table fixes, ideas for better experimental design documentation, scripting for data integration…. CHANGE LOGS WERE ESSENTIAL; USDA PLANTS DB

4. CS – case study on Data Integration; need for TOOLS:PASTA-LIKE SERVICE & TAXONOMIC CONCEPT SERVICE

Page 5: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

5LTER IM Town Hall Panel 2007

Growing season total precip (mm)

0 200 400 600 800 1000

AN

PP

(g

m-2

)

0

100

200

300

400

JRNSEVSGSKNZKRG

ANPP vs. PrecipANPP vs. PrecipNo climate data yet

Page 6: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

6LTER IM Town Hall Panel 2007

Shortgrass

Year

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

AN

PP

(g

m-2

)

20

40

60

80

100

120

140

160

PD

SI

-6

-4

-2

0

2

4

6

8

ANPPPDSI

Jornada

Year

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

AN

PP

(g

m-2

)

40

60

80

100

120

140

160

PD

SI

-4

-2

0

2

4

6

8

10

ANPPPDSI

Sevilleta

Year

1998

1999

2000

2001

2002

2003

2004

2005

AN

PP

(g

m-2

)

0

20

40

60

80

100

PD

SI

-3

-2

-1

0

1

2

3

ANPPPDSI

SGS / SEV / JRN

Year

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

AN

PP

(g

m-2

)

0

50

100

150

200

250

300

350

PD

SI

-10

-5

0

5

10

15

ANPPPDSE

r = 0.608 r = 0.631

r = 0.329 r = 0.196

Page 7: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

7LTER IM Town Hall Panel 2007

CART Model: Classification and Regression Tree Model, R2 = 0.642!!

Variables included in model: LTER, year, PDSI, NH4, NO3, absTmax, asbTmin, Tmax, Tmin, Tmean, Precip

Page 8: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

8LTER IM Town Hall Panel 2007

Lesson 1What you (IMs) do is important

Lesson 1What you (IMs) do is important

• ANPP – a critical ecological measure (indicator?)• You (Kristin, Ken, Nicole) made GDI happen….• It’s a collaborative & interdisciplinary project –

and not a technology problem….• IMs • Computer Scientists• Ecologists • Statistician (Data Analyst)

• You know the issues, physically possess the datafor important ecological & scientific DB problemse.g., global climate change, resource management

Page 9: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

9LTER IM Town Hall Panel 2007

Lesson 2The GDI DB should be dynamic – Not Static

A static data warehouse is an oxymoronas is “Museum of Innovation”

Lesson 2The GDI DB should be dynamic – Not Static

A static data warehouse is an oxymoronas is “Museum of Innovation”

• More years, future years• Current data – further refined• More sites, different ecosystems

Page 10: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

10LTER IM Town Hall Panel 2007

Lesson 3Volume Matters….

More sites, more years, more trouble….

Lesson 3Volume Matters….

More sites, more years, more trouble….

• More species codes• Differences in experimental design• Cross-site comparison highlights data anomalies• High volumes make a qualitative difference• A good data structure* matters even more….

* Ask me why GIS not been a priority to illustrate my field datasets….

Page 11: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

11LTER IM Town Hall Panel 2007

Lesson 4Information Managers Critical

Computer Science in Crisis….

Lesson 4Information Managers Critical

Computer Science in Crisis….

There won’t be enough CS graduates … to do all the jobs …

even today….

Page 12: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

12LTER IM Town Hall Panel 2007

NSF’S ICER (CPATH) INITIATIVE INTEGRATIVE COMPUTING EDUCATION & RESEARCH NSF

NSF’S ICER (CPATH) INITIATIVE INTEGRATIVE COMPUTING EDUCATION & RESEARCH NSF

1. CS content changed (changing!) radically…. 2. No uniform agreement on the core…3. Graduates lack a systems approach….4. Dwindling pipeline….5. US industry [& science] competitiveness

threatened….

Page 13: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

13LTER IM Town Hall Panel 2007

NSF’S ICER (CPATH) INITIATIVE NSF asked: Why is CS in crisis? What can be done?

NSF’S ICER (CPATH) INITIATIVE NSF asked: Why is CS in crisis? What can be done?

Northwest Region: http://www.evergreen.edu/icer

Improve the quality of computing education …. Attract more people ….

Improve retention….Strengthen interdisciplinary connections….

Improve CS educational research ….

Google asked: What can industry do?

I ask: What should the LTER IMs do?

Page 14: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

14LTER IM Town Hall Panel 2007

Lesson 4 (cont)Computer Science in Crisis….

Lesson 4 (cont)Computer Science in Crisis….

My charge on this panel: IMs typically come from “the sciences” (essential)

Yet their tasks are programming & managing software projects.

What skills or tools are essential for IMs? …As an educator, which are effectively learned on-the-job,

and which require formal training?

Tools are learned on the job,Skills through practice.

(but should be demonstrable before hiring)

Concepts require (some) formal training….(there is a handful of critical concepts?)

Page 15: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

15LTER IM Town Hall Panel 2007

Lesson 4 (cont)What CS to do the GDI?

Lesson 4 (cont)What CS to do the GDI?

•Concepts • Formal Languages & Parsing• Data Structures

•Abilities • See patterns (and non-patterns)• Learn new technology fast; see when the tools won’t do it• Build new technology, services….

•Skills (tools)• Scripting Languages, Database tools and SQL

But, CS is not enough… needed an interdisciplinary team…. historical perspective, ecology vision, statistical expertise

Future tools – PASTA- like & TAXONOMIC SERVICES,

Contextual data provision (ClimDB, EcoTrends)

Page 16: LTER IM Town Hall Panel 2007 1 Challenges in Integrating Diverse Data for Ecological Synthesis Special Roles & Responsibilities for Information Managers

16LTER IM Town Hall Panel 2007

Questions?Questions?

Judy [email protected]

www.evergreen.edu/bdeihttp://canopy.evergreen.edu/canopydb www2.evergreen.edu/quantecology