Upload
lizbeth-patterson
View
212
Download
0
Embed Size (px)
Citation preview
1LTER IM Town Hall Panel 2007
Challenges in Integrating Diverse Data for Ecological Synthesis
Special Roles & Responsibilitiesfor Information Managers
Challenges in Integrating Diverse Data for Ecological Synthesis
Special Roles & Responsibilitiesfor Information Managers
Judy CushingThe Evergreen State College
Olympia [email protected]
www.evergreen.edu/bdei NSF EIA-0310659, EIA-0131952
http://canopy.evergreen.edu/canopydb NSF DBI-0417311, DBI-0319309, …
www2.evergreen.edu/quantecology
2LTER IM Town Hall Panel 2007
Challenges in Integrating Diverse Data Lessons Learned from the
Grasslands Data Integration (GDI) Project*
Challenges in Integrating Diverse Data Lessons Learned from the
Grasslands Data Integration (GDI) Project*
Integrate Above-Ground Net Primary Productivity (ANPP) data, with its drivers (contextual data) for
cross-site comparisons (Ecological Synthesis),
past and future
(come visit our poster!)
EcologistsChristine Laney (JRN), Alan Knapp (SGS),
Daniel Milchunas (SGS), Esteban Muldavin (SEV)
Information ManagersJincheng Gao (KNZ), Nicole Kaplan (SGS), Ken Ramsey (JRN) , Mark Servilla (NET),
Kristin Vanderbilt (SEV)
Computer Scientists and Data AnalystsJudy Cushing, Carri LeRoy,
Juli Mallett, Lee Zeman
3LTER IM Town Hall Panel 2007
What’s in the GDI Database?What’s in the GDI Database?• recorded or calculated
annual aboveground NPP values from 5 LTERs: Jornada, Sevilleta,SGS, Konza, Kruger
• 4,126,700 grams, over 20 years in 1697 plots
• recorded or calculated annual aboveground NPP values from 5 LTERs: Jornada, Sevilleta,SGS, Konza, Kruger
• 4,126,700 grams, over 20 years in 1697 plots
1980
1985
1990
1995
2000
2005
KNZ
KRG
SGS
SEV
JRN
Plots per LTER105
240
735
536
79
SGS
SEV
JRN
KRG
KNZ
4LTER IM Town Hall Panel 2007
What’s did we Find? What’s did we Find? 1. Ecology
Environmental drivers of ANPPANPP-based grassland community composition.
2. Preliminary definition & provision of contextual data – Ecotrends ++….
3. Information Management: species table fixes, ideas for better experimental design documentation, scripting for data integration…. CHANGE LOGS WERE ESSENTIAL; USDA PLANTS DB
4. CS – case study on Data Integration; need for TOOLS:PASTA-LIKE SERVICE & TAXONOMIC CONCEPT SERVICE
5LTER IM Town Hall Panel 2007
Growing season total precip (mm)
0 200 400 600 800 1000
AN
PP
(g
m-2
)
0
100
200
300
400
JRNSEVSGSKNZKRG
ANPP vs. PrecipANPP vs. PrecipNo climate data yet
6LTER IM Town Hall Panel 2007
Shortgrass
Year
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
AN
PP
(g
m-2
)
20
40
60
80
100
120
140
160
PD
SI
-6
-4
-2
0
2
4
6
8
ANPPPDSI
Jornada
Year
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
AN
PP
(g
m-2
)
40
60
80
100
120
140
160
PD
SI
-4
-2
0
2
4
6
8
10
ANPPPDSI
Sevilleta
Year
1998
1999
2000
2001
2002
2003
2004
2005
AN
PP
(g
m-2
)
0
20
40
60
80
100
PD
SI
-3
-2
-1
0
1
2
3
ANPPPDSI
SGS / SEV / JRN
Year
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
AN
PP
(g
m-2
)
0
50
100
150
200
250
300
350
PD
SI
-10
-5
0
5
10
15
ANPPPDSE
r = 0.608 r = 0.631
r = 0.329 r = 0.196
7LTER IM Town Hall Panel 2007
CART Model: Classification and Regression Tree Model, R2 = 0.642!!
Variables included in model: LTER, year, PDSI, NH4, NO3, absTmax, asbTmin, Tmax, Tmin, Tmean, Precip
8LTER IM Town Hall Panel 2007
Lesson 1What you (IMs) do is important
Lesson 1What you (IMs) do is important
• ANPP – a critical ecological measure (indicator?)• You (Kristin, Ken, Nicole) made GDI happen….• It’s a collaborative & interdisciplinary project –
and not a technology problem….• IMs • Computer Scientists• Ecologists • Statistician (Data Analyst)
• You know the issues, physically possess the datafor important ecological & scientific DB problemse.g., global climate change, resource management
9LTER IM Town Hall Panel 2007
Lesson 2The GDI DB should be dynamic – Not Static
A static data warehouse is an oxymoronas is “Museum of Innovation”
Lesson 2The GDI DB should be dynamic – Not Static
A static data warehouse is an oxymoronas is “Museum of Innovation”
• More years, future years• Current data – further refined• More sites, different ecosystems
10LTER IM Town Hall Panel 2007
Lesson 3Volume Matters….
More sites, more years, more trouble….
Lesson 3Volume Matters….
More sites, more years, more trouble….
• More species codes• Differences in experimental design• Cross-site comparison highlights data anomalies• High volumes make a qualitative difference• A good data structure* matters even more….
* Ask me why GIS not been a priority to illustrate my field datasets….
11LTER IM Town Hall Panel 2007
Lesson 4Information Managers Critical
Computer Science in Crisis….
Lesson 4Information Managers Critical
Computer Science in Crisis….
There won’t be enough CS graduates … to do all the jobs …
even today….
12LTER IM Town Hall Panel 2007
NSF’S ICER (CPATH) INITIATIVE INTEGRATIVE COMPUTING EDUCATION & RESEARCH NSF
NSF’S ICER (CPATH) INITIATIVE INTEGRATIVE COMPUTING EDUCATION & RESEARCH NSF
1. CS content changed (changing!) radically…. 2. No uniform agreement on the core…3. Graduates lack a systems approach….4. Dwindling pipeline….5. US industry [& science] competitiveness
threatened….
13LTER IM Town Hall Panel 2007
NSF’S ICER (CPATH) INITIATIVE NSF asked: Why is CS in crisis? What can be done?
NSF’S ICER (CPATH) INITIATIVE NSF asked: Why is CS in crisis? What can be done?
Northwest Region: http://www.evergreen.edu/icer
Improve the quality of computing education …. Attract more people ….
Improve retention….Strengthen interdisciplinary connections….
Improve CS educational research ….
Google asked: What can industry do?
I ask: What should the LTER IMs do?
14LTER IM Town Hall Panel 2007
Lesson 4 (cont)Computer Science in Crisis….
Lesson 4 (cont)Computer Science in Crisis….
My charge on this panel: IMs typically come from “the sciences” (essential)
Yet their tasks are programming & managing software projects.
What skills or tools are essential for IMs? …As an educator, which are effectively learned on-the-job,
and which require formal training?
Tools are learned on the job,Skills through practice.
(but should be demonstrable before hiring)
Concepts require (some) formal training….(there is a handful of critical concepts?)
15LTER IM Town Hall Panel 2007
Lesson 4 (cont)What CS to do the GDI?
Lesson 4 (cont)What CS to do the GDI?
•Concepts • Formal Languages & Parsing• Data Structures
•Abilities • See patterns (and non-patterns)• Learn new technology fast; see when the tools won’t do it• Build new technology, services….
•Skills (tools)• Scripting Languages, Database tools and SQL
But, CS is not enough… needed an interdisciplinary team…. historical perspective, ecology vision, statistical expertise
Future tools – PASTA- like & TAXONOMIC SERVICES,
Contextual data provision (ClimDB, EcoTrends)
16LTER IM Town Hall Panel 2007
Questions?Questions?
Judy [email protected]
www.evergreen.edu/bdeihttp://canopy.evergreen.edu/canopydb www2.evergreen.edu/quantecology