13
•The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. •The complexity of the fundamental scientific questions being addressed require a variety of data with integrative and innovative approaches if we are to find solutions. •Geoscientists have a tradition of sharing of data, but being willing to share data if asked or even maintaining an obscure website accomplishes little. Also as a community, we have no mechanisms to share the work that has been done when a third party cleans up, reorganizes or embellishes an existing database. •We waste a large amount of human capital in Some Some Thoughts About the Need for Thoughts About the Need for Cyberinfrastructure Cyberinfrastructure

The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

Embed Size (px)

Citation preview

Page 1: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

•The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies.•The complexity of the fundamental scientific questions being addressed require a variety of data with integrative and innovative approaches if we are to find solutions. •Geoscientists have a tradition of sharing of data, but being willing to share data if asked or even maintaining an obscure website accomplishes little. Also as a community, we have no mechanisms to share the work that has been done when a third party cleans up, reorganizes or embellishes an existing database.•We waste a large amount of human capital in duplicative efforts and fall further behind by having no mechanism for existing databases to grow and evolve via community input.

SomeSome Thoughts About the Need for CyberinfrastructureThoughts About the Need for Cyberinfrastructure

Page 2: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

Data Set: A relatively raw compilation of data (standards, formats, completeness may be questionable)

Data Base: A mature data compilation that has been “cleaned”, standardized with input from the scientific community, formatted for use by others (not based on proprietary software, e.g., ORACLE)

Data System: A linked and organized set of data bases including public domain software (not platform dependent) and procedures to analyze the data

Some DefinitionsSome Definitions

Page 3: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

Goals of the gravity data system effortGoals of the gravity data system effortConstruct an updated gravity database that will have excellent spatial coverage and quality that will serve a diverse community of users, while being simple to access and flexible enough to meet the range of applications and changing needs of users.

Create a structure to provide efficient updating and to encourage additions of new data by users (i.e., craft a living database)

Construct a robust toolbox of public domain software.

Link this specialized data system into an emerging broad national (and North American) geoscience data system.

Page 4: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

Software Development EffortsSoftware Development Efforts

1. Advanced technique to remove duplicate measurements and to provide for merging of new data without creating new duplicates (new high quality data will be time-stamped)

2. Advanced technique to detect erroneous values

3. Web interface to the data system (26 students in a software engineering capstone class are involved)

4. Mapping software (GUI for GMT)

5. Modeling software (graphical interface for 2.5 D models)

6. Digital processing (published USGS package - a start)

7. Data reduction equations (community consensus)

Page 5: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

GDRP1 - welcomeGDRP1 - welcome

Page 6: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

Map searchMap searchQuickTime™ and a

TIFF (LZW) decompressorare needed to see this picture.

Page 7: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

GDRP3 - search resultsGDRP3 - search results

Page 8: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

•From From Geological Geological Survey of Survey of Canada Canada websitewebsite

Page 9: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

GDRP2 - base stationsGDRP2 - base stations

Page 10: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

Base station descriptionBase station description

Page 11: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

Base station pictureBase station picture

Page 12: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

GDRP4 - uploadGDRP4 - uploadNSF

Page 13: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity

Status and ProductsStatus and Products Web-based data systemWeb-based data system

Gravity database (U. S. first pass Gravity database (U. S. first pass done; TCs done/online done; TCs done/online soonsoon))

Base stations (U. S. Stations Base stations (U. S. Stations available on lineavailable on line))Educational material (Tutorial Educational material (Tutorial donedone))Standards (Committee Standards (Committee consensus, implementation consensus, implementation

underwayunderway))Links to other related sites (Links to other related sites (donedone))Software to manipulate data (subgrids, Software to manipulate data (subgrids,

profiles, filters, etc.) (profiles, filters, etc.) (Partly donePartly done))Modeling software (Modeling software (2.5D 2.5D donedone))

North American GridsNorth American Grids

North American MapsNorth American Maps