GDS – The GrADS/DODS Server
Jim KinterCenter for Ocean-Land-Atmosphere Studies
(COLA)
NVODS Workshop10 September 2003
GrADS-DODS Server (GDS)*Joe Wielgosz, Brian Doty, Jennifer Adams
James Gallagher, Daniel Halloway
*(server-side integration of GrADS and DODS - now OpeNDAP)
GrADSJennifer Adams, Reinhard Budich, Luigi Calori, Brian Doty, Wesley Ebisuzaki, Mike Fiorino, Tom Holt, Don Hooper, Jim Kinter, Steve Lord, Gary Love, Karin Meier, Matt Munnich,
Uwe Schulzweida, Arlindo da Silva, Michael Timlin, Pedro Tsai, Brian Wilkinson, Katja Winger (and others)
Ultimate Goal
• integrated• analytical• quantitative• intuitive (for domain
scientists)
• format-independent • data type-independent
e.g. grids, stations etc.
• supporting rapid, domain-relevant subsetting
Data InteroperabilityData Distribution
Distributed Analysisthat is
Grid Analysis and Display System
INTEGRATEDUSER INTERFACE
Maps, Charts, Animations
Expressions, Functions of Original Variables
General slices of {
4D GridsIn Situ ObsImages
User Definable,Extensible
Arbitrary Domains Optimized for Typical Geophysical Queries
Accessing, SubsettingAnalyzing
Visualizing
Interactive Quantitative
GrADS – A Tool for Geophysics
• “Natural” user interface for scientific computations, and graphical production
– Used at O(102) laboratories worldwide– Used by over O(103) scientists worldwide– E.g., 2002 J. Climate - Over ½ of all figures (and
computations?) produced using GrADS
• Handles many geophysical data formats in “native” mode
– Widely used for analysis and display of data from the National Weather Service, other WMO sources
GrADS Analysis Model
ENABLES VERY SOPHISTICATED ANALYSIS TASKSIN A HIGHLY ENCAPSULATED WAY
Scientists only need to specify:• dimension constraint• list of data sets • analysis expression
This approach to geophysical data analysis, despite its apparent simplicity, is extremely powerful.
Make GrADS-readable datasets - both gridded and in-situ - accessible across the network, to a diverse range of clients
What the GDS can do
Perform server-side analysis and comparisons against other distributed datasets
servers clients
and more...
web browserLAS
GDS
OPeNDAP server
OPeNDAP server
GDScomparisondata
GRIB
HDF4
netCDF
BUFR
Unidata IDV
ncBrowse
GrADS
Ferret
Matlab
IDL
OPeNDAP
binary
GrADS
Java servlet
GrADS-DODS Analysis Server
GRIB, HDFNetCDF
GrADS binary
etc..
datasets in any format supported by GrADS
Result cache
holds temporary data (uploaded, generated by a previous operation, or transferred directly from another server) for use in remote analysis
GrADSbatch mode
interface code
DODS server libraries
Serverperforms analysis
operations
manages sessions, translates dataset
names
Java servlet
supports extended request types for analysis, upload
internet
DODS data, metadata, and requests
DODS client libraries
GrADS
Ferret
Matlab
IDL, etc
data appears to client as local file, in a standard format (i.e, NetCDF, etc.)
Client
Encapsulated Analysis Requests
GrADS stationBUFR
sdfopen http://cola8...
set gxout shaded
set time jul1980
d p
sdfopen http://cdc...
set gxout contour
d prate.2*86400*31
Data InteroperabilityExample:
Data from two Servers
Example:Analysis at the Server
sdfopen http://cola8.iges.org:9090/dods/_expr_{ssta,z5a} {tmave(maskout(aave(... }{-180:0,0:90,500:500, jan1950,dec1990}set gxout shadeddisplay result
GDS in production, well-received
Positive response from:
• COLA scientists
• GrADS user community - research, corporate, hobbyists
• NOAA/CIRES CDC (earliest adopters outside COLA)
Some public GDS servers: (google on "grads dods server")
COLA Public Data Server: cola8.iges.org:9191 COLA Monsoon Data Server: monsoondata.org
NOAA/CIRES CDC: www.cdc.noaa.gov/dods
FNMOC / GODAE: usgodae.org
NCEP (NOMADS): nomad2.ncep.noaa.gov
GFDL (NOMADS): data1.gfdl.noaa.gov
NASA / GSWP: voda.gsfc.nasa.gov:9090
NASA / LIS: lis1.sci.gsfc.nasa.gov:9090
NASA / NSIPP: beta.gsfc.nasa.gov:9090
IPRC: aprdc.soest.hawaii.edu:9090
CSAG (South Africa) www.csag.uct.ac.za:9090
plus activity at centers in France, Britain (BADC), Italy (CINECA) and Japan...
COLA GDS User Categories Report from Jennifer Adams (2 years of statistics):
1. Users who have automated their access 2. Users who come regularly, but not automatically3. Project-oriented users who access intensely but only in the short-term 4. Users who download data and treat GDS as a subsetting FTP server (this may be a subset of category 3)5. Casual/curious users who are just looking, not downloading6. Robots (e.g., google; these don’t really count, even though they are "unique IPs")
incentive?
how?
need more robot-friendlycontent on top page?
10,000
100,000
1,000,000
10,000,000
Sep-01 Dec-01 Mar-02 Jun-02 Sep-02 Dec-02 Mar-03 Jun-03
Hits on All COLA GDS Sites
1,000,000,000
10,000,000,000
100,000,000,000
Sep-01 Dec-01 Mar-02 Jun-02 Sep-02 Dec-02 Mar-03 Jun-03
Bytes Served by All COLA GDS Sites
Hits on All COLA GDS
106
105
104
Bytes on All COLA GDS100 GB
10 GB
1 GB
Jan2002
Jan2003
Jul2002
Jan2002
Jul2002
Jan2003
Jul2003
Jul2003
Desktop Weather Forecasting
NCEP
Global Weather Forecasts
NCEPNCEPGlobal Weather
ForecastsGlobal Weather
Forecasts
COLACOLAGrADS-DODS
ServerGrADS-DODS
Server
Region-SpecificLateral BCs
WWWWWW
PC-BasedRegional NWP
PILOT PROJECT WITH US NWS IN SOUTH AFRICA, VIETNAM
What's new in 1.2
New data type support
Station data - GrADS format and BUFR
Remote OPeNDAP data
Subsampling ("striding") for gridded data
Core code refactoredAnagram - generic data server frameworkSwappable, reusable modulesDesigned for efficiency - streaming I/O
XML-based configuration,with more flexibility in:Dataset loadingLoggingSecurityResource management
Improved web interfaceCustom links to help, home, dataset infoURL-based administration interface
Scales better to 1000's of datasetsOrganizes data catalog into directoriesFaster startup and smarter caching
GDS – What’s new and interesting for NVODS
• GrADS 1.9• beta release by 10/31/2003; production by 12/31/2003• much more robust support for in situ data
• DTYPE station via OpeNDAP• DTYPE BUFR (WMO station data format)• Handling GRIB-2 (WMO gridded data format)
• new interface for netCDF/HDF for non-COARDS-compliant data sets (via GrADS descriptor file)
• GDS 1.2 – Anagram• framework for building servers• set of reusable classes
• documented on www.iges.org/grads/gds• white paper in preparation (Joe Wielgosz)
Administrator-friendly
Complete online documentation: http://www.iges.org/grads/gds/doc
Stable and fast COLA public GDS currently handling > 1.5 million hits/month
GDS Hits
0
500
1,000
1,500
2,000
2,500
Th
ou
san
ds
motherlode
dataportal
monsoon
cola8 9191
cola8 9090
Install in minutes (really!)No root privileges neededCross-platform Java and ANSI C
Easy to configureEdit one (simple) XML file, and make updates on-the-fly
SecureRestrict dataset access & resource usage by IP address
And more...Automatic scans for new datasets Detailed loggingGraceful handling of heavy loadsEasily integrated with Apache...
<gds> <catalog temp_size_limit="1000"> <data> <dataset name="test" file="testdata/big_endian.ctl" format="ctl" /> <datadir name="/mnt/data1" suffix=".ctl"/> </data> </catalog> <log mode="rotate" frequency="monthly" file="log/gds.log" level="info" /> <grads> <invoker grads_bin="/users/joew/bin/grads"/> </grads> <servlet> <filter-abuse enabled="true" hits="1000" timeout="24" /> <filter-overload enabled="true" limit="20" /> <filter-analysis enabled="true" /> </servlet> <mapper> <service-admin enabled="true" auth="open-sesame" /> </mapper> <privilege_mgr default="public"> <ip_range mask="127.0.0.1" privilege="private" /> <privilege name="public"> <deny path="private_data" /> </privilege> <privilege name="private" /> </privilege_mgr></gds>
GDS Enables …
• Sharing data: Enterprise-wide; Internet-wide --- data-format independent
• Data interoperability: Consistent metadata for many data types
• Distributed analysis: Saves scientists’ time*, reduces network load; improves interactivity
• Automation of analysis techniques: Analysis techniques can be captured in the form of scripts and provided on server and/or client
“Data-portal changed my life.” – Ben Kirtman (COLA)