12
SAN DIEGO SUPERCOMPUTER CENTER NEAR REAL TIME VISUALIZATION OF USGS INSTANTANEOUS DATA: INTEGRATION OF OPEN SOURCE DATA TURBINE IN CUAHSI HIS Thomas Whitenack David Ryan, David Valentine, Ilya Zaslavsky, Matt Rodriguez

Thomas Whitenack David Ryan, David Valentine, Ilya Zaslavsky, Matt Rodriguez

  • Upload
    valora

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

NEAR REAL TIME VISUALIZATION OF USGS INSTANTANEOUS DATA: INTEGRATION OF OPEN SOURCE DATA TURBINE IN CUAHSI HIS. Thomas Whitenack David Ryan, David Valentine, Ilya Zaslavsky, Matt Rodriguez. USGS Instantaneous water data services. 15 minute intervals 10,000+ sites (7,000+ hav e dischage ) - PowerPoint PPT Presentation

Citation preview

Page 1: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

NEAR REAL TIME VISUALIZATION OF USGS INSTANTANEOUS DATA:

INTEGRATION OF OPEN SOURCE DATA TURBINE IN CUAHSI HIS

Thomas Whitenack

David Ryan, David Valentine, Ilya Zaslavsky, Matt Rodriguez

Page 2: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

USGS Instantaneous water data services

• 15 minute intervals• 10,000+ sites (7,000+ have dischage)• Upto 60 days of data available• http://waterservices.usgs.gov/WOF/InstantaneousValues• Data provided using CUAHSI WaterML

Page 3: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

Open Source Data Turbine (Ring Buffered Network Bus)

•DataTurbine is a robust open-source streaming data middleware system, designed for sensor based systems.•Co-developed by our UCSD / Calit2 colleagues. •Solution for accessing both streaming and static data, from different vendor systems, via a common interface.• Released under Apache 2.0 Open Source License• Provides real high performance data streaming, 10+MB/sec, 1000 frames/sec

Page 4: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

Open Source DataTurbine

• Supported by NASA SBIR, 15 years in development

• Supports multiple types of streams: real-time monitoring, video and multimedia, telemetry, instant messages, etc. etc.

• Scalable: DataTurbine servers can be interconnected to handle large streams

• Can manipulate the streams: fast forward or slow motion playback (TiVo-like)

Page 5: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

Goal of Integrating Data Turbine with CUAHSI HIS

• Get the two systems to work together. • Maintain an up-to-date view of a large volume of

near real time data, in house. • Store data locally beyond the 60 days it is made

available. • Enable viewing of the NWIS Instantaneous data

in the Realtime Data Viewer (RDV).

Page 6: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

Challenges of Project • Integrate CUAHSI HIS with the data turbine

• CUAHIS HIS perspective: • Consuming waterML from Java environment• Obtain and store NWIS 15 minute data beyond 60 days.

• Data Turbine Perspective• Cuahsi data represented unusual challenges

– Pulling data.– Time stamps have to set for each value.

• 7,000 “Channels” needed to be organized for the RDV client– Visualizing / navigating mass volumes of data.

Page 7: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

CUAHSI –> Data Turbine

Page 8: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

OSDT Custom Source• Each source is a separate connection

• 7000 sources was too many for OSDT.• Sources can have multiple channels and sub-

channels• Sites were organized by state and county to make it

navigatible • 50GB Disk cache: ~ 1 year of 15 minute data for 7000

sites. • Cycling through 7,000+ getValues request takes ~18

hours for the iteration, or upon restart.• Subsequent iterations still can complete in under 8 hours.

Page 9: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

Realtime Data Viewer (RDV)

Page 10: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

OSDT Custom “Sink”• Is essentially a custom client connection to

DataTurbine (RDV is a sink process). • Pulls data and writes it to SQL batch files for

batch inserts. • Used to update local ODM instance of NWIS

instantaneous data.

Page 11: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

Conclusions• CUAHSI HIS WaterML can be used in Java/ non windows

environments successfully. • Displaying near realtime data in RDV is very fast and is a valuable

visualization tool. • Data turbine is designed to ingest much more data than this.

• Capable of 10MB/Second – We’re feeding it < 1K/second.• Updating 7000+ data channels worked, but is well beyond what the

OSDT developers had in mind when designing it. • Organizing 7000+ channels in a viewer display represents

organizational challenges.

Page 12: Thomas Whitenack David Ryan, David Valentine,  Ilya Zaslavsky, Matt Rodriguez

SAN DIEGO SUPERCOMPUTER CENTER

Questions?

[email protected]

• http://www.dataturbine.org