Upload
amazon-web-services
View
808
Download
0
Embed Size (px)
Citation preview
NOAA Big Data Project
AWS Earth Observations in the Cloud event2015 Nov 10
Jeff de La Beaujardière, PhDNOAA Data Management Architect
NOAA has "Big Data"
2
• 10 satellites• 150+ weather radars • 3 buoy networks• 200+ tide gauges• human observers• animal telemetry• 17 ships • 10 aircraft• Many numerical models
(slide adapted from "NOAA 101" briefing)
NOAA data are unique, valuable, irreplaceable, and collected at public expense
2015-
09-
24
3
Vision for NOAA Data Management
Discoverable
All NOAA environmental data shall be
for all types of users and applications.
Accessible Usable Preserved
2015-09-24
data services layer
Data Access Services
Data Search & Discovery Services
Data.govand
Other Portals
DataSources Satellite Radar Buoy Ship Sonar Surveys ModelsROV/UAV
Data Documentation
Compatible Formats and Vocabularies
UserTools
DecisionSupport
Tools
ScientificSoftware
Value-AddingReseller
Traditional Data Services Approach (pre-Cloud)
4
NumericalModels
shared standards
DataSources
Traditional Data Services Approach (pre-Cloud)
5
Satellite
dataaccess
Radar
dataaccess
Buoy
dataaccess
Ship
dataaccess
Sonar
dataaccess
Surveys
dataaccess
ROV/UAV
dataaccess
Models
dataaccess
DataDiscovery
UserFacilities
UserHardware
UserHardware
UserHardware
UserHardware
copy of data copy of data copy of data copy of data
Conceptual Overview of NOAA Big Data Project
6
EarthObservations
EarthObservations
ModelOutputs
Agency Service Tier
Access ServicesCatalog
Metadata Formatting
agency security boundary
Customer 1 Customer 2 Customer 3
integration functions
analysis functions
mastercopy of data
agency-providedservices
Cloud IaaSprovider(s)
[Infrastructure as a Service]working copy of data
application &product providers
new customers& lines of business
CustomProduct/App #1
CustomProduct/App #2
CustomProduct/App #3
2015-
09-
24
Premise of NOAA Big Data Project• There is additional value in NOAA data that has not
yet been realized because of access & infrastructure difficulties.
• If NOAA data were accessible in the Cloud, alongside computing capability, private enterprise might generate new value-added products, services, and lines of business.
• Private enterprise might be willing to support the cost of transferring and storing large datasets because of these new lines of business.
• Self-sustaining partnerships are possible
7
2015-
09-
24
5 BDP CRADAs announced April 2015CRADA = Cooperative Research and Development Agreement
2015-
09-
24
8
NOAA BDP Data Alliance Concept
9
NOAA data
Agency Service Tier
PartnerPartner
Partner
2015-
09-
24
PartnerPartner
PartnerPartner
Partner
Partner
PartnerPartner
PartnerPartner
Partner
Partner
Project Lifecycle• CRADAs are valid for three years
– Research activity– Annual renewal possible– Collaborators or NOAA may choose to terminate early
• Iterative approach:– Start small to minimize collaborators’ investment
costs– Use initial datasets to establish baselines and
demonstrate concepts– Allow market-driven demand from collaborators for
datasets
2015-
09-
24
10
1st BDP Dataset: NEXRAD L2• NEXRAD = Next-generation Radar
– Level 2 = reflectivity data from 150+ stations
2015-
09-
24
11
1st BDP Dataset: NEXRAD L2• Copied all NEXRAD Level 2 processed data to AWS S3
– Data from 1991-2015 archived at NCEI– Near-real-time IDD/LDM feed for current data– 270+ TB compressed (over 1PB uncompressed)– Availability on AWS announced 2015 Oct 27
• https://aws.amazon.com/noaa-big-data/nexrad/
2015-
09-
24
14
Examples of Possible Future Datasets for BDP
2015-
09-
24
15
Multi-Radar/Multi-Sensor
GeostationarySatellite
NumericalModels LIDAR
• Ensure free and open access to all data, regardless of market interest
• Provide authoritative data, metadata, information, forecasts, warnings, and analyses
• Perform research to improve sensors, numerical models, and algorithms
• Ensure long-term preservation of master copy of data• Perform scientific data stewardship as an unbiased, objective
partner• Provide expertise and skills to support proper use and
application of data
These roles will continue, and even gain importance, with Big Data, Cloud, and other IT advances.
Project LifecycleThe Role of NOAA in a Big Data world
Initial Lessons Learned• Much behind-the scenes work coordinating details of
data – sources, formats, status, etc.• Need active monitoring of data feeds• Much work communicating with external
collaborators• Differing expectations of private sector vs
government on how fast things will happen– Reading everything off robotic tape is slow
• Aggregating even a single dataset in the Cloud can generate great interest! 20
15-
09-
24
17
Closing Thoughts• Cloud technology will introduce significant changes
for science agencies• Government science will continue to have major
roles & responsibilities• Public/private partnerships can leverage each
participant's expertise
2015-
09-
24
18
Questions?
Jeff de La Beaujardiè[email protected]
http://orcid.org/0000-0002-1001-9210
2015-
09-
24
19