19
Next Generation Cloud Based Ingest & Processing Framework (I&PF) for Environmental Data Copyright © 2017 by Solers, Inc. Josh Leaverton, Peter MacHarrie, Dan Beall Solers, Inc. Dr. Shay Strong OmniEarth, Inc. 2017 AMS Annual Meeting Rich Baker Chief Architect Solers, Inc. Email: [email protected] Phone: 240-790-3338

Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Next Generation Cloud Based Ingest & Processing Framework (I&PF) for Environmental Data

Copyright © 2017 by Solers, Inc.

Josh Leaverton, Peter MacHarrie, Dan Beall Solers, Inc. Dr. Shay Strong OmniEarth, Inc.

2017 AMS Annual Meeting Rich Baker Chief Architect Solers, Inc. Email: [email protected] Phone: 240-790-3338

Page 2: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Cloud Based Ingest & Processing Framework (I&PF)

Copyright © 2017 Solers, Inc.

2

OBJECTIVES: • Enable fast/easy integration of data sources, product algorithms, and data consumers within a cloud based workflow (or “data

pipeline”) framework

• Provide easy to use web-based user interfaces for discovery and access (for end users), as well as workflow monitoring and management (for algorithm developers and system operators/admins)

• Provide RESTful web services for other developers, scientists, etc. to discover and access the ingested/processed data and metadata, for use in other research / engineering initiatives (e.g., developing a new product algorithm)

Uses Readily Available Open Source Technologies and Commercial Amazon Cloud Services

Page 3: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Cloud Based I&PF High-Level Architecture

Copyright © 2017 Solers, Inc.

3

Page 4: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Cloud Based I&PF Apache NiFi Workflow Engine

Copyright © 2017 Solers, Inc.

4

Page 5: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

NOAA S-NPP ATMS and MIRS • Ingests and inventories Suomi National Polar Partnership (S-NPP)

Advanced Technology Microwave Sounder (ATMS) granules

• Generates Microwave Integrated Retrieval System (MIRS) products from the ATMS granules

• Makes ATMS granules and MIRS products searchable and accessible

NOAA Nexrad II Weather Radar • Ingests and inventories NOAA Nexrad II Weather Radar data sets that

were published on Amazon S3 as part of the NOAA Big Data Project

• Makes NOAA Nexrad II Weather Radar data sets searchable and accessible

MIRS / Nexrad II Blended Product • Leverages the available MIRS products and NOAA Nexrad II Weather Radar

data sets to produce a new blended product that combines the MIRS snow/water data with the Nexrad II radar data over mountainous regions

Cloud Based I&PF 3 NOAA Proof of Concept Use Cases (Solers IR&D)

Copyright © 2017 Solers, Inc.

5

Page 6: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Cloud Based I&PF NOAA Proof of Concept Architecture

Copyright © 2017 Solers, Inc.

6

Page 7: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

I&PF Data Storage

Cloud-Based I&PF NOAA Proof of Concept NOAA Data Ingest, Processing, and Inventory

Copyright © 2017 Solers, Inc.

7

I&PF Workflow Engine (Apache NiFi on

Amazon EC2)

I&PF Metadata

Repository

I&PF Data Storage

S-NPP ATMS Granules (HDF5 Files)

Check for PG Triggers (data types that trigger a

PG algorithm to be executed)

Ingest Files (extracts metadata)

S-NPP ATMS Granules (Full Orbit)

I&PF In-Memory Cache

Index Metadata in Elasticsearch

Check for Completed Job Specifications

(all required input data received for a PG trigger)

Execute MIRS Algorithm (executes the algorithm leveraging all required

input data)

Ingest Generated Products

(extracts metadata)

Generated Products

Job Specification (JSON String)

MIRS Products (NetCDF4 Files)

Job Specification (JSON String)

Job (JSON String)

Job (JSON String)

MIRS Products (NetCDF4 Files)

Metadata (JSON String)

(1)

(2b)

(3)

(4)

(2a)

(5)

(6)

Production Rule Queries and Results

(JSON String)

Inventory Queries and Results

(JSON String)

Nexrad II Radar Data (Gzipped Files)

Nexrad II Radar Data (Gzipped Files)

Amazon EC2

Amazon S3

Amazon S3

Amazon ElastiCache (Redis)

Amazon Elasticsearch

I&PF User Portal

Discovery and Access (Web Services)

Page 8: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

NOAA Data/Products Made Available in the Cloud-Based I&PF: Ingested S-NPP ATMS Granules

Copyright © 2017 Solers, Inc.

8

Page 9: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Copyright © 2017 Solers, Inc.

9

NOAA Data/Products Made Available in the Cloud-Based I&PF: Ingested Nexrad II Weather Radar Data

Page 10: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Copyright © 2017 Solers, Inc.

10

NOAA Data/Products Made Available in the Cloud-Based I&PF: Generated MIRS Products

Page 11: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Copyright © 2017 Solers, Inc.

11

NOAA Data/Products Made Available in the Cloud-Based I&PF: MIRS / Nexrad II Blended Product

Page 12: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

OmniEarth Overview • OmniEarth utilizes large satellite imagery sets combined with advanced

machine learning algorithms to classify land cover for purposes of determining outdoor water budgets at the parcel level

• These budgets aid water agencies in drought-ridden communities in the US to best target water over-users

Solers’ OmniEarth Commercial Project • Solers has partnered with OmniEarth’s Data Scientists to help them utilize

the Cloud Based I&PF in order to automate their (previously manual) satellite imagery ingest and land classification algorithm processing activities for their commercial Water Resource Management product

Cloud Based I&PF OmniEarth Commercial Project

Copyright © 2017 Solers, Inc.

12

Water Resource Management Information: http://water.omniearth.net

Page 13: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Cloud Based I&PF OmniEarth Commercial Project Architecture

Copyright © 2017 Solers, Inc.

13

Page 14: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

OmniEarth Land Classification

Algorithm Workflow (Amazon EC2 GPU VM )

Amazon EC2

OmniEarth Land Classification

Algorithm Workflow (Amazon EC2 GPU VM )

Amazon EC2

Cloud Based I&PF OmniEarth Commercial Project Satellite Imagery Ingest and Land Classification Processing

Copyright © 2017 Solers, Inc.

14

I&PF Workflow Engine (Apache NiFi on

Amazon EC2)

Temporary File Storage

Persistent File

Storage

Regional Satellite Imagery Tiles (Web Service Requests

and GeoTIFF Files)

Generate Multi-Zoom-Level Raster Tiles

(executes algorithm)

Queuing Service

Multi-Zoom-Level Raster Tiles

Multi-Zoom-Level Raster Tile Information

(JSON String)

Land Classification Algorithm Workflow VM Provisioning

(1)

(2)

Multi-Zoom-Level Raster Tiles (GeoTIFF Files)

Periodic Status Updates (Web Service) Ingest Regional Satellite

Imagery Tiles (extracts metadata)

Initiate Regional Ingest (Web Service)

Regional Satellite Imagery Tiles

Regional Satellite Imagery Tiles (GeoTIFF Files)

Land Classified Imagery Tiles

Automated Algorithm Provisioning

OmniEarth Algorithm Repository

(3)

(4)

(5)

Deploy Latest Land Classification Algorithm (AWS CodeDeploy Request)

Retrieve Latest Land Classification Algorithm (GitHub Request)

Regional Satellite Imagery Tiles (GeoTIFF Files)

Multi-Zoom-Level Raster Tiles (GeoTIFF Files)

Land Classified Imagery Tiles (GeoTIFF Files)

Amazon S3

Amazon EFS

Amazon EC2

OmniEarth Land Classification Algorithm Workflow (Amazon EC2 GPU VM )

Amazon EC2

Amazon CodeDeploy

GitHub

Amazon SQS

Auto-Scaling

Page 15: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Cloud Based I&PF OmniEarth Commercial Project Automation At Scale, Without Quality Reduction

Copyright © 2017 Solers, Inc.

15

Solers’ Cloud-Based I&PF workflow automation produces land classified imagery tiles for OmniEarth, with the same level of precision and accuracy as those produced manually

Produced Manually by OmniEarth’s Data Scientists

Produced by Solers’ Cloud Based I&PF Workflow Automation

Page 16: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Cloud Based I&PF OmniEarth Commercial Project Outcomes and Benefits for OmniEarth

Copyright © 2017 Solers, Inc.

16

Automation and Efficiency • Automates their previously manual satellite data ingest and land

classification processing activities

• Reduces the time to perform these activities by an order of magnitude (days to hours)

• Allows OmniEarth’s Data Scientists to focus on improving their algorithms and training models, instead of manually running and watching over the satellite data ingest and land classification algorithm execution

Tailored to the Data Scientist Needs • Simple web service interface to initiate the workflow, based upon

customer needs (e.g., specific regions of interest)

• Periodic monitoring/alerting of workflow status using a tool that is already heavily used and familiar to them (Slack)

Page 17: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Copyright © 2017 Solers, Inc.

17

Cloud Based I&PF Future Utilities and Benefits Development, integration, and test environment for

Government (e.g., NOAA, NASA) satellite ground systems • Perform R&D and Cal/Val of new product algorithms for multiple

satellites/platforms

• Scalable cloud-based framework that avoids on-premise infrastructure costs (pay just for the services that you need/use)

• Automation at scale with interfaces tailored to science algorithm developer and data scientist needs, helping to reduce the Research to Operations (R2O) timeline

Ingest and processing framework for commercial small satellite startup companies • Enable them to quickly get their satellite data ingested, processed, and

available to users via a scalable cloud-based workflow or “data pipeline” framework, without requiring on-premise infrastructure

Page 18: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

AMS 2017 Theme: Observations Lead The Way

Copyright © 2017 Solers, Inc.

18

Your view on the greatest observational needs for your discipline in general • Promotion of cloud-based platforms/frameworks for NOAA (FedRAMP

approved, such as AWS GovCloud) to perform value-added capabilities with available observation data sets that are being published to cloud storage services (such as S3) as part of ongoing cloud-related initiatives (such as the NOAA Big Data Project)

• Index/catalog the data for discovery and access, and leverage cloud services to perform other value-added capabilities such as product generation, data assimilation, re-processing, etc. (more than just storing the data)

Page 19: Next Generation Cloud Based Ingest & Processing Framework ...determining outdoor water budgets at the parcel level •These budgets aid water agencies in drought-ridden communities

Questions

Copyright © 2017 Solers, Inc.

19