A First Look at San Francisco’s New ETL Job Platform

Preview:

Citation preview

A First Look at San Francisco’s New ETL Job Platform

Samuel ValdezJanine Heiser

Agenda

• About Us• Some Context• A Very Brief History• The ETL Job Platform• Workflow• Looking Ahead• Questions

About Us

City and County of San Francisco

Department of Technologyhttp://sfgov.org/dt/

Samuel ValdezEnterprise GIS

EngineerSan Francisco Enterprise

GIS Program (SFGIS)

Janine Heiser

Open Data Engineer

DataSF

Some Context

DataSF (open data program)Our mission is to empower use of the City’s data. Our vision is that the City’s data is understood, documented, and of high quality.

(With some help from the Department of Technology.)

Learn more at http://datasf.org/

Automated Data Publishing“... goal of increasing the number and timeliness of datasets on [SF OpenData]...”

SF OpenData (Socrata)

A Very Brief History

Why is SFGIS involved?• Well-established legacy• Traditionally shared (spatial) data• Most open-data were spatial• Enterprise perspective and relationships• Technical skills, tools, and experience

Generation 0• Predecessor platform• FME Server-based• Deployed around 2011 to support the EAS• Several key ETL jobs• Project champion• Organic evolution• Learning experience

The ETL Job Platform

Platform Architecture I

“ETL Job” Services

Platform Architecture II

Platform Architecture III

Workflow

“Requirements”• Few workspace authors• Need a safe place to try out workspaces• A few hundred ETL jobs (100-500?)• “Simple” ETL jobs• Use best-practices• And more

Workflow I

Workflow II

Workflow III

Looking Ahead

Looking Ahead x 2

Possible future talks• Version control• Supporting services, including geo• Job scheduling• Business continuity• Operations and maintenance• And more

Possible platform evolution• Scale out• More value-added services• Support streaming data• Third-party scheduler (not FME-centric)• Asynchronous ETL job execution• And more?

Questions?

Thank you!

Jeff Johnson – SFGIS & OD ManagerJoy Bonaguro – Chief Data OfficerMiguel Gamiño – City CIOAaron Koning – FME Server Product Mgr

Thank you!

Samuel.Valdez@sfgov.orgJanine.Heiser@sfgov.org

http://datasf.org/http://sfgov.org/dt/

Recommended