14
FME Data Transformation for the Geographic Support System Initiative Jay E. Spurlin Software Architect and Development Manager for the GSS-I Feature Source Evaluation software system April 8, 2013

FME Data Transformation for the Geographic Support System Initiative

Embed Size (px)

Citation preview

Page 1: FME Data Transformation for the Geographic Support System Initiative

FME Data Transformation for the Geographic Support System InitiativeJay E. SpurlinSoftware Architect and Development Manager for theGSS-I Feature Source Evaluation software system

April 8, 2013

Page 2: FME Data Transformation for the Geographic Support System Initiative

U.S. Census Bureau

• The Census Bureau serves as the leading source of quality data about the nation's people and economy. We honor privacy, protect confidentiality, share our expertise globally, and conduct our work openly. We are guided on this mission by our strong and capable workforce, our readiness to innovate, and our abiding commitment to our customers.

2

Page 3: FME Data Transformation for the Geographic Support System Initiative

Geography Division• The Geography Division plans, coordinates, and

administers all geographic and cartographic activities needed to facilitate the Census Bureau's statistical programs throughout the US and its territories. We manage the Census Bureau's programs to continuously update features, boundaries and geographic entities in TIGER and the Master Address File (MAF). We also conduct research into geographic concepts, methods, and standards needed to facilitate the Census Bureau's data collection and dissemination programs.

3

Page 4: FME Data Transformation for the Geographic Support System Initiative

GSS-I• In support of the 2020 Decennial Census, the Census Bureau

is evaluating what areas should be targeted for a traditional, on-the-ground address canvassing operation and in which areas a traditional canvassing operation is not necessary.

• The task the Census Bureau is undertaking is determining how to decide which areas should be considered for targeting– GEO has evaluated the MAF/TIGER database and assigned

quality indicators to each of the census tracts– A Targeted Address Canvassing strategy has been developed that

contains an inventory of criteria for evaluation

4

Page 5: FME Data Transformation for the Geographic Support System Initiative

GSS-I

• The Geographic Partnership program is now underway.– GEO is receiving both address and spatial data from invited partners

• This data is at the state, county, and local level.• The data is being evaluated and integrated with the MAF/TIGER database.• The next step is to determine what level of feedback we can give to the partners

about their data.

• GEO is also working with statisticians on predictive modeling to help determine where to target.

• The combination of the evaluation of the current MAF/TIGER database, the partner data, and the predictive modeling will contribute to the recommendation on which areas of the country should be considered for targeting.

5

Page 6: FME Data Transformation for the Geographic Support System Initiative

The Geographic Partnership Program

• A partner provides a set of source files• The source files are moved inside the Census firewall via a secure web-exchange module• The content inventory of the files undergoes initial verification• The files are preserved, as supplied, for later reference• A more detailed content assessment is done, including verification the files meet the minimum

guidelines for content and metadata• The files are prepared for automated processing, including re-projection and mapping to a

standardized schema• A series of (mostly) automated checks is run, which provides metrics about the data in the files• An interactive review is conducted, in which the files and their associated metrics are reviewed

and a decision is made how to capture any new data• Any data that are not useful for updating the MAF/TIGER database get removed from the files• Features or addresses are added or modified, using an automated conflate and review process

– or – an interactive update process

6

Page 7: FME Data Transformation for the Geographic Support System Initiative

Feature Source Evaluation Software

• A number of MAF/TIGER spatial layers will be extracted for the extent of the partner entity

• An analyst will use the supplied data and metadata to map the provided source schema to a standardized schema, and the supplied road centerline file will be converted to an ArcSDE layer, re-projected, and the name and MTFCC mappings applied

• The feature names in the source file will be standardized to the parsed, MAF/TIGER naming conventions

• The standardized feature names will be checked to see if any contain illegal characters or prohibited or generic names

• A topological check will be run, to gauge the topological stability of the source file• A completeness / change detection check will be run to attempt to identify areas in the

source file that contain features not found in MAF/TIGER• A comparison will be run between the universe of feature names in the source file and

the universe of feature names found in MAF/TIGER within the extent of the entity• All intersections that meet the requirements for CE95 assessment will be identified

7

Page 8: FME Data Transformation for the Geographic Support System Initiative

Previous FME Technology Architecture• FME Workspaces were developed using FME Workbench 2012 on

desktop workstations, running 32-bit Windows XP Service Pack 3• FME Server 2012 (FME Engine only), on batch servers running Linux

Redhat Enterprise 5 connected to a SAN (Storage Area Network)

8

Linux Batch Server

FME Server (command line invocation of FME Engines)

Perl and shell scripts

Cronacle job-queueing system

Oracle Run-Time Client

Shapefiles onSAN

MAF/TIGER (Oracle

Database)

Page 9: FME Data Transformation for the Geographic Support System Initiative

New FME Technology Architecture• FME Workspaces are developed using FME Workbench 2012 SP3 on

desktop workstations, running 32-bit Windows XP Service Pack 3• FME Server 2012 SP3 (FME Server Console), on batch servers running

Linux Redhat Enterprise 5• FME Server 2012 SP3, on Windows server, with SAN (Storage Area

Network) disk(s) mounted via Samba

9

Linux Batch Server

Perl and shell scripts

FME Server Console (remote job submission to FME Server)

Cronacle job-queueing systemShapefiles on

SAN

MAF/TIGER(Oracle

Database)

Windows Web Server

ArcGIS for Server

FME Server (full installation)

Oracle Run-Time Client ArcSDE Geodatabase

Page 10: FME Data Transformation for the Geographic Support System Initiative

Cross-walking (Transmogrification)

10

Page 11: FME Data Transformation for the Geographic Support System Initiative

Topology Check• The Topology Check workspace compiles a number of topology and

tolerance based metrics:– Gaps – endpoints within 5 meters of any line segment– Overshoots – line segments extending less than 5 meters beyond an intersection– Tiny Features – features with a total length less than 5 meters– Floating Features – features or connected sets of features that are not connected

to the rest of the road network– Exact Duplicates – features whose geometry and name are identical to another

feature– Coincident – features whose geometry overlaps with another feature– Crossing – features that cross but do not intersect at a node– Multi-part – features that consist of multiple geometry parts– Cutbacks – features containing angles less than 25 degrees

11

Page 12: FME Data Transformation for the Geographic Support System Initiative

Completeness / Change Detection Check

• The MAF/TIGER road centerline features and the feature source file road centerline features will be compared using and FME workspace.

• The MAF/TIGER features will be Buffered to a distance of 15 meters, then “overlayed” with the source file features.

• Any source file feature parts that fall outside of the Buffer areas will be chained together, and the total length of difference (and of each part) will be reported as an evaluation metric.

12

Page 13: FME Data Transformation for the Geographic Support System Initiative

CE95 Qualifying Intersection Identification

• Qualifying intersections must meet the following criteria:– Must consist of three roads (a “T” intersection)

or four roads (an “X” intersection)– Must consist of only secondary roads or local

roads– Must meet at 90 or 180 degree angles, with a

15 degree plus/minus tolerance13

Page 14: FME Data Transformation for the Geographic Support System Initiative

Thank You!

Questions?

For more information: Jay E. Spurlin

[email protected] U.S. Census Bureau

http://www.census.gov/geo/www/gss/