2005 Ohio GIS Conference September 21-23, 2005 Marriott North Hotel Columbus, Ohio

Preview:

DESCRIPTION

Geoprocessing for Animal Premises ID. Luanne Hendricks State of Ohio OIT/GISSC Intern Columbus State Community College. 2005 Ohio GIS Conference September 21-23, 2005 Marriott North Hotel Columbus, Ohio. Overview. Objective Source Data & Desired Outputs Timeline - PowerPoint PPT Presentation

Citation preview

2005 Ohio GIS Conference

September 21-23, 2005

Marriott North Hotel

Columbus, Ohio

Geoprocessing for

Animal Premises ID

Luanne HendricksState of Ohio OIT/GISSC InternColumbus State Community College

Overview

• Objective

• Source Data & Desired Outputs

• Timeline

• Tools and Automation

• Process

• Statistics

• Observations

Objective

Geoprocessing

Input:Source Data

from County Auditors

Output: - Normalized Parcel Data - Unique AG Owners

Output - Deliverables

• Normalized Parcel/Point Geodata – agricultural ( 100 <= LUC <= 199)

– dairy (LUC = 103, 113)– residential ( 510 <= LUC <= 520, LUC = 560)

• Normalized Tabular Data (Access DB)– Table of unique ag owners with owner_id– Table of parcel data with owner_id

• Time Estimate to regenerate data annually

Example: Locate Residential Parcels of Ag Land Owners

Example: Select Parcels owned by Owner ID = 2894

Owner to Parcel Table Example

Source Data – Quantity/Quality• Large volume of data

– approx. 5 million source records– some counties had 40-50 fields of data– approx. 5 GB of data

• Multiple source files per county

• Parcel, Point, CAMA data

• Non-standardized data fields

• Variable completeness

Example: Non-Normalized Source vs. Normalized Output

Processing – High Level View

Data Collection from Counties

Normalize Source Data

Generate Owner Ids for Parcel Records

Generate Owner Table

Match Dairy Addresses to Parcel Table

Create Project for User

TimelineFirst Pass

Effort Several PT HC - Approx. 1 FT HC

Tasks Data Collection & Geocoding

Normalizing Owner IDs Dairy Match

Create Project

Month January February March April May

Second Pass

Effort 1 PT HC 1 FT HC

Tasks Identify Original Source used

Manual Normalizing

Automation

Normalizing

Owner IDs

Owner Ids

Dairy match

Project

Month May June July August Sept.

Need Automation Strategy

• Need to automate process for:– Repeatability– Ease of modification– Testability– Traceability

• ...As well as speed

Tools Processing Tasks

ArcToolBox- Model Builder Script development

- Python

- VBscript

Pre-Normalization- Joining source files, - adding key id, -copying to working directory

Pre-Owner ID Generation- Address Standardization- Rejoin Data file to Shapefile

MS Access - VBA- Queries- SQL- Form Interface

- Normalization

- Owner ID & Owner Table

- (Dairy Match)

Processing Detail - Example

Pre-normalization steps in Model-Builder for a county with 2 source files – shape and CAMA that need to be joined. This county is now ready for normalization in Access. Slightly different steps are needed for point files andcounties with a single source parcel shapefile.

Processing Detail - Example Continued

Model-Builder has limitations – you can’t loop through these steps for a list ofcounties. But this model can be converted to script and coded to process alist. Additional field-name mapping steps needed due to “coarse-grained” geoprocessing object.

Loop thrucnty list.

Delete Temporarytable view & layer

Get FieldsMake Field Map

Example of Geoprocessing Tool Limitations

When you join fields in the geoprocessing environment, and create a new Feature Layer shapefile, field names are [original layer name].[field name] truncated to 10 characters. Renaming is not done automatically for you as itis when you join and create a new layer manually in ArcMap.

Python Script Example

Access Form Interface Used for Normalization

Example: Non-Normalized Source vs. Normalized Output

Normalization Mapping Table

Processing – Owner IDs

Data Collection from Counties

Normalize Source Data

Generate Owner Ids for Parcel Records

Generate Owner Table

Match Dairy Addresses to Parcel Table

Create Project for User

Owner ID and Owner Table Generation

Standardized vs. Un-standardized

Owner ID Algorithm

• Aggregate on Lastname, Firstname

• Standardize addresses

• For each Lastname,Firstname group, choose the address - OWNADD1, MAILADD1, or SITEADD, that produces the best set of matches

Statistics

ORIG_REC = Total AG + Total ResidentialNOAD = # Records with no address informationADD_REC = Total # of AG + Total Residential associated with more than 1 parcelFINL_REC = Total # of AG + Total Residential associated with at least one AG pclOWNR = # of Records in the Owner TableNMD_AG = Aggregate of OWNNAM1/MAILADD1 and OWNADD1/MAILADD1

as a sanity check and to compare how effective the processing was

Testing

• Use Statistics– Numbers make sense– Numbers add up, e.g.:

• All records in Parcel table assigned an ownerid• # Records in Owner Table = # Aggregated on Owner Id in

PCL table

• Visual Inspection– Visually inspect how Owner Ids were assigned– Create shapefile and view data in project– Spot check source vs. processed data in shapefiles

Status

• 53 counties normalized

• 40 counties have owner ids/owner table

• Dairy matching - to do

• Final project – to do

Example Project – Work in Progress

Observations and Conclusions (1)

• After initial development, Automation speeds process

• For example, using Form Interface to normalize:

Data NormalizationTime Data

Volume

Manual

1st pass

6 day 1X

Ag only

Auto

2nd pass

1 day 5X

Ag + Res

Observations and Conclusions (2)

• Automation:– speeds process after initial development investment– enables repeatability of process– makes modification and redo less painful– increases data consistency– reduces errors– accurately documents process– increases future capability to do similar processing –

tools are reusable

• Automation is cost effective

Observations and Conclusions (3)

• This job would be easier if:– Data was maintained in small standard

components:• Last Name, First Name, MI as separate fields• Address components – SiteNum, SiteDir, SiteStr• There was a standard for field names of

components

Recommended