Introduction to Geographic Information Systems Fall 2013 (INF 385T-28620 ) Geodatabases Dr . David...

Preview:

DESCRIPTION

Introduction to Geographic Information Systems Fall 2013 (INF 385T-28620 ) Geodatabases Dr . David Arctur Research Fellow, Adjunct Faculty University of Texas at Austin Lecture 4 September 19, 2013. Outline. Tables Geocodes Data table joins Spatial joins Spatial data formats - PowerPoint PPT Presentation

Citation preview

Introduction to Geographic Information Systems Fall 2013 (INF 385T-28620)Geodatabases

Dr. David ArcturResearch Fellow, Adjunct Faculty

University of Texas at Austin

Lecture 4September 19, 2013

Outline

Tables Geocodes Data table joins Spatial joins Spatial data formats Geodatabases Calculating geometry

2INF385T(28620) – Fall 2013 – Lecture 4

TABLESLecture 4

3INF385T(28620) – Fall 2013 – Lecture 4

4

Two kinds of tables in ArcGIS Feature attribute table of map layer

Attribute data is part of map layers Data table with geocodes (such as

census IDs) Can add as table to ArcMap Can join to map layer to add more attributes to layer Join via same geocode values in both the data table

and map layer’s attribute table Census data example—too many census variables to

supply already in feature attribute table, so download custom table and join to appropriate polygon layer

INF385T(28620) – Fall 2013 – Lecture 4

Data table format Rectangular table with one value per

cell Columns (fields) are attributes Rows are observations (records)

5INF385T(28620) – Fall 2013 – Lecture 4

Data table format First row must have column names that are

self-documenting labels E.g., Shape, POP2000 First character of attribute name must be

a letter Remaining characters can be any letter,

digit, or the underscore character (but no blanks)

6INF385T(28620) – Fall 2013 – Lecture 4

Data table format All additional rows of a data table must

contain only attribute values (raw data) None of the rows can be sums,

averages, or other statistics for raw data rows

7INF385T(28620) – Fall 2013 – Lecture 4

8

Primary keys Each table has a primary key attribute

with two properties Each value is unique There are no null values

INF385T(28620) – Fall 2013 – Lecture 4

9

Field calculator Add computed columns in ArcGIS

ArcGIS does not have the query capacity of relational database packages to compute new columns on the fly

So, must create permanent new columns

Full range of computation Can add, multiply, etc. Has numeric and text functions Can concatenate text values

INF385T(28620) – Fall 2013 – Lecture 4

Field calculator (numeric)

10INF385T(28620) – Fall 2013 – Lecture 4

Field calculator (text) Concatenate house number and street fields

11INF385T(28620) – Fall 2013 – Lecture 4

External table file formats for import into ArcGIS

Plain ASCII text with comma separated values (.csv)

Very transportable format, very large files Each table record is a row terminated with a line-break

character (invisible, nonprinting value) Has values separated by a delimiter, usually a comma For data values that contain the delimiter, enclose the

value in double quotes Sometimes columns get wrong data type on import (use

double quotes to force text data type for digits, say for house numbers)

12INF385T(28620) – Fall 2013 – Lecture 4

External table file formats for import to ArcGIS

Excel (.xls, .xlsx) Excel 2003, up to 65,000 rows and 256 columns Excel 2007, up to 1,048,576 rows and 16,384

columns dBase database table (.dbf)

Legacy format ArcMap truncates field names to 1st 10

characters dBase IV has maximum of 255 columns Can open dBase file in Excel but cannot save

dBase from Excel Microsoft Access database (.mdb)

Up to 2 GB file size See following for other limits:

http://www.databasedev.co.uk/access_specifications.html 13INF385T(28620) – Fall 2013 – Lecture 4

GEOCODESLecture 4

14INF385T(28620) – Fall 2013 – Lecture 4

Geocodes (2000) Federal Information Processing

Standards (FIPS) Developed by the National Institute of

Standards and Technology Codes for place-names throughout the

world– Countries– States/provinces– Counties– Metropolitan statistical areas (MSA’s)– Cities– Places—Indian reservations, airports, and post offices in

the USSee http://www.genesys-sampling.com/pages/Template2/site2/61/default.aspx for additional geocodes. 15INF385T(28620) – Fall 2013 – Lecture 4

16

Geocodes: hierarchical

Country: US

FIPS codes (political boundaries)

County: 003 (Allegheny)State: 42 (Pennsylvania)

Tract: 1917

Block: 005 (US420031917003005)Block group: 003

Census codes(statistical boundaries)

Minor civil division: 4200361000 (Pittsburgh)

Parcel block & lot number0096-P-00210000000(1690 Seaton St, Pittsburgh, PA 15226)

Local government cadastral data(legal boundaries)

INF385T(28620) – Fall 2013 – Lecture 4

17INF385T(28620) – Fall 2013 – Lecture 4

World and US

18INF385T(28620) – Fall 2013 – Lecture 4

US and state 42

State 42 and county 003

19INF385T(28620) – Fall 2013 – Lecture 4

County 003 and municipality 61000

Municipality 61000 and tract 1917

20INF385T(28620) – Fall 2013 – Lecture 4

Tract 1917 and block group 003

Block group 003 and block 005

Geocodes (2010) ANSI Codes

American National Standards Institute Codes

Replace the Federal Information Processing Standards (FIPS)

The entities covered include: – States and statistically equivalent entities– Counties and statistically equivalent entities– Named populated and related location

entities (such as places and county subdivisions)

– American Indian and Alaska Native areas See http://www.census.gov/geo/www/ansi/ansi.html

21INF385T(28620) – Fall 2013 – Lecture 4

DATA TABLE JOINSLecture 4

22INF385T(28620) – Fall 2013 – Lecture 4

Review: Table joins Puts two tables together, on

the fly, to make one table One-to-one join (e.g., join state attribute

data to state shapefile by StateName) One-to-many join (e.g., join code table to

feature attribute table to add code description. Many records can use the same code value.)

Each table in a join must have key attribute for matching Must have same values and data types for

key in both tables23INF385T(28620) – Fall 2013 – Lecture 4

Example join

+ =

24INF385T(28620) – Fall 2013 – Lecture 4

Problems with joins Field types are different (e.g., one is

numeric and one is text)

25INF385T(28620) – Fall 2013 – Lecture 4

Text values left alignwhile numeric valuesright align

Solution Create a new field of the same type and use

Field Calculator

26INF385T(28620) – Fall 2013 – Lecture 4

Solution

Both tables are same field types

27INF385T(28620) – Fall 2013 – Lecture 4

Problems with joins

Data format varies Must remove dashes

28INF385T(28620) – Fall 2013 – Lecture 4

SPATIAL JOINSLecture 4

29INF385T(28620) – Fall 2013 – Lecture 4

Spatial joins

Joins using shape (not attribute field) Enables data aggregation (counting or

summing points by polygon) Common spatial joins

Points to polygons (counts) Polygons to points (adds text) Points to points (distances)

30INF385T(28620) – Fall 2013 – Lecture 4

31

Points to polygons How many businesses are in each

neighborhood? Start with:

Business points Neighborhood

polygons

INF385T(28620) – Fall 2013 – Lecture 4

Points to polygonsRight-click neighborhoods > Joins and Relates > Join

32INF385T(28620) – Fall 2013 – Lecture 4

Spatial join result New polygon layer with count of points (number

of architects and engineers)

33INF385T(28620) – Fall 2013 – Lecture 4

Spatial join result Show as a choropleth map with labels, or table

Neighborhood Name CountCentral Business District 53Southside Flats 14Shadyside 9Bloomfield 8Lower Lawrenceville 8North Shore 8Squirrel Hill South 6Strip District 6Point Breeze 4Squirrel Hill North 4Garfield 3South Oakland 3Friendship 2North Oakland 2Carrick 2Central Lawrenceville 2East Allegheny 2Mount Washington 2East Liberty 1Central Northside 1Westwood 1Banksville 1Brookline 1Perry North 1Highland Park 1Larimer 1Allegheny West 1Middle Hill 1Bluff 1Southside Slopes 1

34INF385T(28620) – Fall 2013 – Lecture 4

35

Points to polygons What neighborhood is a business in?

Start with: Business points Neighborhood

polygons

INF385T(28620) – Fall 2013 – Lecture 4

Polygons to points

Right-click business points > Joins and Relates > Join

36INF385T(28620) – Fall 2013 – Lecture 4

Spatial join result Point shapefile with neighborhood data on each

business

37INF385T(28620) – Fall 2013 – Lecture 4

Points to points How close is the nearest bus stop to a

business?

Start with: Business points Bus stop points

38INF385T(28620) – Fall 2013 – Lecture 4

Points to points Right-click business points > Joins and Relates

> Join

39INF385T(28620) – Fall 2013 – Lecture 4

Result Distance field added to new layer of businesses

and stops joined

40INF385T(28620) – Fall 2013 – Lecture 4

SPATIAL DATA FORMATSLecture 4

41INF385T(28620) – Fall 2013 – Lecture 4

Esri legacy format: Coverage Folder with

multiple files Can have

points, lines, and/or polygons

Has several intermediate data products (topology) to speed up processing (now calculated on the fly)

42INF385T(28620) – Fall 2013 – Lecture 4

Esri legacy format: Shapefile Multiple files, all with the same name but

different file extensions No intermediate data products, but has

indices to speed data processing Widely used to share spatial data files

43INF385T(28620) – Fall 2013 – Lecture 4

44

Shapefiles ArcView native format

Minimum files .shp–stores feature geometry .shx–stores index of features .dbf–stores attribute data

Additional files .prj–projection data .xml–metadata .sbn and .sbx–store

additional indices

INF385T(28620) – Fall 2013 – Lecture 4

CAD drawings CAD software

Autodesk, AutoCAD (.dwg) Bentley, Microstation (.dgn, .dxf)

Often used by engineering companies Better digitizing precision

45INF385T(28620) – Fall 2013 – Lecture 4

CAD drawings

46INF385T(28620) – Fall 2013 – Lecture 4

GEODATABASESLecture 4

Geodatabases

A geodatabase is a container used to hold a collection of datasets (GIS features, tables, raster images, and other objects)

Country layer

Graticule layer

World.gdb

48INF385T(28620) – Fall 2013 – Lecture 4

Enterprise geodatabases Practically unlimited size and multiple

simultaneous users Use enterprise data management

systems Store spatial datasets in a number of

DBMSs: IBM DB2, Microsoft SQL Server, Oracle, or Postgres

49INF385T(28620) – Fall 2013 – Lecture 4

Personal geodatabase

Parallels enterprise geodatabase but on PC

Stores datasets in a Microsoft Access .mdb file

Limited to 2 GB Much overhead in space and extra

structure Tempting to apply one’s own Access skills,

but needs ArcGIS Catalog utility for manipulation

50INF385T(28620) – Fall 2013 – Lecture 4

File geodatabase An Esri replacement for shapefiles

Vector and raster map layers Other objects (tables) Stores one or more datasets in a

folder of files with .gdb extension Can be up to 1 TB in size Can be used across platforms Can be compressed and encrypted

for read-only, secure use

51INF385T(28620) – Fall 2013 – Lecture 4

View geodatabases Cannot identify names in Windows

Explorer Must use ArcCatalog

52INF385T(28620) – Fall 2013 – Lecture 4

Non-Esri vector formats Interoperability

Ability of different vendors’ hardware and software to share data

Driven by the Internet with standards evolving for open data access (International Organization for Standardization, Open Geospatial Consortium, US Federal Geographic Data Committee)

Over 110 vector file formats available in ArcGIS Data Interoperability extension (http://www.esri.com/library/fliers/pdfs/data-interop-formats.pdf)

53INF385T(28620) – Fall 2013 – Lecture 4

KML (Keyhole Markup Language)

XML schema for Internet-based maps Originally created by Keyhole, Inc. for satellite images

and purchased by Google to become Google Maps Provides a set of features (points, lines, polygons,

images, text, etc.) with lat/long coordinates plus altitude for 3D viewing

KMZ is zipped KML and associated files, needed for upload to Google Maps

Portability Can import and export KML/KMZ via ArcToolbox in ArcGIS Can upload to Google maps from your computer

54INF385T(28620) – Fall 2013 – Lecture 4

X,y data Point data table with x and y attributes Increasingly popular to include x and y

with data Commonly used for GPS data

55INF385T(28620) – Fall 2013 – Lecture 4

CALCULATING GEOMETRYLecture 4

56INF385T(28620) – Fall 2013 – Lecture 4

Point centroidsWhen displaying or analyzing small polygons it is often better to use point centroids

57INF385T(28620) – Fall 2013 – Lecture 4

Calculate x,y fields

Add new x and y fields in the attribute table

58INF385T(28620) – Fall 2013 – Lecture 4

Calculate x,y fieldsCalculate geometry for x field, repeat for

y

59INF385T(28620) – Fall 2013 – Lecture 4

X,y field resultsResults are x and y values based on map properties (e.g., Long/Lat or x,y feet)

60INF385T(28620) – Fall 2013 – Lecture 4

Export table with x,y values

61INF385T(28620) – Fall 2013 – Lecture 4

Add x,y data table

62INF385T(28620) – Fall 2013 – Lecture 4

Export features X,y events should be exported as

permanent shapefile or feature class

63INF385T(28620) – Fall 2013 – Lecture 4

Count point centroids Population can be spatially joined to buffer around

polluting companies

64INF385T(28620) – Fall 2013 – Lecture 4

Other geometry calculations Area Perimeter Length

65INF385T(28620) – Fall 2013 – Lecture 4

Summary

Tables Geocodes Data table joins Spatial joins Spatial data formats Geodatabases Calculating geometry

66INF385T(28620) – Fall 2013 – Lecture 4

Recommended