19
Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee (DAS) September 10, 2009 Advanced Concepts and Integrated Systems (ACIS) SAIC

Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Embed Size (px)

Citation preview

Page 1: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Data.govReview of New and Existing Applications

Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy

Presentation to the Data Architecture Subcommittee (DAS)

September 10, 2009

Advanced Concepts and Integrated Systems (ACIS)

SAIC

Page 2: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

© 2008 Science Applications International Corporation. All rights reserved. SAIC and the SAIC logo are registered trademarks of Science Applications International Corporation in the U.S. and/or other countries.

2

Overview

• Review Data Sets

• Review and Demonstrate New and Existing Applications

• Feedback and Comments

Summary

Tools Data

COTS & GOTS, Desktop & Web

Review variety of applications from a variety of sources

Page 3: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Review Datasets and Application Sources

Data Sources

• http://www.data.gov/details/92 (June 2009)

Application Sources

• http://data-gov.tw.rpi.edu/wiki/Main_Page

• http://wiki.sunlightlabs.com/Main_Page

• http://data-gov.tw.rpi.edu/wiki/Main_Page

• http://www.gov2expo.com/gov2expo2009

By the Numbers

Data Sources

www.data.gov

788

Data-gov.tw.rpi.eduhttp://data-gov.tw.rpi.edu/wiki/Demos

Applications: 11

Converted to RDF: 16

Apps for America 2

http://sunlightlabs.com/

46

Gov2.0 Expohttp://www.gov2expo.com/gov2expo2009/public/schedule/presentations

35 (5 Categories)

Other: https://analyzethe.us/

Palantir Government

Page 4: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

See also Data.Gov dashboard

http://spreadsheets.google.com/pub?key=tchvwRko8_bEQ9c36b33fOA&gid=10

The Giant Warehouse of Data

Page 5: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee
Page 6: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

File Type Contributed

Page 7: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Influence the kind of applications that are developed.

Page 8: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Review the Challenge (Open Government)

Gives us the tools and we will can do it ourselves. Lend your hand and your coding skills (Tim O’Reilly)

http://blip.tv/file/25528241. Be an Organizer2. Volunteer skills, developers – parse a state – 50 states3. Provide Specific Results, Work together4. Visualize Data(Clay Johnson, Sunlight Labs)

http://blip.tv/file/2075676

5. Visually explore and interact with data to facilitate sense making (DAS, 9/10/2009)

Page 9: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Age of Visualization and Analysis

Emerging Trends in Data Visualization, July 30,2009 DM Radiohttp://www.information-management.com/dmradio/-10015788-1.html

Heat Maps, Tag Clouds, Concepts Layers Widgets, Dashboards, Sliders, Filters

Page 10: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

View of data over time is a storyHeat Maps, Tag Clouds, Concepts Layers Widgets, Dashboards, Sliders, Filters

Page 11: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

View of data over time is a story

http://www.smartmoney.com/investing/bonds/the-living-yield-curve-7923/

The Yield Curve

Page 12: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

View of data over time is a story (temporal and geospatial characteristics)

http://www.palantirtech.com/government/analysis-blog/uncovering-a-bot-net-exploring-router-data-using-palantir

Page 13: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Efficient Access of Data Sources

• Data Imaging

• Direct, ad hoc extraction of selected data elements from a native file• Representation of the content of the data extracted as an integer matrix

• Dates become integer in YYYYMMDD format• Time becomes number of seconds after midnight• Character names/descriptions assigned index values in table• Numerical values expressed as integer with understood base

• Benefits• Minimal overhead in configuration for data handling• Significant compression of working files without loss of content• Substantial acceleration of data retrieval and analysis capabilities achieved by:

• Reduction of tests to integer (=1 word) compares • Exploiting matrix-based processing efficiencies

Page 14: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Efficient Access of Data Sources

Date/Time of X-mission

Router SerialNumber Fault Code

Example479,921 records/69,588,548 bytes

1/7/2007 7:00:00 6 1 1/7/2007 6:29:00 00-08-74-36-37-21 29.224.42.199 80 1545 9S26

1/1/2007 4:00:00 3 2 1/1/2007 3:36:00 00-08-74-52-83-98 129.224.42.199 443 1970 4Z55

1/2/2007 14:01:00 14 3 1/2/2007 14:01:00 00-08-74-52-73-79 129.251.240.179 443 1073 2J89

1/7/2007 0:01:00 0 1 1/7/2007 0:01:00 00-08-74-52-83-98 129.251.240.179 80 7095 4Q66

1/7/2007 22:00:00 21 1 1/7/2007 21:01:00 00-08-74-06-36-24 129.3.1.91 22 2014 7X44

1/5/2007 8:01:00 8 6 1/5/2007 8:01:00 00-08-74-08-66-92 129.3.1.91 443 5821 5G49

1/7/2007 13:00:00 12 1 1/7/2007 12:31:00 00-08-74-52-73-79 129.40.42.144 22 1605 4Z55

1/7/2007 18:00:00 17 1 1/7/2007 17:36:00 00-08-74-52-73-79 129.40.42.144 443 922 4Z55

1/5/2007 12:01:00 12 6 1/5/2007 12:01:00 00-08-74-52-83-98 129.66.124.144 80 2825 3D39

1/5/2007 6:01:00 6 6 1/5/2007 6:01:00 00-08-74-06-36-24 129.9.137.79 21 3653 1G29

. . .

Data Elements of Interest : 21,596,488 bytes

Image Generation: 44.5 secsImage Size:

- 7,678,736 bytes plus 12,592 bytes in conversion tables - 9:1 compression over total data set

- 2.8:1 compression of data soughtQuery for Error Counts by Router:

- Direct: more than 1 minute - Matrix-Based: 9.7 secs - Image-Based: 1.2 secs

Page 15: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

-- Neighborhood to Live

Name Source

Crime in the US 1998-2007

FBI Tableau 5 Application

Data.gov

New Application

State

--Related

Are you Safe?

http://www.areyousafedc.com/

Existing Application

City

--Related

Every Block

http://dc.everyblock.com/crime/by-offense/theft/

Existing Application

City

--RelatedDensity of firearms/ Death Rate

http://www.datamasher.org/mash-ups/test-123#table-tab

Existing Application

State

Page 16: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

-- Purchasing a Car, Planning a Vacation

Name Source

Fuel Efficient Cars

www.fueleconomy.gov

Heat Map Explorer (COTS)

New Application

Federal

Hurricane data (1990 – 2006)

-- Related www.nhc.noaa.gov/

Tableau 5 (COTS)

New Application

Federal

See other examples

http://www.tableausoftware.com/learning/examples

Page 17: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

Discussion and Feedback

[email protected]@[email protected]

Page 18: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

-- OtherBackup

Name Source

World Copper Smelters

http://tin.er.usgs.gov/copper/output/copper-fLD.kml

Data.gov

Existing Application

World Copper Smelters.bmp

USGS Oil and Gas Assessment Database

http://energy.cr.usgs.gov/oilgas/wep

Data.gov

Existing Application

World Petroleum Assessment.bmp

Page 19: Data.gov Review of New and Existing Applications Brand K. Niemann, Rich W. LaValley, Dr. W. Chris Hardy Presentation to the Data Architecture Subcommittee

-- Emerging Technologies Backup