7
Challenges Not easy to extract data from PDF. Fonts are not available in computer. Software

Data Wrangling

Embed Size (px)

Citation preview

Page 1: Data Wrangling

Challenges

• Not easy to extract data from PDF.

• Fonts are not available in computer.

• Software

Page 2: Data Wrangling

Open Data

Page 3: Data Wrangling

Why Open Data?

• Open data, especially open government data, is

a terrific resource that is as yet largely

untouched.

• There are many areas where we can expect

open data to be of value.

• www.wheredoesmymoneygo.com

• Tux tree

Page 4: Data Wrangling

What is data wrangler?

• A data wrangler is the person performing the

wrangling.

• Data wrangling is loosely the process of

manually converting or mapping data from

one "raw" form into another format which

includes further munging, data visualization.

Page 5: Data Wrangling

Why to put Data in C.S.V.

Page 6: Data Wrangling

Software

•ABBYY FineReader

•Cometdocs

•Tabula

Page 7: Data Wrangling