18
Getting Data into ActiveData for Excel A Methodology for Requesting and Converting Various File Types into an Excel Format Self Study Course DEMO VERSION By: Michelle Shein and Richard B. Lanza NOTE: AFTER INSTALL BOOK AND PRACTICE FILES CAN BE FOUND AT: C:\Program Files\ActiveData\Getting Data\ © 2008 InformationActive Inc., Michelle Shein and Richard Lanza

Getting Data into ActiveData for Excel - InformationActive · Excel and ActiveData for Office is how to `get the data in`… ActiveData and other similar types of software are extremely

Embed Size (px)

Citation preview

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

Getting Data into ActiveData for Excel A Methodology for Requesting and Converting Various

File Types into an Excel Format

Self Study Course

DEMO VERSION

By: Michelle Shein and Richard B. Lanza

NOTE:

AFTER INSTALL BOOK AND PRACTICE FILES CAN BE FOUND AT:

C:\Program Files\ActiveData\Getting Data\

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

Getting Data Into ActiveData for Excel DEMO VERSION (ONLY CHAPTERS LISTED

AS DEMO APPEAR)

Table of Contents Copyright Page ..................................................................................................................................... 1

Purpose of the Publication / Learning Objectives ......................................................................... 2

Self Study Roadmap ............................................................................................................... 2

About the Authors ............................................................................................................................... 3

Introduction .......................................................................................................................................... 5

DEMO Chapter 1 - How to Request Data for Analysis ................................................ 6

Making Arrangements with the Client to Obtain Data .................................................... 8

Transferring the Client’s Data .............................................................................................. 8

Verifying the Data Received from the Client ..................................................................... 8

DEMO Chapter 2 – File Formats and Data Types ....................................................... 9

Data Files Included With This Publication ........................................................................ 9

Excel Versions ........................................................................................................................ 9

Chapter 3 - Importing Delimited Text Data Into Excel ..............................................10

Importing Tab Delimited Text ........................................................................................... 10

Saving the Excel File ............................................................................................................ 14

Importing Comma Delimited Text .................................................................................... 15

Importing Text with Other Delimiter and Qualifiers ..................................................... 19

DEMO Chapter 4 - Importing Fixed Width Data into Excel ..................................... 20

Saving the Excel File ............................................................................................................ 24

Chapter 5 - Importing a Text File by Opening it in Excel .......................................... 25

Opening a .csv File ............................................................................................................... 25

Saving the Excel File ............................................................................................................ 26

Opening a .txt File ................................................................................................................ 27

Saving the Excel File ............................................................................................................ 28

Chapter 6 - Getting Access Data into Excel ................................................................ 29

Copying Access Data into Excel ........................................................................................ 29

Exporting Access Data into Excel ..................................................................................... 32

Chapter 7 - Connecting to External Data from Excel ................................................. 34

Connecting to a Text File and Refreshing Data .............................................................. 34

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

Connecting to Access Data from Excel ............................................................................ 36

Refreshing the Imported Data ............................................................................................ 36

Refreshing Data Using a Toolbar ...................................................................................... 38

The Data Connection Wizard ............................................................................................. 39

Chapter 8 - Converting Data from a .pdf Format to Excel...........................................41

Chapter 9 – PDF Conversions and Data Masking Using Import Wizard .................. 42

Converting PDF Files .......................................................................................................... 42

About Data Masking ............................................................................................................ 43

Chapter 10 - Formatting Imported Excel Data ........................................................... 49

Changing Column Width .................................................................................................... 49

Repositioning Imported Data for Alignment Consistency ............................................ 50

Changing Date Formats ...................................................................................................... 51

Converting Numbers Stored as Text to Numbers .......................................................... 52

Converting Dates Stored as Text to Dates ....................................................................... 55

Chapter 11 – ActiveData’s Formatting and Data Conversion Features ...................... 56

Combining Columns ............................................................................................................ 57

Splitting a Column ................................................................................................................ 58

Conclusion ................................................................................................................... 60

IMPORTANT NOTE:

AFTER INSTALL THE EBOOK AND PRACTICE FILES CAN BE FOUND AT:

C:\Program Files\ActiveData\Getting Data\

THIS IS A DEMO COPY OF THE BOOK FOR A COMPLETE COPY OF ALL THE TEXT PLEASE PURCHASE THE ENTIRE GETTING DATA BOOK AT www.informationactive.com

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

Copyright Page

© InformationActive Inc, Michelle Shein and Richard B. Lanza

No part of this publication may be reproduced in any form without permission in writing

from InformationActive Inc, Michelle Shein and Richard B. Lanza.

Limitation of Liability / Disclaimer of Warranty

The authors have used his best efforts in preparing this publication and are not

responsible for any errors or omissions. They make no representations or warranties with

respect to the accuracy or completeness of the contents of this document and specifically

disclaim any implied warranties of merchantability or fitness for any particular purpose,

and shall in no event be liable for any loss of profit or any other financial or commercial

damage, including, but not limited to, special, incidental, consequential, or other

damages.

ActiveData for Excel is the trademark of InformationActive Inc.; ACL, Audit Command

Language, and Access Command Language are trademarks of ACL Services Ltd.; IDEA

is the trademark of Caseware IDEA Ltd.; Excel and Access are the trademarks of

Microsoft. All other trademarks are the property of their respective owners.

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

2

Purpose of the Publication / Learning Objectives

The purpose of this course is to assist auditors, fraud examiners, and management in importing data files into Excel so that they can be analyzed by ActiveData for Microsoft Excel. This course is not expected to explain ActiveData or Microsoft Excel concepts at length but rather to provide guidance as to which of the product’s features can be used to import data files for analysis. For more information on the use of audit software, and countless ways of applying audit software to your business, please see www.auditsoftware.net. If you would like to provide feedback on the document, we welcome and encourage it as we plan to complete later versions. Please provide your feedback via Email at [email protected] or [email protected].

Self Study Roadmap This self study guidebook has been organized to build your knowledge of how to get data into Excel for the purpose of analyzing the data. It is suggested that the guide be completed in the order as established in the table of contents. The steps explained throughout this guide use the sample data files provided to help you become proficient in the various methods of getting data into Excel. After you have completed this guide you may wish to purchase the self study guide: Financial Statement Auditing, Fraud Detection and Cash Recovery Using ActiveData for Excel. The purpose of this course is to assist auditors, fraud examiners, and management in implementing data analysis routines using ActiveData for Microsoft Excel. It is hoped that through the dissemination of this new information that more analysis will be done using audit software to prevent and proactively detect organizational inefficiency, ineffectiveness, and fraud. These guides can be purchased through the ActiveData site: www.InformationActive.com For any other information please contact InformationActive Inc.: 1+ 613-569-4675 x184

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

3

About the Authors

Michelle Shein is a highly-skilled instructor with over twenty five years of technical training experience. With her proficiency in both teaching and the use of desktop PC products she has taught Auditors discovery skills to uncover fraud using the technology of Microsoft Access and Excel.

Ms. Shein is the President of PR1OR1TY Computer Training & Services, Inc. Since 1990 the training corporation has been providing training services and PC consulting to corporate clients helping to build the PC skills of many corporate teams. Ms. Shein has taught for numerous clients including: Morgan Stanley, Merrill Lynch, AICPA, Chubb, Kraft, Nabisco, Comcast, Toys R Us, AIG, AT&T, Bank of New York, Johnson & Johnson, Ciba Gigy, Sandoz, Novartis, Pfizer, King Pharmaceuticals, UCB Pharmaceuticals, Barr Labs, PF Labs, Dress Barn, Bell Core, Telcordia and Avon.

As a professional PC trainer for numerous years, Ms. Shein has taught classes in many of the popular PC desktop products. Ms. Shein has specialized in teaching Microsoft Project, Microsoft Excel and Microsoft Access users as well as specializing in developing Access applications for her client’s data storage and analysis needs.

Ms. Shein earned a Bachelor and Master’s degree in education from the State University of New York in Fredonia, New York. She has used her educational and psychology background in developing rewarding training sessions for both the advanced learner and PC user as well as for the reluctant learner and novice PC user.

Other products Ms. Shein and Mr. Lanza have co-authored are the ACFE Access Training – Auditing Payables for Fraud CD series as well as the guides; Fraud Detection and Cash Recovery using ActiveData for Excel and Fraud Detection and Cash Recovery using ActiveData for Office.

Michelle Shein can be reached through the following means: Phone: +1-352-751-4139 E-mail: [email protected] Rich Lanza (CPA, CFE, PMP), president of Audit Software Professionals, has a decade and a half of experience in the audit and assurance technology field and has become one of its leading authorities. Rich helps companies save millions (in their respective currency) each year using his technology tools.

Based on his years in the trenches at several Fortune 500 firms, CPA firms, and medium-sized businesses, Rich brings personalized coaching expertise in automating report systems enabling organizations to generate cash recoveries, stop profit leaks, resolve control issues, and implement process improvements.

He is a frequent and popular speaker at industry events such as the conferences and seminars of Institute of Internal Auditors and a prolific writer. A columnist for many professional journals, author of 13 publications, and co-author of The Buyer’s Guide to Audit Software, the first comprehensive look at this exploding market. Rich is an ACL

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

4

software expert, as well as, an ActiveData for Excel expert, and highly proficient in using Microsoft Excel and Access in engagements.

From a volunteer perspective, Rich is committee member of the NYSSCPA Technology Assurance Committee and the IIA's Board of Research and Education Advisors, as well as, serves as President of the North Jersey IIA Chapter.

A graduate of PACE University, he is a member of the American Institute of Certified Public Accountants (AICPA), the Association of certified Fraud Examiners (ACFE), and the Project Management Institute (PMI®). Rich was recently awarded the prestigious Outstanding Achievement in Commerce Award from the Association of Certified Fraud Examiners.

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

5

Introduction A common issue users highlight when using Data Analysis software such as ActiveData for Excel and ActiveData for Office is how to `get the data in`… ActiveData and other similar types of software are extremely powerful tools, however, they must have data formatted in a specific way to perform their functions. Data must be in rows and columns (usually starting in the first `cell` of the grid). This is commonly referred to as `tabular` data. One of the great successes of the ActiveData product line is its ease of use. The fact that the software utilizes standard Microsoft software and protocols to accomplish many of its tasks makes it very accessible to users and this has been one of the key reasons there has been such a the rapid uptake of the software. In fact many users have a relatively easy time attaining data in a format that ActiveData can use because many programs have utilities for exporting data tabular data into Excel or other data types that are easily read by Excel (which can be used by both ActiveData for Excel and ActiveData for Office).. The goal of this book is to provide the reader with a number of tools and techniques for getting tabular data into Excel in cases where the data import is not clean. It covers items from basic formatting tips, through the use of ActiveData to clean the data. There is also discussion of cost effective third party products that can be used for more complex tasks such as importing PDF`s and more intricate `data masking`. This book is not a definitive guide to all situations for getting data into tabular data format, however, it is hoped that by using this guide users will have a very strong foundation for addressing a number of common and more complex situations that may arise with data importing and formatting.

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

6

1 - How to Request Data for Analysis

An important first step in data analysis is acquiring the data. If that data is being acquired from an external source (either with-in your company, or from an external source). It is important to develop a process for getting the data in a usable format.

NOTE: For using data with ActiveData you need to have your data in rows and columns, preferably with the column headers in the first row for identification. This is generically known as “tabular data”.

It is suggested that prior to requesting data that all expected reports be identified so that one request is made of the client. Getting data can be broken into the following logical process steps:

Making Arrangements with the Client to Obtain Data

Transferring the Client’s Data

Verifying the Data Received from the Client

Making Arrangements with the Client to Obtain Data

You should meet with the appropriate client personnel (generally the primary contact for the audit and a key contact in information systems) to make arrangements to obtain the data. Matters to be discussed include:

Specific data needed

Types of files needed. Common file types include: Comma delimited format, Tab delimited format, Microsoft Access format, and of course, the Microsoft Excel format.

Record layout of the file (The user should arrange to get copies of the record layout which is a simple definition of each data field and where the fields are positioned in the data file).

Timing of the transfer.

Method of transfer.

Arrangements for verification information.

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

7

The results of this discussion should be formalized into a request letter. A sample letter that might be used by an external auditor is shown below. This letter can be revised depending on the individual circumstances. Mr. X IS Manager ABC Company Dear Mr. X: As part of our investigation, we will be performing certain tests in the X audit area using data extraction software. As we discussed today, we require the X file be available for us on X/X/XXXX. We believe the following fields are required from the file for the period X/X/XXXX to XX/XX/XXXX: List Fields Here If you believe, after looking at the reports we expect to process (Appendix A), that we will need more data fields besides those listed above, please provide these fields in the file extraction. Also, if it would be easier, we can receive the entire files from which we can extract and define our desired fields. We will need this file in an appropriate format for importing into Microsoft Excel. Therefore, any of the following file formats will be acceptable (tab delimited ascii, comma delimited ascii, fixed-length ascii, Microsoft Access and Microsoft Excel). To assist in downloading the file to our PC, we prefer that the file be provided on a CD-ROM or Emailed to us as a file. We would like to receive the first 100 records of the data file printed out, as well as, a record count for the file. We will be using this information to confirm the proper transfer of the data to our system. Please contact us if you are unclear as to the source or significance of any of the items requested. Thank you for your assistance. Sincerely, Mr. Y Page 2 of Request Letter Appendix A - Expected Reports To Produce

Report Name Expected Completion Date

List reports here List desired report completion date

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

8

Transferring the Client’s Data

There are many ways to transfer data to your computer for analysis, depending on the client’s system architecture. Examples of possible data transfer methods include:

DVD or CD

E-mail file attachements

USB Memory Stick

Data Tape

FTP or network transfers

Web storage (i.e., www.xdrive.com) The first three methods are more likely to be used for small PC systems. The last four methods are more likely to be used on larger systems (LANs, minicomputers, or mainframes). However, since we will be using Microsoft Excel in our processing will likely be more moderately sized (up to 1 million rows for Excel 2007) and therefore one of the first three methods will likely suffice.

Verifying the Data Received from the Client

It is generally good practice to verify client data before processing it. There are two reasons for this. First, the user can confirm that the data file received from the client is complete and accurate. Second, the user can ensure that the data has been read correctly by Microsoft Excel. Verification of client data is generally accomplished through one or more of the following procedures:

Obtain a printout of the first 100 rows and match “on screen” to the data file.

Compute totals for key data fields (i.e., invoice amount) and agree them to control totals supplied by the client.

Calculate totals or statistics of the file (such as file size) to determine if the relative size of the activity appears reasonable.

Check the sequence (such as, check numbers, inventory part numbers, or invoice numbers) for gaps and/or duplicates.

Select a sample of data items and trace the information to client records. Any exceptions, unreconciled amounts, or other indications of problems should be resolved before applying the automated procedures.

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

9

2 - File Formats and Data Types

A file format is a particular way to encode information for storage in a computer file. Applications save stored data in files in formats that vary from application to application. One way of identifying a file’s format is to recognize the saved file’s extension. The three or four characters after the period of a file name is know as the filename extension. The extension alone cannot determine the exact format of the file. We will be working with several text files that contain data with various delimiters. A delimiter such as a comma, tab or space character separates the data found in the file. Within a file you will also find various data types. The three general types that we work with in Excel are: Text, Numbers and Dates. In addition there are the following data types: General, Currency, Time, Percents, Fractions and Scientific. When bringing in data into Excel not only do you need to be aware of the file format for the purpose of determining how to bring in the data but you may also need to convert the data’s type. The chapters that follow in this guide will be working with the following data formats: TXT – Text File XLS – Excel Spreadsheet File MDB – Access Database File PDF – Adobe Acrobat Portable Document Format or Netware Printer Definition File CSV – Comma separated values file

Data Files Included With This Publication The following data files are included with this course and are used with the various exercises. Note that each chapter lists the files that are needed for the chapter’s exercises.

Data – Tab Delimited.txt

Data – Comma Delimited.txt

Data – Space Delimited.txt

Data – Fixed Width.txt

Data – Unicode Text.txt

Data.csv

Data.mdb

Data – PDF Report.pdf

Data – Sample.txt

Formatting Practice.xls

Excel Versions The examples shown within the main text of the chapters relate to Excel version 2003 (which is similar to Excel XP (2002) and Excel 2000). Excel 2007 tends to have a different look and feel for various functions. When Excel 2007 diverges away from the steps that are shown in the main text this will be highlighted in the sidebars on the right side of the page.

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

10

DEMO CHAPTER

4 - Importing Fixed Width Data into Excel

Data arranged in columns is referred to as fixed width. Fixed width data can be imported into Excel using Excel’s Text Import Wizard. Step One: Have the Practice.xls file open and select a blank sheet. If there isn’t a blank sheet available in your workbook, add another sheet.

To add an additional worksheet to a workbook: Right mouse click on a sheet tab, select Insert from the pop up menu, click on the Worksheet button in the Insert dialog box and then select OK.

Step Two: Start the Text Import Wizard by selecting: Data from the Excel menu, Import External Data, and then Import Data.

Step Three: In the Select Data Source dialog box, select the file: Data – Fixed Width.txt. You may need to select All File (*.*) from the Files of type pull down in order to view the file we wish to import. Click Open to open the file in Excel.

Sample Data File:

Data - Fixed Width.txt

Practice.xls Excel 2007

Excel 2007 also uses the Excel

Text Import Wizard to import a

fixed width file. The instructions

are identical once the Wizard is

displayed.

Step One: Open the file

Practice.xls and select a blank

worksheet. If a blank worksheet

isn’t available follow the

directions on this page To add an

additional worksheet to a

workbook.

Step Two: From the Excel 2007

ribbon, select Data and then from

the Get External Data group,

select From Text.

Step Three: The Import Text File

dialog box will display. Excel

will look only for text files at this

point. Find the file: Data – Fixed

Width.txt and select it.

Step Four: Click Import to open

the first of the Text Import

Wizard dialog boxes. At this

point the instructions are the same

for both versions of Excel. You

can continue with the Step Four

directions on the next page.

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

11

Step Four: In the first Text Import Wizard dialog box make sure the data type Fixed width is selected. This is the selection for text that is already in columns. We will start importing at row 1 and keep the file origin as the default 437 : OEM United States. Notice the data displayed in the Preview of File box. Verify that this is the data you want to import by clicking on Next.

Your Notes:

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

12

Step Five: In the second Text Import Wizard dialog notice the columns of data displayed in the Data preview box. The lines with arrows signify a column break. If the column break lines do not line up the way you would like them to, you can adjustment them.

To create a break line, click at the desired position on the preview box ruler.

To delete a break line, double click on the line.

To move a break line, click and drag it to a desired position.

Test this out and then click Next to go on to the last import dialog box.

Step Six: In the third Text Import Wizard dialog box check to see if there are any columns that you wish to change the displayed data format. To do this, select a column by clicking on its column header and then selecting a data type.

To convert a column of all currency number characters to the Excel’s currency format, select General.

To convert a column of all number characters to the Excel’s text format, select Text.

To convert a column of all date characters to the Excel’s date format, select Date and then select the date type in the Date box.

If a column contains a mix of formats, such as alphabetical and numeric characters, Excel converts the column to General. If Excel doesn’t convert a column to the format that you want, you can convert the data after you import it. After you have made your format changes, click the Finish button to go on to the next step.

Your Notes:

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

13

Step Seven: In the third Text Import Wizard dialog box, let Excel know if you want to import the new data into the Existing worksheet or into a New worksheet. If you choose the default of the Existing worksheet then indicate in which cell you want the import process to begin. You can use the “range select” button to select a new location or keep the current A1 cell location by leaving it and clicking OK. For this exercise we are already on a new blank sheet so all we have to do is click OK.

Your Notes:

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

14

Your data is now imported into an Excel workbook. Review the data to make sure it is valid and in the correct data format. Data can be reformatted or aligned if the columns do not appear consistent or correctly formatted. Formatting imported data will be discussed in lesson ten. Saving the Excel File

After an import is complete, it is always a good practice to save the Excel file. We will continue to save this file as Practice.xls with the

menu selection File, Save.

Your Notes:

Getting Data Into ActiveData for Excel – DEMO VERSION

© 2008 – InformationActive Inc., Michelle Shein and Richard Lanza

15

FOR A COMPLETE COPY OF ALL THE TEXT PLEASE PURCHASE THE ENTIRE GETTING DATA BOOK AT www.informationactive.com