30
Reading and Writing Text Files in Java John Lamertina (Dietel Java 5.0 Chp 14, 19, 29) April 2007

Reading and Writing Text Files

Embed Size (px)

Citation preview

Page 1: Reading and Writing Text Files

Reading and Writing Text Files in Java

John Lamertina(Dietel Java 5.0 Chp 14, 19, 29)April 2007

Page 2: Reading and Writing Text Files

Content Reading and Writing Data Files (chp 14) String Tokenizer to Parse Data (chp 29) Comma Separated Value (CSV) Files – an

exercise which applies: Multi-dimensional arrays (chp 7) Exception Handling (chp 13) Files (chp 14) ArrayList Collection (chp 19) Tokenizer (chp 29)

Page 3: Reading and Writing Text Files

Reading & Writing Files 3

Data Hierarchy Field – a group of characters or bytes that

conveys meaning Record – a group of related fields File – a group of related records Record key – identifies a record as belonging to

a particular person or entity – used for easy retrieval of specific records

Sequential file – file in which records are stored in order by the record-key field

Page 4: Reading and Writing Text Files

Reading & Writing Files 4

Java Streams and Files Each file is a sequential stream of bytes Operating system provides mechanism to

determine end of fileEnd-of-file markerCount of total bytes in file

Java program processing a stream of bytes receives an indication from the operating system when program reaches end of stream

Page 5: Reading and Writing Text Files

Reading & Writing Files 5

File - Object - Stream

Java opens file by creating an object and associating a stream with it

Standard streams – each stream can be redirected System.in – standard input stream object, can be

redirected with method setIn System.out – standard output stream object, can be

redirected with method setOut System.err – standard error stream object, can be

redirected with method setErr

Page 6: Reading and Writing Text Files

Reading & Writing Files 6

Classes related to Files java.io classes

FileInputStream and FileOutputStream – byte-based I/O FileReader and FileWriter – character-based I/O ObjectInputStream and ObjectOutputStream – used for

input and output of objects or variables of primitive data types

File – useful for obtaining information about files and directories

Classes Scanner and Formatter Scanner – can be used to easily read data from a file Formatter – can be used to easily write data to a file

Page 7: Reading and Writing Text Files

Reading & Writing Files 7

File Class Common File methods

exists – return true if file exists where it is specified

isFile – returns true if File is a file, not a directory

isDirectory – returns true if File is a directory getPath – return file path as a string list – retrieve contents of a directory

Page 8: Reading and Writing Text Files

Reading & Writing Files 8

Write with Formatter Class Formatter class can be used to open a text file for writing

Pass name of file to constructor If file does not exist, will be created If file already exists, contents are truncated (discarded) Use method format to write formatted text to file Use method close to close the Formatter object (if

method not called, OS normally closes file when program exits)

Example: see figure 14.7 (p 686-7)

Page 9: Reading and Writing Text Files

Reading & Writing Files 9

Possible Exceptions SecurityException – occurs when opening file

using Formatter object, if user does not have permission to write data to file

FileNotFoundException – occurs when opening file using Formatter object, if file cannot be found and new file cannot be created

NoSuchElementException – occurs when invalid input is read in by a Scanner object

FormatterClosedException – occurs when an attempt is made to write to a file using an already closed Formatter object

Page 10: Reading and Writing Text Files

Reading & Writing Files 10

Read with Scanner Class Scanner object can be used to read data

sequentially from a text file Pass File object representing file to be read to Scanner constructor

FileNotFoundException occurs if file cannot be found

Data read from file using same methods as for keyboard input – nextInt, nextDouble, next, etc.

IllegalStateException occurs if attempt is made to read from closed Scanner object

Example: see Figure 14.11 (p 690-1)

Page 11: Reading and Writing Text Files

String Tokenizer 11

Tokens: Fields of a Record

Tokenization breaks a statement, sentence, or line of data into individual pieces

Tokens are the individual pieces Words from a sentence Keywords, identifiers, operators from a Java

statement Individual data items or fields of a record (that were

separated by white space, tab, new line, comma, or other delimiter)

Page 12: Reading and Writing Text Files

String Tokenizer 12

String Classes

Class java.lang.String Class java.lang.StringBuffer Class java.util.StringTokenizer

Page 13: Reading and Writing Text Files

String Tokenizer 13

StringTokenizer

Breaks a string into component tokens Default delimiters: “ \t \n \r \f”

space, tab, new line, return, or form feed

Specify other delimiter(s) at construction or in method nextToken: String delimiter = “ , \n”;

StringTokenizer tokens = new StringTokenizer(sentence, delimiter); -or- String newDelimiterString = “|,”;

tokens.nextToken(newDelimiterString);

Page 14: Reading and Writing Text Files

String Tokenizer 14

Example 29.18 import java.util.Scanner;import java.util.StringTokenizer;

public class TokenTest {

public static void main (String[] args) { Scanner scan = new Scanner(System.in); System.out.println("Enter a sentence to tokenize and press Enter:"); String sentence = scan.nextLine();

// default delimiter is " \t\n\r\f" String delimiter = " ,\n"; StringTokenizer tokens = new StringTokenizer(sentence, delimiter); System.out.printf("Number of elements: %d\n", tokens.countTokens());

System.out.println("The tokens are:"); while (tokens.hasMoreTokens()) System.out.println(tokens.nextToken()); }}

(Refer to p 1378)

Page 15: Reading and Writing Text Files

Comma Separated Values 15

Comma Separated Value (CSV) Data Files Fields are separated by commas For data exchange between disparate

systems Pseudo standard used by Microsoft Excel

and other systems

Page 16: Reading and Writing Text Files

Comma Separated Values 16

CSV File Format Rules1. Each record is one line2. Fields are separated by comma delimiters3. Leading and trailing white space in a field is ignored unless the

field is enclosed in double quotes4. First record in a CSV may be a header of field names. A CSV

application needs some boolean indication of whether first record is a header.

5. Empty fields are indicated by consecutive comma delimiters. Thus every record should have the same number of delimiters

6. Fields with embedded commas must be enclosed in double quotes

For more information: http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm

Page 17: Reading and Writing Text Files

Comma Separated Values 17

CSV Format vs StringTokenizer

StringTokenizer with a comma delimiter will read most CSV files, but does not account for empty fields or a quoted field with embedded commas: Empty fields in a CSV file are indicated by

consecutive commas. Example: 123, John ,, Doe (Middle Name field is blank)

Fields with embedded commas are enclosed in quotes. Example:

456 , “King , the Gorilla” , Kong

Page 18: Reading and Writing Text Files

Comma Separated Values 18

Exercise Part 1 Develop and test classes to read and write CSV

data files, satisfying the first four “CSV File Format Rules” (listed on a previous slide). Your completed classes must: Handle the usual possible file exceptions Read CSV-formatted data from one or more files into

a single array Print the data array Write data from the array to a single file in CSV format

Test your CSV reader to read and print sample files: TestFile1.csv TestFile2.csv

Page 19: Reading and Writing Text Files

Comma Separated Values 19

Multi-dimensional Arrays Java implements multi-dimensional arrays

as arrays of 1-dimensional arrays. Rows can actually have different numbers of

columns. Example:int b[][];b = new int[ 2 ][ ]; // create 2 rowsb[ 0 ] = new int[ 5 ]; // create 5 columns for row 0 b[ 1 ] = new int[ 3 ]; // create 3 columns for row 1

(Refer to p 311-315)

Page 20: Reading and Writing Text Files

Comma Separated Values 20

Array Dimension: Length Recall that for a one-dimensional array:

For a two-dimensional array:

int a[ ] = new int[ 10 ];int size = a.length;

int b[][] = new int[ 10 ][ 20 ]; int size1 = b.length; // number of rowsint size2 = b[ i ].length; // number of cols for i-th row

Page 21: Reading and Writing Text Files

Comma Separated Values 21

TestFile1.cvs987, Thomas ,Jefferson,7 Estate Ave.,Loretto, PA, 15940413, Martha,Washington,1600 Penna Ave,Washington, DC,20002123, Martin , Martina ,777 Williams Ct.,Smallville, PA,15990990, Shelby, Roosevelt,15 Jackson Pl,NYC,NY, 12345

TestFile2.cvsID, FName, LName, StreetAddress, City, State, Zip 123, John ,Dozer,120 Main st.,Loretto, PA, 15940107, Jane,Washington,220 Hobokin Ave.,Philadelphia, PA,0911123, William , Adams ,120 Jefferson St.,Johnstown, PA,15904451, Brenda, Bronson,127 Terrace Road,Barrows,AK, 99789729, Brainfield,Blanktowm, PA, 16600

Page 22: Reading and Writing Text Files

Comma Separated Values 22

Exercise Part 2 Develop an application that uses your CSV reader and

writer classes Read the test files (or create your own test files) and

perform data validity checks by displaying an appropriate error message and the offending record(s): If any fields are missing If extra fields are found If any records have duplicate IDs If any record has an invalid zip code (i.e. not exactly 5 digits)

Write all records to a single CSV file (i.e. concatenate the multiple test files in a single file)

Page 23: Reading and Writing Text Files

Comma Separated Values 23

Exercise Part 3 (extra credit)

Extend your classes to be fully compliant with the “CSV File Format Rules”.

Hint: Review some existing CSV Java libraries online.

Page 24: Reading and Writing Text Files

Comma Separated Values 24

Hints 1.aCSVFile- boolean hasHeaderRow;- String fileName;- Scanner input;- List<String> records;- String data[][];- int numRecords;- int maxNumFields;

+ CSVFile(String fileName) + CSVFile(boolean headerRow, String fileName) + boolean getHasHeaderRow()+ String getFileName()+ int getNumRecords()+ int getMaxNumFields()+ void getData(String a[][])+ void openFile()+ void readRecords()+ void parseFields()+ void printData()

Page 25: Reading and Writing Text Files

Comma Separated Values 25

Hints 1.b

import java.io.File;import java.util.Scanner;import java.io.FileNotFoundException;import java.lang.IllegalStateException;import java.util.NoSuchElementException;import java.util.List;import java.util.ArrayList;import java.util.StringTokenizer;

Page 26: Reading and Writing Text Files

Comma Separated Values 26

Hints 1.c

public void readRecords() { // Read all lines (records) from the file into an ArrayList records = new ArrayList<String>(); try { while (input.hasNext()) records.add( input.nextLine() ); ...

public void openFile() { try { input = new Scanner(new File(fileName)); } catch (FileNotFoundException fileNotFound) {

...

Page 27: Reading and Writing Text Files

Comma Separated Values 27

Hints 1.d public void parseFields() { String delimiter = ",\n";

// Create two-dimensional array to hold data (see Deitel, p 313-315) int rows = records.size(); // #rows for array = #lines in file data = new String[rows][]; // create the rows for the array int row = 0;

for (String record : records) { StringTokenizer tokens = new StringTokenizer(record,delimiter); int cols = tokens.countTokens(); data[row] = new String[cols]; // create columns for current row int col = 0; while (tokens.hasMoreTokens()) { data[row][col] = tokens.nextToken(); col++; }

Page 28: Reading and Writing Text Files

Comma Separated Values 28

Hints 1.e public static void main (String[] args) {

CSVFile file1 = new CSVFile(true,"TestFile1.csv"); file1.openFile(); file1.readRecords(); file1.parseFields(); file1.printData(); String fileData[][] =

new String[file1.getNumRecords()][file1.getMaxNumFields()]; file1.getData(fileData);

Page 29: Reading and Writing Text Files
Page 30: Reading and Writing Text Files

http://ostermiller.org/utils/CSV.html http://opencsv.sourceforge.net/

CSV Libraries