38
Java I/O

Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Embed Size (px)

Citation preview

Page 1: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Java I/O

Page 2: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Agenda

Introducing I/O. What Is a Stream?

Network Streams

Filter Streams

Data Streams

Reader and Writers

Streams in Memory

Object Serialization

Page 3: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

What Is a Stream?

- A stream is an ordered sequence of bytes of undetermined length.

- Input streams move bytes from some generally external source.

- Output streams move bytes of data from Java to some generally external target.

- The word stream is derived from an analogy with a stream of water. An input stream is like a siphon that sucks up water; an output stream is like a hose sprays out water

Page 4: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Partial Byte Stream Inheritance Hierarchies

Page 5: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

The InputStream and OutputStream classes

BufferedInputStream

ByteArrayInputStream

DataInputStream

FileInputStream

FilterInputStream

PushbackInputStream

ObjectInputStream

PipedInputStream

SequenceInputStream

BufferedOutputStream

ByteArrayOutputStream

DataOutputStream

FileOutputStream

FilterOutputStream

PushbackOutputStream

ObjectOutputStream

PipedOutputStream

Page 6: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

ASCII

ASCII, the American Standard Code for Information Interchange, is a seven-bit character set.

Thus it defines 27 or 128 different characters whose numeric values range from to 127.

ASCII characters 0-31 and character 127 are nonprinting control characters.

Characters 32-47 are various punctuation and space characters.

Characters 48-57 are the digits 0-9.

Characters 58-64 are another group of punctuation characters.

Characters 65-90 are the capital letters A-Z.

Characters 91-96 are a few more punctuation marks.

Characters 97-122 are the lowercase letters a-z.

Characters 123 through 126 are a few remaining punctuation

Page 7: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Unicode

Java is one of the first programming languages to explicitly address the need for non-English text.

It does this by adopting Unicode as its native character set. All Java chars and strings are given in Unicode.

Each Unicode character is a two-byte, unsigned number with a value between and 65,535. This provides enough space for characters from all the world's alphabetic scripts.

Not all Java environments can display all Unicode characters. The biggest problem is the lack of fonts. Few computers have fonts for all the scripts Java supports. Even web browsers that can handle Chinese, Cyrillic, Arabic, Japanese, or other non-Roman scripts in HTML don't necessarily support those same scripts in applets.

Page 8: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Unicode – Java compilers

Many Java compilers assume that source files are written in ASCII and

that the only Unicode characters present are Unicode escapes. During a

single-pass preprocessing phase, the compiler converts each raw ASCII

character or Unicode escape sequence to a two-byte Unicode character it

stores in memory. Only after preprocessing is complete and the ASCII file

has been converted to in-memory Unicode, is the file actually compiled.

Page 9: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

UTF-8

The downside is that Unicode is far from the most efficient encoding possible. In a file containing mostly English text, the high bytes of almost all the characters will be 0.

These bytes can occupy as much as half of the file. If you're sending data across the network, Unicode data can take twice as long.

A more efficient encoding can be achieved for files that are composed primarily of ASCII text by encoding the more common characters in fewer bytes.

UTF-8 is one such format that encodes the non-null ASCII characters in a single byte, characters between 128 and 2047 and ASCII null in two bytes, and the remaining characters in three bytes.

Page 10: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

The InputStream class

public abstract int read() throws IOException

public int read(byte[] data) throws IOException

public int read(byte[] data, int offset, int length) throws IOException

public long skip(long n) throws IOException

public int available() throws IOException

public void close() throws IOException

public synchronized void mark(int readlimit)

public synchronized void reset() throws IOException

public boolean markSupported()

Page 11: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Marking and Resetting

public synchronized void mark(int readLimit)

public synchronized void reset() throws IOException

public boolean markSupported()

It's often useful to be able to read a few bytes and then back up and reread them. The mark() method places a bookmark at the current position in the stream. You can rewind the stream to this position later with reset() as long as you haven't read more than readLimit bytes.

The only two input stream classes in java.io that always support marking are BufferedInputStream and ByteArrayInputStream.

However, other input streams, like DataInputStream , may support marking if they're chained to a buffered input stream first.

Page 12: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

An Efficient Stream Copier

public class StreamCopier {

public static void main(String[] args) throws IOException {

copy(System.in, System.out);

}

public static void copy(InputStream in, OutputStream out)

throws IOException {

byte[] buffer = new byte[1024];

while (true) {

int bytesRead = in.read(buffer);

if (bytesRead == -1) break;

out.write(buffer, 0, bytesRead);

}

}

}

Page 13: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Network Streams

Network I/O relies primarily on the basic InputStream and OutputStream methods, which you can wrap with any higher-level stream that suits your needs: buffering, cryptography, compression, or whatever your application requires. Java's URL, URLConnection, Socket, and ServerSocket classes are all fertile sources of streams.

The socket represents a reliable connection for the transmission of data between two hosts. It isolates you from the details of packet encodings, lost and retransmitted packets, and packets that arrive out of order. A socket performs four fundamental operations:

Connect to a remote machine

Send data

Receive data

Close the connection

Page 14: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

The URLTyper program

The program connects to the specified URLs, downloads the data, and copies it to System.out.

public class URLTyper {

public static void main(String[] args) {

for (int i = 0; i < args.length; i++) {

if (args.length > 1) System.out.println(args[i] + ":");

try {

URL u = new URL(args[i]);

InputStream in = u.openStream();

StreamCopier.copy(in, System.out); in.close();

} catch (MalformedURLException e) {System.err.println(e);

} catch (IOException e) {System.err.println(e);} } } }

Page 15: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Filter Streams

- Filter input streams read data from a preexisting input stream like a FileInputStream and have an opportunity to work with or change the data before it is delivered to the client program.

- Filter output streams write data to a preexisting output stream such as a FileOutputStream and have an opportunity to work with or change the data before it is written onto the underlying stream.

Page 16: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

The Filter Stream Subclasses

- BufferedInputStream and BufferedOutputStream

The data is read from or written into the buffer in blocks; subsequent accesses go straight to the buffer. This improves performance in many situations. Buffered input streams also allow the reader to back up and reread data.

Page 17: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

PrintStream

The java.io.PrintStream class, which System.out and System.err are instances of, allows very simple printing of primitive values, objects, and string literals.

It uses the platform's default character encoding to convert characters into bytes.

Page 18: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

PushbackInputStream

The java.io.PushbackInputStream class provides a pushback buffer so a program can "unread" the last several bytes read. The next time data is read from the stream, the unread bytes are reread.

Page 19: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Print Streams

System.out and System.err are instances of the java.io.PrintStream class. This is a subclass of FilterOutputStream that converts numbers and objects to text.

This makes System.out easy to use in quick and dirty hacks and simple examples, while simultaneously making it unsuitable for production code, which should use Multitarget Output Streams the java.io.PrintWriter class or Log4j instead.

Page 20: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Multitarget Output Streams

2 unusual filter output streams that direct their data to multiple underlying streams.

public class TeeOutputStream extends FilterOutputStream {

OutputStream out1; OutputStream out2;

public TeeOutputStream(OutputStream stream1, OutputStream stream2) {

super(stream1);

out1 = stream1;

out2 = stream2;

}

public synchronized void write(int b) throws IOException {

out1.write(b);

out2.write(b);

}

}

Page 21: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Data Streams

Data streams read and write strings, integers, floating-point numbers, and other data that's commonly presented at a higher level than mere bytes.

The DataInputStream and DataOutputStream classes read and write the primitive Java data types (boolean, int, double, etc.) and strings in a particular, well-defined, platform-independent format.

Page 22: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

DataInputStream

The DataInputStream class read primitive Java data types and strings in a machine-independent way.

public abstract short readShort() throws IOException

public abstract int readUnsignedShort() throws IOException

public abstract char readChar() throws IOException

public abstract int readInt() throws IOException

public abstract long readLong() throws IOException

public abstract float readFloat() throws IOException

public abstract double readDouble() throws IOException

public abstract String readLine() throws IOException

public abstract String readUTF() throws IOException

Page 23: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Data Stream : Echo program

public class Echo {

public static void main(String[] args) {

try {

DataInputStream din = new DataInputStream(System.in);

while (true) {

String theLine = din.readLine();

if (theLine == null) break; // end of stream

if (theLine.equals(".")) break; // . on line by itself

System.out.println(theLine);

}

}

catch (IOException e) {System.err.println(e);}

}

}

Page 24: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Thread Safety

Never allow two threads to share a stream! The principle is most obvious for filter streams, but it applies to regular streams as well.

Although writing or reading a single byte can be treated as an atomic operation, many programs will not be happy to read and write individual bytes. They'll want to read or write a particular group of bytes and will not react well to being interrupted.

Page 25: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Sequence Input Streams

The java.io.SequenceInputStream class connects multiple input streams together in a particular order:

public class SequenceInputStream extends InputStream

Reads from a SequenceInputStream first read all the bytes from the first stream in the sequence, then all the bytes from the second stream in the sequence, then all the bytes from the third stream, and so on. When the end of one of the streams is reached, that stream is closed; the next data comes from the next stream. Of course, this assumes that the streams in the sequence are in fact finite.

Page 26: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Sequence Input Streams: Constructors

public SequenceInputStream(Enumeration e)

public SequenceInputStream(InputStream in1, InputStream in2)

The first constructor creates a sequence out of all the elements of the Enumeration e. This assumes all objects in the enumeration are input streams. If this isn't the case, a ClassCastException will be thrown the first time a read is attempted from an object that is not an InputStream.

The second constructor creates a sequence input stream that reads first from in1, then from in2.

Page 27: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Sequence Input Streams: Example

try {

URL u1 = new URL("http://java.sun.com/");

URL u2 = new URL("http://www.altavista.com");

SequenceInputStream sin = new SequenceInputStream(u1.openStream(),

u2.openStream());

}

catch (IOException e) { //...

Page 28: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Byte Array Streams

It's sometimes convenient to use stream methods to manipulate data in byte arrays. For example, you might receive an array of raw bytes that you want to interpret as double-precision, floating-point numbers.

The quickest way to do this is to use a DataInputStream.However, before you can create a data input stream, you first need to create a raw, byte-oriented stream. This is what the java.io.ByteArrayInputStream class gives you.

Page 29: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Piped Streams

The java.io.PipedInputStream class and java.io.PipedOutputStream class provide a

convenient means to move streaming data from one thread to another.

Page 30: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

PipedInputStream

public class PipedInputStream extends InputStream

The PipedInputStream class has two constructors:

public PipedInputStream()

public PipedInputStream(PipedOutputStream source) throws IOException

The no-argument constructor creates a piped input stream that is not yet connected to a piped output stream. The second constructor creates a piped input stream that's connected to the public class PipedInputStream extends InputStream

Page 31: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

PipedOutputStream

public class PipedOutputStream extends OutputStream

The PipedOutputStream class also has two constructors:

public PipedOutputStream(PipedInputStream sink) throws IOException

public PipedOutputStream()

The no-argument constructor creates a piped output stream that is not yet connected to a piped input stream. The second constructor creates a piped output stream that's connected to the piped input stream sink.

Page 32: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Piped Streams - implemantation

We can create them both unconnected, then use one or the other's connect() method to link them:

PipedInputStream pin = new PipedInputStream();

PipedOutputStream pout = new PipedOutputStream();

pin.connect(pout);

The piped input stream also has four protected fields and one protected method that are used to implement the piping:

protected static final int PIPE_SIZE

protected byte[] buffer

protected int in

protected int out

protected synchronized void receive(int b) throws IOException

Page 33: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Piped Streams – How it works

When a client class invokes a write() method in the piped output stream class, the write() method invokes the receive() method in the connected piped input stream to place the data in the byte array buffer.

Data is always written at the position in the buffer given by the field in and read from the position in the buffer given by the field out.

Page 34: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Piped Streams – blocking situations

There are two possible blocking situations here.

The first occurs if the writing thread tries to write data while the reading thread's input buffer is full. When this occurs, the output stream enters an infinite loop in which it repeatedly waits for one second until some thread reads some data out of the buffer and frees up space. If this is likely to be a problem for your application, you should subclass PipedInputStream and make the buffer larger.

The second possible block is when the reading thread tries to read and no data is present in the buffer. In this case, the input stream enters an infinite loop in which it repeatedly waits for one second until some thread writes some data into the buffer.

Page 35: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Readers and Writers

The difference between readers and writers and input and output streams is that streams are fundamentally byte based, while readers and writers are fundamentally character based.

Where an input stream reads a byte, a reader reads a character; where an output stream writes a byte, a writer writes a character.

Page 36: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Partial Character Stream Inheritance Hierarchies

Page 37: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

Object Serialization

Object serialization saves an object's state in a sequence of bytes so that the object can be reconstituted from those bytes at a later time.

Object serialization, first used in the context of Remote Method Invocation (RMI) and later for JavaBeans, addresses this need.

The java.io.ObjectOutputStream class provides a writeObject() method you can use to write a Java object onto a stream.

The java.io.ObjectInputStream class has a readObject() method you can use to read an object from a stream.

Page 38: Java I/O. Agenda Introducing I/O. What Is a Stream? Network Streams Filter Streams Data Streams Reader and Writers Streams in Memory Object Serialization

References

http://en.wikipedia.org/wiki/UTF-8

Java I/O - Cryptographic Streams by Elliotte Rusty Harold, Publisher: O'Reilly.

Java NIO by Ron Hitchens

Java Threads, 2nd edition by Scott Oaks & Henry Wong(blocking on IO)