Contextual Ads
More Java Resources
Advertisement
File handling is an integral part of nearly all programming projects. Files provide the
means by which a program stores data, accesses stored data, or shares data. As a
result, there are very few applications that don’t interact with a file in one form or
another. Although no aspect of file handling is particularly difficult, a great many classes,
interfaces, and methods are involved. Being able to effectively apply them to your projects
is the mark of a professional.
It is important to understand that file I/O is a subset of Java’s overall I/O system.
Furthermore, Java’s I/O system is quite large. This is not surprising given that it supports
two distinct I/O class hierarchies: one for bytes and one for characters. It contains classes
that enable a byte array, a character array, or a string to be used as source or target of I/O
operations. It also provides the ability to set or obtain various attributes associated with a
file, itself, such as its read/write status, whether the file is a directory, or if it is hidden. You
even obtain a list of files within a directory.
Despite is size, Java’s I/O system is surprisingly easy to use. One reason for this is its
well-thought-out design. By structuring the I/O system around a carefully crafted set of
classes, this very large API is made manageable. Once you understand how to use the core
classes, it’s easy to learn its more advanced capabilities. The I/O system’s consistency
makes it easy to maintain or adapt code, and its rich functionality provides solutions to
most file handling tasks.
The core of Java’s I/O system is packaged in java.io. It has been included with Java
since version 1.0, and it contains the classes and interfaces that you will most often use
when performing I/O operations, including those that operate on files. Simply put, when
you need to read or write files, java.io is the package that you will normally turn to. As a
result, all of the recipes in this chapter use its capabilities in one form or another.
Another package that includes file handling classes is java.util.zip. The classes in
java.util.zip can create a compressed file, or decompress a file. These classes build on the
functionality provided by the I/O classes defined in java.io. Thus, they are integrated in
to Java’s overall I/O strategy. Three recipes demonstrate the use of data compression when
handling files.
This tutorial provides several recipes that demonstrate file handling. It begins by
describing several fundamental operations, such as reading and writing bytes or characters.
It then shows various techniques that help you utilize and manage files.
Here are the recipes contained in this chapter:
• Read Bytes from a File
• Write Bytes to a File
• Buffer Byte-Based File I/O
• Read Characters from a File
• Write Characters to a File
• Buffer Character-Based File I/O
• Read and Write Random-Access Files
• Obtain File Attributes
• Set File Attributes
• List a Directory
• Compress and Decompress Data
• Create a ZIP file
• Decompress a ZIP file
• Serialize Objects
Beginning with version 1.4, Java began providing an additional approach to I/O called NIO
(which stands for New I/O). It creates a channel-based approach to I/O and is packaged in
java.nio. The NIO system is not intended to replace the stream-based I/O classes found in
java.io. Instead, NIO supplements them. Because the focus of this chapter is stream-based I/O,
no NIO-based recipes are included. The interested reader will find a discussion of NIO (and of
I/O in general) in my book Java: The Complete Reference.
An Overview of File Handling
In Java, file handling is simply a special case aspect of a larger concept because file I/O is
tightly integrated into Java’s overall I/O system. In general, if you understand one part of
the I/O system, it’s easy to apply that knowledge to another situation. There are two
aspects of the I/O system that make this feature possible. The first is that Java’s I/O system
is build on a cohesive set of class hierarchies, at the top of which are abstract classes that
define much of the basic functionality shared by all specific concrete subclasses. The second
is the stream. The stream ties together the file system because all I/O operations occur
through one. Because of the importance of the stream, we will begin this overview of Java’s
file handling capabilities there.
Streams
Astream is an abstraction that either produces or consumes information. A stream is linked to a physical device by the I/O system. All streams behave in the same manner, even if the actual physical devices they are linked to differ. Thus, the same I/O classes and methods can
be applied to different types of devices. For example, the same methods that you use to write
to the console can also be used to write to a disk file or to a network connection. The core
Java streams are implemented within class hierarchies defined in the java.io package. These
are the streams that you will usually use when handling files. However, some other packages
also define streams. For example, java.util.zip supplies streams that create and operate on
compressed data.
Modern versions of Java define two types of streams: byte and character. (The original
1.0 version of Java defined only byte streams, but character streams were quickly added.)
Byte streams provide a convenient means for handling input and output of bytes. They are
used, for example, when reading or writing binary data. They are especially helpful when
working with files. Character streams are designed for handling the input and output of
characters, which streamlines internationalization.
The fact that Java defines two different types of streams makes the I/O system quite
large because two separate class hierarchies (one for bytes, one for characters) are needed.
The sheer number of I/O classes can make the I/O system appear more intimidating that it
actually is. For the most part, the functionality of byte streams is paralleled by that of the
character streams.
One other point: at the lowest level, all I/O is still byte-oriented. The character-based
streams simply provide a convenient and efficient means for handling characters.
The Byte Stream Classes
Byte streams are defined by two class hierarchies: one for input and one for output. At the
top of these are two abstract classes: InputStream and OutputStream. InputStream defines
the characteristics common to byte input streams, and OutputStream describes the behavior
of byte output streams. The methods specified by InputStream and OutputStream are
shown in Tables 3-1 and 3-2. From InputStream and OutputStream are created several
subclasses, which offer varying functionality. These classes are shown in Table 3-3.
Of the byte-stream classes, two are directly related to files: FileInputStream and
FileOutputStream. Because these are concrete implementations of InputStream and
OutputStream, they can be used any place an InputStream or an OutputStream is needed.
For example, an instance of FileInputStream can be wrapped in another byte stream class,
such as a BufferedInputStream. This is one reason why Java’s stream-based approach to I/O
is so powerful: it enables the creation of a fully integrated class hierarchy.
The Character Stream Classes
Character streams are defined by using class hierarchies that are different from the byte streams. The character stream hierarchies are topped by these two abstract classes: Reader
andWriter. Reader is used for input, and Writer is used for output. Tables 3-4 and 3-5 show
the methods defined by these classes. Concrete classes derived from Reader and Writer
operate on Unicode character streams. In general, the character-based classes parallel the
byte-based classes. The character stream classes are shown in Table 3-6.
Of the character-stream classes, two are directly related to files: FileReader and FileWriter.
Because these are concrete implementations of Reader and Writer, they can be used any place
a Reader or Writer is needed. For example, an instance of FileReader can be wrapped in a
BufferedReader to buffer input operations.
Table 3.1 The Methods Defiend by InputStream
| Method |
Description |
| int available( ) throws IOException |
Returns the number of bytes of input currently
available for reading. |
| void close( ) throws IOException |
Closes the input source. |
| void mark(int numBytes) |
Places a mark at the current point in the input
stream that will remain valid until numBytes
bytes are read. Not all streams implement
mark( ). |
| boolean markSupported( ) |
Returns true if mark( )/reset( ) are supported
by the invoking stream. |
| abstract int read( ) throws IOException |
Returns an integer representation of the next
available byte of input. –1 is returned when the
end of the file is encountered. |
| int read(byte buffer[ ]) throws IOException |
Attempts to read up to buffer.length bytes into
buffer and returns the actual number of bytes
that were successfully read. –1 is returned
when the end of the file is encountered. |
| int read(byte buffer[ ], int offset,int numBytes) throws IOException |
Attempts to read up to numBytes bytes into buffer starting at buffer[offset], returning
the number of bytes successfully read.
–1 is returned when the end of the file is
encountered. |
| void reset( ) throws IOException |
Resets the input pointer to the previously set mark. Not all streams support reset( ). |
| long skip(long numBytes) throws IOException |
Ignores (that is, skips) numBytes bytes of input,
returning the number of bytes actually ignored. |
Table 3.2 The Methods Specified by OutputStream
| Method |
Description |
| void close( ) throws IOException |
Closes the output stream. |
| void flush( ) throws IOException |
Finalizes the output state so that any buffers are cleared. That is, it flushes the output buffers. |
abstract void write(int b)
throws IOException |
Writes the low-order byte of b to the output stream. |
void write(byte buffer[ ])
throws IOException |
Writes a complete array of bytes to the output stream. |
void write(byte buffer[ ], int offset,
int numBytes) throws IOException |
Writes a subrange of numBytes bytes from the array buffer, beginning at buffer[offset]. |
Table 3.3 The Byte Stream Classes
| Byte Stream Class |
Description |
| BufferedInputStream |
Buffered input stream. |
| BufferedOutputStream |
Buffered output stream. |
| ByteArrayInputStream |
Input stream that reads from a byte array. |
| ByteArrayOutputStream |
Output stream that writes to a byte array. |
| DataInputStream |
An input stream that contains methods for reading Java’s standard data types. |
| DataOutputStream |
An output stream that contains methods for writing Java’s standard data types. |
| FileInputStream |
Input stream that reads from a file. |
| FileOutputStream |
Output stream that writes to a file. |
| FilterInputStream |
Implements InputStream and allows the contents
of another stream to be altered (filtered). |
| FilterOutputStream |
Implements OutputStream and allows the
contents of another stream to be altered (filtered). |
| InputStream |
Abstract class that describes stream input. |
| OutputStream |
Abstract class that describes stream output. |
| PipedInputStream |
Input pipe. |
| PipedOutputStream |
Output pipe. |
| PrintStream |
Output stream that contains print( ) and println( ). |
| PushbackInputStream |
Input stream that allows bytes to be returned to the stream. |
| RandomAccessFile |
Supports random-access file I/O. |
| SequenceInputStream |
Input stream that is a combination of two or more input streams that will be read sequentially, one after the other. |
A superclass of FileReader is InputStreamReader. It translates bytes into characters.
A superclass of FileWriter is OutputStreamWriter. It translates characters into bytes. These
classes are necessary because all files are, at their foundation, byte-oriented.
The RandomAccessFile Class
The stream classes just described operate on files in a strictly sequential fashion. However, Java also allows you to access the contents of a file in non-sequential order. To do this, you
will use RandomAccessFile, which encapsulates a random-access file. RandomAccessFile
is not derived from InputStream or OutputStream. Instead, it implements the interfaces DataInput and DataOutput (which are described shortly). RandomAccessFile supports
random access because it lets you change the location in the file at which the next read or
write operation will occur. This is done by calling its seek( ) method.
Table 3.4 The Methods Defined by Reader
| Method |
Description |
| abstract void close( ) throws IOException |
Closes the input source. |
void mark(int numChars)
throws IOException |
Places a mark at the current point in the input stream that will remain valid until numChars characters are read. Not all streams support mark( ). |
| boolean markSupported( ) |
Returns true if mark( )/reset( ) are supported on this stream. |
| int read( ) throws IOException |
Returns an integer representation of the next available character from the input stream. –1 is returned when the end of the file is encountered. |
| int read(char buffer[ ]) throws IOException |
Attempts to read up to buffer.length characters into buffer and returns the actual number of characters that were successfully read. –1 is returned when the end of the file is encountered. |
abstract int read(char buffer[ ],
int offset,
int numChars)
throws IOException |
Attempts to read up to numChars characters into buffer starting at buffer[offset], returning the number of characters successfully read. –1 is returned when the end of the file is encountered. |
| boolean ready( ) throws IOException |
Returns true if input is pending. Otherwise, it returns false. |
| void reset( ) throws IOException |
Resets the input pointer to the previously set mark. Not all streams support reset( ). |
long skip(long numChars)
throws IOException |
Skips over numChars characters of input, returning the number of characters actually skipped. |
The File Class
In addition to the classes that support file I/O, Java provides the File class, which encapsulates information about a file. This class is extremely useful when manipulating a file itself (rather
than its contents) or the file system of the computer. For example, using File you can determine
if a file is hidden, set a file’s date, set a file to read-only, list the contents of a directory, or create
a new directory, among many other things. Thus, File puts the file system under your control.
This makes File one of the most important classes in Java’s I/O system.
Table 3.5 The Methods Defined By Writer
| Method |
Description |
Writer append(char ch)
throws IOException |
Appends ch to the end of the invoking output stream. Returns a reference to the stream. |
Writer append(CharSequence chars)
throws IOException |
Appends chars to the end of the invoking output stream. Returns a reference to the stream. |
Writer append(CharSequence chars,
int begin, int end)
throws IOException |
Appends a subrange of chars, specified by begin and end, to the end of the output stream. Returns
a reference to the stream. |
| abstract void close( ) throws IOException |
Closes the output stream. |
| abstract void flush( ) throws IOException |
Finalizes the output state so that any buffers are cleared. That is, it flushes the output buffers. |
| void write(int ch) throws IOException |
Writes the character in the low-order 16 bits of ch to the output stream. |
void write(char buffer[ ])
throws IOException |
Writes a complete array of characters to the output stream. |
abstract void write(char buffer[ ],
int offset,
int numChars)
throws IOException |
Writes a subrange of numChars characters from the array buffer, beginning at buffer[offset] to the
output stream. |
| void write(String str) throws IOException |
Writes str to the output stream. |
void write(String str, int offset,
int numChars) |
Writes a subrange of numChars characters from the string str, beginning at the specified offset. |
The I/O Interfaces
Java’s I/O system includes the following interfaces (which are packaged in java.io):
| Closeable |
DataInput |
DataOutput |
| Externalizable |
FileFilter |
FilenameFilter |
| Flushable |
ObjectInput |
ObjectInputValidation |
| ObjectOutput |
ObjectStreamConstants |
Serializable |
Those used either directly or indirectly by the recipes in this chapter are DataInput,
DataOutput, Closeable, Flushable, FileFilter, FilenameFilter, ObjectInput, and
ObjectOutput.
The DataInput and DataOutput interfaces define a variety of read and write methods,
such as readInt( ) and writeDouble( ), that can read and write Java’s primitive data types.
They also specify read( ) and write( ) methods that parallel those specified by InputStream
and OutputStream. All operations are byte-oriented. RandomAccessFile implements the
DataInput and the DataOutput interfaces. Thus, random-access file operations in Java are
byte-oriented.
Table 3.6 The Character Stream Classes
| Method |
Description |
| BufferedReader |
Buffered input character stream. |
| BufferedWriter |
Buffered output character stream. |
| CharArrayReader |
Input stream that reads from a character array. |
| CharArrayWriter |
Output stream that writes to a character array. |
| FileReader |
Input stream that reads from a file. |
| FileWriter |
Output stream that writes to a file. |
| FilterReader |
Filtered reader. |
| FilterWriter |
Filtered writer. |
| InputStreamReader |
Input stream that translates bytes to characters. |
| LineNumberReader |
Input stream that counts lines. |
| OutputStreamWriter |
Output stream that translates characters to bytes. |
| PipedReader |
Input pipe. |
| PipedWriter |
Output pipe. |
| PrintWriter |
Output stream that contains print( ) and println( ). |
| PushbackReader |
Input stream that allows characters to be returned to the input stream. |
| Reader |
Abstract class that describes character stream input. |
| StringReader |
Input stream that reads from a string. |
| StringWriter |
Output stream that writes to a string. |
| Writer |
Abstract class that describes character stream output. |
The Closeable and Flushable interfaces are implemented by several of the I/O classes. They provide a uniform way of specifying that a stream can be closed or flushed. The
Closeable interface defines only one method, close( ), which is shown here:
void close( ) throws IOException
This method closes an open stream. Once closed, the stream cannot be used again. All
I/O classes that open a stream implement Closeable.
The Flushable interface also specifies only one method, flush( ), which is shown here:
void flush( ) throws IOException
Calling flush( ) causes any buffered output to be physically written to the underlying device. This interface is implemented by the I/O classes that write to a stream.
FileFilter and FilenameFilter are used to filter directory listings.
The ObjectInput and ObjectOutput interfaces are used when serializing (saving and restoring) objects.
The Compressed File Streams
In java.util.zip, Java provides a very powerful set of specialized file streams that handle the compression and decompression of data. All are subclasses of either InputStream or OutputStream, described earlier. The compressed file streams are shown here:
| DeflaterInputStream |
Reads data, compressing the data in the process. |
| DeflaterOutputStream |
Writes data, compressing the data in the process. |
| GZIPInputStream |
Reads a GZIP file. |
| GZIPOutputStream |
Writes a GZIP file. |
| InflaterInputStream |
Reads data, decompressing the data in the process. |
| InflaterOutputStream |
Writes data, decompressing the data in the process. |
| ZipInputStream |
Reads a ZIP file. |
| ZipOutputStream |
Writes a ZIP file. |
Using the compressed file streams, it is possible to automatically compress data while
writing to a file or to automatically decompress data when reading from a file. You can also
create compressed files that are compatible with the standard ZIP or GZIP formats, and you
can decompress files in those formats.
The actual compression is provided by the Inflater and Deflater classes, also packaged
in java.util.zip. They use the ZLIB compression library. You won’t usually need to deal with
these classes directly when compressing or decompressing files because their default
operation is sufficient.
Tips For Handling Errors
File I/O poses a special challenge when it comes to error handling. There are two reasons
for this. First, I/O failures are a very real possibility when reading or writing files. Despite
the fact that computer hardware (and the Internet) is much more reliable than in the past, it
still fails at a fairly high rate, and any such failure must be handled in a manner consistent
with the needs of your application. The second reason that error handling presents a
challenge when working with files is that nearly all file operations can generate one or more
exceptions. This means that nearly all file handling code must take place within a try block.
The most common I/O exception is IOException. This exception can be thrown by
many of the constructors and methods in the I/O system. As a general rule, it is generated
when something goes wrong when reading or writing data, or when opening a file. Other
common I/O-related exceptions, such as FileNotFoundException and ZipException, are
subclasses of IOException.
There is another common exception related to file handling: SecurityException. Many constructors or methods will throw a SecurityException if the invoking application does
not have permission to access a file or perform a specific operation. You will need to handlethis exception in a manner appropriate to your application. For simplicity, the examples in
this chapter do not handle security exceptions, but it may be necessary for your applications
to do so.
Because so many constructors and methods can generate an IOException, it is not
uncommon to see code that simply wraps all I/O operations within a single try block and
then catches any IOException that may occur. While adequate for experimenting with file
I/O or possibly for simple utility programs that are for your own personal use, this approach
is not usually suitable for commercial code. This is because it does not let you easily deal
individually with each potential error. Instead, for detailed control, it is better to put each
operation within its own try block. This way, you can precisely report and respond to the
error that occurred. This is the approach demonstrated by the examples in this chapter.
Another way that IOExceptions are sometimes handled is by throwing them out of the
method in which they occur. To do this you must include a throws IOException clause in
the method’s declaration. This approach is fine in some cases, because it reports an I/O
failure back to the caller. However, in other situations it is a dissatisfying shortcut because it
causes all users of the method to handle the exception. The examples in this chapter do not
use this approach. Rather, they handle all IOExceptions explicitly. This allows each error
handler to report precisely the error that occurred.
If you do handle IOExceptions by throwing them out of the method in which they occur,
you must take extra care to close any files that have been opened by the method. The easiest
way to do this is to wrap your method’s code in a try block and then use a finally clause to
close the files(s) prior to the method returning.
In the examples in this chapter, any I/O exceptions that do occur are handled by simply
displaying a message. While this approach is acceptable for the example programs, real
applications will usually need to provide a more sophisticated response to an I/O error. For
example, you might want to give the user the ability to retry the operation, specify an
alternative operation, or otherwise gracefully handle the problem. Preventing the loss or
corruption of data is a primary goal. Part of being a great programmer is knowing how to
effectively manage the things that might go wrong when an I/O operation fails.
One final point: a common mistake that occurs when handling files is forgetting to close
a file when you are done with it. Open files use system resources. Thus, there are limits to the
number of files that can be open at any one time. Closing a file also ensures that any data
written to the file is actually written to the physical device. Therefore, the rule is very simple:
if you open a file, close the file. Although files are typically closed automatically when an
application ends, it’s best not to rely on this because it can lead to sloppy programming and
bad habits. It is better to explicitly close each file, properly handling any exceptions that
might occur. For this reason, all files are explicitly closed by the examples in this chapter,
even when the program is ending. |