HTML Markup | JavaScript | Java | Home & Links

Java Tutorial 8 - File IO

Modern program design uses a stream metaphor for information flow. Information streams from sources (inputs) through applications (programs) into sinks (outputs). Storage reservoirs (files) are used when needed. Java provides standard IO streams, file streams, data streams, object streams for complex structures such as records and trees and pipe streams for threads. import java.io.* is required for access to IO classes.

Standard IO Streams

The standard input and output streams use the console by default but can be redirected by the operating system to programs, files, printers or other devices using the symbols | (ie pipe), <, > and >>. The standard error stream is redirected with 2> (Bourne Shell). Filter apps assume standard io streams and pipes.

Java applications can force redirection with setIn(InputStream xx), setOut(OutputStream xx) and setErr(OutputStream xx) methods. System.in, System.out and System.err objects read and write from the standard stream. The method read() returns an integer that is ASCII of the input character. It can throw an IOException error. The output methods are print(string), println(string) and printf(format, object_list) [using c-like syntax formatting].

public void main (String args[]) throws java.io.IOException
{ int ch;
  System.out.println("Enter text: ");
  while (ch=System.in.read()!='\n') {
  System.out.println(ch); // shows it reads ASCII
  System.out.println((char)ch); }

There are alternate ways for user interaction. Refer to scanner class for stdio streams, file io for basic file management or file choosers for visual file interfaces using Swing objects.

File Management

File management is the manner in which files are monitored and controlled for I/O access. The File class provides a constructor to create a file handle. This file handle is then used by various file class methods to access the properties of a specific file or by file stream constructors to open files. The File class also provides a platform dependent directory and file separator symbol using either File.separator or File.separatorChar. Simple file constructors either use hardcoded names or pass a value from the parameter line such as:

File simple=new File("sample.dat"); // simple name in current dir
File path=new File("subfolder/sample.dat"); // using relative path
File hard=new File(File.separator + "sample.dat"); // in root dir
File soft=new File(args[0]); // entered as part of command line

Accessor methods: getAbsolutePath(), getCanonicalPath(), getName(), getPath(), getParent(), lastModified(), length(), list() [returns array of String], listFiles() [returns array of File objects].

Mutator methods: delete(), deleteOnExit(), mkdir(), mkdirs(), renameTo(), setLastModified(), setReadOnly().

Boolean methods: canRead(), canWrite(), compareTo(), exists(), isAbsolute(), isDirectory(), isFile(), isHidden().

Here is routine to establish the current directory path:

To limit what is returned by the list() method, apply a filter using the FilenameFilter interface. accept() is the only method allowed. A program showing the use of FilenameFilter is:

Note: More sophisticated GUI techniques for selecting files include the Swing JFileChooser class and its awt cousin FileDialog. These classes also include file filters and checks for existence and the ability to access the data.

File Streams

File streams are primitive streams whose sources or destinations are files. Both byte [8-bit] (FileInputStream / FileOutputStream) and character [16-bit] (FileReader/FileWriter) datatypes can be used. Design new apps with character streams to allow Unicode material to be processed correctly. The constructors are:

FileReader streamer=new FileReader(fileObj)
FileReader streamer=new FileReader(FilePath[,append_flag])

Streams are opened when constructed. Always use close() on opened output files to make sure that file buffers are completely written. The read() method returns a datatype int value.

Oracle's tutorial provides a CopyBytes app that shows how to use file streams. However there are more efficient ways of handling most types of data. File streams should be wrapped and buffered in data streams. Refer to fileCopy() for a very efficient network file duplication method.

Data Streams

Data streams are streams whose sources and destinations are other streams. They are known as wrappers because they wrap the primitive file stream object mechanism inside a more powerful one.

InputStreamReader() and OutputStreamWriter() convert primitive data to Unicode encoded data.

DataInputStream() and DataOutputStream() streams can be used to read/write primitive data types. Some useful methods are: read(), readXX(), write() and writeXX() where XX is a primitive data type. Data streams can also be buffered so that more than a single 8/16 bit quantity is processed at a time. The basic buffered streams are BufferedInputStream(), BufferedOutputStream(), BufferedReader() and BufferedWriter().

Note: Java does not provide an EOF() method. When an EOF event occurs read() returns an integer -1 and readLine() returns null. But a better technique is to use an exception handler to catch the EOFException and handle it explicitly.

copyline.java uses buffered 16-bit data streams to read and copy lines to a new file. Many utilities rely on text files which are often best handled one line at a time. Set STRIP=true to remove blank lines from file. Use copyline as a start point for your own utility by altering the lines between the readLine() and write() methods.

Stream Tokenization

The StreamTokenizer class can be used to read tokens directly from a file stream! This makes some utilities more efficient because they can work with individual tokens rather than lines of text (ie. the line had already been parsed). The constructor requires a filereader object as a parameter. resetSyntax() allows setting custom delimiters. Whitespace is defined with the whitespaceChars(iStart,iEnd) method. Valid word characters are defined with the wordChars(iStart,iEnd) method. Unfortunately this selection by range limits usefulness of the class! The eolIsSignificant(true) method allows newlines to be detected. tttype contains the type of token scanned by nextToken(). The token scanned sits in either nval (numeric) or sval (string). copytoken.java uses a tokenized 16-bit token stream to read and copy tokens to a new file.

Note: Whitespace is minimized by tokenization which makes it a great method for compressing HTML source files into a server copy. Use copytoken as a start point and add your own utility between the read and write operations.

Scanner class objects are constructed with a stream name as its parameter (eg. Scanner(inStream) or Scanner(System.in). It has the methods: next() and nextXxx() [where Xxx is a primitive type like Int], hasNext(), hasNextXxx(), useDelimiter(reg_exp), useRadix(int) [defaults to base 10], findInline(reg_exp) and skip(). If nextXxx() gets a token that does not match the Xxx type, it throws an InputMisMatchException.

Scanner input=new Scanner (System.in);  // set up stream
System.out.println("Enter positive integer:"); // prompt
num=input.nextInt();   // fetch response

Random Access Files

Random access files allow files to be accessed at a specific point in the file. They can also be opened in read/write mode which allows updating of a current file. The constructor is RandomAccessFile(FilefileObject, String accessMethod) where the access method is either "r" or "rw". The seek(long position) method moves the file position pointer. It is incremented automatically on a write. The getFilePointer() method returns the current file position pointer. The file size can be adjusted with setLength(). Normal i/o methods are used for access.

copyrandom.java is a working random access file io system that uses bit streams to read and copy binary data to a new file. It demonstrates file creation and acessing but does not show either the ability to access specific points in the file or file updating.

mirror.java shows a very simple use of the seek() method. The source file is read backwards and each byte written to a new file. This is one of the simplest forms of data encryption offered.

RandomAccess.java is a more complete example that uses graphical user input (GUI) to alter file contents. Since it extends the GenericApplication class, that file must be compiled first. Once both files are compiled, test with java RandomAccess xxx where xxx is the filename.

Tutorial Source Code

Obtain source for Concord, concordance, mirror, RandomAccess here.



JR's HomePage | Comments [jatutor8.htm:2016 02 17]