Create exciting Web pages - http://www.mcwebsoftware.com

RandomAccessFile

Sequential Text Files

Files are used to store data on a disk-drive. Without this, any data we typed in would be lost as soon as we shut down the computer. An easy way to create a data file is to use a text editor (e.g. Notepad) and typing the data, like this bank-account file:

Arnie
12.25
Barbie
123.50
Chaplin
999.99
Dufus
0.01
Ernie
2500.00
Great, Alexander the
123456789.00
......

This is a sequential text file. It is called a text-file because it is clearly readable, containing no special formatting or strange characters. It is sequential because a computer must read all the data in order when searching for something - there is no way to "jump around" in the file. Although this makes sequential text files inefficient, they are very flexible - you can type anything you want using any format. For example, there is no restriction on the size of a name or the size of a number. The example above is alphabetical, so if you were looking for "Yoyo" you would jump to the end. But if the file were not alphabetical, and contained all the words in a novel (in order), you would need to search sequentially to find a specific word. In a text-file, computers always read sequentially, because there is no command for jumping around in a text-file.

Random Access

If we are willing to use clear, inflexible structures in our files, we can make it possible for the computer to jump around in the file by counting bytes.

We break the file up into records - one for each customer. Then decide what the maximum size of a record should be. For example, we can allocate 30 bytes (characters) for each name and 20 bytes for each number, giving each record 50 bytes total maximum size.

Each record is divided up into fields - in this case, a NAME field of 30 bytes and a MONEY field of 20 bytes.

After making these decisions, we can say exactly where the 750th record in the file is:

==> 750 * 50 = 37500 bytes

Assuming the computer can count bytes, it can jump to byte #37500 and then read that record, without reading the first 749 records. This won't work in a text file, where the records and fields are variable length. This only works if we use fixed length fields and records.

Jumping around in the file is potentially more efficient than reading sequentially. In addition to this efficiency in reading data, it also enables us to write data much more efficiently. For example, to change Ernie's money, the computer can jump directly to that spot in the file and write a new number there. In a text file, the only way to write new data is to copy all the data into arrays in the computer's memory, then change an item, and then write all the data back onto the disk.

The increased efficiency of Random Access becomes more important with large amounts of data. For example, there are 80 million people in Germany and probably an equal number of telephones. So the telephone customer database contains 80 million records. If one piece of data changes, copying 80 million records could require 50 x 80 million bytes, or approximately 4000 megabytes (4 GB) of memory. That might not fit into the memory at all, making it impossible to change the file if it is supposed to all be in an array at once.

RandomAccessFile

Java provides the RandomAccessFile class for creating and manipulating random-access data files. It does not specify the size of a record or the number of fields - this is controlled by the program.

SEEK

The most important command is seek, which jumps around in the file. Despite the name, this is not a search command. It simply jumps to a specific position in the file. This is not possible in a text file.

READ and WRITE UTF

The input and output commands are .readUTF() and .writeUTF(String) . UTF stands for Unicode Transformation Format. If you want lots of details, try this link: http://en.wikipedia.org/wiki/UTF-8 . UTF supports Unicode characters in a standard way, so lots of computer software can read and write UTF successfully.

FIELD SIZES

Since the file has fixed size fields, the program must control the size of the data before writing. If a program writes 50 characters into a 30 character field, this will cause some sort of problem. But the RandomAccessFile methods will let you make this mistake - it does not control the size of the data. So your program must check data size before writing.

SEEK FIRST

Programs should always seek to a specific position before reading or writing. If the program is writing two fields - name and money - into the file, there should be two seek commands, one before each write command. This is shown in the following sample program.

Bank Sample Program

Here is a sample program that writes and reads bank data in a RandomAccessFile.

//== Create a RandomAccessFile ==
// Creates a RandomAccessFile with names and salaries (money)
// The program allocates 50 bytes for each recored -
//    40 bytes for the name field
//    10 bytes for the salary field
// The commands .writeUTF and .readUTF use the following system:
//    First two bytes tell the length of the following string,
//     and the following bytes contain the string (1 char per byte)
// The double value occupies 8 bytes.
// So there are 42 bytes for the name = 2 bytes for length, 40 bytes for data.
// If shorter strings are recorded, the extra bytes are empty (wasted) and ignored.
//==================================================================================

import java.awt.*;
import java.awt.event.*;
import javax.swing.*;
import java.io.*;

public class Bank extends EasyApp

{ public static void main(String[] args)
   { new Bank(); }

   Button bWrite = addButton("Write Record",30,30,100,50,this);
   Button bRead   = addButton("Read Record",130,30,100,50,this);

   public void actionPerformed(ActionEvent evt)
   {
       Object source = evt.getSource();
       if (source == bWrite) { writeRecord();}
       if (source == bRead) { readRecord(); }
   }

   public void writeRecord()
   {
       try
       {
           RandomAccessFile file = new RandomAccessFile("bank.dat","rw");
           long pos = inputLong("Type the record number for saving the data:");

           String name = input("Type the customer's name:");
           if (name.length() > 40) { name = name.substring(0,40);}

           double money = inputDouble("Type the customer's money:");

           file.seek(50*pos);
           file.writeUTF(name);
           file.seek(50*pos + 42);
           file.writeDouble(money);

           file.close();
       }
       catch (IOException ex)
       { output(ex.toString());

       }
   }

   public void readRecord()
   {
       try
       {
           RandomAccessFile file = new RandomAccessFile("bank.dat","r");
           long pos = inputLong("Type the record number for reading the data:");

           file.seek(50*pos);
           String name = file.readUTF();
           file.seek(50*pos + 42);
           double money = file.readDouble();

           output("Record #" + pos + " = " + name + " : " + money);

           file.close();
       }
       catch (IOException ex) { output(ex.toString());}
   }
}

Notice the following details:

new RandomAccessFile
Creates an object to use for accessing the file.
It creates a new, empty file if the file does not exist.
You may need to use a complete path rather than just the file name.
readDouble, writeDouble
Notice that the money is a double number, and the RandomAccessFile provides commands for reading and writing double numbers.
SEEK before each access
Directly before any .read or .write command, there is a SEEK command
SEEK uses a long number to specify the byte location
SEEK requires a calculation for the byte number - normally like this
file.seek( recordSize * recordNumber + fieldOffset )
try..catch..
All file operations must be enclosed in a try..catch.. error handler.
File Size
If you try to write a record using a large record number,
the file will automatically expand to provide a space for the record.
For example, if you write record #1 million, the file will expand to 50 MegaBytes.

Counting Bytes

Counting the bytes in a RandomAccessFile is tricky. The arithmetic is not so difficult (see SEEK above). The problem is knowing exactly how many bytes are actually used by various data types. The following chart shows the .write commands and the corresponding number of bytes required.

*write command*	*bytes occupied*
.writeInt(int)	4
.writeDouble(double)	8
.writeChar(char)	2 (this is not UTF)
.writeLong(long)	8
.writeByte(byte)	1
.writeFloat(float)	4
.writeBoolean(boolean)	1
.writeUTF(String)	String.length() + 2 bytes ***

In the sample Bank program, the name field is limited to 40 characters. But the program allows 42 bytes in the file. UTF Strings are written with a 2 byte prefix that tells how long the String is. So the UTF String actually occupies 42 bytes instead of 40.

*** Calculating UTF storage space is actually more complex. UTF does not always use 1 byte per character - it uses 1,2, or 3 bytes per character, depending on the language. "Normal" English characters (those with ASCII codes below 128) require one byte per character. So the calculation above is fine as long as you have normal pure English language data. If the text might contain some Greek letters or special math symbols, then these characters will take more than one byte of storages. If you are unsure and you don't mind wasting disk storage space, allocate 3 times as much space as you actually need, and you won't have any problems.

In general, there is nothing wrong with allocating a bit of extra space. For example, if you are writing 20 character String, an int and a double, you calculate:

==> (20 + 2) + 4 + 8 = 36

You can allocate 40 bytes per record (or even 50), in case you miscounted.

Practice - Add More Features

countCustomers()
    Read through the entire file and count the records that contain a name that isn't blank.
   Assume there are exactly 1000 records in the file, so record 999 is not blank.

showAllCustomers()
    Read through the entire file and print the name and money for each customer
    Assume there are exactly 1000 records in the file, so record 999 is not blank.
PIN number
Banks normally have security to prevent people from accessing other people's data. Often this uses a 4-digit PIN (Personal ID Number). To add a PIN to this database, each record must become larger. This can be written as a UTF String - that's the simplest way. Remember that this requires 6 bytes for 4 characters - 2 extra bytes for the String length.

Add the PIN code to the program so that it is required before each access. That means both reading and writing data should input the PIN code and check it against the code stored in the file.
Charges
Banks charge fees for various services.

monthlyFee
   Reads through the entire file and subtracts 50.00 from each customer as a monthly fee.
   If the customer has less than 50.00 EU, resulting in a negative balance,
     the method should print a warning message. This would print the account number,
     the customer's name and the current balance.

interest(double rate)
   Adds money to each customer. For example, if rate is 0.5 % , it is calculated like this:
        newMoney = money * (1 + rate / 100) ;
     So   500 EU ---> 500 * (1.005) = 502.50

montlyUpdate()
   Reads through the entire file and subtracts 50.00 for the monthly fee and then add
0.5% interest. Like this:

   Current balance = 500
   Subtract 50.00 = 450.00
   Add 0.5% = 452.25 = New Balance
   Write the new balance back into the file
Searching
The program is fine as long as the customers and the bank employees know the record number for each customer. Otherwise, it is impossible to access the correct record. For example, what if Madonna wants to put some money in the bank? Maybe she already has a record, so it would be silly to make a new one.

The program needs some searching methods. Write some of the following:

nameSearch(String customer)
    searches sequentially for the record containing a name matching customer

moneySearch(double min, double max)
   searches for all records where the money is between min and max
    For example, we could find all the rich people
   by running moneySearch(1000000,9e99);

< br >