8.4 Serialization

Objects are serialized in a number of situations in Java. The two main reasons to serialize objects are to transfer objects and to store them.

There are several ways to improve the performance of serialization and deserialization. First, fields that are transient do not get serialized, saving both space and time. You can consider implementing readObject( ) and writeObject( ) (see java.io.Serializable documentation) to override the default serialization routine; it may be that you can produce a faster serialization routine for your specific objects. If you need this degree of control, you are better off using the java.io.Externalizable interface (the reason is illustrated shortly). Overriding the default serialization routine in this way is generally only worth doing for large or frequently serialized objects. The tight control this gives you may also be necessary to correctly handle canonicalized objects (to ensure objects remain canonical when deserializing them).

To transfer objects across networks, it is worth compressing the serialized objects. For large amounts of data, the transfer overhead tends to swamp the costs of compressing and decompressing the data. For storing to disk, it is worth serializing multiple objects to different files rather than to one large file. The granularity of access to individual objects and subsets of objects is often improved as well.

It is also possible to serialize objects in a separate thread for storage and network transfers, letting the serialization execute in the background. For objects whose state can change between serializations, consider using transaction logs or change logs (logs of the differences in the objects since they were last fully serialized) rather than reserializing the whole object. This works much like full and incremental backups. You need to maintain the changes somewhere, of course, so it makes the objects more complicated, but this complexity can have a really good payback in terms of performance: consider how much faster an incremental backup is compared to a full backup.

It is worthwhile to spend some time on a basic serialization tuning exercise. I chose a couple of fairly simple objects to serialize, but they are representative of the sorts of issues that crop up in serialization.

class Foo1 implements Serializable
{
  int one;
  String two;
  Bar1[  ] four;
  
  public Foo1(  )
  {
    two = new String("START");
    one = two.length(  );
    four = new Bar1[2];
    four[0] = new Bar1(  );
    four[1] = new Bar1(  );
  }
}
  
class Bar1 implements Serializable
{
  float one;
  String two;
  public Bar1(  )
  {
    two = new String("hello");
    one = 3.14F;
  }
}

Note that I have given the objects default initial values for the tuning tests. The defaults assigned to the various String variables are forced to be unique for every object by making them new Strings. Without doing this, the compiler assigns the identical String to every object. That alters the timings: only one String is written on output, and when created on input, all other String references reference the same string by identity. (Java serialization can maintain relative identity of objects for objects that are serialized together.) Using identical Strings would make the serialization tests quicker and would not be representative of normal serializations.

Test measurements are easily skewed by rewriting previously written objects. Previously written objects are not converted and written out again; instead, only a reference to the original object is written. Writing this reference can be faster than writing out the object again. The speed is even more skewed on reading since only one object gets created. All the other references refer to the same uniquely created object.

Early in my career, I was given the task of testing the throughput of an object database. The first tests registered a fantastically high throughput until we realized we were storing just a few objects once, and all the other objects we thought we were storing were only references to those first few.

The Foo objects each contain two Bar objects in an array to make the overall objects slightly more representative of real-world objects. I'll make a baseline using the standard serialization technique:

    if (toDisk)
      OutputStream ostream = new FileOutputStream("t.tmp");
    else
      OutputStream ostream = new ByteArrayOutputStream(  );
    ObjectOutputStream wrtr = new ObjectOutputStream(ostream);
  
    long time = System.currentTimeMillis(  );
    //write objects: time only the 3 lines for serialization output
    wrtr.writeObject(lotsOfFoos);
    wrtr.flush(  );
    wrtr.close(  );
    System.out.println("Writing time: " + 
            (System.currentTimeMillis(  )-time));
  
    if (toDisk)
      InputStream istream = new FileInputStream("t.tmp");
    else
      InputStream istream = new ByteArrayInputStream(
        ((ByteArrayOutputStream) ostream).toByteArray(  ));
    ObjectInputStream rdr = new ObjectInputStream(istream);
  
    time = System.currentTimeMillis(  );
    //read objects: time only the 2 lines for serialization input
    Foo1[  ] allFoos = (Foo1[  ]) rdr.readObject(  );
    rdr.close(  );
    System.out.println("Reading time: " + 
            (System.currentTimeMillis(  )-time));

As you can see, I provide for running tests either to disk or purely in memory. This allows you to break down the cost into separate components. The actual values revealed that 95% of the time is spent in the serialization. Less than 5% is the actual write to disk (of course, the relative times are system-dependent, but these results are probably representative).

When measuring, I used a pregrown ByteArrayOutputStream so that there were no effects from allocating the byte array in memory. Furthermore, to eliminate extra memory copying and garbage-collection effects, I reused the same ByteArrayOutputStream, and indeed the same byte array from that ByteArrayOutputStream object for reading. The byte array is accessible by subclassing ByteArrayOutputStream and providing an accessor to the ByteArrayOutputStream.buf instance variable.

The results of this first test for JDK 1.2.2^[9] are shown in the following chart:

^[9] Table 8-3 lists the full results of tests with a variety of VMs. I have used the 1.2 results for discussion in this section, and the results are generally applicable to the other VMs tested.

Writing (serializing)

Reading (deserializing)

Standard serialization

100%

164%

I have normalized the baseline measurements to 100% for the byte array output (i.e., serializing the collection of Foos). On this scale, the reading (deserializing) takes 164%. This is not what I expected, because I am used to the idea that "writing" takes longer than "reading." Thinking about exactly what is happening, you can see that for the serialization you take the data in some objects and write that data out to a stream of bytes, which basically accesses and converts objects into bytes. But for the deserializing, you access elements of a byte array and convert these to other object and data types, including creating any required objects. Added to the fact that the serializing procedures are much more costly than the actual (disk) writes and reads, it is now understandable that deserialization is likely to be the more intensive, and consequently slower, activity.

Considering exactly what the ObjectInputStream and ObjectOutputStream must do, I realize that they are accessing and updating internal elements of the objects they are serializing, without knowing beforehand anything about those objects. This means there must be a heavy usage of the java.reflect package, together with some internal VM access procedures (since the serializing can reach private and protected fields and methods).^[10] All this suggests that you should improve performance by taking explicit control of the serializing.

^[10] The actual code is difficult and time-consuming to work through. It was written in parts as one huge iterated/recursed switch, probably for performance reasons.

Alert readers might have noticed that Foo and Bar have constructor s that initialize the object and may be wondering if deserializing could be speeded up by changing the constructors to avoid the unnecessary overhead there. In fact, the deserialization uses internal VM access to create the objects without going through the constructor, similar to cloning the objects. Although the Serializable interface requires serializable objects to have no-arg constructors, deserialized objects do not actually use that (or any) constructor.

To start with, the Serializable interface supports two methods that allow classes to handle their own serializing. So the first step is to try these methods. Add the following two methods to Foo:

  private void writeObject(java.io.ObjectOutputStream out)
    throws IOException
  {
    out.writeUTF(two);
    out.writeInt(one);
    out.writeObject(four);
  }
  private void readObject(java.io.ObjectInputStream in)
    throws IOException, ClassNotFoundException
  {
    two = in.readUTF(  );
    one = in.readInt(  );
    four = (Bar2[  ]) in.readObject(  );
  }

Bar needs the equivalent two methods:

  private void writeObject(java.io.ObjectOutputStream out)
    throws IOException
  {
    out.writeUTF(two);
    out.writeFloat(one);
  }
  private void readObject(java.io.ObjectInputStream in)
    throws IOException, ClassNotFoundException
  {
    two = in.readUTF(  );
    one = in.readFloat(  );
  }

The following chart shows the results of running the test with these methods added to the classes:

Writing (serializing)

Reading (deserializing)

Standard serialization

100%

164%

Customized read/writeObject( ) in Foo and Bar

140%

150%

We have improved the reads but made the writes worse. I expected an improvement for both, and I cannot explain why the writes are worse (other than perhaps that the ObjectOutputStream class may have suboptimal performance for this method-overriding feature; the 1.4 VM does show a speedup for both the writes and reads, suggesting that the class has been optimized in that version). Instead of analyzing what the ObjectOutputStream class may be doing, let's try further optimizations.

Examining and manipulating objects during serialization takes more time than the actual conversion of data to or from streams. Considering this, and looking at the customized serializing methods, you can see that the Foo methods simply pass control back to the default serializing mechanism to handle the embedded Bar objects. It may be worth handling the serializing more explicitly. For this example, I'll break encapsulation by accessing the Bar fields directly (although going through accessors and updators or calling serialization methods in Bar would not make much difference in time here). I redefine the Foo serializing methods as:

private void writeObject(java.io.ObjectOutputStream out)
    throws IOException
  {
    out.writeUTF(two);
    out.writeInt(one);
    out.writeUTF(four[0].two);
    out.writeFloat(four[0].one);
    out.writeUTF(four[1].two);
    out.writeFloat(four[1].one);
  }
  private void readObject(java.io.ObjectInputStream in)
    throws IOException, ClassNotFoundException
  {
    two = in.readUTF(  );
    one = in.readInt(  );
    four = new Bar3[2];
    four[0] = new Bar3(  );
    four[1] = new Bar3(  );
    four[0].two = in.readUTF(  );
    four[0].one = in.readFloat(  );
    four[1].two = in.readUTF(  );
    four[1].one = in.readFloat(  );
  }

The Foo methods now handle serialization for both Foo and the embedded Bar objects, so the equivalent methods in Bar are now redundant. The following chart illustrates the results of running the test with these altered methods added to the classes (Table 8-3 lists the full results of tests with a variety of VMs):

Writing (serializing)

Reading (deserializing)

Standard serialization

100%

164%

Customized read/writeObject( ) in Foo and Bar

140%

150%

Customized read/writeObject( ) in Foo handling Bar

38%

58%

Now this gives a clearer feel for the costs of dynamic object examination and manipulation.

Given the overhead the serializing I/O classes incur, it has now become obvious that the more serializing you handle explicitly, the better off you are. This being the case, the next step is to ask the objects explicitly to serialize themselves rather than going through the ObjectInputStream and ObjectOutputStream to have them in turn ask the objects to serialize themselves.

The readObject( ) and writeObject( ) methods must be defined as private according to the Serializable interface documentation, so they cannot be called directly. You must either wrap them in another public method or copy the implementation to another method so you can access them directly. But in fact, java.io provides a third alternative. The Externalizable interface also provides support for serializing objects using ObjectInputStream and ObjectOutputStream. But Externalizable defines two public methods rather than the two private methods required by Serializable. So you can just change the names of the two methods: readObject(ObjectInputStream) becomes readExternal(ObjectInput), and writeObject(ObjectOutputStream) becomes writeExternal(ObjectOutput). You must also redefine Foo as implementing Externalizable instead of Serializable. All of these are simple changes, but to be sure that nothing untoward has happened as a consequence, rerun the tests (as good tuners should for any changes, even minor ones). The following chart shows the new test results:

Writing (serializing)

Reading (deserializing)

Standard serialization

100%

164%

Customized read/writeObject( ) in Foo handling Bar

38%

58%

Foo made Externalizable, using last methods renamed

28%

44%

Remarkably, the times are significantly faster. This probably reflects the improvement you get from being able to compile and execute a line such as:

((Externalizable) someObject).writeExternal(this)

in the ObjectOutputStream class, rather than having to go through java.reflect and the VM internals to reach the private writeObject( ) method. This example also shows that you are better off making your classes Externalizable rather than Serializable if you want to control your own serializing.

The drawback to controlling your own serializing is a significantly higher maintenance cost, as any change to the class structure also requires changes to the two Externalizable methods (or the two methods supported by Serializable). In some cases (as in the example presented in this tuning exercise), changes to the structure of one class actually require changes to the Externalizable methods of another class. The example presented here requires that if the structure of Bar is changed, the Externalizable methods in Foo must also be changed to reflect the new structure of Bar. Here, you can avoid the dependency between the classes by having the Foo serialization methods call the Bar serialization methods directly. But the general fragility of serialization, when individual class structures change, still remains.

You changed the methods in the first place to provide public access to the methods in order to access them directly. Let's continue with this task. Now, for the first time, you will change actual test code, rather than anything in the Foo or Bar classes. The new test looks like:

    if (toDisk)
      OutputStream ostream = new FileOutputStream("t.tmp");
    else
      OutputStream ostream = new ByteArrayOutputStream(  );
    ObjectOutputStream wrtr = new ObjectOutputStream(ostream);
  
    //The old version of the test just ran the next
    //commented line to write the objects
    //wrtr.writeObject(lotsOfFoos);
  
    long time = System.currentTimeMillis(  );
    //This new version writes the size of the array,
    //then each object explicitly writes itself
    //time these five lines for serialization output
    wrtr.writeInt(lotsOfFoos.length);
    for (int i = 0; i < lotsOfFoos.length ; i++)
      lotsOfFoos[i].writeExternal(wrtr);
    wrtr.flush(  );
    wrtr.close(  );
    System.out.println("Writing time: " + 
        (System.currentTimeMillis(  )-time));
  
    if (toDisk)
      InputStream istream = new FileInputStream("t.tmp");
    else
      InputStream istream = new ByteArrayInputStream(
        ((ByteArrayOutputStream) ostream).toByteArray(  ));
    ObjectInputStream rdr = new ObjectInputStream(istream);
  
    //The old version of the test just ran the next
    //commented line to read the objects
    //Foo1[  ] allFoos = (Foo1[  ]) rdr.readObject(  );
  
    time = System.currentTimeMillis(  );
    //This new version reads the size of the array and creates
    //the array, then each object is explicitly created and
    //reads itself. read objects - time these ten lines to
    //the close(  ) for serialization input
    int len = rdr.readInt(  );
    Foo[  ] allFoos = new Foo[len];
    Foo foo;
    for (int i = 0; i < len ; i++)
    {
      foo = new Foo(  );
      foo.readExternal(rdr);
      allFoos[i] = foo;
    }
    rdr.close(  );
    System.out.println("Reading time: " + 
        (System.currentTimeMillis(  )-time));

This test bypasses the serialization overhead completely. You are still using the ObjectInputStream and ObjectOutputStream classes, but really only to write out basic data types, not for any of their object-manipulation capabilities. If you didn't require those specific classes because of the required method signatures, you could have happily used DataInputStream and DataOutputStream classes for this test. The following chart shows the test results.

Writing (serializing)

Reading (deserializing)

Standard serialization

100%

164%

Foo made Externalizable, using last methods renamed

28%

44%

Foo as last test, but read/write called directly in test

3.9%

25%

If you test serializing to and from the disk, you find that the disk I/O now takes nearly one-third of the total test times. Because disk I/O is now a significant portion of the total time, the CPU is now underworked, and you can even gain some speedup by serializing in several threads, i.e., you can evenly divide the collection into two or more subsets and have each subset serialized by a separate thread (I leave that as an exercise for you).

Note that since you are now explicitly creating objects by calling their constructors, the instance variables in Bar are being set twice during deserialization, once at the creation of the Bar instance in Foo.readExternal( ), and again when reading in the instance variable values and assigning those values. Normally you should move any Bar initialization out of the no-arg constructor to avoid redundant assignments.

Is there any way of making the deserializing faster? Well, not significantly, if you need to read in all the objects and use them all immediately. But more typically, you need only some of the objects immediately. In this case, you can use lazily initialized objects to speed up the deserializing phase (see also Section 4.6.2). The idea is that instead of combining the read with the object creation in the deserializing phase, you decouple these two operations. So each object reads in just the bytes it needs, but does not convert those bytes into objects or data until that object is actually accessed. To test this, add a new instance variable to Foo to hold the bytes between reading and converting to objects or data. You also need to change the serialization methods. I will drop support for the Serializable and Externalizable interfaces since we are now explicitly requiring the Foo objects to serialize and deserialize themselves, and I'll add a second stream to store the size of the serialized Foo objects. Foo now looks like:

class Foo1 implements Serializable
{
  int one;
  String two;
  Bar1[  ] four;
  byte[  ] buffer;
  
  //empty constructor to optimize deserialization
  public Foo5(  ){  }
  //And constructor that creates initialized objects
  public Foo5(boolean init)
  {
    this(  );
    if (init)
      init(  );
  }
  public void init(  )
  {
    two = new String("START");
    one = two.length(  );
    four = new Bar5[2];
    four[0] = new Bar5(  );
    four[1] = new Bar5(  );
  }
  
  //Serialization method
  public void writeExternal(MyDataOutputStream out, DataOutputStream outSizes)
    throws IOException
  {
    //Get the amount written so far so that we can determine
    //the extra we write
    int size = out.written(  );
  
    //write out the Foo
    out.writeUTF(two);
    out.writeInt(one);
    out.writeUTF(four[0].two);
    out.writeFloat(four[0].one);
    out.writeUTF(four[1].two);
    out.writeFloat(four[1].one);
  
    //Determine how many bytes I wrote
    size = out.written(  ) - size;
  
    //and write that out to our second stream
    outSizes.writeInt(size);
  }
  public void readExternal(InputStream in, DataInputStream inSizes)
    throws IOException
  {
    //Determine how many bytes I consist of in serialized form
    int size = inSizes.readInt(  );
  
    //And read me into a byte buffer
    buffer = new byte[size];
    int len;
    int readlen = in.read(buffer);
  
    //be robust and handle the general case of partial reads
    //and incomplete streams
    if (readlen =  = -1)
      throw new IOException("expected more bytes");
    else
      while(readlen < buffer.length)
      {
        len = in.read(buffer, readlen, buffer.length-readlen);
        if (len < 1)
          throw new IOException("expected more bytes");
        else
          readlen += len;
      }
  }
  
  //This method does the deserializing of the byte buffer to a 'real' Foo
  public void convert(  )
    throws IOException
  {
    DataInputStream in = new DataInputStream(new ByteArrayInputStream(buffer));
    two = in.readUTF(  );
    one = in.readInt(  );
    four = new Bar5[2];
    four[0] = new Bar5(  );
    four[1] = new Bar5(  );
    four[0].two = in.readUTF(  );
    four[0].one = in.readFloat(  );
    four[1].two = in.readUTF(  );
    four[1].one = in.readFloat(  );
    buffer = null;
  }
}

As you can see, I have chosen to use DataInputStreams and DataOutputStreams since they are all that's needed. I also use a subclass of DataOutputStream called MyDataOutputStream. The class adds only one method, MyDataOutputStream.written( ), to provide access to the DataOutputStream.written instance variable so that you have access to the number of bytes written. The timing tests are essentially the same as before, except that you change the stream types and add a second stream for the sizes of the serialized objects (e.g., to file t2.tmp, or a second pair of byte-array input and output streams). The following chart shows the new times:

Writing (serializing)

Reading (deserializing)

Standard serialization

100%

164%

Foo as last test, but read/write called directly in test

3.9%

25%

Foo lazily initialized

17%

4%

We have lost out on the writes because of the added complexity, but improved the reads considerably. The cost of the Foo.convert( ) method has not been factored in, but the strategy illustrated here is for cases where you need to run only that convert method on a small subset of the deserialized objects, and so the extra overhead should be small. This technique works well when transferring large groups of objects across a network.

For the case in which you need only a few objects out of many serialized objects that have been stored on disk, another strategy is even more efficient. This strategy uses techniques similar to the example just shown. One file (the data file) holds the serialized objects. A second file (the index file) holds the offset of the starting byte of each serialized object in the first file. For serializing, the only difference to the example is that when writing out the objects, the full DataOutputStream.written instance variable is added to the index file as the writeExternal( ) method is entered, instead of writing the difference between successive values of DataOutputStream.written. A moment's thought should convince you that this provides the byte offset into the data file.

With this technique, deserializing is straightforward. You enter the index file and skip to the correct index for the object you want in the data file (e.g., for the object at array index 54, skip 54 x 4 = 216 bytes from the start of the index file). The serialized int at that location holds the byte offset into the data file, so you deserialize that int. Then you enter the data file, skipping to the specified offset, and deserialize the object there. (This is also the first step in building your own database: the next steps are normally to waste time and effort before realizing that you can more easily buy a database that does most of what you want.) This "index file-plus-data file" strategy works best if you leave the two files open and skip around the files, rather than repeatedly opening and closing the files every time you want to deserialize an object. The strategy illustrated in this paragraph does not work as well for transferring serialized objects across a network. For network transfers, a better strategy is to limit the objects being transferred to only those that are needed.^[11] Table 8-3 shows the tunings of the serialization tests, normalized to the JDK 1.2 standard serialization test. Each entry is a pair giving write/read timings. The test name in parentheses refers to the method name executed in the tuning.io.SerializationTest class.

^[11] You could transfer index files across the network, then use those index files to precisely identify the objects required and limit transfers to only those identified objects.

Table 8-3. Timings (in write/read pairs) of the serialization tests with various VMs

1.1.8

1.2.2

1.3.1

1.3.1-server

1.4.0

1.4.0-Xint

Standard serialization (test1a)

929%/1848%

100%/164%

112%/120%

68%/144%

99%/100%

700%/593%

Customized write/readObject( ) in Foo and Bar (test2a)

406%/612%

140%/150%

139%/145%

113%/178%

91%/93%

556%/486%

Customized write/readObject( ) in Foo handling Bar (test3a)

43%/132%

38%/58%

41%/53%

37%/71%

29%/37%

201%/234%

Foo made Externalizable, using last methods renamed (test4a)

28%/92%

28%/44%

25%/40%

32%/57%

29%/39%

187%/222%

Foo as last test, but write/read called directly in test (test4c)

2.9%/97%

3.9%/25%

4.9%/19%

19%/30%

7.6%/18%

75%/132%

Foo lazily initialized (test5a)

16%/3.4%

17%/4%

12%/2.6%

61%/6.7%

13%/2.4%

105%/18%