[ Team LiB ] Previous Section Next Section

5.3 Conversions to Strings

Generally, the JDK methods that convert objects and data types to strings are suboptimal, both in terms of performance and the number of temporary objects used in the conversion procedure. In this section, we consider how to optimize these conversions.

5.3.1 Converting longs to Strings

Let's start by looking at conversion of long values. In the JDK, this is achieved with the Long.toString( ) method. Bear in mind that you typically add a converted value to a StringBuffer (explicitly, or implicitly with the + concatenation operator). So it would be nice to avoid the two intermediate temporary objects created while converting the long, i.e., the one char array inside the conversion method, and the returned String object that is used just to copy the chars into the StringBuffer.

Avoiding the temporary char array is difficult to do because most fast methods for converting numbers start with the low digits in the number, and you cannot add to the StringBuffer from the low to the high digits unless you want all your numbers coming out backwards.

However, with a little work, you can get to a method that is fast and obtains the digits in order. The following code works by determining the magnitude of the number first, then successively stripping off the highest digit, as shown.

//Up to radix 36
private static final char[  ] charForDigit = {
  '0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f','g','h',
  'i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'
};
  
public static void append(StringBuffer s, long i)
{
  if (i < 0)
  {
    //convert negative to positive numbers for later algorithm
    if (i =  = Long.MIN_VALUE)
    {
      //cannot make this positive due to integer overflow,
      //so treat it specially
      s.append("-9223372036854775808");
      return;
    }
    //otherwise append the minus sign, and make the number positive
    s.append('-');
    i = -i;
  }
  //Get the magnitude of the int
  long mag = l_magnitude(i);
  long c;
  while ( mag > 1 )
  {
    //The highest digit
    c = i/mag;
    s.append(charForDigit[(int) c]);
    //remove the highest digit
    c *= mag;
    if ( c <= i)
      i -= c;
    //and go down one magnitude
    mag /= 10;
  }
  //The remaining magnitude is one digit large
  s.append(charForDigit[(int) i]);
}
  
private static long l_magnitude(long i)
{
    if (i < 10L) return 1;
    else if (i < 100L) return 10L;
    else if (i < 1000L) return 100L;
    else if (i < 10000L) return 1000L;
    else if (i < 100000L) return 10000L;
    else if (i < 1000000L) return 100000L;
    else if (i < 10000000L) return 1000000L;
    else if (i < 100000000L) return 10000000L;
    else if (i < 1000000000L) return 100000000L;
    else if (i < 10000000000L) return 1000000000L;
    else if (i < 100000000000L) return 10000000000L;
    else if (i < 1000000000000L) return 100000000000L;
    else if (i < 10000000000000L) return 1000000000000L;
    else if (i < 100000000000000L) return 10000000000000L;
    else if (i < 1000000000000000L) return 100000000000000L;
    else if (i < 10000000000000000L) return 1000000000000000L;
    else if (i < 100000000000000000L) return 10000000000000000L;
    else if (i < 1000000000000000000L) return 100000000000000000L;
    else return  1000000000000000000L;
}

When compared to executing the plain StringBuffer.append(long), the algorithm listed here is generally quicker (see Table 5-1) and creates two fewer objects. It can be even faster and is quicker for all VMs with further tuning, but I'll leave the more complicated tuning to the next section.

Table 5-1. Time taken to append a long to a StringBuffer

VM

1.1.8

1.2.2

1.3.1

1.3.1-server

1.4.0

1.4.0-server

1.4.0-Xint

JDK long conversion

104%

100%

116%

157%

116%

100%

306%

Optimized long conversion

115%

89%

121%

107%

115%

95%

310%

There are several things to note about possible variations of this algorithm. First, although the algorithm here is specifically radix 10 (decimal), it is easy to change to any radix. To do this, the reduction in magnitude in the loop has to go down by the radix value, and the l_magnitude( ) method has to be altered. For example, for radix 16, hexadecimal, the statement mag = mag/10 becomes mag = mag/16 and the magnitude method for radix 16 looks like:

private static long l_magnitude16(long i)
{
    if (i < 16L) return 1;
    else if (i < 256L) return 16L;
    else if (i < 4096L) return 256L;
    else if (i < 65536L) return 4096L;
    else if (i < 1048576L) return 65536L;
    else if (i < 16777216L) return 1048576L;
    else if (i < 268435456L) return 16777216L;
    else if (i < 4294967296L) return 268435456L;
    else if (i < 68719476736L) return 4294967296L;
    else if (i < 1099511627776L) return 68719476736L;
    else if (i < 17592186044416L) return 1099511627776L;
    else if (i < 281474976710656L) return 17592186044416L;
    else if (i < 4503599627370496L) return 281474976710656L;
    else if (i < 72057594037927936L) return 4503599627370496L;
    else if (i < 1152921504606846976L) return 72057594037927936L;
    else return 1152921504606846976L;
}

Second, because we are working through the digits in written order, this algorithm is suitable for writing directly to a stream or writer (such as a FileWriter) without the need for any temporary objects. This is potentially a large gain, enabling writes to files without generating intermediate temporary strings.

Finally, if you want formatting added in, the algorithm is again suitable because you proceed through the number in written order, and also because you have the magnitude at the start. (You can easily create another method, similar to magnitude( ), that returns the number of digits in the value.) You can put in a comma every three digits as the number is being written (or apply whatever internationalized format is required). This saves you having to write out the number first in a temporary object and then add formatting to it. For example, if you are using integers to fake fixed-place floating-point numbers, you can insert a point at the correct position without resorting to temporary objects.

5.3.2 Converting ints to Strings

While the previous append( ) version is suitable to use for ints by overloading, it is much more efficient to create another version specifically for ints. This is because int arithmetic is optimal and considerably faster than the long arithmetic being used. Although earlier versions of the JDK (before JDK 1.1.6) used an inefficient conversion procedure for ints, from 1.1.6 onward Sun targeted the conversion (for radix 10 integers only) and speeded it up by an order of magnitude. To better this already optimized performance, you need every optimization available.

There are three changes you can make to the long conversion algorithm already presented. First, you can change everything to use ints. This gives a significant speedup (more than a third faster than the long conversion). Second, you can inline the "magnitude" method. And finally, you can unroll the loop that handles the digit-by-digit conversion. In this case, the loop can be completely unrolled since there are at most 10 digits in an int.

The resulting method is a little long-winded:

public static void append(StringBuffer s, int i)
{
  if (i < 0)
  {
    if (i =  = Integer.MIN_VALUE)
    {
      //cannot make this positive due to integer overflow
      s.append("-2147483648");
      return this;
    }
    s.append('-');
    i = -i;
  }
  int mag;
  int c;
  if (i < 10)                       //one digit
    s.append(charForDigit[i]);
  else if (i < 100)                 //two digits
    s.append(charForDigit[i/10])
     .append(charForDigit[i%10]);
  else if (i < 1000)                //three digits
    s.append(charForDigit[i/100])
     .append(charForDigit[(c=i%100)/10])
     .append(charForDigit[c%10]);
  else if (i < 10000)               //four digits
    s.append(charForDigit[i/1000])
     .append(charForDigit[(c=i%1000)/100])
     .append(charForDigit[(c%=100)/10])
     .append(charForDigit[c%10]);
  else if (i < 100000)              //five digits
    s.append(charForDigit[i/10000])
     .append(charForDigit[(c=i%10000)/1000])
     .append(charForDigit[(c%=1000)/100])
     .append(charForDigit[(c%=100)/10])
     .append(charForDigit[c%10]);
  else if (i < 1000000)             //six digits
    ... //I'm sure you get the idea
  else if (i < 10000000)            //seven digits
    ... //so just keep doing the same, but more
  else if (i < 100000000)           //eight digits
    ... //because my editor doesn't like wasting all this space
  else if (i < 1000000000)          //nine digits
    ... //on unnecessary repetitions
  else
    {
        //ten digits
        s.append(charForDigit[i/1000000000]);
        s.append(charForDigit[(c=i%1000000000)/100000000]);
        s.append(charForDigit[(c%=100000000)/10000000]);
        s.append(charForDigit[(c%=10000000)/1000000]);
        s.append(charForDigit[(c%=1000000)/100000]);
        s.append(charForDigit[(c%=100000)/10000]);
        s.append(charForDigit[(c%=10000)/1000]);
        s.append(charForDigit[(c%=1000)/100]);
        s.append(charForDigit[(c%=100)/10]);
        s.append(charForDigit[c%10]);
    }
}

In the first edition of this book, I compared this implementation to executing StringBuffer.append(int) with earlier VM versions (1.1.6, 1.2.0, 1.3.0, and HotSpot 1.0). The algorithm listed here ran in less time for all the VMs, and created two fewer objects[3] (see Table 5-2). This algorithm still has a smaller impact on garbage creation, digits are iterated in order so you can write to a stream, and it is easier to alter for formatting without using temporary objects. Note that the long conversion method can also be improved using two of the three techniques we used for the int conversion method: inlining the magnitude method and unrolling the loop.

[3] If the StringBuffer.append(int) used the algorithm shown here, it would be faster for all JDK versions measured in this chapter, as the characters could be added directly to the char buffer without going through the StringBuffer.append(char) method.

However, the comparison against the latest versions of the various VMs now shows a completely different story (see Table 5-3). Sun has continued to optimize, especially object creation and garbage collection in the VM, as well as the conversion algorithm. The improvement in garbage collection is obvious if you run the comparison test with the -verbosegc parameter. With garbage collection being reported, the much larger volume of garbage slows down the JDK conversion relative to the proprietary algorithm. Without -verbosegc, the extra temporary objects are still overhead, but not as significant as with earlier VMs.

It is also instructive to see what Sun has done to the algorithm to make the conversion faster. The source of the 1.3.1/1.4.0 Integer.toString(int) method is almost unrecognizable from earlier implementations. One optimization is to reduce the number of temporary objects created by using a privileged String constructor that accepts a passed char array rather than creating a new one. But the major algorithmic optimization is that multiplications have been changed to bit-shifts. For example, instead of multiplying by 100, three bit-shifts are used:

//These are all equivalent operations
q_times_100 = q * 100;
q_times_100 = (q * 64) + (q * 32) + (q * 4);
q_times_100 = ((q << 6) + (q << 5) + (q << 2));

One operation has been replaced with three, but with optimized generated native code on most modern CPUs, the bit-shifts would operate in parallel and are significantly faster than the multiplication. The only VM in Table 5-2 that is slower than the algorithm I presented is the interpreted VM, which supports the analysis that the bit-shifts are crucial. Whether you can produce an algorithm that is even faster than the latest JDK one by also using bit-shifting is best left to another time.

Table 5-2. Time taken to append an int to a StringBuffer (from the first edition)

VM

1.2

1.3

HotSpot 1.0

1.1.6

JDK int conversion

100%

61%

89%

148%

Optimized int conversion

84%

60%

81%

111%

Table 5-3. Time taken to append an int to a StringBuffer (current version)

VM

1.1.8

1.2.2

1.3.1

1.3.1-server

1.4.0

1.4.0-server

1.4.0-Xint

JDK int conversion

148%

100%

51%

45%

58%

40%

498%

Optimized int conversion

172%

107%

96%

122%

94%

83%

402%

5.3.3 Converting bytes, shorts, chars, and booleans to Strings

You can use the int conversion method for bytes and shorts (using overloading). You can make byte conversion even faster using a String array as a lookup table for the 256 byte values. The conversion of bytes and shorts to Strings in the JDK appears not to have been tuned to as high a standard as radix 10 ints (up to JDK 1.4). This means that the int conversion algorithm shown previously, when applied to bytes and shorts, is significantly faster than the JDK conversions and does not produce any temporary objects.

When it comes to using the other data types, there is no need to handle booleans in any special way: the Boolean.toString( ) already uses canonical strings. And there is obviously nothing in particular that needs to be done for chars (apart from making sure you add them to strings as characters, not numbers).

5.3.4 Converting floats to Strings

Converting floating-point numbers to strings turns out to be hideously under-optimized in every version of the JDK up to 1.4 (and maybe beyond). Looking at the JDK code and comments, it seems that no one has yet got around to tuning these conversions. Floating-point numbers can be converted using similar optimizations to the number conversions previously addressed. You need to check for and handle the special cases separately. You then scale the floats into an integer value and use the previously defined int conversion algorithm to convert to characters in order, ensuring that you format the decimal point at the correct position. The case of values between .001 and 10,000,000 are handled differently because they are printed without exponent values; all other floats are printed with exponents. Finally, it would be possible to overload the float and double case, but it turns out that if you do this, the float does not convert as well (in correctness or speed), so it is necessary to duplicate the algorithms for the float and double cases.

Note that the printed values of floats and doubles are, in general, only representative of the underlying value. This is true both for the JDK algorithms and the conversions here. There are times when the string representation comes out differently for the two implementations, and neither is actually more accurate. The algorithm used by the JDK prints the minimum number of digits possible, while maintaining uniqueness of the printed value with respect to the other floating-point values adjacent to the value being printed. The algorithm presented here prints the maximum number of digits (not including trailing zeros) regardless of whether some digits are not needed to distinguish the number from other numbers. For example, the Float.MIN_VALUE is printed by the JDK as "1.4E-45" whereas the algorithm here prints it as "1.4285714E-45". Because of the limitations in the accuracy of numbers, neither printed representation is more or less accurate compared to the underlying floating-point number actually held in Float.MIN_VALUE (e.g., assigning both "1.46e-45F" and "1.45e-45F" to a float results in Float.MIN_VALUE being assigned). Note that the code that follows shortly uses the previously defined append( ) method for appending longs to StringBuffers. Also note that the dot character has been hardcoded as the decimal separator character here for clarity, but it is straightforward to change for internationalization.

This method of converting floats to strings has the same advantages as those mentioned previously for integral types (i.e., it is printed in digit order, no temporary objects are generated, etc.). The double conversion (see the next section) is similar to the float conversion, with all the same advantages. In addition, both algorithms are several times faster than the JDK conversions.

Normally, when you print out floating-point numbers, you print in a defined format with a specified number of digits. The default floating-point toString( ) methods cannot format floating-point numbers; you must first create the string, then format it afterwards. The algorithm presented here could easily be altered to handle formatting floating-point numbers without using any intermediate strings. This algorithm is also easily adapted to handle rounding up or down; it already detects which side of the "half" value the number is on:

public static final char[  ] NEGATIVE_INFINITY = 
      {'-','I','n','f','i','n','i','t','y'};
public static final char[  ] POSITIVE_INFINITY = 
      {'I','n','f','i','n','i','t','y'};
public static final char[  ] NaN = {'N','a','N'};
private static final int floatSignMask = 0x80000000;
private static final int floatExpMask  = 0x7f800000;
private static final int floatFractMask= ~(floatSignMask|floatExpMask);
private static final int floatExpShift = 23;
private static final int floatExpBias = 127;
//change dot to international character where this is used below
public static final char[  ] DOUBLE_ZERO = {'0','.','0'};
public static final char[  ] DOUBLE_ZERO2 = {'0','.','0','0'};
public static final char[  ] DOUBLE_ZERO0 = {'0','.'};
public static final char[  ] DOT_ZERO = {'.','0'}; 
private static final float[  ] f_magnitudes = {
 1e-44F, 1e-43F, 1e-42F, 1e-41F, 1e-40F,
 1e-39F, 1e-38F, 1e-37F, 1e-36F, 1e-35F, 1e-34F, 1e-33F, 1e-32F, 1e-31F, 1e-30F,
 1e-29F, 1e-28F, 1e-27F, 1e-26F, 1e-25F, 1e-24F, 1e-23F, 1e-22F, 1e-21F, 1e-20F,
 1e-19F, 1e-18F, 1e-17F, 1e-16F, 1e-15F, 1e-14F, 1e-13F, 1e-12F, 1e-11F, 1e-10F,
 1e-9F, 1e-8F, 1e-7F, 1e-6F, 1e-5F, 1e-4F, 1e-3F, 1e-2F, 1e-1F,
 1e0F, 1e1F, 1e2F, 1e3F, 1e4F, 1e5F, 1e6F, 1e7F, 1e8F, 1e9F,
 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, 1e15F, 1e16F, 1e17F, 1e18F, 1e19F,
 1e20F, 1e21F, 1e22F, 1e23F, 1e24F, 1e25F, 1e26F, 1e27F, 1e28F, 1e29F,
 1e30F, 1e31F, 1e32F, 1e33F, 1e34F, 1e35F, 1e36F, 1e37F, 1e38F
};
  
public static void append(StringBuffer s, float d)
{
  //handle the various special cases
  if (d =  = Float.NEGATIVE_INFINITY)
    s.append(NEGATIVE_INFINITY);
  else if (d =  = Float.POSITIVE_INFINITY)
    s.append(POSITIVE_INFINITY);
  else if (d != d)
    s.append(NaN);
  else if (d =  = 0.0)
  {
    //can be -0.0, which is stored differently
    if ( (Float.floatToIntBits(d) & floatSignMask) != 0)
      s.append('-');
    s.append(DOUBLE_ZERO);
  }
  else
  {
    //convert negative numbers to positive
    if (d < 0)
    {
      s.append('-');
      d = -d;
    }
    //handle 0.001 up to 10000000 separately, without exponents
    if (d >= 0.001F && d < 0.01F)
    {
      long i = (long) (d * 1E12F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      s.append(DOUBLE_ZERO2);
      appendFractDigits(s, i,-1);
    }
    else if (d >= 0.01F && d < 0.1F)
    {
      long i = (long) (d * 1E11F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      s.append(DOUBLE_ZERO);
      appendFractDigits(s, i,-1);
    }
    else if (d >= 0.1F && d < 1F)
    {
      long i = (long) (d * 1E10F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      s.append(DOUBLE_ZERO0);
      appendFractDigits(s, i,-1);
    }
    else if (d >= 1F && d < 10F)
    {
      long i = (long) (d * 1E9F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      appendFractDigits(s, i,1);
    }
    else if (d >= 10F && d < 100F)
    {
      long i = (long) (d * 1E8F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      appendFractDigits(s, i,2);
    }
    else if (d >= 100F && d < 1000F)
    {
      long i = (long) (d * 1E7F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      appendFractDigits(s, i,3);
    }
    else if (d >= 1000F && d < 10000F)
    {
      long i = (long) (d * 1E6F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      appendFractDigits(s, i,4);
    }
    else if (d >= 10000F && d < 100000F)
    {
      long i = (long) (d * 1E5F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      appendFractDigits(s, i,5);
    }
    else if (d >= 100000F && d < 1000000F)
    {
      long i = (long) (d * 1E4F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      appendFractDigits(s, i,6);
    }
    else if (d >= 1000000F && d < 10000000F)
    {
      long i = (long) (d * 1E3F);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      appendFractDigits(s, i,7);
    }
    else
    {
      //Otherwise the number has an exponent
      int magnitude = magnitude(d);
      long i;
      if (magnitude < -35)
        i = (long) (d*1E10F / f_magnitudes[magnitude + 45]);
      else
        i = (long) (d / f_magnitudes[magnitude + 44 - 9]);
      i = i%100 >= 50 ? (i/100) + 1 : i/100;
      appendFractDigits(s, i, 1);
      s.append('E');
      append(s,magnitude);
    }
  }
  return this;
}
  
private static int magnitude(float d)
{
  return magnitude(d,Float.floatToIntBits(d));
}
  
private static int magnitude(float d, int floatToIntBits)
{
  int magnitude = 
    (int) ((((floatToIntBits & floatExpMask) >> floatExpShift)
                 - floatExpBias) * 0.301029995663981);
  
  if (magnitude < -44)
    magnitude = -44;
  else if (magnitude > 38)
    magnitude = 38;
  
  if (d >= f_magnitudes[magnitude+44])
  {
    while(magnitude < 39 && d >= f_magnitudes[magnitude+44])
      magnitude++;
    magnitude--;
    return magnitude;
  }
  else
  {
    while(magnitude > -45 && d < f_magnitudes[magnitude+44])
      magnitude--;
    return magnitude;
  }
}
private static void appendFractDigits(StringBuffer s, long i, int decimalOffset)
{
  long mag = magnitude(i);
  long c;
  while ( i > 0 )
  {
    c = i/mag;
    s.append(charForDigit[(int) c]);
    decimalOffset--;
    if (decimalOffset =  = 0)
      s.append('.'); //change to use international character
    c *= mag;
    if ( c <= i)
      i -= c;
    mag = mag/10;
  }
  if (i != 0)
    s.append(charForDigit[(int) i]);
  else if (decimalOffset > 0)
  {
    s.append(ZEROS[decimalOffset]); //ZEROS[n] is a char array of n 0's
    decimalOffset = 1;
  }
  
  decimalOffset--;
  if (decimalOffset =  = 0)
    s.append(DOT_ZERO);
  else if (decimalOffset =  = -1)
    s.append('0');
}

The conversion times compared to the JDK conversions are shown in Table 5-4. Note that if you are formatting floats, the JDK conversion requires additional steps and so takes longer. However, the method shown here is likely to take even less time, as you normally print fewer digits that require fewer loop iterations.

Table 5-4. Time taken to append a float to a StringBuffer

VM

1.1.8

1.2.2

1.3.1

1.3.1-server

1.4.0

1.4.0-server

1.4.0-Xint

JDK float conversion

128%

100%

85%

60%

117%

66%

472%

Optimized float conversion

55%

62%

47%

44%

49%

29%

144%

5.3.5 Converting doubles to Strings

The double conversion is almost identical to the float conversion, except that the doubles extend over a larger range. The differences are the following constants used in place of the corresponding float constants:

private static final long  doubleSignMask = 0x8000000000000000L;
private static final long  doubleExpMask  = 0x7ff0000000000000L;
private static final long  doubleFractMask= ~(doubleSignMask|doubleExpMask);
private static final int  doubleExpShift = 52;
private static final int  doubleExpBias = 1023;
//private static final double[  ] d_magnitudes = {
  //as f_magnitudes[  ] except doubles extending
  //from 1e-323D to 1e308D inclusive
  ...
}

The last section of the append( ) method is:

      int magnitude = magnitude(d);
      long i;
      if (magnitude < -305)
        i = (long) (d*1E18 / d_magnitudes[magnitude + 324]);
      else
        i = (long) (d / d_magnitudes[magnitude + 323 - 17]);
      i = i%10 >= 5 ? (i/10) + 1 : i/10;
      appendFractDigits(s, i, 1);
      s.append('E');
      append(s,magnitude);

and the magnitude methods are:

private static int magnitude(double d)
{
  return magnitude(d,Double.doubleToLongBits(d));
}
private static int magnitude(double d, long doubleToLongBits)
{
  int magnitude = 
    (int) ((((doubleToLongBits & doubleExpMask) >> doubleExpShift)
                - doubleExpBias) * 0.301029995663981);
  
  if (magnitude < -323)
    magnitude = -323;
  else if (magnitude > 308)
    magnitude = 308;
  
  if (d >= d_magnitudes[magnitude+323])
  {
    while(magnitude < 309 && d >= d_magnitudes[magnitude+323])
      magnitude++;
    magnitude--;
    return magnitude;
  }
  else
  {
    while(magnitude > -324 && d < d_magnitudes[magnitude+323])
      magnitude--;
    return magnitude;
  }
}

The conversion times compared to the JDK conversions are shown in Table 5-5. As with floats, formatting doubles with the JDK conversion requires additional steps and would consequently take longer, but the method shown here takes even less time, as you normally print fewer digits that require fewer loop iterations.

Table 5-5. Time taken to append a double to a StringBuffer

VM

1.1.8

1.2.2

1.3.1

1.3.1-server

1.4.0

1.4.0-server

1.4.0-Xint

JDK double conversion

117%

100%

94%

76%

95%

87%

761%

Optimized double conversion

22%

17%

19%

17%

20%

14%

64%

5.3.6 Converting Objects to Strings

Converting Objects to Strings is also inefficient in the JDK. For a generic object, the toString( ) method is usually implemented by calling any embedded object's toString( ) method, then combining the embedded strings in some way. For example, Vector.toString( ) calls toString( ) on all its elements and combines the generated substrings with the comma character surrounded by opening and closing square brackets.

Although this conversion is generic, it usually creates a huge number of unnecessary temporary objects. If the JDK had taken the "printOn: aStream" paradigm from Smalltalk, the temporary objects used would be significantly reduced. This paradigm basically allows any object to be appended to a stream. In Java, it looks something like:

public String toString(  )
{
  StringBuffer s =new  StringBuffer(  );
  appendTo(s);
  return s.toString(  );
}
  
public void appendTo(StringBuffer s)
{
  //The real work of converting to strings. Any embedded
  //objects would have their 'appendTo(  )' methods called,
  //NOT their 'toString(  )' methods.
  ...
}

This implementation allows far fewer objects to be created in converting to strings. In addition, as StringBuffer is not a stream, this implementation becomes much more useful if you use a java.io.StringWriter and change the appendTo( ) method to accept any Writer, for example:

public String toString(  )
{
  java.io.StringWriter s =new  java.io.StringWriter(  );
  appendTo(s);
  return s.getBuffer(  ).toString(  );
}
  
public void appendTo(java.io.Writer s)
{
  //The real work of converting to strings. Any embedded
  //objects would have their 'appendTo(  )' methods called,
  //NOT their 'toString(  )' methods.
  ...
}

This implementation allows the one appendTo( ) method to write out any object to any streamed writer object. Unfortunately, this implementation is not supported by the Object class, so you need to create your own framework of methods and interfaces to support this implementation. I find that I can use an Appendable interface with an appendTo( ) method, and then write toString( ) methods that check for that interface:

public interface Appendable
{
  public void appendTo(java.io.Writer s);
}
  
public class SomeClass
  implements Appendable
{
  Object[  ] embeddedObjects;
  
  ...
  
  public String toString(  )
  {
    java.io.StringWriter s =new  java.io.StringWriter(  );
    appendTo(s);
    return s.getBuffer(  ).toString(  );
  }
  public void appendTo(java.io.Writer s)
  {
    //The real work of converting to strings. Any embedded
    //objects would have their 'appendTo(  )' methods called,
    //NOT their 'toString(  )' methods.
    for (int i = 0; i<embeddedObjects.length; i++)
      if (embeddedObjects[i] instanceof Appendable)
        ( (Appendable) embeddedObjects[i]).appendTo(s);
      else
        s.write(embeddedObjects[i].toString(  ));
  }
}

In addition, you can extend this framework even further to override the appending of frequently used classes such as Vector, allowing a more efficient conversion mechanism that uses fewer temporary objects:

public class AppenderHelper
{
  final static String NULL = "null";
  final static String OPEN = "[";
  final static String CLOSE = "]";
  final static String MIDDLE = ", ";
  
  public void appendCheckingAppendable(Object o, java.io.Writer s)
  {
    //Use more efficient Appendable interface if possible,
    //and NULL string if appropriate
    if ((o = v.elementAt(0)) =  = null)
      s.write(NULL);
    else if (o instanceof Appendable)
      ( (Appendable) o).appendTo(s);
    else
      s.write(o.toString(  ));
  }
  
  public void appendVector(java.util.Vector v, java.io.Writer s)
  {
    int size = v.size(  );
    Object o;
  
    //Write the opening bracket
    s.write(OPEN);
  
    if (size != 0)
    {
      //Add the first element
      appendCheckingAppendable(v.elementAt(0), s);
      //And add in each other element preceded by the MIDDLE separator
      for(int i = 1; i < size; i++)
      {
          s.append(MIDDLE);
          appendCheckingAppendable(v.elementAt(i), s);
      }
    }
  
    //Write the closing bracket
    s.append(CLOSE);
  }
}

If you add this framework to an application, you can support the notion of converting objects to string representations to a particular depth. For example, a Vector containing another Vector to depth two looks like this:

[1, 2, [3, 4, 5]]

But to depth one, it looks like this:

[1, 2, Vector@4444]

The default Object.toString( ) implementation in the JDK writes out strings for objects as:

return getClass(  ).getName(  ) + "@" + Integer.toHexString(hashCode(  ));

The JDK implementation is inefficient for two reasons. First, the method creates an unnecessary intermediate string because it uses the concatenation operator twice. Second, the Class.getName( ) method (which is a native method) also creates a new string every time it is called: the class name is not cached. It turns out that if you reimplement this to cache the class name and avoid the extra temporary strings, your conversion is faster and uses fewer temporary objects. The two are related, of course: using fewer temporary objects means less object-creation overhead.

You can create a generic framework that converts the basic data types while also supporting the efficient conversion of JDK classes (such as Vector, as well as Integer, Long, etc.). With this framework in place, I find that performance is generally improved because the application uses more efficient conversion algorithms and reduces the number of temporary objects. In almost every respect, this framework is better than the simpler framework, which supports only the toString( ) method.

    Previous Section Next Section