[ Team LiB ] Previous Section Next Section

3.4 Using the SafeStr Library

3.4.1 Problem

You want an alternative to using the standard C string-manipulation functions to help avoid buffer overflows (see Recipe 3.3), format-string problems (see Recipe 3.2), and the use of unchecked external input.

3.4.2 Solution

Use the SafeStr library, which is available from http://www.zork.org/safestr/.

3.4.3 Discussion

The SafeStr library provides an implementation of dynamically sizable strings in C. In addition, the library also performs reference counting and accounting of the allocated and actual sizes of each string. Any attempt to increase the actual size of a string beyond its allocated size causes the library to increase the allocated size of the string to a size at least as large. Because strings managed by SafeStr ("safe strings") are dynamically sized, safe strings are not a source of potential buffer overflows. (See Recipe 3.3.)

Safe strings use the type safestr_t, which can actually be cast to the normal C-style string type, char *, though we strongly recommend against doing so where it can be avoided. In fact, the only time you should ever cast a safe string to a normal C-style string is for read-only purposes. This is also the only reason why the safestr_t type was designed in a way that allows casting to normal C-style strings.

Casting a safe string to a normal C-style string and modifying it using C-style string-manipulation functions or other means defeats the protections and accounting afforded by the SafeStr library.

The SafeStr library provides a rich set of API functions to manipulate the strings it manages. The large number of functions prohibits us from enumerating them all here, but note that the library comes with complete documentation in the form of Unix man pages, HTML, and PDF. Table 3-1 lists the functions that have C equivalents, along with those equivalents.

Table 3-1. SafeStr API functions and equivalents for normal C strings

SafeStr function

C function

safestr_append( )

strcat( )

safestr_nappend( )

strncat( )

safestr_find( )

strstr( )

safestr_copy( )

strcpy( )

safestr_ncopy( )

strncpy( )

safestr_compare( )

strcmp( )

safestr_ncompare( )

strncmp( )

safestr_length( )

strlen( )

safestr_sprintf( )

sprintf( )

safestr_vsprintf( )

vsprintf( )

You can typically create safe strings in any of the following three ways:

SAFESTR_ALLOC( )

Allocates a resizable string with an initial allocation size in bytes as specified by its only argument. The string returned will be an empty string (actual size zero). Normally the size allocated for a string will be larger than the actual size of the string. The library rounds memory allocations up, so if you know that you will need a large string, it is worth allocating it with a large initial allocation size up front to avoid reallocations as the actual string length grows.

SAFESTR_CREATE( )

Creates a resizable string from the normal C-style string passed as its only argument. This is normally the appropriate way to convert a C-style string to a safe string.

SAFESTR_TEMP( )

Creates a temporary resizable string from the normal C-style string passed as its only argument. SAFESTR_CREATE( ) and SAFESTR_TEMP( ) behave similarly, except that a string created by SAFESTR_TEMP( ) will be automatically destroyed by the next SafeStr function that uses it. The only exception is safestr_reference( ), which increments the reference count on the string, allowing it to survive until safestr_release( ) or safestr_free( ) is called to decrement the string's reference count.

People are sometimes confused about when actually to use SAFESTR_TEMP( ), as well as how to use it properly. Use SAFESTR_TEMP( ) when you need to pass a constant string as an argument to a function that is expecting a safestr_t. A perfect example of such a case would be safestr_sprintf( ), which has the following signature:

int safestr_sprintf(safestr_t *output, safestr_t *fmt, ...);

The string that specifies the format must be a safe string, but because you should always use constant strings for the format specification (see Recipe 3.2), you should use SAFESTR_TEMP( ). The alternative is to use SAFESTR_CREATE( ) to create the string before calling safestr_sprintf( ), and free it immediately afterward with safestr_free( ).

int       i = 42;
safestr_t fmt, output;
   
output = SAFESTR_ALLOC(1);
   
/* Instead of doing this: */
fmt = SAFESTR_CREATE("The value of i is %d.\n");
safestr_sprintf(&output, fmt, i);
safestr_free(fmt);
   
/* You can do this: */
safestr_sprintf(&output, SAFESTR_TEMP("The value of i is %d.\n"), i);

When using temporary strings, remember that the temporary string will be destroyed automatically after a call to any SafeStr API function except safestr_reference( ), which will increment the string's reference count. If a temporary string's reference count is incremented, the string will then survive any number of API calls until its reference count is decremented to the extent that it will be destroyed. The API functions safestr_release( ) and safestr_free( ) may be used interchangeably to decrement a string's reference count.

For example, if you are writing a function that accepts a safestr_t as an argument (which may or may not be passed as a temporary string) and you will be performing multiple operations on the string, you should increment the string's reference count before operating on it, and decrement it again when you are finished. This will ensure that the string is not prematurely destroyed if a temporary string is passed in to the function.

void some_function(safestr_t *base, safestr_t extra) {
  safestr_reference(extra);
  if (safestr_length(*base) + safestr_length(extra) < 17)
    safestr_append(base, extra);
  safestr_release(extra);
}

In this example, if you omitted the calls to safestr_reference( ) and safestr_release( ), and if extra was a temporary string, the call to safestr_length( ) would cause the string to be destroyed. As a result, the safestr_append( ) call would then be operating on an invalid safestr_t if the combined length of base and extra were less than 17.

Finally, the SafeStr library also tracks the trustworthiness of strings. A string can be either trusted or untrusted. Operations that combine strings result in untrusted strings if any one of the strings involved in the combination is untrusted; otherwise, the result is trusted. There are few places in SafeStr's API where the trustworthiness of a string is tested, but the function safestr_istrusted( ) allows you to test strings yourself.

The strings that result from using SAFESTR_CREATE( ) or SAFESTR_TEMP( ) are untrusted. You can use SAFESTR_TEMP_TRUSTED( ) to create temporary strings that are trusted. The trustworthiness of an existing string can be altered using safestr_trust( ) to make it trusted or safestr_untrust( ) to make it untrusted.

The main reason to track the trustworthiness of a string is to monitor the flow of external inputs. Safe strings created from external data should initially be untrusted. If you later verify the contents of a string, ensuring that it contains nothing dangerous, you can then mark the string as trusted. Whenever you need to use a string to perform some potentially dangerous operation (for example, using a string in a command-line argument to an external program), check the trustworthiness of the string before you use it, and fail appropriately if the string is untrusted.

3.4.4 See Also

    [ Team LiB ] Previous Section Next Section