8.6 Managing the Heap

Each Unix process owns a specific memory region called heap, which is used to satisfy the process's dynamic memory requests. The start_brk and brk fields of the memory descriptor delimit the starting and ending addresses, respectively, of that region.

The following C library functions can be used by the process to request and release dynamic memory:

malloc(size)

Requests size bytes of dynamic memory; if the allocation succeeds, it returns the linear address of the first memory location.

calloc(n,size)

Requests an array consisting of n elements of size size; if the allocation succeeds, it initializes the array components to 0 and returns the linear address of the first element.

free(addr)

Releases the memory region allocated by malloc( ) or calloc( ) that has an initial address of addr.

brk(addr)

Modifies the size of the heap directly; the addr parameter specifies the new value of current->mm->brk, and the return value is the new ending address of the memory region (the process must check whether it coincides with the requested addr value).

sbrk(incr)

Is similar to brk( ), except that the incr parameter specifies the increment or decrement of the heap size in bytes.

The brk( ) function differs from the other functions listed because it is the only one implemented as a system call. All the other functions are implemented in the C library by using brk( ) and mmap( ).

When a process in User Mode invokes the brk( ) system call, the kernel executes the sys_brk(addr) function (see Chapter 9). This function first verifies whether the addr parameter falls inside the memory region that contains the process code; if so, it returns immediately:

mm = current->mm; 
down_write(&mm->mmap_sem);
if (addr < mm->end_code) {
out:
    up_write(&mm->mmap_sem);
    return mm->brk; 
} 

Since the brk( ) system call acts on a memory region, it allocates and deallocates whole pages. Therefore, the function aligns the value of addr to a multiple of PAGE_SIZE and compares the result with the value of the brk field of the memory descriptor:

newbrk = (addr + 0xfff) & 0xfffff000; 
oldbrk = (mm->brk + 0xfff) & 0xfffff000; 
if (oldbrk == newbrk) { 
    mm->brk = addr; 
    goto out; 
} 

If the process asked to shrink the heap, sys_brk( ) invokes the do_munmap( ) function to do the job and then returns:

if (addr <= mm->brk) { 
    if (!do_munmap(mm, newbrk, oldbrk-newbrk)) 
        mm->brk = addr; 
    goto out; 
} 

If the process asked to enlarge the heap, sys_brk( ) first checks whether the process is allowed to do so. If the process is trying to allocate memory outside its limit, the function simply returns the original value of mm->brk without allocating more memory:

rlim = current->rlim[RLIMIT_DATA].rlim_cur; 
if (rlim < RLIM_INFINITY && addr - mm->start_data > rlim) 
    goto out; 

The function then checks whether the enlarged heap would overlap some other memory region belonging to the process and, if so, returns without doing anything:

if (find_vma_intersection(mm, oldbrk, newbrk+PAGE_SIZE)) 
    goto out; 

The last check before proceeding to the expansion consists of verifying whether the available free virtual memory is sufficient to support the enlarged heap (see the earlier section Section 8.3.4):

if (!vm_enough_memory((newbrk-oldbrk) >> PAGE_SHIFT)) 
    goto out; 

If everything is OK, the do_brk( ) function is invoked with the MAP_FIXED flag set. If it returns the oldbrk value, the allocation was successful and sys_brk( ) returns the value addr; otherwise, it returns the old mm->brk value:

if (do_brk(oldbrk, newbrk-oldbrk) == oldbrk)
    mm->brk = addr;
goto out;

The do_brk( ) function is actually a simplified version of do_mmap( ) that handles only anonymous memory regions. Its invocation might be considered equivalent to:

do_mmap(NULL, oldbrk, newbrk-oldbrk, PROT_READ|PROT_WRITE|PROT_EXEC, 
        MAP_FIXED|MAP_PRIVATE, 0)

Of course, do_brk( ) is slightly faster than do_mmap( ) because it avoids several checks on the memory region object fields by assuming that the memory region doesn't map a file on disk.