Sunday 2 November 2008

malloc failures

I can't put a comment on Debarshi's post, so I'll answer here. Debarshi complains about this comment by the "inimitable" Jeff Johnson:

You have to look at the usage case, malloc returning NULL is a "can't happen" condition where an exit call is arguably justified.

Returning an error from library to application when malloc returns NULL assumes:

1) error return paths exist [...]
2) applications are prepared to do something meaningful with the error

Another problem is that only about 1 in 10 memory allocations in a typical C program are mallocs. The rest are stack-allocated variables, and those aren't usually checked at all. If any of your 9 out of 10 stack allocations fail, your whole program fails hard.

This is the correct way to deal with those 1 in 10 memory allocations that you can check — provide a custom abort function that the main program can override in the very rare case that they can do anything useful other than exit:

void (*custom_abort) () = abort;

void
lib_set_custom_abort (void (*new_abort) ())
{
custom_abort = new_abort;
}

void *
lib_malloc (int n)
{
void *data = malloc (n);
if (data == NULL) custom_abort ();
return data;
}
Note that the main program can use longjmp (or exceptions in some cases) to "return" back to a safe point in the program, such as a transaction checkpoint. If the main program uses pool allocators — about the only safe and sensible way to deal with C's programming model — then the program has a chance of recovering.

Really the answer is to use a sensible programming language though. Programming languages invented before C had safer, faster memory allocation, dealt with 10 out of 10 memory allocation errors, and provided a mechanism to recover correctly. Those languages are now 30 years more advanced. In 2008 we're having these silly arguments about how to deal with malloc failures. That's a failure of ourselves as programmers.

No comments: