Zum Inhalt springen

Signal safety and why should you care

Calling a non-async-signal-safe function from a signal handler is essentially an undefined behaviour and can lead to all sorts of weird behaviours: deadlocks, race conditions and even memory corruption. You may even experience crashes originating from libc, such as the following:

malloc(): unsorted double linked list corrupted
Fatal glibc error: malloc.c:2599 (sysmalloc): assertion failed: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)

What are signals?

Signals are a way to do asynchronous messages on a POSIX system (such as BSD, Linux, macOS) to notify a process that a specific event happened. The key feature here is the asynchronous part. A signal can be delivered at almost any moment interrupting the application’s normal flow. This might happen when the application is inside critical session or syscall. Your application can be in the middle of allocating or freeing memory, or performing any other standard C function call. Pretty scary.

Most signals are ignored by default, some automatically terminate the entire program. For most of them a custom signal handler can be attached to execute potentially any code when the signal is delivered. Potentially, because of all the restrictions described in the previous paragraph. Therefore the installed signal handler needs to be so-called async-signal-safe.

What is async‑signal safety?

A function is considered async-signal-safe if it can be safely called from within a signal handler without causing undefined behaviour. This is pretty circular definition. Basically there are two possibilities:

  • the function needs to be reentrant and only execute atomic operations
  • the function needs to block signals temporarily

A function is reentrant if it can be invoked again even if the previous invocation is still running. Basically it must be able to recurse into itself at any point. Such function cannot rely on global state or execute non-atomic operations.
An atomic operation is the one that completes as a single step that cannot be interrupted. It either happens completely or not at all – an intermediate state cannot be observed. Most operations are not atomic. Even seemingly simple operations might not be, e.g. incrementing a variable is a complex operation (unless it’s of an atomic type).
Both properties drastically reduce the number of possible operations that can be safely executed from a handler.
The second option is to temporarily block signals while executing a function. This should not be overused as it may limit the program responsiveness.
The general advice is that a signal handlers should perform only the minimum required work.

Functions that are not async-signal-safe

Below are some examples of functions that should not be called from a signal handler. Note that this is not an exhaustive list.

  • Memory management: malloc, free, realloc, etc.
  • Standard C I/O: printf, fprintf, fopen, fclose, etc.
  • Thread synchronisation: pthread_mutex_lock, pthread_cond_wait, etc.
  • Dynamic symbol resolution: dlopen, dlsym, etc.
  • High‑level POSIX calls: system, getaddrinfo, etc.

It is not always obvious if a function is async-signal-safe or not. For example you might think that memset should be safe, but it turns out that memset is not async-signal-safe.

Functions that are guaranteed to be async-signal-safe

POSIX specifies a list of functions that are required to be async-signal-safe, here are a few examples:

  • I/O: read(2), write(2), creat(2), close(2), unlink(2)
  • Process control: kill(2), wait(2), abort(3)
  • Reading and writing to atomic flags (volatile sig_atomic_t)

To see the full list consult the POSIX specification or your operating system’s documentation. For Linux its signal-safety(7)

Common practice

Use volatile sig_atomic_t flag

If any more complex work is required in response to a signal, the common practice is to set a volatile sig_atomic_t flag in the handler and return immediately. The bulk of the required work is then done elsewhere in the normal program loop after the flag has been checked. However, not every signal can be handled in this way (e.g. SIGSEGV)

volatile sig_atomic_t got_sigint = 0;

void handler(int sig) {
    got_sigint = 1;  // Safe atomic store
}

int main(void) {
    signal(SIGINT, handler);
    while (...) {
        // normal program loop
        do_work();
        if (got_sigint) {
            // handle signal
            handle_sigint();
            got_sigint = 0;
        }
    }
}

Block signals during critical sections

Use sigprocmask or pthread_sigmask to block relevant signals around non‑async-signal‑safe code and unblock immediately after.

Real world signals

Most of the code isn’t likely to be invoked from a signal handler. However in some contexts this may happen. And it’s not always clear what parts of code might be invoked from a signal handler.

Postgresql query timeout handlers

In Postgresql you can register a query timeout handler. It will be invoked automatically when a query exceeds the specified time limit. What might be surprising is that Postgresql uses SIGALRM signal to interrupt the query. This means that the timeout handler runs from a signal handler. You might want to produce a log that says that the query exceeded the specified time in the handler, but this could lead to all the weird behaviours mentioned above.

Handling Ctrl+C for safe exit

A common use of signals is to implement safe program termination by handling Ctrl-C key combination. This is done by handling the SIGINT signal. And again, you must be very careful what you do inside the handler. Typically you would simply set a flag, return from the handler and do the actual handling elsewhere in the program. Doing anything more complex directly in the handler is very risky.

Conclusion

Be aware of the contexts in which the code might be called from a signal handler. It’s very easy to introduce undefined behaviour in these places if you’re not careful.

If you encounter weird memory corruption or other weird behaviours, signal handlers could be the first place to look for potential issues.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert