A word on signals

If you have already played with signals, chances are you might have tried to catch signals using the signal or sigaction syscalls. All signals can be caught except SIGKILL and SIGSTOP.

The SIGKILL signal is used to cause immediate program termination. It cannot be handled or ignored, and is therefore always fatal. It is also not possible to block this signal. […] if SIGKILL fails to terminate a process, that by itself constitutes an operating system bug which you should report.

Termination Signals

The SIGSTOP signal stops the process. It cannot be handled, ignored, or blocked.

Job Control Signals

But I don’t want my process to die, I heard you say. Well if I can’t catch them, what do you want me to do? Kill the sender?

Actually I can’t kill the sender, but what if I could intercept and drop the message…

Meet eBPF

The Linux kernel has always been an ideal place to implement monitoring/observability, networking, and security. Unfortunately this was often impractical as it required changing kernel source code or loading kernel modules, and resulted in layers of abstractions stacked on top of each other. eBPF is a revolutionary technology that can run sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules.

By making the Linux kernel programmable, infrastructure software can leverage existing layers, making them more intelligent and feature-rich without continuing to add additional layers of complexity to the system or compromising execution efficiency and safety.

eBPF

eBPF is used to expand the Linux kernel by allowing user space program to inject some code into hook points. The code is JIT compiled and executed if there is no error raised by the verification engine.

Blocking some signals

My idea was as simple as blocking the signal, this way it will never reach our protected process. In order to do so, I tried to catch the signal as early as possible in the kernel: when a process uses the kill syscall.

To quickly see if the idea was really possible, I used bpftrace, a high-level tracing language for eBPF.

Fortunately, all syscalls have a kprobe hook point for eBPF. The goal here is to filter out signals for our protected process and discard them. In order to discard them, I will use the override() bpftrace method which will abort the probed function an will return the return code provided in argument. This functionality requires that your kernel was compiled with CONFIG_BPF_KPROBE_OVERRIDE and only works on functions with the ALLOW_ERROR_INJECTION tag. Fortunately Arch Linux’s kernel already comes with CONFIG_BPF_KPROBE_OVERRIDE enabled, and every syscall handler seems to have the ALLOW_ERROR_INJECTION tag on them.

So here is a script to block all signals to your process:

#!/bin/sh

if [ "$#" -ne 1 ]; then
    echo "Usage: $0 pid"
    exit 1
fi

if [ "$EUID" -ne 0 ]; then
    echo "Please run as root"
    exit 1
fi

bpftrace -e "kprobe:__x64_sys_kill { if (arg1 == $1) { printf(\"Signal blocked for $1\n\"); override(0); } }" --unsafe

Let’s take an example:

$ # Without blocksig.sh
$ ping skallwar.fr > /dev/null &
[1] 371628
$ kill -9 371628
[1]+  Killed                  ping skallwar.fr > /dev/null

$ # With blocksig.sh
$ ping skallwar.fr > /dev/null &
[1] 315629
$ sudo ./blocksig.sh 315629 &
Attaching 1 probe...
$ kill -9 315629
Signal blocked for 315629

As you can see, the second time around, our ping did not get killed. We actually blocked a SIGKILL.

As a side note, the first time I launched blocksig.sh, I did not filter on the pid before calling override(). As a side effect, systemctl refused to either poweroff or reboot my machine.

This technique works fine but we just moved the problem elsewhere. Now our process is protected but our blocksig.sh is not. If someone kills blocksig.sh, our process is defenseless and we are back to square one. You might think that using $$, the shell special variable for pid will do the trick but remember, this is the pid of the shell not the pid of the bpftrace command.

We need to setup the “fence” from the inside…

BCC to the rescue

To fix our problem I used BCC. BCC is a toolkit for creating efficient kernel tracing and manipulation programs using eBPF and Python.

Here is what a basic hello world looks like:

#!/usr/bin/python

from bcc import BPF

BPF(text='int kprobe__sys_kill(void *ctx) { bpf_trace_printk("Hello, World!\\n"); return 0; }').trace_print()

So we write some C code as a string inside our Python script… Weird but why not ? You can also load from a file like so:

int syscall_kill(void *ctx) {
    bpf_trace_printk("Hello, World!\\n"); 
    return 0;
}
#!/usr/bin/python
# sudo ./hello_world.py

from bcc import BPF

BPF(src_file = "hello_world.c")

In order to prevent our script to be killed, it needs to be able to block signals for multiples pids. I also want to block multiple signals. But how do we provide this arguments to our eBPF program? This is done using eBPF maps. Maps are data structures used to share data between userland and our eBPF program. There are a lot of different kinds of maps, going from arrays to hashmaps. To create a new map with BCC you use the BPF_YOURTYPEHERE macro in your C stub like so:

BPF_HASH(pids, int, u8); // Syntax: BPF_HASH(name, key_type, value_type)

For the eBPF hook, the logic is quite simple: if the given pid is inside the pids hashmap and the signal is in the signal array, then we need to return early from the syscall.

Here is the C code corresponding to this algorithm:

#include <uapi/linux/ptrace.h>
#include <linux/sched.h>

BPF_HASH(pids, int, u8);
BPF_ARRAY(sigs, u8, 65);

static u8 needs_block(u8 protected_pid, u8 protected_sig) {
    return protected_pid != 0 && protected_sig != 0;
}

int syscall__kill(struct pt_regs *ctx, int pid, int sig)
{

    u8 *protected_pid = pids.lookup(&pid);
    u8 *protected_sig = sigs.lookup(&sig);
    if (!protected_pid || !protected_sig)
        return 0;

    if (needs_block(*protected_pid, *protected_sig)) {
        bpf_trace_printk("Blocked signal %d for %d\\n", sig, pid);
        bpf_override_return(ctx, 0);
    }

    return 0;
}

The Python part needs a bit more logic to work:

  • Parse the arguments to retrieve signals to block and the pids that need to be protected
  • Add the pid of the Python script
  • Put the pids to block inside the corresponding maps

Here is the Python code (without the argument parsing because that’s boring):

# Args is the resulting object of parse_args() method of argparse
def initialize_bpf(args):
    b = BPF(src_file="blocksig.c")
    kill_fnname = b.get_syscall_fnname('kill')
    b.attach_kprobe(event=kill_fnname, fn_name='syscall__kill')
    pids_map = b.get_table('pids')
    sigs_map = b.get_table('sigs')

    args.pids.append(str(os.getpid()))
    for pid in args.pids:
        pids_map[c_int(int(pid))] = c_int(1)

    for sig in args.sig_array:
        sigs_map[sig] = c_int(1)

Time for a demo:

$ ping skallwar.fr > /dev/null &
[1] 315629
$ sudo ./blocksig.py 315629 &
$ kill -9 315629
$ # Nothing happened
$ kill -9 $(pidof python) # Pid of the blocksig
$ # Nothing happened

But a new problem arises. If the script is protected from signals, and the terminal in which it runs is closed, we find ourselves unable to stop the script. In order to do so, I’ve implemented a system of ticket (a simple file with a unique name) in the tmpfs where the script is polling whether our ticket has been deleted or not:

def wait_for_close():
# Create a tempfile and wait for its deletion
    tf = tempfile.NamedTemporaryFile(delete = False)
    print(f"This script might not be killable anymore. To stop it run ``rm {tf.name}``")

    try:
        while os.path.isfile(tf.name):
            time.sleep(0.5)
            continue
    except KeyboardInterrupt:
        tf.close()
        os.remove(tf.name)
        print('')

So there are 2 use cases:

  • Keep it running in the shell and you can use CTRL+C to raise a Python keyboard exception (SIGINT can still be blocked)
  • Run it in the background and use the unique ticket in order to stop it

After all of this we should be good, we can protect ourself and our targeted pids. Let’s see what it looks like in htop, just to make sure.

At this stage I was quite frustrated. Yes you could make it work by logging as root and not using sudo but that’s not convenient at all. Fortunately I found a post on stack overflow about forcing sudo not to fork, suggesting me to use exec before sudo. And for once, “it works on my machine"™ out of the box, nice.

So here is the final result:

#include <uapi/linux/ptrace.h>
#include <linux/sched.h>

BPF_HASH(pids, int, u8);
BPF_ARRAY(sigs, u8, 65);

static u8 needs_block(u8 protected_pid, u8 protected_sig) {
    return protected_pid != 0 && protected_sig != 0;
}

int syscall__kill(struct pt_regs *ctx, int pid, int sig)
{
    u8 *protected_pid = pids.lookup(&pid);
    u8 *protected_sig = sigs.lookup(&sig);
    if (!protected_pid || !protected_sig)
        return 0;
    if (needs_block(*protected_pid, *protected_sig)) {
        bpf_trace_printk("Blocked signal %d for %d\\n", sig, pid);
        bpf_override_return(ctx, 0);
    }
    return 0;
}
#!/usr/bin/python

from bcc import BPF
from bcc.utils import ArgString, printb
from ctypes import *
import argparse
import tempfile
import time
import os

def parse_args():
    parser = argparse.ArgumentParser(description='Blocksig is a tool to block certain or all signal to be recived by given pids')
    parser.add_argument('-p', dest='pids', nargs='+', default=[], metavar='pid', help='List of pid to protect')
    parser.add_argument('-s', dest='sigs', nargs='+', default=[], metavar='signal_num', help='List of signal to block. If no signal is specified, they are all blocked')
    parser.add_argument('--auto-protect', action=argparse.BooleanOptionalAction, default=True, help='Whether to protect blocksig itself or not')
    args = parser.parse_args()

    return args


def initialize_bpf(args):
    b = BPF(src_file = "blocksig.c")
    kill_fnname = b.get_syscall_fnname('kill')
    b.attach_kprobe(event=kill_fnname, fn_name='syscall__kill')
    pids_map = b.get_table('pids')
    sigs_map = b.get_table('sigs')

    if args.auto_protect == True:
        args.pids.append(str(os.getpid()))
    for pid in args.pids:
        pids_map[c_int(int(pid))] = c_int(1)

    sig_array = [int(sig) for sig in args.sigs] if len(args.sigs) else range(1, 64)
    for sig in sig_array:
        sigs_map[sig] = c_int(1)

def wait_for_close():
# Create a tempfile and wait for its deletion
    tf = tempfile.NamedTemporaryFile(delete = False)
    print(f"This script might not be killable anymore. To stop it run ``rm {tf.name}``")

    try:
        while os.path.isfile(tf.name):
            time.sleep(0.5)
            continue
    except KeyboardInterrupt:
        tf.close()
        os.remove(tf.name)
        print('')

args = parse_args()
initialize_bpf(args)
wait_for_close()

You can find all the code (and maybe future updates 👀) on Github

Libbpf

Before using BCC, I tried to uselibbpf but it did not work well. Using almost the same C code for the actual eBPF part, all the syscall arguments had strange values and thus, nothing worked. You can see what I tried to do on the libbpf branch on Github

Talk

I have talked about this in a conference for the LSE (french):