From bfafe5efa9754ebc991750da0bcca2a6694f3ed3 Mon Sep 17 00:00:00 2001 From: Andrei Vagin Date: Fri, 28 Jun 2024 02:10:12 +0000 Subject: seccomp: release task filters when the task exits Previously, seccomp filters were released in release_task(), which required the process to exit and its zombie to be collected. However, exited threads/processes can't trigger any seccomp events, making it more logical to release filters upon task exits. This adjustment simplifies scenarios where a parent is tracing its child process. The parent process can now handle all events from a seccomp listening descriptor and then call wait to collect a child zombie. seccomp_filter_release takes the siglock to avoid races with seccomp_sync_threads. There was an idea to bypass taking the lock by checking PF_EXITING, but it can be set without holding siglock if threads have SIGNAL_GROUP_EXIT. This means it can happen concurently with seccomp_filter_release. This change also fixes another minor problem. Suppose that a group leader installs the new filter without SECCOMP_FILTER_FLAG_TSYNC, exits, and becomes a zombie. Without this change, SECCOMP_FILTER_FLAG_TSYNC from any other thread can never succeed, seccomp_can_sync_threads() will check a zombie leader and is_ancestor() will fail. Reviewed-by: Oleg Nesterov Signed-off-by: Andrei Vagin Link: https://lore.kernel.org/r/20240628021014.231976-3-avagin@google.com Reviewed-by: Tycho Andersen Signed-off-by: Kees Cook --- kernel/exit.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'kernel/exit.c') diff --git a/kernel/exit.c b/kernel/exit.c index f95a2c1338a8..b945ab81eb92 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -277,7 +277,6 @@ repeat: } write_unlock_irq(&tasklist_lock); - seccomp_filter_release(p); proc_flush_pid(thread_pid); put_pid(thread_pid); release_thread(p); @@ -832,6 +831,8 @@ void __noreturn do_exit(long code) io_uring_files_cancel(); exit_signals(tsk); /* sets PF_EXITING */ + seccomp_filter_release(tsk); + acct_update_integrals(tsk); group_dead = atomic_dec_and_test(&tsk->signal->live); if (group_dead) { -- cgit v1.2.3