Received: by 2002:ab2:7855:0:b0:1f9:5764:f03e with SMTP id m21csp711780lqp; Wed, 22 May 2024 18:46:20 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUbcx/V4k/jKReZoJemCoEk2hzyPpYTFf60GDVuKnve8X03x3JC7TuSGgAsZpFrPcNGtXwVM2FniSbGN1TeZYWCW4lldioIzD9T4idLKQ== X-Google-Smtp-Source: AGHT+IECfkGtzY1FtPQf26M8BXlVll5MwR5f0PexokDDprirRTJyb52Wg0msPpUsBVFT6VOIwPgY X-Received: by 2002:a17:906:1910:b0:a59:ba2b:5913 with SMTP id a640c23a62f3a-a62281435b8mr214246366b.62.1716428780010; Wed, 22 May 2024 18:46:20 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716428779; cv=pass; d=google.com; s=arc-20160816; b=TguFhHA2EmJZ4oHiQuIaEsaQCWCni2lqzulqqaA7C9mo6QOCAvWDfpWWK8Hrwzftgl hxxCTP1gCxfrYd3hPtjGfPyy0ZdP2Z6qeEUGDToO3chCvwelXedRTQEO/iQlmOdO+USP uzpGuG+lCvDuzlroSLJGErSjK0qoRp10tRBxlVC0UKl3lDsr0iV41EUwub9ey3xzDmRz 2P5zgrS0fl1vju3IfZ/MQ/uYGjasLUz+0hza60l5VPZvG7eAQLYfhvRDyDC2EOeS99d+ Z114cxtQzptrkRxtXMzEBuilveTNw7T7IMWCylXwnkcSKTHkWWL/xI4eYb450J1gWhNT p67g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=sI+JSuHl+a4DbD/5uHc0jD/fV7+8U56yFV1emFa1PRU=; fh=/YQuUh76/kgEPfE2YXowbUqqIqOgE7f6+06v5Dxr91g=; b=TUm1QaXLMPiamREWInECynGq0Lel085SZxAxpKNED+bRTA+xoL8IZ2F2HoJC0pTmRG asl+hzatH+N1nv8VxHCw5dQnUGCtF1/S4cGQXcchWQGbhkzG09ug9LpxrX+y+4Vybcrh QJfgvxNaOhlKLBdWhdZJaMfTs2Imjaiy3zFQIa/ljzuWA/9dz0Z3CrQS5Gni0UOQ9JsF ORdsKnsRcFdDlTyMr18uf4Ot4cznCYNSNHP+sX8oIChqmVyzaJxG8De51tpgvT5jZJ5Q rAyRpul3/ibp/zSAUmKAt4LCe5S8w7u1jMZz0ymMeE+2hWUcPgGkT16yNhY4xOHAS1Eq ZHuQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=WCuFWUaz; arc=pass (i=1 spf=pass spfdomain=flex--avagin.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-186891-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-186891-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id a640c23a62f3a-a5a179462aasi1531014166b.104.2024.05.22.18.46.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 May 2024 18:46:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-186891-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=WCuFWUaz; arc=pass (i=1 spf=pass spfdomain=flex--avagin.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-186891-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-186891-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id BC17C1F21E83 for ; Thu, 23 May 2024 01:46:19 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 98336A94B; Thu, 23 May 2024 01:45:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WCuFWUaz" Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 320604690 for ; Thu, 23 May 2024 01:45:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716428757; cv=none; b=Mb2UbSlHRAMOHvLp0XTCeh98hrmbNi7gawqk2B1/Nh5RwidA8GDIYGLe6OSIUXyb5Gl5ja9gzHBua8Seso1sPtja91R+MvIv1ue3lqCBQKq4SeRBb22H4+E63zSKPqToYKIoYnni6i/8l3CQzPaQ/j3M6RZ0ZgqLvO8ijdB7hA0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716428757; c=relaxed/simple; bh=WSYL46L/nY3J3dr73k57B+B68lsssSYG6HVzprwEcOg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=oIQg8+7MM9Ls52oVtAJnK5C3Mf7yn3xw+e11hpSljO/8wa0gLlAQYCyhOKqwBUK98BxX1e/RF3cNrdZdwzskwvHn9j6YxpveuH5veSHHL8GCcUHpXw3ynjABh7O02Vd20Ltj8dtoQWSl7qu/p9U+Z+Aex2GR8UXYKvaV7u49sro= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--avagin.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WCuFWUaz; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--avagin.bounces.google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-627f644882eso7929847b3.0 for ; Wed, 22 May 2024 18:45:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1716428755; x=1717033555; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sI+JSuHl+a4DbD/5uHc0jD/fV7+8U56yFV1emFa1PRU=; b=WCuFWUazR2CLAPJ7xIrXD6C37yObRgSkb+kmm6cPttW7cKxaxI31CeGfbPECmXWc5d gi0KuWJRby+lVV/g1Zxa9SskNF0zot+PLs63Ty8Pr/YQFYd776tD6G7BWTR7lbVLSYbI ECAXeMbPblQu0XJ6WFQqJUzLKir50sX5JamGPDbDCqc01jBojjXaf7xrTV1X6s6ISvQt tpoTzmm1xv5tCArTFrWeeLHFGmHROQDACHdTtAUg333RhGt8KvW5uSxb7akIUlvX4yob ijJYMFmYaNrgqPwNeJTfvCZFMccI0STXhVE8m89/zNdTi2NLREKXxRBCPLX9o0CWYYOI /+SQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716428755; x=1717033555; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sI+JSuHl+a4DbD/5uHc0jD/fV7+8U56yFV1emFa1PRU=; b=CV9zDVt2BWWs+TgqpEnqaNcB5R57rZhUcVhTVzxYRESmagHLA6pBu2LHWsf1rlrzgj xePT7CTWYMBjfGOERKW//HAtF/cmTEQhnEhuoyLxWFT3hghWiWXh3zADkALcIkpUvQev mRdl2DtC8tr8lS28BGy5/FgT80ciyYpoSsL4Bt94WgxfJ1VSmv0avbQb6BLabmVWyoEu GvuKf+tf7g5w1uCmINhqHYW07hW2GzP/v/AU6CCa38fulKJSfNd/T/M++Wzgfs2LK9BY khnwhTELh1QnBAIoglckAO+nOpsKe291ctv08uNF0VYYEKVOE2Se0T5S+Czu3XUmJ8ph /tBQ== X-Gm-Message-State: AOJu0YzQfxECHly4VSnSGHV0IXSBpwuWxsUP97nU4G+0QxKGnGMIVmqQ qM+zD/Mx/DQL79HuYvl7DpGwhhBjcTuM5eGTmqBX7xA69ho6D/dIXRBR26irAWr48X9h+iSdcT/ 32A== X-Received: from avagin.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:b84]) (user=avagin job=sendgmr) by 2002:a05:6902:1894:b0:dee:6a2b:5fdb with SMTP id 3f1490d57ef6-df4e0aceda8mr347665276.3.1716428755262; Wed, 22 May 2024 18:45:55 -0700 (PDT) Date: Thu, 23 May 2024 01:45:39 +0000 In-Reply-To: <20240523014540.372255-1-avagin@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240523014540.372255-1-avagin@google.com> X-Mailer: git-send-email 2.45.1.288.g0e0cd299f1-goog Message-ID: <20240523014540.372255-3-avagin@google.com> Subject: [PATCH 2/3] seccomp: release task filters when the task exits From: Andrei Vagin To: Kees Cook , Andy Lutomirski , Will Drewry , Oleg Nesterov , Christian Brauner Cc: linux-kernel@vger.kernel.org, Tycho Andersen , Andrei Vagin , Jens Axboe Content-Type: text/plain; charset="UTF-8" Previously, seccomp filters were released in release_task(), which required the process to exit and its zombie to be collected. However, exited threads/processes can't trigger any seccomp events, making it more logical to release filters upon task exits. This adjustment simplifies scenarios where a parent is tracing its child process. The parent process can now handle all events from a seccomp listening descriptor and then call wait to collect a child zombie. seccomp_filter_release takes the siglock to avoid races with seccomp_sync_threads. There was an idea to bypass taking the lock by checking PF_EXITING, but it can be set without holding siglock if threads have SIGNAL_GROUP_EXIT. This means it can happen concurently with seccomp_filter_release. Signed-off-by: Andrei Vagin --- kernel/exit.c | 3 ++- kernel/seccomp.c | 22 ++++++++++++++++------ 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index 41a12630cbbc..23439c021d8d 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -278,7 +278,6 @@ void release_task(struct task_struct *p) } write_unlock_irq(&tasklist_lock); - seccomp_filter_release(p); proc_flush_pid(thread_pid); put_pid(thread_pid); release_thread(p); @@ -836,6 +835,8 @@ void __noreturn do_exit(long code) io_uring_files_cancel(); exit_signals(tsk); /* sets PF_EXITING */ + seccomp_filter_release(tsk); + acct_update_integrals(tsk); group_dead = atomic_dec_and_test(&tsk->signal->live); if (group_dead) { diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 35435e8f1035..67305e776dd3 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -502,6 +502,9 @@ static inline pid_t seccomp_can_sync_threads(void) /* Skip current, since it is initiating the sync. */ if (thread == caller) continue; + /* Skip exited threads. */ + if (thread->flags & PF_EXITING) + continue; if (thread->seccomp.mode == SECCOMP_MODE_DISABLED || (thread->seccomp.mode == SECCOMP_MODE_FILTER && @@ -563,18 +566,18 @@ static void __seccomp_filter_release(struct seccomp_filter *orig) * @tsk: task the filter should be released from. * * This function should only be called when the task is exiting as - * it detaches it from its filter tree. As such, READ_ONCE() and - * barriers are not needed here, as would normally be needed. + * it detaches it from its filter tree. PF_EXITING has to be set + * for the task. */ void seccomp_filter_release(struct task_struct *tsk) { - struct seccomp_filter *orig = tsk->seccomp.filter; - - /* We are effectively holding the siglock by not having any sighand. */ - WARN_ON(tsk->sighand != NULL); + struct seccomp_filter *orig; + spin_lock_irq(¤t->sighand->siglock); + orig = tsk->seccomp.filter; /* Detach task from its filter tree. */ tsk->seccomp.filter = NULL; + spin_unlock_irq(¤t->sighand->siglock); __seccomp_filter_release(orig); } @@ -602,6 +605,13 @@ static inline void seccomp_sync_threads(unsigned long flags) if (thread == caller) continue; + /* + * Skip exited threads. seccomp_filter_release could have + * been already called for this task. + */ + if (thread->flags & PF_EXITING) + continue; + /* Get a task reference for the new leaf node. */ get_seccomp_filter(caller); -- 2.45.1.288.g0e0cd299f1-goog