Received: by 2002:a89:413:0:b0:1fd:dba5:e537 with SMTP id m19csp114512lqs; Thu, 13 Jun 2024 05:42:09 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVWM74VkPCUztKvRQf+y/2cWlGkjoFTMVNSsObGZd5g86FIS5o5Eb3Q3MnNHI+hSbKiMAt+kgqAM4kpXKhYkjTXpEq1RZkm5sCgf+SUzA== X-Google-Smtp-Source: AGHT+IESsxf+L0yewcO82DCCWgJcWc7QmINo+KBWNL6yabDpmuG1b+CZ6Sk3XlgycYzA/d4HkQF2 X-Received: by 2002:a17:907:9485:b0:a6f:201a:299a with SMTP id a640c23a62f3a-a6f48013ba4mr356027566b.60.1718282529167; Thu, 13 Jun 2024 05:42:09 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718282529; cv=pass; d=google.com; s=arc-20160816; b=Pbfpmd6j2f9fTQZwT2/aZ5gZkpLRTcm7zPG7mnbHX9brBWN8DIlGNsATFnNLluNozB MNf24zp7ifKkNHCLuBnWh/S3nZdW6vZ++ykLmFO9ZfdcqZY93ciaG4DpzRbzvPHNtXhK hY1FD3H6akMaBle0dnGWJQFRXoNL0WlXu+FY+3Nr8JY8KLH+Kp1A1sIciYlqdYYitudF c8JV8UtveKOj5JQomtWYUU8ibAYBJAKWkBwqTvebsD9K6B4Lc2xwFdi+Z4F/giQ/BdG1 LAJ2ppu0a/ByhsbzxrQ8ROMZTHrdxCK8VAh9baHiS5LEPKBW7MUcbSV9wYMa708WMGwe h0Wg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=subject:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:user-agent:message-id:in-reply-to:date:references:cc:to :from; bh=83uZyNcywcnjUrLZiyLH0yG+Q42hfg8REFB9B5kvD7Q=; fh=JbEQCYPsB5CrftUNntxLekSNnF7ky92GuPRROxflgZs=; b=juV8v2H+HzUBcrog+0p1DC8a7HiJ5EoNN6LS+Mr1ya5h++vakPnzuItvF0POWm7a9L LxAERasXInX7ZLsI2O8prm4PqMubAtITgD9I3ZdL5ivGKDCXynGZ0OkCckCz7RBmmBJe EPmLSa/SRCFkbB89UzeuPPsJB9mW6GUC4mFyL2g+OLvuL0KKiq/P05aPwV6G1M0Wsd2+ ludVwu/KCcZe4KGsG8WflmR9GrZTaCCI8wQtDewNQ0elIutfurs+JCptKiQmuKvI7aYS ttSgMTYPmS0uo6KuxGb968sI26D7P5diguPIjrWvwnCI+C+XH3h+DuHmBZifmmO6niIz qSCQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=xmission.com dmarc=pass fromdomain=xmission.com); spf=pass (google.com: domain of linux-kernel+bounces-213223-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-213223-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id a640c23a62f3a-a6f56df7716si66774366b.533.2024.06.13.05.42.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jun 2024 05:42:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-213223-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=xmission.com dmarc=pass fromdomain=xmission.com); spf=pass (google.com: domain of linux-kernel+bounces-213223-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-213223-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id D7AC01F280D5 for ; Thu, 13 Jun 2024 12:42:08 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 94FD6143747; Thu, 13 Jun 2024 12:41:14 +0000 (UTC) Received: from out03.mta.xmission.com (out03.mta.xmission.com [166.70.13.233]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 79244142E86; Thu, 13 Jun 2024 12:41:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=166.70.13.233 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718282474; cv=none; b=p4NeXDRiMReNUy2LdhEqI4ad884axB0Y+8qSAfBR+MGaOOeDExWDvgxbdPaAeGOqCCPKeNuutNqy6FY8BB0Zg0kwWA9OQ97N2XqOMCdWr+CEr1urnBO08AtiGna6FpMcLR3nXjiXtqUHDq301UcccFS1yo0ZHiAOvZYg/954nvE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718282474; c=relaxed/simple; bh=dD6ShW76KbkH42bBEwPS/NzvqunFMNOkDf05maEapns=; h=From:To:Cc:References:Date:In-Reply-To:Message-ID:MIME-Version: Content-Type:Subject; b=eKm54VnWexO94MaXEgrA4IikZs6reFx7HwJyIg1asDbbw2YJPxhEM075y6y46gGGk6gsr8tgUK+qg1KXQVLEbFBr7yoNZs4titM3yYJb1+2oK19Ycx2Rv3MEPorYvbhqSOXj3DZwi7dhmcGBeIVQS5ngtykbWh2Mo9/P4dussA0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=xmission.com; spf=pass smtp.mailfrom=xmission.com; arc=none smtp.client-ip=166.70.13.233 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=xmission.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=xmission.com Received: from in02.mta.xmission.com ([166.70.13.52]:35548) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1sHjlH-00DJUt-DE; Thu, 13 Jun 2024 06:41:03 -0600 Received: from ip68-227-168-167.om.om.cox.net ([68.227.168.167]:41526 helo=email.froward.int.ebiederm.org.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1sHjlE-003eVV-7O; Thu, 13 Jun 2024 06:41:03 -0600 From: "Eric W. Biederman" To: Oleg Nesterov Cc: Andrew Morton , Rachel Menge , linux-kernel@vger.kernel.org, rcu@vger.kernel.org, Wei Fu , apais@linux.microsoft.com, Sudhanva Huruli , Jens Axboe , Christian Brauner , Mike Christie , Joel Granados , Mateusz Guzik , "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang References: <1386cd49-36d0-4a5c-85e9-bc42056a5a38@linux.microsoft.com> <20240608120616.GB7947@redhat.com> Date: Thu, 13 Jun 2024 07:40:06 -0500 In-Reply-To: <20240608120616.GB7947@redhat.com> (Oleg Nesterov's message of "Sat, 8 Jun 2024 14:06:16 +0200") Message-ID: <87a5jpqamx.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1sHjlE-003eVV-7O;;;mid=<87a5jpqamx.fsf@email.froward.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=68.227.168.167;;;frm=ebiederm@xmission.com;;;spf=pass X-XM-AID: U2FsdGVkX18qLrVAX1hXXj2C18cuMCIcy/jiniNxzDk= X-SA-Exim-Connect-IP: 68.227.168.167 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Level: * X-Spam-Virus: No X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4998] * 0.7 XMSubLong Long Subject * 0.5 XMGappySubj_01 Very gappy subject * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa02 1397; Body=1 Fuz1=1 Fuz2=1] * -0.0 T_SCC_BODY_TEXT_LINE No description available. X-Spam-DCC: XMission; sa02 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;Oleg Nesterov X-Spam-Relay-Country: X-Spam-Timing: total 358 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 3.8 (1.1%), b_tie_ro: 2.6 (0.7%), parse: 0.69 (0.2%), extract_message_metadata: 2.9 (0.8%), get_uri_detail_list: 1.43 (0.4%), tests_pri_-2000: 2.8 (0.8%), tests_pri_-1000: 2.7 (0.7%), tests_pri_-950: 1.04 (0.3%), tests_pri_-900: 0.80 (0.2%), tests_pri_-90: 71 (19.9%), check_bayes: 70 (19.6%), b_tokenize: 7 (1.9%), b_tok_get_all: 8 (2.3%), b_comp_prob: 1.88 (0.5%), b_tok_touch_all: 51 (14.2%), b_finish: 0.61 (0.2%), tests_pri_0: 259 (72.2%), check_dkim_signature: 0.39 (0.1%), check_dkim_adsp: 2.2 (0.6%), poll_dns_idle: 0.93 (0.3%), tests_pri_10: 1.73 (0.5%), tests_pri_500: 6 (1.6%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Oleg Nesterov writes: > kernel_wait4() doesn't sleep and returns -EINTR if there is no > eligible child and signal_pending() is true. > > That is why zap_pid_ns_processes() clears TIF_SIGPENDING but this is not > enough, it should also clear TIF_NOTIFY_SIGNAL to make signal_pending() > return false and avoid a busy-wait loop. I took a look through the code. It used to be that TIF_NOTIFY_SIGNAL was all about waking up a task so that task_work_run can be used. io_uring still mostly uses it that way. There is also a use in kthread_stop that just uses it as a TIF_SIGPENDING without having a pending signal. At the point in do_exit where exit_notify and thus zap_pid_ns_processes is called I can't possibly see a use for TIF_NOTIFY_SIGNAL. exit_task_work, exit_signals, and io_uring_cancel have all been called. So TIF_NOTIFY_SIGNAL should be spurious at this point and safe to clear. Why it remains set is a mystery to me. If I had infinite time and energy the ideal is to rework the pid namespace exit logic so that waiting for everything to exit works like delay_group_leader in wait_task_consider. Simply blocking reaping of the pid namespace leader until everything in the pid namespace have been reaped. I think acct_exit_ns is the only piece of code that needs to be moved to allow that, and acct_exit_ns is purely bookkeeping so does not affect userspace visible semantics. This active waiting is weird and non-standard in the kernel and winds up causeing a problem every couple of years because of that. > > Fixes: 12db8b690010 ("entry: Add support for TIF_NOTIFY_SIGNAL") > Reported-by: Rachel Menge > Closes: https://lore.kernel.org/all/1386cd49-36d0-4a5c-85e9-bc42056a5a38@linux.microsoft.com/ > Signed-off-by: Oleg Nesterov > --- > kernel/pid_namespace.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c > index dc48fecfa1dc..25f3cf679b35 100644 > --- a/kernel/pid_namespace.c > +++ b/kernel/pid_namespace.c > @@ -218,6 +218,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns) > */ > do { > clear_thread_flag(TIF_SIGPENDING); > + clear_thread_flag(TIF_NOTIFY_SIGNAL); > rc = kernel_wait4(-1, NULL, __WALL, NULL); > } while (rc != -ECHILD);