Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1295901ybf; Fri, 28 Feb 2020 19:01:09 -0800 (PST) X-Google-Smtp-Source: APXvYqzyziDXCGGO1jfi1VdZt7BLK0X4VPvFvbcuStb0+nQw0slw7tmBwOZAdLgCVmQLqkWKe1tZ X-Received: by 2002:aca:ec02:: with SMTP id k2mr5383870oih.105.1582945269163; Fri, 28 Feb 2020 19:01:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582945269; cv=none; d=google.com; s=arc-20160816; b=i2LVPbHsHhvbf6DpXUYGPrmBHEOYe2UNOqUcuwVRXh7YZT9yyGE/VqHXYNtP1o4y6y cUC6LVdrHt0OeTUuDtMRlS25VTZVbX8YwdlAyXrzkFqpvp5MkdsEJbbotXtqX7m9abT6 +DMJF57EfycWal1yjSeJ77Ag1EFl/r7Y0adMqiDlTT3mO9gQ8sfo6L7SrpbglH3I0MZ+ v3U3EA4OkjRMl7utwsbGO0d1AaC5/FbwYBIzmE2KjIBxqqxB4qytGWoZzqYXF22g1Ppx LX+PFTaj5dcrIUnXAlvayy5vSeZrBSy3/pSqZ833iplKMEK56TU82POfSvfZ3waY3sSM 8F3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Ss4psCf9E9EGK9N3a5pMKBRiV1XZ3w9ZUtfhEJB5t6I=; b=Qutd+GnIg6M0rSxOfIWiEC2iPmDwPBrnyuaem8HtAsM9RgxNk5SCK9rjR8+k5AUNyW VNV68Bf4CS5+5ceaQJ42tCNW1VYlW9p7DjVYAhcoyZBuWRTH6sxXi17pCP96Ezq6nHvm w48OsESYa8XoKknyW7rJIkalansflAH3wUsUsLZeu3sxdkpcoc85UfYACCu7igz4A0Jd lH0DUui2n87bWcUIInd2k+cHKuEfhGhoHtFm11DgM5eh+QynF0lyoKF0IYekYA/tDhZq MqnJp2GHPQUkIoG7LL0LedsorlhOUehu/LdjKF9cFeng7mIQzd0Qw/vmUXW+wAmBm+Ui uKjQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l1si2519895otn.213.2020.02.28.19.00.54; Fri, 28 Feb 2020 19:01:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726682AbgB2C7y (ORCPT + 99 others); Fri, 28 Feb 2020 21:59:54 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:43794 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726046AbgB2C7y (ORCPT ); Fri, 28 Feb 2020 21:59:54 -0500 Received: from ip5f5bf7ec.dynamic.kabel-deutschland.de ([95.91.247.236] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1j7sM1-0000Wh-Pt; Sat, 29 Feb 2020 02:59:49 +0000 Date: Sat, 29 Feb 2020 03:59:48 +0100 From: Christian Brauner To: "Eric W. Biederman" Cc: linux-kernel@vger.kernel.org, Al Viro , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Alexey Gladkov , Linus Torvalds , Jeff Dike , Richard Weinberger , Anton Ivanov Subject: Re: [PATCH 4/3] pid: Improve the comment about waiting in zap_pid_ns_processes Message-ID: <20200229025948.mfla2guxzvo7mdwg@wittgenstein> References: <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> <871rqpaswu.fsf_-_@x220.int.ebiederm.org> <871rqk2brn.fsf_-_@x220.int.ebiederm.org> <878skmsbyy.fsf_-_@x220.int.ebiederm.org> <878skmpcib.fsf_-_@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <878skmpcib.fsf_-_@x220.int.ebiederm.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 28, 2020 at 04:34:20PM -0600, Eric W. Biederman wrote: > > Oleg wrote a very informative comment, but with the removal of > proc_cleanup_work it is no longer accurate. > > Rewrite the comment so that it only talks about the details > that are still relevant, and hopefully is a little clearer. > > Signed-off-by: "Eric W. Biederman" > --- > kernel/pid_namespace.c | 31 +++++++++++++++++++------------ > 1 file changed, 19 insertions(+), 12 deletions(-) > > diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c > index 318fcc6ba301..01f8ba32cc0c 100644 > --- a/kernel/pid_namespace.c > +++ b/kernel/pid_namespace.c > @@ -224,20 +224,27 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns) > } while (rc != -ECHILD); > > /* > - * kernel_wait4() above can't reap the EXIT_DEAD children but we do not > - * really care, we could reparent them to the global init. We could > - * exit and reap ->child_reaper even if it is not the last thread in > - * this pid_ns, free_pid(pid_allocated == 0) calls proc_cleanup_work(), > - * pid_ns can not go away until proc_kill_sb() drops the reference. > + * kernel_wait4() misses EXIT_DEAD children, and EXIT_ZOMBIE > + * process whose parents processes are outside of the pid > + * namespace. Such processes are created with setns()+fork(). > * > - * But this ns can also have other tasks injected by setns()+fork(). > - * Again, ignoring the user visible semantics we do not really need > - * to wait until they are all reaped, but they can be reparented to > - * us and thus we need to ensure that pid->child_reaper stays valid > - * until they all go away. See free_pid()->wake_up_process(). > + * If those EXIT_ZOMBIE processes are not reaped by their > + * parents before their parents exit, they will be reparented > + * to pid_ns->child_reaper. Thus pidns->child_reaper needs to > + * stay valid until they all go away. > * > - * We rely on ignored SIGCHLD, an injected zombie must be autoreaped > - * if reparented. > + * The code relies on the the pid_ns->child_reaper ignoring s/the the/the/ Hm, can we maybe reformulate this to: "The code relies on having made pid_ns->child_reaper ignore SIGCHLD above causing EXIT_ZOMBIE processes to be autoreaped if reparented." Which imho makes it clearer that it was us ensuring that SIGCHLD is ignored. Someone not too familiar with the exit codepaths might be looking at zap_pid_ns_processes() not knowing that it is only called when namespace init is exiting. Otherwise Acked-by: Christian Brauner