Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp9926481ybi; Wed, 24 Jul 2019 12:27:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqwDKQiKgKKrGb7K14GrNBUHloqaaNPcWL8DpGNEBf5/+CtOc2JAzdcPGwtZeCc6fMQDH9Hn X-Received: by 2002:a65:6114:: with SMTP id z20mr84182513pgu.141.1563996455123; Wed, 24 Jul 2019 12:27:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563996455; cv=none; d=google.com; s=arc-20160816; b=S40Cd9MoK4+JthoaBtj6v2CbPjSuG7MZRvzktiDlet3IQsLNXfAgvf9O6NnqaweCtF MqX9CxInrTWSCvQMa1wR25vpNCYKCYER8DM+Gm7Xs/Q36s15+12xrlkgRJP0t/FBbMLy dMhLThTWYCc7T+yrNLf238on3Q85USsD7KhGlh2cD+QeCbNMT4PF8iUWae45cTmeJlNI hN9s4vXk0OZrc/CjUjh1fUGhS6tGU7XLYMY10Yj/5G4N+cLZpFUkgJ3mY8UbOC1SHoXc LNG5/2mPSxu9rw42CF02eCHxkCemIi84PXm3Wth8J3QKoaVjDIQgGZ6y25kUco3SQTnD CfIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:from:cc:to:subject :content-transfer-encoding:mime-version:references:in-reply-to :user-agent:date:dkim-signature; bh=3c55STgphphZ74Tkr7e0DO52Xm8oc23hN7JygFwylXY=; b=NXNCR55Edi22EkiNu1R9zv1beVPTeWAXSMeAcValXLqSFkPfSmkiEnztZYJi4w9OH4 Pezvb7jDDHO62hObHDn9MbCUaO4m0t5HtJtTQXUSI0VbWnQi5Svzj8mD+UXk2UWQE/CT +ivro92sX8C/m4GUSt8AqucNUiLV2GZegVunH3rqyee3KBOHciJiAgStd+XilfMIwTfH DqW+yWOn9v9PJioOnNNCM3lz3IatvVCTsIwj3ff28x26WDM2aFNdXcqw6tEqlpaGIkqO gfhLHTxEYOFAAdgzipK/L8OhsoHdUl1/9naQ1nHK+S9s1K+dVs2dEE32kfN4pgrl5Rft 0GzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@brauner.io header.s=google header.b=Lqx8OPn1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 126si18014594pgb.47.2019.07.24.12.27.20; Wed, 24 Jul 2019 12:27:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@brauner.io header.s=google header.b=Lqx8OPn1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728928AbfGXTKa (ORCPT + 99 others); Wed, 24 Jul 2019 15:10:30 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:33982 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728523AbfGXTKa (ORCPT ); Wed, 24 Jul 2019 15:10:30 -0400 Received: by mail-pf1-f195.google.com with SMTP id b13so21413557pfo.1 for ; Wed, 24 Jul 2019 12:10:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=date:user-agent:in-reply-to:references:mime-version :content-transfer-encoding:subject:to:cc:from:message-id; bh=3c55STgphphZ74Tkr7e0DO52Xm8oc23hN7JygFwylXY=; b=Lqx8OPn1aLHQ/uSUcziIbWBa43ionTq0dyMamtLFcU251mVHo88+e4KPRuKF5mh0Y9 RMG/jOpGnVXotjfZqOT+yo4G54408UcJX3uQdUXVeG8e4NQlqW4DjYACataDsRvCuTtP bTTXl59Zcu1C2zFiXSHTJkEa1Aq44TdaUEqqaLExRWz1inh64BnxCWfssojvzfYi9rFN WwCrsI8V/kAQEo2BAdX0XdavW5Bzgce81ABKccC41F7wc1WZe25N5jmQ+Fj67MgZ70JM GKEn0ofQhfMD1Hcr8tEO7oL2769tDeLi0f6akIZxZP/WGRMZFJr3ZyuVAv0gGh4RwwUH aXeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:user-agent:in-reply-to:references :mime-version:content-transfer-encoding:subject:to:cc:from :message-id; bh=3c55STgphphZ74Tkr7e0DO52Xm8oc23hN7JygFwylXY=; b=AO20RbwCJEEw4Jrw6ZZn+TCpMouNqF+u13mnkwKsG2pBbVCxJ/09Rft+EPJpaQx0c9 bdlf94sLjfoiQj25jFZ6eXTh39/MHVg7QNk8QpNGTjVI/J7fxK1ugDUq9BfZPvfKIeSp XKtFkB1oQ98QadKTJotD0jGoOJECOmp+rxvAPpX3r1X/yPovZRYO7Nywi4acHZzMh5x2 fQR5Eo/0iqqY2ZJ1eyhLXNNhZA0JsuVx6reG1asRn9t2+1lKcekRpL2gHUM5X8bmQLqf k/Ew/dTRseQPnewt85Vh7UXqj7kiI1qGCl3Oy6kJe13iVhyEDp3nZegDzrzhWf5eruZO SflA== X-Gm-Message-State: APjAAAXvzJ2LvVvS8pSbJqGll5UtB+VQWycHoVKVk8xdEQdHGuKZ9DyK BL0w7akVLV0HsJG9o16zeUE= X-Received: by 2002:a65:6081:: with SMTP id t1mr83803190pgu.9.1563995429178; Wed, 24 Jul 2019 12:10:29 -0700 (PDT) Received: from [25.171.251.59] ([172.58.27.54]) by smtp.gmail.com with ESMTPSA id i124sm87012408pfe.61.2019.07.24.12.10.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jul 2019 12:10:28 -0700 (PDT) Date: Wed, 24 Jul 2019 21:10:20 +0200 User-Agent: K-9 Mail for Android In-Reply-To: References: <20190724144651.28272-1-christian@brauner.io> <20190724144651.28272-5-christian@brauner.io> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PATCH 4/5] pidfd: add CLONE_WAIT_PID To: Jann Horn CC: kernel list , Oleg Nesterov , Arnd Bergmann , "Eric W. Biederman" , Kees Cook , "Joel Fernandes (Google)" , Thomas Gleixner , Tejun Heo , David Howells , Andy Lutomirski , Andrew Morton , Aleksa Sarai , Linus Torvalds , Al Viro , kernel-team , Ingo Molnar , Peter Zijlstra , Linux API From: Christian Brauner Message-ID: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On July 24, 2019 9:07:54 PM GMT+02:00, Jann Horn wrote= : >On Wed, Jul 24, 2019 at 8:27 PM Christian Brauner > wrote: >> On July 24, 2019 8:14:26 PM GMT+02:00, Jann Horn >wrote: >> >On Wed, Jul 24, 2019 at 4:48 PM Christian Brauner >> > wrote: >> >> If CLONE_WAIT_PID is set the newly created process will not be >> >> considered by process wait requests that wait generically on >children >> >> such as: >> >> >> >> syscall(__NR_wait4, -1, wstatus, options, rusage) >> >> syscall(__NR_waitpid, -1, wstatus, options) >> >> syscall(__NR_waitid, P_ALL, -1, siginfo, options, rusage) >> >> syscall(__NR_waitid, P_PGID, -1, siginfo, options, rusage) >> >> syscall(__NR_waitpid, -pid, wstatus, options) >> >> syscall(__NR_wait4, -pid, wstatus, options, rusage) >> >> >> >> A process created with CLONE_WAIT_PID can only be waited upon with >a >> >> focussed wait call=2E This ensures that processes can be reaped even >if >> >> all file descriptors referring to it are closed=2E >> >[=2E=2E=2E] >> >> diff --git a/kernel/fork=2Ec b/kernel/fork=2Ec >> >> index baaff6570517=2E=2Ea067f3876e2e 100644 >> >> --- a/kernel/fork=2Ec >> >> +++ b/kernel/fork=2Ec >> >> @@ -1910,6 +1910,8 @@ static __latent_entropy struct task_struct >> >*copy_process( >> >> delayacct_tsk_init(p); /* Must remain after >> >dup_task_struct() */ >> >> p->flags &=3D ~(PF_SUPERPRIV | PF_WQ_WORKER | PF_IDLE); >> >> p->flags |=3D PF_FORKNOEXEC; >> >> + if (clone_flags & CLONE_WAIT_PID) >> >> + p->flags |=3D PF_WAIT_PID; >> >> INIT_LIST_HEAD(&p->children); >> >> INIT_LIST_HEAD(&p->sibling); >> >> rcu_copy_process(p); >> > >> >This means that if a process with PF_WAIT_PID forks, the child >> >inherits the flag, right? That seems unintended? You might have to >add >> >something like "if (clone_flags & CLONE_THREAD =3D=3D 0) p->flags &=3D >> >~PF_WAIT_PID;" before this=2E (I think threads do have to inherit the >> >flag so that the case where a non-leader thread of the child goes >> >through execve and steals the leader's identity is handled >properly=2E) >> >Or you could cram it somewhere into signal_struct instead of on the >> >task - that might be a more logical place for it? >> >> Hm, CLONE_WAIT_PID is only useable with CLONE_PIDFD which in turn is >> not useable with CLONE_THREAD=2E >> But we should probably make that explicit for CLONE_WAIT_PID too=2E > >To clarify: > >This code looks buggy to me because p->flags is inherited from the >parent, with the exception of flags that are explicitly stripped out=2E >Since PF_WAIT_PID is not stripped out, this means that if task A >creates a child B with clone(CLONE_WAIT_PID), and then task B uses >fork() to create a child C, then B will not be able to use >wait(&status) to wait for C since C inherited PF_WAIT_PID from B=2E > >The obvious way to fix that would be to always strip out PF_WAIT_PID; >but that would also be wrong, because if task B creates a thread C, >and then C calls execve(), the task_struct of B goes away and B's TGID >is taken over by C=2E When C eventually exits, it should still obey the >CLONE_WAIT_PID (since to A, it's all the same process)=2E Therefore, if >p->flags is used to track whether the task was created with >CLONE_WAIT_PID, PF_WAIT_PID must be inherited if CLONE_THREAD is set=2E >So: > >diff --git a/kernel/fork=2Ec b/kernel/fork=2Ec >index d8ae0f1b4148=2E=2Eb32e1e9a6c9c 100644 >--- a/kernel/fork=2Ec >+++ b/kernel/fork=2Ec >@@ -1902,6 +1902,10 @@ static __latent_entropy struct task_struct >*copy_process( > delayacct_tsk_init(p); /* Must remain after dup_task_struct() */ > p->flags &=3D ~(PF_SUPERPRIV | PF_WQ_WORKER | PF_IDLE); > p->flags |=3D PF_FORKNOEXEC; >+ if (!(clone_flags & CLONE_THREAD)) >+ p->flags &=3D ~PF_PF_WAIT_PID; >+ if (clone_flags & CLONE_WAIT_PID) >+ p->flags |=3D PF_PF_WAIT_PID; > INIT_LIST_HEAD(&p->children); > INIT_LIST_HEAD(&p->sibling); > rcu_copy_process(p); > >An alternative would be to not use p->flags at all, but instead make >this a property of the signal_struct - since the property is shared by >all threads, that might make more sense? Yeah, thanks for clarifying=2E Now it's more obvious=2E I need to take a look at the signal struct before I can say anything about= this=2E