Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp593011yba; Thu, 18 Apr 2019 06:31:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqz3ktKAIS1Iwbp59VEHHMN7gp8oTNmAuKNdAja66s/1HM/GA8OcIQ39eu9eH2V2EwYWKMgP X-Received: by 2002:aa7:8083:: with SMTP id v3mr41814808pff.135.1555594285096; Thu, 18 Apr 2019 06:31:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555594285; cv=none; d=google.com; s=arc-20160816; b=VZA/ZRkblIiaVrFpjPKOhmjGeNl26jCDxlssjx6Ij4SCTvo+njLo/EDR6YmBl+yTGn RSR60kiisglpFBIalCy/ptk/IkScSMIccVEUxTNyB1KqdoPHD1YI2LfOnPuIFdbS5UJ2 35esQpn3TM43lQbXWcFhTMeYvYWvgNWtWdxo28rz1kXNAE8DItmJ2JTc7czLs1ciug30 jhFIsoMEOtpkYF2dYztdZ+iNDATA5hs73PjbwDY2+QQs71HqEWn5TTR9z/9yVbTbzE0j sDxPa+7BWoI2PSD927QPPRWhLsMhaVMtjLj9Izpgm3wyEs9d8k7Oyo7nP6PUh4FKmI0s yI1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=+8hg6qujl1+vbAOJgXR0XPQwmvHW2iRzPh5yyJeKJ+Q=; b=J6OiepW/13rQiEeKikWguW4CapjmbIlyZAXY+opiuHnXdMyQohgVuOLHi0B27D9+TI EjvKNd6KfZhz78W7w9/8MT29cIhTjzcR9e67XUsqBx2lbre5AXLyqeC3Js7HnAvTeXOC LM/cn+ry8+vbCX70Iv3elPefdLez+alsq+ArjwDkvyGMTAcvEgXk5E/WFrA5Lq+z6MNX ie8V/EZvVchozYoNiV6iYj/j08oyWBk3ZBFGr87XA37LhvTdxFjeJq6BW7uCTlCHhqh3 MI0iU5NmhIdzFX6d0c13v+beuX5ai7XT82dXaNIdOGlwygBYWQhhVe/iBwd6/jEvuAcd E/4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b=SbeZJGAv; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 194si2483938pfu.48.2019.04.18.06.31.09; Thu, 18 Apr 2019 06:31:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@brauner.io header.s=google header.b=SbeZJGAv; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388685AbfDRN22 (ORCPT + 99 others); Thu, 18 Apr 2019 09:28:28 -0400 Received: from mail-ed1-f65.google.com ([209.85.208.65]:36057 "EHLO mail-ed1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388495AbfDRN21 (ORCPT ); Thu, 18 Apr 2019 09:28:27 -0400 Received: by mail-ed1-f65.google.com with SMTP id u57so1820919edm.3 for ; Thu, 18 Apr 2019 06:28:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=+8hg6qujl1+vbAOJgXR0XPQwmvHW2iRzPh5yyJeKJ+Q=; b=SbeZJGAvr8VTOX6UdLjzWusPyK86MapUgvtDHJIQHUAJUehW0AiF76gM7DS24r8xpV nZTHKr1C/4raOjSLGsQVv+EB+ihEhM//KJw2RgcM0mOlY2/iYEOvcDuVFh0QE5MIQSS6 1zfWAu18Pjore6aDzEK2/HI8CkkHDVQv8Hqq1KEDRVWowD3T/sNerb4kH0Tu5NHkP7m3 4wTI6OzUOMdorIuuRzUlqvow2MYo8U/0r7PsWdr5YKf+Aw8nJf4F9OvZO2mom/1PE19p +haGhTrxhcvthfUGz3Q1O4jzRWNkmfjOU4W0NqFVS/TlQHceWS7buPqpTuXIHdEah7Su /qdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=+8hg6qujl1+vbAOJgXR0XPQwmvHW2iRzPh5yyJeKJ+Q=; b=ucAQFlqEJIF1219NhPPiJtGlpk27IwypJjmU4K3yks1vTFflHAxHZFxze21j12SMj/ LQyRtmhxUHCnPGx3OP0YmCrkg/UcC8H1+9BfjHoWPDWTI/7r1CleSkdj7dK1WN4Ci0aH Jv/6+/2IscrNWmov/xU5qt1LPjZVmq4hgXeiwh1FBz7L4xX+/s89pbbuqFK9HHoPiIOm HXdvvieDFoSGJBY8BAwIdU817bEksmzpAarvzU+ZpXHM54JdgwbxqjTdaJc9QRrxilfx Jh4g8VeUZwj5fS1ct5GCklrZeO5CDSNaJR00wX2XItzqtCfvxOk61xjdQSS6MmAIG/73 VRPw== X-Gm-Message-State: APjAAAVOQHWDDluIi7DboeiRBTSnNGyXDvA+pGZk1RAgCeZ9iw8IUBeH JZReyrReS7mr2GrbjN7TIqOxsw== X-Received: by 2002:a17:906:e202:: with SMTP id gf2mr50491722ejb.55.1555594105530; Thu, 18 Apr 2019 06:28:25 -0700 (PDT) Received: from brauner.io ([212.91.227.56]) by smtp.gmail.com with ESMTPSA id x14sm526350edm.1.2019.04.18.06.28.24 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Thu, 18 Apr 2019 06:28:24 -0700 (PDT) Date: Thu, 18 Apr 2019 15:28:23 +0200 From: Christian Brauner To: Oleg Nesterov Cc: torvalds@linux-foundation.org, viro@zeniv.linux.org.uk, jannh@google.com, dhowells@redhat.com, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, serge@hallyn.com, luto@kernel.org, arnd@arndb.de, ebiederm@xmission.com, keescook@chromium.org, tglx@linutronix.de, mtk.manpages@gmail.com, akpm@linux-foundation.org, cyphar@cyphar.com, joel@joelfernandes.org, dancol@google.com Subject: Re: [PATCH v2 2/5] clone: add CLONE_PIDFD Message-ID: <20190418132822.untjt7erfvbbiz7a@brauner.io> References: <20190418101841.4476-1-christian@brauner.io> <20190418101841.4476-3-christian@brauner.io> <20190418131206.GB13701@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190418131206.GB13701@redhat.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 18, 2019 at 03:12:07PM +0200, Oleg Nesterov wrote: > On 04/18, Christian Brauner wrote: > > > > @@ -1674,13 +1729,14 @@ static __latent_entropy struct task_struct *copy_process( > > unsigned long clone_flags, > > unsigned long stack_start, > > unsigned long stack_size, > > + int __user *parent_tidptr, > > int __user *child_tidptr, > > struct pid *pid, > > int trace, > > unsigned long tls, > > int node) > > { > > - int retval; > > + int pidfd = -1, retval; > > it seems that initialization is unneeded, but this is cosmetic. > > I see no technical problems, feel free to add my reviewed-by. Thank you! > > > But let me ask a couple of questions... Sure! > > > Why O_CLOEXEC? I am just curious, I do not really care. I think that having the file descriptor O_CLOEXEC by default is a good security measure in general. Most of the time you do not want to pass a file descriptor through exec() (apart from 0,1,2) and it is usually more of an issue when you accidently do it then when you accidently don't. So if users really care about passing a pidfd they should do so by removing the O_CLOEXEC flag explicitly. (New file descriptors should probably all default to that but that's just my opinion.) Another thing is that for a pidfds it makes even more sense to have them cloexec by default. You don't want to *unintentionally* leak an fd that can be used to operate on a process. > > > Should we allow CLONE_THREAD | CLONE_PIDFD ? I think so, yes. I have thought about this. Yes, due to CLONE_FILES | CLONE_VM you'd necessarily hand the pidfd to the child but threads are no security boundary in the first place. So if you have a non-cooperating thread you very much already have a problem. The situation is not very much different from passing the tid. Plus, CLONE_PIDFD can make use of the CLONE_UNDETACHED flag in the future in contrast to all the other flags. So one could potentially (not saying we have this planned!) add a flag CLONE_PIDFD_KILL_ON_CLOSE (This is just an improvised bad name rn.) which would cause the child to kill itself if it does close(pidfd) on its own pidfd and noone else is holding a reference at which point you'd hand a suicide switch to a new thread but nothing else. > > > Are you sure we will never need to extend this interface? If not, then perhaps it Well, we can already make use of CLONE_UNDETACHED with CLONE_PIDFD since we don't just ignore it. :) > make sense to add something like > > if (CLONE_PIDFD) { > unsigned long not_used_yet; > if (get_user(not_used_yet, parent_tidptr) || > not_used_yet != 0) > return -EINVAL; Oh, very interesting point. Yes, I think this is worth it. > } > > this way we can easily add more arguments in future or even turn CLONE_PIDFD into > CLONE_MORE_ARGS_IN_PARENT_TIDPTR. > > Not that I think this is really good idea, sys_clone2() makes more sense, but still. No, you're right about leaving this option open. Even if we don't extend not allowing garbage to be passed is always a good idea. Christian