Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp4424678img; Tue, 26 Mar 2019 09:08:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqznw5YbM3+Wp1vePjV2ijOpIjY34SrBmY/2VP4aHyckOMN0DXs98AnJ1BmW4kSokhn+23Aj X-Received: by 2002:a63:7843:: with SMTP id t64mr29883366pgc.178.1553616519831; Tue, 26 Mar 2019 09:08:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553616519; cv=none; d=google.com; s=arc-20160816; b=v4Wtz+GQeUDPrHhKW+TPTbSe4i4xcebf9THp7HnYKmKgWFe3Me/h8hkf6SXQf3cZRY R+Wd9U0Y4KrjKmCZ0MfqUViRfiGH58sm+cTOHrQAkEcf+jYeytH4ZiXyD9Q8tw8z4std UqIwtKozou0gAJlH2W8MOhLg9x99Ek/KU53KM/Du55sOIkgpUuLufFpS/ZMSyir0vUDS lS/tJEeEd2jZ0betarXeFHw4NVTDpYi3jdhNS5bdfvRxCsiS4D2hgtKy+6iQpHzgL05F A/jFDOYRVXe3NzGOtJUqmJzYpA9ZnhLvxbuABQbWMqJJXtG7mAdKvU13wJ3RpEZNzs3D VkPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=QJ1htOgLKMIjpEqiqEqAcf/VdDwidl5sIsLdfwmDN0A=; b=spi3HAt9aFSW5hHw0Vyd4EvMGqnQHIqedElQOHV9yClGPjUz3Qz14Dio97Deee/iue p/nxu36EKcNSV8hYcSyVsU4wUp2KUL7TKmB0A8rGjkv8iEIMplIc+KNKL4TUKBUClz+i 5yCHMSlHY1dlEQ4SX9NHiMnn1W1StZTzU0lWn9eI2oi5PfDu2Zb4tOfsL3GghcDnbm2q w82zo20+ptECLHsadwn+1waglOUVEtSgggzCa3ztReCpnOD4dcb1I6cJRc1ZniUom2ii 4aw3IO3HXPnaWc1om/X8bQEZfu2vQxNU/LXDeDU6aYrlklc2mnNe+uOFKGseA9CiLs6D x2Zw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=xVmHjKTG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d9si18196268pln.403.2019.03.26.09.08.24; Tue, 26 Mar 2019 09:08:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=xVmHjKTG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732057AbfCZQHd (ORCPT + 99 others); Tue, 26 Mar 2019 12:07:33 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:45744 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730454AbfCZQHd (ORCPT ); Tue, 26 Mar 2019 12:07:33 -0400 Received: by mail-pl1-f194.google.com with SMTP id bf11so1850529plb.12 for ; Tue, 26 Mar 2019 09:07:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=QJ1htOgLKMIjpEqiqEqAcf/VdDwidl5sIsLdfwmDN0A=; b=xVmHjKTGlw+QUZFp19EcARGAeF8sCBEWC1M+hH27FRGy9plVZdgB1HFO1Wd1F14HFK yi1/GjjKBYGchV5smReb870JAIJWXrSbcbaELDbN8TEcLbIVkxFO4dw7RChD4JMzoicH 2esPVCN8ksUNi+FSO/i1f8olFTwSEUCFfVPks= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=QJ1htOgLKMIjpEqiqEqAcf/VdDwidl5sIsLdfwmDN0A=; b=ksav2FdyfbzhZ139f97zBXy6wphJIKgqo01UsJbqSr8AE+bxQ0m3OsptMF/xFNouzj NQYe66yBR9of/C7POz1UGVKqjmqzqbQqqbhLvsCCuQNtOVPMci5J3ZuYZJlxCB/GwAUd eQpc2OU7M5YKWHN1T1JSYxi4F/R2upsCVrCTRzhGyQREoIf1arMX6GnQ1rKxQoGIM/sS E/fL+IWToPgP2sqVNmtaVRTsnfJ+rSttidEm691ZLUoUoeeHNboqFxFLxCZKKtUvRHS3 4iQlO0CPnIlmO+YGjyVki6LGsf7EzjAXRTiZI3+DoaaW4A/NR1G2VjJlhvsSXAxhBFma EMLA== X-Gm-Message-State: APjAAAVoaeVlhfqJcXb5AM1IJrN5F5iC2Dclr0N38e/XzVoO7a+eOfhu VKwYjAaDPDvV8pnNPkIuiBkGgg== X-Received: by 2002:a17:902:e85:: with SMTP id 5mr32119928plx.13.1553616451877; Tue, 26 Mar 2019 09:07:31 -0700 (PDT) Received: from localhost ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id e1sm24679452pfn.73.2019.03.26.09.07.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 26 Mar 2019 09:07:30 -0700 (PDT) Date: Tue, 26 Mar 2019 12:07:29 -0400 From: Joel Fernandes To: Jann Horn Cc: Christian Brauner , Andy Lutomirski , Konstantin Khlebnikov , David Howells , "Serge E. Hallyn" , "Eric W. Biederman" , Linux API , kernel list , Arnd Bergmann , Kees Cook , Alexey Dobriyan , Thomas Gleixner , Michael Kerrisk-manpages , bl0pbl33p@gmail.com, "Dmitry V. Levin" , Andrew Morton , Oleg Nesterov , Nagarathnam Muthusamy , cyphar@cyphar.com, Al Viro , Daniel Colascione Subject: Re: [PATCH 2/4] pid: add pidctl() Message-ID: <20190326160729.GA76870@google.com> References: <20190325162052.28987-1-christian@brauner.io> <20190325162052.28987-3-christian@brauner.io> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 25, 2019 at 07:18:42PM +0100, Jann Horn wrote: > On Mon, Mar 25, 2019 at 5:21 PM Christian Brauner wrote: > > The pidctl() syscalls builds on, extends, and improves translate_pid() [4]. > > I quote Konstantins original patchset first that has already been acked and > > picked up by Eric before and whose functionality is preserved in this > > syscall: > [...] > > + > > +static struct pid_namespace *get_pid_ns_by_fd(int fd) > > +{ > > + struct pid_namespace *pidns = ERR_PTR(-EINVAL); > > + > > + if (fd >= 0) { > > +#ifdef CONFIG_PID_NS > > + struct ns_common *ns; > > + struct file *file = proc_ns_fget(fd); > > + if (IS_ERR(file)) > > + return ERR_CAST(file); > > + > > + ns = get_proc_ns(file_inode(file)); > > + if (ns->ops->type == CLONE_NEWPID) > > + pidns = get_pid_ns( > > + container_of(ns, struct pid_namespace, ns)); > > This increments the refcount of the pidns... > > > + > > + fput(file); > > +#endif > > + } else { > > + pidns = task_active_pid_ns(current); > > ... but this doesn't. That's pretty subtle; could you please put a > comment on top of this function that points this out? Or even better, > change the function to always take a reference, so that the caller > doesn't have to worry about figuring this out. > > > + } > > + > > + return pidns; > > +} > [...] > > +SYSCALL_DEFINE5(pidctl, unsigned int, cmd, pid_t, pid, int, source, int, target, > > + unsigned int, flags) > > +{ > > + struct pid_namespace *source_ns = NULL, *target_ns = NULL; > > + struct pid *struct_pid; > > + pid_t result; > > + > > + switch (cmd) { > > + case PIDCMD_QUERY_PIDNS: > > + if (pid != 0) > > + return -EINVAL; > > + pid = 1; > > + /* fall through */ > > + case PIDCMD_QUERY_PID: > > + if (flags != 0) > > + return -EINVAL; > > + break; > > + case PIDCMD_GET_PIDFD: > > + if (flags & ~PIDCTL_CLOEXEC) > > + return -EINVAL; > > + break; > > + default: > > + return -EOPNOTSUPP; > > + } > > + > > + source_ns = get_pid_ns_by_fd(source); > > + result = PTR_ERR(source_ns); > > I very much dislike using PTR_ERR() on pointers before checking > whether they contain an error value or not. I understand that the > result of this won't actually be used, but it still seems weird to > have what is essentially a cast of a potentially valid pointer to a > potentially smaller integer here. > > Could you maybe move the PTR_ERR() into the error branch? Like so: > > if (IS_ERR(source_ns)) { > result = PTR_ERR(source_ns); > goto err_source; > } FWIW, thought of mentioning that once the get_pid_ns_by_fd can be modified to always take a reference on the ns, a further simplifcation here could be: if (IS_ERR(source_ns)) { result = PTR_ERR(source_ns); source_ns = NULL; goto error; } if (IS_ERR(target_ns)) { result = PTR_ERR(target_ns); target_ns = NULL; goto error; } And the error patch can be simplified as well which also avoids the "if (target)" issues Jan mentioned in the error path: error: if (source_ns) put_pid_ns(source_ns); if (target_ns) put_pid_ns(target_ns); return result; > > + if (IS_ERR(source_ns)) > > + goto err_source; > > + > > + target_ns = get_pid_ns_by_fd(target); > > + result = PTR_ERR(target_ns); > > + if (IS_ERR(target_ns)) > > + goto err_target; > > + > > + if (cmd == PIDCMD_QUERY_PIDNS) { > > + result = pidns_related(source_ns, target_ns); > > + } else { > > + rcu_read_lock(); > > + struct_pid = find_pid_ns(pid, source_ns); > > find_pid_ns() doesn't take a reference on its return value, the return > value is only pinned into memory by the RCU read-side critical > section... > > > + result = struct_pid ? pid_nr_ns(struct_pid, target_ns) : -ESRCH; > > + rcu_read_unlock(); > > ... which ends here, making struct_pid a dangling pointer... > > > + > > + if (cmd == PIDCMD_GET_PIDFD) { > > + int cloexec = (flags & PIDCTL_CLOEXEC) ? O_CLOEXEC : 0; > > + if (result > 0) > > + result = pidfd_create_fd(struct_pid, cloexec); > > ... and then here you continue to use struct_pid. That seems bogus. Absolutely. > > + else if (result == 0) > > + result = -ENOENT; > > You don't need to have flags for this for new syscalls, you can just > make everything O_CLOEXEC; if someone wants to preserve an fd across > execve(), they can just clear that bit with fcntl(). (I thiiink it was > Andy Lutomirski who said that before regarding another syscall? But my > memory of that is pretty foggy, might've been someone else.) Agreed, I also was going to say the same, about the flags. thanks, - Joel > > > + } > > + } > > + > > + if (target) > > + put_pid_ns(target_ns); > > +err_target: > > + if (source) > > + put_pid_ns(source_ns); > > I think you probably intended to check for "if (target >= 0)" and "if > (source >= 0)" instead of these conditions, matching the condition in > get_pid_ns_by_fd()? The current code looks as if it will leave the > refcount one too high when used with target==0 or source==0, and leave > the refcount one too low when used with target==-1 or source==-1. > Anyway, as stated above, I think it would be simpler to > unconditionally take a reference instead. > > > +err_source: > > + return result; > > +}