Received: by 2002:a25:5b86:0:0:0:0:0 with SMTP id p128csp2230849ybb; Fri, 29 Mar 2019 23:26:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqx/F/gqzQAfnWwUppHUbkKz2N1t67vGG8/GXkMEk+nvYM9Z23Ck21upEBwtDwAUfPe2zJFX X-Received: by 2002:a63:3190:: with SMTP id x138mr42639101pgx.273.1553927166421; Fri, 29 Mar 2019 23:26:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553927166; cv=none; d=google.com; s=arc-20160816; b=XHq17Qk4fOGHUPLHr3Mw/gUyEcSLpqO6kI0osLWKyB+bXz8M/C5WFzQmhN9VBvUi2P r9EcrDNRGnJ5gCKBisjmPa28r+Vwuj7uwwhg0/cVKaI8oEm8jDTNhuRI9JdUuzlHJnpi 9FszyyrLloSymol8tJ6A3CJHS5q07U1r/i1cWipamKt8l6jP6vb8wPVaBxh30Z+iLe2j F0TiHvq4aPUpMxsUG8sscgEOFY2n9VHThJjsPzfWFuV2Iyk/WQfZykdiEil2Wc3XVDYl 3/w8IKd8oD1Cp0aW0P9iFYafcaPj6pqLJei7B4Im1l2KR/4uUPEfG9ceN0OY0rLCAMFw 2nYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=So+1AjqY2B3hpkxsqiZAinMeQ5z4HoPx4le+MMpEwD8=; b=TDjLNtpqQ+RFOVUanSDOwREwwbpT5oKxSRfOT3mlzN24xxV7Ahs8koksGyfEx77qXf EXyMSJ+McHCpOwua8au2AnkAbRyjUa9hp0kDfQFM9+Nq/9Ol2Tk7dR43xxQH9yW9x3K7 SjCPAiAYqoRBcF99cQLia/yARPa4lib0oYrJsmmP94egRkcuggsrEx8qiM7Dcw3KCcZP X7SXxQzFvRpabLc/9LICVF4ja8cghmFFjh9Qj2YcAJDo0lAnSRffqV8IZ0VziXfs4FvQ XOvfA+bSR9K+NB1AsJcAnRkGjasjnW9zR51KfdqkE4g+6H4We6BeIdH/jkkHOVbC97oA GQnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ELr8fceC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h10si2011646pgk.85.2019.03.29.23.25.50; Fri, 29 Mar 2019 23:26:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ELr8fceC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726569AbfC3GZQ (ORCPT + 99 others); Sat, 30 Mar 2019 02:25:16 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:36155 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726132AbfC3GZQ (ORCPT ); Sat, 30 Mar 2019 02:25:16 -0400 Received: by mail-qt1-f196.google.com with SMTP id s15so5218896qtn.3; Fri, 29 Mar 2019 23:25:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=So+1AjqY2B3hpkxsqiZAinMeQ5z4HoPx4le+MMpEwD8=; b=ELr8fceCtBEORgYwia9kSiZbhR68x8euvkj7NnMre69NFWLGJjQseDBpe2R4+eEsCm qp/+a5zypayLB0nuoMaPTO7XjmXHAueaFqP0CC++t3LnzvEZn0MNzZQAhIYoknnh6QrK gdKOyZlKA8bvgbWDqiGDhEK7zgeoGAM6/J2MDasyx2cdOX1lIahv5d582yOmWHbU91j1 ME3H3JJUzr2eauewMfHdAyg+niTF/rIW1xu4b6A267p0fLN+sxTk4BjzWs5QXamfaEow thOwljj+5oT+GXdZ1d0viBifEyciOmyHhuyE6A9bt5ci2RIjx8ujPOP7x/vdxsxZD2tG 1msA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=So+1AjqY2B3hpkxsqiZAinMeQ5z4HoPx4le+MMpEwD8=; b=dYDgJBFblvuYEiL5tb9YytHtlTYTN7tFreELagagT6jolDi0licST958NKmNMxHkrG BAKIoibO2AOvLiAQ1NeFc2ahcX2f+to5kar+ifqyrGPMSqOJ6QKeMnXeOMIUW8D9hFdG tplUVw7KOHWvYFNIti+LDS1/tcBsEkRcvpgD5oQpbnp/AB/UyVeniHQNoOqs/XjWiNAh Y5VWYoxKgcrquFUC50POQT8OmF+xMuawmnxtG45Jxa8ka/by32os8I125imHNUNg/M8B Jt53J5RTRgAH5Dw9KLmWccwr7RWI4haxE5vj4M1uFSi9wtYIs+NZ149k9x/v2Vo/tJbX ia3g== X-Gm-Message-State: APjAAAUOR5bFOkOsxLa9BFB56A6Y07ToB1XqvayMk0Prcj14u9B5FqQY KmAJG4QLsaD9mGzg+nmVFos7tq+uKCAXfkkgGGY= X-Received: by 2002:ac8:3042:: with SMTP id g2mr44122464qte.1.1553927115292; Fri, 29 Mar 2019 23:25:15 -0700 (PDT) MIME-Version: 1.0 References: <20190327162147.23198-1-christian@brauner.io> <20190327162147.23198-3-christian@brauner.io> <20190327213404.pv4wqtkjbufkx36u@brauner.io> <20190327222543.huugotqcew6jyytv@brauner.io> <20190328103813.eogszrqbitw3e7k7@brauner.io> In-Reply-To: From: Jonathan Kowalski Date: Sat, 30 Mar 2019 06:25:09 +0000 Message-ID: Subject: Re: [PATCH 2/4] pid: add pidfd_open() To: Daniel Colascione Cc: Christian Brauner , Jann Horn , Konstantin Khlebnikov , Andy Lutomirski , David Howells , "Serge E. Hallyn" , "Eric W. Biederman" , Linux API , linux-kernel , Arnd Bergmann , Kees Cook , Alexey Dobriyan , Thomas Gleixner , Michael Kerrisk-manpages , "Dmitry V. Levin" , Andrew Morton , Oleg Nesterov , Nagarathnam Muthusamy , Aleksa Sarai , Al Viro , Joel Fernandes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Mar 30, 2019 at 5:35 AM Daniel Colascione wrote: > > On Thu, Mar 28, 2019 at 3:38 AM Christian Brauner wrote: > > > > > All that said, thanks for the work on this once again. My intention is > > > just that we don't end up with an API that could have been done better > > > and be cleaner to use for potential users in the coming years. > > > > Thanks for your input on all of this. I still don't find multiplexers in > > the style of seccomp()/fsconfig()/keyctl() to be a problem since they > > deal with a specific task. They are very much different from ioctl()s in > > that regard. But since Joel, you, and Daniel found the pidctl() approach > > not very nice I dropped it. The interface needs to be satisfactory for > > all of us especially since Android and other system managers will be the > > main consumers. > > Thanks. > > > So let's split this into pidfd_open(pid_t pid, unsigned int flags) which > > allows to cleanly get pidfds independent procfs and do the translation > > to procpidfds in an ioctl() as we've discussed in prior threads. This > > I sustain my objection to adding an ioctl. Compared to a system call, > an ioctl has a more rigid interface, greater susceptibility to > programmer error (due to the same ioctl control code potentially doing > different things for different file types), longer path length, and > more awkward filtering/monitoring/auditing/tracing. We've discussed > this issue at length before, and I thought we all agreed to use system > calls, not ioctl, for core kernel functionality. So why is an ioctl > suddenly back on the table? The way I see it, an ioctl has no > advantages except for 1) conserving system call numbers, which are not > scarce, and 2) avoiding the system call number coordination problem > (and the coordination problem isn't a factor for core kernel code). I > don't understand everyone's reluctance to add new system calls. What > am I missing? Why would we give up all the advantages that a system > call gives us? > I agree in general, but in this particular case a system call or an ioctl doesn't matter much, all it does is take the pidfd, the command, and /proc's dir fd. If you start adding a system call for every specific operation on file descriptors, it *will* become a problem. Besides, the translation is just there because it is racy to do in userspace, it is not some well defined core kernel functionality. Therefore, it is just a way to enter the kernel to do the openat in a race free and safe manner. As is, the facility being provided through an ioctl on the pidfd is not something I'd consider a problem. I think the translation stuff should also probably be an extension of ioctl_ns(2) (but I wouldn't be opposed if translate_pid is resurrected as is). For anything more involved than ioctl(pidfd, PIDFD_TO_PROCFD, procrootfd), I'd agree that a system call would be a cleaner interface, otherwise, if you cannot generalise it, using ioctls as a command interface is probably the better tradeoff here. > I also don't understand Andy's argument on the other thread that an > ioctl is okay if it's an "operation on an FD" --- *most* system calls > are operations on FDs. We don't have an ioctl for sendmsg(2) and it's > an "operation on an FD".