Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752087AbaJSXfb (ORCPT ); Sun, 19 Oct 2014 19:35:31 -0400 Received: from mail-la0-f45.google.com ([209.85.215.45]:44335 "EHLO mail-la0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751579AbaJSXf1 (ORCPT ); Sun, 19 Oct 2014 19:35:27 -0400 MIME-Version: 1.0 In-Reply-To: <20141019224251.GJ7996@ZenIV.linux.org.uk> References: <1401975635-6162-1-git-send-email-drysdale@google.com> <20141019202034.GH7996@ZenIV.linux.org.uk> <20141019212921.GI7996@ZenIV.linux.org.uk> <20141019224251.GJ7996@ZenIV.linux.org.uk> From: Andy Lutomirski Date: Sun, 19 Oct 2014 16:35:05 -0700 Message-ID: Subject: Re: [PATCHv4 RESEND 0/3] syscalls,x86: Add execveat() system call To: Al Viro Cc: David Drysdale , "Eric W. Biederman" , Meredydd Luff , "linux-kernel@vger.kernel.org" , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Andrew Morton , Kees Cook , Arnd Bergmann , X86 ML , linux-arch , Linux API Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Oct 19, 2014 at 3:42 PM, Al Viro wrote: > On Sun, Oct 19, 2014 at 03:16:03PM -0700, Andy Lutomirski wrote: > >> Oh, you mean that #!/usr/bin/make -f would turn into /usr/bin/make >> /dev/fd/3? That could be interesting, although I can imagine it >> breaking things, especially if /dev/fd/3 isn't set up like that, e.g. >> early in boot. > > Sigh... What I mean is that fexecve(fd, ...) would have to put _something_ > into argv when it execs the interpreter of #! file. Simply because the > interpreter (which can be anything whatsoever) has no fscking idea what > to do with some descriptor it has before execve(). Hell, it doesn't have > any idea *which* descriptor had it been. > > You need to put some pathname that would yield your script upon open(2). > If you bothered to read those patches, you'd see that they do supply > one, generating it with d_path(). Which isn't particulary reliable. > > I'm not sure there's any point putting any of that in the kernel - if > you *do* have that pathname, you can just pass it. Hmm. This issue certainly makes fexecve or execveat less attractive, at least in cases where d_path won't work. On the other hand, if you want to run a static binary on a cloexec fd (or, for that matter, a dynamic binary if you trust the interpreter to close the extra copy of the fd it gets) in a namespace or chroot where the binary is invisible, then you need kernel help. It's too bad that script interpreters don't have a mechanism to open their scripts by fd. > >> Aside from the general scariness of allowing one process to actually >> dup another process's fds, I feel like this is asking for trouble wrt >> the various types of file locks. > > Who said anything about another process's fds? That, indeed, would be > a recipe for serious trouble. It's a filesystem with one directory, > not with one directory for each process... > This still has issues with locks if you pass an fd to a child process, but I guess that you get what you ask for if you do that. > FWIW, they (Plan 9) do have procfs and there they have /proc//fd. > Which is a regular file, with contents consisting of \n-terminated > lines (one per descriptor in 's descriptor table>) in the same > format as in *ctl (they put descriptor number as the first field in > those). > > Unlike our solution, they do not allow to get to any process' files via > procfs. They do allow /dev/stdin-style access to your own files via > dupfs. And yes, for /dev/stdin and friends dup-style semantics is better - > you get consistent behaviours for pipes and redirects from file that way. > See the example I've posted upthread. Besides, for things like sockets > our semantics simply fails - they really depend on having only one > struct file for given socket; it's dup or nothing there. The same goes > for a lot of things like eventfd, etc. Fair enough. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/