Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1491045ybz; Wed, 22 Apr 2020 23:06:24 -0700 (PDT) X-Google-Smtp-Source: APiQypIRsyq9pcWGQRjEBO2/2QNtjHDzm+AWiDQQ1xdQ/MdzhwXHPnorl0CIEs4CgzWJDQwrcqUD X-Received: by 2002:a05:6402:3129:: with SMTP id dd9mr1476477edb.121.1587621984538; Wed, 22 Apr 2020 23:06:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587621984; cv=none; d=google.com; s=arc-20160816; b=djOzLc2gDVeS00WoOC/8Wr68N4x81DMHGfDtRNZP3CZSoLH+QZDopYdepbJrxHK8kU R2iYyg+RS4EwI/2VFkJp250MtGY18Q6JOD+8nW6IwJDXx3+9EPcycP21YkBD9aabO8WP C13n9cfo7Xp5WPxujmxqRTzPqvDk2DqHkLkOtemXMF6c5PJrbWQkzuaTr43iHtfP633a US/Dih8jqGdK0C3Q103l0uSHSklXt8FBjBAM/4Ax2ucaQ59mRx9mq2Z3S7GR77mVC0+Q Xne7O+3HcTkasO3ECaIJusvdX+KfzfwIgnqFJZMjduneQ1bzGuEp8FtEZuSdk20RmUcM fDbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=iJwaTH8LkE8SYFlauVNm3KSXFSKFJXUbSfiZFZKke1g=; b=LYuTfE4OCS71Bth3hGr99eWN0udBWCdGja/bY5IdN9rb87NbaBgOtITELYlX5OllXY UaDI2xT91YsOZruUKu5T2Aces06ShvOIRFNjf1kaja+WXYll1gZx9rjb455cg8htLgRG xOIA/Bg5M+tl2+xcfixl3rQk24bVBE8p+pCkrOhoMwa8FtRje28ddyxmuWhTp4Tn33/k l07mr5JpAPB2LyHYn9Mv3ZrVL/fpRpLYrEbaGzn+54rjvitmh2lhN5KZvhql4ZJS6EhR /djvaYsUAYSh8VE9jnffDA7lD4re9P22lKVsor4V4aBkkp2yUaPpYszGO0rvY2I5lBLI Dg0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=Qwh3upIC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z9si673831edi.150.2020.04.22.23.06.01; Wed, 22 Apr 2020 23:06:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=Qwh3upIC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725854AbgDWGEi (ORCPT + 99 others); Thu, 23 Apr 2020 02:04:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1725562AbgDWGEi (ORCPT ); Thu, 23 Apr 2020 02:04:38 -0400 Received: from mail-ed1-x543.google.com (mail-ed1-x543.google.com [IPv6:2a00:1450:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3BC7C03C1AB for ; Wed, 22 Apr 2020 23:04:37 -0700 (PDT) Received: by mail-ed1-x543.google.com with SMTP id d16so3439508edv.8 for ; Wed, 22 Apr 2020 23:04:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iJwaTH8LkE8SYFlauVNm3KSXFSKFJXUbSfiZFZKke1g=; b=Qwh3upICI/1MLSvIIFDHEVl4679/BheYiJokF6SxrCKujHxyCIMyVvJMtul8/kIzXR jzWsdyckkaFni2N5FXBXqI0PdYGeJ+24CGO8SlMIeErnoLWrSSxJHi2Ts5U81nSumB37 na1q/xSG8NEuO3HU4zpcOtR/g9tX27egrfS3c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iJwaTH8LkE8SYFlauVNm3KSXFSKFJXUbSfiZFZKke1g=; b=qQQqIjrbS612fLb/Ury1f0rRqrH1iI37f3O5jNP3ROcHnsL+jvfay4/ykOXDayNDKa aPH5W8CD7QPGofb0jhG6H14BiFT7X0NtCx8h6m8g8/67i/QMvCSkPmx4kbLpzB0YY22D rNFnbp5WdVKgqT0NVp/WPwLuuHQBbcRL3X6TsaHzrB/1VuTXz//n6SDZOTkqTNDf2dHT bSx8h/nOVTmBSN7kpD7BvCM89HjIVRGOyWMAiD9ZWk5kvEtARp4a8RAnSu/8OO8V2ObY 1mge1p6EljKLwGdQCWJ1dpBMYtdNvY5V5aAhs9tXvPjmg7SKpYN9CpBMOjECko5LMSox Dgdg== X-Gm-Message-State: AGi0Pubea/OcIGyfCXCh0gxYaLuv6A01Hronrmo+1T86RFXVLjFywGhf qttuyWnttM3VWFD3R/uTWBWFszPVXmvG+uJANyaCtg== X-Received: by 2002:aa7:c312:: with SMTP id l18mr1529317edq.161.1587621876304; Wed, 22 Apr 2020 23:04:36 -0700 (PDT) MIME-Version: 1.0 References: <9873b8bd7d14ff8cd2a5782b434b39f076679eeb.1587531463.git.josh@joshtriplett.org> <20200423004807.GC161058@localhost> <20200423044226.GH161058@localhost> In-Reply-To: <20200423044226.GH161058@localhost> From: Miklos Szeredi Date: Thu, 23 Apr 2020 08:04:25 +0200 Message-ID: Subject: Re: [PATCH v5 2/3] fs: openat2: Extend open_how to allow userspace-selected fds To: Josh Triplett Cc: Michael Kerrisk , io-uring@vger.kernel.org, "linux-fsdevel@vger.kernel.org" , lkml , Alexander Viro , Arnd Bergmann , Jens Axboe , Aleksa Sarai , linux-man , Linux API Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 23, 2020 at 6:42 AM Josh Triplett wrote: > > On Thu, Apr 23, 2020 at 06:24:14AM +0200, Miklos Szeredi wrote: > > On Thu, Apr 23, 2020 at 2:48 AM Josh Triplett wrote: > > > On Wed, Apr 22, 2020 at 09:55:56AM +0200, Miklos Szeredi wrote: > > > > On Wed, Apr 22, 2020 at 8:06 AM Michael Kerrisk (man-pages) > > > > wrote: > > > > > > > > > > [CC += linux-api] > > > > > > > > > > On Wed, 22 Apr 2020 at 07:20, Josh Triplett wrote: > > > > > > > > > > > > Inspired by the X protocol's handling of XIDs, allow userspace to select > > > > > > the file descriptor opened by openat2, so that it can use the resulting > > > > > > file descriptor in subsequent system calls without waiting for the > > > > > > response to openat2. > > > > > > > > > > > > In io_uring, this allows sequences like openat2/read/close without > > > > > > waiting for the openat2 to complete. Multiple such sequences can > > > > > > overlap, as long as each uses a distinct file descriptor. > > > > > > > > If this is primarily an io_uring feature, then why burden the normal > > > > openat2 API with this? > > > > > > This feature was inspired by io_uring; it isn't exclusively of value > > > with io_uring. (And io_uring doesn't normally change the semantics of > > > syscalls.) > > > > What's the use case of O_SPECIFIC_FD beyond io_uring? > > Avoiding a call to dup2 and close, if you need something as a specific > file descriptor, such as when setting up to exec something, or when > debugging a program. > > I don't expect it to be as widely used as with io_uring, but I also > don't want io_uring versions of syscalls to diverge from the underlying > syscalls, and this would be a heavy divergence. What are the plans for those syscalls that don't easily lend themselves to this modification (such as accept(2))? Do we want to introduce another variant of these? Is that really worth it? If not, we are faced with the same divergence. Compared to that, having a common flag for file ops to enable the use of fixed and private file descriptors is a clean and well contained interface. > > > > This would also allow Implementing a private fd table for io_uring. > > > > I.e. add a flag interpreted by file ops (IORING_PRIVATE_FD), including > > > > openat2 and freely use the private fd space without having to worry > > > > about interactions with other parts of the system. > > > > > > I definitely don't want to add a special kind of file descriptor that > > > doesn't work in normal syscalls taking file descriptors. A file > > > descriptor allocated via O_SPECIFIC_FD is an entirely normal file > > > descriptor, and works anywhere a file descriptor normally works. > > > > What's the use case of allocating a file descriptor within io_uring > > and using it outside of io_uring? > > Calling a syscall not provided via io_uring. Calling a library that > doesn't use io_uring. Passing the file descriptor via UNIX socket to > another program. Passing the file descriptor via exec to another > program. Userspace is modular, and file descriptors are widely used. I mean, you could open the file descriptor outside of io_uring in such cases, no? The point of O_SPECIFIC_FD is to be able to perform short sequences of open/dosomething/close without having to block and having to issue separate syscalls. If you're going to issue separate syscalls anyway, then I see no point in doing the open within io_uring. Or? Thanks, Miklos