Received: by 2002:a05:6a10:8a4d:0:0:0:0 with SMTP id dn13csp1041629pxb; Fri, 13 Aug 2021 12:03:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwipBRIJEY7a/gQLx2Ng+Mx+azekZc4DPb8WUOjzoPb1E1ZF9787Id+/3yeOAfPY/sbd3kq X-Received: by 2002:a1c:a50c:: with SMTP id o12mr4197479wme.4.1628881421536; Fri, 13 Aug 2021 12:03:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628881421; cv=none; d=google.com; s=arc-20160816; b=d7UbgkUHu8lZwZiqOXAMIUsXvfdR5C2++8fLMuISSqmSoV1EtsPVv5oQytjTxGbBgw ct7hePGAAlzFlnMYp82obz/Nzw/OjYXn1o0alNAlxftVgSvo7F7GqN8+HqwIgI2F4mO0 mIgxD3wv99PmsjMmk8xQnDcLDTIc3TGtHgBkr5Q3WnzLcaL90V+Ecut8NvUJQDUvTpdS qUd3zcgEhdyS16PqJ6O3klZryrYZwaF4PSwZNSc59o6LpRA+d6gbGTO4b/jrkP4GP0ZD drUXoIFxh2YifjwBVbLNy+dDu4Yt4VEHOpiCmUvcudpi7DWd73IK8BvlYrcqnkfVVaT2 NtPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature :dkim-signature; bh=Oi9pJS/VGWw696eTE+HoTGcxYFI8GqP9wVUUlw1PSOc=; b=Hcb9rWMyvjPYv9OEanxd0D0AmwGNj5goB/baPCSRLa6DSlX8Yp7Qb+UpHXXcJwDVcy 82dwpGyhuNNIuEvKVueWL9FY4y9rugfOCbaNX+AbnZ1DWgUM43rhRhezuK4KwbckZUii gx0/1HkOVHF5jPgi8aJs9ZEGpjaSU5ZRUtuwgVSDgP8ikOnQZrhWQjsyAGTKbv+4cZaD GAkC0ZkfbtcArYpsWdZwvMa7fI0XTYiBG8BHMTx/JSDltJGFVN4VV5THi9PcbSoC1NAi C4aEbOYDG4CXNdYt63COv2S06QbeU66Q634d5AqeVDLZCvgxpdVBfDxG1Zs6xNPW2Ap2 Yipg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joshtriplett.org header.s=fm1 header.b=fXlJPlut; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=SoR9Ne5u; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ec14si2404105edb.84.2021.08.13.12.03.17; Fri, 13 Aug 2021 12:03:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@joshtriplett.org header.s=fm1 header.b=fXlJPlut; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=SoR9Ne5u; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233364AbhHMTBY (ORCPT + 99 others); Fri, 13 Aug 2021 15:01:24 -0400 Received: from wnew1-smtp.messagingengine.com ([64.147.123.26]:39509 "EHLO wnew1-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230440AbhHMTBX (ORCPT ); Fri, 13 Aug 2021 15:01:23 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 935342B011D3; Fri, 13 Aug 2021 15:00:52 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Fri, 13 Aug 2021 15:00:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= joshtriplett.org; h=date:from:to:cc:subject:message-id :references:mime-version:content-type:in-reply-to; s=fm1; bh=Oi9 pJS/VGWw696eTE+HoTGcxYFI8GqP9wVUUlw1PSOc=; b=fXlJPlutM/+izC+yijH iTEGyqvzLzEsBCwgkXcOhWVudDdiSvID7fBVnK/QeNVMLrDuyGQOuPSxbEZTCPRQ +UrgK6s+MTgFxYrdtADeFOPFvpJaWqZL4hF3rssxiNguhU6m0j5P/jU+zbeQsnXy NjRQqP9k1W2AFhGHUDMQuwVyMQ2tvyVKbH4+P2n2OpcT/Dh5XAgqiVz+irHTZYLW n8k//ViXfvgoYhh5VDIilvGPvBY+xJgWCigVrDtvbUnzUNkWLzT/7nlVV/zqJWtE F0Qwd8kIOoqmNlaaGxj0EWD+rLJqi5AWnLixNsTdk/3oH1Obk5lnnrB1ueyNUddL ILw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=Oi9pJS /VGWw696eTE+HoTGcxYFI8GqP9wVUUlw1PSOc=; b=SoR9Ne5uhZS+goYczaeM41 /w0Zk587Z16VWAbDRSBOGYHpzzPpxtX7fRMTi1xYsPbYivG0XBIsAxmYDmcE3i6S Nq8rLHJQ6AEtcl/t99gDjBnvJ5+OboJgU3v4sNAMShvqX3+ANcZEWAkLX2J4+7j4 jv/1HfDL4GxDp1dmoem+Y28PATzMD8xeXCGvN7JQkSIhqR9b8rdllmIEaVhy+k1E 78s5uJj8i9ZAbiaciDNLtQftUfekf/PeTsBwRDNkelfVR+q1KrOjL/acOXmaFqkr gwbSkBDzw48S/cm2Cq3w1ms0ALJ+Ia11n+b0bgCf/t36DUE6APWljHC/NTa/7nLg == X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrkeehgddufeefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvffukfhfgggtuggjsehttdertddttddvnecuhfhrohhmpeflohhshhcu vfhrihhplhgvthhtuceojhhoshhhsehjohhshhhtrhhiphhlvghtthdrohhrgheqnecugg ftrfgrthhtvghrnhepgedtgfefgefhveeglefgfeeigeduueehkeektdeuueetgfehffev geeuieetheetnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrh homhepjhhoshhhsehjohhshhhtrhhiphhlvghtthdrohhrgh X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 13 Aug 2021 15:00:49 -0400 (EDT) Date: Fri, 13 Aug 2021 12:00:48 -0700 From: Josh Triplett To: Pavel Begunkov Cc: Jens Axboe , io-uring@vger.kernel.org, "David S . Miller" , Jakub Kicinski , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, Stefan Metzmacher Subject: Re: [PATCH v2 0/4] open/accept directly into io_uring fixed file table Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 13, 2021 at 05:43:09PM +0100, Pavel Begunkov wrote: > Add an optional feature to open/accept directly into io_uring's fixed > file table bypassing the normal file table. Same behaviour if as the > snippet below, but in one operation: > > sqe = prep_[open,accept](...); > cqe = submit_and_wait(sqe); > // error handling > io_uring_register_files_update(uring_idx, (fd = cqe->res)); > // optionally > close((fd = cqe->res)); > > The idea in pretty old, and was brough up and implemented a year ago > by Josh Triplett, though haven't sought the light for some reasons. Thank you for working to get this over the finish line! > Tested on basic cases, will be sent out as liburing patches later. > > A copy paste from 2/2 describing user API and some notes: > > The behaviour is controlled by setting sqe->file_index, where 0 implies > the old behaviour. If non-zero value is specified, then it will behave > as described and place the file into a fixed file slot > sqe->file_index - 1. A file table should be already created, the slot > should be valid and empty, otherwise the operation will fail. > > Note 1: we can't use IOSQE_FIXED_FILE to switch between modes, because > accept takes a file, and it already uses the flag with a different > meaning. > > Note 2: it's u16, where in theory the limit for fixed file tables might > get increased in the future. If would ever happen so, we'll better > workaround later, e.g. by making ioprio to represent upper bits 16 bits. > The layout for open is tight already enough. Rather than using sqe->file_index - 1, which feels like an error-prone interface, I think it makes sense to use a dedicated flag for this, like IOSQE_OPEN_FIXED. That flag could work for any open-like operation, including open, accept, and in the future many other operations such as memfd_create. (Imagine using a single ring submission to open a memfd, write a buffer into it, seal it, send it over a UNIX socket, and then close it.) The only downside is that you'll need to reject that flag in all non-open operations. One way to unify that code might be to add a flag in io_op_def for open-like operations, and then check in common code for the case of non-open-like operations passing IOSQE_OPEN_FIXED. Also, rather than using a 16-bit index for the fixed file table and potentially requiring expansion into a different field in the future, what about overlapping it with the nofile field in the open and accept requests? If they're not opening a normal file descriptor, they don't need nofile. And in the original sqe, you can then overlap it with a 32-bit field like splice_fd_in. EEXIST seems like the wrong error-code to use if the index is already in use; open can already return EEXIST if you pass O_EXCL. How about EBADF, or better yet EBADSLT which is unlikely to be returned for any other reason? - Josh Triplett