Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3603732ybf; Tue, 3 Mar 2020 08:56:08 -0800 (PST) X-Google-Smtp-Source: ADFU+vvnUu4hcQQxV6o1pOgQosvuxt5wMiR38SuW/5XzM8pO2abbsF5pIulr7wei22D5s39pYx2H X-Received: by 2002:aca:44b:: with SMTP id 72mr3066308oie.67.1583254567883; Tue, 03 Mar 2020 08:56:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583254567; cv=none; d=google.com; s=arc-20160816; b=bEXw9SzcLFV6fWM0L9v0rs3fUTXIvI+hy7mp45yrM1aoDwEd/DPd0Wt1VEwfREtRuc LyywnbGf4I6tjYBThB8WxOeOlVoUt5i33+htoCYHaV41JnnkAgQTS7tXpKEDgxBjfGpx XiLebnAYvr/yS5mOw/AbBtUd5JSx2tSLuOAKAR4Rsjy3ZbGAM2y0T4Ye2+YKFk3crqeJ LSTt9ATcm0kjYGpkbfOGL+7AsJosGEzHwkkk/E+JSGForKug991N508L+qvjsC2pCq6L g8GNE+a0OX4fztlsq7k2/Q7s2NfGn0vC++eLe4gxu7iO5wozPHrBVUp4xwXo90EN3urR PLEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=8bZht07bPqvyVs51MmKlUDCBipIAwVU9OTwKqsIEvs8=; b=E9mUZbYsy0if8D7xblARqrTDw5pLKNsnI8RaTXkjhqLfHMRb1rYlcCZtO3UCmULtY+ OjEvmZuigwjTzhfhBdT1QBG5JOzf4JYmtQ9hhqYSoHFColtQRdcJD3Ui1lwb440Kpkea jmy8MJaBjvRJlQWfJZnjaizScTgUcOvKLCh9jOdZxvqfOieFGUAEJZYsLY6KGk/GeUOb QIox6lt3bkdeqw3KmCwQOAnyg/uxIIWvMg4YAHipdo6aqcGFzwc3EGaXSfTSzMsovSnO yuTZGgz7vNCBmdMKOpHNREq/rD+fiMWOnN9kwH5j5pLfXedGipqsB/4yoQ+ThIAoSNmf ikvg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=SHP4Uw6H; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h5si5370668otr.68.2020.03.03.08.55.55; Tue, 03 Mar 2020 08:56:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=SHP4Uw6H; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730459AbgCCQzZ (ORCPT + 99 others); Tue, 3 Mar 2020 11:55:25 -0500 Received: from mail-io1-f66.google.com ([209.85.166.66]:39600 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729390AbgCCQzZ (ORCPT ); Tue, 3 Mar 2020 11:55:25 -0500 Received: by mail-io1-f66.google.com with SMTP id h3so4346118ioj.6 for ; Tue, 03 Mar 2020 08:55:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=8bZht07bPqvyVs51MmKlUDCBipIAwVU9OTwKqsIEvs8=; b=SHP4Uw6HvUlPaUs9IYB16AOCis0EH3YzejdOEBRWeds0i79JF0JfXrI0pVYQsMwrDb WB+DYvR5cNaV8QmCLEUKpzGPNgdMardTT5O6bIFEIZQCxYHi0w0eiZe7xJ5js40tFMo/ VDIKRVWKxpVYJPC0cQr15xJYMpuBbyGccD/rBHyvX5L03X0TdVPnXRkifL4nvQARjTzA 6HBR7dz5832egFWxtqG2opZfTX93h0QynLjTC1oqlHHXYX5/Bx7yRcQPMnTLHW37El18 GIrCKTVdpnpJ7IAfqLopMlvclCzdIoRAG5Nq6iBhma57TC53If7Jkf5KoAUqYbIn6TSG Qj3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=8bZht07bPqvyVs51MmKlUDCBipIAwVU9OTwKqsIEvs8=; b=KqA+PJqdBeVKZdPz+TDXObQwXwWuHimaYeoq5JXWXiDCfa2ve1Y23FH55G3oAa/Xb4 RgxL66a2nCjW+Yw+HQgv5foKwf23zq5fqWotQuYhYud8uosVwZnuj0qRtzc2GOozbRIf c/gCCwZDHs76CdVa1tkCpPQl5R8JneApGsqLM+6NCk00xkB54iWshyTtkyyO5vh8AkTZ GnVCkC5I3PxzSrwP1SvGyQS1hwSzRYMWulX7hZRMwHS4K30jMBrBqhe5u7y8x/XTvECv tEZg6CXRziPc8zNkgUfDxFPCW5xypdq5tw5FMQkOKbk/yw7WI/kuPHetJhONKic+Z6kD VmCA== X-Gm-Message-State: ANhLgQ0GLMRyGqM0I1jbzYJsm9pwTbifumDWWUPPRrVs1TmI94STnFC0 uJz/wcwQ2xR9f/7hXR/ntiuyjWgqjzs= X-Received: by 2002:a6b:8b8c:: with SMTP id n134mr2442924iod.58.1583254522700; Tue, 03 Mar 2020 08:55:22 -0800 (PST) Received: from [192.168.1.159] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id u80sm7993352ili.77.2020.03.03.08.55.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 03 Mar 2020 08:55:22 -0800 (PST) Subject: Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17] To: Jeff Layton , Greg Kroah-Hartman , Jann Horn Cc: Miklos Szeredi , Karel Zak , David Howells , Ian Kent , Christian Brauner , James Bottomley , Steven Whitehouse , Miklos Szeredi , viro , Christian Brauner , "Darrick J. Wong" , Linux API , linux-fsdevel , lkml References: <1509948.1583226773@warthog.procyon.org.uk> <20200303113814.rsqhljkch6tgorpu@ws.net.home> <20200303130347.GA2302029@kroah.com> <20200303131434.GA2373427@kroah.com> <20200303134316.GA2509660@kroah.com> <20200303141030.GA2811@kroah.com> <20200303142407.GA47158@kroah.com> <030888a2-db3e-919d-d8ef-79dcc10779f9@kernel.dk> From: Jens Axboe Message-ID: <7a05adc8-1ca9-c900-7b24-305f1b3a9b86@kernel.dk> Date: Tue, 3 Mar 2020 09:55:20 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/3/20 9:51 AM, Jeff Layton wrote: > On Tue, 2020-03-03 at 08:44 -0700, Jens Axboe wrote: >> On 3/3/20 7:24 AM, Greg Kroah-Hartman wrote: >>> On Tue, Mar 03, 2020 at 03:13:26PM +0100, Jann Horn wrote: >>>> On Tue, Mar 3, 2020 at 3:10 PM Greg Kroah-Hartman >>>> wrote: >>>>> On Tue, Mar 03, 2020 at 02:43:16PM +0100, Greg Kroah-Hartman wrote: >>>>>> On Tue, Mar 03, 2020 at 02:34:42PM +0100, Miklos Szeredi wrote: >>>>>>> On Tue, Mar 3, 2020 at 2:14 PM Greg Kroah-Hartman >>>>>>> wrote: >>>>>>> >>>>>>>>> Unlimited beers for a 21-line kernel patch? Sign me up! >>>>>>>>> >>>>>>>>> Totally untested, barely compiled patch below. >>>>>>>> >>>>>>>> Ok, that didn't even build, let me try this for real now... >>>>>>> >>>>>>> Some comments on the interface: >>>>>> >>>>>> Ok, hey, let's do this proper :) >>>>> >>>>> Alright, how about this patch. >>>>> >>>>> Actually tested with some simple sysfs files. >>>>> >>>>> If people don't strongly object, I'll add "real" tests to it, hook it up >>>>> to all arches, write a manpage, and all the fun fluff a new syscall >>>>> deserves and submit it "for real". >>>> >>>> Just FYI, io_uring is moving towards the same kind of thing... IIRC >>>> you can already use it to batch a bunch of open() calls, then batch a >>>> bunch of read() calls on all the new fds and close them at the same >>>> time. And I think they're planning to add support for doing >>>> open()+read()+close() all in one go, too, except that it's a bit >>>> complicated because passing forward the file descriptor in a generic >>>> way is a bit complicated. >>> >>> It is complicated, I wouldn't recommend using io_ring for reading a >>> bunch of procfs or sysfs files, that feels like a ton of overkill with >>> too much setup/teardown to make it worth while. >>> >>> But maybe not, will have to watch and see how it goes. >> >> It really isn't, and I too thinks it makes more sense than having a >> system call just for the explicit purpose of open/read/close. As Jann >> said, you can't currently do a linked sequence of open/read/close, >> because the fd passing between them isn't done. But that will come in >> the future. If the use case is "a bunch of files", then you could >> trivially do "open bunch", "read bunch", "close bunch" in three separate >> steps. >> >> Curious what the use case is for this that warrants a special system >> call? >> > > Agreed. I'd really rather see something more general-purpose than the > proposed readfile(). At least with NFS and SMB, you can compound > together fairly arbitrary sorts of operations, and it'd be nice to be > able to pattern calls into the kernel for those sorts of uses. > > So, NFSv4 has the concept of a current_stateid that is maintained by the > server. So basically you can do all this (e.g.) in a single compound: > > open > write > close > > It'd be nice to be able to do something similar with io_uring. Make it > so that when you do an open, you set the "current fd" inside the > kernel's context, and then be able to issue io_uring requests that > specify a magic "fd" value that use it. > > That would be a really useful pattern. For io_uring, you can link requests that you submit into a chain. Each link in the chain is done in sequence. Which means that you could do: in a single sequence. The only thing that is missing right now is a way to have the return of that open propagated to the 'fd' of the read and close, and it's actually one of the topics to discuss at LSFMM next month. One approach would be to use BPF to handle this passing, another suggestion has been to have the read/close specify some magic 'fd' value that just means "inherit fd from result of previous". The latter sounds very close to the stateid you mention above, and the upside here is that it wouldn't explode the necessary toolchain to need to include BPF. In other words, this is really close to being reality and practically feasible. -- Jens Axboe