Received: by 2002:a25:ca44:0:0:0:0:0 with SMTP id a65csp2007344ybg; Thu, 30 Jul 2020 08:09:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwC6ntcQkINH0GfG2PkTyBYPv7cXh4VIFYe9dMNlu2WbbOtdY6BahSAr1DmwgMhrgKiZozF X-Received: by 2002:a17:906:1104:: with SMTP id h4mr3024731eja.456.1596121783993; Thu, 30 Jul 2020 08:09:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596121783; cv=none; d=google.com; s=arc-20160816; b=tG7hXWk9wNpq2A+j208IrnGGo+9MKjpW5MwjeRPQ6KCFt7MiNH6z+VfN5U2pFB5Id9 v2rJuvcBa81iSh7eii+RlDBmbnUtn+lddRWFeYA6FYqRZcwalJGguymO4kYjMmowYU3d FlS4gUk6cG/hlzknW5EuPBJ1X0UlK25AyrRUbO1eHawm22BHO7DSDNnKy0oo/mIEXpbe BURfUGEboREQd0jxCl7pmoyn92Lrg1lWMon0O/fwS2GAKgBEOq0urfmuBbZvf38IFihJ eXSXAfIH/gYYKG4Fq+qeqdYrKV3lW4L8h70Oy+jdcjx72Z8W1vhBTo2z3iLZXk8DhH7H ldEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=QCbRIxXxLdFz3xqh2pbwde2tRvNLrhwGTq6PxNyeAkE=; b=zXEz7TZUDwsjYd591ql7mWwuVc5lEV+CXM1IgREs18o12DX/T5iVQjU1knFtsi6bIY aIVuo6n+S+UL7ejb/GhOsBGMJLvDoRb1SCcFf4uG7cqdCtqot8s89T/9XXrGYoks8fSY GDON/u4p/g5nsTqkYwmjZmvcKLBnLp8/ilGJKZxOvrkh+JYCwtnCo7ytR7C5vWaM1Te3 HQx+xk3tOuHWFJny6wIjBOjU9xG3YZpFCS4/gU2pPEulVvEivgvYDoLg2aSgW8N16wkP Yj01z25Kv90dsOqngPzZ27GetPhDpZVDctwixoKNaxez7IWGcMBpieTv1vja2AOzN7Nu 6iBA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g5si3308869ejr.456.2020.07.30.08.09.21; Thu, 30 Jul 2020 08:09:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729696AbgG3PIv (ORCPT + 99 others); Thu, 30 Jul 2020 11:08:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726353AbgG3PIu (ORCPT ); Thu, 30 Jul 2020 11:08:50 -0400 Received: from ZenIV.linux.org.uk (zeniv.linux.org.uk [IPv6:2002:c35c:fd02::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A694C061574; Thu, 30 Jul 2020 08:08:50 -0700 (PDT) Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1k1AAU-005j5Y-G1; Thu, 30 Jul 2020 15:08:26 +0000 Date: Thu, 30 Jul 2020 16:08:26 +0100 From: Al Viro To: Christoph Hellwig Cc: Linus Torvalds , Stephen Rothwell , Luis Chamberlain , Matthew Wilcox , Kees Cook , Iurii Zaikin , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH 22/23] fs: default to generic_file_splice_read for files having ->read_iter Message-ID: <20200730150826.GA1236603@ZenIV.linux.org.uk> References: <20200707174801.4162712-1-hch@lst.de> <20200707174801.4162712-23-hch@lst.de> <20200730000544.GC1236929@ZenIV.linux.org.uk> <20200730070329.GB18653@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200730070329.GB18653@lst.de> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 30, 2020 at 09:03:29AM +0200, Christoph Hellwig wrote: > On Thu, Jul 30, 2020 at 01:05:44AM +0100, Al Viro wrote: > > On Tue, Jul 07, 2020 at 07:48:00PM +0200, Christoph Hellwig wrote: > > > If a file implements the ->read_iter method, the iter based splice read > > > works and is always preferred over the ->read based one. Use it by > > > default in do_splice_to and remove all the direct assignment of > > > generic_file_splice_read to file_operations. > > > > The worst problem here is the assumption that all ->read_iter() instances > > will take pipe-backed destination; that's _not_ automatically true. > > In particular, it's almost certainly false for tap_read_iter() (as > > well as tun_chr_read_iter() in IFF_VNET_HDR case). > > > > Other potentially interesting cases: cuse and hugetlbfs. > > > > But in any case, that blind assertion ("iter based splice read works") > > really needs to be backed by something. > > I think we need to fix that in the instances, as we really expect > ->splice_read to just work instead of the caller knowing what could > work and what might not. Er... generic_file_splice_read() is a library helper; the decision to use is up to the filesystem/driver/protocol in question, and so's making sure it's not used with ->read_iter() that isn't fit for it. Note that we *do* have instances where we have different ->splice_read() (sometimes using generic_file_splice_read(), sometimes not) even though ->read_iter() is there. Your patch ignores those (thankfully), but commit message is rather misleading - it strongly implies that generic_file_splice_read() is *always* the right thing when ->read_iter() is there, not just that in such cases it makes a better fallback than default_file_splice_read(). And even the latter assumption is not obvious - AFAICS, we do have counterexamples. I'm not saying that e.g. tun/tap don't need fixing for other reasons and it's quite possible that they will become suitable for generic_file_splice_read() after that's done. But I'm really unhappy about the implied change of generic_file_splice_read() role; if nothing else, commit message should be very clear that if you have ->read_iter() and generic_file_splice_read() won't do the right thing, you MUST provide ->splice_read() of your own. Probably worth Documentation/filesystem/porting entry as well. Alternatively, if you really want to change the role of that thing, we need to go through all instances that are *not* generic_file_splice_read() and see what's going on in those. Starting with the sockets. The list right now is: fs/fuse/dev.c:2263: .splice_read = fuse_dev_splice_read, fs/overlayfs/file.c:786: .splice_read = ovl_splice_read, net/socket.c:164: .splice_read = sock_splice_read, kernel/relay.c:1331: .splice_read = relay_file_splice_read, kernel/trace/trace.c:7081: .splice_read = tracing_splice_read_pipe, kernel/trace/trace.c:7149: .splice_read = tracing_buffers_splice_read, kernel/trace/trace.c:7712: .splice_read = tracing_buffers_splice_read, The first 3 have ->read_iter(); the rest (kernel/* stuff) doesn't. Socket case uses generic_file_splice_read() unless the protocol provides an override; SMC, TCP, TCPv6, AF_UNIX STREAM and KCM SEQPACKET do that. I hadn't looked into the socket side of things for 5 years or so, so I'd have to dig the notes out first. It wasn't pleasant...