Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2872218rwd; Fri, 19 May 2023 11:17:38 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4Fks/g4XE5PniHGAmf0cr5y7H/vy+yx/KH2CFDE5dVf4vCo33Sm7ncmb4P358+6RmM/bgQ X-Received: by 2002:a05:6a20:c89b:b0:f2:57da:7f45 with SMTP id hb27-20020a056a20c89b00b000f257da7f45mr3774449pzb.8.1684520257840; Fri, 19 May 2023 11:17:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684520257; cv=none; d=google.com; s=arc-20160816; b=DQ4N+C1YI5BfjPDhkiOTcDNQXXtp3XumJBeJUHI5zCUYBGzzLkPNhDUQoBvPVKa+zI Gg08Y61WAvixKhBI1KNsiYnL/dTvsZDLQKdl6WcHisJ3jSFpV0wcidQOIVFtFSaYnKkc mR6SXRMLfNu5VVPXj2xoVoVD+BkAAiStwnVY2dWcwqjpHOeC7OR1JSD/OUlj38+vPo1s lDILXZsiSeWdz2TSeJd9nohx3A+8qTP2ez8rJFeNLVfjQkTH4RpV/ksufWOtPMSsV1YN jbDOs/Ode+1s+rG2yQzyejFcMpJZ+EQs/uXGGKaczsO3TsZTxVMeTUwhzcjZQWE6vv9Q LJOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=2ucWVdofg5ar9Sy7957FpoXFp27vLdLXG53ARFC9CGo=; b=dTCaokJywlcH/dDA5m9rtkYQ2ZVLuXlXdG2m8SYLO2t1Pt0UnJoXRLxH7d8MpPwUpU 4GCuL9m3KoxEr9MCUgEYF45HKhIpo9FocmWzA6ePfT153jX7lThIw2lHVdFlLooMaVdl brUKL+t51g/lzJ+iV1YiWvNcZQfaDDPDgPr9yJHV88lUVwT/cF7u6sbBmEUsqRwghQ/E I0iZChOAm2kCRNs4weURK89V6pVjHUyrtB0K+a7ElBAaxMuJW+cDhlC0wT5YDrn+cj+p 7W9rlRI9IdYROS9cDNE8aku9/WF1ZG4608xhNaDSDYankVOvtglaXfEa4czk7WzI+1wi qu+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=ednmVM8r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q7-20020a638c47000000b005348a306b0csi23026pgn.365.2023.05.19.11.17.14; Fri, 19 May 2023 11:17:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=ednmVM8r; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232069AbjESRiW (ORCPT + 99 others); Fri, 19 May 2023 13:38:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229726AbjESRiU (ORCPT ); Fri, 19 May 2023 13:38:20 -0400 Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 059D5C7 for ; Fri, 19 May 2023 10:38:19 -0700 (PDT) Received: by mail-ed1-x52b.google.com with SMTP id 4fb4d7f45d1cf-510d6e1f1b2so5997971a12.3 for ; Fri, 19 May 2023 10:38:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1684517897; x=1687109897; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2ucWVdofg5ar9Sy7957FpoXFp27vLdLXG53ARFC9CGo=; b=ednmVM8rJlHSxfzo9l6pu1Mem8mdgWjnKJHswoZ2OZGw7qtqP0Xq/0tpuGdr1uXHnn skHkC+vIxObBmeOB6fR0duwzSXz/tUoyVweBu6kZdWiDIcBm1knjqCc0lVqp3o/RzUrH 7NrCPhwg1Zj9LHaLbbhs+f87dfOy5TBGItUDM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684517897; x=1687109897; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2ucWVdofg5ar9Sy7957FpoXFp27vLdLXG53ARFC9CGo=; b=hb3yJsFKCMiBPL0moyyZ8XsVgG8eke/9CYPEoCbCB3u7sW1OYSNv/qkZ5KEAaxk2nk Ux+SngcGj1O6pxgAG2IzMPlzZiZzycmTN8uWqNycC3BcSdjROoKtzS1lxHnq3Hgn4mRN YlUstKEWCpQRFU0jf9lgBz9qh5RxRijHhO2WH/0UMyFm27NP3yQ2bjRq6spkdEleZun2 k2J4FM18CZjW6SYg+OmAVpfmgMgWu02gzni39eX+a6pq7Bg/kxUBPzoH1D7Nj2LAaAvk klDenzaWiUmHYiTObkkMQ75ht4mMp5tUYimd64ImJyt0aaF22IBxpA3YnK0eLaU1nXiM Ys8Q== X-Gm-Message-State: AC+VfDxSATQsH7R9dI6dOZ/4wbEorDyUru7dE7+d2+OWTGeEvp9UIWz3 sk1Risrw/XddxC8Abi6hAnsxDB0zucClm1yVrrP2gAhy X-Received: by 2002:aa7:da0a:0:b0:510:deb5:ff4f with SMTP id r10-20020aa7da0a000000b00510deb5ff4fmr2254739eds.35.1684517897446; Fri, 19 May 2023 10:38:17 -0700 (PDT) Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com. [209.85.218.50]) by smtp.gmail.com with ESMTPSA id b20-20020a056402139400b00508804f3b1dsm1870793edv.57.2023.05.19.10.38.16 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 19 May 2023 10:38:17 -0700 (PDT) Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-965ab8ed1fcso648227666b.2 for ; Fri, 19 May 2023 10:38:16 -0700 (PDT) X-Received: by 2002:a17:906:db0d:b0:94f:1a23:2f1b with SMTP id xj13-20020a170906db0d00b0094f1a232f1bmr2341051ejb.24.1684517896563; Fri, 19 May 2023 10:38:16 -0700 (PDT) MIME-Version: 1.0 References: <20230519074047.1739879-1-dhowells@redhat.com> <20230519074047.1739879-4-dhowells@redhat.com> <1845768.1684514823@warthog.procyon.org.uk> In-Reply-To: <1845768.1684514823@warthog.procyon.org.uk> From: Linus Torvalds Date: Fri, 19 May 2023 10:37:59 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v20 03/32] splice: Make direct_read_splice() limit to eof where appropriate To: David Howells Cc: Jens Axboe , Al Viro , Christoph Hellwig , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 19, 2023 at 9:48=E2=80=AFAM David Howells = wrote: > > This is just an optimisation to cut down the amount of bufferage allocate= d So the thing is, it's actually very very wrong for some files. Now, admittedly, those files have other issues too, and it's a design mistake to begin with, but look at a number of files in /proc. In particular, look at the regular files that have a size of '0'. It's quite common indeed. Things like /proc/cpuinfo /proc/stat ... you can find a ton of them with find /proc -type f -size 0 Is it horribly wrong and bad? Yes. I hate it. It means that some really basic user space tools refuse to work on them, and the tools are 100% right - this is a kernel misfeature. Trying to do things like less -S /proc/cpuinfo may or may not work depending on your version of 'less', for example, because it's entirely reasonable to do something like fd =3D open(..); if (!fstat(fd, &st)) len =3D st.st_size; and limit your reads to the size of the file - exactly like your patch does= . Except it fails horribly on those broken /proc files. I hate it, and I blame myself for the above horror, but it's pretty much unfixable. We could make them look like named pipes or something, but that's really ugly and probably would break other things anyway. And we simply don't know the size ahead of time. Now, *most* things work, because they just do the whole "read until EOF". In fact, my current version of 'less' has no problem at all doing the above thing, and gives the "expected" output. Also, honestly, I really don't think that it's necessarily a good idea to splice /proc files, but we actually do have splice wired up to these because people asked for it: fe33850ff798 ("proc: wire up generic_file_splice_read for iter ops") 4bd6a7353ee1 ("sysctl: Convert to iter interfaces") so I suspect those things do exist. > I could just drop it and leave it to userspace for now as the filesystem/= block > layer will stop anyway if it hits the EOF. Christoph would prefer that I= call > direct_splice_read() from generic_file_splice_read() in all O_DIRECT case= s, if > that's fine with you. I guess that's fine, and for O_DIRECT itself it might even make sense to do the size test. That said, I doubt it matters: if you use O_DIRECT on a small file, you only have yourself to blame for doing something stupid. And if it isn't a small file, then who cares about some small EOF-time optimization? Nobody. So I would suggest not doing that optimization at all, because as-is, it's either pointless or actively broken. That said, I would *not* hate some kind of special FMODE_SIZELIMIT flag that allows filesystems to opt in to "limit reads to size". We already have flags like that: FMODE_UNSIGNED_OFFSET and 'sb->s_maxbytes' are both basically variations on that same theme, and having another flag to say "limit reads to i_size" wouldn't be wrong. It's only wrong when it is done mindlessly with S_ISREG(). Linus