Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4829208imw; Tue, 19 Jul 2022 14:12:36 -0700 (PDT) X-Google-Smtp-Source: AGRyM1shM9JP3V/G1v/Y47mYoUKgoIDPbu3u5afwhRBQNr2DeYj46qPih2tcB7MYfRlGFLmd9DpO X-Received: by 2002:aa7:cdd3:0:b0:43a:1b7d:dce3 with SMTP id h19-20020aa7cdd3000000b0043a1b7ddce3mr47372415edw.359.1658265156135; Tue, 19 Jul 2022 14:12:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658265156; cv=none; d=google.com; s=arc-20160816; b=QFb6+3VYoIactomhEeHJUidfBhqaOLlu6O3t98juqjGf8rC7Aq7q8TWp9ku/r0mynu ePposul97S0bqpX2lkul3+3e9MlpwnTauIQCEAMABveUcNMpjq0LQUzVcF5Zdwmn3G9J GJALhrjXEJfJ+ZsK6ZbfOb4lKtLx83wNLH/15Ct7+jtgpxJv3/g28jBU455DIfzetWiX hikC7GiQmLjn6vx6uQHqVbBjlNFyTryL1DhTYDO2mnoTV3wXhto21yVg5KAMlIdpQE/h 4i/2npG0TlFUwuKLSXMsFsz8dfTd75uuWjidLxvrRJgZ2OkCSN6YPDJFxuXTCEG8GDm7 JJnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Ex8VLnkVRDH9VW8GedVb3fp8QXwrAT2+hk3e4CSkF+w=; b=Axj7a0kAgVHO9tf6i3HZuLr6RW8T8MQ+ncCLhMEGPmSjrUxZ2ZL7doAO+z+BvbEFAB wtXghTJVEvuxkYkPdnVU5EKLKPfzWMIaTfFWUbe+XOpnWdFR6NlyCfhKp+e7EdfJJU4K 7wjEJIrUTS7aK5J8lRcDpH/GHGv0/RL6bJd6L1oZU7iJ49apPtMcO7ZBeb03fK15I67U onhs6iDVtzc50Zm0L8CqqJ87LbAiHNsc3txUv2dh+MNgZgtl3L9YwhZDNmrZhsW8gIHi JwslZiKduYxqqWQ9GSxZ0DR4G3SnPJit2Ddxg8PnmFh/FDwLKMpCRBZGQvIAatHN8YiD j9LA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=jmFkHHE+; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f10-20020a056402328a00b0043aa1a7abf9si20034518eda.133.2022.07.19.14.11.54; Tue, 19 Jul 2022 14:12:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=jmFkHHE+; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230052AbiGSVLD (ORCPT + 99 others); Tue, 19 Jul 2022 17:11:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229451AbiGSVLC (ORCPT ); Tue, 19 Jul 2022 17:11:02 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9998D59278; Tue, 19 Jul 2022 14:11:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Ex8VLnkVRDH9VW8GedVb3fp8QXwrAT2+hk3e4CSkF+w=; b=jmFkHHE+Uajr6efIo7IAvxcAHL vAvWIjRz7we9eqNK9PC8JKhZA5bRGmlZK1BES6yzi3I+YQXF7KUKxZaPbhlbheInSyACezBl4RFRp fSiSvPGQfPhwRd9odsZFawwb9jr4WKt3pj2thhTxED0/Ko2kmQQ98vtz8MHBx1XjaBVBFsAI9xRh3 xXntLOW9riZAs2j6zD2nMcIhXD9jftG3XnOQTKtPSFdQxv6XDXUtjS38Y7No40x4hhUkccHkh761j y77iborV1B9/g0LnsI1zbxMcouCq82s8HUF7TDLou+1RoeTTqhJ2RSCLuam/zRWTbUvpK3Y4/gwGV TiJ7T0xA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1oDuUT-00Dt4n-M8; Tue, 19 Jul 2022 21:10:49 +0000 Date: Tue, 19 Jul 2022 22:10:49 +0100 From: Matthew Wilcox To: Anna Schumaker Cc: Chuck Lever III , Linux NFS Mailing List , linux-fsdevel , Dave Chinner Subject: Re: [PATCH v3 6/6] NFSD: Repeal and replace the READ_PLUS implementation Message-ID: References: <20220715184433.838521-1-anna@kernel.org> <20220715184433.838521-7-anna@kernel.org> <20220718011552.GK3600936@dread.disaster.area> <5A400446-A6FD-436B-BDE2-DAD61239F98F@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Tue, Jul 19, 2022 at 04:24:18PM -0400, Anna Schumaker wrote: > On Tue, Jul 19, 2022 at 1:21 PM Chuck Lever III wrote: > > But I also thought the purpose of READ_PLUS was to help clients > > preserve unallocated extents in files during copy operations. > > An unallocated extent is not the same as an allocated extent > > that has zeroes written into it. IIUC this new logic does not > > distinguish between those two cases at all. (And please correct > > me if this is really not the goal of READ_PLUS). > > I wasn't aware of this as a goal of READ_PLUS. As of right now, Linux > doesn't really have a way to punch holes into pagecache data, so we > and up needing to zero-fill on the client side during decoding. I've proven myself unqualified to opine on how NFS should be doing things in the past ... so let me see if I understand how NFS works for this today. Userspace issues a read(), the VFS allocates some pages to cache the data and calls ->readahead() to get the filesystem to fill those pages. NFS uses READ_PLUS to get the data and the server says "this is a hole, no data for you", at which point NFS has to call memset() because the page cache does not have the ability to represent holes? If so, that pretty much matches how block filesystems work. Except that block filesystems know the "layout" of the file; whether they use iomap or buffer_heads, they can know this without doing I/O (some filesystems like ext2 delay knowing the layout of the file until an I/O happens, but then they cache it). So I think Linux is currently built on assuming the filesystem knows where its holes are, rather than informing the page cache about its holes. Could we do better? Probably! I'd be interested in seeing what happens if we add support for "this page is in a hole" to the page cache. I'd also be interested in seeing how things would change if we had filesystems provide their extent information to the VFS and have the VFS handle holes all by itself without troubling the filesystem. It'd require network filesystems to invalidate the VFS's knowledge of extents if another client modifies the file. I haven't thought about it deeply.