Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2088809pxb; Mon, 8 Mar 2021 13:57:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJzGoj4A6qxGIiFI+t7hnNZ2msrQopvXhiyiXU8iBYQBcf//1qNGvGG0iGlGv7b4w2mhmePm X-Received: by 2002:a17:906:444d:: with SMTP id i13mr16528670ejp.170.1615240645554; Mon, 08 Mar 2021 13:57:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615240645; cv=none; d=google.com; s=arc-20160816; b=qPOGeGMf4A9CbQ0Cxb53LdxfEvlbNLbirFny+w1JGfH3Q2VwxGXU/MCb+1Mooj1FN7 +6454/WWvPZ5fsmOHD0KiMJCJwWtcRIv1B6KgvuyNtRMK9ceUR/qyG7RGpffqW1RJFo4 TpQFEWtUd48oAwEq1VbvPNq41mYvObKy7Hi8hNKBuisoaLWIGagi/FAU/2x5CcYIRjer Bfym+NS84iv4gazXQ8000HRjtr4i59JenS1QHPQkvZvr6arKiv4HeRvBw4FM4Mh5YT+7 yzmERP+MWTElKEIEhc1k8o0fMIjm6g17oSKsdODZz5J5SG9f0lq+5sdfJDCnO/TdRlCS Vulw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=8wCm7lihTNHTQHGsgX49WqwYrtJxNkJzhPXxPQiwvVU=; b=jIRtidy4YKPiAxzcVs3IJ5r0hJCDPbxE0CVD342wgn7yZ96XXXThq7CRKfXRxS8k62 R9orhIf/8+pIZXFN28YtaSnTwhuqgUofezxkkFJ+F7jWLwIwhiOQEasAXt8LFMRXa7Md wqhoXx4/IV+x9yf6KVye0e3+Ul8tPa9xd6TR5hCq9px0ry48mpleU/oWSkpEcy8eJfI8 QyS6VX4A5Ql63O47tShFa7R2zXEe8tFXyVU32HzY/GpmPKi0UgsFB9qKvSvDJm2VCwbP tfeuXIZhKeAKD2B8Fo7lqXLOXZ6k5bGeCWFZzlJmeocQtKylJ6zJBKBmaJp/EEhXPqEH 0oKA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u12si9249943eda.90.2021.03.08.13.56.52; Mon, 08 Mar 2021 13:57:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231321AbhCHV4U (ORCPT + 99 others); Mon, 8 Mar 2021 16:56:20 -0500 Received: from mail104.syd.optusnet.com.au ([211.29.132.246]:56725 "EHLO mail104.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230047AbhCHVzs (ORCPT ); Mon, 8 Mar 2021 16:55:48 -0500 Received: from dread.disaster.area (pa49-181-239-12.pa.nsw.optusnet.com.au [49.181.239.12]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 29E148289A3; Tue, 9 Mar 2021 08:55:36 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1lJNqh-000HG4-6L; Tue, 09 Mar 2021 08:55:35 +1100 Date: Tue, 9 Mar 2021 08:55:35 +1100 From: Dave Chinner To: David Howells Cc: Amir Goldstein , linux-cachefs@redhat.com, Jeff Layton , David Wysochanski , "Matthew Wilcox (Oracle)" , "J. Bruce Fields" , Christoph Hellwig , Dave Chinner , Alexander Viro , linux-afs@lists.infradead.org, Linux NFS Mailing List , CIFS , ceph-devel , v9fs-developer@lists.sourceforge.net, linux-fsdevel , linux-kernel , Miklos Szeredi Subject: Re: fscache: Redesigning the on-disk cache Message-ID: <20210308215535.GA63242@dread.disaster.area> References: <2653261.1614813611@warthog.procyon.org.uk> <517184.1615194835@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <517184.1615194835@warthog.procyon.org.uk> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=YKPhNiOx c=1 sm=1 tr=0 cx=a_idp_d a=gO82wUwQTSpaJfP49aMSow==:117 a=gO82wUwQTSpaJfP49aMSow==:17 a=kj9zAlcOel0A:10 a=dESyimp9J3IA:10 a=pGLkceISAAAA:8 a=7-415B0cAAAA:8 a=tj5_YPy7viIAn9pg2yAA:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Mon, Mar 08, 2021 at 09:13:55AM +0000, David Howells wrote: > Amir Goldstein wrote: > > > > (0a) As (0) but using SEEK_DATA/SEEK_HOLE instead of bmap and opening the > > > file for every whole operation (which may combine reads and writes). > > > > I read that NFSv4 supports hole punching, so when using ->bmap() or SEEK_DATA > > to keep track of present data, it's hard to distinguish between an > > invalid cached range and a valid "cached hole". > > I wasn't exactly intending to permit caching over NFS. That leads to fun > making sure that the superblock you're caching isn't the one that has the > cache in it. > > However, we will need to handle hole-punching being done on a cached netfs, > even if that's just to completely invalidate the cache for that file. > > > With ->fiemap() you can at least make the distinction between a non existing > > and an UNWRITTEN extent. > > I can't use that for XFS, Ext4 or btrfs, I suspect. Christoph and Dave's > assertion is that the cache can't rely on the backing filesystem's metadata > because these can arbitrarily insert or remove blocks of zeros to bridge or > split extents. Well, that's not the big problem. The issue that makes FIEMAP unusable for determining if there is user data present in a file is that on-disk extent maps aren't exactly coherent with in-memory user data state. That is, we can have a hole on disk with delalloc user data in memory. There's user data in the file, just not on disk. Same goes for unwritten extents - there can be dirty data in memory over an unwritten extent, and it won't get converted to written until the data is written back and the filesystem runs a conversion transaction. So, yeah, if you use FIEMAP to determine where data lies in a file that is being actively modified, you're going get corrupt data sooner rather than later. SEEK_HOLE/DATA are coherent with in memory user data, so don't have this problem. Cheers, Dave. -- Dave Chinner david@fromorbit.com