Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp6224115ybl; Wed, 15 Jan 2020 00:39:51 -0800 (PST) X-Google-Smtp-Source: APXvYqwaOQOy7JWWIZyEmw1seuws7VAxRKms4osohTowbb0+Ek2QUmajkXl/poldn0L34yl2Qh8t X-Received: by 2002:a54:468b:: with SMTP id k11mr19243772oic.134.1579077591359; Wed, 15 Jan 2020 00:39:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579077591; cv=none; d=google.com; s=arc-20160816; b=mki5e+WYUk+yIM4vFro/aYQrI1YkmCU4l4abZR1fNQpTds90uaRVUA6TmicNGk0Mvq Wszq0zwJ8bI7ELJPmIgxZi7cHWiOOGvxf0u5V/898aC7IMX7YoEBXl3pBnp2BZGEKXHg f7ek4zSqufqk3PA2ZtZ1sOZ6urb16VHHQdTIXI7jNF/S7IhKdfCKd9cEnD/OmZkJPy++ /RJedgf8du5XoQqn7kNPX7DIRY1a7vahcPZMBQ85uPGO1RcDnLV552Ox58l/AAX82tMj m+G9KbFEa2k27utLoOZbOE3VjJq33iwHVmeNw8fWmOl2G5stDG+HIzJKpSZ/Ts1tDznK O2Sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=c4AFU+DNnH106WjgoTihNdWhCQ6EBzQZV5U8+Dsuvuo=; b=dG4GFbYwsgIDUoYccHpwJMwgUyphINRFx8U3bUfNPH4vhKn+I2NCI6E0Ws3udZDw5+ pfBL7/2bxe6d/H0NjGRzocKTDyzOey1LzKHNzX4kQsSShBulXuSDEZI3Sdk2yTVq7pFl oFxH8D2vHjrFsGeYyYGhZIZViSwrJWJGdcxIyS1yCmVwpUgxcrV71rLCJGK8meOwAmiz s5EMKx0BrCm5mw3vOLMo2F/RYOMhV1kgkBMn55pXYQEwuoCz51yDX0nYFY+ZPEBZVo1Q UTeiQpjkmg4Jp1O2Fg/s2owmvuwpim+ud6mKZnFPxxe3ZoJV/pqI6a1ZtSbyPgffuRcG HSAQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e26si11073542otj.113.2020.01.15.00.39.40; Wed, 15 Jan 2020 00:39:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728931AbgAOIi6 (ORCPT + 99 others); Wed, 15 Jan 2020 03:38:58 -0500 Received: from verein.lst.de ([213.95.11.211]:49605 "EHLO verein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726513AbgAOIi6 (ORCPT ); Wed, 15 Jan 2020 03:38:58 -0500 Received: by verein.lst.de (Postfix, from userid 2407) id 9E17968B05; Wed, 15 Jan 2020 09:38:54 +0100 (CET) Date: Wed, 15 Jan 2020 09:38:54 +0100 From: Christoph Hellwig To: David Howells Cc: linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, hch@lst.de, tytso@mit.edu, adilger.kernel@dilger.ca, darrick.wong@oracle.com, clm@fb.com, josef@toxicpanda.com, dsterba@suse.com, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: Problems with determining data presence by examining extents? Message-ID: <20200115083854.GB23039@lst.de> References: <4467.1579020509@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4467.1579020509@warthog.procyon.org.uk> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Tue, Jan 14, 2020 at 04:48:29PM +0000, David Howells wrote: > Again with regard to my rewrite of fscache and cachefiles: > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter > > I've got rid of my use of bmap()! Hooray! > > However, I'm informed that I can't trust the extent map of a backing file to > tell me accurately whether content exists in a file because: > > (a) Not-quite-contiguous extents may be joined by insertion of blocks of > zeros by the filesystem optimising itself. This would give me a false > positive when trying to detect the presence of data. > > (b) Blocks of zeros that I write into the file may get punched out by > filesystem optimisation since a read back would be expected to read zeros > there anyway, provided it's below the EOF. This would give me a false > negative. The whole idea of an out of band interface is going to be racy and suffer from implementation loss. I think what you want is something similar to the NFSv4.2 READ_PLUS operation - give me that if there is any and otherwise tell me that there is a hole. I think this could be a new RWF_NOHOLE or similar flag, just how to return the hole size would be a little awkward. Maybe return a specific negative error code (ENODATA?) and advance the iov anyway.