Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp6556144ybl; Wed, 15 Jan 2020 06:36:20 -0800 (PST) X-Google-Smtp-Source: APXvYqwLeWFUnwQ5WDJegPPTOVp04yJi9ulDkgBy2kt+7f00z6Qr6v56/rT3LIErvEY7OpXYXfwH X-Received: by 2002:aca:4a08:: with SMTP id x8mr23211oia.39.1579098980336; Wed, 15 Jan 2020 06:36:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579098980; cv=none; d=google.com; s=arc-20160816; b=dof08OD4C8O4UJsLlQP7CYg1CEzdCTwlCxlF8wDGrfIRypDuyU57Xn795maspsNO7l DaVhJXBFeAmGcAMWFHNrnE5N2FdL580Ck8T6SanmnMr2mZ5AxWB21dRG2qEr6IVk/Udb 1gjjSRcIBxVWdQ5po2ephPN00buL0LZ+6fda5ZO2A2XlcGjKx9GKuEhLs+SGbWiK7vVD j9+1zQrl8m9AeNkHaC05o66VIHYzX6NJ883UWST1bQKq+CeXN6djMTDZluT8ZYt44+r+ RmWsJdIg7DIbHe5nRz5iii/mnTHaurdMARffUIVV20b/s26bXQS2ykuxmv7P5mPbfodU Te4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:content-id:mime-version :subject:cc:to:references:in-reply-to:from:organization :dkim-signature; bh=02gzqhLOtgbp3kYs0sEj7toypycUFhqtBNRxiOLtfDo=; b=QHp5pXVFQZwbaXieQm6N9DLrYGJ+J0ztMXH2I/OXK4AJaeiRT4ihQ0fLYpaHfGqRXD jz/R0ANPTi7AXWUyAPH0qOayFK7YZfiRkqOzCHWAU8kKN7QJPTxpDcou3K3wvv24ntnx bnBTUotpg2gy4X6QHeSxxKXevboDplPU+SNkKY9GNKybj2F+Y269nGsNUCsJGhaRuigc q7vwJQeFMdWSUWklHQnlcO+yjoc6NEFHQvwa0UskXpgiI+sxC5wZyimdN3+twsjkHDOR 6kXa+5yggM0bGfg1Wf7u5MnKSqWfjrK89BvTS/WFaVgL8kKWBFOWa870vVfjUe8kbsyZ jrfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IwLPGwt8; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 16si3400630otu.77.2020.01.15.06.36.08; Wed, 15 Jan 2020 06:36:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IwLPGwt8; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729141AbgAOOff (ORCPT + 99 others); Wed, 15 Jan 2020 09:35:35 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:43661 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726418AbgAOOff (ORCPT ); Wed, 15 Jan 2020 09:35:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579098934; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=02gzqhLOtgbp3kYs0sEj7toypycUFhqtBNRxiOLtfDo=; b=IwLPGwt81206FTl0a+78MJDXYYQNqhcjusuovIbSlWYU1JYTyWP9ujE8cNJOdFTFp3dBsx iEG3qlniPKOzkY3+Fbi3gIcWvmw27xjHXVIcarHsix2r1Vp9fT2mQSJNcIFcIrtMXWbC31 MKNw1XCjM8rxCJC4zY2m4evBmmAAfzQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-1-8SlNiMreOIWVPvbUAcL_5g-1; Wed, 15 Jan 2020 09:35:30 -0500 X-MC-Unique: 8SlNiMreOIWVPvbUAcL_5g-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 12180802B79; Wed, 15 Jan 2020 14:35:26 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-120-52.rdu2.redhat.com [10.10.120.52]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3B2AF5C28C; Wed, 15 Jan 2020 14:35:23 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: <20200115133101.GA28583@lst.de> References: <20200115133101.GA28583@lst.de> <4467.1579020509@warthog.procyon.org.uk> <00fc7691-77d5-5947-5493-5c97f262da81@gmx.com> <27181AE2-C63F-4932-A022-8B0563C72539@dilger.ca> To: Christoph Hellwig Cc: dhowells@redhat.com, Qu Wenruo , Andreas Dilger , linux-fsdevel , Al Viro , "Theodore Y. Ts'o" , "Darrick J. Wong" , Chris Mason , Josef Bacik , David Sterba , linux-ext4 , linux-xfs , linux-btrfs , Linux Kernel Mailing List Subject: Re: Problems with determining data presence by examining extents? MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <26092.1579098922.1@warthog.procyon.org.uk> Date: Wed, 15 Jan 2020 14:35:22 +0000 Message-ID: <26093.1579098922@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Christoph Hellwig wrote: > If we can't get that easily it can be emulated using lseek SEEK_DATA / > SEEK_HOLE assuming no other thread could be writing to the file, or the > raciness doesn't matter. Another thread could be writing to the file, and the raciness matters if I want to cache the result of calling SEEK_HOLE - though it might be possible just to mask it off. One problem I have with SEEK_HOLE is that there's no upper bound on it. Say I have a 1GiB cachefile that's completely populated and I want to find out if the first byte is present or not. I call: end = vfs_llseek(file, SEEK_HOLE, 0); It will have to scan the metadata of the entire 1GiB file and will then presumably return the EOF position. Now this might only be a mild irritation as I can cache this information for later use, but it does put potentially put a performance hiccough in the case of someone only reading the first page or so of the file (say the file program). On the other hand, probably most of the files in the cache are likely to be complete - in which case, it's probably quite cheap. However, SEEK_HOLE doesn't help with the issue of the filesystem 'altering' the content of the file by adding or removing blocks of zeros. David