Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1084426pxb; Fri, 1 Apr 2022 04:17:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwkhcHQn1FQIbPI1kqgzeoYfBzP583qk2DWcnfTH4XBLwUkgUXuid5rqyVST7kqbpg3EK9T X-Received: by 2002:a17:902:e8c4:b0:155:e8c7:8288 with SMTP id v4-20020a170902e8c400b00155e8c78288mr10012236plg.81.1648811849996; Fri, 01 Apr 2022 04:17:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648811849; cv=none; d=google.com; s=arc-20160816; b=ciJVTehdF/VsBEiQJ4FvO44K6rCAFu86xaGDqzTPlhm71jLiktmCA87Mcf26Ai1zNj etfynEm4wTFfXyKfBtyZ1ClLEh5u+oxG9k48mew0LGGyWx550I84sZXtssaUB5i/OpFq 9GhEpndS3RBleUPmrUj5oO1qZwdj3YLxG06SDcDK6YV6gFrnK8XTXJf4507Hw9ncX+uw HKoOCHVCewIsJVGuYLKMO2JTqycR4iBzRnAGsAaxxzuZQcWX4upuX0OgB6zrJz/D+fHm xO5fofZnZr3dmyShD7zL3R2WU11jPFJ0d0CjKVdMGcFiBIVX/dCkxaGug2Pycz0YHFyb BqkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=Zqwd4HIYoFoDyV8NRuBQvD3/64ZZ9oPWap8QQLNxJFo=; b=unjxYly9DyY0gIgL+2aPIRt7ofU1MNvL8Wd5ZzeIG+2BZvQQEJfsPQ2nhd7zTNHtKH hsL9Zf9qHfrO5PbvyqFHs74lqoyITmKl3Gr6WpYpNDHMKOmVbfhP+8YQUTP8aPpmkw2+ crLkJ6Yl6pWu+sKU+2FkbOgDBkxormVQjbJHBbhQx2+P1v0k/L0giZXjt2riJduec4HS K/EpeXXIwHcQ6Ank7Ga4TJpc90kRlRecHW/kLq3aYZeOsSVwgZ+e8U8O/KNqjN2Nfojf VFrrqnMCXrz4zBJfkwt4OZXUAOjpbp+RxUDJ/r4ACWUrFu/qJvS9HEtNi5aWHQJKuThi dxSA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p7-20020a634f47000000b003816043f118si2288491pgl.781.2022.04.01.04.17.17; Fri, 01 Apr 2022 04:17:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241848AbiCaVXr (ORCPT + 99 others); Thu, 31 Mar 2022 17:23:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241800AbiCaVXn (ORCPT ); Thu, 31 Mar 2022 17:23:43 -0400 Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1955323D779; Thu, 31 Mar 2022 14:21:55 -0700 (PDT) Received: from dread.disaster.area (pa49-180-43-123.pa.nsw.optusnet.com.au [49.180.43.123]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id B677B10E7293; Fri, 1 Apr 2022 08:21:53 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1na2Eq-00CEYy-AF; Fri, 01 Apr 2022 08:21:52 +1100 Date: Fri, 1 Apr 2022 08:21:52 +1100 From: Dave Chinner To: wang.yi59@zte.com.cn Cc: djwong@kernel.org, linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, xue.zhihong@zte.com.cn, wang.liang82@zte.com.cn, cheng.lin130@zte.com.cn Subject: Re: [PATCH] xfs: getattr ignore blocks beyond eof Message-ID: <20220331212152.GG1544202@dread.disaster.area> References: <20220331053340.GE1544202@dread.disaster.area> <202203311632074775168@zte.com.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <202203311632074775168@zte.com.cn> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.4 cv=deDjYVbe c=1 sm=1 tr=0 ts=62461b72 a=MV6E7+DvwtTitA3W+3A2Lw==:117 a=MV6E7+DvwtTitA3W+3A2Lw==:17 a=kj9zAlcOel0A:10 a=z0gMJWrwH1QA:10 a=1RTuLK3dAAAA:8 a=7-415B0cAAAA:8 a=CJxDh8eq4twOAiMNBa0A:9 a=CjuIK1q_8ugA:10 a=kRpfLKi8w9umh8uBmg1i:22 a=biEYGPWJfzWAr4FL6Ov7:22 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 31, 2022 at 04:32:07PM +0800, wang.yi59@zte.com.cn wrote: > > We do not, and have not ever tried to, hide allocation or block > > usage artifacts from userspace because any application that depends > > on specific block allocation patterns or accounting from the > > filesystem is broken by design. > > > > Every filesystem accounts blocks differently, and more often than > > not the block count exposed to userspace also includes metadata > > blocks (extent maps, xattr blocks, etc) and it might multiple count > > other blocks (e.g. shared extents). Hence so you can't actually > > use it for anything useful in userspace except reporting how many > > blocks this file *might* use. > > > > If your application is dependent on block counts exactly matching > > the file data space for waht ever reason, then what speculative > > preallocation does is the least of your problems. > > > > Thanks for your explaination. > > Unfortunately, the app I'm using evaluates diskusage by querying > the changes of the backend filesystem (XFS) file before and after > the operation. What application is this? What is it trying to use this information for? I'm trying to understand why someone thought this was a good idea, and without actually being able to look up the code and see what it is using the information for, I can't really say much more than "this seems broken by design". > Without giving up the benefits of preallocation, the > app's statistics will become obsolete and no chance to correct it > at a small cost, because of the silence reclaim of posteof blocks. > That is the app's problem. Yes it is. > Posteof blocks will be reclaimed sooner or later, it seems reasonable No, that is not guaranteed. If you the extend the file again, those post eof blocks will no longer be post-eof blocks and instead contain user data. Also, fallocate() can allocate post-eof blocks, and in that case they can be retained permanently because the user asked them to be placed beyond EOF. So the assertion that post-eof blocks always get removed sooner or later is not actually true. > to ignore them directly during query. This is my humble opinion in > this patch. At the query moment, it's not real, but it will become so > eventually. It's a speculative result for query. No, it's the _correct_ result for the current state of the file being queried. The statx() man page says: st_blocks This field indicates the number of blocks allocated to the file, in 512-byte units. (This may be smaller than st_size/512 when the file has holes.) The POSIX specification just defines it as "Number of blocks allocated for this object." Neither say anything about how the filesystem should or shouldn't account those blocks, that it must be stable, that it must reflect the amount of data written to the file, etc. ALl they say is that it is the amount of blocks allocated for that file. As it is, hiding space usage like you propose is likely to cause more problems than it solaves, because not du will not report all the disk space used by a file and hence we'll end up with other users reporting that the disk space reported by du does not match up with the space the filesytem is using. Which, of course is also expected, because reflink/dedupe result in du multiple counting shared blocks. IOWs, userspace tracking and aggregation of filesystem space usage just doesn't work, and so papering over behaviours that expose the fact it doesn't and can't work are in no-ones best interests. Cheers, Dave. -- Dave Chinner david@fromorbit.com