Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp2613057pxb; Mon, 19 Apr 2021 09:34:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxzdNxn8Pd2xzWyyNk5cIfHrm/CQQPfvbsAgQioThVVH/7kJqRCIdjDu6sqwfzQc7v2Arb/ X-Received: by 2002:a05:6402:2686:: with SMTP id w6mr4438081edd.226.1618850053683; Mon, 19 Apr 2021 09:34:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618850053; cv=none; d=google.com; s=arc-20160816; b=r4T9cnwL9SM3qCkQyxiafShbHYpgLYufDX3X9u49n70xkxtZtZlWOZ/3z/UC/Pvmff 6GRX7nMJY5gq3TPBT5GuYycwyKVcZWNQn6uNJl1Ezx/RKpNE4NxQQvyekt3vCxoAiQkX U54rG6hdve7S3rTSFljEzcYZHeUAjFrST+zUFLwVLj0jkrRtwrs8ct5HvmDJ+LUaocE3 bFHWXnD/gQX/VAcN8tjXfUfJfqmMM/RC3Tx+CLeOoRJV2C0Vcp6y5Kb297EGz3fOYkn0 Qkz69g9Jm+n7GFlR+tGNaGFtK8fY2xE2QMSUTAgLcJ2Y772pCP7pZUaP+A/wn8r3mbyq pauw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=e7LTKUPwdAs11YBy9Lv+AIceCyszlxPexipmXFEQwjs=; b=MIdE7H9H206aBiEdoqjAJFd74OqoIxORTQaTajLRs2eEf0i+pecTRmfqWe8/yKG+5t igzBENDLxoC7h/gU8TEPoUmq2t8+ek8Wo9Petkv8b7k4pLYubZvLlo//dmpB0qHNHcTL wA6hI+C5p7zpETZR9g1EOJhdHKEueTtc4gYe7bUOgTqhHpaI8RiG9bzrsrq2KaSD0JTD gZi+vQMPbcLaDjc7DIBQrd8B36rTaueZPLtALuBt3KpWDop2oWcRTX3q1iNAPfosTyA7 H6ngcHpg2FwgoiXmjm6vjTccuPDWCRB4kYbBOCaRjM8lqJSSSDkc+HHo6OcSKuCyz/IC b5aw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id lj24si11501809ejb.80.2021.04.19.09.33.47; Mon, 19 Apr 2021 09:34:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236294AbhDSQZf (ORCPT + 99 others); Mon, 19 Apr 2021 12:25:35 -0400 Received: from mx2.suse.de ([195.135.220.15]:42890 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233989AbhDSQZf (ORCPT ); Mon, 19 Apr 2021 12:25:35 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 61751AFF8; Mon, 19 Apr 2021 16:25:04 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 274FB1F2B68; Mon, 19 Apr 2021 18:25:04 +0200 (CEST) Date: Mon, 19 Apr 2021 18:25:04 +0200 From: Jan Kara To: Matthew Wilcox Cc: Jan Kara , linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, Ted Tso , Christoph Hellwig , Amir Goldstein , Dave Chinner Subject: Re: [PATCH 0/7 RFC v3] fs: Hole punch vs page cache filling races Message-ID: <20210419162504.GI8706@quack2.suse.cz> References: <20210413105205.3093-1-jack@suse.cz> <20210419152008.GD2531743@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210419152008.GD2531743@casper.infradead.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Mon 19-04-21 16:20:08, Matthew Wilcox wrote: > On Tue, Apr 13, 2021 at 01:28:44PM +0200, Jan Kara wrote: > > Also when writing the documentation I came across one question: Do we mandate > > i_mapping_sem for truncate + hole punch for all filesystems or just for > > filesystems that support hole punching (or other complex fallocate operations)? > > I wrote the documentation so that we require every filesystem to use > > i_mapping_sem. This makes locking rules simpler, we can also add asserts when > > all filesystems are converted. The downside is that simple filesystems now pay > > the overhead of the locking unnecessary for them. The overhead is small > > (uncontended rwsem acquisition for truncate) so I don't think we care and the > > simplicity is worth it but I wanted to spell this out. > > What do we actually get in return for supporting these complex fallocate > operations? Someone added them for a reason, but does that reason > actually benefit me? Other than running xfstests, how many times has > holepunch been called on your laptop in the last week? I don't want to > incur even one extra instruction per I/O operation to support something > that happens twice a week; that's a bad tradeoff. I agree hole punch is relatively rare compared to normal operations but when it is used, it is used rather frequently - e.g. by VMs to manage their filesystem images. So if we regress holepunch either by not freeing blocks or by slowing it down significantly, I'm pretty sure some people will complain. That being said I fully understand your reluctance to add lock to the read path but note that it is taken only when we need to fill data from the storage and it should be taken once per readahead request so I actually doubt the extra acquisition will be visible in the profiles. But I can profile it to be sure. > Can we implement holepunch as a NOP? Or return -ENOTTY? Those both > seem like better solutions than adding an extra rwsem to every inode. We already have that rwsem there today for most major filesystems. This work just lifts it from fs-private inode area into the VFS inode. So in terms of memory usage we are not loosing that much. > Failing that, is there a bigger hammer we can use on the holepunch side > (eg preventing all concurrent accesses while the holepunch is happening) > to reduce the overhead on the read side? I'm open to other solutions but frankly this was the best I could come up with. Holepunch already uses a pretty much big hammer approach - take all the locks there are on the inode in exclusive mode, block DIO, unmap everything and then do its dirty deeds... I don't think we want hole punch to block anything on fs-wide basis (that's a DoS recipe) and besides that I don't see how the hammer could be bigger ;). Honza -- Jan Kara SUSE Labs, CR