Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1324133rwl; Wed, 5 Apr 2023 15:32:27 -0700 (PDT) X-Google-Smtp-Source: AKy350b1BvdGSEtJfnLYkH8egUMwBSae6ihGcNJj0rL5qpKppVplqQWgTV1CqqdXtUtq6ZaiMpdu X-Received: by 2002:a05:6402:3512:b0:4af:6e08:319 with SMTP id b18-20020a056402351200b004af6e080319mr3034590edd.15.1680733947262; Wed, 05 Apr 2023 15:32:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680733947; cv=none; d=google.com; s=arc-20160816; b=IxR6AFxNVLPbsVlXBOfz++i5qGL3GGP84XywNeO8nAstjSn2Ayu93r57HMuX1HF9fg ZkHoNrfNe5+F2AyTOOGon4YJUso5W2PWYbyaWtJD/22O1Amp61pjTwiX//whZmTBAB+W 4aZaExk1LxUBhRIhZUBDoMzLdvjpnTuzi5eYps9m7DW8Jlj+sTDvKZ5M0veJ9DIii+nz 56L0SgVyiHG7Ol9Vh5OMetfnb/kod6O3Sw5O0y+TZqrIaDRDYGj4l04guaJPjpE8bulZ eBeEW/iSzW3ezwp+alo35ojLwTXFOE53P9TGPJixsaJhZohlsK4bWajij+6/3oq9yYsn b7Ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=XK91TzLV+BuNFfVuafZ92bpnVCXIC7GzKLzoDgloMhQ=; b=HVM60Ejfrr1h2/1oYDVXDcpMt3KcuhXkXEx9w0SbIDmOD+2S3uBNDmwBFDVO6DFWUO ZbNneAmWbMDPdJoNumGuSoCCHkdYMIDEpaI/1hyylGIsx/oPrKiAfQFfv1Sj4Zu29XVW gCbxavJf95hNxF+kjIoqpwnfrqPs5FrEE/bjesXHUgKvnSvtOnPXC1t5GKjCOxVaant8 wLqK3qPzT3T8Ih6xGkS4CwQQubRB1TXQPOayKn8+093OItq4kRLq3y0vyCzvyWWMQdv4 vnD8oYI15A13kHW88bOdd3d9wtaQ4S3/NuX/gYF1ZJGW6yEJTTlgXbFN0Pw1x4WDQo2C 7T0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fromorbit-com.20210112.gappssmtp.com header.s=20210112 header.b="phPcb/SN"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n17-20020aa7c451000000b004acbdf2227dsi845498edr.82.2023.04.05.15.32.02; Wed, 05 Apr 2023 15:32:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@fromorbit-com.20210112.gappssmtp.com header.s=20210112 header.b="phPcb/SN"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232063AbjDEW0x (ORCPT + 99 others); Wed, 5 Apr 2023 18:26:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229608AbjDEW0w (ORCPT ); Wed, 5 Apr 2023 18:26:52 -0400 Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E93C15FF9 for ; Wed, 5 Apr 2023 15:26:50 -0700 (PDT) Received: by mail-pf1-x433.google.com with SMTP id u38so24645197pfg.10 for ; Wed, 05 Apr 2023 15:26:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1680733610; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=XK91TzLV+BuNFfVuafZ92bpnVCXIC7GzKLzoDgloMhQ=; b=phPcb/SN25q2kw/FZ/aTqEL2YPFWV1/ZElIAli3MXwfTv+tjVLKBBur9WuCBaHCACZ D0ju4+GcM15ciS601Xut6OQO62IcxpBJTZB44/j8jsTfzgxzPHi6TdM4MY4h6EyEny7j BbdxTD7Y3B8eY3yR5kJLlHPTG42T+G04/g6dTuxR8Njnk2FYDl45LZH1iZ4SC7iVdwTL tTVNywDojFuzLzWPQMIgKhwVwu6aApOhjuoSmwEpixB3aP31p9E2vZt7xOTDI/JRk5fv sCNqzrkqiBz4QGhA8yapRr2so2J+w7D3gNKPhFHxFD9sZSe0S5qmiAchEhtJIawjKa7P E+3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680733610; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=XK91TzLV+BuNFfVuafZ92bpnVCXIC7GzKLzoDgloMhQ=; b=s+91+4fqLBSI+C9LgH/QCEcfOGCwPJV11m0r6htZrjhFkP7lUrkSsagnhLz8rEulQm 51dNnVGHx2A2pzDVshOMuaQULl9GFqYwpt2SiT7I/A+daeJohADX+vWZUgXKNdh3WROa rwaz1MMQaJ22hG1PO78jiEYICCsIuVSnJ2PL1bDRALfWFZq8yXvPZl1k3xCBBViBMifa 2icYxAfvTZqBzC7+/NYMcGZenZ9Nka8kLVFWPi6PLCvjAxhk9GfryOzSMJdPHLjpgtfR 3g1L34FHgHt1AcVnyb4Lq1a+XFZpjMLqLd+GjVhnwEoC7ZMbmpHdN1/Pu8meGwrwaHSU sEwQ== X-Gm-Message-State: AAQBX9fGgj/jAIoLdPgwaSYUEJwBlExSJVcLK39dopKt+Qb525Ywe+rS RLFsRixDXVGG5W3Tb5CivsWLbA== X-Received: by 2002:a62:1d8f:0:b0:627:e577:4326 with SMTP id d137-20020a621d8f000000b00627e5774326mr6595721pfd.17.1680733610237; Wed, 05 Apr 2023 15:26:50 -0700 (PDT) Received: from dread.disaster.area (pa49-181-91-157.pa.nsw.optusnet.com.au. [49.181.91.157]) by smtp.gmail.com with ESMTPSA id 2-20020aa79142000000b0062c0cfbb264sm11493110pfi.93.2023.04.05.15.26.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Apr 2023 15:26:49 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1pkBaY-00HUjN-NL; Thu, 06 Apr 2023 08:26:46 +1000 Date: Thu, 6 Apr 2023 08:26:46 +1000 From: Dave Chinner To: Eric Biggers Cc: "Darrick J. Wong" , Andrey Albershteyn , dchinner@redhat.com, hch@infradead.org, linux-xfs@vger.kernel.org, fsverity@lists.linux.dev, rpeterso@redhat.com, agruenba@redhat.com, xiang@kernel.org, chao@kernel.org, damien.lemoal@opensource.wdc.com, jth@kernel.org, linux-erofs@lists.ozlabs.org, linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com Subject: Re: [PATCH v2 21/23] xfs: handle merkle tree block size != fs blocksize != PAGE_SIZE Message-ID: <20230405222646.GR3223426@dread.disaster.area> References: <20230404145319.2057051-1-aalbersh@redhat.com> <20230404145319.2057051-22-aalbersh@redhat.com> <20230404163602.GC109974@frogsfrogsfrogs> <20230405160221.he76fb5b45dud6du@aalbersh.remote.csb> <20230405163847.GG303486@frogsfrogsfrogs> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=0.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Apr 05, 2023 at 06:16:00PM +0000, Eric Biggers wrote: > On Wed, Apr 05, 2023 at 09:38:47AM -0700, Darrick J. Wong wrote: > > > The merkle tree pages are dropped after verification. When page is > > > dropped xfs_buf is marked as verified. If fs-verity wants to > > > verify again it will get the same verified buffer. If buffer is > > > evicted it won't have verified state. > > > > > > So, with enough memory pressure buffers will be dropped and need to > > > be reverified. > > > > Please excuse me if this was discussed and rejected long ago, but > > perhaps fsverity should try to hang on to the merkle tree pages that > > this function returns for as long as possible until reclaim comes for > > them? > > > > With the merkle tree page lifetimes extended, you then don't need to > > attach the xfs_buf to page->private, nor does xfs have to extend the > > buffer cache to stash XBF_VERITY_CHECKED. > > Well, all the other filesystems that support fsverity (ext4, f2fs, and btrfs) > just cache the Merkle tree pages in the inode's page cache. It's an approach > that I know some people aren't a fan of, but it's efficient and it works. Which puts pages beyond EOF in the page cache. Given that XFS also allows persistent block allocation beyond EOF, having both data in the page cache and blocks beyond EOF that contain unrelated information is a Real Bad Idea. Just because putting metadata in the file data address space works for one filesystem, it doesn't me it's a good idea or that it works for every filesystem. > We could certainly think about moving to a design where fs/verity/ asks the > filesystem to just *read* a Merkle tree block, without adding it to a cache, and > then fs/verity/ implements the caching itself. That would require some large > changes to each filesystem, though, unless we were to double-cache the Merkle > tree blocks which would be inefficient. No, that's unnecessary. All we need if for fsverity to require filesystems to pass it byte addressable data buffers that are externally reference counted. The filesystem can take a page reference before mapping the page and passing the kaddr to fsverity, then unmap and drop the reference when the merkle tree walk is done as per Andrey's new drop callout. fsverity doesn't need to care what the buffer is made from, how it is cached, what it's life cycle is, etc. The caching mechanism and reference counting is entirely controlled by the filesystem callout implementations, and fsverity only needs to deal with memory buffers that are guaranteed to live for the entire walk of the merkle tree.... Cheers, Dave. -- Dave Chinner david@fromorbit.com