Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp647828iog; Thu, 30 Jun 2022 07:38:02 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vF7mUZ52eLhtK0EctLovR5NNFj/ojgMQPTl0BhUjhiirNiUijtauGi7H1Zs3HbjYOILdEF X-Received: by 2002:a05:6402:330d:b0:435:95ae:9b1e with SMTP id e13-20020a056402330d00b0043595ae9b1emr11774744eda.402.1656599882451; Thu, 30 Jun 2022 07:38:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656599882; cv=none; d=google.com; s=arc-20160816; b=Zy0FVUJfQ+g9GaM3GzA8OCZyh1ehpRY54u4HFrfXbQbfPuoAgkw+OEMpHfjPCinIHH M1QLgPMF8t6xjSusggB5uD2sSwmcpwAJMH9MxkoCczXD/tsVBHhppg5Tl9ReK4GXo+BP X1K2vKK7kexKbpqPClPuIU9zEW6mjMHTGFA6dgIFjZHMZfq9gUXGfME1pTwuS2uX8mCz T2DuidQhUomXq0DQVwYF3nbKrjMDQ5Ci4WCTSaMcK2tIBDjlHGkcf8RFPk1I2ySdTpD3 Wv8yU5WDDMsg/cWoJI0tvpAAooQ2mgoyYLbeXXPY1RzfKspyfMvfjwjtSDwe5AndJtFW GRog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=WKc7rPtttX1eyosUytDOeVfSiZ5PwBW0S1RAlET7hUI=; b=M4B32YDNhDbjJLVqRSJNlmrQtAMnJME8pbLMOe4yELhOpqHYEnPdn5bgvr0gPPjeFu 6IAYLAoLzX1UqtQGly71K8nWPsqlgX3WEGdgjlPIwWlaRDjZ0WwQlmFTfYFxgeDJ/HXa b/Xacacu1gwQFxyH/yOEljV57GEQMtngjkmyMWXdum7BAvOLrOQVbnoZF7h/VIWwqXu6 25g88WD5UITa/r9bn6wwElF2FM1r76jSnN7BARAeCsXcxcPmLUlFbYy/M4QePSN4iRsj 6TxHWaRPjTD38JFwfJqwxKwqbFfLbuuCQc2YxCz1bPU6UWuqeVpta/HEPrWNAatzD2XJ t4kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=Xqt4fvJg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m7-20020aa7d347000000b00435c517de46si22914736edr.394.2022.06.30.07.37.37; Thu, 30 Jun 2022 07:38:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=Xqt4fvJg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236708AbiF3OIv (ORCPT + 99 others); Thu, 30 Jun 2022 10:08:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236679AbiF3OHW (ORCPT ); Thu, 30 Jun 2022 10:07:22 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3E0872EEA; Thu, 30 Jun 2022 06:54:28 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 132B061F5F; Thu, 30 Jun 2022 13:54:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 230BBC341CB; Thu, 30 Jun 2022 13:54:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1656597262; bh=YSrChqLUMjIgAIyGJymSZd2Gm+5E0kMjXymR4Fw4Kz8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Xqt4fvJg53Eeg/SOdmmFnQUl9P1ZMHpLGSO9Km5tDaNjKWlQsPYuDqXoS+WCVUfXn I/aWZGobp6JBRU8UNCi5qGAfKzm1it0GvajzOGudUKSz2OSNux+kdKjsiGod+XRolL MiFFn1PYeH66voNzXU6ktUpKuz9UE6iXTbyIrL/I= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Darrick J. Wong" , Dave Chinner , Chandan Babu R , Amir Goldstein Subject: [PATCH 5.10 09/12] xfs: remove all COW fork extents when remounting readonly Date: Thu, 30 Jun 2022 15:47:14 +0200 Message-Id: <20220630133230.967036087@linuxfoundation.org> X-Mailer: git-send-email 2.37.0 In-Reply-To: <20220630133230.676254336@linuxfoundation.org> References: <20220630133230.676254336@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Darrick J. Wong" commit 089558bc7ba785c03815a49c89e28ad9b8de51f9 upstream. [backport xfs_icwalk -> xfs_eofblocks for 5.10.y] As part of multiple customer escalations due to file data corruption after copy on write operations, I wrote some fstests that use fsstress to hammer on COW to shake things loose. Regrettably, I caught some filesystem shutdowns due to incorrect rmap operations with the following loop: mount # (0) fsstress & # (1) while true; do fsstress mount -o remount,ro # (2) fsstress mount -o remount,rw # (3) done When (2) happens, notice that (1) is still running. xfs_remount_ro will call xfs_blockgc_stop to walk the inode cache to free all the COW extents, but the blockgc mechanism races with (1)'s reader threads to take IOLOCKs and loses, which means that it doesn't clean them all out. Call such a file (A). When (3) happens, xfs_remount_rw calls xfs_reflink_recover_cow, which walks the ondisk refcount btree and frees any COW extent that it finds. This function does not check the inode cache, which means that incore COW forks of inode (A) is now inconsistent with the ondisk metadata. If one of those former COW extents are allocated and mapped into another file (B) and someone triggers a COW to the stale reservation in (A), A's dirty data will be written into (B) and once that's done, those blocks will be transferred to (A)'s data fork without bumping the refcount. The results are catastrophic -- file (B) and the refcount btree are now corrupt. Solve this race by forcing the xfs_blockgc_free_space to run synchronously, which causes xfs_icwalk to return to inodes that were skipped because the blockgc code couldn't take the IOLOCK. This is safe to do here because the VFS has already prohibited new writer threads. Fixes: 10ddf64e420f ("xfs: remove leftover CoW reservations when remounting ro") Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Chandan Babu R Signed-off-by: Amir Goldstein Acked-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman --- fs/xfs/xfs_super.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1695,7 +1695,10 @@ static int xfs_remount_ro( struct xfs_mount *mp) { - int error; + struct xfs_eofblocks eofb = { + .eof_flags = XFS_EOF_FLAGS_SYNC, + }; + int error; /* * Cancel background eofb scanning so it cannot race with the final @@ -1703,8 +1706,13 @@ xfs_remount_ro( */ xfs_stop_block_reaping(mp); - /* Get rid of any leftover CoW reservations... */ - error = xfs_icache_free_cowblocks(mp, NULL); + /* + * Clear out all remaining COW staging extents and speculative post-EOF + * preallocations so that we don't leave inodes requiring inactivation + * cleanups during reclaim on a read-only mount. We must process every + * cached inode, so this requires a synchronous cache scan. + */ + error = xfs_icache_free_cowblocks(mp, &eofb); if (error) { xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); return error;