Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp806080rwb; Mon, 26 Sep 2022 06:10:51 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7kETj+en738BlklmqvB0/EbjKSzJo8TLUqoQ1owDCZmPDtXZi8bLBX9oBD5ySt+QJ7D993 X-Received: by 2002:a05:6402:274c:b0:44f:334e:1e11 with SMTP id z12-20020a056402274c00b0044f334e1e11mr22429067edd.304.1664197851034; Mon, 26 Sep 2022 06:10:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664197851; cv=none; d=google.com; s=arc-20160816; b=Dwpe1ButYvkeWppXJ+BCubu7ElUFve1aHsQajFI8eYp5O1rsZFX7pm53A0vBswrXzZ BUjemorn8PFxpn53Q281FfRg78K325Zz92NpHxT5WMe7FZBi4oFntFo/TOtr+ASzXl3+ XwFb8PpGR7GD1jJuxEo85+jVLOO/cgMNaP/tlcQNEFmN5BUixH9ipg8XieK4LWYYu8PF Er4MFFWjty29Lsfg9kyBfQt+ORn1gWeUFWJl19EMjgx26DaAC9iLTYufj3N7Djg0cctG vOEoj0vtbo6x4515bArq+cmoWJcxRWLJSamfmg7EiTwhJTFESytv8mw4AR67dyFc3SlG ZGtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=KnSBL/eBN1RkygsV5wjSAcLhiqT7C/p9dWsboOOyxRE=; b=nxRBCVX/DwrhZji+1Bt+cNwaIrOFHEbgQ150IkUeJwCg6ogQfKEXWfYorrs0kuC15L Ndyq3Gsa4v4R+qlQwlP9fP3bGwxzNm31NCKgLTaHoCcTKjYtMBSqNg0Mey/nV1PhQrST cCqfdBOz0hISplZ2+kBNd3/6YUuR+km/K7x2TTQ1XtD9jaQLuqpDkzlC8g6sM4b+PsyE dxsEeZHg2riMzpXpREqA1AExVXiufizhBp5waxcOkWMETuT16xXWRgLmbb8hH7PXa1xK 9CGLX7R0Gc+G8pG7JU7RXS1lI/1wNkcMd2w0RyZn+jS/QZCY0j8bXsGVl8uUE6I3Pz/0 UMrg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=qm3so1UY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id op12-20020a170906bcec00b00733069541b8si12762278ejb.102.2022.09.26.06.10.25; Mon, 26 Sep 2022 06:10:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=qm3so1UY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234696AbiIZKuj (ORCPT + 99 others); Mon, 26 Sep 2022 06:50:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231396AbiIZKrF (ORCPT ); Mon, 26 Sep 2022 06:47:05 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D18C257543; Mon, 26 Sep 2022 03:26:07 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id EAE1AB80835; Mon, 26 Sep 2022 10:26:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3359AC433D7; Mon, 26 Sep 2022 10:26:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1664187964; bh=WgGXi2sDnorIfeZ/f8KhMyyqHF6msGVA4lDb8KE7coI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qm3so1UYUMA/xXQeXulDKmERslcCHCja4KLRZ7PROhwSeliWAHTfFBXVcD9z5dx9Z T24ylo54ad/26xeodhNK5Sx7UmeAzPmF+etmc5IBdkJ/5RR54LnEmSQeicskfGtEfF RHIsw19ps4jM2ONUyKHGRJ3GCDEvPeJh5LQrM9BA= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Alex Lyakas , "Darrick J. Wong" , Brian Foster , "Darrick J. Wong" , Chandan Babu R Subject: [PATCH 5.4 118/120] xfs: dont commit sunit/swidth updates to disk if that would cause repair failures Date: Mon, 26 Sep 2022 12:12:31 +0200 Message-Id: <20220926100755.237627793@linuxfoundation.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220926100750.519221159@linuxfoundation.org> References: <20220926100750.519221159@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Darrick J. Wong" commit 13eaec4b2adf2657b8167b67e27c97cc7314d923 upstream. Alex Lyakas reported[1] that mounting an xfs filesystem with new sunit and swidth values could cause xfs_repair to fail loudly. The problem here is that repair calculates the where mkfs should have allocated the root inode, based on the superblock geometry. The allocation decisions depend on sunit, which means that we really can't go updating sunit if it would lead to a subsequent repair failure on an otherwise correct filesystem. Port from xfs_repair some code that computes the location of the root inode and teach mount to skip the ondisk update if it would cause problems for repair. Along the way we'll update the documentation, provide a function for computing the minimum AGFL size instead of open-coding it, and cut down some indenting in the mount code. Note that we allow the mount to proceed (and new allocations will reflect this new geometry) because we've never screened this kind of thing before. We'll have to wait for a new future incompat feature to enforce correct behavior, alas. Note that the geometry reporting always uses the superblock values, not the incore ones, so that is what xfs_info and xfs_growfs will report. [1] https://lore.kernel.org/linux-xfs/20191125130744.GA44777@bfoster/T/#m00f9594b511e076e2fcdd489d78bc30216d72a7d Reported-by: Alex Lyakas Signed-off-by: Darrick J. Wong Reviewed-by: Brian Foster Acked-by: Darrick J. Wong Signed-off-by: Chandan Babu R Signed-off-by: Greg Kroah-Hartman --- fs/xfs/libxfs/xfs_ialloc.c | 64 +++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_ialloc.h | 1 fs/xfs/xfs_mount.c | 45 ++++++++++++++++++++++++++++++- fs/xfs/xfs_trace.h | 21 ++++++++++++++ 4 files changed, 130 insertions(+), 1 deletion(-) --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -2854,3 +2854,67 @@ xfs_ialloc_setup_geometry( else igeo->ialloc_align = 0; } + +/* Compute the location of the root directory inode that is laid out by mkfs. */ +xfs_ino_t +xfs_ialloc_calc_rootino( + struct xfs_mount *mp, + int sunit) +{ + struct xfs_ino_geometry *igeo = M_IGEO(mp); + xfs_agblock_t first_bno; + + /* + * Pre-calculate the geometry of AG 0. We know what it looks like + * because libxfs knows how to create allocation groups now. + * + * first_bno is the first block in which mkfs could possibly have + * allocated the root directory inode, once we factor in the metadata + * that mkfs formats before it. Namely, the four AG headers... + */ + first_bno = howmany(4 * mp->m_sb.sb_sectsize, mp->m_sb.sb_blocksize); + + /* ...the two free space btree roots... */ + first_bno += 2; + + /* ...the inode btree root... */ + first_bno += 1; + + /* ...the initial AGFL... */ + first_bno += xfs_alloc_min_freelist(mp, NULL); + + /* ...the free inode btree root... */ + if (xfs_sb_version_hasfinobt(&mp->m_sb)) + first_bno++; + + /* ...the reverse mapping btree root... */ + if (xfs_sb_version_hasrmapbt(&mp->m_sb)) + first_bno++; + + /* ...the reference count btree... */ + if (xfs_sb_version_hasreflink(&mp->m_sb)) + first_bno++; + + /* + * ...and the log, if it is allocated in the first allocation group. + * + * This can happen with filesystems that only have a single + * allocation group, or very odd geometries created by old mkfs + * versions on very small filesystems. + */ + if (mp->m_sb.sb_logstart && + XFS_FSB_TO_AGNO(mp, mp->m_sb.sb_logstart) == 0) + first_bno += mp->m_sb.sb_logblocks; + + /* + * Now round first_bno up to whatever allocation alignment is given + * by the filesystem or was passed in. + */ + if (xfs_sb_version_hasdalign(&mp->m_sb) && igeo->ialloc_align > 0) + first_bno = roundup(first_bno, sunit); + else if (xfs_sb_version_hasalign(&mp->m_sb) && + mp->m_sb.sb_inoalignmt > 1) + first_bno = roundup(first_bno, mp->m_sb.sb_inoalignmt); + + return XFS_AGINO_TO_INO(mp, 0, XFS_AGB_TO_AGINO(mp, first_bno)); +} --- a/fs/xfs/libxfs/xfs_ialloc.h +++ b/fs/xfs/libxfs/xfs_ialloc.h @@ -152,5 +152,6 @@ int xfs_inobt_insert_rec(struct xfs_btre int xfs_ialloc_cluster_alignment(struct xfs_mount *mp); void xfs_ialloc_setup_geometry(struct xfs_mount *mp); +xfs_ino_t xfs_ialloc_calc_rootino(struct xfs_mount *mp, int sunit); #endif /* __XFS_IALLOC_H__ */ --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -31,7 +31,7 @@ #include "xfs_reflink.h" #include "xfs_extent_busy.h" #include "xfs_health.h" - +#include "xfs_trace.h" static DEFINE_MUTEX(xfs_uuid_table_mutex); static int xfs_uuid_table_size; @@ -365,6 +365,42 @@ release_buf: } /* + * If the sunit/swidth change would move the precomputed root inode value, we + * must reject the ondisk change because repair will stumble over that. + * However, we allow the mount to proceed because we never rejected this + * combination before. Returns true to update the sb, false otherwise. + */ +static inline int +xfs_check_new_dalign( + struct xfs_mount *mp, + int new_dalign, + bool *update_sb) +{ + struct xfs_sb *sbp = &mp->m_sb; + xfs_ino_t calc_ino; + + calc_ino = xfs_ialloc_calc_rootino(mp, new_dalign); + trace_xfs_check_new_dalign(mp, new_dalign, calc_ino); + + if (sbp->sb_rootino == calc_ino) { + *update_sb = true; + return 0; + } + + xfs_warn(mp, +"Cannot change stripe alignment; would require moving root inode."); + + /* + * XXX: Next time we add a new incompat feature, this should start + * returning -EINVAL to fail the mount. Until then, spit out a warning + * that we're ignoring the administrator's instructions. + */ + xfs_warn(mp, "Skipping superblock stripe alignment update."); + *update_sb = false; + return 0; +} + +/* * If we were provided with new sunit/swidth values as mount options, make sure * that they pass basic alignment and superblock feature checks, and convert * them into the same units (FSB) that everything else expects. This step @@ -424,10 +460,17 @@ xfs_update_alignment( struct xfs_sb *sbp = &mp->m_sb; if (mp->m_dalign) { + bool update_sb; + int error; + if (sbp->sb_unit == mp->m_dalign && sbp->sb_width == mp->m_swidth) return 0; + error = xfs_check_new_dalign(mp, mp->m_dalign, &update_sb); + if (error || !update_sb) + return error; + sbp->sb_unit = mp->m_dalign; sbp->sb_width = mp->m_swidth; mp->m_update_sb = true; --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3609,6 +3609,27 @@ DEFINE_KMEM_EVENT(kmem_alloc_large); DEFINE_KMEM_EVENT(kmem_realloc); DEFINE_KMEM_EVENT(kmem_zone_alloc); +TRACE_EVENT(xfs_check_new_dalign, + TP_PROTO(struct xfs_mount *mp, int new_dalign, xfs_ino_t calc_rootino), + TP_ARGS(mp, new_dalign, calc_rootino), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(int, new_dalign) + __field(xfs_ino_t, sb_rootino) + __field(xfs_ino_t, calc_rootino) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->new_dalign = new_dalign; + __entry->sb_rootino = mp->m_sb.sb_rootino; + __entry->calc_rootino = calc_rootino; + ), + TP_printk("dev %d:%d new_dalign %d sb_rootino %llu calc_rootino %llu", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->new_dalign, __entry->sb_rootino, + __entry->calc_rootino) +) + #endif /* _TRACE_XFS_H */ #undef TRACE_INCLUDE_PATH