Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3999345pxk; Tue, 29 Sep 2020 11:25:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyUHFvf1WMclvnhRl1Fi6jeYuhjtZOUfV6n1lzNZG7zO+n1U4djqw1QbcwQNpL/Tn1mfQKJ X-Received: by 2002:a50:fb0e:: with SMTP id d14mr4959991edq.172.1601403958745; Tue, 29 Sep 2020 11:25:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601403958; cv=none; d=google.com; s=arc-20160816; b=MZbrNAwg4s03A2fOeFkYh8Aeg1/YcuA0y/4Oph36J5cFk5x+yKwJ/MW+xPFudwOnjW GhuAcYBNTBYkDzR+5s3hMOvO1VpxWMakBbfUYSuh7+jlTNbtNvfJcc9VLMdXbTzvz6Aj uwdvSo2XaEurnawmyPfciwPjFb+95Fp61z4D4t9IBLtOCHY5aaz3egCzvt1/Qxqz/vx3 U+vkmXgv0IPuj3nMn9usRjz6IlPinBQvUdyxhMtHAY142osFimKjYwF6D30ohMBDBaT9 dg27471IYTTQU6gEEoEO1pXtDJHQrAikkEAOWFy8QXRBI89UBNARRMxdmZ3HU8nc/RWu 93hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=MAD6JI3Lmf6H1nOZtpBwZ+MTfIuVgb0SOE29hyGfiBg=; b=m6K5fXTLF3TYmQM04Vxsat3nc3uAaagI4BHEtzRjlNdgxL7WHOjezo7D5KzyfkR1w2 CK2a0CinUykBVIjbbRCyvCAD2hG0/ijZSoWmeHhWcQU5Vp2HlazIYjOulwac+chpK5xf YBy4J2o7TrtK5XdU7lXEbNnUb9u7kH9q+CMI9QO+6+7869P6ijiqqa60jOrdDCUJ2FR5 ucE+YE3RXQhSaGkFVMU7WDzm88mBLFYOOqHp0aew0GZy/XhrV5jlW9LHBgCmfwWAcydx e+9I+MY4PBrE5HMt/RHbQOufnGO6/iCX7/WgghVMbW0mpAGCCK/s9b3NjJ44ZtAYMMLr raoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=CxU+IL6z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b14si3272487edw.199.2020.09.29.11.25.35; Tue, 29 Sep 2020 11:25:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=CxU+IL6z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732176AbgI2MOH (ORCPT + 99 others); Tue, 29 Sep 2020 08:14:07 -0400 Received: from mail.kernel.org ([198.145.29.99]:50296 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730109AbgI2Lh3 (ORCPT ); Tue, 29 Sep 2020 07:37:29 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 976A523B31; Tue, 29 Sep 2020 11:33:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601379235; bh=joLZIgMFB0vzZK2lPsF6Sk37sr/HoUSetan2POT/xU4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CxU+IL6zNSBtV/MownXRKFPleHFkxfI+E+1HOZE9nkKxvT/6V+AwBhdt+S4pVQLoc 31XdUPwqWKHaqkF2iG5WeLpvKkTXAvhEczXp9mLaYWtyFX3LlEGaSztkrOBiVN2NhX rzE9F8ru5pJuq4gU6rDNCN26d+ybfZkmyQ9lXHCY= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Omar Sandoval , "Darrick J. Wong" , Brian Foster , Sasha Levin Subject: [PATCH 5.4 073/388] xfs: fix log reservation overflows when allocating large rt extents Date: Tue, 29 Sep 2020 12:56:44 +0200 Message-Id: <20200929110014.026788426@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929110010.467764689@linuxfoundation.org> References: <20200929110010.467764689@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Darrick J. Wong [ Upstream commit b1de6fc7520fe12949c070af0e8c0e4044cd3420 ] Omar Sandoval reported that a 4G fallocate on the realtime device causes filesystem shutdowns due to a log reservation overflow that happens when we log the rtbitmap updates. Factor rtbitmap/rtsummary updates into the the tr_write and tr_itruncate log reservation calculation. "The following reproducer results in a transaction log overrun warning for me: mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb mount -o rtdev=/dev/vdc /dev/vdb /mnt fallocate -l 4G /mnt/foo Reported-by: Omar Sandoval Tested-by: Omar Sandoval Signed-off-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Sasha Levin --- fs/xfs/libxfs/xfs_trans_resv.c | 96 +++++++++++++++++++++++++++------- 1 file changed, 77 insertions(+), 19 deletions(-) diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index d12bbd526e7c0..b3584cd2cc164 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -196,6 +196,24 @@ xfs_calc_inode_chunk_res( return res; } +/* + * Per-extent log reservation for the btree changes involved in freeing or + * allocating a realtime extent. We have to be able to log as many rtbitmap + * blocks as needed to mark inuse MAXEXTLEN blocks' worth of realtime extents, + * as well as the realtime summary block. + */ +unsigned int +xfs_rtalloc_log_count( + struct xfs_mount *mp, + unsigned int num_ops) +{ + unsigned int blksz = XFS_FSB_TO_B(mp, 1); + unsigned int rtbmp_bytes; + + rtbmp_bytes = (MAXEXTLEN / mp->m_sb.sb_rextsize) / NBBY; + return (howmany(rtbmp_bytes, blksz) + 1) * num_ops; +} + /* * Various log reservation values. * @@ -218,13 +236,21 @@ xfs_calc_inode_chunk_res( /* * In a write transaction we can allocate a maximum of 2 - * extents. This gives: + * extents. This gives (t1): * the inode getting the new extents: inode size * the inode's bmap btree: max depth * block size * the agfs of the ags from which the extents are allocated: 2 * sector * the superblock free block counter: sector size * the allocation btrees: 2 exts * 2 trees * (2 * max depth - 1) * block size - * And the bmap_finish transaction can free bmap blocks in a join: + * Or, if we're writing to a realtime file (t2): + * the inode getting the new extents: inode size + * the inode's bmap btree: max depth * block size + * the agfs of the ags from which the extents are allocated: 2 * sector + * the superblock free block counter: sector size + * the realtime bitmap: ((MAXEXTLEN / rtextsize) / NBBY) bytes + * the realtime summary: 1 block + * the allocation btrees: 2 trees * (2 * max depth - 1) * block size + * And the bmap_finish transaction can free bmap blocks in a join (t3): * the agfs of the ags containing the blocks: 2 * sector size * the agfls of the ags containing the blocks: 2 * sector size * the super block free block counter: sector size @@ -234,40 +260,72 @@ STATIC uint xfs_calc_write_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + - max((xfs_calc_inode_res(mp, 1) + + unsigned int t1, t2, t3; + unsigned int blksz = XFS_FSB_TO_B(mp, 1); + + t1 = xfs_calc_inode_res(mp, 1) + + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), blksz) + + xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), blksz); + + if (xfs_sb_version_hasrealtime(&mp->m_sb)) { + t2 = xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), - XFS_FSB_TO_B(mp, 1)) + + blksz) + xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), - XFS_FSB_TO_B(mp, 1))), - (xfs_calc_buf_res(5, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), - XFS_FSB_TO_B(mp, 1)))); + xfs_calc_buf_res(xfs_rtalloc_log_count(mp, 1), blksz) + + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), blksz); + } else { + t2 = 0; + } + + t3 = xfs_calc_buf_res(5, mp->m_sb.sb_sectsize) + + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), blksz); + + return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3); } /* - * In truncating a file we free up to two extents at once. We can modify: + * In truncating a file we free up to two extents at once. We can modify (t1): * the inode being truncated: inode size * the inode's bmap btree: (max depth + 1) * block size - * And the bmap_finish transaction can free the blocks and bmap blocks: + * And the bmap_finish transaction can free the blocks and bmap blocks (t2): * the agf for each of the ags: 4 * sector size * the agfl for each of the ags: 4 * sector size * the super block to reflect the freed blocks: sector size * worst case split in allocation btrees per extent assuming 4 extents: * 4 exts * 2 trees * (2 * max depth - 1) * block size + * Or, if it's a realtime file (t3): + * the agf for each of the ags: 2 * sector size + * the agfl for each of the ags: 2 * sector size + * the super block to reflect the freed blocks: sector size + * the realtime bitmap: 2 exts * ((MAXEXTLEN / rtextsize) / NBBY) bytes + * the realtime summary: 2 exts * 1 block + * worst case split in allocation btrees per extent assuming 2 extents: + * 2 exts * 2 trees * (2 * max depth - 1) * block size */ STATIC uint xfs_calc_itruncate_reservation( struct xfs_mount *mp) { - return XFS_DQUOT_LOGRES(mp) + - max((xfs_calc_inode_res(mp, 1) + - xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK) + 1, - XFS_FSB_TO_B(mp, 1))), - (xfs_calc_buf_res(9, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 4), - XFS_FSB_TO_B(mp, 1)))); + unsigned int t1, t2, t3; + unsigned int blksz = XFS_FSB_TO_B(mp, 1); + + t1 = xfs_calc_inode_res(mp, 1) + + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK) + 1, blksz); + + t2 = xfs_calc_buf_res(9, mp->m_sb.sb_sectsize) + + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 4), blksz); + + if (xfs_sb_version_hasrealtime(&mp->m_sb)) { + t3 = xfs_calc_buf_res(5, mp->m_sb.sb_sectsize) + + xfs_calc_buf_res(xfs_rtalloc_log_count(mp, 2), blksz) + + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), blksz); + } else { + t3 = 0; + } + + return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3); } /* -- 2.25.1