Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp1345127ybp; Thu, 17 Oct 2019 11:22:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqxdYf6SaHHZGi73tQKY0EPTzWKFNrcRQuRDbmC1knO7c9c8Gx27ECkqXkh6cSj//EMgagqY X-Received: by 2002:aa7:ce94:: with SMTP id y20mr5374964edv.120.1571336527490; Thu, 17 Oct 2019 11:22:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571336527; cv=none; d=google.com; s=arc-20160816; b=wa21Vd0tacBNAmIePnnqYkRyJ7ibyScUTGneHp6oqlmACEIQO0upaq+4CM8rHCtIiy 5uYIuhKPe2nn0Rd6C5/wxDzoXy7nXiQh2pZy7nOh555h9NaN/7DtWfQthL+WAB6mvhFQ GQyYOTNREJzx8tZW67OrBBIlj0bIRKoB6mXk70sRQ/PkdX9N65gE5NBZSS/RG65a2UDf 7oxo+E2aZrRQsTuskgPd32WRqheWBtt0gMYYYNH6n7UKcabwR/SiP/qX852ZUNv10jJ0 JVbG13lWXiPl9mtdHOyEg4Qqj58QFjsjS8N/uVVIRBVoOhSS7seOWS4Q5zDH+YEH0BnG FpIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=j0uAjB28wa/FRbknR1k20aI3OMqDbo+HYuw8OJwhhLI=; b=jRAo/KlSSf7pK/HWsbscwXr78F8fCrq6cp7ax95J05p3X1YiySQF5WlezCw2m+Mtyt WSMewmipDh6PXZiUv8mlK02g3/2Qyh+8xngONmD3PCWik5PCILHZY2/Pl728D1uXPW4K MvSPdVbL0WK1NhfOIQMVl4ECm270KPf96drzMzDChNV5PUJfseKEGAIkQGTwCwqFNs4+ UgQs+3ygNIoNz61nEy+2ya7WFpU6Yz3MtamTbEnTjC6gdCKhxQZJ7eUQhFqN3t61WeAe 463ypEnrGc49YwoCs8iPi6Mi6jZ4fvVr4QX6ZKhK8mba3b/6YnO+0/x7EFieS4192+o+ x1Eg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=zjuEVqc4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z5si2355388edk.157.2019.10.17.11.21.44; Thu, 17 Oct 2019 11:22:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=zjuEVqc4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2438803AbfJPWAc (ORCPT + 99 others); Wed, 16 Oct 2019 18:00:32 -0400 Received: from mail.kernel.org ([198.145.29.99]:53934 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2438469AbfJPV7T (ORCPT ); Wed, 16 Oct 2019 17:59:19 -0400 Received: from localhost (unknown [192.55.54.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9608721928; Wed, 16 Oct 2019 21:59:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1571263158; bh=tB4YwU90xWK7HOs8Tb3pTYHJbeM4yslGfjNewpddlDE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=zjuEVqc4kgEDzm50kv260J9XBq224cCBTPFui1JOVCMnYsB8vv+HAWWnXJfCG/7t4 xFF4ABGUuWp65IoxQ7+NKKbjl0nfNrcjyIXldlun9oinyomklPPFNBsuGXrL4ws1SR oKmI/zyxTG2szHMCsZI3tTZm8m9omC6O6ANfwSzs= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Chris Mason , Filipe Manana , Josef Bacik , David Sterba Subject: [PATCH 5.3 087/112] btrfs: fix incorrect updating of log root tree Date: Wed, 16 Oct 2019 14:51:19 -0700 Message-Id: <20191016214905.132963181@linuxfoundation.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191016214844.038848564@linuxfoundation.org> References: <20191016214844.038848564@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Josef Bacik commit 4203e968947071586a98b5314fd7ffdea3b4f971 upstream. We've historically had reports of being unable to mount file systems because the tree log root couldn't be read. Usually this is the "parent transid failure", but could be any of the related errors, including "fsid mismatch" or "bad tree block", depending on which block got allocated. The modification of the individual log root items are serialized on the per-log root root_mutex. This means that any modification to the per-subvol log root_item is completely protected. However we update the root item in the log root tree outside of the log root tree log_mutex. We do this in order to allow multiple subvolumes to be updated in each log transaction. This is problematic however because when we are writing the log root tree out we update the super block with the _current_ log root node information. Since these two operations happen independently of each other, you can end up updating the log root tree in between writing out the dirty blocks and setting the super block to point at the current root. This means we'll point at the new root node that hasn't been written out, instead of the one we should be pointing at. Thus whatever garbage or old block we end up pointing at complains when we mount the file system later and try to replay the log. Fix this by copying the log's root item into a local root item copy. Then once we're safely under the log_root_tree->log_mutex we update the root item in the log_root_tree. This way we do not modify the log_root_tree while we're committing it, fixing the problem. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Chris Mason Reviewed-by: Filipe Manana Signed-off-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/tree-log.c | 36 +++++++++++++++++++++++++++--------- 1 file changed, 27 insertions(+), 9 deletions(-) --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -2932,7 +2932,8 @@ out: * in the tree of log roots */ static int update_log_root(struct btrfs_trans_handle *trans, - struct btrfs_root *log) + struct btrfs_root *log, + struct btrfs_root_item *root_item) { struct btrfs_fs_info *fs_info = log->fs_info; int ret; @@ -2940,10 +2941,10 @@ static int update_log_root(struct btrfs_ if (log->log_transid == 1) { /* insert root item on the first sync */ ret = btrfs_insert_root(trans, fs_info->log_root_tree, - &log->root_key, &log->root_item); + &log->root_key, root_item); } else { ret = btrfs_update_root(trans, fs_info->log_root_tree, - &log->root_key, &log->root_item); + &log->root_key, root_item); } return ret; } @@ -3041,6 +3042,7 @@ int btrfs_sync_log(struct btrfs_trans_ha struct btrfs_fs_info *fs_info = root->fs_info; struct btrfs_root *log = root->log_root; struct btrfs_root *log_root_tree = fs_info->log_root_tree; + struct btrfs_root_item new_root_item; int log_transid = 0; struct btrfs_log_ctx root_log_ctx; struct blk_plug plug; @@ -3104,18 +3106,26 @@ int btrfs_sync_log(struct btrfs_trans_ha goto out; } + /* + * We _must_ update under the root->log_mutex in order to make sure we + * have a consistent view of the log root we are trying to commit at + * this moment. + * + * We _must_ copy this into a local copy, because we are not holding the + * log_root_tree->log_mutex yet. This is important because when we + * commit the log_root_tree we must have a consistent view of the + * log_root_tree when we update the super block to point at the + * log_root_tree bytenr. If we update the log_root_tree here we'll race + * with the commit and possibly point at the new block which we may not + * have written out. + */ btrfs_set_root_node(&log->root_item, log->node); + memcpy(&new_root_item, &log->root_item, sizeof(new_root_item)); root->log_transid++; log->log_transid = root->log_transid; root->log_start_pid = 0; /* - * Update or create log root item under the root's log_mutex to prevent - * races with concurrent log syncs that can lead to failure to update - * log root item because it was not created yet. - */ - ret = update_log_root(trans, log); - /* * IO has been started, blocks of the log tree have WRITTEN flag set * in their headers. new modifications of the log will be written to * new positions. so it's safe to allow log writers to go in. @@ -3135,6 +3145,14 @@ int btrfs_sync_log(struct btrfs_trans_ha mutex_unlock(&log_root_tree->log_mutex); mutex_lock(&log_root_tree->log_mutex); + + /* + * Now we are safe to update the log_root_tree because we're under the + * log_mutex, and we're a current writer so we're holding the commit + * open until we drop the log_mutex. + */ + ret = update_log_root(trans, log, &new_root_item); + if (atomic_dec_and_test(&log_root_tree->log_writers)) { /* atomic_dec_and_test implies a barrier */ cond_wake_up_nomb(&log_root_tree->log_writer_wait);