Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp798456imm; Mon, 21 May 2018 14:41:00 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpasWSHQvYBF5rlP3m2LG+1nfDgzvHrcOYLUEg6HGfjMiselUeryWom5oldT2dDiRe3LY27 X-Received: by 2002:a17:902:6505:: with SMTP id b5-v6mr21981103plk.147.1526938860464; Mon, 21 May 2018 14:41:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526938860; cv=none; d=google.com; s=arc-20160816; b=UTzGLDSsTQH+/aimfBnfLmC5o1JPnpZfPk6J3yhPvsw0rXwLamtUPFQvUAYWhAN7vX CjOwf1yygUaZvPOgVxBHJNb139lwJHogOjSa91AkVAjtp1DeS4dOkCzZ2HXtirSRaNjo t1P7/orvD6iTqaD6eoqjXh1tfnPrQTRItNZ4f1jN4YOQNkfy45fivmtAlvmm8a+0zUUy LOE3L7yx6oHKBiIy6qWHpBvXX0lZg/uzSxYYaccXGjFeowPduek+JvuBUNfjIOjkHhLd UCoOmeV1lIzkrl819Rlbxw+rpypb7+NyYmkFFbZ60Plf0Xi50F2ighkQQOyzFAcIuES6 LRXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=OIb13Ui5adWE/03jrVuGOtEFsjYc/wuBrtyR/Mfm0yU=; b=PY85Gv04FLvzGOJaHkE+2P4CohUiz5+y7ZIBd3hfOamSYh1+tDuaGEhPPhcu6gX1Dm LYQO+7HwQWHHVx9M1qWlYYZP4M6B9In5BtWNfLrvWkErk6WZywC/XuplCQX9xoCk0njB EljGUmz3IFz3dkt8u4Od9Y7HVfi5KR+ZwwwWt2l5bfnATHoP/KHsoBk7A9Chrj839Q4a tk9gNEJz4zXjW5iKfrlhcRaKefArxWSpnlLlcc0L8GtPHIytj3Zcm+NQGdY7iSgqQgXy 4VjIyza23QHMlEed0jh5JP3qk0e1PexPbmj4b8aTEgCg0SRirhg5odNeIySgZ+odKosI /TpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=aHUHZN4k; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 1-v6si14691037plv.16.2018.05.21.14.40.46; Mon, 21 May 2018 14:41:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=aHUHZN4k; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932363AbeEUVkj (ORCPT + 99 others); Mon, 21 May 2018 17:40:39 -0400 Received: from mail.kernel.org ([198.145.29.99]:39696 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753916AbeEUVYt (ORCPT ); Mon, 21 May 2018 17:24:49 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CE88820873; Mon, 21 May 2018 21:24:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1526937888; bh=Lz55suY6BSzsPFTjm/lrAuGb4xbpu0nQuqbPQOx8vHw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aHUHZN4kgzdaaXoZFVqYanQr09B+XSQMq3VkcmIbD218bOc3VC0WB/n5JKDsx914f FnsD1iAdUKl76EPzs6Wk7Q6hJISDN5qWilEgbGO4c+4QEHKrbGcBvqbKHvQzBVC1sN 4F+OLlHYbwgpnHwV24fG74AXDF76idCzdkkrPVao= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Liu Bo , Filipe Manana , Qu Wenruo , David Sterba , Nikolay Borisov Subject: [PATCH 4.16 061/110] btrfs: fix reading stale metadata blocks after degraded raid1 mounts Date: Mon, 21 May 2018 23:11:58 +0200 Message-Id: <20180521210511.440167357@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180521210503.823249477@linuxfoundation.org> References: <20180521210503.823249477@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.16-stable review patch. If anyone has any objections, please let me know. ------------------ From: Liu Bo commit 02a3307aa9c20b4f6626255b028f07f6cfa16feb upstream. If a btree block, aka. extent buffer, is not available in the extent buffer cache, it'll be read out from the disk instead, i.e. btrfs_search_slot() read_block_for_search() # hold parent and its lock, go to read child btrfs_release_path() read_tree_block() # read child Unfortunately, the parent lock got released before reading child, so commit 5bdd3536cbbe ("Btrfs: Fix block generation verification race") had used 0 as parent transid to read the child block. It forces read_tree_block() not to check if parent transid is different with the generation id of the child that it reads out from disk. A simple PoC is included in btrfs/124, 0. A two-disk raid1 btrfs, 1. Right after mkfs.btrfs, block A is allocated to be device tree's root. 2. Mount this filesystem and put it in use, after a while, device tree's root got COW but block A hasn't been allocated/overwritten yet. 3. Umount it and reload the btrfs module to remove both disks from the global @fs_devices list. 4. mount -odegraded dev1 and write some data, so now block A is allocated to be a leaf in checksum tree. Note that only dev1 has the latest metadata of this filesystem. 5. Umount it and mount it again normally (with both disks), since raid1 can pick up one disk by the writer task's pid, if btrfs_search_slot() needs to read block A, dev2 which does NOT have the latest metadata might be read for block A, then we got a stale block A. 6. As parent transid is not checked, block A is marked as uptodate and put into the extent buffer cache, so the future search won't bother to read disk again, which means it'll make changes on this stale one and make it dirty and flush it onto disk. To avoid the problem, parent transid needs to be passed to read_tree_block(). In order to get a valid parent transid, we need to hold the parent's lock until finishing reading child. This patch needs to be slightly adapted for stable kernels, the &first_key parameter added to read_tree_block() is from 4.16+ (581c1760415c4). The fix is to replace 0 by 'gen'. Fixes: 5bdd3536cbbe ("Btrfs: Fix block generation verification race") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Liu Bo Reviewed-by: Filipe Manana Reviewed-by: Qu Wenruo [ update changelog ] Signed-off-by: David Sterba Signed-off-by: Nikolay Borisov Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/ctree.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -2491,10 +2491,8 @@ read_block_for_search(struct btrfs_root if (p->reada != READA_NONE) reada_for_search(fs_info, p, level, slot, key->objectid); - btrfs_release_path(p); - ret = -EAGAIN; - tmp = read_tree_block(fs_info, blocknr, 0); + tmp = read_tree_block(fs_info, blocknr, gen); if (!IS_ERR(tmp)) { /* * If the read above didn't mark this buffer up to date, @@ -2508,6 +2506,8 @@ read_block_for_search(struct btrfs_root } else { ret = PTR_ERR(tmp); } + + btrfs_release_path(p); return ret; }