Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp347337ybi; Fri, 7 Jun 2019 09:01:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqy5k/AgxSZq0mqvrOJhUjmJOGIyMTvyEdHOTFRyqbPmLdbWo6rz60gZQPRkOFa0dcMGzO6b X-Received: by 2002:a17:902:3341:: with SMTP id a59mr27598906plc.186.1559923290458; Fri, 07 Jun 2019 09:01:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559923290; cv=none; d=google.com; s=arc-20160816; b=nKo2aWTmMrZie+nZylmrNIDGtWj8/EHnJRVTWEubRNjOb+idVoA96HSL+du2C1wTpN vwdbIjzmDLRR08V46vrjbB6awukQSMY3LdAUAZOc1AcEatGSs8zUezZHu41Wvu+i+Srg 0NoJKCHfdxu/o3OOAUz4I3h/RI4tWKonjXdEXOYCYZIJPziNy1NcDClKb6RfkMxDtXIN 1iKEiiESrnrkZqLX7icOabSFoOrXZr924ktYQ0il2t1Gm9F6vNoNrPVDJI1ITxXMKVRp Js97OmOp9De6dzVYHcO4AUBV7/rYYWN/CM0cLbBJRLEjK0LBBbamkURQMs97ImDwFZDT 923Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=8uIOeOCGusF5YqrkVZwW4sfy19vDbn7E4O59BIEwpvU=; b=BoZS+co5EcFnhM/islreNrI0novvQ37AYMxlfHu4misxLjlgZcPHQkE0s1P+O6Dj3M bCqmAEYbhZ30hNiXmC+J2rYMxXX+TWApS5K0PGj9whzi02wO4PEoIamFLaTkJ0Y1EVgX ZrZJY3xy0D6F7XpgrkTjgcYIF05g0o+6vlz9fm5PJI9XPpXVhq5VCO4ByQNMgUpsURM3 YS3LfcKd82BaKTIUd/kT9jg6lQokWrQIPMl9rRCHGRhC53YGIXfFPjT+BuRXOKE5Wcqx BduYFx/0VbF7Svdoy9zr8S2BI9zXn92aOo20zsQqUFQiY5z5Uwj74uKL816SMl3St4pM zDfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SlW8rUwc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i94si2154154plb.255.2019.06.07.09.01.14; Fri, 07 Jun 2019 09:01:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SlW8rUwc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730453AbfFGPmC (ORCPT + 99 others); Fri, 7 Jun 2019 11:42:02 -0400 Received: from mail.kernel.org ([198.145.29.99]:52148 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729349AbfFGPlz (ORCPT ); Fri, 7 Jun 2019 11:41:55 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8AFF121473; Fri, 7 Jun 2019 15:41:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1559922115; bh=7Dwi07MGBO9U6N2e0LqhLrsjBz2fdKOnllJzj59RAsw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SlW8rUwcP9GlBDyW8kAxZ52cG4bCN/XNI7XgwkcwidXV9riPJKtrm6FgFOVmuOZ2m PmA6P1QUJnNJNDopzWzBFR5Swfinn8/3dl+xLhZA3bHwPjRsaERvYKe0AeIvROeYCN +wDsk6ETRnWDAEh/DJFPQ3wpIubrSMQG9ij80yM8= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Filipe Manana , David Sterba Subject: [PATCH 4.14 40/69] Btrfs: fix race updating log root item during fsync Date: Fri, 7 Jun 2019 17:39:21 +0200 Message-Id: <20190607153853.310094093@linuxfoundation.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190607153848.271562617@linuxfoundation.org> References: <20190607153848.271562617@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Filipe Manana commit 06989c799f04810f6876900d4760c0edda369cf7 upstream. When syncing the log, the final phase of a fsync operation, we need to either create a log root's item or update the existing item in the log tree of log roots, and that depends on the current value of the log root's log_transid - if it's 1 we need to create the log root item, otherwise it must exist already and we update it. Since there is no synchronization between updating the log_transid and checking it for deciding whether the log root's item needs to be created or updated, we end up with a tiny race window that results in attempts to update the item to fail because the item was not yet created: CPU 1 CPU 2 btrfs_sync_log() lock root->log_mutex set log root's log_transid to 1 unlock root->log_mutex btrfs_sync_log() lock root->log_mutex sets log root's log_transid to 2 unlock root->log_mutex update_log_root() sees log root's log_transid with a value of 2 calls btrfs_update_root(), which fails with -EUCLEAN and causes transaction abort Until recently the race lead to a BUG_ON at btrfs_update_root(), but after the recent commit 7ac1e464c4d47 ("btrfs: Don't panic when we can't find a root key") we just abort the current transaction. A sample trace of the BUG_ON() on a SLE12 kernel: ------------[ cut here ]------------ kernel BUG at ../fs/btrfs/root-tree.c:157! Oops: Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=2048 NUMA pSeries (...) Supported: Yes, External CPU: 78 PID: 76303 Comm: rtas_errd Tainted: G X 4.4.156-94.57-default #1 task: c00000ffa906d010 ti: c00000ff42b08000 task.ti: c00000ff42b08000 NIP: d000000036ae5cdc LR: d000000036ae5cd8 CTR: 0000000000000000 REGS: c00000ff42b0b860 TRAP: 0700 Tainted: G X (4.4.156-94.57-default) MSR: 8000000002029033 CR: 22444484 XER: 20000000 CFAR: d000000036aba66c SOFTE: 1 GPR00: d000000036ae5cd8 c00000ff42b0bae0 d000000036bda220 0000000000000054 GPR04: 0000000000000001 0000000000000000 c00007ffff8d37c8 0000000000000000 GPR08: c000000000e19c00 0000000000000000 0000000000000000 3736343438312079 GPR12: 3930373337303434 c000000007a3a800 00000000007fffff 0000000000000023 GPR16: c00000ffa9d26028 c00000ffa9d261f8 0000000000000010 c00000ffa9d2ab28 GPR20: c00000ff42b0bc48 0000000000000001 c00000ff9f0d9888 0000000000000001 GPR24: c00000ffa9d26000 c00000ffa9d261e8 c00000ffa9d2a800 c00000ff9f0d9888 GPR28: c00000ffa9d26028 c00000ffa9d2aa98 0000000000000001 c00000ffa98f5b20 NIP [d000000036ae5cdc] btrfs_update_root+0x25c/0x4e0 [btrfs] LR [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs] Call Trace: [c00000ff42b0bae0] [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs] (unreliable) [c00000ff42b0bba0] [d000000036b53610] btrfs_sync_log+0x2d0/0xc60 [btrfs] [c00000ff42b0bce0] [d000000036b1785c] btrfs_sync_file+0x44c/0x4e0 [btrfs] [c00000ff42b0bd80] [c00000000032e300] vfs_fsync_range+0x70/0x120 [c00000ff42b0bdd0] [c00000000032e44c] do_fsync+0x5c/0xb0 [c00000ff42b0be10] [c00000000032e8dc] SyS_fdatasync+0x2c/0x40 [c00000ff42b0be30] [c000000000009488] system_call+0x3c/0x100 Instruction dump: 7f43d378 4bffebb9 60000000 88d90008 3d220000 e8b90000 3b390009 e87a01f0 e8898e08 e8f90000 4bfd48e5 60000000 <0fe00000> e95b0060 39200004 394a0ea0 ---[ end trace 8f2dc8f919cabab8 ]--- So fix this by doing the check of log_transid and updating or creating the log root's item while holding the root's log_mutex. Fixes: 7237f1833601d ("Btrfs: fix tree logs parallel sync") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/tree-log.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -2907,6 +2907,12 @@ int btrfs_sync_log(struct btrfs_trans_ha log->log_transid = root->log_transid; root->log_start_pid = 0; /* + * Update or create log root item under the root's log_mutex to prevent + * races with concurrent log syncs that can lead to failure to update + * log root item because it was not created yet. + */ + ret = update_log_root(trans, log); + /* * IO has been started, blocks of the log tree have WRITTEN flag set * in their headers. new modifications of the log will be written to * new positions. so it's safe to allow log writers to go in. @@ -2925,8 +2931,6 @@ int btrfs_sync_log(struct btrfs_trans_ha mutex_unlock(&log_root_tree->log_mutex); - ret = update_log_root(trans, log); - mutex_lock(&log_root_tree->log_mutex); if (atomic_dec_and_test(&log_root_tree->log_writers)) { /*