Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp1236573pxu; Mon, 23 Nov 2020 15:33:46 -0800 (PST) X-Google-Smtp-Source: ABdhPJxTaIqah5Vz2Xh/uA5TYoATdDK9gdwv3NPW2MRcj5h7/Z9y0L8GuZY2Z0dv4AKZ8dLSWfqF X-Received: by 2002:a50:e00f:: with SMTP id e15mr1575016edl.210.1606174426177; Mon, 23 Nov 2020 15:33:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606174426; cv=none; d=google.com; s=arc-20160816; b=j8wKwM2Oq14N7srP8eQLyr8PKKbwiZ7BqA1rardCgUSspW2V6PFipZD5ATZNirBVvG SOnsMwACrj21xPrjZaq4Vxm3XHB4y8BDmAkGnjnyG3mj4tdroGckPcWhMQktE2znYccd UwXY5Jcpys7C02xoN3jQgkqt0QK7xPkthNtY1dJG/jNmSm8ycbJ6JaTkmKOHbJPOl5FN PaKBUiu7fQ+wUdNe8Ne5rHSsKie2v/EJNMtf9to2H8t5SNDLn19/qBeLLo0kOM1VL0Jg +KS0CW4631fmijkJQfGNIaJqvPHQ6TyhCQKhnCobV4wX7fpZsNIVXWFcAi4fe2yEq8Ko 7xCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=0Z5oBpXhDqNXAYwDM/JyuRfVpjXG19mQDpVID2BuWt0=; b=jz7QvmJq3FI/10KrihX850WSPwg6Hu3fAaNJ+tRRcbhemhLyOjvPehBYes7pnmUm6/ 9jjeK2BaeEu7kvFM1TZDrp4745f1cclGsPLC9JplCkGZFJjx23Ezo5c9gvISsJFlcPyW I+4khM/YoD+d2dCP75F8R+C7X8whAA8RriNKMk7WcqXMwHQc2S81e/2bxRJYDN7AR00z rbrhtJ8eaM1aON+bPwGrNsQj5W9tR2OGuZQrrZr2mm+oASMHZVFrgjLSpbMosQuSnhC2 B01zHlgSKubIrhfD1QXWqPvX1N9nl8ugGrs5WYuwKXZNKwbYqaQqcrjj0mzWPgoCLXCK 8l3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=m7c8CAih; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u20si7512335eje.628.2020.11.23.15.33.23; Mon, 23 Nov 2020 15:33:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=m7c8CAih; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730781AbgKWMax (ORCPT + 99 others); Mon, 23 Nov 2020 07:30:53 -0500 Received: from mail.kernel.org ([198.145.29.99]:41746 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730762AbgKWMav (ORCPT ); Mon, 23 Nov 2020 07:30:51 -0500 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9447920888; Mon, 23 Nov 2020 12:30:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1606134648; bh=1uRssq3GslaarJs5II8b6WsQ8G7v74YWY+xgS3R4vWM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=m7c8CAihUC94t9nKdgPNU/tD6ZL0TAOyOqXNQ2bcu1bsW4PQcBmipqUIZBjx/hSoJ epW7H5hjpoT+CoMXL9tYB5c23oG2GBVmPtbme+HunkaKgWhBddFCWvECOqC7R3Ojlr sC7lYX72pvTpA0eZjczT+77Zuih0ukyOTkW32j/U= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Darrick J. Wong" , Christoph Hellwig , Jan Kara , Sasha Levin Subject: [PATCH 4.19 32/91] vfs: remove lockdep bogosity in __sb_start_write Date: Mon, 23 Nov 2020 13:21:52 +0100 Message-Id: <20201123121810.885615538@linuxfoundation.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201123121809.285416732@linuxfoundation.org> References: <20201123121809.285416732@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Darrick J. Wong [ Upstream commit 22843291efc986ce7722610073fcf85a39b4cb13 ] __sb_start_write has some weird looking lockdep code that claims to exist to handle nested freeze locking requests from xfs. The code as written seems broken -- if we think we hold a read lock on any of the higher freeze levels (e.g. we hold SB_FREEZE_WRITE and are trying to lock SB_FREEZE_PAGEFAULT), it converts a blocking lock attempt into a trylock. However, it's not correct to downgrade a blocking lock attempt to a trylock unless the downgrading code or the callers are prepared to deal with that situation. Neither __sb_start_write nor its callers handle this at all. For example: sb_start_pagefault ignores the return value completely, with the result that if xfs_filemap_fault loses a race with a different thread trying to fsfreeze, it will proceed without pagefault freeze protection (thereby breaking locking rules) and then unlocks the pagefault freeze lock that it doesn't own on its way out (thereby corrupting the lock state), which leads to a system hang shortly afterwards. Normally, this won't happen because our ownership of a read lock on a higher freeze protection level blocks fsfreeze from grabbing a write lock on that higher level. *However*, if lockdep is offline, lock_is_held_type unconditionally returns 1, which means that percpu_rwsem_is_held returns 1, which means that __sb_start_write unconditionally converts blocking freeze lock attempts into trylocks, even when we *don't* hold anything that would block a fsfreeze. Apparently this all held together until 5.10-rc1, when bugs in lockdep caused lockdep to shut itself off early in an fstests run, and once fstests gets to the "race writes with freezer" tests, kaboom. This might explain the long trail of vanishingly infrequent livelocks in fstests after lockdep goes offline that I've never been able to diagnose. We could fix it by spinning on the trylock if wait==true, but AFAICT the locking works fine if lockdep is not built at all (and I didn't see any complaints running fstests overnight), so remove this snippet entirely. NOTE: Commit f4b554af9931 in 2015 created the current weird logic (which used to exist in a different form in commit 5accdf82ba25c from 2012) in __sb_start_write. XFS solved this whole problem in the late 2.6 era by creating a variant of transactions (XFS_TRANS_NO_WRITECOUNT) that don't grab intwrite freeze protection, thus making lockdep's solution unnecessary. The commit claims that Dave Chinner explained that the trylock hack + comment could be removed, but nobody ever did. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Reviewed-by: Jan Kara Signed-off-by: Sasha Levin --- fs/super.c | 33 ++++----------------------------- 1 file changed, 4 insertions(+), 29 deletions(-) diff --git a/fs/super.c b/fs/super.c index f3a8c008e1643..9fb4553c46e63 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1360,36 +1360,11 @@ EXPORT_SYMBOL(__sb_end_write); */ int __sb_start_write(struct super_block *sb, int level, bool wait) { - bool force_trylock = false; - int ret = 1; + if (!wait) + return percpu_down_read_trylock(sb->s_writers.rw_sem + level-1); -#ifdef CONFIG_LOCKDEP - /* - * We want lockdep to tell us about possible deadlocks with freezing - * but it's it bit tricky to properly instrument it. Getting a freeze - * protection works as getting a read lock but there are subtle - * problems. XFS for example gets freeze protection on internal level - * twice in some cases, which is OK only because we already hold a - * freeze protection also on higher level. Due to these cases we have - * to use wait == F (trylock mode) which must not fail. - */ - if (wait) { - int i; - - for (i = 0; i < level - 1; i++) - if (percpu_rwsem_is_held(sb->s_writers.rw_sem + i)) { - force_trylock = true; - break; - } - } -#endif - if (wait && !force_trylock) - percpu_down_read(sb->s_writers.rw_sem + level-1); - else - ret = percpu_down_read_trylock(sb->s_writers.rw_sem + level-1); - - WARN_ON(force_trylock && !ret); - return ret; + percpu_down_read(sb->s_writers.rw_sem + level-1); + return 1; } EXPORT_SYMBOL(__sb_start_write); -- 2.27.0