Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp3711950pxv; Mon, 26 Jul 2021 10:01:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyr1C7kdeC/R32iPP/PnjnW19tXOrof3mDYYQeJD+qPcL+hVk9w5+uWx40OLdySmK5bhlAQ X-Received: by 2002:adf:fe89:: with SMTP id l9mr15155345wrr.396.1627318870178; Mon, 26 Jul 2021 10:01:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627318870; cv=none; d=google.com; s=arc-20160816; b=WmI+urf8Zqvy4UnOQs5jVhqKUoZuCxHbyrr0Ak691RYuh3EVMuODkYwZRRseGsZl8a 8guv5aL8+q7yd03ivFEZ04GX2wIXWhKgEnLk/8w8Vw6hhw8VJ6chkZid7kqDzVr598vr AVV962eMP9hTvLhvRCQLfRTv24TNTA7lYYZZcq6cSr3JcVVwM2/1l8dZgM7ZVwkLqDmw jjgfKgN8yoVfo2KGXf3rwiLoqbIg54PE3vYNtLN66FJWu/cK9IeZzEEEa7LM726tv6ch 3lHYFVAk6TZ+5Z1AIfCoLOSxCeqgDzlJlQexO6yYwbkPYtsemlmCl2hLmh8Wd5WpYe+w R9gg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=sFAFEltMhuOz/FtS+qLryTs3Y08kuS9fKW2RWJTkGo4=; b=rSUQC9sQBbKhAIYNm20f6iTJZpB/KWZY6tgWRxdnCHb4S0XBiI5b0bXCOQtnxq1W88 CRljp7iZFDuF9GQUmEe80ETiD/IrwLbj2GaIBihqatnkBXtdcAPg3ir5u5ujOdP+nuQf 91fPS5rSwIcY2HMK/+ENckkcijmWR7pcJQG0WmuUCOIYFj2DkHPXenaPEDHr2qCdF0lZ CWWR2rig/egTLopdFDZX2L1wi3ua6zwLbPvapnJaYC6Sr8aFo3GlavehL02xNr0d37kN HBsQR97HVs0icrSRFxI1P++1vPGgEpJhujsRozHueyksUvA+PlkFKxzkqI/TVlkfz7oe bq1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=YPtlrifR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ho6si398057ejc.635.2021.07.26.10.00.46; Mon, 26 Jul 2021 10:01:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=YPtlrifR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237227AbhGZPec (ORCPT + 99 others); Mon, 26 Jul 2021 11:34:32 -0400 Received: from mail.kernel.org ([198.145.29.99]:32828 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236841AbhGZPTo (ORCPT ); Mon, 26 Jul 2021 11:19:44 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 021A660F6E; Mon, 26 Jul 2021 16:00:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1627315212; bh=RsE2NI3sa5Azr3Ibritsf9OIwHrfBh7G+XJXY9zikck=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YPtlrifRAIcWlJ06kXVBXQepzvB7OkmbrZvVbk8CPz6ttWlRuhjSszjhWQDxD3YT6 lxpZrCeESI2YDj7AYp47uaQoRWwZMKDdNWozftETwWM4HXVPd4TdPAjYvqrOuHNeSX SCnrmgvES8hNsU6osbakTbAUMCGGpH6JBPQX1AVI= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Ilya Dryomov , Robin Geuze Subject: [PATCH 5.4 097/108] rbd: dont hold lock_rwsem while running_list is being drained Date: Mon, 26 Jul 2021 17:39:38 +0200 Message-Id: <20210726153834.792089562@linuxfoundation.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210726153831.696295003@linuxfoundation.org> References: <20210726153831.696295003@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ilya Dryomov commit ed9eb71085ecb7ded9a5118cec2ab70667cc7350 upstream. Currently rbd_quiesce_lock() holds lock_rwsem for read while blocking on releasing_wait completion. On the I/O completion side, each image request also needs to take lock_rwsem for read. Because rw_semaphore implementation doesn't allow new readers after a writer has indicated interest in the lock, this can result in a deadlock if something that needs to take lock_rwsem for write gets involved. For example: 1. watch error occurs 2. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and releases lock_rwsem 3. after reestablishing the watch, rbd_reregister_watch() takes lock_rwsem for write and calls rbd_reacquire_lock() 4. rbd_quiesce_lock() downgrades lock_rwsem to for read and blocks on releasing_wait until running_list becomes empty 5. another watch error occurs 6. rbd_watch_errcb() blocks trying to take lock_rwsem for write 7. no in-flight image request can complete and delete itself from running_list because lock_rwsem won't be granted anymore A similar scenario can occur with "lock has been acquired" and "lock has been released" notification handers which also take lock_rwsem for write to update owner_cid. We don't actually get anything useful from sitting on lock_rwsem in rbd_quiesce_lock() -- owner_cid updates certainly don't need to be synchronized with. In fact the whole owner_cid tracking logic could probably be removed from the kernel client because we don't support proxied maintenance operations. Cc: stable@vger.kernel.org # 5.3+ URL: https://tracker.ceph.com/issues/42757 Signed-off-by: Ilya Dryomov Tested-by: Robin Geuze Signed-off-by: Greg Kroah-Hartman --- drivers/block/rbd.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4239,8 +4239,6 @@ again: static bool rbd_quiesce_lock(struct rbd_device *rbd_dev) { - bool need_wait; - dout("%s rbd_dev %p\n", __func__, rbd_dev); lockdep_assert_held_write(&rbd_dev->lock_rwsem); @@ -4252,11 +4250,11 @@ static bool rbd_quiesce_lock(struct rbd_ */ rbd_dev->lock_state = RBD_LOCK_STATE_RELEASING; rbd_assert(!completion_done(&rbd_dev->releasing_wait)); - need_wait = !list_empty(&rbd_dev->running_list); - downgrade_write(&rbd_dev->lock_rwsem); - if (need_wait) - wait_for_completion(&rbd_dev->releasing_wait); - up_read(&rbd_dev->lock_rwsem); + if (list_empty(&rbd_dev->running_list)) + return true; + + up_write(&rbd_dev->lock_rwsem); + wait_for_completion(&rbd_dev->releasing_wait); down_write(&rbd_dev->lock_rwsem); if (rbd_dev->lock_state != RBD_LOCK_STATE_RELEASING)