Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp337872pxb; Mon, 16 Aug 2021 06:38:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzT3h5au22DaorB0oVa4LNPBuBeQAWaEreVP/95Z7KzecNa0+UCU9WZCoVcGSJ5jMk5Z1tF X-Received: by 2002:aa7:c1c8:: with SMTP id d8mr20737831edp.20.1629121082795; Mon, 16 Aug 2021 06:38:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629121082; cv=none; d=google.com; s=arc-20160816; b=Xk+9pqlUXcO5sWZh6jK57JjEemq0TGlXDmHhlPQAfEpViVyS3zsvseCKeqZcaN6eOa tHZJKsaph/J6tLzxE1uu6d7ObNjNBki3Z7MwOoh13rEB5CqgXz3/fT8DM6M9pEdpNrJM OeRpYkYe185nT+mZdaOPWxHz4kJJYR1RTqFQLepVsR8mswvqPI4PoHIR5Jgd3VIGKx0u 6F995laXAyVugqFEt2VJ2aP4xSfcM3Cbu08GOJpK6NZHYfgriysU1eW+zs2xP8+s+Usp Gh6+aqU0THFJnqZgrGm7Dy7kpMdryg67Mj4yjp5msGzpNFb2go6pGI7LAlr2TscFOb5L 5MVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=NTmBv786SWnBY907pPdUltfiSVrlwvBJfjRJL2BcAdU=; b=PqHDaDJQJDWagp2nvoDy9W60Dw05D5ICd61MuebuF+D+MMAUugO7T5m9sOM1rJGYfd HegjIMYx/wE0T3MJtY/WXyle3DROTGqu+n3dAPujBfyoOFtD/T37BEe9sX8eMIFpYa3N CZsf/CPCRsaYoLmthTm97MAilySHmiUQpZvuNr2tPap+szJeMjqluOiM/uIL47UKLRqC oRjs1tpKuWVSjtxbQffScCJosxsNj0970FjYx8QCb/G9o1y533JZfHUPTffPKFnE20cd JcNbO1Y32bOBJmZZtPLcA13fmsaiHsQC5U+NqgxozbZpRFR06LH6sS8E++RtlopLQDB3 g7hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=lhA0mxnq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id co10si10089647edb.200.2021.08.16.06.37.39; Mon, 16 Aug 2021 06:38:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=lhA0mxnq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239453AbhHPNd1 (ORCPT + 99 others); Mon, 16 Aug 2021 09:33:27 -0400 Received: from mail.kernel.org ([198.145.29.99]:43334 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238752AbhHPNUg (ORCPT ); Mon, 16 Aug 2021 09:20:36 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 79303600D4; Mon, 16 Aug 2021 13:15:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1629119758; bh=EEr00xQjhRt4abvlxPqMIxXmKFUHYRW4cSkD8hI5gok=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lhA0mxnqrQZh1NrtEsyMHlOZZqhhiIrmGFTyC6ITi+0a4BbnLsdtgYqVLHx6p08ew SHI2iUAkFKWP/JMiPMkFiIIrPsvKA8lojwfSttK4lkp7KhXDyHW+EddjyncxcUylKc yB3mBzeEX1irOs/flEss0ctUi6N/lPLJKVHmdqyk= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Mark Nelson , Jeff Layton , Luis Henriques , Ilya Dryomov Subject: [PATCH 5.13 150/151] ceph: take snap_empty_lock atomically with snaprealm refcount change Date: Mon, 16 Aug 2021 15:03:00 +0200 Message-Id: <20210816125449.027025247@linuxfoundation.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210816125444.082226187@linuxfoundation.org> References: <20210816125444.082226187@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Jeff Layton commit 8434ffe71c874b9c4e184b88d25de98c2bf5fe3f upstream. There is a race in ceph_put_snap_realm. The change to the nref and the spinlock acquisition are not done atomically, so you could decrement nref, and before you take the spinlock, the nref is incremented again. At that point, you end up putting it on the empty list when it shouldn't be there. Eventually __cleanup_empty_realms runs and frees it when it's still in-use. Fix this by protecting the 1->0 transition with atomic_dec_and_lock, and just drop the spinlock if we can get the rwsem. Because these objects can also undergo a 0->1 refcount transition, we must protect that change as well with the spinlock. Increment locklessly unless the value is at 0, in which case we take the spinlock, increment and then take it off the empty list if it did the 0->1 transition. With these changes, I'm removing the dout() messages from these functions, as well as in __put_snap_realm. They've always been racy, and it's better to not print values that may be misleading. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/46419 Reported-by: Mark Nelson Signed-off-by: Jeff Layton Reviewed-by: Luis Henriques Signed-off-by: Ilya Dryomov Signed-off-by: Greg Kroah-Hartman --- fs/ceph/snap.c | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -67,19 +67,19 @@ void ceph_get_snap_realm(struct ceph_mds { lockdep_assert_held(&mdsc->snap_rwsem); - dout("get_realm %p %d -> %d\n", realm, - atomic_read(&realm->nref), atomic_read(&realm->nref)+1); /* - * since we _only_ increment realm refs or empty the empty - * list with snap_rwsem held, adjusting the empty list here is - * safe. we do need to protect against concurrent empty list - * additions, however. + * The 0->1 and 1->0 transitions must take the snap_empty_lock + * atomically with the refcount change. Go ahead and bump the + * nref here, unless it's 0, in which case we take the spinlock + * and then do the increment and remove it from the list. */ - if (atomic_inc_return(&realm->nref) == 1) { - spin_lock(&mdsc->snap_empty_lock); + if (atomic_inc_not_zero(&realm->nref)) + return; + + spin_lock(&mdsc->snap_empty_lock); + if (atomic_inc_return(&realm->nref) == 1) list_del_init(&realm->empty_item); - spin_unlock(&mdsc->snap_empty_lock); - } + spin_unlock(&mdsc->snap_empty_lock); } static void __insert_snap_realm(struct rb_root *root, @@ -208,28 +208,28 @@ static void __put_snap_realm(struct ceph { lockdep_assert_held_write(&mdsc->snap_rwsem); - dout("__put_snap_realm %llx %p %d -> %d\n", realm->ino, realm, - atomic_read(&realm->nref), atomic_read(&realm->nref)-1); + /* + * We do not require the snap_empty_lock here, as any caller that + * increments the value must hold the snap_rwsem. + */ if (atomic_dec_and_test(&realm->nref)) __destroy_snap_realm(mdsc, realm); } /* - * caller needn't hold any locks + * See comments in ceph_get_snap_realm. Caller needn't hold any locks. */ void ceph_put_snap_realm(struct ceph_mds_client *mdsc, struct ceph_snap_realm *realm) { - dout("put_snap_realm %llx %p %d -> %d\n", realm->ino, realm, - atomic_read(&realm->nref), atomic_read(&realm->nref)-1); - if (!atomic_dec_and_test(&realm->nref)) + if (!atomic_dec_and_lock(&realm->nref, &mdsc->snap_empty_lock)) return; if (down_write_trylock(&mdsc->snap_rwsem)) { + spin_unlock(&mdsc->snap_empty_lock); __destroy_snap_realm(mdsc, realm); up_write(&mdsc->snap_rwsem); } else { - spin_lock(&mdsc->snap_empty_lock); list_add(&realm->empty_item, &mdsc->snap_empty); spin_unlock(&mdsc->snap_empty_lock); }