Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp4093522pxb; Mon, 27 Sep 2021 09:15:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxZzef0GWCYqRvVLHuPkZ+lCtDLeJvCU4nwLWiQh7aXgFPk4vDSeweXxwUTA7svWMuMCIpt X-Received: by 2002:a05:6402:8ce:: with SMTP id d14mr835556edz.193.1632759301828; Mon, 27 Sep 2021 09:15:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632759301; cv=none; d=google.com; s=arc-20160816; b=v/SyzCnNDWfyUWQ6w03n+P/LluwuSe8Dcyxn3nBfPtS/pCzyQKKrOJaxK6AoDGSwK+ hH9e+g2o04FacgKC76n0RiL8uVu4YwFsvw8dyzy5zT40ETleoC3lq2I8Z9dAQXAvzRCY 0DK/63V4yDmzXuYtCQd1pQVLYTgWVB7s8Ba26Y1EB0sOXcMaKdVuc/nnQvs85kWkl3t1 57xsFWOBNmqiMiMxHXrJYiSkalKnwNmJ1cax/uHZnpNpz/Ndegy+xFbA7kxfqqhylmKd nCfmUihFxqPL01b6LJCU5UREuACeUP6T7lSplouLuyxnvlBUwJ9dvR+bwkzrCFpkv2WJ Uxww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=ZP+hObH4Ca67c7s2rNxjdBNo4T5OiqRqt2rFhX0gpYo=; b=uwreFuBWWzN9Wi+GtL5TEu5fiGmbfwQtE4Pn5GzzP9dH7Egh+X+S146Gagl/n3ZUj0 jikH5FW7ouH50AuU8dHVvDbuES8cKo4Un0AKbx/JczCOdHwia8hTYstvUlKqNXP1kn8o zXoVA6EePDcVj6u2sJYbniuSNjVR8dnl+rL9wRsvEqwjbN56JTJEJj9qvnzNBHogIoOL Mv3mwaZqxiNE4K1yfdw6jrsi7aeBpMibpBi5Zv5JeHLK/LiK/ROR5l7X9XxKsYHyPAqK HivBpIqs9N4S36OOL1h0we506JiCnh53O20X/CNnpClUYIk6eI6joMngAAeBer11nA6x TbGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=hyR9PMh5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b18si20123529ejh.638.2021.09.27.09.14.36; Mon, 27 Sep 2021 09:15:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=hyR9PMh5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235330AbhI0QMZ (ORCPT + 99 others); Mon, 27 Sep 2021 12:12:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:33704 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235261AbhI0QMY (ORCPT ); Mon, 27 Sep 2021 12:12:24 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 8F93A60F24; Mon, 27 Sep 2021 16:10:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1632759046; bh=F+UegogzQUd6lQGfpVddtwodkH6FMEhe5nAdym7ukz4=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=hyR9PMh5ejEwybOGdxb6WLqLhNYOsg228WFDNgq4h4JAYZuN4zhHixjHpzG9nUSO0 Io9+0oY/Etdl4/F5/UFBaYeACVsq7Dw7uxX2TtUMTu5lgtPsrkFYGk/BvLdx7dyqb9 PngD7G64G49V0Nl7YXySEpL87xztC/iqmyZhGwv/VWwIeExEE8MKDjPyeDb5whxvrl HmBoqUdtXtCWWoL3LxktTyh/G9YzL/lnnyD154Urf75a9+InCN2635SnY9BjY71zw1 m/t/rRW/mjf6uiLAeQuM0m1tZImkO9sXD0azIJlxPcqAv45E2ukAzPp1TFqGVkiYpW 80b3xPrFnTOcQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 5D4A95C0926; Mon, 27 Sep 2021 09:10:46 -0700 (PDT) Date: Mon, 27 Sep 2021 09:10:46 -0700 From: "Paul E. McKenney" To: Guillaume Morin Cc: linux-kernel@vger.kernel.org Subject: Re: call_rcu data race patch Message-ID: <20210927161046.GU880162@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20210917213404.GA14271@bender.morinfr.org> <20210917220700.GV4156@paulmck-ThinkPad-P17-Gen-1> <20210918003933.GA25868@bender.morinfr.org> <20210918040035.GX4156@paulmck-ThinkPad-P17-Gen-1> <20210918070836.GA19555@bender.morinfr.org> <20210919163539.GD880162@paulmck-ThinkPad-P17-Gen-1> <20210920160540.GA31426@bender.morinfr.org> <20210922191406.GA31531@bender.morinfr.org> <20210922192448.GB880162@paulmck-ThinkPad-P17-Gen-1> <20210927153842.GA12620@bender.morinfr.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210927153842.GA12620@bender.morinfr.org> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 27, 2021 at 05:38:42PM +0200, Guillaume Morin wrote: > On 22 Sep 12:24, Paul E. McKenney wrote: > > On Wed, Sep 22, 2021 at 09:14:07PM +0200, Guillaume Morin wrote: > > > I am little afraid of jinxing it :) but so far so good. I have the a > > > patched kernel running on a few machines (including my most "reliable > > > crasher") and they've been stable so far. > > > > > > It's definitely too early to declare victory though. I will keep you > > > posted. > > > > Here is hoping! ;-) > > Things are still stable. So I am pretty optimistic. How are you planning > to proceeed? Very good! Would you be willing to give me your Tested-by? > The first patch is already in your rcu tree and my gut feeling is that > it is the one that fixes the issue but you're the expert here... Though > I think it should be probably fast tracked and marked for stable? > > Are you planning on committing the 2nd patch to your tree? This is the second patch, correct? (Too many patches!) If so, I add your Tested-by and fill out the commit log. It would be slated for the v5.17 merge window by default, that is, not the upcoming merge window but the one after that. Please let me know if you need it sooner. Thanx, Paul ------------------------------------------------------------------------ commit 1a792b59071b697defd4ccdc8b951cce49de9d2f Author: Paul E. McKenney Date: Fri Sep 17 15:04:48 2021 -0700 EXP rcu: Tighten rcu_advance_cbs_nowake() checks This is an experimental shot-in-the-dark debugging patch. Signed-off-by: Paul E. McKenney diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 6a1e9d3374db..6d692a591f66 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1590,10 +1590,14 @@ static void __maybe_unused rcu_advance_cbs_nowake(struct rcu_node *rnp, struct rcu_data *rdp) { rcu_lockdep_assert_cblist_protected(rdp); - if (!rcu_seq_state(rcu_seq_current(&rnp->gp_seq)) || + // Don't do anything unless the current grace period is guaranteed + // not to end. This means a grace period in progress and at least + // one holdout CPU. + if (!rcu_seq_state(rcu_seq_current(&rnp->gp_seq)) || !READ_ONCE(rnp->qsmask) || !raw_spin_trylock_rcu_node(rnp)) return; - WARN_ON_ONCE(rcu_advance_cbs(rnp, rdp)); + if (rcu_seq_state(rcu_seq_current(&rnp->gp_seq)) && READ_ONCE(rnp->qsmask)) + WARN_ON_ONCE(rcu_advance_cbs(rnp, rdp)); raw_spin_unlock_rcu_node(rnp); }