Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp141878pxb; Wed, 24 Feb 2021 21:20:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJxYD3lJblkQp9Rv5sTbZn/BkcINngjVsGxw/ISxAD0pukiF7c4XreNJhUfZm8vxcx+Du8U4 X-Received: by 2002:a17:907:20af:: with SMTP id pw15mr979369ejb.298.1614230448716; Wed, 24 Feb 2021 21:20:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614230448; cv=none; d=google.com; s=arc-20160816; b=FsaPWJbKso1tMLyjeD8ZWxSGcdljD5jODanoVQ4zqUNKBadYNgLLtwhjZcFrJJ3Nn5 +eemAGiGddu/RFMQEMSp6kIJ1ZGxEcStuWjZrYHE/l8IvKIuckP0APjNgBxxnXqC+YRY WOWqyVZoFphVsVKMg5iq024X5n5nMkCQYQ5CIyznpvSnVrXQWr7FquQ3XJXLhYQiGxd9 buHyLSMXAtUzHsTzWNxOsFWb8uzMBGazp6V+DaRlWPXiASWUm5/+lAdLXgbjpgDHQQ4I 34INv3cwgkKZ/xnIUsN8iTfEmEYwvPcdh/8InntgvhgvWUf3UwjisZZmk4L2BLa3TjsU KhxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=Fpe1PuoHMPIFtxaK/bKY/hWWAfLpPKMemwhPUbPyp/U=; b=VBrOQDziML8y9mGVLA9mcXgzeptwM85JmeGmxukk2eBsECMPZVGf8jt65mw59Dsuzt gtDnlwMgn04XDVKCutSIsszw5VJ1hn+McNtYsWxlBj3X72uQOk60Qq8Qaud+UNKgHjYj 8LfDrj7AjnX9imE+sAftw1Mxmxx1Dn243Ay3ZbBOCS58DuUF5Zethln8xJQw4gHRWowX 53StJbZxv2i/2szxzujCWoBd8IfobjDuBnpHGIFzDm1TolVQn5E8rU+5CRNR04zKY0cV bjH8snSR9nK/GyAl3FwsupqFsASJkKq4cBN6X8bOPO5zjKs+8Qy+z1Dq+Qnxx7FP/VR0 c/cA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=b5xEQ292; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o26si2591407edw.74.2021.02.24.21.20.25; Wed, 24 Feb 2021 21:20:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=b5xEQ292; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236743AbhBYBIK (ORCPT + 99 others); Wed, 24 Feb 2021 20:08:10 -0500 Received: from mail.kernel.org ([198.145.29.99]:46710 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236651AbhBYBIJ (ORCPT ); Wed, 24 Feb 2021 20:08:09 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 4B35764EC4; Thu, 25 Feb 2021 01:07:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1614215249; bh=jGjXSXJ0rJ17csgozfjaU2+iOqw5SdMN9WdZUHITUdk=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=b5xEQ2920RdTXJxYBiByVBpCxF504IG3O4g2+NsK8k4Y+p2IYb01HMEDFiaQRBd39 rF2lrYCOrTZPqUhVpoP7yPHrCChOqwX6icM2AHFQI+P2ZoFPxBvzsQlFSV365ETO7Z 7/3KndTOsvAlaKxgFBSkSIrk2hxeFgI6V8fzxb/bF6MnqkMKWnbKapIz3FCImPK7LT aNSXwbfIC2EP8TSzWsaD3/WirMqlCYJBuRH4nsV/i83v8X0g4l10cUBE//K1gs08uH 4SnOuMUg9PZ8ootMMB8tFGyBaQwyLKOx78p08IWq4GwJ3b4wjYvrxUxlbcVyk+rCTu QzHL9sbhOIK4g== Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id 0C8863520D1E; Wed, 24 Feb 2021 17:07:29 -0800 (PST) Date: Wed, 24 Feb 2021 17:07:29 -0800 From: "Paul E. McKenney" To: Frederic Weisbecker Cc: LKML , Thomas Gleixner , Boqun Feng , Lai Jiangshan , Neeraj Upadhyay , Josh Triplett , Stable , Joel Fernandes Subject: Re: [PATCH 01/13] rcu/nocb: Fix potential missed nocb_timer rearm Message-ID: <20210225010729.GN2743@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <20210223001011.127063-1-frederic@kernel.org> <20210223001011.127063-2-frederic@kernel.org> <20210224183709.GI2743@paulmck-ThinkPad-P72> <20210224220606.GA3179@lothringen> <20210225001425.GL2743@paulmck-ThinkPad-P72> <20210225004813.GB12431@lothringen> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210225004813.GB12431@lothringen> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 25, 2021 at 01:48:13AM +0100, Frederic Weisbecker wrote: > On Wed, Feb 24, 2021 at 04:14:25PM -0800, Paul E. McKenney wrote: > > On Wed, Feb 24, 2021 at 11:06:06PM +0100, Frederic Weisbecker wrote: > > > I managed to recollect some pieces of my brain. So keep the above but > > > let's change the point 10: > > > > > > 10. CPU 0 enqueues its second callback, this time with interrupts > > > enabled so it can wake directly ->nocb_gp_kthread. > > > It does so with calling __wake_nocb_gp() which also cancels the > > > pending timer that got queued in step 2. But that doesn't reset > > > CPU 0's ->nocb_defer_wakeup which is still set to RCU_NOCB_WAKE. > > > So CPU 0's ->nocb_defer_wakeup and CPU 0's ->nocb_timer are now > > > desynchronized. > > > > > > 11. ->nocb_gp_kthread associates the callback queued in 10 with a new > > > grace period, arrange for it to start and sleeps on it. > > > > > > 12. The grace period ends, ->nocb_gp_kthread awakens and wakes up > > > CPU 0's ->nocb_cb_kthread which invokes the callback queued in 10. > > > > > > 13. CPU 0 enqueues its third callback, this time with interrupts > > > disabled so it tries to queue a deferred wakeup. However > > > ->nocb_defer_wakeup has a stalled RCU_NOCB_WAKE value which prevents > > > the CPU 0's ->nocb_timer, that got cancelled in 10, from being armed. > > > > > > 14. CPU 0 has its pending callback and it may go unnoticed until > > > some other CPU ever wakes up ->nocb_gp_kthread or CPU 0 ever calls > > > an explicit deferred wake up caller like idle entry. > > > > > > I hope I'm not missing something this time... > > > > Thank you, that does sound plausible. I guess I can see how rcutorture > > might have missed this one! > > I must admit it requires a lot of stars to be aligned :-) It nevertheless constitutes a bug in rcutorture. Or maybe an additional challenge for the formal verification people. ;-) Thanx, Paul