Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp746845pxb; Thu, 19 Nov 2020 12:47:10 -0800 (PST) X-Google-Smtp-Source: ABdhPJyybyq/Xk7EkzEdxPcNlQSE8mzgYMp8zexnjNW+4DA6y/NdHy4BiSAIcMrc4LdwYrTPJTLx X-Received: by 2002:a17:906:3ac4:: with SMTP id z4mr29992648ejd.92.1605818829984; Thu, 19 Nov 2020 12:47:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605818829; cv=none; d=google.com; s=arc-20160816; b=aQZq70tNnYpMBwQBTgn+qVPvaNfx7wwL/RSn6MSmm8Vb6ysHjVKrYTLZTz2e+vQpVw c0rzfqPGt1IX7zQK4+sy+AwSEXgsG7/leww30vdzqA2PrrkDihc9RWHkBn7WCf81WE3+ fqIdlQyytL9iJhRuBnAT9SSln+gLH06j//Hy/jB81p0dRZu4uFtYrUc4W/rG6eyVbmyr jc2g8A4BaF6x4/oZf+dQL6c59ryipVbLjbBM55o9JHt6Ek5LzZ1gUDhadatRURvoasBA p8xTs7kU8QmqDOSG8vjRcKMbDghUSAMlQllEtD8TN5iZ4yhk2tf3OGXDnbQvgIJ7WM/H /KbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=1fI8LYeeHZvB8l+dO4G9/ig9mUXz+DcREqe0Rtm8+fY=; b=sfxE5vdms6ONCdbxjFT+s497CMQ+hhilwIU4h8MW6quccn5/7q1/imnMkC2IVDZRji CgfXO0Yr6QxrryF3D4iukRRw0ApED1FwXrjC5qLbLuwpZFaKss0fmn27/sjKAOQ7j7L3 fdMy7/04g/oULAcG/zk6dnsGEtbg2hNgwQg5y8nz5BINJuCvhkK4W1GqLKlvYwvI+aZG TJ2m/dWYIaECHH3HLlhWoAdpTWXCDRozboS4kv1gnT5I26LOYdmMHyhTsQiQR0Q55Pk4 eYhJQfTQR7AsP8au3J8TDlGYD+TQxqO7f6rqu3YLDVSF2StmLUa3dpJLoH7l0FzjJDDT OJ7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=E4ATwBW6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s9si470966ejy.610.2020.11.19.12.46.44; Thu, 19 Nov 2020 12:47:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=E4ATwBW6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726314AbgKSUmX (ORCPT + 99 others); Thu, 19 Nov 2020 15:42:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726172AbgKSUmX (ORCPT ); Thu, 19 Nov 2020 15:42:23 -0500 Received: from mail-qt1-x844.google.com (mail-qt1-x844.google.com [IPv6:2607:f8b0:4864:20::844]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3515BC0613CF for ; Thu, 19 Nov 2020 12:42:23 -0800 (PST) Received: by mail-qt1-x844.google.com with SMTP id i12so5500457qtj.0 for ; Thu, 19 Nov 2020 12:42:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=1fI8LYeeHZvB8l+dO4G9/ig9mUXz+DcREqe0Rtm8+fY=; b=E4ATwBW6XoGEG9+dr7/A7CaJtFRdYaYbmgp09Unbw2L9KqL1IZp2k8tvA/lYUi6Puv mtPiZGLb7ns9NAnW9NBPpyQzKEREOk0nP9PK5JKKWXcd0ef+/vW0pSLIKDsOWse2T7rO LlYBp9EspYBFxI7Yoi8guGDWUZ/JOBV59ocCM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=1fI8LYeeHZvB8l+dO4G9/ig9mUXz+DcREqe0Rtm8+fY=; b=Sea8nHc1W8ePVG558zKZcLPbRcFNkcfKtLdCZGL9qGpQTN0Ptzo3LHWMqV0NLKUqn5 E3AHgr/+IeAHrT7ZvwaWDuFD7w0n8yiMuV1T/Brp+VJIGkkf8QvXsSXywIHETEAx24aE L9YQYc+oumENTdog6+bVU61EUOH1bhO/mDFW9/9RiQpLzxfSoTxtHceBCxkscxy9X5Fg aRbzGpGwkoFUdcPLzS/acmpPTzgTDltdR72mZxm8jwPylRftdxL2V76UktIThBX+LL/a NA8mB3sJDeRP+ZqfpMQXXX4+aWDtX8GzAwKTSKGafmRspsNSE2eZHN7cY2sgfY00Ncsa Plaw== X-Gm-Message-State: AOAM5304NSLnFRJw6Q2IKjAAsT9xcgXwKlJClQxTtQWCHxG1+kJEEZRm 5v+lOR9QKP+fSo73V70L1PS79Auu1LmB4A== X-Received: by 2002:a05:622a:4e:: with SMTP id y14mr12990868qtw.392.1605818542369; Thu, 19 Nov 2020 12:42:22 -0800 (PST) Received: from localhost ([2620:15c:6:411:cad3:ffff:feb3:bd59]) by smtp.gmail.com with ESMTPSA id t126sm644992qkh.133.2020.11.19.12.42.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Nov 2020 12:42:21 -0800 (PST) Date: Thu, 19 Nov 2020 15:42:21 -0500 From: Joel Fernandes To: "Paul E. McKenney" Cc: LKML , Josh Triplett , Lai Jiangshan , Mathieu Desnoyers , rcu , Steven Rostedt Subject: Re: [PATCH v2] rcu/segcblist: Add debug checks for segment lengths Message-ID: <20201119204221.GB812262@google.com> References: <20201118161541.3844924-1-joel@joelfernandes.org> <20201118201335.GR1437@paulmck-ThinkPad-P72> <20201119035222.GA18458@paulmck-ThinkPad-P72> <20201119035613.GA18816@paulmck-ThinkPad-P72> <20201119183252.GA812262@google.com> <20201119192241.GZ1437@paulmck-ThinkPad-P72> <20201119201615.GA1437@paulmck-ThinkPad-P72> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201119201615.GA1437@paulmck-ThinkPad-P72> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 19, 2020 at 12:16:15PM -0800, Paul E. McKenney wrote: > On Thu, Nov 19, 2020 at 02:44:35PM -0500, Joel Fernandes wrote: > > On Thu, Nov 19, 2020 at 2:22 PM Paul E. McKenney wrote: > > > > > > > On Wed, Nov 18, 2020 at 11:15:41AM -0500, Joel Fernandes (Google) wrote: > > > > > > > > After rcu_do_batch(), add a check for whether the seglen counts went to > > > > > > > > zero if the list was indeed empty. > > > > > > > > > > > > > > > > Signed-off-by: Joel Fernandes (Google) > > > > > > > > > > > > > > Queued for testing and further review, thank you! > > > > > > > > > > > > FYI, the second of the two checks triggered in all four one-hour runs of > > > > > > TREE01, all four one-hour runs of TREE04, and one of the four one-hour > > > > > > runs of TREE07. This one: > > > > > > > > > > > > WARN_ON_ONCE(count != 0 && rcu_segcblist_n_segment_cbs(&rdp->cblist) == 0); > > > > > > > > > > > > That is, there are callbacks in the list, but the sum of the segment > > > > > > counts is nevertheless zero. The ->nocb_lock is held. > > > > > > > > > > > > Thoughts? > > > > > > > > > > FWIW, TREE01 reproduces it very quickly compared to the other two > > > > > scenarios, on all four run, within five minutes. > > > > > > > > So far for TREE01, I traced it down to an rcu_barrier happening so it could > > > > be related to some interaction with rcu_barrier() (Just a guess). > > > > > > Well, rcu_barrier() and srcu_barrier() are the only users of > > > rcu_segcblist_entrain(), if that helps. Your modification to that > > > function looks plausible to me, but the system's opinion always overrules > > > mine. ;-) > > > > Right. Does anything the bypass code standout? That happens during > > rcu_barrier() as well, and it messes with the lengths. > > In theory, rcu_barrier_func() flushes the bypass before doing the > entrain, and does the rcu_segcblist_entrain() afterwards. > > Ah, and that is the issue. If ->cblist is empty and ->nocb_bypass > is not, then ->cblist length will be nonzero, and none of the > segments will be nonzero. > > So you need something like this for that second WARN, correct? > > WARN_ON_ONCE(!rcu_segcblist_empty(&rdp->cblist) && > rcu_segcblist_n_segment_cbs(&rdp->cblist) == 0); > > This is off the cuff, so should be taken with a grain of salt. And > there might well be other similar issues. Ah, makes sense. Or maybe should be made like the other warning? WARN_ON_ONCE(!IS_ENABLED(CONFIG_RCU_NOCB_CPU) && count != 0 && rcu_segcblist_n_segment_cbs(&rdp->cblist) == 0); Though your warning is better. I will try these out and see if it goes away. I am afraid though that there is an issue with !NOCB code since you had other configs that were failing similarly.. :-\. thanks, :-) - Joel