Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp359828imu; Wed, 12 Dec 2018 18:44:00 -0800 (PST) X-Google-Smtp-Source: AFSGD/WMZoP2xpTf1x+SR9n//1i+gn0JRqQLvFm1WcRYqo+NRIDTPvbqZYMJIZigCim//3Ui7/vT X-Received: by 2002:aa7:80d7:: with SMTP id a23mr22147096pfn.86.1544669040396; Wed, 12 Dec 2018 18:44:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544669040; cv=none; d=google.com; s=arc-20160816; b=yHQUDfBRlu2eKeeLXNaUdct49GF92t4SQQVV/GCpyodYv9j3Ye37V+B6vHDQfnpFNa 3AGBLCMebdV0D2LjtQolHXueJnadHHJzZ50mrtjhSrwfXoSXcpuCbSjP+XlN111Hxvds dt+5JBbpUfu7kgvVyk0/dp19gtbvN49VgXQEQzGRWOyFvjWOLzb4uOb15AaZSU3eceQ6 az1OoEsd8BCYt4TeJAvNK3nR4vxkCiHQRAtWtcJT3zvs16hKItOaIQYZrPpcBHLbWGFM VrcQLQ0gp75ruPL2WpO1YQkTjb7+bc87rCTWTJHm+f0cSC4Hyc03pL4l8E6SMaqPruaJ N9Cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:subject:cc:to :from:date; bh=5abT4xy9YsOgECGsAgLIzZBbic9OKJFeY90iig6NtN0=; b=P3EFANUA18mSQzthLcBQSbV1rRxthm9j1TwWH5Gq6lIF3DE6jwzFRvh6pAlFozK8C6 9u5lE2OpLBS6CeO9MGLd4YhxpPcq2+RiLV1xH7HRj6HDkPl8x+jBFpqN3CAygJZ4Hmhc RPJmXWL8FcTwjZHDzHaPL2ArauR6nUE8LILHmuLslsCLJwiwMIBfp2T8cMI+wjbl703e VK5Fe5JbcMe7qFKv1GY+JcEPrna3bC0BoRQnvDrFdG/s87LTBv50SconIhJsSzVwUrif oHtABq4qV0+6Kkeo6baFixLxquWwvDQaqW/QCXtIP76TyWJJCv0kkn939KLC1YHE281M OYoQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w14si457207plq.145.2018.12.12.18.43.44; Wed, 12 Dec 2018 18:44:00 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726685AbeLMCmo (ORCPT + 99 others); Wed, 12 Dec 2018 21:42:44 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:57676 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726389AbeLMCmn (ORCPT ); Wed, 12 Dec 2018 21:42:43 -0500 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wBD2ckkg122420 for ; Wed, 12 Dec 2018 21:42:42 -0500 Received: from e14.ny.us.ibm.com (e14.ny.us.ibm.com [129.33.205.204]) by mx0a-001b2d01.pphosted.com with ESMTP id 2pbamjb35a-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 12 Dec 2018 21:42:41 -0500 Received: from localhost by e14.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 13 Dec 2018 02:42:41 -0000 Received: from b01cxnp22033.gho.pok.ibm.com (9.57.198.23) by e14.ny.us.ibm.com (146.89.104.201) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 13 Dec 2018 02:42:36 -0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wBD2gZTi19923184 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 13 Dec 2018 02:42:35 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6486BB2067; Thu, 13 Dec 2018 02:42:35 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2D8B3B205F; Thu, 13 Dec 2018 02:42:34 +0000 (GMT) Received: from paulmck-ThinkPad-W541 (unknown [9.80.223.5]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 13 Dec 2018 02:42:34 +0000 (GMT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id A4C3D16C5F35; Wed, 12 Dec 2018 18:42:34 -0800 (PST) Date: Wed, 12 Dec 2018 18:42:34 -0800 From: "Paul E. McKenney" To: "Zhang, Jun" Cc: "He, Bo" , Steven Rostedt , "linux-kernel@vger.kernel.org" , "josh@joshtriplett.org" , "mathieu.desnoyers@efficios.com" , "jiangshanlai@gmail.com" , "Xiao, Jin" , "Zhang, Yanmin" , "Bai, Jie A" , "Sun, Yi J" Subject: Re: rcu_preempt caused oom Reply-To: paulmck@linux.ibm.com References: <20181211003838.GD4170@linux.ibm.com> <20181211044631.GA19942@linux.ibm.com> <20181212022446.GV4170@linux.ibm.com> <20181212154224.GX4170@linux.ibm.com> <20181212210316.GA14777@linux.ibm.com> <20181213001214.GE4170@linux.ibm.com> <88DC34334CA3444C85D647DBFA962C2735AD5F77@SHSMSX104.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <88DC34334CA3444C85D647DBFA962C2735AD5F77@SHSMSX104.ccr.corp.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 18121302-0052-0000-0000-000003664E75 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010217; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000270; SDB=6.01130961; UDB=6.00587720; IPR=6.00911095; MB=3.00024674; MTD=3.00000008; XFM=3.00000015; UTC=2018-12-13 02:42:39 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18121302-0053-0000-0000-00005F15C28C Message-Id: <20181213024234.GF4170@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-12-12_04:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812130022 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 13, 2018 at 02:11:35AM +0000, Zhang, Jun wrote: > Hello, Paul > > I think the next patch is better. > Because ULONG_CMP_GE could cause double write, which has risk that write back old value. > Please help review. > I don't test it. If you agree, we will test it. Just to make sure that I understand, you are worried about something like the following, correct? o __note_gp_changes() compares rnp->gp_seq_needed and rdp->gp_seq_needed and finds them equal. o At just this time something like rcu_start_this_gp() assigns a new (larger) value to rdp->gp_seq_needed. o Then __note_gp_changes() overwrites rdp->gp_seq_needed with the old value. This cannot happen because __note_gp_changes() runs with interrupts disabled on the CPU corresponding to the rcu_data structure referenced by the rdp pointer. So there is no way for rcu_start_this_gp() to be invoked on the same CPU during this "if" statement. Of course, there could be bugs. For example: o __note_gp_changes() might be called on a different CPU than that corresponding to rdp. You can check this with something like: WARN_ON_ONCE(rdp->cpu != smp_processor_id()); o The same things could happen with rcu_start_this_gp(), and the above WARN_ON_ONCE() would work there as well. o rcutree_prepare_cpu() is a special case, but is irrelevant unless you are doing CPU-hotplug operations. (It can run on a CPU other than rdp->cpu, but only at times when rdp->cpu is offline.) o Interrupts might not really be disabled. That said, your patch could reduce overhead slightly, given that the two values will be equal much of the time. So it might be worth testing just for that reason. So why not just test it anyway? If it makes the bug go away, I will be surprised, but it would not be the first surprise for me. ;-) Thanx, Paul > Thanks! > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 0b760c1..c00f34e 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1849,7 +1849,7 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, > zero_cpu_stall_ticks(rdp); > } > rdp->gp_seq = rnp->gp_seq; /* Remember new grace-period state. */ > - if (ULONG_CMP_GE(rnp->gp_seq_needed, rdp->gp_seq_needed) || rdp->gpwrap) > + if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) || rdp->gpwrap) > rdp->gp_seq_needed = rnp->gp_seq_needed; > WRITE_ONCE(rdp->gpwrap, false); > rcu_gpnum_ovf(rnp, rdp); > > > -----Original Message----- > From: Paul E. McKenney [mailto:paulmck@linux.ibm.com] > Sent: Thursday, December 13, 2018 08:12 > To: He, Bo > Cc: Steven Rostedt ; linux-kernel@vger.kernel.org; josh@joshtriplett.org; mathieu.desnoyers@efficios.com; jiangshanlai@gmail.com; Zhang, Jun ; Xiao, Jin ; Zhang, Yanmin ; Bai, Jie A ; Sun, Yi J > Subject: Re: rcu_preempt caused oom > > On Wed, Dec 12, 2018 at 11:13:22PM +0000, He, Bo wrote: > > I don't see the rcutree.sysrq_rcu parameter in v4.19 kernel, I also checked the latest kernel and the latest tag v4.20-rc6, not see the sysrq_rcu. > > Please correct me if I have something wrong. > > That would be because I sent you the wrong patch, apologies! :-/ > > Please instead see the one below, which does add sysrq_rcu. > > Thanx, Paul > > > -----Original Message----- > > From: Paul E. McKenney > > Sent: Thursday, December 13, 2018 5:03 AM > > To: He, Bo > > Cc: Steven Rostedt ; > > linux-kernel@vger.kernel.org; josh@joshtriplett.org; > > mathieu.desnoyers@efficios.com; jiangshanlai@gmail.com; Zhang, Jun > > ; Xiao, Jin ; Zhang, Yanmin > > ; Bai, Jie A > > Subject: Re: rcu_preempt caused oom > > > > On Wed, Dec 12, 2018 at 07:42:24AM -0800, Paul E. McKenney wrote: > > > On Wed, Dec 12, 2018 at 01:21:33PM +0000, He, Bo wrote: > > > > we reproduce on two boards, but I still not see the show_rcu_gp_kthreads() dump logs, it seems the patch can't catch the scenario. > > > > I double confirmed the CONFIG_PROVE_RCU=y is enabled in the config as it's extracted from the /proc/config.gz. > > > > > > Strange. > > > > > > Are the systems responsive to sysrq keys once failure occurs? If > > > so, I will provide you a sysrq-R or some such to dump out the RCU state. > > > > Or, as it turns out, sysrq-y if booting with rcutree.sysrq_rcu=1 using the patch below. Only lightly tested. > > ------------------------------------------------------------------------ > > commit 04b6245c8458e8725f4169e62912c1fadfdf8141 > Author: Paul E. McKenney > Date: Wed Dec 12 16:10:09 2018 -0800 > > rcu: Add sysrq rcu_node-dump capability > > Backported from v4.21/v5.0 > > Life is hard if RCU manages to get stuck without triggering RCU CPU > stall warnings or triggering the rcu_check_gp_start_stall() checks > for failing to start a grace period. This commit therefore adds a > boot-time-selectable sysrq key (commandeering "y") that allows manually > dumping Tree RCU state. The new rcutree.sysrq_rcu kernel boot parameter > must be set for this sysrq to be available. > > Signed-off-by: Paul E. McKenney > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0b760c1369f7..e9392a9d6291 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -61,6 +61,7 @@ > #include > #include > #include > +#include > > #include "tree.h" > #include "rcu.h" > @@ -128,6 +129,9 @@ int num_rcu_lvl[] = NUM_RCU_LVL_INIT; int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */ > /* panic() on RCU Stall sysctl. */ > int sysctl_panic_on_rcu_stall __read_mostly; > +/* Commandeer a sysrq key to dump RCU's tree. */ static bool sysrq_rcu; > +module_param(sysrq_rcu, bool, 0444); > > /* > * The rcu_scheduler_active variable is initialized to the value @@ -662,6 +666,27 @@ void show_rcu_gp_kthreads(void) } EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads); > > +/* Dump grace-period-request information due to commandeered sysrq. */ > +static void sysrq_show_rcu(int key) { > + show_rcu_gp_kthreads(); > +} > + > +static struct sysrq_key_op sysrq_rcudump_op = { > + .handler = sysrq_show_rcu, > + .help_msg = "show-rcu(y)", > + .action_msg = "Show RCU tree", > + .enable_mask = SYSRQ_ENABLE_DUMP, > +}; > + > +static int __init rcu_sysrq_init(void) > +{ > + if (sysrq_rcu) > + return register_sysrq_key('y', &sysrq_rcudump_op); > + return 0; > +} > +early_initcall(rcu_sysrq_init); > + > /* > * Send along grace-period-related data for rcutorture diagnostics. > */ >