Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp5615739img; Wed, 27 Mar 2019 11:46:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqy4Ku3eTbc5ZPtHkyn2qFRec6ejqC3TCGQ/YRsnC3HENXJ9mdw9YafCYksDSTXk4wXQRRqW X-Received: by 2002:a62:26c1:: with SMTP id m184mr6739373pfm.102.1553712389799; Wed, 27 Mar 2019 11:46:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553712389; cv=none; d=google.com; s=arc-20160816; b=XUbZT/MhD0yEA4PBLyaZVKYn8uyrrTcKOJyVZN7NtB3OCz6YlO9u8azV1rTW/lLjEl gOzXVZXrCWHyenxYOmGeOTyocB4cV4qhfMOb2YAmJQA03Kx8tmI/PEnhS7U2iFtDCexz KwZFi/eWVb3R5tckolLvVTHAqBTtNdAPleDLJNZoi0VEjtPaKmTZ1w+H/IMs8jwOQuFZ BWNTZtkD1sI3mDcG70sgdhRqKe68BZtL8V119nPJqg+FiHFb4kv4/wVqxh1TR4KiTHa/ mOoVu0vQaaO+s+tu8h16oJsb0yk1ZGzHG3ySL3P4WungEa7xZyRUeYSf6nVoLAc496Sw wlsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:subject:cc:to :from:date; bh=9iJ+oZtcZwdH8PO4Q0UM6wtsSQXzQM35+qKH/FQOiGE=; b=oDT2qEVPxIv2burwVWiArz7j/EOkGg2FyBCJckfU0r7VZlhVF9GLlqB7/KylXxA02H 3tRiybRGCW1SZxP9DCPefcYDyuVMwyfrf3pjPQMiBdUQE03ZvB8biz1q3b5LdPc5RSOs FFM9r1ELIwCWoOgXJnbGwepS+0F8tZ2fx0J1Gritv8S7ZCTVJh894RnDAYw3j53fyuXB uYsCAo4XYDAtb2IkpJDLXTL0sNIz56DNHBMhWcbqivkMn5q8SGrAH8O++AR8TMgqYO3M FeenQMdRPWA0MOIfjhKriIXNwaAMG/M/vRY59zffVvM81wcFAkmH2O820Ik2NuDrDO7W KEsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 60si21279845plf.122.2019.03.27.11.46.14; Wed, 27 Mar 2019 11:46:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391593AbfC0Sos (ORCPT + 99 others); Wed, 27 Mar 2019 14:44:48 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:46024 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2391651AbfC0Sop (ORCPT ); Wed, 27 Mar 2019 14:44:45 -0400 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x2RIcgMI020141 for ; Wed, 27 Mar 2019 14:44:44 -0400 Received: from e13.ny.us.ibm.com (e13.ny.us.ibm.com [129.33.205.203]) by mx0b-001b2d01.pphosted.com with ESMTP id 2rge4ahgab-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 27 Mar 2019 14:44:44 -0400 Received: from localhost by e13.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 27 Mar 2019 18:44:43 -0000 Received: from b01cxnp22033.gho.pok.ibm.com (9.57.198.23) by e13.ny.us.ibm.com (146.89.104.200) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 27 Mar 2019 18:44:38 -0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x2RIibVM24051888 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 27 Mar 2019 18:44:37 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6BA94B2068; Wed, 27 Mar 2019 18:44:37 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 395BFB2064; Wed, 27 Mar 2019 18:44:37 +0000 (GMT) Received: from paulmck-ThinkPad-W541 (unknown [9.70.82.188]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Wed, 27 Mar 2019 18:44:37 +0000 (GMT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id 339BA16C31B1; Wed, 27 Mar 2019 11:44:37 -0700 (PDT) Date: Wed, 27 Mar 2019 11:44:37 -0700 From: "Paul E. McKenney" To: Joel Fernandes Cc: linux-kernel@vger.kernel.org, byungchul.park@lge.com, kernel-team@android.com, rcu@vger.kernel.org, Ingo Molnar , Josh Triplett , Lai Jiangshan , linux-kselftest@vger.kernel.org, Mathieu Desnoyers , Peter Zijlstra , Shuah Khan , Steven Rostedt , Will Deacon Subject: Re: [PATCH v2 2/4] rcutree: Add checks for dynticks counters in rcu_is_cpu_rrupt_from_idle Reply-To: paulmck@linux.ibm.com References: <20190326192411.198070-1-joel@joelfernandes.org> <20190326192411.198070-2-joel@joelfernandes.org> <20190327024751.GV4102@linux.ibm.com> <20190327153401.GA152912@google.com> <20190327155351.GZ4102@linux.ibm.com> <20190327174545.GA209305@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190327174545.GA209305@google.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 19032718-0064-0000-0000-000003C0E224 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010825; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000282; SDB=6.01180521; UDB=6.00617807; IPR=6.00961238; MB=3.00026183; MTD=3.00000008; XFM=3.00000015; UTC=2019-03-27 18:44:42 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19032718-0065-0000-0000-00003CDAFE6F Message-Id: <20190327184437.GE4102@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-03-27_12:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903270131 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 27, 2019 at 01:45:45PM -0400, Joel Fernandes wrote: > On Wed, Mar 27, 2019 at 08:53:51AM -0700, Paul E. McKenney wrote: > > On Wed, Mar 27, 2019 at 11:34:01AM -0400, Joel Fernandes wrote: > > > On Tue, Mar 26, 2019 at 07:47:51PM -0700, Paul E. McKenney wrote: > > > > On Tue, Mar 26, 2019 at 03:24:09PM -0400, Joel Fernandes (Google) wrote: > > > > > In the future we would like to combine the dynticks and dynticks_nesting > > > > > counters thus leading to simplifying the code. At the moment we cannot > > > > > do that due to concerns about usermode upcalls appearing to RCU as half > > > > > of an interrupt. Byungchul tried to do it in [1] but the > > > > > "half-interrupt" concern was raised. It is half because, what RCU > > > > > expects is rcu_irq_enter() and rcu_irq_exit() pairs when the usermode > > > > > exception happens. However, only rcu_irq_enter() is observed. This > > > > > concern may not be valid anymore, but at least it used to be the case. > > > > > > > > > > Out of abundance of caution, Paul added warnings [2] in the RCU code > > > > > which if not fired by 2021 may allow us to assume that such > > > > > half-interrupt scenario cannot happen any more, which can lead to > > > > > simplification of this code. > > > > > > > > > > Summary of the changes are the following: > > > > > > > > > > (1) In preparation for this combination of counters in the future, we > > > > > first need to first be sure that rcu_rrupt_from_idle cannot be called > > > > > from anywhere but a hard-interrupt because previously, the comments > > > > > suggested otherwise so let us be sure. We discussed this here [3]. We > > > > > use the services of lockdep to accomplish this. > > > > > > > > > > (2) Further rcu_rrupt_from_idle() is not explicit about how it is using > > > > > the counters which can lead to weird future bugs. This patch therefore > > > > > makes it more explicit about the specific counter values being tested > > > > > > > > > > (3) Lastly, we check for counter underflows just to be sure these are > > > > > not happening, because the previous code in rcu_rrupt_from_idle() was > > > > > allowing the case where the counters can underflow, and the function > > > > > would still return true. Now we are checking for specific values so let > > > > > us be confident by additional checking, that such underflows don't > > > > > happen. Any case, if they do, we should fix them and the screaming > > > > > warning is appropriate. All these checks checks are NOOPs if PROVE_RCU > > > > > and PROVE_LOCKING are disabled. > > > > > > > > > > [1] https://lore.kernel.org/patchwork/patch/952349/ > > > > > [2] Commit e11ec65cc8d6 ("rcu: Add warning to detect half-interrupts") > > > > > [3] https://lore.kernel.org/lkml/20190312150514.GB249405@google.com/ > > > > > > > > > > Cc: byungchul.park@lge.com > > > > > Cc: kernel-team@android.com > > > > > Cc: rcu@vger.kernel.org > > > > > Signed-off-by: Joel Fernandes (Google) > > > > > > > > Color me stupid: > > > > > > > > [ 48.845724] ------------[ cut here ]------------ > > > > [ 48.846619] Not in hardirq as expected > > > > [ 48.847322] WARNING: CPU: 5 PID: 34 at /home/git/linux-2.6-tip/kernel/rcu/tree.c:388 rcu_is_cpu_rrupt_from_idle+0xea/0x110 > > > > [ 48.849302] Modules linked in: > > > > [ 48.849869] CPU: 5 PID: 34 Comm: cpuhp/5 Not tainted 5.1.0-rc1+ #1 > > > > [ 48.850985] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > > > [ 48.852436] RIP: 0010:rcu_is_cpu_rrupt_from_idle+0xea/0x110 > > > > [ 48.853455] Code: 85 c0 0f 85 59 ff ff ff 80 3d 33 55 68 01 00 0f 85 4c ff ff ff 48 c7 c7 48 d8 cc 8e 31 c0 c6 05 1d 55 68 01 01 e8 66 54 f8 ff <0f> 0b e9 30 ff ff ff 65 48 8b 05 df 58 54 72 48 85 c0 0f 94 c0 0f > > > > [ 48.856783] RSP: 0000:ffffbc46802dfdc0 EFLAGS: 00010082 > > > > [ 48.857735] RAX: 000000000000001a RBX: 0000000000022b80 RCX: 0000000000000000 > > > > [ 48.859028] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8dac906c > > > > [ 48.860313] RBP: ffffbc46802dfe20 R08: 0000000000000001 R09: 0000000000000001 > > > > [ 48.861607] R10: 000000007d53d16d R11: ffffbc46802dfb48 R12: ffff9e7d7eb62b80 > > > > [ 48.862898] R13: 0000000000000005 R14: ffffffff8dae2ac0 R15: 00000000000000c9 > > > > [ 48.864191] FS: 0000000000000000(0000) GS:ffff9e7d7eb40000(0000) knlGS:0000000000000000 > > > > [ 48.865663] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > [ 48.866702] CR2: 0000000000000000 CR3: 0000000021022000 CR4: 00000000000006e0 > > > > [ 48.867993] Call Trace: > > > > [ 48.868450] rcu_exp_handler+0x35/0x90 > > > > [ 48.869147] generic_exec_single+0xab/0x100 > > > > [ 48.869918] ? rcu_barrier+0x240/0x240 > > > > [ 48.870607] smp_call_function_single+0x8e/0xd0 > > > > [ 48.871441] rcutree_online_cpu+0x80/0x90 > > > > [ 48.872181] cpuhp_invoke_callback+0xb5/0x890 > > > > [ 48.872979] cpuhp_thread_fun+0x172/0x210 > > > > [ 48.873722] ? cpuhp_thread_fun+0x2a/0x210 > > > > [ 48.874474] smpboot_thread_fn+0x10d/0x160 > > > > [ 48.875224] kthread+0xf3/0x130 > > > > [ 48.875804] ? sort_range+0x20/0x20 > > > > [ 48.876446] ? kthread_cancel_delayed_work_sync+0x10/0x10 > > > > [ 48.877445] ret_from_fork+0x3a/0x50 > > > > [ 48.878124] irq event stamp: 734 > > > > [ 48.878717] hardirqs last enabled at (733): [] _raw_spin_unlock_irqrestore+0x2d/0x40 > > > > [ 48.880402] hardirqs last disabled at (734): [] generic_exec_single+0x9a/0x100 > > > > [ 48.881986] softirqs last enabled at (0): [] copy_process.part.56+0x61f/0x2110 > > > > [ 48.883540] softirqs last disabled at (0): [<0000000000000000>] (null) > > > > [ 48.884840] ---[ end trace 00b4c1d2f816f4ed ]--- > > > > > > > > If a CPU invokes generic_exec_single() on itself, the "IPI handler" will > > > > be invoked directly, triggering your new lockdep check. Which is a bit > > > > wasteful. My thought is to add code to sync_rcu_exp_select_node_cpus() > > > > to check the CPU with preemption disabled, avoiding the call to > > > > smp_call_function_single() in that case. > > > > > > > > I have queued all four of your patches, and am trying the fix to > > > > the caller of smp_call_function_single() shown below. Thoughts? > > > > > > Oh interesting. Your fix makes sense. I will go through these paths more as > > > well since I'm not super familiar with this area of the RCU code. But I had > > > one small nit below. > > > > Very good, applying that change. I have a similar issue in the CPU-hotplug > > code that I will also be fixing. > > > > Are there other places where I should be using get_cpu()? > > I will check other usages. I wonder if this path is problematic: > > rcu_do_batch AIUI can be called from process-context if boost is enabled. > In that case rcu_do_batch()-> invoke_rcu_core()-> smp_processor_id() might be > problematic. I will double confirm this situation is possible and send a > get/put_cpu patch as well if that's the case. Other paths seem to be > disabling interrupts or softirqs so they are fine. But I will go through it > in more detail later today (sorry for slow responses, currently catching a plane). The theory is that the case where it is invoked from process context, it is invoked from an rcuc kthread, which is bound to a single CPU. Wouldn't hurt to check, though! > CONFIG_DEBUG_PREEMPT should be able to catch these kinds of issues since > smp_processor_id() checks this internally. And it seems rcutorture configs do > enable these, so it may not be an issue after all, or that DEBUG_PREEMPT > checking needs some investigation to see why it doesn't warn if at all :-) Or maybe I don't have CONFIG_DEBUG_PREEMPT enabled on the scenario that needs it. ;-) And please see below for an additional patch to make the world safe for rcu_is_cpu_rrupt_from_idle(). ;-) Thanx, Paul ------------------------------------------------------------------------ commit a8d8c1e6e09a9a9521e3248a92f5fbb9eb2cf988 Author: Paul E. McKenney Date: Wed Mar 27 10:03:12 2019 -0700 rcu: Avoid self-IPI in sync_sched_exp_online_cleanup() The sync_sched_exp_online_cleanup() is invoked at online time to handle the case where the start of an expedited grace period ran concurrently with a CPU being taken offline and then immediately being placed online. It checks to see if RCU needs an expedited quiescent state from the incoming CPU, sending it an IPI if so. However, it is quite possible that sync_sched_exp_online_cleanup() is running on that CPU, in which case it is considerably less overhead to simply request the quiescent state locally instead of simulating a self-IPI. This commit therefore places the last few lines of rcu_exp_handler() into a new rcu_exp_need_qs() function, which is invoked both by rcu_exp_handler() and by sync_sched_exp_online_cleanup() in the self-IPI case. This also reduces the rcu_exp_handler() function's state space by removing the direct call that this smp_call_function_single() uses to emulate the requested self-IPI. This in turn will allow tighter error checking in rcu_is_cpu_rrupt_from_idle(). Signed-off-by: Paul E. McKenney diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 5390618787b6..de1b4acf6979 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -699,6 +699,16 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp) #else /* #ifdef CONFIG_PREEMPT_RCU */ +/* Request an expedited quiescent state. */ +static void rcu_exp_need_qs(void) +{ + __this_cpu_write(rcu_data.cpu_no_qs.b.exp, true); + /* Store .exp before .rcu_urgent_qs. */ + smp_store_release(this_cpu_ptr(&rcu_data.rcu_urgent_qs), true); + set_tsk_need_resched(current); + set_preempt_need_resched(); +} + /* Invoked on each online non-idle CPU for expedited quiescent state. */ static void rcu_exp_handler(void *unused) { @@ -714,25 +724,38 @@ static void rcu_exp_handler(void *unused) rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); return; } - __this_cpu_write(rcu_data.cpu_no_qs.b.exp, true); - /* Store .exp before .rcu_urgent_qs. */ - smp_store_release(this_cpu_ptr(&rcu_data.rcu_urgent_qs), true); - set_tsk_need_resched(current); - set_preempt_need_resched(); + rcu_exp_need_qs(); } /* Send IPI for expedited cleanup if needed at end of CPU-hotplug operation. */ static void sync_sched_exp_online_cleanup(int cpu) { + unsigned long flags; + int my_cpu; struct rcu_data *rdp; int ret; struct rcu_node *rnp; rdp = per_cpu_ptr(&rcu_data, cpu); rnp = rdp->mynode; - if (!(READ_ONCE(rnp->expmask) & rdp->grpmask)) + my_cpu = get_cpu(); + /* Quiescent state either not needed or already requested, leave. */ + if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) || + __this_cpu_read(rcu_data.cpu_no_qs.b.exp)) { + put_cpu(); return; + } + /* Quiescent state needed on current CPU, so set it up locally. */ + if (my_cpu == cpu) { + local_irq_save(flags); + rcu_exp_need_qs(); + local_irq_restore(flags); + put_cpu(); + return; + } + /* Quiescent state needed on some other CPU, send IPI. */ ret = smp_call_function_single(cpu, rcu_exp_handler, NULL, 0); + put_cpu(); WARN_ON_ONCE(ret); }