Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp477965pxb; Wed, 3 Mar 2021 07:54:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJwf3OhKRhux33m7Yd3+32wx29l0SibCHY59JQ2pBd+XTtdr/zWUrGc9bs7gpRX/X3iLVJpL X-Received: by 2002:a17:906:5797:: with SMTP id k23mr10428303ejq.515.1614786858858; Wed, 03 Mar 2021 07:54:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614786858; cv=none; d=google.com; s=arc-20160816; b=ZyJZUncwFd7hNRNGUHGoOjxMAEWXR1tNXjgC+/gJcc3BJC7GZFfUnSoO1G7bZng/WV bvIjs0gqDR0KLdrrJ59hPYpxhn3xaRQ4fIaU560m91gDya2JeEJoa1KfxqJNGJsfkfR+ nxNgn5zpRk8SbWCM8e53SEeL8RdE5HTB1w0rcqI4mDi8+N4QqqA5di3hQNz6kVt3YULz mwi6IXbkz3AnoWYjN49GeZPwfATCP0Kn7W7hxOp/z6upKE6krwROehKUizuJ+1PJQdBs DxSuuVHpyi9BlF0HxdtCRhHUfVljyvrbCccWcKUOmczsvSl5DNvcVGZjIlwi6zBx0k// X1HA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=gm0noI/ENxlOmGIOOuZ/llt70m+MwASDe2oXwSLJJVY=; b=dqpDE5EX6bn9TpTNj4RdxgG/6V6tyqL4zdkWo//uYWbjfX9mqSYrZL5qghGePJqSoh CUYqVljz8XEDhFC3wosCFWWW9M9c14N5/ijWZWVAHY6sX5WAUkUxE4uPunuoC3Kmeug6 2JsIAmmbIyW/YcAzqHraokqFuXsKp+v1s/5+J5pPl8AV40i62lwIebniDv9ubRfp1+N2 2zfjVLU+KAsLL5iAWbaFww/S8xOsJwVzTvPWnaxoUgtgAeisPQEzNGtsLotKsk2+anaS 2f5L7KD77vDLIFxn7oALlSagiBKj4lJhe9YWAYGKXddyAFaOTATfNURM6b9bumdKjTE0 Gqpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=onE4EjMC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v6si7817964ejk.379.2021.03.03.07.53.40; Wed, 03 Mar 2021 07:54:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=onE4EjMC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344847AbhCAXCi (ORCPT + 99 others); Mon, 1 Mar 2021 18:02:38 -0500 Received: from mail.kernel.org ([198.145.29.99]:47100 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238986AbhCARwG (ORCPT ); Mon, 1 Mar 2021 12:52:06 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 96C1E65089; Mon, 1 Mar 2021 17:29:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1614619769; bh=0CKpFXtBZrg8imhY0GCetZ0dCC7wRPgy9/2vh+9Cbqg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=onE4EjMCkJY5tUzYvQGcDrYCBDq9dzbdlC14dygzhw77BO2JHLSCZbMkVx9cNqGBj JeriRKXH0DCLzSyhW4myCmfhQ+M54LplAdoX4T/8C5hEP5lp2ejkeNKlZHQUpr/A15 ltLHMXgyFtYhswMvQN/oaDNvpJT5W+rFncVUQRjI= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Paul E. McKenney" , Frederic Weisbecker , "Peter Zijlstra (Intel)" , Ingo Molnar Subject: [PATCH 5.10 574/663] rcu/nocb: Trigger self-IPI on late deferred wake up before user resume Date: Mon, 1 Mar 2021 17:13:42 +0100 Message-Id: <20210301161210.263566366@linuxfoundation.org> X-Mailer: git-send-email 2.30.1 In-Reply-To: <20210301161141.760350206@linuxfoundation.org> References: <20210301161141.760350206@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Frederic Weisbecker commit f8bb5cae9616224a39cbb399de382d36ac41df10 upstream. Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced. Unfortunately the call to rcu_user_enter() is already past the last rescheduling opportunity before we resume to userspace or to guest mode. We may escape there with the woken task ignored. The ultimate resort to fix every callsites is to trigger a self-IPI (nohz_full depends on arch to implement arch_irq_work_raise()) that will trigger a reschedule on IRQ tail or guest exit. Eventually every site that want a saner treatment will need to carefully place a call to rcu_nocb_flush_deferred_wakeup() before the last explicit need_resched() check upon resume. Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf) Reported-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210131230548.32970-4-frederic@kernel.org Signed-off-by: Greg Kroah-Hartman --- kernel/rcu/tree.c | 21 ++++++++++++++++++++- kernel/rcu/tree.h | 2 +- kernel/rcu/tree_plugin.h | 25 ++++++++++++++++--------- 3 files changed, 37 insertions(+), 11 deletions(-) --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -669,6 +669,18 @@ void rcu_idle_enter(void) EXPORT_SYMBOL_GPL(rcu_idle_enter); #ifdef CONFIG_NO_HZ_FULL + +/* + * An empty function that will trigger a reschedule on + * IRQ tail once IRQs get re-enabled on userspace resume. + */ +static void late_wakeup_func(struct irq_work *work) +{ +} + +static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) = + IRQ_WORK_INIT(late_wakeup_func); + /** * rcu_user_enter - inform RCU that we are resuming userspace. * @@ -686,12 +698,19 @@ noinstr void rcu_user_enter(void) lockdep_assert_irqs_disabled(); + /* + * We may be past the last rescheduling opportunity in the entry code. + * Trigger a self IPI that will fire and reschedule once we resume to + * user/guest mode. + */ instrumentation_begin(); - do_nocb_deferred_wakeup(rdp); + if (do_nocb_deferred_wakeup(rdp) && need_resched()) + irq_work_queue(this_cpu_ptr(&late_wakeup_work)); instrumentation_end(); rcu_eqs_enter(true); } + #endif /* CONFIG_NO_HZ_FULL */ /** --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -431,7 +431,7 @@ static bool rcu_nocb_try_bypass(struct r static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty, unsigned long flags); static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp); -static void do_nocb_deferred_wakeup(struct rcu_data *rdp); +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp); static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp); static void rcu_spawn_cpu_nocb_kthread(int cpu); static void __init rcu_spawn_nocb_kthreads(void); --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1631,8 +1631,8 @@ bool rcu_is_nocb_cpu(int cpu) * Kick the GP kthread for this NOCB group. Caller holds ->nocb_lock * and this function releases it. */ -static void wake_nocb_gp(struct rcu_data *rdp, bool force, - unsigned long flags) +static bool wake_nocb_gp(struct rcu_data *rdp, bool force, + unsigned long flags) __releases(rdp->nocb_lock) { bool needwake = false; @@ -1643,7 +1643,7 @@ static void wake_nocb_gp(struct rcu_data trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("AlreadyAwake")); rcu_nocb_unlock_irqrestore(rdp, flags); - return; + return false; } del_timer(&rdp->nocb_timer); rcu_nocb_unlock_irqrestore(rdp, flags); @@ -1656,6 +1656,8 @@ static void wake_nocb_gp(struct rcu_data raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); if (needwake) wake_up_process(rdp_gp->nocb_gp_kthread); + + return needwake; } /* @@ -2152,20 +2154,23 @@ static int rcu_nocb_need_deferred_wakeup } /* Do a deferred wakeup of rcu_nocb_kthread(). */ -static void do_nocb_deferred_wakeup_common(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp) { unsigned long flags; int ndw; + int ret; rcu_nocb_lock_irqsave(rdp, flags); if (!rcu_nocb_need_deferred_wakeup(rdp)) { rcu_nocb_unlock_irqrestore(rdp, flags); - return; + return false; } ndw = READ_ONCE(rdp->nocb_defer_wakeup); WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); - wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); + ret = wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake")); + + return ret; } /* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */ @@ -2181,10 +2186,11 @@ static void do_nocb_deferred_wakeup_time * This means we do an inexact common-case check. Note that if * we miss, ->nocb_timer will eventually clean things up. */ -static void do_nocb_deferred_wakeup(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp) { if (rcu_nocb_need_deferred_wakeup(rdp)) - do_nocb_deferred_wakeup_common(rdp); + return do_nocb_deferred_wakeup_common(rdp); + return false; } void rcu_nocb_flush_deferred_wakeup(void) @@ -2523,8 +2529,9 @@ static int rcu_nocb_need_deferred_wakeup return false; } -static void do_nocb_deferred_wakeup(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp) { + return false; } static void rcu_spawn_cpu_nocb_kthread(int cpu)