Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 992F3C433F5 for ; Fri, 26 Nov 2021 13:34:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1353350AbhKZNhl (ORCPT ); Fri, 26 Nov 2021 08:37:41 -0500 Received: from foss.arm.com ([217.140.110.172]:34014 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352433AbhKZNfk (ORCPT ); Fri, 26 Nov 2021 08:35:40 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3345311D4; Fri, 26 Nov 2021 05:32:27 -0800 (PST) Received: from e113632-lin (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 306713F66F; Fri, 26 Nov 2021 05:32:26 -0800 (PST) From: Valentin Schneider To: Vincent Guittot Cc: Vincent Donnefort , peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org, mgorman@techsingularity.net, dietmar.eggemann@arm.com Subject: Re: [PATCH] sched/fair: Fix detection of per-CPU kthreads waking a task In-Reply-To: References: <20211124154239.3191366-1-vincent.donnefort@arm.com> <8735nkcwov.mognet@arm.com> <87zgpsb6de.mognet@arm.com> Date: Fri, 26 Nov 2021 13:32:03 +0000 Message-ID: <87sfvjavqk.mognet@arm.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 26/11/21 09:23, Vincent Guittot wrote: > On Thu, 25 Nov 2021 at 16:30, Valentin Schneider > wrote: >> On 25/11/21 14:23, Vincent Guittot wrote: >> > If we want to filter wakeup >> > generated by interrupt context while a per cpu kthread is running, it >> > would be better to fix all cases and test the running context like >> > this >> > >> >> I think that could make sense - though can the idle task issue wakeups in >> process context? If so that won't be sufficient. A quick audit tells me: >> >> o rcu_nocb_flush_deferred_wakeup() happens before calling into cpuidle >> o I didn't see any wakeup issued from the cpu_pm_notifier call chain >> o I'm not entirely sure about flush_smp_call_function_from_idle(). I found >> this thing in RCU: >> >> smp_call_function_single(cpu, rcu_exp_handler) >> >> rcu_exp_handler() >> rcu_report_exp_rdp() >> rcu_report_exp_cpu_mult() >> __rcu_report_exp_rnp() >> swake_up_one() >> >> IIUC if set_nr_if_polling() then the smp_call won't send an IPI and should be >> handled in that flush_foo_from_idle() call. > > Aren't all these planned to wakeup on local cpu ? so i don't see any > real problem there > Hm so other than boot time oddities I think that does end up with threads of an !UNBOUND (so pcpu) workqueue... >> >> I'd be tempted to stick your VincentD's conditions together, just to be >> safe... > > More than safe I would prefer that we fix the correct root cause > instead of hiding it > I did play around a bit to see if this could be true when evaluating that is_per_cpu_kthread() condition: is_idle_task(current) && in_task() && p->nr_cpus_allowed > 1 but no luck so far. An in_task() check would appear sufficient, but how's this? --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 884f29d07963..f45806b7f47a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6390,14 +6390,18 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) return prev; /* - * Allow a per-cpu kthread to stack with the wakee if the - * kworker thread and the tasks previous CPUs are the same. - * The assumption is that the wakee queued work for the - * per-cpu kthread that is now complete and the wakeup is - * essentially a sync wakeup. An obvious example of this + * Allow a per-cpu kthread to stack with the wakee if the kworker thread + * and the tasks previous CPUs are the same. The assumption is that the + * wakee queued work for the per-cpu kthread that is now complete and + * the wakeup is essentially a sync wakeup. An obvious example of this * pattern is IO completions. + * + * Ensure the wakeup is issued by the kthread itself, and don't match + * against the idle task because that could override the + * available_idle_cpu(target) check done higher up. */ - if (is_per_cpu_kthread(current) && + if (is_per_cpu_kthread(current) && !is_idle_task(current) && + in_task() && prev == smp_processor_id() && this_rq()->nr_running <= 1) { return prev;