Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp4302524ioo; Tue, 31 May 2022 01:04:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzO/VkDX9GA8/5P+unW2ZinUjlGjo5kGAKnpebAEFRW/MQhpcTFNUpDIKWrZVmnIHai2Zhb X-Received: by 2002:a17:906:58c5:b0:6fe:fa12:1da9 with SMTP id e5-20020a17090658c500b006fefa121da9mr34184293ejs.148.1653984271015; Tue, 31 May 2022 01:04:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653984271; cv=none; d=google.com; s=arc-20160816; b=PRot9Nuz91mN//qEP+hgA8KxLjCO8hjPGBA4ahuR9m9RB8MR3ZBCE3PDJrmylejRIS IZDg5cFMdfZPXqzPsV/2tf94KBbBxB/Ar2TlAus3z5DHLTkLFpXjgg31gD82qTNA6ej0 bcHaU55L86vVkjh3qvFzeGEA0N+LZ8AaRaySPcnMh42XXaz1lBHf7MA4IlmJ1QpC/swe ZsaWKlGe9Jj0UqXixZuf8yKKvJxN1/xBnDj631lTyxA2GqsJNFno3GbEvMxx+1Hl4vy/ cGKM19QOlxi8Y2g/dV3mWJ2umaP+LJq7mLg3IG3QXKLBFSON8HRrZTgORnQAEovNfcYo 9M7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=zFbND0CnNXqoLpTJ95poS7XsqOwK4wYIEJAldCiqNxc=; b=kcxKcYzhHc2dHSzyB/J7WRa9nu5NMAU7t3hIM0Uaf6/d5YUMGUBvhvXlr67Ei6Du3o i/nQYHcnrhJWNRyXpZssbbFZrNNo8VIfR97aJlMEwFg7Na5mUaXQ+zEfQ4wL9aIwub3a K+5xMaMnDN1HGR58CVmH0/1hMe/tP/4g+dhoJOTIryhToofmHOrG0qdn/NedrLbSoNmW iyVzl4eUzr4HblHn74+pk465je/4mB7jhn49Dp0RVGe1BqlPSQooyXEv+Cp3AGtwPA3m o0fY4f3semaoRp4I2cC15sM5EZ/ghQsHLPw48ab0k88uRQOdqrLJKtPj9wv/rGCJyf+W Wn/w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f14-20020a056402328e00b0042ddbfc80b9si578040eda.36.2022.05.31.01.04.02; Tue, 31 May 2022 01:04:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244376AbiEaHUu (ORCPT + 99 others); Tue, 31 May 2022 03:20:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233205AbiEaHUt (ORCPT ); Tue, 31 May 2022 03:20:49 -0400 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 092FF8BD01 for ; Tue, 31 May 2022 00:20:45 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R851e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=dtcccc@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0VEu.I5r_1653981640; Received: from 30.39.201.51(mailfrom:dtcccc@linux.alibaba.com fp:SMTPD_---0VEu.I5r_1653981640) by smtp.aliyun-inc.com(127.0.0.1); Tue, 31 May 2022 15:20:42 +0800 Message-ID: <1d0eb8f4-e474-86a9-751a-7c2e1788df85@linux.alibaba.com> Date: Tue, 31 May 2022 15:20:40 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH v2] sched: Queue task on wakelist in the same llc if the wakee cpu is idle Content-Language: en-US To: Valentin Schneider Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , linux-kernel@vger.kernel.org References: <20220527090544.527411-1-dtcccc@linux.alibaba.com> From: Tianchen Ding In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,HK_RANDOM_ENVFROM,HK_RANDOM_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/5/31 00:24, Valentin Schneider wrote: > On 27/05/22 17:05, Tianchen Ding wrote: >> The main idea of wakelist is to avoid cache bouncing. However, >> commit 518cd6234178 ("sched: Only queue remote wakeups when >> crossing cache boundaries") disabled queuing tasks on wakelist when >> the cpus share llc. This is because, at that time, the scheduler must >> send IPIs to do ttwu_queue_wakelist. Nowadays, ttwu_queue_wakelist also >> supports TIF_POLLING, so this is not a problem now when the wakee cpu is >> in idle polling. > > [...] > >> Our patch has improvement on schbench, hackbench >> and Pipe-based Context Switching of unixbench >> when there exists idle cpus, >> and no obvious regression on other tests of unixbench. >> This can help improve rt in scenes where wakeup happens frequently. >> >> Signed-off-by: Tianchen Ding > > This feels a bit like a generalization of > > 2ebb17717550 ("sched/core: Offload wakee task activation if it the wakee is descheduling") > > Given rq->curr is updated before prev->on_cpu is cleared, the waker > executing ttwu_queue_cond() can observe: > > p->on_rq=0 > p->on_cpu=1 > rq->curr=swapper/x (aka idle task) > > So your addition of available_idle_cpu() in ttwu_queue_cond() (sort of) > matches that when invoked via: > > if (smp_load_acquire(&p->on_cpu) && > ttwu_queue_wakelist(p, task_cpu(p), wake_flags | WF_ON_CPU)) > goto unlock; > > but it also affects > > ttwu_queue(p, cpu, wake_flags); > > at the tail end of try_to_wake_up(). Yes. This part is what we mainly want to affect. The above WF_ON_CPU is not our point. > > With all that in mind, I'm curious whether your patch is functionaly close > to the below. > > --- > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 66c4e5922fe1..ffd43264722a 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -3836,7 +3836,7 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags) > * the soon-to-be-idle CPU as the current CPU is likely busy. > * nr_running is checked to avoid unnecessary task stacking. > */ > - if ((wake_flags & WF_ON_CPU) && cpu_rq(cpu)->nr_running <= 1) > + if (cpu_rq(cpu)->nr_running <= 1) > return true; > > return false; It's a little different. This may bring extra IPIs when nr_running == 1 and the current task on wakee cpu is not the target wakeup task (i.e., rq->curr == another_task && rq->curr != p). Then this another_task may be disturbed by IPI which is not expected. So IMO the promise by WF_ON_CPU is necessary.