Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5652216ioo; Wed, 1 Jun 2022 09:40:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxz85YmHsFgE2dZraRtRvKXkOM8M9r3iXjmLimMpFUHliviiJMcEpykW4RI+N8z1cx8BrXK X-Received: by 2002:a05:6a00:2402:b0:4e1:3df2:5373 with SMTP id z2-20020a056a00240200b004e13df25373mr67214204pfh.40.1654101611718; Wed, 01 Jun 2022 09:40:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654101611; cv=none; d=google.com; s=arc-20160816; b=MXAKm0aUixyP7EbM5AMftKkfZIR8cXFr6eaHsNY2nx5MjJdqKnqWPjP64XmylGvwVr 2/76CnOGWc3UHl9p33el4QZ+NP9SYCiIuHFnGIGq9Lms7hxDLZbGUpR4+t9SR4L0a8K0 YZvdKw1QQZ/f0TavcaWPNQ+HUK5jmx1xAZST0vijSTIR1D8JLdF2SRE7yIG/F2IV2psP 1rSq4OP/Wi8S2z63fugNgQ8AHo6g4fjhLZ+BD2P705RymZQqGdtpUlmdw1XJljttg8CU BpO6uhGOLu4UeK6te2X/64HRadhzsWj/tTFB5E+dHEIiRycYKo7EayCCbhdwWZb8Ki/1 4fHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=Tz11Xd3DUwgpuz1I1Af7eS0DC941QGcjEkZiApeJh1Q=; b=V/F5iBspL6Yvt16xmMrdFCGZy3JE2/REKdm4RlO8SFP6KMPDKdaU801meJre4BI91t Zoq52d3dgAouFFGogrbS43octNCCFirxKQGIImsexjvQI1BUpPD3ZOgeK3gaAvFj5gJE J0BFyYp1Iq5OqE+jBdreBVh6YguBLviZwpnm9zasMWYIV7UVN5qz0JT181bgZ5pUgswf fIU2eW4JxTZPLJ3HyMt8DX8ll5Vz92+bzXoyOAdBD1Lv5iyiH4HnCXiAxCkeCE8WmWFM MWZBuc9FzDjyKJFfK4+6/UXix0R3uDatUJ24/YMlTgRy7Kn3WQnBqU3tgTVTWieOe079 eBdA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gb14-20020a17090b060e00b001df6b7a8a91si2779040pjb.109.2022.06.01.09.39.52; Wed, 01 Jun 2022 09:40:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352895AbiFAMC0 (ORCPT + 99 others); Wed, 1 Jun 2022 08:02:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51108 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351742AbiFAMCV (ORCPT ); Wed, 1 Jun 2022 08:02:21 -0400 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 272BE4AE1A for ; Wed, 1 Jun 2022 05:02:18 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=dtcccc@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0VF4pZzh_1654084934; Received: from 30.236.24.63(mailfrom:dtcccc@linux.alibaba.com fp:SMTPD_---0VF4pZzh_1654084934) by smtp.aliyun-inc.com(127.0.0.1); Wed, 01 Jun 2022 20:02:15 +0800 Message-ID: <76b9b12d-b5a3-8990-f7ab-1a49f55aac19@linux.alibaba.com> Date: Wed, 1 Jun 2022 20:02:13 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH v2] sched: Queue task on wakelist in the same llc if the wakee cpu is idle Content-Language: en-US To: Valentin Schneider Cc: Ingo Molnar , Mel Gorman , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Daniel Bristot de Oliveira , linux-kernel@vger.kernel.org References: <20220527090544.527411-1-dtcccc@linux.alibaba.com> <1d0eb8f4-e474-86a9-751a-7c2e1788df85@linux.alibaba.com> <20220531135532.GA3332@suse.de> From: Tianchen Ding In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,HK_RANDOM_ENVFROM,HK_RANDOM_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/6/1 18:58, Valentin Schneider wrote: > On 01/06/22 13:54, Tianchen Ding wrote: >> On 2022/5/31 23:56, Valentin Schneider wrote: >> >>> Thanks! >>> >>> So I'm thinking we could first make that into >>> >>> if ((wake_flags & WF_ON_CPU) && !cpu_rq(cpu)->nr_running) >>> >>> Then building on this, we can generalize using the wakelist to any remote >>> idle CPU (which on paper isn't as much as a clear win as just WF_ON_CPU, >>> depending on how deeply idle the CPU is...) >>> >>> We need the cpu != this_cpu check, as that's currently served by the >>> WF_ON_CPU check (AFAIU we can only observe p->on_cpu in there for remote >>> tasks). >>> >>> --- >>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >>> index 66c4e5922fe1..60038743f2f1 100644 >>> --- a/kernel/sched/core.c >>> +++ b/kernel/sched/core.c >>> @@ -3830,13 +3830,20 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags) >>> if (!cpus_share_cache(smp_processor_id(), cpu)) >>> return true; >>> >>> + if (cpu == smp_processor_id()) >>> + return false; >>> + >>> /* >>> * If the task is descheduling and the only running task on the >>> * CPU then use the wakelist to offload the task activation to >>> * the soon-to-be-idle CPU as the current CPU is likely busy. >>> * nr_running is checked to avoid unnecessary task stacking. >>> + * >>> + * Note that we can only get here with (wakee) p->on_rq=0, >>> + * p->on_cpu can be whatever, we've done the dequeue, so >>> + * the wakee has been accounted out of ->nr_running >>> */ >>> - if ((wake_flags & WF_ON_CPU) && cpu_rq(cpu)->nr_running <= 1) >>> + if (!cpu_rq(cpu)->nr_running) >>> return true; >>> >>> return false; >> >> Hi Valentin. I've done a simple unixbench test (Pipe-based Context >> Switching) on my x86 machine with full threads (104). >> >> old patch1 patch1+patch2 >> score 7825.4 7500(more)-8000 9061.6 >> >> patch1: use !cpu_rq(cpu)->nr_running instead of cpu_rq(cpu)->nr_running <= 1 >> patch2: ignore WF_ON_CPU check >> >> The score of patch1 is not stable. I've tested for many times and the >> score is floating between about 7500-8000 (more at 7500). >> >> patch1 means more strict limit on using wakelist. But it may cause >> performance regression. >> >> It seems that, using wakelist properly can help improve wakeup >> performance, but using it too much may cause more IPIs. It's a trade-off >> about how strict the ttwu_queue_cond() is. >> >> Anyhow, I think patch2 should be a pure improvement. What's your idea? > > Thanks for separately testing these two. > > I take it the results for patch1 are noticeably more swingy than the > baseline? (FWIW boxplots are usually a nice way to summarize those sort of > results). > Hmm... I'm not familiar with this... T want to say that I'm not sure about the performance impact about patch1. While from the view of logic, patch1 should be correct. > WF_ON_CPU && nr_running == 1 means the wakee is scheduling out *and* there > is another task queued, I'm guessing that's relatively common in your > unixbench scenario... > > Either way, I think we want to keep the two changes separate for the sake > of testing and bisecting. Yes. I'll split the patch to 2 parts. One for logic fix and another for performance improvement.