Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp4624089pxv; Tue, 27 Jul 2021 11:58:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzRxAcklCzJnWRLt1F6clJs4vOz1a0ECz7XsPsnnQt+0cGskeFG2bK3NKDnKLNEZGHIb+d4 X-Received: by 2002:a05:6638:2656:: with SMTP id n22mr22566743jat.64.1627412330408; Tue, 27 Jul 2021 11:58:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627412330; cv=none; d=google.com; s=arc-20160816; b=Cn+qpT8552fcUJEAyT3TA76H4hRQa/cHrgLcZMI7vOywWJWVFv7bwGkQeGUJP/Phym d+5iNLA592mihvY7abcgHbEVasEL55g4fVeZsiVoRs7V3MEBRacEg/JbQnFs42VGMyG1 AdkZNrfd1kMjaHEqbgYnZ+hqrdUbajY9xlbBePckVgOP9TjKtc90B3QO+WHhbexzCFdJ 3Ayf18VBDfTiW74QO9KsbQHdYRvAzI56r555sWdDuKh/Adydz06w3oP/1U7ZdlOiRe3E Q3gwzjreymLSupA+MLv4Ta3UD0yvcWAbgmbFNuLrJAJO1rjTeWnw+hD7UY9hGcPkUivy Jzcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:in-reply-to :date:references:subject:cc:to:from:dkim-signature; bh=NXlwJruXkpJ+yuRr3Ny47qE4ejJcWzcd0hJXzdTUy1o=; b=klPIWkjUSdjMNi32aRuRT6L+W34WF8bqSsts6iAQiezRDZNI9l7Jfof1fwQXz/TUJK +DgA9vgccsKkCgxbd17gPDpQlKNJe/20zo+2Amshhh0Aj6+GCRO5IyZh7wvXUvSU/bKC a6uV6EN1ErPPvhVGRTWu/AQbfLIcBFwBgMcMDNruQwdslkq0zKvlYvrxYg6dTy6RQ69J JT7o5zSIKJwatmahzH3UxCrHIpEHobioUXmuEAndrxM4LncbG4QctHI5/7KiOXR6ZNDW yzZbzSeITqCiYhmbTS9vEof8iDbCGIVs4obLmHgKGye0U8lmI7kz6jkL+HzaG1Zex9Sc Mayg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="H7UwOW/t"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l23si3724881jaf.47.2021.07.27.11.58.38; Tue, 27 Jul 2021 11:58:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="H7UwOW/t"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230362AbhG0S52 (ORCPT + 99 others); Tue, 27 Jul 2021 14:57:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229763AbhG0S51 (ORCPT ); Tue, 27 Jul 2021 14:57:27 -0400 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6CE9C061757 for ; Tue, 27 Jul 2021 11:57:27 -0700 (PDT) Received: by mail-pj1-x1035.google.com with SMTP id m1so1038147pjv.2 for ; Tue, 27 Jul 2021 11:57:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=NXlwJruXkpJ+yuRr3Ny47qE4ejJcWzcd0hJXzdTUy1o=; b=H7UwOW/t/LEWHiQpFLro/pe3jEtcPO0eJPaadtLJ3pnvkadLizFuOiiY+oM8G9HD+P Ocs++ENNM0TOk1oXv1vAS0WRHRKyItKZqj0gCjto78xvLekO7HX2suQDKD/hSpzsM3Xm 7DU1hLmwDBgTtzdkX9E5JOnVfwjWf1Xr6Gd6U8e17NZ6vRcUoUZ3UqsEZxgcSbszjN7X PRoH+KWuLkOFv+6F1Bx+m+vtMT6tjucX+A7OSrtQCmrnYWOb2gy0/V0LMEdZnue7x/85 yArY/k1okB0FmJMpAlnf2PcM5WTrvfiMI+ArAsxb2sPRB5EONRckb9G3MJD59gPfNVzW 9+5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=NXlwJruXkpJ+yuRr3Ny47qE4ejJcWzcd0hJXzdTUy1o=; b=g7XIrtr2r6T16V42u5JEAa4HzG4xwGt6nJ0fUkUTkNScgfq/4Tvs5EPDjQXQYCRgST 82RP9PQyTpJCUzV7Wj6EjPrFVPhTlW2raBkZlNO1m1sAyjjAINpBTtof911+zb+OibiT EPs7MgrM64Pgi0XU2hhAm8AKzDMbaLzrCSw6ehz6svAUS2zhAdneRynMcjqI3uMHP/ys BKQFIVU67IlPLbOK/Fs69zE08Zq7T7dpSiVHHtgn8PEPZILGSdf0rGHm3HB+L1+nM72p Wb9J0l8kz8blUgA1KZjJVlSOTaB0NeVaddZxawUBULfFM5jyW5xDuG1MyoVu2I3q5xGk erMA== X-Gm-Message-State: AOAM531qngQjv+8/M86HUhm2wx1blGT3Rylg1AomRlziRt1qqHXiUxDx cFfxzk22jZdcBQ7ParfGxobC0g== X-Received: by 2002:a63:f241:: with SMTP id d1mr24813648pgk.424.1627412246933; Tue, 27 Jul 2021 11:57:26 -0700 (PDT) Received: from bsegall-glaptop.localhost (c-73-71-82-80.hsd1.ca.comcast.net. [73.71.82.80]) by smtp.gmail.com with ESMTPSA id il2sm3338495pjb.29.2021.07.27.11.57.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Jul 2021 11:57:25 -0700 (PDT) From: Benjamin Segall To: Christian Borntraeger Cc: Mel Gorman , peterz@infradead.org, bristot@redhat.com, dietmar.eggemann@arm.com, joshdon@google.com, juri.lelli@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linux@rasmusvillemoes.dk, mgorman@suse.de, mingo@kernel.org, rostedt@goodmis.org, valentin.schneider@arm.com, vincent.guittot@linaro.org Subject: Re: [PATCH 1/1] sched/fair: improve yield_to vs fairness References: <20210707123402.13999-1-borntraeger@de.ibm.com> <20210707123402.13999-2-borntraeger@de.ibm.com> <20210723093523.GX3809@techsingularity.net> <20210723162137.GY3809@techsingularity.net> <1acd7520-bd4b-d43d-302a-8dcacf6defa5@de.ibm.com> Date: Tue, 27 Jul 2021 11:57:13 -0700 In-Reply-To: <1acd7520-bd4b-d43d-302a-8dcacf6defa5@de.ibm.com> (Christian Borntraeger's message of "Mon, 26 Jul 2021 20:41:15 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Christian Borntraeger writes: > On 23.07.21 18:21, Mel Gorman wrote: >> On Fri, Jul 23, 2021 at 02:36:21PM +0200, Christian Borntraeger wrote: >>>> sched: Do not select highest priority task to run if it should be skipped >>>> >>>> >>>> >>>> index 44c452072a1b..ddc0212d520f 100644 >>>> --- a/kernel/sched/fair.c >>>> +++ b/kernel/sched/fair.c >>>> @@ -4522,7 +4522,8 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr) >>>> se = second; >>>> } >>>> - if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1) { >>>> + if (cfs_rq->next && >>>> + (cfs_rq->skip == left || wakeup_preempt_entity(cfs_rq->next, left) < 1)) { >>>> /* >>>> * Someone really wants this to run. If it's not unfair, run it. >>>> */ >>>> >>> >>> I do see a reduction in ignored yields, but from a performance aspect for my >>> testcases this patch does not provide a benefit, while the the simple >>> curr->vruntime += sysctl_sched_min_granularity; >>> does. >> I'm still not a fan because vruntime gets distorted. From the docs >> Small detail: on "ideal" hardware, at any time all tasks would have the >> same >> p->se.vruntime value --- i.e., tasks would execute simultaneously and no task >> would ever get "out of balance" from the "ideal" share of CPU time >> If yield_to impacts this "ideal share" then it could have other >> consequences. >> I think your patch may be performing better in your test case because every >> "wrong" task selected that is not the yield_to target gets penalised and >> so the yield_to target gets pushed up the list. >> >>> I still think that your approach is probably the cleaner one, any chance to improve this >>> somehow? >>> >> Potentially. The patch was a bit off because while it noticed that skip >> was not being obeyed, the fix was clumsy and isolated. The current flow is >> 1. pick se == left as the candidate >> 2. try pick a different se if the "ideal" candidate is a skip candidate >> 3. Ignore the se update if next or last are set >> Step 3 looks off because it ignores skip if next or last buddies are set >> and I don't think that was intended. Can you try this? >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 44c452072a1b..d56f7772a607 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -4522,12 +4522,12 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr) >> se = second; >> } >> - if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1) { >> + if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, se) < 1) { >> /* >> * Someone really wants this to run. If it's not unfair, run it. >> */ >> se = cfs_rq->next; >> - } else if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, left) < 1) { >> + } else if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, se) < 1) { >> /* >> * Prefer last buddy, try to return the CPU to a preempted task. >> */ >> > > This one alone does not seem to make a difference. Neither in ignored yield, nor > in performance. > > Your first patch does really help in terms of ignored yields when > all threads are pinned to one host CPU. After that we do have no ignored yield > it seems. But it does not affect the performance of my testcase. > I did some more experiments and I removed the wakeup_preempt_entity checks in > pick_next_entity - assuming that this will result in source always being stopped > and target always being picked. But still, no performance difference. > As soon as I play with vruntime I do see a difference (but only without the cpu cgroup > controller). I will try to better understand the scheduler logic and do some more > testing. If you have anything that I should test, let me know. > > Christian If both yielder and target are in the same cpu cgroup or the cpu cgroup is disabled (ie, if cfs_rq_of(p->se) matches), you could try if (p->se.vruntime > rq->curr->se.vruntime) swap(p->se.vruntime, rq->curr->se.vruntime) as well as the existing buddy flags, as an entirely fair vruntime boost to the target. For when they aren't direct siblings, you /could/ use find_matching_se, but it's much less clear that's desirable, since it would yield vruntime for the entire hierarchy to the target's hierarchy.