Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp550675ybt; Wed, 17 Jun 2020 07:55:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx/3KwygoSOCOw20RjnhrlImV1D88xH6dYJ2jT7uDOZyhteXyyzuvBP9HL4KNH/t2sQjijI X-Received: by 2002:aa7:c81a:: with SMTP id a26mr7962842edt.353.1592405725634; Wed, 17 Jun 2020 07:55:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592405725; cv=none; d=google.com; s=arc-20160816; b=Q5UUjtUmia5Utg43maEdB0T23LJafl5vQq67DzNFJDez/Iq/XMwsuyWDbQtgUqgSLZ Bh5RT/xWfEqMSAwIRgOjCLZj/OgeyYZds+S135ARshGaoX3oJ9TEJ5uH6tRGKhzFLeVR ROWirKulIGn1CB5lB+tsjIY2wxpBcaT4TP+/npFR/L9cN4G4jyFZFLStSNMqqn+yo6t2 MFWKMZ/oj4umtnuLe7sWBXCskaoLIsw2ZG7SH1a9oDTx8IOydekRH3wUEE0EGOJWvVXj cAotGcCfMmK03bF/OxKYVBkQcWfWemkSrfsn1QBGgbx1i4cPwZgiZtioakV39Vg8vHAD ehmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=GJE9jML/rs5aHW6KANE9bp+nTXaw2X7TS0VvCJYBPOA=; b=Rl7R+l0oyR7QJWcWntiv5EAWFpKbD5M8H/r4TIcoHgTCV3EzuyhGkBUU5G/cVg2EJP /SCX5yYAAx49L2EL8aJ6um0wUQ3nnNBUhQmonA4CE7X7pn6OHz4HulB4TW18tPN73gyO INa/zhJ1jE+htiHC9+9ob36yaFKOBz67JxvcDSzvTMQOexM5H4na1G0sJp3bh/Tcq8b+ nA6rMxLoWV21O6g3odGuCc5kt2j9aXLmA52WZ9HBi92Z49gBpD3s4hAYibAibY4jdsKJ R+Eu5x+GHvz8IMcGuv6mq1gr4WVorHMShxEZyVjVg0eDpQfwO4PBp3RT9oSXJ5ywT6LZ Aoig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=kOc4tU6n; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z11si137929ede.551.2020.06.17.07.55.03; Wed, 17 Jun 2020 07:55:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=kOc4tU6n; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726858AbgFQOxN (ORCPT + 99 others); Wed, 17 Jun 2020 10:53:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726355AbgFQOxL (ORCPT ); Wed, 17 Jun 2020 10:53:11 -0400 Received: from mail-lj1-x241.google.com (mail-lj1-x241.google.com [IPv6:2a00:1450:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B0E3C06174E for ; Wed, 17 Jun 2020 07:53:11 -0700 (PDT) Received: by mail-lj1-x241.google.com with SMTP id i3so3245668ljg.3 for ; Wed, 17 Jun 2020 07:53:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GJE9jML/rs5aHW6KANE9bp+nTXaw2X7TS0VvCJYBPOA=; b=kOc4tU6nGLrNB7XuIs8kQ61SoV/HvgPps5eut81gdNsvfp9GRT3OUBTvgQWKKiypxY rOvjL20PjDocXOoQsQ3TMX/vHZusaj8j0WmlAg/+EBzb9pXgoXh961f49AZbX//jfidW dXbndr0UWgbOEZb4dWkFgzLutVJbVHB6w33/8NULF7Sbofe1WFPKB+uths7yNOWqUt9l u04vg+luPdpPu0777HXXQctfvarpOVMhMvuKmagWS9/bo+cXWvm7bclMb2WxTfidEhyX lsTN9YvGDZmfNQ9huWnx4s3whwj7b2IgJjL8Ljk4yD9Fvl/P2zGScs+TtzfV6XX3tc7+ A5QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GJE9jML/rs5aHW6KANE9bp+nTXaw2X7TS0VvCJYBPOA=; b=qSoKMFdLSxuK6QKgDTtnIiWejblWDFS/trxIUsipg5MXiyUe92FsqZ7ejd9uR6ybb4 +FTaKmxUrBP3J1PqIGbWC2w6RKoPgtFvX0jnodEjqWLjr2h+BM3SI5WfcnXJ1YOA+WIQ GRZ8ZaGdxFzN92mzX7aWF7ku6m6GCpNJHhU5Frrd2n1/7iiHc/xZdnKpHkLAaiX/vCDL edesMHAiDxqKEI3zY7bbYuKn9xCM8Kpbmu2WzcXn6ZRkvc+nXxqZHbqZaUqc4Isztfcp PddxXbvoXiQGsuyj6e8t0gErpWD74ix6+msjjkn3fpIUtLvH3DPf4t64FTvAzp+OYCS8 1AnQ== X-Gm-Message-State: AOAM530bdcil0TcrXneYhmRcE8nZ2I37BfMMLXrZ0KiJyZBI2/usuXiU +/eKM1LHTZGYFYmGNeYPytbkUQylhZkjycnyZd9UPQ== X-Received: by 2002:a2e:9395:: with SMTP id g21mr3833138ljh.58.1592405589547; Wed, 17 Jun 2020 07:53:09 -0700 (PDT) MIME-Version: 1.0 References: <20200616164801.18644-1-peter.puhov@linaro.org> In-Reply-To: From: Peter Puhov Date: Wed, 17 Jun 2020 10:52:58 -0400 Message-ID: Subject: Re: [PATCH] sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal To: Valentin Schneider Cc: linux-kernel@vger.kernel.org, Robert Foley , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Peter Puhov Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 17 Jun 2020 at 06:50, Valentin Schneider wrote: > > > On 16/06/20 17:48, peter.puhov@linaro.org wrote: > > From: Peter Puhov > > We tested this patch with following benchmarks: > > perf bench -f simple sched pipe -l 4000000 > > perf bench -f simple sched messaging -l 30000 > > perf bench -f simple mem memset -s 3GB -l 15 -f default > > perf bench -f simple futex wake -s -t 640 -w 1 > > sysbench cpu --threads=8 --cpu-max-prime=10000 run > > sysbench memory --memory-access-mode=rnd --threads=8 run > > sysbench threads --threads=8 run > > sysbench mutex --mutex-num=1 --threads=8 run > > hackbench --loops 20000 > > hackbench --pipe --threads --loops 20000 > > hackbench --pipe --threads --loops 20000 --datasize 4096 > > > > and found some performance improvements in: > > sysbench threads > > sysbench mutex > > perf bench futex wake > > and no regressions in others. > > > > One nitpick for the results of those: condensing them in a table form would > make them more reader-friendly. Perhaps something like: > > | Benchmark | Metric | Lower is better? | BASELINE | SERIES | DELTA | > |------------------+----------+------------------+----------+--------+-------| > | Sysbench threads | # events | No | 45526 | 56567 | +24% | > | Sysbench mutex | ... | | | | | > > If you want to include more stats for each benchmark, you could have one table > per (e.g. see [1]) - it'd still be a more readable form (or so I believe). > > [1]: https://lore.kernel.org/lkml/20200206191957.12325-1-valentin.schneider@arm.com/ > Good point. I will reformat test results. > > --- > > kernel/sched/fair.c | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 02f323b85b6d..abcbdf80ee75 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -8662,8 +8662,14 @@ static bool update_pick_idlest(struct sched_group *idlest, > > > > case group_has_spare: > > /* Select group with most idle CPUs */ > > - if (idlest_sgs->idle_cpus >= sgs->idle_cpus) > > + if (idlest_sgs->idle_cpus > sgs->idle_cpus) > > return false; > > + > > + /* Select group with lowest group_util */ > > + if (idlest_sgs->idle_cpus == sgs->idle_cpus && > > + idlest_sgs->group_util <= sgs->group_util) > > + return false; > > + > > break; > > } > > update_sd_pick_busiest() uses the group's nr_running instead. You mention > in the changelog that using nr_running is a possible alternative, did you > try benchmarking that and seeing how it compares to using group_util? > > I think it would be nice to keep pick_busiest() and pick_idlest() aligned > wherever possible/sensible. > I agree with you. > Also, there can be cases where one group has a few "big" tasks and another > has a handful more "small" tasks. Say something like > > sgs_a->group_util = U > sgs_a->sum_nr_running = N > > sgs_b->group_util = U*4/3 > sgs_b->sum_nr_running = N*2/3 > > (sgs_b has more util per task, i.e. bigger tasks on average) > > Given that we're in the 'group_has_spare' case, I would think picking the > group with the lesser amount of running tasks would make sense. Though I > guess you can find pathological cases where the util per task difference is > huge and we should look at util first... I will re-run the tests with logic based on sum_nr_running and post results. Thank you for suggestions.