Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1949436rwd; Mon, 15 May 2023 05:29:06 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6XEVPp4MV+qB9ohtOaihlc4HkuN9avnNXF9SHvgF8hp61LuxMaSo2M25D2vs6xuTXfQQDN X-Received: by 2002:a05:6a20:1581:b0:101:530a:1d12 with SMTP id h1-20020a056a20158100b00101530a1d12mr26030366pzj.44.1684153746056; Mon, 15 May 2023 05:29:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684153746; cv=none; d=google.com; s=arc-20160816; b=Ait596m+e9S3q2kXK8PCVpx3fNV7CZWPJnksG820H0iO6HccMl4gD6/H2mf1zqLAFO Vm1YQxPdGcw/3plN9zMd1bB14yDxiJ81SPzqiprSjPMj8yAL2dtvAZQKFJjWosKEuyHJ 11gmJEXu1wB0+ZY6C615sUpbimc4C4yiVhyS7Jldu3gnFSjhZUxTfuA2Ec0kw23CgTfL EvpFhKdHMgnGs7chD7mLFNk13fCNmUGkh95AqzmGhRyTbkGmlk/0M1wJliNWWjguZ1fy VMZ4kXbPzGLmPfkEGaDe0qtRbI7w6rhfEUDOEMkcHHSL915MZ6l6XpegkhfV12Jz2RKP 9JnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=PuTLzrwLjYPMalkgGvIKrXB8CWGCowuFPQlrAZS2KhY=; b=SStpIdZ5a/BdHNqpb1Q8hca7YSpIn5hXF2XujZcQEI+kTkXi/XQNO55XhpmRXqbSOn 3e/2o9dc0iM2Iu8ra3Z4bTzJ64oad1D2LEj/GuwbJZNaJhoNzbzcxQF3OS8n9YFvn25X F71nNSQr3oc4fLUNkqHpkjht7oaXVfOP67cGk2jVXBPUM2BQcUN0eZ6LfNs2gmCzhD8C 8Mv+NthH4hfAV/2xTxcitoBoG/DDmwHtWutl1qcBIwxtrKODLTS7/FBv2AXjKpLea1MC 3+rYoMyKuDXPyHRlSXhpLYKPgbTs+FBIzUZWDzbxify5LJuQLKK88EyKSLai1S1BgE7m yrNQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i184-20020a6387c1000000b0052c27a0125bsi16521413pge.738.2023.05.15.05.28.48; Mon, 15 May 2023 05:29:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241426AbjEOMW6 (ORCPT + 99 others); Mon, 15 May 2023 08:22:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229679AbjEOMW4 (ORCPT ); Mon, 15 May 2023 08:22:56 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2026F1BC for ; Mon, 15 May 2023 05:22:55 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 057582F4; Mon, 15 May 2023 04:58:36 -0700 (PDT) Received: from e125579.fritz.box (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AA2CC3F7BD; Mon, 15 May 2023 04:57:49 -0700 (PDT) From: Dietmar Eggemann To: Ingo Molnar , Peter Zijlstra , Vincent Guittot Cc: Qais Yousef , Kajetan Puchalski , Morten Rasmussen , Vincent Donnefort , Quentin Perret , Abhijeet Dharmapurikar , linux-kernel@vger.kernel.org Subject: [PATCH v3 0/2] sched: Consider CPU contention in frequency, EAS max util & load-balance busiest CPU selection Date: Mon, 15 May 2023 13:57:33 +0200 Message-Id: <20230515115735.296329-1-dietmar.eggemann@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is the implementation of the idea to factor in CPU runnable_avg into the CPU utilization getter functions (so called 'runnable boosting') as a way to consider CPU contention for: (a) CPU frequency (b) EAS' max util and (c) 'migrate_util' type load-balance busiest CPU selection. Tests: for (a) and (b): Testcase is Jankbench (all subtests, 10 iterations) on Pixel6 (Android 12) with mainline v5.18 kernel and forward ported task scheduler patches. Uclamp has been deactivated so that the Android Dynamic Performance Framework (ADPF) 'CPU performance hints' feature (Userspace task boosting via uclamp_min) does not interfere. Max_frame_duration: +-----------------+------------+ | kernel | value [ms] | +-----------------+------------+ | base | 163.061513 | | runnable | 161.991705 | +-----------------+------------+ Mean_frame_duration: +-----------------+------------+----------+ | kernel | value [ms] | diff [%] | +-----------------+------------+----------+ | base | 18.0 | 0.0 | | runnable | 12.7 | -29.43 | +-----------------+------------+----------+ Jank percentage (Jank deadline 16ms): +-----------------+------------+----------+ | kernel | value [%] | diff [%] | +-----------------+------------+----------+ | base | 3.6 | 0.0 | | runnable | 1.0 | -68.86 | +-----------------+------------+----------+ Power usage [mW] (total - all CPUs): +-----------------+------------+----------+ | kernel | value [mW] | diff [%] | +-----------------+------------+----------+ | base | 129.5 | 0.0 | | runnable | 134.3 | 3.71* | +-----------------+------------+----------+ * Power usage went up from 129.3 (-0.15%) in v1 to 134.3 (3.71%) whereas all the other benchmark numbers stayed roughly the same. This is probably because of using 'runnable boosting' for EAS max util now as well and tasks more often end up running on non-little CPUs because of that. for (c): Testcase is 'perf bench sched messaging' on Arm64 Ampere Altra with 160 CPUs (sched domains = {MC, DIE, NUMA}) which shows some small improvement: perf stat --null --repeat 10 -- perf bench sched messaging -t -g 1 -l 2000 0.4869 +- 0.0173 seconds time elapsed (+- 3.55%) -> 0.4377 +- 0.0147 seconds time elapsed (+- 3.36%) Chen Yu tested v1** with schbench, hackbench, netperf and tbench on an Intel Sapphire Rapids with 2x56C/112T = 224 CPUs which showed no obvious difference and some small improvements on tbench: https://lkml.kernel.org/r/ZFSr4Adtx1ZI8hoc@chenyu5-mobl1 ** The implementation for (c) hasn't changed in v2. v1 -> v2: (1) Refactor CPU utilization getter functions, let cpu_util_cfs() call cpu_util_next() (now cpu_util()). (2) Consider CPU contention in EAS (find_energy_efficient_cpu() -> eenv_pd_max_util()) next to schedutil (sugov_get_util()) as well so that EAS' and schedutil's views on CPU frequency selection are in sync. (3) Move 'util_avg = max(util_avg, runnable_avg)' from cpu_boosted_util_cfs() to cpu_util_next() (now cpu_util()) so that EAS can use it too. (4) Rework patch header. (5) Add test results (JankbenchX on Pixel6 to test changes in schedutil and EAS) and 'perf bench sched messaging' on Arm64 Ampere Altra for CFS load-balance (find_busiest_queue()). v2 -> v3: (1) Move function header from cpu_util_cfs() to cpu_util() and add a paragraph about 'runnable boosting'. (2) Create cpu_util_cfs_boost() and call it for sites which want to use 'runnable boosting'. (3) Use regular 'if (boost)' in cpu_util(). Dietmar Eggemann (2): sched/fair: Refactor CPU utilization functions sched/fair, cpufreq: Introduce 'runnable boosting' kernel/sched/cpufreq_schedutil.c | 3 +- kernel/sched/fair.c | 87 ++++++++++++++++++++++++++------ kernel/sched/sched.h | 48 +----------------- 3 files changed, 76 insertions(+), 62 deletions(-) -- 2.25.1