Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp10068618rwr; Fri, 12 May 2023 03:25:18 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6cqTsSiHLNxAfPgCQEY3D7khyrrWG8DDgtdRimBVgN0kDpxhpYHbFJdRiAKmqP3NuImv0c X-Received: by 2002:a17:903:124a:b0:1ac:4d01:dfec with SMTP id u10-20020a170903124a00b001ac4d01dfecmr31639195plh.54.1683887118653; Fri, 12 May 2023 03:25:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683887118; cv=none; d=google.com; s=arc-20160816; b=gk15+izFEylKzGAQm8zjWqLoWmKuxtu7hi2cO6av6Y5BhH+ST4/yQG2gUAo8XTsVtp o9OxYVJBUiVniS+lzOmAI4n2Gmfoj7UKbkFzwDo9HAjS9fH1qseVKONks6UCAufF1zoh IDALt58Yw/IbuttD0dwb8EKaXRyI6Qwqupn1xSRrlznK72acZ75oLdVxkWMSdUg6luIQ yBQFdWKezHZU3glq93ZfrOFk3VgLjwt9C6b37Wxvl43wfOV5YXvKgXun+rtXACInrUcV FYMGJPowargtVqX8y4jMDBS8MEUxMqHz9L+DaaEfOE6R6CqnSAw/p+69Hi9gyTCwTibL 9FHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=zHZEueUP/GJ+vJ8VAOqFlGdMNRbizfFtYmInqcrPIzo=; b=EqnuvpvwJ3FG99BLbFAggCxZOFea/kD0FGX6NZQRn/It+ZuIIM0s0s5dfkEDE43nk7 PhgGg+9W0T7HN9006nkyZfumcXrJ/WEyM3ifQDmbOODrs+luc0Eud84FS7aZ+kgE0W7f BCYczIaw33wmEMzhcyxPTE2QigAAsFygai/E9SC+D8yVYjDof1i/Hx+LmZPgN+EfDlTz 72TCRLn5pzSw1zMi5Z+p0cZ+LTR/bJqC4HtbaocmiRV7ERGbK2LVLFxNZau5nbEE+XMe gbopSO1s/cAWt61kYA1EJ7T/lQCLvXYqjl39OuG5LPaoIW5UPqXfEsTC/PtLcACCmMrI Ui4Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f14-20020a170902ce8e00b001ac7b1ddba1si201550plg.458.2023.05.12.03.25.04; Fri, 12 May 2023 03:25:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240186AbjELKLq (ORCPT + 99 others); Fri, 12 May 2023 06:11:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240013AbjELKLl (ORCPT ); Fri, 12 May 2023 06:11:41 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A12E911DA4 for ; Fri, 12 May 2023 03:11:04 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9B96DFEC; Fri, 12 May 2023 03:11:26 -0700 (PDT) Received: from e125579.fritz.box (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 67C9D3F5A1; Fri, 12 May 2023 03:10:40 -0700 (PDT) From: Dietmar Eggemann To: Ingo Molnar , Peter Zijlstra , Vincent Guittot Cc: Qais Yousef , Kajetan Puchalski , Morten Rasmussen , Vincent Donnefort , Quentin Perret , Abhijeet Dharmapurikar , linux-kernel@vger.kernel.org Subject: [PATCH v2 0/2] sched: Consider CPU contention in frequency, EAS max util & load-balance busiest CPU selection Date: Fri, 12 May 2023 12:10:27 +0200 Message-Id: <20230512101029.342823-1-dietmar.eggemann@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is the implementation of the idea to factor in CPU runnable_avg into the CPU utilization getter functions (so called 'runnable boosting') as a way to consider CPU contention for: (a) CPU frequency (b) EAS' max util and (c) 'migrate_util' type load-balance busiest CPU selection. Tests: for (a) and (b): Testcase is Jankbench (all subtests, 10 iterations) on Pixel6 (Android 12) with mainline v5.18 kernel and forward ported task scheduler patches. Uclamp has been deactivated so that the Android Dynamic Performance Framework (ADPF) 'CPU performance hints' feature (Userspace task boosting via uclamp_min) does not interfere. Max_frame_duration: +-----------------+------------+ | kernel | value [ms] | +-----------------+------------+ | base | 163.061513 | | runnable | 161.991705 | +-----------------+------------+ Mean_frame_duration: +-----------------+------------+----------+ | kernel | value [ms] | diff [%] | +-----------------+------------+----------+ | base | 18.0 | 0.0 | | runnable | 12.7 | -29.43 | +-----------------+------------+----------+ Jank percentage (Jank deadline 16ms): +-----------------+------------+----------+ | kernel | value [%] | diff [%] | +-----------------+------------+----------+ | base | 3.6 | 0.0 | | runnable | 1.0 | -68.86 | +-----------------+------------+----------+ Power usage [mW] (total - all CPUs): +-----------------+------------+----------+ | kernel | value [mW] | diff [%] | +-----------------+------------+----------+ | base | 129.5 | 0.0 | | runnable | 134.3 | 3.71* | +-----------------+------------+----------+ * Power usage went up from 129.3 (-0.15%) in v1 to 134.3 (3.71%) whereas all the other benchmark numbers stayed roughly the same. This is probably because of using 'runnable boosting' for EAS max util now as well and tasks more often end up running on non-little CPUs because of that. for (c): Testcase is 'perf bench sched messaging' on Arm64 Ampere Altra with 160 CPUs (sched domains = {MC, DIE, NUMA}) which shows some small improvement: perf stat --null --repeat 10 -- perf bench sched messaging -t -g 1 -l 2000 0.4869 +- 0.0173 seconds time elapsed (+- 3.55%) -> 0.4377 +- 0.0147 seconds time elapsed (+- 3.36%) Chen Yu tested v1** with schbench, hackbench, netperf and tbench on an Intel Sapphire Rapids with 2x56C/112T = 224 CPUs which showed no obvious difference and some small improvements on tbench: https://lkml.kernel.org/r/ZFSr4Adtx1ZI8hoc@chenyu5-mobl1 ** The implementation for (c) hasn't changed in v2. v1 -> v2: (1) Refactor CPU utilization getter functions, let cpu_util_cfs() call cpu_util_next() (now cpu_util()). (2) Consider CPU contention in EAS (find_energy_efficient_cpu() -> eenv_pd_max_util()) next to schedutil (sugov_get_util()) as well so that EAS' and schedutil's views on CPU frequency selection are in sync. (3) Move 'util_avg = max(util_avg, runnable_avg)' from cpu_boosted_util_cfs() to cpu_util_next() (now cpu_util()) so that EAS can use it too. (4) Rework patch header. (5) Add test results (JankbenchX on Pixel6 to test changes in schedutil and EAS) and 'perf bench sched messaging' on Arm64 Ampere Altra for CFS load-balance (find_busiest_queue()). Dietmar Eggemann (2): sched/fair: Refactor CPU utilization functions sched/fair, cpufreq: Introduce 'runnable boosting' kernel/sched/core.c | 2 +- kernel/sched/cpufreq_schedutil.c | 3 +- kernel/sched/fair.c | 72 +++++++++++++++++++++++++------- kernel/sched/sched.h | 49 +--------------------- 4 files changed, 63 insertions(+), 63 deletions(-) -- 2.25.1