Received: by 10.213.65.68 with SMTP id h4csp257494imn; Tue, 20 Mar 2018 02:45:47 -0700 (PDT) X-Google-Smtp-Source: AG47ELvqInao3ZFYTLnYY9lyNqQ2Sl6WUfgEeps1NxPg7J+HjrgI5ieXviVlMt++ymNF8i0faSdo X-Received: by 10.98.180.13 with SMTP id h13mr12913153pfn.139.1521539147310; Tue, 20 Mar 2018 02:45:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521539147; cv=none; d=google.com; s=arc-20160816; b=wHrThTjDRCwIK3epoz9l2MlICZfGmZpf16RfOK1+IG4Sn/HiHoGX6HhUE0f8XUjxxG k1duDQ/B4qVajjGcWBi3bwQqpNKj8UTkTW0A5xUlm87P7KpDemjiuOCHienQ0VscX4hT twK2/aTPwVbhszjiklWhGAi2iiUzCm+xwAHxXG9/aLCHEjwrf0AnvO4Q068vsTbcgD4e jTPqXtJ8zSivfv7i0mevvIMQbe/ewg8x4YrDIMxsbMhVj+Wg5M8oheDq0TmVf+T/APJU y9gX2MSWcv76KGN7wbOHXiGYQObXeotBsztC1DK3HbpuJDPsqts5e/4g+m105zfjqcp6 mKgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=ClwWW8GvXiPTDI97AyxX90Z1e+0HrKGX2Vdlz1boagE=; b=hVAY2fAO6c/P/OJLSswXvd9KfnC0Qy9oflOiQQ8k3EPsCKXXNc+bTK4h7OXtDqMg/k G+is0D+XMnjBze5Tfhf6nOLxZZve6vNpb0axL8JuoR1+eHbw3NQoRRm/W6XOfdbMhUjV u658S+O3Qb3proT0pzFOa6AtZGDkBOe0LbTxemHeZ4ff75uKbxUlV/EhoxBYQYNb6OQr dt1L9uzcMcLPOIyWYTkBf3v8fjPK0kcKSY3A5Gs59alsJ8344jV0YiTIQpYmOKxE9DZL UCD8edvncx5E7dTMv68L8ATkNl5w6B74l+gC+azMG4EkqRGWVIZO1fYqTi2nmBIEaHOF pASw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l6-v6si1299603pls.438.2018.03.20.02.45.30; Tue, 20 Mar 2018 02:45:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752326AbeCTJoK (ORCPT + 99 others); Tue, 20 Mar 2018 05:44:10 -0400 Received: from foss.arm.com ([217.140.101.70]:37818 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751640AbeCTJoH (ORCPT ); Tue, 20 Mar 2018 05:44:07 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E37DC1435; Tue, 20 Mar 2018 02:44:06 -0700 (PDT) Received: from e107985-lin.cambridge.arm.com (e107985-lin.cambridge.arm.com [10.1.210.41]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 67B233F487; Tue, 20 Mar 2018 02:44:04 -0700 (PDT) From: Dietmar Eggemann To: linux-kernel@vger.kernel.org, Peter Zijlstra , Quentin Perret , Thara Gopinath Cc: linux-pm@vger.kernel.org, Morten Rasmussen , Chris Redpath , Patrick Bellasi , Valentin Schneider , "Rafael J . Wysocki" , Greg Kroah-Hartman , Vincent Guittot , Viresh Kumar , Todd Kjos , Joel Fernandes Subject: [RFC PATCH 0/6] Energy Aware Scheduling Date: Tue, 20 Mar 2018 09:43:06 +0000 Message-Id: <20180320094312.24081-1-dietmar.eggemann@arm.com> X-Mailer: git-send-email 2.11.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 1. Overview The Energy Aware Scheduler (EAS) based on Morten Rasmussen's posting on LKML [1] is currently part of the AOSP Common Kernel and runs on today's smartphones with Arm's big.LITTLE CPUs. Based on the experience gained over the last two and a half years in product development, we propose an energy model based task placement for CPUs with asymmetric core capacities (e.g. Arm big.LITTLE or DynamIQ), to align with the EAS adopted by the AOSP Common Kernel. We have developed a simplified energy model, based on the physical active power/performance curve of each core type using existing SoC power/performance data already known to the kernel. The energy model is used to select the most energy-efficient CPU to place each task, taking utilization into account. 1.1 Energy Model A CPU with asymmetric core capacities features cores with significantly different energy and performance characteristics. As the configurations can vary greatly from one SoC to another, designing an energy-efficient scheduling heuristic that performs well on a broad spectrum of platforms appears to be particularly hard. This proposal attempts to solve this issue by providing the scheduler with an energy model of the platform which enables energy impact estimation of scheduling decisions in a generic way. The energy model is kept very simple as it represents only the active power of CPUs at all available P-states and relies on existing data in the kernel (only used by the thermal subsystem so far). This proposal does not include the power consumption of C-states and cluster-level resources which were originally introduced in [1] since firstly, their impact on task placement decisions appears to be neglectible on modern asymmetric platforms and secondly, they require additional infrastructure and data (e.g new DT entries). The scheduler is also informed of the span of frequency domains, hence enabling an accurate accounting of the energy costs of frequency changes. This appears to be especially important for future Arm CPU topologies (DynamIQ) where the span of scheduling domains can be different from the span of frequency domains. 1.2 Overutilization/Tipping Point The primary job for the task scheduler is to deliver the highest possible throughput with minimal latency. With increasing utilization the opportunities to save energy for the scheduler become rarer. There must be spare CPU time available to place tasks based on utilization in an energy-aware fashion, i.e. to pack tasks on energy-efficient CPUs with unnecessary constraining of the task throughput. This spare CPU time decreases towards zero when the utilization of the system rises. To cope with this situation, we introduce the concept of overutilization in order to enable/disable EAS depending on system utilization. The point in which a system switches from being not overutilized to being overutilized or vice versa is called the tipping point. A per sched domain tipping point indicator implementation is introduced here. 1.3 Wakeup path On a system which has an energy model, the energy-aware wakeup path trumps affine and capacity based wake up in case the lowest sched domain of the task's previous CPU is not overutilized. The energy-aware algorithm tries to find a new target CPU among the CPUs of the highest non-overutilized domain which includes previous and current CPU, for which the placement of the task would contribute a minimum on energy consumption. The energy model is only enabled on CPUs with asymmetric core capacities (SD_ASYM_CPUCAPACITY). These systems typically have less than or equal 8 cores. 2. Tests Two fundamentally different tests were executed. Firstly the energy test case shows the impact on energy consumption this patch-set has using a synthetic set of tasks. Secondly the performance test case provides the conventional hackbench metric numbers. The tests run on two arm64 big.LITTLE platforms: Hikey960 (4xA73 + 4xA53) and Juno r0 (2xA57 + 4xA53). Base kernel is tip/sched/core (4.16-rc4), with some Hikey960 and Juno specific patches, the SD_ASYM_CPUCAPACITY flag set at DIE sched domain level for arm64 and schedutil as cpufreq governor [2]. 2.1 Energy test case 10 iterations of between 10 and 50 periodic rt-app tasks (16ms period, 5% duty-cycle) for 30 seconds with energy measurement. Unit is Joules. The goal is to save energy, so lower is better. 2.1.1 Hikey960 Energy is measured with an ACME Cape on an instrumented board. Numbers include consumption of big and little CPUs, LPDDR memory, GPU and most of the other small components on the board. They do not include consumption of the radio chip (turned-off anyway) and external connectors. +----------+-----------------+------------------------+ | | Without patches | With patches | +----------+---------+-------+-----------------+------+ | Tasks nb | Mean | RSD* | Mean | RSD* | +----------+---------+-------+-----------------+------+ | 10 | 41.50 | 1.1% | 37.43 (-9.81%) | 2.0% | | 20 | 55.51 | 0.7% | 50.74 (-8.59%) | 1.5% | | 30 | 75.39 | 0.4% | 70.36 (-6.67%) | 7.3% | | 40 | 95.82 | 0.3% | 89.90 (-6.18%) | 1.5% | | 50 | 121.53 | 0.9% | 112.61 (-7.34%) | 0.9% | +----------+---------+-------+-----------------+------+ 2.1.2 Juno r0 Energy is measured with the onboard energy meter. Numbers include consumption of big and little CPUs. +----------+-----------------+------------------------+ | | Without patches | With patches | +----------+--------+--------+-----------------+------+ | Tasks nb | Mean | RSD* | Mean | RSD* | +----------+--------+--------+-----------------+------+ | 10 | 11.52 | 1.1% | 7.67 (-33.42%) | 2.8% | | 20 | 19.25 | 0.9% | 13.39 (-30.44%) | 1.8% | | 30 | 28.73 | 1.3% | 21.85 (-31.49%) | 0.6% | | 40 | 37.58 | 0.9% | 31.40 (-16.44%) | 0.4% | | 50 | 47.24 | 0.6% | 45.37 ( -3.96%) | 0.6% | +----------+--------+--------+-----------------+------+ 2.2 Performance test case 30 iterations of perf bench sched messaging --pipe --thread --group G --loop L with G=[1 2 4 8] and L=50000 (Hikey960)/16000 (Juno r0). 2.2.1 Hikey960 The impact of thermal capping was mitigated thanks to a heatsink, a fan, and a 10 sec delay between two successive executions. +----------------+-----------------+------------------------+ | | Without patches | With patches | +--------+-------+---------+-------+----------------+-------+ | Groups | Tasks | Mean | RSD* | Mean | RSD* | +--------+-------+---------+-------+----------------+-------+ | 1 | 40 | 8.01 | 1.70% | 8.16 (+1.90%) | 1.79% | | 2 | 80 | 15.59 | 0.76% | 15.79 (+1.33%) | 0.92% | | 4 | 160 | 32.23 | 0.70% | 32.46 (+0.72%) | 0.55% | | 8 | 320 | 66.93 | 0.46% | 67.40 (+0.69%) | 0.37% | +--------+-------+---------+-------+----------------+-------+ 2.2.2 Juno r0 +----------------+-----------------+------------------------+ | | Without patches | With patches | +--------+-------+---------+-------+----------------+-------+ | Groups | Tasks | Mean | RSD* | Mean | RSD* | +--------+-------+---------+-------+----------------+-------+ | 1 | 40 | 8.37 | 0.12% | 8.33 ( 0.00%) | 0.08% | | 2 | 80 | 14.63 | 0.12% | 14.49 (-0.01%) | 0.14% | | 4 | 160 | 27.17 | 0.14% | 26.80 (-0.01%) | 0.14% | | 8 | 320 | 52.50 | 0.25% | 51.54 (-0.02%) | 0.23% | +--------+-------+---------+-------+----------------+-------+ *RSD: Relative Standard Deviation (std dev / mean) 3. Dependencies This series depends on additional infrastructure being merged in the OPP core. As this infrastructure can also be useful for other clients, the related patches have been posted separately [3]. [1] https://lkml.org/lkml/2015/7/7/754 [2] http://www.linux-arm.org/git?p=linux-de.git;a=shortlog;h=refs/heads/upstream/eas_v1_base [3] https://marc.info/?l=linux-pm&m=151635516419249&w=2 Dietmar Eggemang (1): sched/fair: Create util_fits_capacity() Quentin Perret (4): sched: Introduce energy models of CPUs sched/fair: Introduce an energy estimation helper function sched/fair: Select an energy-efficient CPU on task wake-up drivers: base: arch_topology.c: Enable EAS for arm/arm64 platforms Thara Gopinath (1): sched: Add over-utilization/tipping point indicator drivers/base/arch_topology.c | 2 + include/linux/sched/energy.h | 31 ++++++ include/linux/sched/topology.h | 1 + kernel/sched/Makefile | 2 +- kernel/sched/energy.c | 190 ++++++++++++++++++++++++++++++++++ kernel/sched/fair.c | 226 +++++++++++++++++++++++++++++++++++++++-- kernel/sched/sched.h | 1 + kernel/sched/topology.c | 12 +-- 8 files changed, 449 insertions(+), 16 deletions(-) create mode 100644 include/linux/sched/energy.h create mode 100644 kernel/sched/energy.c -- 2.11.0