Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp726526yba; Fri, 26 Apr 2019 07:49:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqxrz9b8TgxBk7vWjyhrraIUoJvb/1cWnlkMM1cJN7bUmRYjaCwrZWnFW7CsRIflrz3Gb8M7 X-Received: by 2002:a65:5286:: with SMTP id y6mr43085317pgp.79.1556290144290; Fri, 26 Apr 2019 07:49:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556290144; cv=none; d=google.com; s=arc-20160816; b=Pc6fUvkhJU5WAlSH32fSdCSnDtu4H6v+b4YH4XAyM5xwfFLYDPZg3I3gFBuMKbYv6F +pcITPB2Zm/PfzSCcxlNYVpEwNsG9nlFcu277Q+VcsRqTmYKlc8QQt4SiipSDVJg9Xlx Ism26f6PYDjhJajvi6m/m8j1ythJpzfO9ypux9/lsSyD+uDS0wnBvFmOHez77Bvsg7jM gJfKYS9VCH2+k7Z3Uvr0ndCOvPCyD0Ntp3mH03+r2ppss9oWyunVUK4IeMuOVuboM8Rw baeWtzM+YTWktXJDInPVYRATwRzHuSg6ZqnZiX/tVRXNSB/qV76t+O5vgNs4WR6Inq30 kYGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=tvxJ35YOB3WymuyyQ8dDSdsujNs+HNekCubnCuEjlE8=; b=fBlpIJnFjfXmlRusLuPqi5DQIg/ESYB8eFwvXOfZGugeNidw837rAo/7HvcfKTmm6Z nWp/eQeLRP6Cwm7BQs+rQv9+WgfRqvBowTv+0oN85FnPjWQSKUh3KySdLC5nssU++jdl jtQsidfisY8fCNXLoELyPdALt3CzfLaMKArIyN5rUvj7cuEiW2e+77ehMnWVRip8P7UU 8+JugQw9RlkirNufwu5q4z4pdhuWvqFAQvLy2g95jaIqfNvmN26ivLn7JshyQ+8TVkWg yGk8/2mM2bsi/ztl6xMD3tHySWc0lZQQxH/IEzwvIIh6BJrm2y4TVjACKYk/VW3wO3yL fQ/g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w3si26134987plp.260.2019.04.26.07.48.49; Fri, 26 Apr 2019 07:49:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726669AbfDZOqM (ORCPT + 99 others); Fri, 26 Apr 2019 10:46:12 -0400 Received: from foss.arm.com ([217.140.101.70]:43372 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726039AbfDZOqM (ORCPT ); Fri, 26 Apr 2019 10:46:12 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4FFEE80D; Fri, 26 Apr 2019 07:46:11 -0700 (PDT) Received: from [10.1.194.42] (e108754-lin.cambridge.arm.com [10.1.194.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id F34413F5C1; Fri, 26 Apr 2019 07:46:08 -0700 (PDT) Subject: Re: [PATCH V2 0/3] Introduce Thermal Pressure To: Thara Gopinath , mingo@redhat.com, peterz@infradead.org, rui.zhang@intel.com Cc: linux-kernel@vger.kernel.org, amit.kachhap@gmail.com, viresh.kumar@linaro.org, javi.merino@kernel.org, edubezval@gmail.com, daniel.lezcano@linaro.org, vincent.guittot@linaro.org, nicolas.dechesne@linaro.org, bjorn.andersson@linaro.org, dietmar.eggemann@arm.com References: <1555443521-579-1-git-send-email-thara.gopinath@linaro.org> <5CC2F07D.1080603@linaro.org> From: Ionela Voinescu Message-ID: Date: Fri, 26 Apr 2019 15:46:08 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <5CC2F07D.1080603@linaro.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thara, >>> Regarding testing, basic build, boot and sanity testing have been >>> performed on hikey960 mainline kernel with debian file system. >>> Further, aobench (An occlusion renderer for benchmarking realworld >>> floating point performance), dhrystone and hackbench test have been >>> run with the thermal pressure algorithm. During testing, due to >>> constraints of step wise governor in dealing with big little systems, >>> cpu cooling was disabled on little core, the idea being that >>> big core will heat up and cpu cooling device will throttle the >>> frequency of the big cores there by limiting the maximum available >>> capacity and the scheduler will spread out tasks to little cores as well. >>> Finally, this patch series has been boot tested on db410C running v5.1-rc4 >>> kernel. >>> >> >> Did you try using IPA as well? It is better equipped to deal with >> big-LITTLE systems and it's more probable IPA will be used for these >> systems, where your solution will have the biggest impact as well. >> The difference will be that you'll have both the big cluster and the >> LITTLE cluster capped in different proportions depending on their >> utilization and their efficiency. > > No. I did not use IPA simply because it was not enabled in mainline. I > agree it is better equipped to deal with big-little systems. The idea > to remove cpu cooling on little cluster was to in some (not the > cleanest) manner to mimic this. But I agree that IPA testing is possibly > the next step.Any help in this regard is appreciated. > I see CONFIG_THERMAL_GOV_POWER_ALLOCATOR=y in the defconfig for arm64: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/configs/defconfig?h=v5.1-rc6#n413 You can enable the use of it or make it default in the defconfig. Also, Hikey960 has the needed setup in DT: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/hisilicon/hi3660.dtsi?h=v5.1-rc6#n1093 This should work fine. >> >>> During the course of development various methods of capturing >>> and reflecting thermal pressure were implemented. >>> >>> The first method to be evaluated was to convert the >>> capped max frequency into capacity and have the scheduler use the >>> instantaneous value when updating cpu_capacity. >>> This method is referenced as "Instantaneous Thermal Pressure" in the >>> test results below. >>> >>> The next two methods employs different methods of averaging the >>> thermal pressure before applying it when updating cpu_capacity. >>> The first of these methods re-used the PELT algorithm already present >>> in the kernel that does the averaging of rt and dl load and utilization. >>> This method is referenced as "Thermal Pressure Averaging using PELT fmwk" >>> in the test results below. >>> >>> The final method employs an averaging algorithm that collects and >>> decays thermal pressure based on the decay period. In this method, >>> the decay period is configurable. This method is referenced as >>> "Thermal Pressure Averaging non-PELT Algo. Decay : XXX ms" in the >>> test results below. >>> >>> The test results below shows 3-5% improvement in performance when >>> using the third solution compared to the default system today where >>> scheduler is unware of cpu capacity limitations due to thermal events. >>> >> >> Did you happen to record the amount of capping imposed on the big cores >> when these results were obtained? Did you find scenarios where the >> capacity of the bigs resulted in being lower than the capacity of the >> LITTLEs (capacity inversion)? >> This is one case where we'll see a big impact in considering thermal >> pressure. > > I think I saw capacity inversion in some scenarios. I did not > particularly capture them. > It would be good to observe this and possibly correlate the amount of capping with resulting behavior and performance numbers. This would give more confidence in the testing coverage. You can create a specific testcase for capacity inversion by only capping the big CPUs, as you've done for these tests, and by running sysbench/dhrystone for example with at least nr_big_cpus tasks. This assumes that the bigs fully utilized would generate enough heat and would be capped enough to achieve a capacity lower than the littles, which on Hikey960 I don't doubt it can be obtained. >> >> Also, given that these are more or less sustained workloads, I'm >> wondering if there is any effect on workloads running on an uncapped >> system following capping. I would image such a test being composed of a >> single threaded period (no capping) followed by a multi-threaded period >> (with capping), continued in a loop. It might be interesting to have >> something like this as well, as part of your test coverage > > I do not understand this. There is either capping for a workload or no > capping. There is no sysctl entry to turn on or off capping. > I was thinking of this as a second hand effect. If you have only one big CPU even fully utilized, with the others quiet, you might not see any capping. But when you have a multi-threaded workload, with all or at least the bigs at a high OPP, the platform will definitely overheat and there will be capping. Thanks, Ionela. > Regards > Thara >> >> >> Thanks, >> Ionela. >> > >