Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3889881pxf; Mon, 22 Mar 2021 19:00:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJySaOIh9WzEAc8ImYOj5HiICcIMcuBh+PQRtzikCv/mVw/1zO4NRI4CH0UXUuwiX7+NSQmE X-Received: by 2002:a17:907:3e21:: with SMTP id hp33mr2489968ejc.313.1616464846693; Mon, 22 Mar 2021 19:00:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616464846; cv=none; d=google.com; s=arc-20160816; b=V8mzj2s0ojE/OmQvjnVUEA+jasuVqNVjyYSlkrae5pidGQPcPXih9AcslB2EBZ0D4d /7SgSAhKXdzQjkYOYTi2YDV1GUQeiZ9655yaDQ4y7udjITiHIots99z3CJpSx0im4dpg swEJncq0+WkMgJnvyJtEmf9/ilJxq0Sjk/LPXJqVIyZQ2Cgw7QoNSpq6Dw3KC1O9UbdT 8/cbL2Zylk1sNPlcqpXZCz9AYgou2AZaHSpwB8vgzvM9WAdxRzUjP4SGyyNpx1zL+06O iIXoQrmqn0IoOQEUzPEl2uX+1j0iZtd7eYAUcJG3HWWGRxcFKH8XLojVmBJ/Vfb2HffS Xk7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=ql04b8BQr5r2yB3OsN7Ul+9AuV/cCqgIi7PdgpWcVgY=; b=v2oIw/nuND1PR3uq3Xus1GRonKVfp9qoWhXkIAjrHvDWGBxB3E09ZovLGjngfGwQrn dQipiBD81DGjQgQPPGQbejCf3i7FRDUXGphvOM9vQsLDaAgIn5AMhbcOXK2cPYk6/qRr hEutKf8faQpzDPaHRJ9SRrwbzgYW7OKcAHkY2yse+H184N7d/tOLGl6tjFCE1tsEx/lR xDyw6v0NvjmICqOldxcH4wEANuvQzb3tU2ettoH7i+tkxQkTGmt10PQ9pgs5ZPvBhM0h fXA9x7tiXs1FEC/ZeX6en+u9e40MJ4TsbGGz+0y4ON1TLu/5/8bHns9WN476pw6rjw6m I04w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m18si12871084edv.199.2021.03.22.19.00.24; Mon, 22 Mar 2021 19:00:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229639AbhCWB47 (ORCPT + 99 others); Mon, 22 Mar 2021 21:56:59 -0400 Received: from foss.arm.com ([217.140.110.172]:39774 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229472AbhCWB4g (ORCPT ); Mon, 22 Mar 2021 21:56:36 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 73BB71042; Mon, 22 Mar 2021 18:56:35 -0700 (PDT) Received: from slackpad.fritz.box (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9CCC23F719; Mon, 22 Mar 2021 18:56:33 -0700 (PDT) Date: Tue, 23 Mar 2021 01:56:27 +0000 From: Andre Przywara To: Samuel Holland Cc: Maxime Ripard , Chen-Yu Tsai , Jernej Skrabec , devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-sunxi@lists.linux.dev, linux-sunxi@googlegroups.com Subject: Re: [RFC PATCH] arm64: dts: allwinner: a64/h5: Add CPU idle states Message-ID: <20210323015627.08f9afd6@slackpad.fritz.box> In-Reply-To: <20210322062514.40747-1-samuel@sholland.org> References: <20210322062514.40747-1-samuel@sholland.org> Organization: Arm Ltd. X-Mailer: Claws Mail 3.17.1 (GTK+ 2.24.31; x86_64-slackware-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 22 Mar 2021 01:25:14 -0500 Samuel Holland wrote: Hi, > Powering off idle CPUs saves about 33 mW compared to using WFI only. > Additional power savings are possible by idling the L2 and downclocking > the cluster when all CPUs are idle. > > Entry and exit latency were measured using a logic analyzer, with GPIO > pins toggled in Linux after the calls to trace_cpu_idle() in > cpuidle_enter_state(), and in the power management firmware after CPU > power-off completes and immediately after detecting an interrupt. > > 800 us and 1500 us are worst-case values, largely driven by the fact > that the power management firmware is single threaded. It can only > handle commands to power off CPUs one at a time, and it cannot process > any commands while powering on a CPU in response to an interrupt. > > The cluster suspend process reliably takes 36 us; I rounded this up to > 50 us. If all CPUs enter the cluster idle state at the same time, exit > latency is actually reduced, because there is no contention in that > case. However, if only some CPUs enter the cluster idle state, behavior > is the same as for CPU idle. > > Polling delay for the power management firmware to detect a pending > interrupt is insignificant; it is less than 20 us. > > min-residency was chosen as the point where enabling the idle state > consumed no more average power than disabling the idle state at a > variety of interrupt rates. > > Signed-off-by: Samuel Holland > --- > > I'm sending this patch as an RFC because it raises questions about how > we handle firmware versioning. How far back does (or should) our support > for old TF-A and Crust versions go? > > cpuidle has a problem that without working firmware support, CPUs will > enter idle states and be unable to wake up. As a result, the system will > hang at some point during boot, usually before getting to userspace. > > For over a year[0], TF-A has exposed the PSCI CPU_SUSPEND function when > a SCPI implementation is present[1]. Implementing CPU_SUSPEND is > required for implementing SYSTEM_SUSPEND[2], even if CPU_SUSPEND is not > itself used for anything. > > However, there was no code to actually wake up a CPU once it called the > CPU_SUSPEND function, because I could not find the register providing > the necessary information. The fact that CPU_SUSPEND was broken affected > nobody, because nothing ever called it -- there were no idle states in > the DTS. In hindsight, what I should have done was always return failure > from sunxi_validate_power_state(), but that ship has long sailed. > > I finally found the elusive register and implemented the wakeup code > earlier this month[3]. So now, CPU_SUSPEND actually works, if all of > your firmware is up to date, and cpuidle works if you add the states in > your device tree. > > Unfortunately, there is currently nothing verifying that compatibility. > So you can get into four possible scenarios: > 1) No idle states in DTS, any firmware => Linux works, with baseline > power consumption. > 2) Idle states added to DTS, no Crust/SCPI => Linux works, but every > attempt to enter an idle state is rejected because CPU_SUSPEND is > not hooked up. So power consumption increases by a sizable amount. > 3) Idle states added to DTS, "old" Crust/SCPI (before [3]) => Linux > fails to boot, because CPUs never return from idle states. > 4) Idle states added to DTS, "new" Crust/SCPI (after [3]) => Linux > works, with improved power consumption compared to the baseline. > > Obviously, we want to prevent scenario 3 if possible. So I think the core of the problem is that the DT describes some firmware feature, but we have the DT bundled with the kernel, not the firmware. So is there any way we can detect an older crust version in U-Boot, then remove any potential idle states from the DT? Granted, this requires recent U-Boot as well, but at least we could try to mitigate the worst case a bit? A better solution could be to only *add* the idle states if the rest of the firmware is deemed worthy. So the mainline DTs would not carry the properties in the first place, and only U-Boot adds them, on detecting a capable firmware? Admittedly this changes the "flow" of the DT, where the kernel is the authority, but it might help to solve this problem? Or any other way, which involves U-Boot patching the DTB? (This would apply to the DTB passed to the kernel, regardless of where and when it's loaded from) Any opinions? Cheers, Andre > Enter the current patch: I chose the arm,psci-suspend-param values > specifically so they would be _rejected_ by the current TF-A code. This > makes scenario 3 behave like scenario 2. I then have some follow-up TF-A > patches (not yet submitted) to switch to the new parameter encoding[4]. > > This brings me back to my original question. Once the TF-A patches in > [4] are merged, scenario 3 (with an updated TF-A but an old Crust) would > fail to boot again. Do we care? > > Should I implement some kind of runtime version checking, so TF-A can > disable CPU_SUSPEND if it would be broken? Or instead, should we wait > some amount of time to merge this patch (or the patches at [4]) and > assume people have upgraded? > > Where would people expect this sort of possibly-breaking change to be > documented? > > Separately, since I assume most A64/H5 users (outside of LibreELEC and > the PinePhone) are not using Crust, scenario 2 would be very common. If > merging this patch increases their idle power draw by 500 mW, is that an > acceptable cost for decreasing other users' idle power draw by 50 mW? > > Sorry for the wall of text, > Samuel > > [0]: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/plat/allwinner/common/sunxi_pm.c?id=e382c88e2a26995099bb931d49e754dcaebc5593 > [1]: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/plat/allwinner/common/sunxi_scpi_pm.c?id=2e0e51f42586826a1f6f6c1e532f90e6df642cf5#n190 > [2]: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/lib/psci/psci_setup.c?id=2e0e51f42586826a1f6f6c1e532f90e6df642cf5#n251 > [3]: https://github.com/crust-firmware/crust/commits/85944467c804 > [4]: https://github.com/crust-firmware/arm-trusted-firmware/commits/d6ebf5dab2da > > --- > > arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 26 +++++++++++++++++++ > arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi | 26 +++++++++++++++++++ > 2 files changed, 52 insertions(+) > > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi > index 57786fc120c3..2b1b5b36098c 100644 > --- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi > +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi > @@ -54,6 +54,7 @@ cpu0: cpu@0 { > clocks = <&ccu CLK_CPUX>; > clock-names = "cpu"; > #cooling-cells = <2>; > + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>; > }; > > cpu1: cpu@1 { > @@ -65,6 +66,7 @@ cpu1: cpu@1 { > clocks = <&ccu CLK_CPUX>; > clock-names = "cpu"; > #cooling-cells = <2>; > + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>; > }; > > cpu2: cpu@2 { > @@ -76,6 +78,7 @@ cpu2: cpu@2 { > clocks = <&ccu CLK_CPUX>; > clock-names = "cpu"; > #cooling-cells = <2>; > + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>; > }; > > cpu3: cpu@3 { > @@ -87,6 +90,29 @@ cpu3: cpu@3 { > clocks = <&ccu CLK_CPUX>; > clock-names = "cpu"; > #cooling-cells = <2>; > + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>; > + }; > + > + idle-states { > + entry-method = "psci"; > + > + cpu_sleep: cpu-sleep { > + compatible = "arm,idle-state"; > + local-timer-stop; > + entry-latency-us = <800>; > + exit-latency-us = <1500>; > + min-residency-us = <25000>; > + arm,psci-suspend-param = <0x00010003>; > + }; > + > + cluster_sleep: cluster-sleep { > + compatible = "arm,idle-state"; > + local-timer-stop; > + entry-latency-us = <850>; > + exit-latency-us = <1500>; > + min-residency-us = <50000>; > + arm,psci-suspend-param = <0x01010013>; > + }; > }; > > L2: l2-cache { > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi > index 578a63dedf46..1c416f648c58 100644 > --- a/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi > +++ b/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi > @@ -18,6 +18,7 @@ cpu0: cpu@0 { > clocks = <&ccu CLK_CPUX>; > clock-latency-ns = <244144>; /* 8 32k periods */ > #cooling-cells = <2>; > + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>; > }; > > cpu1: cpu@1 { > @@ -28,6 +29,7 @@ cpu1: cpu@1 { > clocks = <&ccu CLK_CPUX>; > clock-latency-ns = <244144>; /* 8 32k periods */ > #cooling-cells = <2>; > + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>; > }; > > cpu2: cpu@2 { > @@ -38,6 +40,7 @@ cpu2: cpu@2 { > clocks = <&ccu CLK_CPUX>; > clock-latency-ns = <244144>; /* 8 32k periods */ > #cooling-cells = <2>; > + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>; > }; > > cpu3: cpu@3 { > @@ -48,6 +51,29 @@ cpu3: cpu@3 { > clocks = <&ccu CLK_CPUX>; > clock-latency-ns = <244144>; /* 8 32k periods */ > #cooling-cells = <2>; > + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>; > + }; > + > + idle-states { > + entry-method = "psci"; > + > + cpu_sleep: cpu-sleep { > + compatible = "arm,idle-state"; > + local-timer-stop; > + entry-latency-us = <800>; > + exit-latency-us = <1500>; > + min-residency-us = <25000>; > + arm,psci-suspend-param = <0x00010003>; > + }; > + > + cluster_sleep: cluster-sleep { > + compatible = "arm,idle-state"; > + local-timer-stop; > + entry-latency-us = <850>; > + exit-latency-us = <1500>; > + min-residency-us = <50000>; > + arm,psci-suspend-param = <0x01010013>; > + }; > }; > }; >