Amlogic G12B and SM1 devices experience CPU stalls and random board
wedges when the system idles and CPU cores clock down to lower opp
points. Recent vendor kernels include a change to remove 100-250MHz
(with no explanation) [0] but other downstream sources also remove
the 500/667MHz points (also with no explanation). Unless 100-667Mhz
opps are removed or the CPU governor forced to performance, stalls
are observed, so let's remove them an improve stability/uptime.
Numerous people have experienced this issue and I have tested with
only the low opp-points removed and numerous voltage tweaks: but it
makes no difference. With the opp points present an Odroid N2 or
Khadas VIM3 reliably drop off my network after being left idling
overnight with UART showing a CPU stall splat. With the opp points
removed I see weeks of uninterupted uptime. It's beyond my skills
to research what the cause of the stalls might be, but if anyone
ever figures it out we can always restore things. NB: This issue
is not too widely reported in forums, but that's largely because
most of the Amlogic supporting distros have been including this
change picked from my kernel patchset for some time.
[0] https://github.com/khadas/linux/commit/20e237a4fe9f0302370e24950cb1416e038eee03
Changes since v1:
- Split into two patches to allow for separate Fixes tags
- Minor edits to commit messages for brevity and typos
Christian Hewitt (2):
arm64: dts: meson: remove CPU opps below 1GHz for G12B boards
arm64: dts: meson: remove CPU opps below 1GHz for SM1
.../boot/dts/amlogic/meson-g12b-a311d.dtsi | 40 -------------------
.../boot/dts/amlogic/meson-g12b-s922x.dtsi | 40 -------------------
arch/arm64/boot/dts/amlogic/meson-sm1.dtsi | 20 ----------
3 files changed, 100 deletions(-)
--
2.17.1
Amlogic G12B devices experience CPU stalls and random board wedges when
the system idles and CPU cores clock down to lower opp points. Recent
vendor kernels include a change to remove 100-250MHz and other distro
sources also remove the 500/667MHz points. Unless all 100-667Mhz opps
are removed or the CPU governor forced to performance stalls are still
observed, so let's remove them to improve stability and uptime.
Fixes: b96d4e92709b ("arm64: dts: meson-g12b: support a311d and s922x cpu operating points")
Signed-off-by: Christian Hewitt <[email protected]>
---
.../boot/dts/amlogic/meson-g12b-a311d.dtsi | 40 -------------------
.../boot/dts/amlogic/meson-g12b-s922x.dtsi | 40 -------------------
2 files changed, 80 deletions(-)
diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b-a311d.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12b-a311d.dtsi
index d61f43052a34..8e9ad1e51d66 100644
--- a/arch/arm64/boot/dts/amlogic/meson-g12b-a311d.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-g12b-a311d.dtsi
@@ -11,26 +11,6 @@
compatible = "operating-points-v2";
opp-shared;
- opp-100000000 {
- opp-hz = /bits/ 64 <100000000>;
- opp-microvolt = <731000>;
- };
-
- opp-250000000 {
- opp-hz = /bits/ 64 <250000000>;
- opp-microvolt = <731000>;
- };
-
- opp-500000000 {
- opp-hz = /bits/ 64 <500000000>;
- opp-microvolt = <731000>;
- };
-
- opp-667000000 {
- opp-hz = /bits/ 64 <667000000>;
- opp-microvolt = <731000>;
- };
-
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
opp-microvolt = <761000>;
@@ -71,26 +51,6 @@
compatible = "operating-points-v2";
opp-shared;
- opp-100000000 {
- opp-hz = /bits/ 64 <100000000>;
- opp-microvolt = <731000>;
- };
-
- opp-250000000 {
- opp-hz = /bits/ 64 <250000000>;
- opp-microvolt = <731000>;
- };
-
- opp-500000000 {
- opp-hz = /bits/ 64 <500000000>;
- opp-microvolt = <731000>;
- };
-
- opp-667000000 {
- opp-hz = /bits/ 64 <667000000>;
- opp-microvolt = <731000>;
- };
-
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
opp-microvolt = <731000>;
diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b-s922x.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12b-s922x.dtsi
index 1e5d0ee5d541..44c23c984034 100644
--- a/arch/arm64/boot/dts/amlogic/meson-g12b-s922x.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-g12b-s922x.dtsi
@@ -11,26 +11,6 @@
compatible = "operating-points-v2";
opp-shared;
- opp-100000000 {
- opp-hz = /bits/ 64 <100000000>;
- opp-microvolt = <731000>;
- };
-
- opp-250000000 {
- opp-hz = /bits/ 64 <250000000>;
- opp-microvolt = <731000>;
- };
-
- opp-500000000 {
- opp-hz = /bits/ 64 <500000000>;
- opp-microvolt = <731000>;
- };
-
- opp-667000000 {
- opp-hz = /bits/ 64 <667000000>;
- opp-microvolt = <731000>;
- };
-
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
opp-microvolt = <731000>;
@@ -76,26 +56,6 @@
compatible = "operating-points-v2";
opp-shared;
- opp-100000000 {
- opp-hz = /bits/ 64 <100000000>;
- opp-microvolt = <751000>;
- };
-
- opp-250000000 {
- opp-hz = /bits/ 64 <250000000>;
- opp-microvolt = <751000>;
- };
-
- opp-500000000 {
- opp-hz = /bits/ 64 <500000000>;
- opp-microvolt = <751000>;
- };
-
- opp-667000000 {
- opp-hz = /bits/ 64 <667000000>;
- opp-microvolt = <751000>;
- };
-
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
opp-microvolt = <771000>;
--
2.17.1
Amlogic SM1 devices experience CPU stalls and random board wedges when
the system idles and CPU cores clock down to lower opp points. Recent
vendor kernels include a change to remove 100-250MHz and other distro
sources also remove the 500/667MHz points. Unless all 100-667Mhz opps
are removed or the CPU governor forced to performance stalls are still
observed, so let's remove them to improve stability and uptime.
Fixes: 3d9e76483049 ("arm64: dts: meson-sm1-sei610: enable DVFS")
Signed-off-by: Christian Hewitt <[email protected]>
---
arch/arm64/boot/dts/amlogic/meson-sm1.dtsi | 20 --------------------
1 file changed, 20 deletions(-)
diff --git a/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi b/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
index 3c07a89bfd27..80737731af3f 100644
--- a/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
@@ -95,26 +95,6 @@
compatible = "operating-points-v2";
opp-shared;
- opp-100000000 {
- opp-hz = /bits/ 64 <100000000>;
- opp-microvolt = <730000>;
- };
-
- opp-250000000 {
- opp-hz = /bits/ 64 <250000000>;
- opp-microvolt = <730000>;
- };
-
- opp-500000000 {
- opp-hz = /bits/ 64 <500000000>;
- opp-microvolt = <730000>;
- };
-
- opp-667000000 {
- opp-hz = /bits/ 64 <666666666>;
- opp-microvolt = <750000>;
- };
-
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
opp-microvolt = <770000>;
--
2.17.1
On 2022-02-10 10:06, Christian Hewitt wrote:
> Amlogic SM1 devices experience CPU stalls and random board wedges when
> the system idles and CPU cores clock down to lower opp points. Recent
> vendor kernels include a change to remove 100-250MHz and other distro
> sources also remove the 500/667MHz points. Unless all 100-667Mhz opps
> are removed or the CPU governor forced to performance stalls are still
> observed, so let's remove them to improve stability and uptime.
>
> Fixes: 3d9e76483049 ("arm64: dts: meson-sm1-sei610: enable DVFS")
> Signed-off-by: Christian Hewitt <[email protected]>
> ---
> arch/arm64/boot/dts/amlogic/meson-sm1.dtsi | 20 --------------------
> 1 file changed, 20 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
> b/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
> index 3c07a89bfd27..80737731af3f 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
> @@ -95,26 +95,6 @@
> compatible = "operating-points-v2";
> opp-shared;
>
> - opp-100000000 {
> - opp-hz = /bits/ 64 <100000000>;
> - opp-microvolt = <730000>;
> - };
> -
> - opp-250000000 {
> - opp-hz = /bits/ 64 <250000000>;
> - opp-microvolt = <730000>;
> - };
> -
> - opp-500000000 {
> - opp-hz = /bits/ 64 <500000000>;
> - opp-microvolt = <730000>;
> - };
> -
> - opp-667000000 {
> - opp-hz = /bits/ 64 <666666666>;
> - opp-microvolt = <750000>;
> - };
> -
> opp-1000000000 {
> opp-hz = /bits/ 64 <1000000000>;
> opp-microvolt = <770000>;
That's not nearly enough. If that's an actual issue, the driver
should be updated not to use these OPPs, and you can't assume
that people will just update their DT (mine comes from u-boot,
and it is unlikely I will update it anytime soon).
Thanks,
M.
--
Jazz is not dead. It just smells funny...
Hi Marc,
On 10/02/2022 11:36, Marc Zyngier wrote:
> On 2022-02-10 10:06, Christian Hewitt wrote:
>> Amlogic SM1 devices experience CPU stalls and random board wedges when
>> the system idles and CPU cores clock down to lower opp points. Recent
>> vendor kernels include a change to remove 100-250MHz and other distro
>> sources also remove the 500/667MHz points. Unless all 100-667Mhz opps
>> are removed or the CPU governor forced to performance stalls are still
>> observed, so let's remove them to improve stability and uptime.
>>
>> Fixes: 3d9e76483049 ("arm64: dts: meson-sm1-sei610: enable DVFS")
>> Signed-off-by: Christian Hewitt <[email protected]>
>> ---
>> arch/arm64/boot/dts/amlogic/meson-sm1.dtsi | 20 --------------------
>> 1 file changed, 20 deletions(-)
>>
>> diff --git a/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
>> b/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
>> index 3c07a89bfd27..80737731af3f 100644
>> --- a/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
>> +++ b/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
>> @@ -95,26 +95,6 @@
>> compatible = "operating-points-v2";
>> opp-shared;
>>
>> - opp-100000000 {
>> - opp-hz = /bits/ 64 <100000000>;
>> - opp-microvolt = <730000>;
>> - };
>> -
>> - opp-250000000 {
>> - opp-hz = /bits/ 64 <250000000>;
>> - opp-microvolt = <730000>;
>> - };
>> -
>> - opp-500000000 {
>> - opp-hz = /bits/ 64 <500000000>;
>> - opp-microvolt = <730000>;
>> - };
>> -
>> - opp-667000000 {
>> - opp-hz = /bits/ 64 <666666666>;
>> - opp-microvolt = <750000>;
>> - };
>> -
>> opp-1000000000 {
>> opp-hz = /bits/ 64 <1000000000>;
>> opp-microvolt = <770000>;
>
> That's not nearly enough. If that's an actual issue, the driver
> should be updated not to use these OPPs, and you can't assume
> that people will just update their DT (mine comes from u-boot,
> and it is unlikely I will update it anytime soon).
The driver is the generic cpufreq and a generic clock driver, we do not
filter out the possible OPP frequencies because the possible freq
is large and depends on the die revision.
I don't see why we should filter out these frequencies so far.
Neil
>
> Thanks,
>
> M.
On 10/02/2022 11:06, Christian Hewitt wrote:
> Amlogic G12B devices experience CPU stalls and random board wedges when
> the system idles and CPU cores clock down to lower opp points. Recent
> vendor kernels include a change to remove 100-250MHz and other distro
> sources also remove the 500/667MHz points. Unless all 100-667Mhz opps
> are removed or the CPU governor forced to performance stalls are still
> observed, so let's remove them to improve stability and uptime.
>
> Fixes: b96d4e92709b ("arm64: dts: meson-g12b: support a311d and s922x cpu operating points")
> Signed-off-by: Christian Hewitt <[email protected]>
> ---
> .../boot/dts/amlogic/meson-g12b-a311d.dtsi | 40 -------------------
> .../boot/dts/amlogic/meson-g12b-s922x.dtsi | 40 -------------------
> 2 files changed, 80 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b-a311d.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12b-a311d.dtsi
> index d61f43052a34..8e9ad1e51d66 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-g12b-a311d.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-g12b-a311d.dtsi
> @@ -11,26 +11,6 @@
> compatible = "operating-points-v2";
> opp-shared;
>
> - opp-100000000 {
> - opp-hz = /bits/ 64 <100000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-250000000 {
> - opp-hz = /bits/ 64 <250000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-500000000 {
> - opp-hz = /bits/ 64 <500000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-667000000 {
> - opp-hz = /bits/ 64 <667000000>;
> - opp-microvolt = <731000>;
> - };
> -
> opp-1000000000 {
> opp-hz = /bits/ 64 <1000000000>;
> opp-microvolt = <761000>;
> @@ -71,26 +51,6 @@
> compatible = "operating-points-v2";
> opp-shared;
>
> - opp-100000000 {
> - opp-hz = /bits/ 64 <100000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-250000000 {
> - opp-hz = /bits/ 64 <250000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-500000000 {
> - opp-hz = /bits/ 64 <500000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-667000000 {
> - opp-hz = /bits/ 64 <667000000>;
> - opp-microvolt = <731000>;
> - };
> -
> opp-1000000000 {
> opp-hz = /bits/ 64 <1000000000>;
> opp-microvolt = <731000>;
> diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b-s922x.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12b-s922x.dtsi
> index 1e5d0ee5d541..44c23c984034 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-g12b-s922x.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-g12b-s922x.dtsi
> @@ -11,26 +11,6 @@
> compatible = "operating-points-v2";
> opp-shared;
>
> - opp-100000000 {
> - opp-hz = /bits/ 64 <100000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-250000000 {
> - opp-hz = /bits/ 64 <250000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-500000000 {
> - opp-hz = /bits/ 64 <500000000>;
> - opp-microvolt = <731000>;
> - };
> -
> - opp-667000000 {
> - opp-hz = /bits/ 64 <667000000>;
> - opp-microvolt = <731000>;
> - };
> -
> opp-1000000000 {
> opp-hz = /bits/ 64 <1000000000>;
> opp-microvolt = <731000>;
> @@ -76,26 +56,6 @@
> compatible = "operating-points-v2";
> opp-shared;
>
> - opp-100000000 {
> - opp-hz = /bits/ 64 <100000000>;
> - opp-microvolt = <751000>;
> - };
> -
> - opp-250000000 {
> - opp-hz = /bits/ 64 <250000000>;
> - opp-microvolt = <751000>;
> - };
> -
> - opp-500000000 {
> - opp-hz = /bits/ 64 <500000000>;
> - opp-microvolt = <751000>;
> - };
> -
> - opp-667000000 {
> - opp-hz = /bits/ 64 <667000000>;
> - opp-microvolt = <751000>;
> - };
> -
> opp-1000000000 {
> opp-hz = /bits/ 64 <1000000000>;
> opp-microvolt = <771000>;
Reviewed-by: Neil Armstrong <[email protected]>
On 10/02/2022 11:06, Christian Hewitt wrote:
> Amlogic SM1 devices experience CPU stalls and random board wedges when
> the system idles and CPU cores clock down to lower opp points. Recent
> vendor kernels include a change to remove 100-250MHz and other distro
> sources also remove the 500/667MHz points. Unless all 100-667Mhz opps
> are removed or the CPU governor forced to performance stalls are still
> observed, so let's remove them to improve stability and uptime.
>
> Fixes: 3d9e76483049 ("arm64: dts: meson-sm1-sei610: enable DVFS")
> Signed-off-by: Christian Hewitt <[email protected]>
> ---
> arch/arm64/boot/dts/amlogic/meson-sm1.dtsi | 20 --------------------
> 1 file changed, 20 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi b/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
> index 3c07a89bfd27..80737731af3f 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-sm1.dtsi
> @@ -95,26 +95,6 @@
> compatible = "operating-points-v2";
> opp-shared;
>
> - opp-100000000 {
> - opp-hz = /bits/ 64 <100000000>;
> - opp-microvolt = <730000>;
> - };
> -
> - opp-250000000 {
> - opp-hz = /bits/ 64 <250000000>;
> - opp-microvolt = <730000>;
> - };
> -
> - opp-500000000 {
> - opp-hz = /bits/ 64 <500000000>;
> - opp-microvolt = <730000>;
> - };
> -
> - opp-667000000 {
> - opp-hz = /bits/ 64 <666666666>;
> - opp-microvolt = <750000>;
> - };
> -
> opp-1000000000 {
> opp-hz = /bits/ 64 <1000000000>;
> opp-microvolt = <770000>;
Reviewed-by: Neil Armstrong <[email protected]>
Hi,
On Thu, 10 Feb 2022 10:06:36 +0000, Christian Hewitt wrote:
> Amlogic G12B and SM1 devices experience CPU stalls and random board
> wedges when the system idles and CPU cores clock down to lower opp
> points. Recent vendor kernels include a change to remove 100-250MHz
> (with no explanation) [0] but other downstream sources also remove
> the 500/667MHz points (also with no explanation). Unless 100-667Mhz
> opps are removed or the CPU governor forced to performance, stalls
> are observed, so let's remove them an improve stability/uptime.
>
> [...]
Thanks, Applied to https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux.git (v5.18/fixes)
[1/2] arm64: dts: meson: remove CPU opps below 1GHz for G12B boards
https://git.kernel.org/amlogic/c/6c4d636bc00dc17c63ffb2a73a0da850240e26e3
[2/2] arm64: dts: meson: remove CPU opps below 1GHz for SM1 boards
https://git.kernel.org/amlogic/c/fd86d85401c2049f652293877c0f7e6e5afc3bbc
--
Neil