2022-10-24 04:41:36

by Hector Martin

[permalink] [raw]
Subject: [PATCH v3 0/5] Apple SoC cpufreq driver

Hi folks,

Third time's the charm? Here's v3 of the cpufreq driver for Apple SoCs.
This version takes a page from both v1 and v2, keeping the dedicated
cpufreq style (instead of pretending to be a clock controller) but using
dedicated DT nodes for each cluster, which accurately represents the
hardware. In particular, this makes supporting t6002 (M1 Ultra) a lot
more reasonable on the DT side.

This version also switches to the standard performance-domains binding,
so we don't need any more vendor-specific properties. In order to
support this, I had to make the performance-domains parsing code more
generic. This required a minor change to the only consumer
(mediatek-cpufreq-hw).

The Linux driver probes based on platform compatible, and then attempts
to locate the cluster nodes by following the performance-domains links
from CPU nodes (this will then fail for any incompatible nodes, e.g. if
a future SoC needs a new compatible and can't fall back). This approach
was suggested by robh as the right way to handle the impedance mismatch
between the hardware, which has separate controllers per cluster, and
the Linux model where there can only be one CPUFreq driver instance.

Functionality-wise, there are no significant changes from v2. The only
notable difference is support for t8112 (M2). This works largely the
same as the other SoCs, but they ran out of bits in the current PState
register, so that needs a SoC-specific quirk. Since that register is
not used by macOS (it was discovered experimentally) and is not critical
for functionality (it just allows accurately reporting the current
frequency to userspace, given boost clock limitations), I've decided to
only use it when a SoC-specific compatible is present. The default
fallback code will simply report the requested frequency as actual.
I expect this will work for future SoCs.

As usual, MAINTAINERS and DT changes are split. I expect patches #2~#4
to go through the cpufreq tree, and we'll take care of #1 and #5 via
the asahi-soc tree.

Hector Martin (5):
MAINTAINERS: Add entries for Apple SoC cpufreq driver
dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC
cpufreq
cpufreq: Generalize of_perf_domain_get_sharing_cpumask phandle format
cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states
arm64: dts: apple: Add CPU topology & cpufreq nodes for t8103

.../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++
MAINTAINERS | 2 +
arch/arm64/boot/dts/apple/t8103.dtsi | 206 +++++++++-
drivers/cpufreq/Kconfig.arm | 9 +
drivers/cpufreq/Makefile | 1 +
drivers/cpufreq/apple-soc-cpufreq.c | 352 ++++++++++++++++++
drivers/cpufreq/cpufreq-dt-platdev.c | 2 +
drivers/cpufreq/mediatek-cpufreq-hw.c | 14 +-
include/linux/cpufreq.h | 28 +-
9 files changed, 706 insertions(+), 27 deletions(-)
create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
create mode 100644 drivers/cpufreq/apple-soc-cpufreq.c

--
2.35.1


2022-10-24 04:52:09

by Hector Martin

[permalink] [raw]
Subject: [PATCH v3 1/5] MAINTAINERS: Add entries for Apple SoC cpufreq driver

This MAINTAINERS update is split, as usual, to facilitate merges via the
SoC tree and avoid conflicts.

Signed-off-by: Hector Martin <[email protected]>
---
MAINTAINERS | 2 ++
1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index cf0f18502372..6783f5263b5e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1897,6 +1897,7 @@ T: git https://github.com/AsahiLinux/linux.git
F: Documentation/devicetree/bindings/arm/apple.yaml
F: Documentation/devicetree/bindings/arm/apple/*
F: Documentation/devicetree/bindings/clock/apple,nco.yaml
+F: Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
F: Documentation/devicetree/bindings/dma/apple,admac.yaml
F: Documentation/devicetree/bindings/i2c/apple,i2c.yaml
F: Documentation/devicetree/bindings/interrupt-controller/apple,*
@@ -1911,6 +1912,7 @@ F: Documentation/devicetree/bindings/power/apple*
F: Documentation/devicetree/bindings/watchdog/apple,wdt.yaml
F: arch/arm64/boot/dts/apple/
F: drivers/clk/clk-apple-nco.c
+F: drivers/cpufreq/apple-soc-cpufreq.c
F: drivers/dma/apple-admac.c
F: drivers/i2c/busses/i2c-pasemi-core.c
F: drivers/i2c/busses/i2c-pasemi-platform.c
--
2.35.1

2022-10-24 04:56:14

by Hector Martin

[permalink] [raw]
Subject: [PATCH v3 5/5] arm64: dts: apple: Add CPU topology & cpufreq nodes for t8103

Add the missing CPU topology/capacity information and the cpufreq nodes,
so we can have CPU frequency scaling and the scheduler has the
information it needs to make the correct decisions.

Boost states are commented out, as they are not yet available (that
requires CPU deep sleep support, to be eventually done via PSCI).
The driver supports them fine; the hardware will just refuse to ever
go into them at this time, so don't expose them to users until that's
done.

Signed-off-by: Hector Martin <[email protected]>
---
arch/arm64/boot/dts/apple/t8103.dtsi | 206 +++++++++++++++++++++++++--
1 file changed, 196 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/boot/dts/apple/t8103.dtsi b/arch/arm64/boot/dts/apple/t8103.dtsi
index 51a63b29d404..055e395ee88d 100644
--- a/arch/arm64/boot/dts/apple/t8103.dtsi
+++ b/arch/arm64/boot/dts/apple/t8103.dtsi
@@ -22,71 +22,245 @@ cpus {
#address-cells = <2>;
#size-cells = <0>;

- cpu0: cpu@0 {
+ cpu-map {
+ cluster0 {
+ core0 {
+ cpu = <&cpu_e0>;
+ };
+ core1 {
+ cpu = <&cpu_e1>;
+ };
+ core2 {
+ cpu = <&cpu_e2>;
+ };
+ core3 {
+ cpu = <&cpu_e3>;
+ };
+ };
+
+ cluster1 {
+ core0 {
+ cpu = <&cpu_p0>;
+ };
+ core1 {
+ cpu = <&cpu_p1>;
+ };
+ core2 {
+ cpu = <&cpu_p2>;
+ };
+ core3 {
+ cpu = <&cpu_p3>;
+ };
+ };
+ };
+
+ cpu_e0: cpu@0 {
compatible = "apple,icestorm";
device_type = "cpu";
reg = <0x0 0x0>;
enable-method = "spin-table";
cpu-release-addr = <0 0>; /* To be filled by loader */
+ operating-points-v2 = <&ecluster_opp>;
+ capacity-dmips-mhz = <714>;
+ performance-domains = <&cpufreq_e>;
};

- cpu1: cpu@1 {
+ cpu_e1: cpu@1 {
compatible = "apple,icestorm";
device_type = "cpu";
reg = <0x0 0x1>;
enable-method = "spin-table";
cpu-release-addr = <0 0>; /* To be filled by loader */
+ operating-points-v2 = <&ecluster_opp>;
+ capacity-dmips-mhz = <714>;
+ performance-domains = <&cpufreq_e>;
};

- cpu2: cpu@2 {
+ cpu_e2: cpu@2 {
compatible = "apple,icestorm";
device_type = "cpu";
reg = <0x0 0x2>;
enable-method = "spin-table";
cpu-release-addr = <0 0>; /* To be filled by loader */
+ operating-points-v2 = <&ecluster_opp>;
+ capacity-dmips-mhz = <714>;
+ performance-domains = <&cpufreq_e>;
};

- cpu3: cpu@3 {
+ cpu_e3: cpu@3 {
compatible = "apple,icestorm";
device_type = "cpu";
reg = <0x0 0x3>;
enable-method = "spin-table";
cpu-release-addr = <0 0>; /* To be filled by loader */
+ operating-points-v2 = <&ecluster_opp>;
+ capacity-dmips-mhz = <714>;
+ performance-domains = <&cpufreq_e>;
};

- cpu4: cpu@10100 {
+ cpu_p0: cpu@10100 {
compatible = "apple,firestorm";
device_type = "cpu";
reg = <0x0 0x10100>;
enable-method = "spin-table";
cpu-release-addr = <0 0>; /* To be filled by loader */
+ operating-points-v2 = <&pcluster_opp>;
+ capacity-dmips-mhz = <1024>;
+ performance-domains = <&cpufreq_p>;
};

- cpu5: cpu@10101 {
+ cpu_p1: cpu@10101 {
compatible = "apple,firestorm";
device_type = "cpu";
reg = <0x0 0x10101>;
enable-method = "spin-table";
cpu-release-addr = <0 0>; /* To be filled by loader */
+ operating-points-v2 = <&pcluster_opp>;
+ capacity-dmips-mhz = <1024>;
+ performance-domains = <&cpufreq_p>;
};

- cpu6: cpu@10102 {
+ cpu_p2: cpu@10102 {
compatible = "apple,firestorm";
device_type = "cpu";
reg = <0x0 0x10102>;
enable-method = "spin-table";
cpu-release-addr = <0 0>; /* To be filled by loader */
+ operating-points-v2 = <&pcluster_opp>;
+ capacity-dmips-mhz = <1024>;
+ performance-domains = <&cpufreq_p>;
};

- cpu7: cpu@10103 {
+ cpu_p3: cpu@10103 {
compatible = "apple,firestorm";
device_type = "cpu";
reg = <0x0 0x10103>;
enable-method = "spin-table";
cpu-release-addr = <0 0>; /* To be filled by loader */
+ operating-points-v2 = <&pcluster_opp>;
+ capacity-dmips-mhz = <1024>;
+ performance-domains = <&cpufreq_p>;
};
};

+ ecluster_opp: opp-table-0 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ opp01 {
+ opp-hz = /bits/ 64 <600000000>;
+ opp-level = <1>;
+ clock-latency-ns = <7500>;
+ };
+ opp02 {
+ opp-hz = /bits/ 64 <972000000>;
+ opp-level = <2>;
+ clock-latency-ns = <22000>;
+ };
+ opp03 {
+ opp-hz = /bits/ 64 <1332000000>;
+ opp-level = <3>;
+ clock-latency-ns = <27000>;
+ };
+ opp04 {
+ opp-hz = /bits/ 64 <1704000000>;
+ opp-level = <4>;
+ clock-latency-ns = <33000>;
+ };
+ opp05 {
+ opp-hz = /bits/ 64 <2064000000>;
+ opp-level = <5>;
+ clock-latency-ns = <50000>;
+ };
+ };
+
+ pcluster_opp: opp-table-1 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ opp01 {
+ opp-hz = /bits/ 64 <600000000>;
+ opp-level = <1>;
+ clock-latency-ns = <8000>;
+ };
+ opp02 {
+ opp-hz = /bits/ 64 <828000000>;
+ opp-level = <2>;
+ clock-latency-ns = <19000>;
+ };
+ opp03 {
+ opp-hz = /bits/ 64 <1056000000>;
+ opp-level = <3>;
+ clock-latency-ns = <21000>;
+ };
+ opp04 {
+ opp-hz = /bits/ 64 <1284000000>;
+ opp-level = <4>;
+ clock-latency-ns = <23000>;
+ };
+ opp05 {
+ opp-hz = /bits/ 64 <1500000000>;
+ opp-level = <5>;
+ clock-latency-ns = <24000>;
+ };
+ opp06 {
+ opp-hz = /bits/ 64 <1728000000>;
+ opp-level = <6>;
+ clock-latency-ns = <29000>;
+ };
+ opp07 {
+ opp-hz = /bits/ 64 <1956000000>;
+ opp-level = <7>;
+ clock-latency-ns = <31000>;
+ };
+ opp08 {
+ opp-hz = /bits/ 64 <2184000000>;
+ opp-level = <8>;
+ clock-latency-ns = <34000>;
+ };
+ opp09 {
+ opp-hz = /bits/ 64 <2388000000>;
+ opp-level = <9>;
+ clock-latency-ns = <36000>;
+ };
+ opp10 {
+ opp-hz = /bits/ 64 <2592000000>;
+ opp-level = <10>;
+ clock-latency-ns = <51000>;
+ };
+ opp11 {
+ opp-hz = /bits/ 64 <2772000000>;
+ opp-level = <11>;
+ clock-latency-ns = <54000>;
+ };
+ opp12 {
+ opp-hz = /bits/ 64 <2988000000>;
+ opp-level = <12>;
+ clock-latency-ns = <55000>;
+ };
+#if 0
+ /* Not available until CPU deep sleep is implemented */
+ opp13 {
+ opp-hz = /bits/ 64 <3096000000>;
+ opp-level = <13>;
+ clock-latency-ns = <55000>;
+ turbo-mode;
+ };
+ opp14 {
+ opp-hz = /bits/ 64 <3144000000>;
+ opp-level = <14>;
+ clock-latency-ns = <56000>;
+ turbo-mode;
+ };
+ opp15 {
+ opp-hz = /bits/ 64 <3204000000>;
+ opp-level = <15>;
+ clock-latency-ns = <56000>;
+ turbo-mode;
+ };
+#endif
+ };
+
timer {
compatible = "arm,armv8-timer";
interrupt-parent = <&aic>;
@@ -124,6 +298,18 @@ soc {
ranges;
nonposted-mmio;

+ cpufreq_e: cpufreq@210e20000 {
+ compatible = "apple,t8103-cluster-cpufreq", "apple,cluster-cpufreq";
+ reg = <0x2 0x10e20000 0 0x1000>;
+ #performance-domain-cells = <0>;
+ };
+
+ cpufreq_p: cpufreq@211e20000 {
+ compatible = "apple,t8103-cluster-cpufreq", "apple,cluster-cpufreq";
+ reg = <0x2 0x11e20000 0 0x1000>;
+ #performance-domain-cells = <0>;
+ };
+
i2c0: i2c@235010000 {
compatible = "apple,t8103-i2c", "apple,i2c";
reg = <0x2 0x35010000 0x0 0x4000>;
@@ -229,12 +415,12 @@ aic: interrupt-controller@23b100000 {
affinities {
e-core-pmu-affinity {
apple,fiq-index = <AIC_CPU_PMU_E>;
- cpus = <&cpu0 &cpu1 &cpu2 &cpu3>;
+ cpus = <&cpu_e0 &cpu_e1 &cpu_e2 &cpu_e3>;
};

p-core-pmu-affinity {
apple,fiq-index = <AIC_CPU_PMU_P>;
- cpus = <&cpu4 &cpu5 &cpu6 &cpu7>;
+ cpus = <&cpu_p0 &cpu_p1 &cpu_p2 &cpu_p3>;
};
};
};
--
2.35.1

2022-10-24 05:02:53

by Hector Martin

[permalink] [raw]
Subject: [PATCH v3 3/5] cpufreq: Generalize of_perf_domain_get_sharing_cpumask phandle format

of_perf_domain_get_sharing_cpumask currently assumes a 1-argument
phandle format, and directly returns the argument. Generalize this to
return the full of_phandle_args, so it can be used by drivers which use
other phandle styles (e.g. separate nodes). This also requires changing
the CPU sharing match to compare the full args structure.

Also, make sure to of_node_put(args.np) (the original code was leaking a
reference).

Signed-off-by: Hector Martin <[email protected]>
---
drivers/cpufreq/mediatek-cpufreq-hw.c | 14 +++++++++-----
include/linux/cpufreq.h | 28 +++++++++++++++------------
2 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/cpufreq/mediatek-cpufreq-hw.c b/drivers/cpufreq/mediatek-cpufreq-hw.c
index f0e0a35c7f21..f80339779084 100644
--- a/drivers/cpufreq/mediatek-cpufreq-hw.c
+++ b/drivers/cpufreq/mediatek-cpufreq-hw.c
@@ -160,6 +160,7 @@ static int mtk_cpu_resources_init(struct platform_device *pdev,
struct mtk_cpufreq_data *data;
struct device *dev = &pdev->dev;
struct resource *res;
+ struct of_phandle_args args;
void __iomem *base;
int ret, i;
int index;
@@ -168,11 +169,14 @@ static int mtk_cpu_resources_init(struct platform_device *pdev,
if (!data)
return -ENOMEM;

- index = of_perf_domain_get_sharing_cpumask(policy->cpu, "performance-domains",
- "#performance-domain-cells",
- policy->cpus);
- if (index < 0)
- return index;
+ ret = of_perf_domain_get_sharing_cpumask(policy->cpu, "performance-domains",
+ "#performance-domain-cells",
+ policy->cpus, &args);
+ if (ret < 0)
+ return ret;
+
+ index = args.args[0];
+ of_node_put(args.np);

res = platform_get_resource(pdev, IORESOURCE_MEM, index);
if (!res) {
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index d5595d57f4e5..6a94a6eaad27 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -1110,10 +1110,10 @@ cpufreq_table_set_inefficient(struct cpufreq_policy *policy,
}

static inline int parse_perf_domain(int cpu, const char *list_name,
- const char *cell_name)
+ const char *cell_name,
+ struct of_phandle_args *args)
{
struct device_node *cpu_np;
- struct of_phandle_args args;
int ret;

cpu_np = of_cpu_device_node_get(cpu);
@@ -1121,41 +1121,44 @@ static inline int parse_perf_domain(int cpu, const char *list_name,
return -ENODEV;

ret = of_parse_phandle_with_args(cpu_np, list_name, cell_name, 0,
- &args);
+ args);
if (ret < 0)
return ret;

of_node_put(cpu_np);

- return args.args[0];
+ return 0;
}

static inline int of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_name,
- const char *cell_name, struct cpumask *cpumask)
+ const char *cell_name, struct cpumask *cpumask,
+ struct of_phandle_args *pargs)
{
- int target_idx;
int cpu, ret;
+ struct of_phandle_args args;

- ret = parse_perf_domain(pcpu, list_name, cell_name);
+ ret = parse_perf_domain(pcpu, list_name, cell_name, pargs);
if (ret < 0)
return ret;

- target_idx = ret;
cpumask_set_cpu(pcpu, cpumask);

for_each_possible_cpu(cpu) {
if (cpu == pcpu)
continue;

- ret = parse_perf_domain(cpu, list_name, cell_name);
+ ret = parse_perf_domain(cpu, list_name, cell_name, &args);
if (ret < 0)
continue;

- if (target_idx == ret)
+ if (pargs->np == args.np && pargs->args_count == args.args_count &&
+ !memcmp(pargs->args, args.args, sizeof(args.args[0]) * args.args_count))
cpumask_set_cpu(cpu, cpumask);
+
+ of_node_put(args.np);
}

- return target_idx;
+ return 0;
}
#else
static inline int cpufreq_boost_trigger_state(int state)
@@ -1185,7 +1188,8 @@ cpufreq_table_set_inefficient(struct cpufreq_policy *policy,
}

static inline int of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_name,
- const char *cell_name, struct cpumask *cpumask)
+ const char *cell_name, struct cpumask *cpumask,
+ struct of_phandle_args *pargs)
{
return -EOPNOTSUPP;
}
--
2.35.1

2022-10-24 05:06:17

by Hector Martin

[permalink] [raw]
Subject: [PATCH v3 2/5] dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC cpufreq

This binding represents the cpufreq/DVFS hardware present in Apple SoCs.
The hardware has an independent controller per CPU cluster, and we
represent them as unique nodes in order to accurately describe the
hardware. The driver is responsible for binding them as a single cpufreq
device (in the Linux cpufreq model).

Signed-off-by: Hector Martin <[email protected]>
---
.../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++++++++++++++
1 file changed, 119 insertions(+)
create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml

diff --git a/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
new file mode 100644
index 000000000000..b11452f91468
--- /dev/null
+++ b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
@@ -0,0 +1,119 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/cpufreq/apple,cluster-cpufreq.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Apple SoC cluster cpufreq device
+
+maintainers:
+ - Hector Martin <[email protected]>
+
+description: |
+ Apple SoCs (e.g. M1) have a per-cpu-cluster DVFS controller that is part of
+ the cluster management register block. This binding uses the standard
+ operating-points-v2 table to define the CPU performance states, with the
+ opp-level property specifying the hardware p-state index for that level.
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - const: apple,t8103-cluster-cpufreq
+ - const: apple,cluster-cpufreq
+ - items:
+ - const: apple,t6000-cluster-cpufreq
+ - const: apple,t8103-cluster-cpufreq
+ - const: apple,cluster-cpufreq
+ - items:
+ - const: apple,t8112-cluster-cpufreq
+ - const: apple,cluster-cpufreq
+
+ reg:
+ maxItems: 1
+ description: The register region for this CPU cluster DVFS controller
+
+ '#performance-domain-cells':
+ const: 0
+
+required:
+ - compatible
+ - reg
+ - '#performance-domain-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ // This example shows a single CPU per domain and 2 domains,
+ // with two p-states per domain.
+ // Shipping hardware has 2-4 CPUs per domain and 2-6 domains.
+ cpus {
+ #address-cells = <2>;
+ #size-cells = <0>;
+
+ cpu@0 {
+ compatible = "apple,icestorm";
+ device_type = "cpu";
+ reg = <0x0 0x0>;
+ operating-points-v2 = <&ecluster_opp>;
+ performance-domains = <&cpufreq_e>;
+ };
+
+ cpu@10100 {
+ compatible = "apple,firestorm";
+ device_type = "cpu";
+ reg = <0x0 0x10100>;
+ operating-points-v2 = <&pcluster_opp>;
+ performance-domains = <&cpufreq_p>;
+ };
+ };
+
+ ecluster_opp: opp-table-0 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ opp01 {
+ opp-hz = /bits/ 64 <600000000>;
+ opp-level = <1>;
+ clock-latency-ns = <7500>;
+ };
+ opp02 {
+ opp-hz = /bits/ 64 <972000000>;
+ opp-level = <2>;
+ clock-latency-ns = <22000>;
+ };
+ };
+
+ pcluster_opp: opp-table-1 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ opp01 {
+ opp-hz = /bits/ 64 <600000000>;
+ opp-level = <1>;
+ clock-latency-ns = <8000>;
+ };
+ opp02 {
+ opp-hz = /bits/ 64 <828000000>;
+ opp-level = <2>;
+ clock-latency-ns = <19000>;
+ };
+ };
+
+ soc {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ cpufreq_e: cpufreq@210e20000 {
+ compatible = "apple,t8103-cluster-cpufreq", "apple,cluster-cpufreq";
+ reg = <0x2 0x10e20000 0 0x1000>;
+ #performance-domain-cells = <0>;
+ };
+
+ cpufreq_p: cpufreq@211e20000 {
+ compatible = "apple,t8103-cluster-cpufreq", "apple,cluster-cpufreq";
+ reg = <0x2 0x11e20000 0 0x1000>;
+ #performance-domain-cells = <0>;
+ };
+ };
--
2.35.1

2022-10-24 05:22:03

by Hector Martin

[permalink] [raw]
Subject: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

This driver implements CPU frequency scaling for Apple Silicon SoCs,
including M1 (t8103), M1 Max/Pro/Ultra (t600x), and M2 (t8112).

Each CPU cluster has its own register set, and frequency management is
fully automated by the hardware; the driver only has to write one
register. There is boost frequency support, but the hardware will only
allow their use if only a subset of cores in a cluster are in
non-deep-idle. Since we don't support deep idle yet, these frequencies
are not achievable, but the driver supports them. They will remain
disabled in the device tree until deep idle is implemented, to avoid
confusing users.

This driver does not yet implement the memory controller performance
state tuning that usually accompanies higher CPU p-states. This will be
done in a future patch.

Signed-off-by: Hector Martin <[email protected]>
---
drivers/cpufreq/Kconfig.arm | 9 +
drivers/cpufreq/Makefile | 1 +
drivers/cpufreq/apple-soc-cpufreq.c | 352 +++++++++++++++++++++++++++
drivers/cpufreq/cpufreq-dt-platdev.c | 2 +
4 files changed, 364 insertions(+)
create mode 100644 drivers/cpufreq/apple-soc-cpufreq.c

diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index 82e5de1f6f8c..29969f84008a 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -41,6 +41,15 @@ config ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM
To compile this driver as a module, choose M here: the
module will be called sun50i-cpufreq-nvmem.

+config ARM_APPLE_SOC_CPUFREQ
+ tristate "Apple Silicon SoC CPUFreq support"
+ depends on ARCH_APPLE || (COMPILE_TEST && 64BIT)
+ select PM_OPP
+ default ARCH_APPLE
+ help
+ This adds the CPUFreq driver for Apple Silicon machines
+ (e.g. Apple M1).
+
config ARM_ARMADA_37XX_CPUFREQ
tristate "Armada 37xx CPUFreq support"
depends on ARCH_MVEBU && CPUFREQ_DT
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 49b98c62c5af..32a7029e25ed 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -52,6 +52,7 @@ obj-$(CONFIG_X86_AMD_FREQ_SENSITIVITY) += amd_freq_sensitivity.o

##################################################################################
# ARM SoC drivers
+obj-$(CONFIG_ARM_APPLE_SOC_CPUFREQ) += apple-soc-cpufreq.o
obj-$(CONFIG_ARM_ARMADA_37XX_CPUFREQ) += armada-37xx-cpufreq.o
obj-$(CONFIG_ARM_ARMADA_8K_CPUFREQ) += armada-8k-cpufreq.o
obj-$(CONFIG_ARM_BRCMSTB_AVS_CPUFREQ) += brcmstb-avs-cpufreq.o
diff --git a/drivers/cpufreq/apple-soc-cpufreq.c b/drivers/cpufreq/apple-soc-cpufreq.c
new file mode 100644
index 000000000000..12c4b490edb8
--- /dev/null
+++ b/drivers/cpufreq/apple-soc-cpufreq.c
@@ -0,0 +1,352 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Apple SoC CPU cluster performance state driver
+ *
+ * Copyright The Asahi Linux Contributors
+ *
+ * Based on scpi-cpufreq.c
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/cpufreq.h>
+#include <linux/cpumask.h>
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/pm_opp.h>
+#include <linux/slab.h>
+
+#define APPLE_DVFS_CMD 0x20
+#define APPLE_DVFS_CMD_BUSY BIT(31)
+#define APPLE_DVFS_CMD_SET BIT(25)
+#define APPLE_DVFS_CMD_PS2 GENMASK(16, 12)
+#define APPLE_DVFS_CMD_PS1 GENMASK(4, 0)
+
+/* Same timebase as CPU counter (24MHz) */
+#define APPLE_DVFS_LAST_CHG_TIME 0x38
+
+/*
+ * Apple ran out of bits and had to shift this in T8112...
+ */
+#define APPLE_DVFS_STATUS 0x50
+#define APPLE_DVFS_STATUS_CUR_PS_T8103 GENMASK(7, 4)
+#define APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8103 4
+#define APPLE_DVFS_STATUS_TGT_PS_T8103 GENMASK(3, 0)
+#define APPLE_DVFS_STATUS_CUR_PS_T8112 GENMASK(9, 5)
+#define APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8112 5
+#define APPLE_DVFS_STATUS_TGT_PS_T8112 GENMASK(4, 0)
+
+/*
+ * Div is +1, base clock is 12MHz on existing SoCs.
+ * For documentation purposes. We use the OPP table to
+ * get the frequency.
+ */
+#define APPLE_DVFS_PLL_STATUS 0xc0
+#define APPLE_DVFS_PLL_FACTOR 0xc8
+#define APPLE_DVFS_PLL_FACTOR_MULT GENMASK(31, 16)
+#define APPLE_DVFS_PLL_FACTOR_DIV GENMASK(15, 0)
+
+#define APPLE_DVFS_TRANSITION_TIMEOUT 100
+
+struct apple_soc_cpufreq_info {
+ u64 max_pstate;
+ u64 cur_pstate_mask;
+ u64 cur_pstate_shift;
+};
+
+struct apple_cpu_priv {
+ struct device *cpu_dev;
+ void __iomem *reg_base;
+ const struct apple_soc_cpufreq_info *info;
+};
+
+static struct cpufreq_driver apple_soc_cpufreq_driver;
+
+const struct apple_soc_cpufreq_info soc_t8103_info = {
+ .max_pstate = 15,
+ .cur_pstate_mask = APPLE_DVFS_STATUS_CUR_PS_T8103,
+ .cur_pstate_shift = APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8103,
+};
+
+const struct apple_soc_cpufreq_info soc_t8112_info = {
+ .max_pstate = 31,
+ .cur_pstate_mask = APPLE_DVFS_STATUS_CUR_PS_T8112,
+ .cur_pstate_shift = APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8112,
+};
+
+const struct apple_soc_cpufreq_info soc_default_info = {
+ .max_pstate = 15,
+ .cur_pstate_mask = 0, /* fallback */
+};
+
+static const struct of_device_id apple_soc_cpufreq_of_match[] = {
+ {
+ .compatible = "apple,t8103-cluster-cpufreq",
+ .data = &soc_t8103_info,
+ },
+ {
+ .compatible = "apple,t8112-cluster-cpufreq",
+ .data = &soc_t8112_info,
+ },
+ {
+ .compatible = "apple,cluster-cpufreq",
+ .data = &soc_default_info,
+ },
+ {}
+};
+
+static unsigned int apple_soc_cpufreq_get_rate(unsigned int cpu)
+{
+ struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu);
+ struct apple_cpu_priv *priv = policy->driver_data;
+ unsigned int pstate;
+ unsigned int i;
+
+ if (priv->info->cur_pstate_mask) {
+ u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_STATUS);
+
+ pstate = (reg & priv->info->cur_pstate_mask) >> priv->info->cur_pstate_shift;
+ } else {
+ /*
+ * For the fallback case we might not know the layout of DVFS_STATUS,
+ * so just use the command register value (which ignores boost limitations).
+ */
+ u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_CMD);
+
+ pstate = FIELD_GET(APPLE_DVFS_CMD_PS1, reg);
+ }
+
+ for (i = 0; policy->freq_table[i].frequency != CPUFREQ_TABLE_END; i++)
+ if (policy->freq_table[i].driver_data == pstate)
+ return policy->freq_table[i].frequency;
+
+ dev_err(priv->cpu_dev, "could not find frequency for pstate %d\n",
+ pstate);
+ return 0;
+}
+
+static int apple_soc_cpufreq_set_target(struct cpufreq_policy *policy,
+ unsigned int index)
+{
+ struct apple_cpu_priv *priv = policy->driver_data;
+ unsigned int pstate = policy->freq_table[index].driver_data;
+ u64 reg;
+
+ /* Fallback for newer SoCs */
+ if (index > priv->info->max_pstate)
+ index = priv->info->max_pstate;
+
+ if (readq_poll_timeout_atomic(priv->reg_base + APPLE_DVFS_CMD, reg,
+ !(reg & APPLE_DVFS_CMD_BUSY), 2,
+ APPLE_DVFS_TRANSITION_TIMEOUT)) {
+ return -EIO;
+ }
+
+ reg &= ~(APPLE_DVFS_CMD_PS1 | APPLE_DVFS_CMD_PS2);
+ reg |= FIELD_PREP(APPLE_DVFS_CMD_PS1, pstate);
+ reg |= FIELD_PREP(APPLE_DVFS_CMD_PS2, pstate);
+ reg |= APPLE_DVFS_CMD_SET;
+
+ writeq_relaxed(reg, priv->reg_base + APPLE_DVFS_CMD);
+
+ return 0;
+}
+
+static unsigned int apple_soc_cpufreq_fast_switch(struct cpufreq_policy *policy,
+ unsigned int target_freq)
+{
+ if (apple_soc_cpufreq_set_target(policy, policy->cached_resolved_idx) < 0)
+ return 0;
+
+ return policy->freq_table[policy->cached_resolved_idx].frequency;
+}
+
+static int apple_soc_cpufreq_find_cluster(struct cpufreq_policy *policy,
+ void __iomem **reg_base,
+ const struct apple_soc_cpufreq_info **info)
+{
+ struct of_phandle_args args;
+ const struct of_device_id *match;
+ int ret = 0;
+
+ ret = of_perf_domain_get_sharing_cpumask(policy->cpu, "performance-domains",
+ "#performance-domain-cells",
+ policy->cpus, &args);
+ if (ret < 0)
+ return ret;
+
+ match = of_match_node(apple_soc_cpufreq_of_match, args.np);
+ of_node_put(args.np);
+ if (!match)
+ return -ENODEV;
+
+ *info = match->data;
+
+ *reg_base = of_iomap(args.np, 0);
+ if (IS_ERR(*reg_base))
+ return PTR_ERR(*reg_base);
+
+ return 0;
+}
+
+static struct freq_attr *apple_soc_cpufreq_hw_attr[] = {
+ &cpufreq_freq_attr_scaling_available_freqs,
+ NULL,
+ NULL,
+};
+
+static int apple_soc_cpufreq_init(struct cpufreq_policy *policy)
+{
+ int ret, i;
+ unsigned int transition_latency;
+ void __iomem *reg_base;
+ struct device *cpu_dev;
+ struct apple_cpu_priv *priv;
+ const struct apple_soc_cpufreq_info *info;
+ struct cpufreq_frequency_table *freq_table;
+
+ cpu_dev = get_cpu_device(policy->cpu);
+ if (!cpu_dev) {
+ pr_err("failed to get cpu%d device\n", policy->cpu);
+ return -ENODEV;
+ }
+
+ ret = dev_pm_opp_of_add_table(cpu_dev);
+ if (ret < 0) {
+ dev_err(cpu_dev, "%s: failed to add OPP table: %d\n", __func__, ret);
+ return ret;
+ }
+
+ ret = apple_soc_cpufreq_find_cluster(policy, &reg_base, &info);
+ if (ret) {
+ dev_err(cpu_dev, "%s: failed to get cluster info: %d\n", __func__, ret);
+ return ret;
+ }
+
+ ret = dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);
+ if (ret) {
+ dev_err(cpu_dev, "%s: failed to mark OPPs as shared: %d\n", __func__, ret);
+ goto out_iounmap;
+ }
+
+ ret = dev_pm_opp_get_opp_count(cpu_dev);
+ if (ret <= 0) {
+ dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
+ ret = -EPROBE_DEFER;
+ goto out_free_opp;
+ }
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv) {
+ ret = -ENOMEM;
+ goto out_free_opp;
+ }
+
+ ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table);
+ if (ret) {
+ dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret);
+ goto out_free_priv;
+ }
+
+ /* Get OPP levels (p-state indexes) and stash them in driver_data */
+ for (i = 0; freq_table[i].frequency != CPUFREQ_TABLE_END; i++) {
+ unsigned long rate = freq_table[i].frequency * 1000;
+ struct dev_pm_opp *opp = dev_pm_opp_find_freq_floor(cpu_dev, &rate);
+
+ if (IS_ERR(opp)) {
+ ret = PTR_ERR(opp);
+ goto out_free_cpufreq_table;
+ }
+ freq_table[i].driver_data = dev_pm_opp_get_level(opp);
+ dev_pm_opp_put(opp);
+ }
+
+ priv->cpu_dev = cpu_dev;
+ priv->reg_base = reg_base;
+ priv->info = info;
+ policy->driver_data = priv;
+ policy->freq_table = freq_table;
+
+ transition_latency = dev_pm_opp_get_max_transition_latency(cpu_dev);
+ if (!transition_latency)
+ transition_latency = CPUFREQ_ETERNAL;
+
+ policy->cpuinfo.transition_latency = transition_latency;
+ policy->dvfs_possible_from_any_cpu = true;
+ policy->fast_switch_possible = true;
+
+ if (policy_has_boost_freq(policy)) {
+ ret = cpufreq_enable_boost_support();
+ if (ret) {
+ dev_warn(cpu_dev, "failed to enable boost: %d\n", ret);
+ } else {
+ apple_soc_cpufreq_hw_attr[1] = &cpufreq_freq_attr_scaling_boost_freqs;
+ apple_soc_cpufreq_driver.boost_enabled = true;
+ }
+ }
+
+ return 0;
+
+out_free_cpufreq_table:
+ dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
+out_free_priv:
+ kfree(priv);
+out_free_opp:
+ dev_pm_opp_remove_all_dynamic(cpu_dev);
+out_iounmap:
+ iounmap(reg_base);
+ return ret;
+}
+
+static int apple_soc_cpufreq_exit(struct cpufreq_policy *policy)
+{
+ struct apple_cpu_priv *priv = policy->driver_data;
+
+ dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
+ dev_pm_opp_remove_all_dynamic(priv->cpu_dev);
+ iounmap(priv->reg_base);
+ kfree(priv);
+
+ return 0;
+}
+
+static struct cpufreq_driver apple_soc_cpufreq_driver = {
+ .name = "apple-cpufreq",
+ .flags = CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
+ CPUFREQ_NEED_INITIAL_FREQ_CHECK | CPUFREQ_IS_COOLING_DEV,
+ .verify = cpufreq_generic_frequency_table_verify,
+ .attr = cpufreq_generic_attr,
+ .get = apple_soc_cpufreq_get_rate,
+ .init = apple_soc_cpufreq_init,
+ .exit = apple_soc_cpufreq_exit,
+ .target_index = apple_soc_cpufreq_set_target,
+ .fast_switch = apple_soc_cpufreq_fast_switch,
+ .register_em = cpufreq_register_em_with_opp,
+ .attr = apple_soc_cpufreq_hw_attr,
+};
+
+static int __init apple_soc_cpufreq_module_init(void)
+{
+ if (!of_machine_is_compatible("apple,arm-platform"))
+ return -ENODEV;
+
+ return cpufreq_register_driver(&apple_soc_cpufreq_driver);
+}
+module_init(apple_soc_cpufreq_module_init);
+
+static void __exit apple_soc_cpufreq_module_exit(void)
+{
+ cpufreq_unregister_driver(&apple_soc_cpufreq_driver);
+}
+module_exit(apple_soc_cpufreq_module_exit);
+
+MODULE_DEVICE_TABLE(of, apple_soc_cpufreq_of_match);
+MODULE_AUTHOR("Hector Martin <[email protected]>");
+MODULE_DESCRIPTION("Apple SoC CPU cluster DVFS driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c
index 6ac3800db450..a108b9796770 100644
--- a/drivers/cpufreq/cpufreq-dt-platdev.c
+++ b/drivers/cpufreq/cpufreq-dt-platdev.c
@@ -103,6 +103,8 @@ static const struct of_device_id allowlist[] __initconst = {
static const struct of_device_id blocklist[] __initconst = {
{ .compatible = "allwinner,sun50i-h6", },

+ { .compatible = "apple,arm-platform", },
+
{ .compatible = "arm,vexpress", },

{ .compatible = "calxeda,highbank", },
--
2.35.1

2022-10-24 09:00:03

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] Apple SoC cpufreq driver

On Mon, 24 Oct 2022 05:39:20 +0100,
Hector Martin <[email protected]> wrote:
>
> Hi folks,
>
> Third time's the charm? Here's v3 of the cpufreq driver for Apple SoCs.
> This version takes a page from both v1 and v2, keeping the dedicated
> cpufreq style (instead of pretending to be a clock controller) but using
> dedicated DT nodes for each cluster, which accurately represents the
> hardware. In particular, this makes supporting t6002 (M1 Ultra) a lot
> more reasonable on the DT side.
>
> This version also switches to the standard performance-domains binding,
> so we don't need any more vendor-specific properties. In order to
> support this, I had to make the performance-domains parsing code more
> generic. This required a minor change to the only consumer
> (mediatek-cpufreq-hw).
>
> The Linux driver probes based on platform compatible, and then attempts
> to locate the cluster nodes by following the performance-domains links
> from CPU nodes (this will then fail for any incompatible nodes, e.g. if
> a future SoC needs a new compatible and can't fall back). This approach
> was suggested by robh as the right way to handle the impedance mismatch
> between the hardware, which has separate controllers per cluster, and
> the Linux model where there can only be one CPUFreq driver instance.
>
> Functionality-wise, there are no significant changes from v2. The only
> notable difference is support for t8112 (M2). This works largely the
> same as the other SoCs, but they ran out of bits in the current PState
> register, so that needs a SoC-specific quirk. Since that register is
> not used by macOS (it was discovered experimentally) and is not critical
> for functionality (it just allows accurately reporting the current
> frequency to userspace, given boost clock limitations), I've decided to
> only use it when a SoC-specific compatible is present. The default
> fallback code will simply report the requested frequency as actual.
> I expect this will work for future SoCs.
>
> As usual, MAINTAINERS and DT changes are split. I expect patches #2~#4
> to go through the cpufreq tree, and we'll take care of #1 and #5 via
> the asahi-soc tree.
>
> Hector Martin (5):
> MAINTAINERS: Add entries for Apple SoC cpufreq driver
> dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC
> cpufreq
> cpufreq: Generalize of_perf_domain_get_sharing_cpumask phandle format
> cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states
> arm64: dts: apple: Add CPU topology & cpufreq nodes for t8103
>
> .../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++
> MAINTAINERS | 2 +
> arch/arm64/boot/dts/apple/t8103.dtsi | 206 +++++++++-
> drivers/cpufreq/Kconfig.arm | 9 +
> drivers/cpufreq/Makefile | 1 +
> drivers/cpufreq/apple-soc-cpufreq.c | 352 ++++++++++++++++++
> drivers/cpufreq/cpufreq-dt-platdev.c | 2 +
> drivers/cpufreq/mediatek-cpufreq-hw.c | 14 +-
> include/linux/cpufreq.h | 28 +-
> 9 files changed, 706 insertions(+), 27 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
> create mode 100644 drivers/cpufreq/apple-soc-cpufreq.c

FWIW, and for the whole series:

Acked-by: Marc Zyngier <[email protected]>

M.

--
Without deviation from the norm, progress is not possible.

2022-10-24 09:02:17

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On Mon, 24 Oct 2022 05:39:24 +0100,
Hector Martin <[email protected]> wrote:
>
> This driver implements CPU frequency scaling for Apple Silicon SoCs,
> including M1 (t8103), M1 Max/Pro/Ultra (t600x), and M2 (t8112).
>
> Each CPU cluster has its own register set, and frequency management is
> fully automated by the hardware; the driver only has to write one
> register. There is boost frequency support, but the hardware will only
> allow their use if only a subset of cores in a cluster are in
> non-deep-idle. Since we don't support deep idle yet, these frequencies
> are not achievable, but the driver supports them. They will remain
> disabled in the device tree until deep idle is implemented, to avoid
> confusing users.
>
> This driver does not yet implement the memory controller performance
> state tuning that usually accompanies higher CPU p-states. This will be
> done in a future patch.
>
> Signed-off-by: Hector Martin <[email protected]>
> ---
> drivers/cpufreq/Kconfig.arm | 9 +
> drivers/cpufreq/Makefile | 1 +
> drivers/cpufreq/apple-soc-cpufreq.c | 352 +++++++++++++++++++++++++++
> drivers/cpufreq/cpufreq-dt-platdev.c | 2 +
> 4 files changed, 364 insertions(+)
> create mode 100644 drivers/cpufreq/apple-soc-cpufreq.c
>

[...]

> +static struct freq_attr *apple_soc_cpufreq_hw_attr[] = {
> + &cpufreq_freq_attr_scaling_available_freqs,
> + NULL,
> + NULL,

nit: extra NULL?

M.

--
Without deviation from the norm, progress is not possible.

2022-10-25 16:08:36

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC cpufreq

On 24/10/2022 00:39, Hector Martin wrote:
> This binding represents the cpufreq/DVFS hardware present in Apple SoCs.
> The hardware has an independent controller per CPU cluster, and we
> represent them as unique nodes in order to accurately describe the
> hardware. The driver is responsible for binding them as a single cpufreq
> device (in the Linux cpufreq model).
>
> Signed-off-by: Hector Martin <[email protected]>
> ---
> .../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++++++++++++++
> 1 file changed, 119 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>
> diff --git a/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
> new file mode 100644
> index 000000000000..b11452f91468
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
> @@ -0,0 +1,119 @@
> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/cpufreq/apple,cluster-cpufreq.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Apple SoC cluster cpufreq device

Few nits, in general looks fine to me.

> +
> +maintainers:
> + - Hector Martin <[email protected]>
> +
> +description: |
> + Apple SoCs (e.g. M1) have a per-cpu-cluster DVFS controller that is part of
> + the cluster management register block. This binding uses the standard
> + operating-points-v2 table to define the CPU performance states, with the
> + opp-level property specifying the hardware p-state index for that level.
> +
> +properties:
> + compatible:
> + oneOf:
> + - items:
> + - const: apple,t8103-cluster-cpufreq
> + - const: apple,cluster-cpufreq
> + - items:
> + - const: apple,t6000-cluster-cpufreq
> + - const: apple,t8103-cluster-cpufreq
> + - const: apple,cluster-cpufreq
> + - items:
> + - const: apple,t8112-cluster-cpufreq

With the first one (t8103) - it's an enum.

> + - const: apple,cluster-cpufreq
> +
> + reg:
> + maxItems: 1
> + description: The register region for this CPU cluster DVFS controller

Drop description, quite obvious.

> +
> + '#performance-domain-cells':
> + const: 0
> +
> +required:
> + - compatible
> + - reg
> + - '#performance-domain-cells'
> +
> +additionalProperties: false
> +
> +examples:
> + - |
> + // This example shows a single CPU per domain and 2 domains,
> + // with two p-states per domain.
> + // Shipping hardware has 2-4 CPUs per domain and 2-6 domains.
> + cpus {
> + #address-cells = <2>;
> + #size-cells = <0>;
> +
> + cpu@0 {
> + compatible = "apple,icestorm";
> + device_type = "cpu";
> + reg = <0x0 0x0>;
> + operating-points-v2 = <&ecluster_opp>;
> + performance-domains = <&cpufreq_e>;
> + };
> +
> + cpu@10100 {
> + compatible = "apple,firestorm";
> + device_type = "cpu";
> + reg = <0x0 0x10100>;
> + operating-points-v2 = <&pcluster_opp>;
> + performance-domains = <&cpufreq_p>;
> + };
> + };
> +
> + ecluster_opp: opp-table-0 {
> + compatible = "operating-points-v2";
> + opp-shared;
> +
> + opp01 {
> + opp-hz = /bits/ 64 <600000000>;
> + opp-level = <1>;
> + clock-latency-ns = <7500>;
> + };
> + opp02 {
> + opp-hz = /bits/ 64 <972000000>;
> + opp-level = <2>;
> + clock-latency-ns = <22000>;
> + };
> + };
> +
> + pcluster_opp: opp-table-1 {
> + compatible = "operating-points-v2";
> + opp-shared;
> +
> + opp01 {
> + opp-hz = /bits/ 64 <600000000>;
> + opp-level = <1>;
> + clock-latency-ns = <8000>;
> + };
> + opp02 {
> + opp-hz = /bits/ 64 <828000000>;
> + opp-level = <2>;
> + clock-latency-ns = <19000>;
> + };
> + };
> +
> + soc {
> + #address-cells = <2>;
> + #size-cells = <2>;
> +
> + cpufreq_e: cpufreq@210e20000 {

Node name: performance-controller

(cpufreq is rather Linux naming)

> + compatible = "apple,t8103-cluster-cpufreq", "apple,cluster-cpufreq";
> + reg = <0x2 0x10e20000 0 0x1000>;
> + #performance-domain-cells = <0>;
> + };
> +
> + cpufreq_p: cpufreq@211e20000 {

The same.

> + compatible = "apple,t8103-cluster-cpufreq", "apple,cluster-cpufreq";
> + reg = <0x2 0x11e20000 0 0x1000>;
> + #performance-domain-cells = <0>;
> + };
> + };

Best regards,
Krzysztof


2022-10-25 16:26:28

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 5/5] arm64: dts: apple: Add CPU topology & cpufreq nodes for t8103

On 24/10/2022 00:39, Hector Martin wrote:
> Add the missing CPU topology/capacity information and the cpufreq nodes,
> so we can have CPU frequency scaling and the scheduler has the
> information it needs to make the correct decisions.
>

Thank you for your patch. There is something to discuss/improve.

> +
> timer {
> compatible = "arm,armv8-timer";
> interrupt-parent = <&aic>;
> @@ -124,6 +298,18 @@ soc {
> ranges;
> nonposted-mmio;
>
> + cpufreq_e: cpufreq@210e20000 {

Node name: performance-controller

> + compatible = "apple,t8103-cluster-cpufreq", "apple,cluster-cpufreq";
> + reg = <0x2 0x10e20000 0 0x1000>;
> + #performance-domain-cells = <0>;
> + };
> +
> + cpufreq_p: cpufreq@211e20000 {

Ditto

> + compatible = "apple,t8103-cluster-cpufreq", "apple,cluster-cpufreq";
> + reg = <0x2 0x11e20000 0 0x1000>;
> + #performance-domain-cells = <0>;
> + };
> +
Krzysztof


2022-10-25 17:33:14

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC cpufreq

On 26/10/2022 01.01, Krzysztof Kozlowski wrote:
> On 24/10/2022 00:39, Hector Martin wrote:
>> This binding represents the cpufreq/DVFS hardware present in Apple SoCs.
>> The hardware has an independent controller per CPU cluster, and we
>> represent them as unique nodes in order to accurately describe the
>> hardware. The driver is responsible for binding them as a single cpufreq
>> device (in the Linux cpufreq model).
>>
>> Signed-off-by: Hector Martin <[email protected]>
>> ---
>> .../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++++++++++++++
>> 1 file changed, 119 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>
>> diff --git a/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>> new file mode 100644
>> index 000000000000..b11452f91468
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>> @@ -0,0 +1,119 @@
>> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/cpufreq/apple,cluster-cpufreq.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Apple SoC cluster cpufreq device
>
> Few nits, in general looks fine to me.
>
>> +
>> +maintainers:
>> + - Hector Martin <[email protected]>
>> +
>> +description: |
>> + Apple SoCs (e.g. M1) have a per-cpu-cluster DVFS controller that is part of
>> + the cluster management register block. This binding uses the standard
>> + operating-points-v2 table to define the CPU performance states, with the
>> + opp-level property specifying the hardware p-state index for that level.
>> +
>> +properties:
>> + compatible:
>> + oneOf:
>> + - items:
>> + - const: apple,t8103-cluster-cpufreq
>> + - const: apple,cluster-cpufreq
>> + - items:
>> + - const: apple,t6000-cluster-cpufreq
>> + - const: apple,t8103-cluster-cpufreq
>> + - const: apple,cluster-cpufreq
>> + - items:
>> + - const: apple,t8112-cluster-cpufreq
>
> With the first one (t8103) - it's an enum.

This is deliberate. t6000 is compatible with t8103, but t8112 is not
(though all are compatible with what the generic apple,cluster-cpufreq
compatible implies).

Ack on the rest, I'll spin a v4 in a few days if there are no other
comments. Thanks!

- Hector

2022-10-25 19:15:32

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC cpufreq

On 25/10/2022 13:22, Hector Martin wrote:
> On 26/10/2022 01.01, Krzysztof Kozlowski wrote:
>> On 24/10/2022 00:39, Hector Martin wrote:
>>> This binding represents the cpufreq/DVFS hardware present in Apple SoCs.
>>> The hardware has an independent controller per CPU cluster, and we
>>> represent them as unique nodes in order to accurately describe the
>>> hardware. The driver is responsible for binding them as a single cpufreq
>>> device (in the Linux cpufreq model).
>>>
>>> Signed-off-by: Hector Martin <[email protected]>
>>> ---
>>> .../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++++++++++++++
>>> 1 file changed, 119 insertions(+)
>>> create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>>
>>> diff --git a/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>> new file mode 100644
>>> index 000000000000..b11452f91468
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>> @@ -0,0 +1,119 @@
>>> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
>>> +%YAML 1.2
>>> +---
>>> +$id: http://devicetree.org/schemas/cpufreq/apple,cluster-cpufreq.yaml#
>>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>>> +
>>> +title: Apple SoC cluster cpufreq device
>>
>> Few nits, in general looks fine to me.
>>
>>> +
>>> +maintainers:
>>> + - Hector Martin <[email protected]>
>>> +
>>> +description: |
>>> + Apple SoCs (e.g. M1) have a per-cpu-cluster DVFS controller that is part of
>>> + the cluster management register block. This binding uses the standard
>>> + operating-points-v2 table to define the CPU performance states, with the
>>> + opp-level property specifying the hardware p-state index for that level.
>>> +
>>> +properties:
>>> + compatible:
>>> + oneOf:
>>> + - items:
>>> + - const: apple,t8103-cluster-cpufreq
>>> + - const: apple,cluster-cpufreq
>>> + - items:
>>> + - const: apple,t6000-cluster-cpufreq
>>> + - const: apple,t8103-cluster-cpufreq
>>> + - const: apple,cluster-cpufreq
>>> + - items:
>>> + - const: apple,t8112-cluster-cpufreq
>>
>> With the first one (t8103) - it's an enum.
>
> This is deliberate. t6000 is compatible with t8103, but t8112 is not
> (though all are compatible with what the generic apple,cluster-cpufreq
> compatible implies).

I was not talking about t6000. I was talking about two entries - first
and last - which should be just an enum. There is no compatibility, so
what is here deliberate? To not make enum things which are an enum?

Best regards,
Krzysztof


2022-10-26 00:10:41

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC cpufreq

On Wed, Oct 26, 2022 at 02:22:40AM +0900, Hector Martin wrote:
> On 26/10/2022 01.01, Krzysztof Kozlowski wrote:
> > On 24/10/2022 00:39, Hector Martin wrote:
> >> This binding represents the cpufreq/DVFS hardware present in Apple SoCs.
> >> The hardware has an independent controller per CPU cluster, and we
> >> represent them as unique nodes in order to accurately describe the
> >> hardware. The driver is responsible for binding them as a single cpufreq
> >> device (in the Linux cpufreq model).
> >>
> >> Signed-off-by: Hector Martin <[email protected]>
> >> ---
> >> .../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++++++++++++++
> >> 1 file changed, 119 insertions(+)
> >> create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
> >>
> >> diff --git a/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
> >> new file mode 100644
> >> index 000000000000..b11452f91468
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
> >> @@ -0,0 +1,119 @@
> >> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> >> +%YAML 1.2
> >> +---
> >> +$id: http://devicetree.org/schemas/cpufreq/apple,cluster-cpufreq.yaml#
> >> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> >> +
> >> +title: Apple SoC cluster cpufreq device
> >
> > Few nits, in general looks fine to me.
> >
> >> +
> >> +maintainers:
> >> + - Hector Martin <[email protected]>
> >> +
> >> +description: |
> >> + Apple SoCs (e.g. M1) have a per-cpu-cluster DVFS controller that is part of
> >> + the cluster management register block. This binding uses the standard
> >> + operating-points-v2 table to define the CPU performance states, with the
> >> + opp-level property specifying the hardware p-state index for that level.
> >> +
> >> +properties:
> >> + compatible:
> >> + oneOf:
> >> + - items:
> >> + - const: apple,t8103-cluster-cpufreq
> >> + - const: apple,cluster-cpufreq
> >> + - items:
> >> + - const: apple,t6000-cluster-cpufreq
> >> + - const: apple,t8103-cluster-cpufreq
> >> + - const: apple,cluster-cpufreq
> >> + - items:
> >> + - const: apple,t8112-cluster-cpufreq
> >
> > With the first one (t8103) - it's an enum.
>
> This is deliberate. t6000 is compatible with t8103, but t8112 is not
> (though all are compatible with what the generic apple,cluster-cpufreq
> compatible implies).

What does compatible mean here? IOW, what can a client do with
'apple,cluster-cpufreq' alone? It's one thing for self-contained blocks
to remain unchanged from chip to chip, but things like this tend to
change frequently. It looks like for 4 chips we have 3 different
versions.

Rob

2022-10-26 04:30:50

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC cpufreq

On 26/10/2022 03.56, Krzysztof Kozlowski wrote:
> On 25/10/2022 13:22, Hector Martin wrote:
>> On 26/10/2022 01.01, Krzysztof Kozlowski wrote:
>>> On 24/10/2022 00:39, Hector Martin wrote:
>>>> This binding represents the cpufreq/DVFS hardware present in Apple SoCs.
>>>> The hardware has an independent controller per CPU cluster, and we
>>>> represent them as unique nodes in order to accurately describe the
>>>> hardware. The driver is responsible for binding them as a single cpufreq
>>>> device (in the Linux cpufreq model).
>>>>
>>>> Signed-off-by: Hector Martin <[email protected]>
>>>> ---
>>>> .../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++++++++++++++
>>>> 1 file changed, 119 insertions(+)
>>>> create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>>> new file mode 100644
>>>> index 000000000000..b11452f91468
>>>> --- /dev/null
>>>> +++ b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>>> @@ -0,0 +1,119 @@
>>>> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
>>>> +%YAML 1.2
>>>> +---
>>>> +$id: http://devicetree.org/schemas/cpufreq/apple,cluster-cpufreq.yaml#
>>>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>>>> +
>>>> +title: Apple SoC cluster cpufreq device
>>>
>>> Few nits, in general looks fine to me.
>>>
>>>> +
>>>> +maintainers:
>>>> + - Hector Martin <[email protected]>
>>>> +
>>>> +description: |
>>>> + Apple SoCs (e.g. M1) have a per-cpu-cluster DVFS controller that is part of
>>>> + the cluster management register block. This binding uses the standard
>>>> + operating-points-v2 table to define the CPU performance states, with the
>>>> + opp-level property specifying the hardware p-state index for that level.
>>>> +
>>>> +properties:
>>>> + compatible:
>>>> + oneOf:
>>>> + - items:
>>>> + - const: apple,t8103-cluster-cpufreq
>>>> + - const: apple,cluster-cpufreq
>>>> + - items:
>>>> + - const: apple,t6000-cluster-cpufreq
>>>> + - const: apple,t8103-cluster-cpufreq
>>>> + - const: apple,cluster-cpufreq
>>>> + - items:
>>>> + - const: apple,t8112-cluster-cpufreq
>>>
>>> With the first one (t8103) - it's an enum.
>>
>> This is deliberate. t6000 is compatible with t8103, but t8112 is not
>> (though all are compatible with what the generic apple,cluster-cpufreq
>> compatible implies).
>
> I was not talking about t6000. I was talking about two entries - first
> and last - which should be just an enum. There is no compatibility, so
> what is here deliberate? To not make enum things which are an enum?

Sorry, I didn't understand what you meant. You mean that the two entries
should be merged, with an enum for the first item listing both SoCs, right?

- Hector

2022-10-26 05:10:57

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC cpufreq

On 26/10/2022 08.12, Rob Herring wrote:
> On Wed, Oct 26, 2022 at 02:22:40AM +0900, Hector Martin wrote:
>> On 26/10/2022 01.01, Krzysztof Kozlowski wrote:
>>> On 24/10/2022 00:39, Hector Martin wrote:
>>>> This binding represents the cpufreq/DVFS hardware present in Apple SoCs.
>>>> The hardware has an independent controller per CPU cluster, and we
>>>> represent them as unique nodes in order to accurately describe the
>>>> hardware. The driver is responsible for binding them as a single cpufreq
>>>> device (in the Linux cpufreq model).
>>>>
>>>> Signed-off-by: Hector Martin <[email protected]>
>>>> ---
>>>> .../cpufreq/apple,cluster-cpufreq.yaml | 119 ++++++++++++++++++
>>>> 1 file changed, 119 insertions(+)
>>>> create mode 100644 Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>>> new file mode 100644
>>>> index 000000000000..b11452f91468
>>>> --- /dev/null
>>>> +++ b/Documentation/devicetree/bindings/cpufreq/apple,cluster-cpufreq.yaml
>>>> @@ -0,0 +1,119 @@
>>>> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
>>>> +%YAML 1.2
>>>> +---
>>>> +$id: http://devicetree.org/schemas/cpufreq/apple,cluster-cpufreq.yaml#
>>>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>>>> +
>>>> +title: Apple SoC cluster cpufreq device
>>>
>>> Few nits, in general looks fine to me.
>>>
>>>> +
>>>> +maintainers:
>>>> + - Hector Martin <[email protected]>
>>>> +
>>>> +description: |
>>>> + Apple SoCs (e.g. M1) have a per-cpu-cluster DVFS controller that is part of
>>>> + the cluster management register block. This binding uses the standard
>>>> + operating-points-v2 table to define the CPU performance states, with the
>>>> + opp-level property specifying the hardware p-state index for that level.
>>>> +
>>>> +properties:
>>>> + compatible:
>>>> + oneOf:
>>>> + - items:
>>>> + - const: apple,t8103-cluster-cpufreq
>>>> + - const: apple,cluster-cpufreq
>>>> + - items:
>>>> + - const: apple,t6000-cluster-cpufreq
>>>> + - const: apple,t8103-cluster-cpufreq
>>>> + - const: apple,cluster-cpufreq
>>>> + - items:
>>>> + - const: apple,t8112-cluster-cpufreq
>>>
>>> With the first one (t8103) - it's an enum.
>>
>> This is deliberate. t6000 is compatible with t8103, but t8112 is not
>> (though all are compatible with what the generic apple,cluster-cpufreq
>> compatible implies).
>
> What does compatible mean here? IOW, what can a client do with
> 'apple,cluster-cpufreq' alone? It's one thing for self-contained blocks
> to remain unchanged from chip to chip, but things like this tend to
> change frequently. It looks like for 4 chips we have 3 different
> versions.

This is described in the cover letter. The actual cpufreq control is
identical for all shipping SoCs right now (that's 5 SoCs, since t6000 is
actually also t6001 and t6002) and will work with just that generic
compatible (and almost certainly quite a few SoC generations going back
too). It's just that I found a useful register that gives you feedback
on the *actual* pstate, and that register field shifted one bit on t8112
because they ran out of bits. If the driver finds a t8103 or t8112
compatible, it will use that register to accurately report the current
frequency (subject to boost frequency restrictions). If it doesn't, it
will just report the requested frequency as actual. t6000 is compatible
with t8103 in this regard, hence the tiering. I expect lots of future
SoCs to be compatible with t8112, since although they exceeded 16
pstates there, I doubt they'll push beyond 32 and have to move it
another bit any time soon.

Right *now*, since boost frequencies are unachievable and disabled due
to reasons unrelated to this driver, all compatibles are, in fact,
completely equivalent in functionality for end users, and nothing would
change if we just had `apple,cluster-cpufreq` in the DT. This will
change once we get cpuidle support, which unlocks boost frequencies as a
side effect, but that will require no changes to this driver/series
(other than uncommenting the extra OPPs in the DT).

- Hector

2022-10-26 14:40:26

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] dt-bindings: cpufreq: apple,soc-cpufreq: Add binding for Apple SoC cpufreq

On 26/10/2022 00:18, Hector Martin wrote:
>>>> With the first one (t8103) - it's an enum.
>>>
>>> This is deliberate. t6000 is compatible with t8103, but t8112 is not
>>> (though all are compatible with what the generic apple,cluster-cpufreq
>>> compatible implies).
>>
>> I was not talking about t6000. I was talking about two entries - first
>> and last - which should be just an enum. There is no compatibility, so
>> what is here deliberate? To not make enum things which are an enum?
>
> Sorry, I didn't understand what you meant. You mean that the two entries
> should be merged, with an enum for the first item listing both SoCs, right?

Yes

Best regards,
Krzysztof


2022-11-01 16:00:25

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On Mon, 24 Oct 2022 at 06:40, Hector Martin <[email protected]> wrote:
>
> This driver implements CPU frequency scaling for Apple Silicon SoCs,
> including M1 (t8103), M1 Max/Pro/Ultra (t600x), and M2 (t8112).
>
> Each CPU cluster has its own register set, and frequency management is
> fully automated by the hardware; the driver only has to write one
> register. There is boost frequency support, but the hardware will only
> allow their use if only a subset of cores in a cluster are in
> non-deep-idle. Since we don't support deep idle yet, these frequencies
> are not achievable, but the driver supports them. They will remain
> disabled in the device tree until deep idle is implemented, to avoid
> confusing users.

Out of curiosity, may I ask if this implies the need of a
synchronization mechanism on the Linux side? Or is the boost frequency
dynamically managed solely by HW/FW?

>
> This driver does not yet implement the memory controller performance
> state tuning that usually accompanies higher CPU p-states. This will be
> done in a future patch.
>
> Signed-off-by: Hector Martin <[email protected]>

Other than the question above, I intend to help to review the series
in more detail soon.

Kind regards
Uffe

> ---
> drivers/cpufreq/Kconfig.arm | 9 +
> drivers/cpufreq/Makefile | 1 +
> drivers/cpufreq/apple-soc-cpufreq.c | 352 +++++++++++++++++++++++++++
> drivers/cpufreq/cpufreq-dt-platdev.c | 2 +
> 4 files changed, 364 insertions(+)
> create mode 100644 drivers/cpufreq/apple-soc-cpufreq.c
>
> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> index 82e5de1f6f8c..29969f84008a 100644
> --- a/drivers/cpufreq/Kconfig.arm
> +++ b/drivers/cpufreq/Kconfig.arm
> @@ -41,6 +41,15 @@ config ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM
> To compile this driver as a module, choose M here: the
> module will be called sun50i-cpufreq-nvmem.
>
> +config ARM_APPLE_SOC_CPUFREQ
> + tristate "Apple Silicon SoC CPUFreq support"
> + depends on ARCH_APPLE || (COMPILE_TEST && 64BIT)
> + select PM_OPP
> + default ARCH_APPLE
> + help
> + This adds the CPUFreq driver for Apple Silicon machines
> + (e.g. Apple M1).
> +
> config ARM_ARMADA_37XX_CPUFREQ
> tristate "Armada 37xx CPUFreq support"
> depends on ARCH_MVEBU && CPUFREQ_DT
> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
> index 49b98c62c5af..32a7029e25ed 100644
> --- a/drivers/cpufreq/Makefile
> +++ b/drivers/cpufreq/Makefile
> @@ -52,6 +52,7 @@ obj-$(CONFIG_X86_AMD_FREQ_SENSITIVITY) += amd_freq_sensitivity.o
>
> ##################################################################################
> # ARM SoC drivers
> +obj-$(CONFIG_ARM_APPLE_SOC_CPUFREQ) += apple-soc-cpufreq.o
> obj-$(CONFIG_ARM_ARMADA_37XX_CPUFREQ) += armada-37xx-cpufreq.o
> obj-$(CONFIG_ARM_ARMADA_8K_CPUFREQ) += armada-8k-cpufreq.o
> obj-$(CONFIG_ARM_BRCMSTB_AVS_CPUFREQ) += brcmstb-avs-cpufreq.o
> diff --git a/drivers/cpufreq/apple-soc-cpufreq.c b/drivers/cpufreq/apple-soc-cpufreq.c
> new file mode 100644
> index 000000000000..12c4b490edb8
> --- /dev/null
> +++ b/drivers/cpufreq/apple-soc-cpufreq.c
> @@ -0,0 +1,352 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Apple SoC CPU cluster performance state driver
> + *
> + * Copyright The Asahi Linux Contributors
> + *
> + * Based on scpi-cpufreq.c
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/bitops.h>
> +#include <linux/cpu.h>
> +#include <linux/cpufreq.h>
> +#include <linux/cpumask.h>
> +#include <linux/delay.h>
> +#include <linux/err.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/pm_opp.h>
> +#include <linux/slab.h>
> +
> +#define APPLE_DVFS_CMD 0x20
> +#define APPLE_DVFS_CMD_BUSY BIT(31)
> +#define APPLE_DVFS_CMD_SET BIT(25)
> +#define APPLE_DVFS_CMD_PS2 GENMASK(16, 12)
> +#define APPLE_DVFS_CMD_PS1 GENMASK(4, 0)
> +
> +/* Same timebase as CPU counter (24MHz) */
> +#define APPLE_DVFS_LAST_CHG_TIME 0x38
> +
> +/*
> + * Apple ran out of bits and had to shift this in T8112...
> + */
> +#define APPLE_DVFS_STATUS 0x50
> +#define APPLE_DVFS_STATUS_CUR_PS_T8103 GENMASK(7, 4)
> +#define APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8103 4
> +#define APPLE_DVFS_STATUS_TGT_PS_T8103 GENMASK(3, 0)
> +#define APPLE_DVFS_STATUS_CUR_PS_T8112 GENMASK(9, 5)
> +#define APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8112 5
> +#define APPLE_DVFS_STATUS_TGT_PS_T8112 GENMASK(4, 0)
> +
> +/*
> + * Div is +1, base clock is 12MHz on existing SoCs.
> + * For documentation purposes. We use the OPP table to
> + * get the frequency.
> + */
> +#define APPLE_DVFS_PLL_STATUS 0xc0
> +#define APPLE_DVFS_PLL_FACTOR 0xc8
> +#define APPLE_DVFS_PLL_FACTOR_MULT GENMASK(31, 16)
> +#define APPLE_DVFS_PLL_FACTOR_DIV GENMASK(15, 0)
> +
> +#define APPLE_DVFS_TRANSITION_TIMEOUT 100
> +
> +struct apple_soc_cpufreq_info {
> + u64 max_pstate;
> + u64 cur_pstate_mask;
> + u64 cur_pstate_shift;
> +};
> +
> +struct apple_cpu_priv {
> + struct device *cpu_dev;
> + void __iomem *reg_base;
> + const struct apple_soc_cpufreq_info *info;
> +};
> +
> +static struct cpufreq_driver apple_soc_cpufreq_driver;
> +
> +const struct apple_soc_cpufreq_info soc_t8103_info = {
> + .max_pstate = 15,
> + .cur_pstate_mask = APPLE_DVFS_STATUS_CUR_PS_T8103,
> + .cur_pstate_shift = APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8103,
> +};
> +
> +const struct apple_soc_cpufreq_info soc_t8112_info = {
> + .max_pstate = 31,
> + .cur_pstate_mask = APPLE_DVFS_STATUS_CUR_PS_T8112,
> + .cur_pstate_shift = APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8112,
> +};
> +
> +const struct apple_soc_cpufreq_info soc_default_info = {
> + .max_pstate = 15,
> + .cur_pstate_mask = 0, /* fallback */
> +};
> +
> +static const struct of_device_id apple_soc_cpufreq_of_match[] = {
> + {
> + .compatible = "apple,t8103-cluster-cpufreq",
> + .data = &soc_t8103_info,
> + },
> + {
> + .compatible = "apple,t8112-cluster-cpufreq",
> + .data = &soc_t8112_info,
> + },
> + {
> + .compatible = "apple,cluster-cpufreq",
> + .data = &soc_default_info,
> + },
> + {}
> +};
> +
> +static unsigned int apple_soc_cpufreq_get_rate(unsigned int cpu)
> +{
> + struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu);
> + struct apple_cpu_priv *priv = policy->driver_data;
> + unsigned int pstate;
> + unsigned int i;
> +
> + if (priv->info->cur_pstate_mask) {
> + u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_STATUS);
> +
> + pstate = (reg & priv->info->cur_pstate_mask) >> priv->info->cur_pstate_shift;
> + } else {
> + /*
> + * For the fallback case we might not know the layout of DVFS_STATUS,
> + * so just use the command register value (which ignores boost limitations).
> + */
> + u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_CMD);
> +
> + pstate = FIELD_GET(APPLE_DVFS_CMD_PS1, reg);
> + }
> +
> + for (i = 0; policy->freq_table[i].frequency != CPUFREQ_TABLE_END; i++)
> + if (policy->freq_table[i].driver_data == pstate)
> + return policy->freq_table[i].frequency;
> +
> + dev_err(priv->cpu_dev, "could not find frequency for pstate %d\n",
> + pstate);
> + return 0;
> +}
> +
> +static int apple_soc_cpufreq_set_target(struct cpufreq_policy *policy,
> + unsigned int index)
> +{
> + struct apple_cpu_priv *priv = policy->driver_data;
> + unsigned int pstate = policy->freq_table[index].driver_data;
> + u64 reg;
> +
> + /* Fallback for newer SoCs */
> + if (index > priv->info->max_pstate)
> + index = priv->info->max_pstate;
> +
> + if (readq_poll_timeout_atomic(priv->reg_base + APPLE_DVFS_CMD, reg,
> + !(reg & APPLE_DVFS_CMD_BUSY), 2,
> + APPLE_DVFS_TRANSITION_TIMEOUT)) {
> + return -EIO;
> + }
> +
> + reg &= ~(APPLE_DVFS_CMD_PS1 | APPLE_DVFS_CMD_PS2);
> + reg |= FIELD_PREP(APPLE_DVFS_CMD_PS1, pstate);
> + reg |= FIELD_PREP(APPLE_DVFS_CMD_PS2, pstate);
> + reg |= APPLE_DVFS_CMD_SET;
> +
> + writeq_relaxed(reg, priv->reg_base + APPLE_DVFS_CMD);
> +
> + return 0;
> +}
> +
> +static unsigned int apple_soc_cpufreq_fast_switch(struct cpufreq_policy *policy,
> + unsigned int target_freq)
> +{
> + if (apple_soc_cpufreq_set_target(policy, policy->cached_resolved_idx) < 0)
> + return 0;
> +
> + return policy->freq_table[policy->cached_resolved_idx].frequency;
> +}
> +
> +static int apple_soc_cpufreq_find_cluster(struct cpufreq_policy *policy,
> + void __iomem **reg_base,
> + const struct apple_soc_cpufreq_info **info)
> +{
> + struct of_phandle_args args;
> + const struct of_device_id *match;
> + int ret = 0;
> +
> + ret = of_perf_domain_get_sharing_cpumask(policy->cpu, "performance-domains",
> + "#performance-domain-cells",
> + policy->cpus, &args);
> + if (ret < 0)
> + return ret;
> +
> + match = of_match_node(apple_soc_cpufreq_of_match, args.np);
> + of_node_put(args.np);
> + if (!match)
> + return -ENODEV;
> +
> + *info = match->data;
> +
> + *reg_base = of_iomap(args.np, 0);
> + if (IS_ERR(*reg_base))
> + return PTR_ERR(*reg_base);
> +
> + return 0;
> +}
> +
> +static struct freq_attr *apple_soc_cpufreq_hw_attr[] = {
> + &cpufreq_freq_attr_scaling_available_freqs,
> + NULL,
> + NULL,
> +};
> +
> +static int apple_soc_cpufreq_init(struct cpufreq_policy *policy)
> +{
> + int ret, i;
> + unsigned int transition_latency;
> + void __iomem *reg_base;
> + struct device *cpu_dev;
> + struct apple_cpu_priv *priv;
> + const struct apple_soc_cpufreq_info *info;
> + struct cpufreq_frequency_table *freq_table;
> +
> + cpu_dev = get_cpu_device(policy->cpu);
> + if (!cpu_dev) {
> + pr_err("failed to get cpu%d device\n", policy->cpu);
> + return -ENODEV;
> + }
> +
> + ret = dev_pm_opp_of_add_table(cpu_dev);
> + if (ret < 0) {
> + dev_err(cpu_dev, "%s: failed to add OPP table: %d\n", __func__, ret);
> + return ret;
> + }
> +
> + ret = apple_soc_cpufreq_find_cluster(policy, &reg_base, &info);
> + if (ret) {
> + dev_err(cpu_dev, "%s: failed to get cluster info: %d\n", __func__, ret);
> + return ret;
> + }
> +
> + ret = dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);
> + if (ret) {
> + dev_err(cpu_dev, "%s: failed to mark OPPs as shared: %d\n", __func__, ret);
> + goto out_iounmap;
> + }
> +
> + ret = dev_pm_opp_get_opp_count(cpu_dev);
> + if (ret <= 0) {
> + dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
> + ret = -EPROBE_DEFER;
> + goto out_free_opp;
> + }
> +
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> + if (!priv) {
> + ret = -ENOMEM;
> + goto out_free_opp;
> + }
> +
> + ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table);
> + if (ret) {
> + dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret);
> + goto out_free_priv;
> + }
> +
> + /* Get OPP levels (p-state indexes) and stash them in driver_data */
> + for (i = 0; freq_table[i].frequency != CPUFREQ_TABLE_END; i++) {
> + unsigned long rate = freq_table[i].frequency * 1000;
> + struct dev_pm_opp *opp = dev_pm_opp_find_freq_floor(cpu_dev, &rate);
> +
> + if (IS_ERR(opp)) {
> + ret = PTR_ERR(opp);
> + goto out_free_cpufreq_table;
> + }
> + freq_table[i].driver_data = dev_pm_opp_get_level(opp);
> + dev_pm_opp_put(opp);
> + }
> +
> + priv->cpu_dev = cpu_dev;
> + priv->reg_base = reg_base;
> + priv->info = info;
> + policy->driver_data = priv;
> + policy->freq_table = freq_table;
> +
> + transition_latency = dev_pm_opp_get_max_transition_latency(cpu_dev);
> + if (!transition_latency)
> + transition_latency = CPUFREQ_ETERNAL;
> +
> + policy->cpuinfo.transition_latency = transition_latency;
> + policy->dvfs_possible_from_any_cpu = true;
> + policy->fast_switch_possible = true;
> +
> + if (policy_has_boost_freq(policy)) {
> + ret = cpufreq_enable_boost_support();
> + if (ret) {
> + dev_warn(cpu_dev, "failed to enable boost: %d\n", ret);
> + } else {
> + apple_soc_cpufreq_hw_attr[1] = &cpufreq_freq_attr_scaling_boost_freqs;
> + apple_soc_cpufreq_driver.boost_enabled = true;
> + }
> + }
> +
> + return 0;
> +
> +out_free_cpufreq_table:
> + dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
> +out_free_priv:
> + kfree(priv);
> +out_free_opp:
> + dev_pm_opp_remove_all_dynamic(cpu_dev);
> +out_iounmap:
> + iounmap(reg_base);
> + return ret;
> +}
> +
> +static int apple_soc_cpufreq_exit(struct cpufreq_policy *policy)
> +{
> + struct apple_cpu_priv *priv = policy->driver_data;
> +
> + dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
> + dev_pm_opp_remove_all_dynamic(priv->cpu_dev);
> + iounmap(priv->reg_base);
> + kfree(priv);
> +
> + return 0;
> +}
> +
> +static struct cpufreq_driver apple_soc_cpufreq_driver = {
> + .name = "apple-cpufreq",
> + .flags = CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
> + CPUFREQ_NEED_INITIAL_FREQ_CHECK | CPUFREQ_IS_COOLING_DEV,
> + .verify = cpufreq_generic_frequency_table_verify,
> + .attr = cpufreq_generic_attr,
> + .get = apple_soc_cpufreq_get_rate,
> + .init = apple_soc_cpufreq_init,
> + .exit = apple_soc_cpufreq_exit,
> + .target_index = apple_soc_cpufreq_set_target,
> + .fast_switch = apple_soc_cpufreq_fast_switch,
> + .register_em = cpufreq_register_em_with_opp,
> + .attr = apple_soc_cpufreq_hw_attr,
> +};
> +
> +static int __init apple_soc_cpufreq_module_init(void)
> +{
> + if (!of_machine_is_compatible("apple,arm-platform"))
> + return -ENODEV;
> +
> + return cpufreq_register_driver(&apple_soc_cpufreq_driver);
> +}
> +module_init(apple_soc_cpufreq_module_init);
> +
> +static void __exit apple_soc_cpufreq_module_exit(void)
> +{
> + cpufreq_unregister_driver(&apple_soc_cpufreq_driver);
> +}
> +module_exit(apple_soc_cpufreq_module_exit);
> +
> +MODULE_DEVICE_TABLE(of, apple_soc_cpufreq_of_match);
> +MODULE_AUTHOR("Hector Martin <[email protected]>");
> +MODULE_DESCRIPTION("Apple SoC CPU cluster DVFS driver");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c
> index 6ac3800db450..a108b9796770 100644
> --- a/drivers/cpufreq/cpufreq-dt-platdev.c
> +++ b/drivers/cpufreq/cpufreq-dt-platdev.c
> @@ -103,6 +103,8 @@ static const struct of_device_id allowlist[] __initconst = {
> static const struct of_device_id blocklist[] __initconst = {
> { .compatible = "allwinner,sun50i-h6", },
>
> + { .compatible = "apple,arm-platform", },
> +
> { .compatible = "arm,vexpress", },
>
> { .compatible = "calxeda,highbank", },
> --
> 2.35.1
>

2022-11-01 18:23:01

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On 02/11/2022 00.16, Ulf Hansson wrote:
> On Mon, 24 Oct 2022 at 06:40, Hector Martin <[email protected]> wrote:
>>
>> This driver implements CPU frequency scaling for Apple Silicon SoCs,
>> including M1 (t8103), M1 Max/Pro/Ultra (t600x), and M2 (t8112).
>>
>> Each CPU cluster has its own register set, and frequency management is
>> fully automated by the hardware; the driver only has to write one
>> register. There is boost frequency support, but the hardware will only
>> allow their use if only a subset of cores in a cluster are in
>> non-deep-idle. Since we don't support deep idle yet, these frequencies
>> are not achievable, but the driver supports them. They will remain
>> disabled in the device tree until deep idle is implemented, to avoid
>> confusing users.
>
> Out of curiosity, may I ask if this implies the need of a
> synchronization mechanism on the Linux side? Or is the boost frequency
> dynamically managed solely by HW/FW?

It's managed by hardware - Linux gets to request whatever frequency it
wants, and the hardware will limit it to what is achievable given the
current idle states within the cluster (and it will change automatically
with them). So if Linux asks for 3.2 GHz but there are no deep idle
cores in the cluster, you get 3.0. If there's one deep idle core, you
get 3.1 (I think). Three, 3.2. So this driver doesn't have to do
anything (and will report the correct current-frequency as long as the
per-SoC compatible is matched; without that this feature is disabled and
it just reports the requested frequency). We could enable the boost
states today just fine, it's just that they would never actually be
reached by the hardware.

- Hector

2022-11-02 06:09:52

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH v3 3/5] cpufreq: Generalize of_perf_domain_get_sharing_cpumask phandle format

On 24-10-22, 13:39, Hector Martin wrote:
> of_perf_domain_get_sharing_cpumask currently assumes a 1-argument
> phandle format, and directly returns the argument. Generalize this to
> return the full of_phandle_args, so it can be used by drivers which use
> other phandle styles (e.g. separate nodes). This also requires changing
> the CPU sharing match to compare the full args structure.
>
> Also, make sure to of_node_put(args.np) (the original code was leaking a
> reference).
>
> Signed-off-by: Hector Martin <[email protected]>
> ---
> drivers/cpufreq/mediatek-cpufreq-hw.c | 14 +++++++++-----
> include/linux/cpufreq.h | 28 +++++++++++++++------------
> 2 files changed, 25 insertions(+), 17 deletions(-)

Applied. Thanks.

--
viresh

2022-11-02 06:58:32

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On 24-10-22, 13:39, Hector Martin wrote:
> diff --git a/drivers/cpufreq/apple-soc-cpufreq.c b/drivers/cpufreq/apple-soc-cpufreq.c
> +struct apple_soc_cpufreq_info {
> + u64 max_pstate;
> + u64 cur_pstate_mask;
> + u64 cur_pstate_shift;
> +};
> +
> +struct apple_cpu_priv {
> + struct device *cpu_dev;
> + void __iomem *reg_base;
> + const struct apple_soc_cpufreq_info *info;
> +};
> +
> +static struct cpufreq_driver apple_soc_cpufreq_driver;
> +
> +const struct apple_soc_cpufreq_info soc_t8103_info = {

static ? For other instances too.

> + .max_pstate = 15,
> + .cur_pstate_mask = APPLE_DVFS_STATUS_CUR_PS_T8103,
> + .cur_pstate_shift = APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8103,
> +};
> +
> +const struct apple_soc_cpufreq_info soc_t8112_info = {
> + .max_pstate = 31,
> + .cur_pstate_mask = APPLE_DVFS_STATUS_CUR_PS_T8112,
> + .cur_pstate_shift = APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8112,
> +};
> +
> +const struct apple_soc_cpufreq_info soc_default_info = {
> + .max_pstate = 15,
> + .cur_pstate_mask = 0, /* fallback */
> +};
> +
> +static const struct of_device_id apple_soc_cpufreq_of_match[] = {
> + {
> + .compatible = "apple,t8103-cluster-cpufreq",
> + .data = &soc_t8103_info,
> + },
> + {

Isn't the preferred way for this is "}, {" instead ?

I couldn't find this in Coding Guidelines, but somehow remember that
to be the preferred format.

> + .compatible = "apple,t8112-cluster-cpufreq",
> + .data = &soc_t8112_info,
> + },
> + {
> + .compatible = "apple,cluster-cpufreq",
> + .data = &soc_default_info,
> + },
> + {}
> +};
> +
> +static unsigned int apple_soc_cpufreq_get_rate(unsigned int cpu)
> +{
> + struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu);
> + struct apple_cpu_priv *priv = policy->driver_data;
> + unsigned int pstate;
> + unsigned int i;

Merge these two ?

> +
> + if (priv->info->cur_pstate_mask) {
> + u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_STATUS);
> +
> + pstate = (reg & priv->info->cur_pstate_mask) >> priv->info->cur_pstate_shift;
> + } else {
> + /*
> + * For the fallback case we might not know the layout of DVFS_STATUS,
> + * so just use the command register value (which ignores boost limitations).
> + */
> + u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_CMD);
> +
> + pstate = FIELD_GET(APPLE_DVFS_CMD_PS1, reg);
> + }
> +
> + for (i = 0; policy->freq_table[i].frequency != CPUFREQ_TABLE_END; i++)

You may want to use, cpufreq_for_each_valid_entry(), or some other
generic iterator here.

> + if (policy->freq_table[i].driver_data == pstate)
> + return policy->freq_table[i].frequency;
> +
> + dev_err(priv->cpu_dev, "could not find frequency for pstate %d\n",
> + pstate);
> + return 0;
> +}
> +
> +static int apple_soc_cpufreq_set_target(struct cpufreq_policy *policy,
> + unsigned int index)
> +{
> + struct apple_cpu_priv *priv = policy->driver_data;
> + unsigned int pstate = policy->freq_table[index].driver_data;
> + u64 reg;
> +
> + /* Fallback for newer SoCs */
> + if (index > priv->info->max_pstate)
> + index = priv->info->max_pstate;
> +
> + if (readq_poll_timeout_atomic(priv->reg_base + APPLE_DVFS_CMD, reg,
> + !(reg & APPLE_DVFS_CMD_BUSY), 2,
> + APPLE_DVFS_TRANSITION_TIMEOUT)) {
> + return -EIO;
> + }
> +
> + reg &= ~(APPLE_DVFS_CMD_PS1 | APPLE_DVFS_CMD_PS2);
> + reg |= FIELD_PREP(APPLE_DVFS_CMD_PS1, pstate);
> + reg |= FIELD_PREP(APPLE_DVFS_CMD_PS2, pstate);
> + reg |= APPLE_DVFS_CMD_SET;
> +
> + writeq_relaxed(reg, priv->reg_base + APPLE_DVFS_CMD);
> +
> + return 0;
> +}
> +
> +static unsigned int apple_soc_cpufreq_fast_switch(struct cpufreq_policy *policy,
> + unsigned int target_freq)
> +{
> + if (apple_soc_cpufreq_set_target(policy, policy->cached_resolved_idx) < 0)
> + return 0;
> +
> + return policy->freq_table[policy->cached_resolved_idx].frequency;
> +}
> +
> +static int apple_soc_cpufreq_find_cluster(struct cpufreq_policy *policy,
> + void __iomem **reg_base,
> + const struct apple_soc_cpufreq_info **info)
> +{
> + struct of_phandle_args args;
> + const struct of_device_id *match;
> + int ret = 0;
> +
> + ret = of_perf_domain_get_sharing_cpumask(policy->cpu, "performance-domains",
> + "#performance-domain-cells",
> + policy->cpus, &args);
> + if (ret < 0)
> + return ret;
> +
> + match = of_match_node(apple_soc_cpufreq_of_match, args.np);
> + of_node_put(args.np);
> + if (!match)
> + return -ENODEV;
> +
> + *info = match->data;
> +
> + *reg_base = of_iomap(args.np, 0);
> + if (IS_ERR(*reg_base))
> + return PTR_ERR(*reg_base);
> +
> + return 0;
> +}
> +
> +static struct freq_attr *apple_soc_cpufreq_hw_attr[] = {
> + &cpufreq_freq_attr_scaling_available_freqs,
> + NULL,
> + NULL,
> +};
> +
> +static int apple_soc_cpufreq_init(struct cpufreq_policy *policy)
> +{
> + int ret, i;
> + unsigned int transition_latency;
> + void __iomem *reg_base;
> + struct device *cpu_dev;
> + struct apple_cpu_priv *priv;
> + const struct apple_soc_cpufreq_info *info;
> + struct cpufreq_frequency_table *freq_table;
> +
> + cpu_dev = get_cpu_device(policy->cpu);
> + if (!cpu_dev) {
> + pr_err("failed to get cpu%d device\n", policy->cpu);
> + return -ENODEV;
> + }
> +
> + ret = dev_pm_opp_of_add_table(cpu_dev);
> + if (ret < 0) {
> + dev_err(cpu_dev, "%s: failed to add OPP table: %d\n", __func__, ret);
> + return ret;
> + }
> +
> + ret = apple_soc_cpufreq_find_cluster(policy, &reg_base, &info);
> + if (ret) {
> + dev_err(cpu_dev, "%s: failed to get cluster info: %d\n", __func__, ret);
> + return ret;
> + }
> +
> + ret = dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);

Why do you need this ? The OPP core should be able to find this
information by itself in your case AFAIU. The OPP core will refer
"operating-points-v2 = <&pcluster_opp>" and find that the cores are
related.

> + if (ret) {
> + dev_err(cpu_dev, "%s: failed to mark OPPs as shared: %d\n", __func__, ret);
> + goto out_iounmap;
> + }
> +
> + ret = dev_pm_opp_get_opp_count(cpu_dev);
> + if (ret <= 0) {
> + dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");

Why would this happen in your case ?

> + ret = -EPROBE_DEFER;
> + goto out_free_opp;
> + }
> +
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> + if (!priv) {
> + ret = -ENOMEM;
> + goto out_free_opp;
> + }
> +
> + ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table);
> + if (ret) {
> + dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret);
> + goto out_free_priv;
> + }
> +
> + /* Get OPP levels (p-state indexes) and stash them in driver_data */
> + for (i = 0; freq_table[i].frequency != CPUFREQ_TABLE_END; i++) {
> + unsigned long rate = freq_table[i].frequency * 1000;
> + struct dev_pm_opp *opp = dev_pm_opp_find_freq_floor(cpu_dev, &rate);

Shouldn't you use dev_pm_opp_find_freq_exact() here ?

--
viresh

2022-11-09 12:17:55

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On 24/10/2022 17.27, Marc Zyngier wrote:
> On Mon, 24 Oct 2022 05:39:24 +0100,
> Hector Martin <[email protected]> wrote:
>>
>> This driver implements CPU frequency scaling for Apple Silicon SoCs,
>> including M1 (t8103), M1 Max/Pro/Ultra (t600x), and M2 (t8112).
>>
>> Each CPU cluster has its own register set, and frequency management is
>> fully automated by the hardware; the driver only has to write one
>> register. There is boost frequency support, but the hardware will only
>> allow their use if only a subset of cores in a cluster are in
>> non-deep-idle. Since we don't support deep idle yet, these frequencies
>> are not achievable, but the driver supports them. They will remain
>> disabled in the device tree until deep idle is implemented, to avoid
>> confusing users.
>>
>> This driver does not yet implement the memory controller performance
>> state tuning that usually accompanies higher CPU p-states. This will be
>> done in a future patch.
>>
>> Signed-off-by: Hector Martin <[email protected]>
>> ---
>> drivers/cpufreq/Kconfig.arm | 9 +
>> drivers/cpufreq/Makefile | 1 +
>> drivers/cpufreq/apple-soc-cpufreq.c | 352 +++++++++++++++++++++++++++
>> drivers/cpufreq/cpufreq-dt-platdev.c | 2 +
>> 4 files changed, 364 insertions(+)
>> create mode 100644 drivers/cpufreq/apple-soc-cpufreq.c
>>
>
> [...]
>
>> +static struct freq_attr *apple_soc_cpufreq_hw_attr[] = {
>> + &cpufreq_freq_attr_scaling_available_freqs,
>> + NULL,
>> + NULL,
>
> nit: extra NULL?

That slot gets filled in later if boost is enabled, hence the need for
an extra terminating NULL in that case.

- Hector

2022-11-09 13:31:58

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On 02/11/2022 15.18, Viresh Kumar wrote:
> On 24-10-22, 13:39, Hector Martin wrote:
>> +const struct apple_soc_cpufreq_info soc_t8103_info = {
>
> static ? For other instances too.

Ack, fixed.

>> +static const struct of_device_id apple_soc_cpufreq_of_match[] = {
>> + {
>> + .compatible = "apple,t8103-cluster-cpufreq",
>> + .data = &soc_t8103_info,
>> + },
>> + {
>
> Isn't the preferred way for this is "}, {" instead ?
>
> I couldn't find this in Coding Guidelines, but somehow remember that
> to be the preferred format.

I did an informal search and the two-line form seems to be more common,
though both are in widespread use. I can change it if you want, though
it seems kind of a wash.

>> +static unsigned int apple_soc_cpufreq_get_rate(unsigned int cpu)
>> +{
>> + struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu);
>> + struct apple_cpu_priv *priv = policy->driver_data;
>> + unsigned int pstate;
>> + unsigned int i;
>
> Merge these two ?

Done.

>
>> +
>> + if (priv->info->cur_pstate_mask) {
>> + u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_STATUS);
>> +
>> + pstate = (reg & priv->info->cur_pstate_mask) >> priv->info->cur_pstate_shift;
>> + } else {
>> + /*
>> + * For the fallback case we might not know the layout of DVFS_STATUS,
>> + * so just use the command register value (which ignores boost limitations).
>> + */
>> + u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_CMD);
>> +
>> + pstate = FIELD_GET(APPLE_DVFS_CMD_PS1, reg);
>> + }
>> +
>> + for (i = 0; policy->freq_table[i].frequency != CPUFREQ_TABLE_END; i++)
>
> You may want to use, cpufreq_for_each_valid_entry(), or some other
> generic iterator here.

Done.

>> + ret = dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);
>
> Why do you need this ? The OPP core should be able to find this
> information by itself in your case AFAIU. The OPP core will refer
> "operating-points-v2 = <&pcluster_opp>" and find that the cores are
> related.

We have multiple clusters sharing an OPP table (e.g. the M1 Ultra has 2
e-cluster and 4 p-clusters, and duplicating OPP tables seems very
silly), so this is necessary to tell it about the subset of cores
sharing a table that are actually one domain.

>
>> + if (ret) {
>> + dev_err(cpu_dev, "%s: failed to mark OPPs as shared: %d\n", __func__, ret);
>> + goto out_iounmap;
>> + }
>> +
>> + ret = dev_pm_opp_get_opp_count(cpu_dev);
>> + if (ret <= 0) {
>> + dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
>
> Why would this happen in your case ?

Good question. This came from scpi-cpufreq.c. It sounds like it doesn't
make any sense here; the error path should just error out, not defer.
I'll change it to that.

>> + ret = -EPROBE_DEFER;
>> + goto out_free_opp;
>> + }
>> +
>> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> + if (!priv) {
>> + ret = -ENOMEM;
>> + goto out_free_opp;
>> + }
>> +
>> + ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table);
>> + if (ret) {
>> + dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret);
>> + goto out_free_priv;
>> + }
>> +
>> + /* Get OPP levels (p-state indexes) and stash them in driver_data */
>> + for (i = 0; freq_table[i].frequency != CPUFREQ_TABLE_END; i++) {
>> + unsigned long rate = freq_table[i].frequency * 1000;
>> + struct dev_pm_opp *opp = dev_pm_opp_find_freq_floor(cpu_dev, &rate);
>
> Shouldn't you use dev_pm_opp_find_freq_exact() here ?

Actually, it would seem the correct thing to do is
dev_pm_opp_find_freq_ceil, or otherwise use _floor and add 999.
dev_pm_opp_init_cpufreq_table() truncates down to kHz, so the real
frequency will always be between `rate` and `rate + 999` here. This
makes it work with frequencies that aren't a multiple of 1 kHz (we don't
have any of those but it seems broken not to support it).

- Hector

2022-11-09 14:53:12

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On Wed, 09 Nov 2022 12:13:33 +0000,
Hector Martin <[email protected]> wrote:
>
> On 24/10/2022 17.27, Marc Zyngier wrote:
> > On Mon, 24 Oct 2022 05:39:24 +0100,
> > Hector Martin <[email protected]> wrote:
> >>
> >> This driver implements CPU frequency scaling for Apple Silicon SoCs,
> >> including M1 (t8103), M1 Max/Pro/Ultra (t600x), and M2 (t8112).
> >>
> >> Each CPU cluster has its own register set, and frequency management is
> >> fully automated by the hardware; the driver only has to write one
> >> register. There is boost frequency support, but the hardware will only
> >> allow their use if only a subset of cores in a cluster are in
> >> non-deep-idle. Since we don't support deep idle yet, these frequencies
> >> are not achievable, but the driver supports them. They will remain
> >> disabled in the device tree until deep idle is implemented, to avoid
> >> confusing users.
> >>
> >> This driver does not yet implement the memory controller performance
> >> state tuning that usually accompanies higher CPU p-states. This will be
> >> done in a future patch.
> >>
> >> Signed-off-by: Hector Martin <[email protected]>
> >> ---
> >> drivers/cpufreq/Kconfig.arm | 9 +
> >> drivers/cpufreq/Makefile | 1 +
> >> drivers/cpufreq/apple-soc-cpufreq.c | 352 +++++++++++++++++++++++++++
> >> drivers/cpufreq/cpufreq-dt-platdev.c | 2 +
> >> 4 files changed, 364 insertions(+)
> >> create mode 100644 drivers/cpufreq/apple-soc-cpufreq.c
> >>
> >
> > [...]
> >
> >> +static struct freq_attr *apple_soc_cpufreq_hw_attr[] = {
> >> + &cpufreq_freq_attr_scaling_available_freqs,
> >> + NULL,
> >> + NULL,
> >
> > nit: extra NULL?
>
> That slot gets filled in later if boost is enabled, hence the need for
> an extra terminating NULL in that case.

Right. Consider placing a comment next to the first NULL so that
someone else doesn't consider it useless and accidentally removes
it...

M.

--
Without deviation from the norm, progress is not possible.

2022-11-09 16:04:43

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On 09/11/2022 23.20, Marc Zyngier wrote:
> On Wed, 09 Nov 2022 12:13:33 +0000,
> Hector Martin <[email protected]> wrote:
>>
>> On 24/10/2022 17.27, Marc Zyngier wrote:
>>> On Mon, 24 Oct 2022 05:39:24 +0100,
>>> Hector Martin <[email protected]> wrote:
>>>>
>>>> This driver implements CPU frequency scaling for Apple Silicon SoCs,
>>>> including M1 (t8103), M1 Max/Pro/Ultra (t600x), and M2 (t8112).
>>>>
>>>> Each CPU cluster has its own register set, and frequency management is
>>>> fully automated by the hardware; the driver only has to write one
>>>> register. There is boost frequency support, but the hardware will only
>>>> allow their use if only a subset of cores in a cluster are in
>>>> non-deep-idle. Since we don't support deep idle yet, these frequencies
>>>> are not achievable, but the driver supports them. They will remain
>>>> disabled in the device tree until deep idle is implemented, to avoid
>>>> confusing users.
>>>>
>>>> This driver does not yet implement the memory controller performance
>>>> state tuning that usually accompanies higher CPU p-states. This will be
>>>> done in a future patch.
>>>>
>>>> Signed-off-by: Hector Martin <[email protected]>
>>>> ---
>>>> drivers/cpufreq/Kconfig.arm | 9 +
>>>> drivers/cpufreq/Makefile | 1 +
>>>> drivers/cpufreq/apple-soc-cpufreq.c | 352 +++++++++++++++++++++++++++
>>>> drivers/cpufreq/cpufreq-dt-platdev.c | 2 +
>>>> 4 files changed, 364 insertions(+)
>>>> create mode 100644 drivers/cpufreq/apple-soc-cpufreq.c
>>>>
>>>
>>> [...]
>>>
>>>> +static struct freq_attr *apple_soc_cpufreq_hw_attr[] = {
>>>> + &cpufreq_freq_attr_scaling_available_freqs,
>>>> + NULL,
>>>> + NULL,
>>>
>>> nit: extra NULL?
>>
>> That slot gets filled in later if boost is enabled, hence the need for
>> an extra terminating NULL in that case.
>
> Right. Consider placing a comment next to the first NULL so that
> someone else doesn't consider it useless and accidentally removes
> it...
>

Good point, done :)

- Hector

2022-11-14 07:18:55

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On 09-11-22, 21:36, Hector Martin wrote:
> On 02/11/2022 15.18, Viresh Kumar wrote:
> >> + ret = dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);
> >
> > Why do you need this ? The OPP core should be able to find this
> > information by itself in your case AFAIU. The OPP core will refer
> > "operating-points-v2 = <&pcluster_opp>" and find that the cores are
> > related.
>
> We have multiple clusters sharing an OPP table (e.g. the M1 Ultra has 2
> e-cluster and 4 p-clusters, and duplicating OPP tables seems very
> silly), so this is necessary to tell it about the subset of cores
> sharing a table that are actually one domain.

The cluster sharing information is already part of the OPP tables, "opp-shared"
property. Platforms like scpi needed this because they didn't have the OPP table
in DT and so no way to find out the relation of the CPUs.

See how drivers/cpufreq/mediatek-cpufreq.c has done this.
dev_pm_opp_of_get_sharing_cpus() followed by dev_pm_opp_of_cpumask_add_table().

--
viresh

2022-11-14 07:20:30

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On 14-11-22, 15:57, Hector Martin wrote:
> I don't think you understood me. We have multiple identical clusters.
> All those clusters share an OPP table but are *not* the same cpufreq
> domain. So we can have 8 CPUs which are two 4-CPU cluster using one OPP
> table.

I looked at the patch 5/5. It shows two clusters, e and p, with four CPUs in
each cluster. And as per the OPP table all the CPUs in cluster e are part of
same frequency domain, i.e. switch rate together. Same with all CPUs of cluster
p.

And I also see both the clusters have separate OPP tables.

> There is no way to express this relationship with OPP tables without
> duplicating the tables themselves.

Can you show how the DT looks in this case ? I am still not clear on what the
scenario is here :(

--
viresh

2022-11-14 07:52:42

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On 14/11/2022 15.51, Viresh Kumar wrote:
> On 09-11-22, 21:36, Hector Martin wrote:
>> On 02/11/2022 15.18, Viresh Kumar wrote:
>>>> + ret = dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);
>>>
>>> Why do you need this ? The OPP core should be able to find this
>>> information by itself in your case AFAIU. The OPP core will refer
>>> "operating-points-v2 = <&pcluster_opp>" and find that the cores are
>>> related.
>>
>> We have multiple clusters sharing an OPP table (e.g. the M1 Ultra has 2
>> e-cluster and 4 p-clusters, and duplicating OPP tables seems very
>> silly), so this is necessary to tell it about the subset of cores
>> sharing a table that are actually one domain.
>
> The cluster sharing information is already part of the OPP tables, "opp-shared"
> property. Platforms like scpi needed this because they didn't have the OPP table
> in DT and so no way to find out the relation of the CPUs.
>
> See how drivers/cpufreq/mediatek-cpufreq.c has done this.
> dev_pm_opp_of_get_sharing_cpus() followed by dev_pm_opp_of_cpumask_add_table().

I don't think you understood me. We have multiple identical clusters.
All those clusters share an OPP table but are *not* the same cpufreq
domain. So we can have 8 CPUs which are two 4-CPU cluster using one OPP
table.

There is no way to express this relationship with OPP tables without
duplicating the tables themselves.

- Hector

2022-11-14 11:09:59

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

For the benefit of everyone else: I screwed up a reply from mobile and
ended up with a brief exchange offline, but here's the useful part of
the context. Viresh also pointed out dropping opp-shared if we don't use
that mechanism, so I'll do that for v4.

On 14/11/2022 16.03, Viresh Kumar wrote:
> On 14-11-22, 15:57, Hector Martin wrote:
>> There is no way to express this relationship with OPP tables without
>> duplicating the tables themselves.
>
> Can you show how the DT looks in this case ? I am still not clear on what the
> scenario is here :(

Here's the most complicated one we have so far: 20 cores and 6 clusters
sharing 2 OPP tables (mind the include and define fun - this is the best
way we could come up with to express two SoC dies glued together into
one MCM).

https://github.com/AsahiLinux/linux/blob/asahi-wip/arch/arm64/boot/dts/apple/t6002.dtsi

- Hector

2022-11-14 20:35:42

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] cpufreq: apple-soc: Add new driver to control Apple SoC CPU P-states

On Tue, 1 Nov 2022 at 19:17, Hector Martin <[email protected]> wrote:
>
> On 02/11/2022 00.16, Ulf Hansson wrote:
> > On Mon, 24 Oct 2022 at 06:40, Hector Martin <[email protected]> wrote:
> >>
> >> This driver implements CPU frequency scaling for Apple Silicon SoCs,
> >> including M1 (t8103), M1 Max/Pro/Ultra (t600x), and M2 (t8112).
> >>
> >> Each CPU cluster has its own register set, and frequency management is
> >> fully automated by the hardware; the driver only has to write one
> >> register. There is boost frequency support, but the hardware will only
> >> allow their use if only a subset of cores in a cluster are in
> >> non-deep-idle. Since we don't support deep idle yet, these frequencies
> >> are not achievable, but the driver supports them. They will remain
> >> disabled in the device tree until deep idle is implemented, to avoid
> >> confusing users.
> >
> > Out of curiosity, may I ask if this implies the need of a
> > synchronization mechanism on the Linux side? Or is the boost frequency
> > dynamically managed solely by HW/FW?
>
> It's managed by hardware - Linux gets to request whatever frequency it
> wants, and the hardware will limit it to what is achievable given the
> current idle states within the cluster (and it will change automatically
> with them). So if Linux asks for 3.2 GHz but there are no deep idle
> cores in the cluster, you get 3.0. If there's one deep idle core, you
> get 3.1 (I think). Three, 3.2. So this driver doesn't have to do
> anything (and will report the correct current-frequency as long as the
> per-SoC compatible is matched; without that this feature is disabled and
> it just reports the requested frequency). We could enable the boost
> states today just fine, it's just that they would never actually be
> reached by the hardware.

Thanks for sharing these details. It's always nice to know a bit more
about how the HW works!

From the reviewing point of view, I don't have more to add at this point!

Kind regards
Uffe