This is v2 of my sun9i SMP support with MCPM series which was started
over two years ago [1]. We've tried to implement PSCI for both the A80
and A83T. Results were not promising. The issue is that these two chips
have a broken security extensions implementation. If a specific bit is
not burned in its e-fuse, most if not all security protections don't
work [2]. Even worse, non-secure access to the GIC become secure. This
requires a crazy workaround in the GIC driver which probably doesn't work
in all cases [3].
Nicolas mentioned that the MCPM framework is likely overkill in our
case [4]. However the framework does provide cluster/core state tracking
and proper sequencing of cache related operations. We could rework
the code to use standard smp_ops, but I would like to actually get
a working version in first.
Much of the sunxi-specific MCPM code is derived from Allwinner code and
documentation, with some references to the other MCPM implementations,
as well as the Cortex's Technical Reference Manuals for the power
sequencing info.
One major difference compared to other platforms is we currently do not
have a standalone PMU or other embedded firmware to do the actually power
sequencing. All power/reset control is done by the kernel. Nicolas
mentioned that a new optional callback should be added in cases where the
kernel has to do the actual power down [5]. For now however I'm using a
dedicated single thread workqueue. CPU and cluster power off work is
queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
is somewhat heavy, as I have a total of 10 static work structs. It might
also be a bit racy, as nothing prevents the system from bringing a core
back before the asynchronous work shuts it down. This would likely
happen under a heavily loaded system with a scheduler that brings cores
in and out of the system frequently. In simple use-cases it performs OK.
Changes since v1:
- Leading zeroes for device node addresses removed
- Added device tree binding for SMP SRAM
- Simplified Kconfig options
- Switched to SPDX license identifier
- Map CPU to device tree node and check compatible to see if it's
Cortex-A15 or Cortex-A7
- Fix incorrect CPUCFG cluster status macro that prevented cluster
0 L2 cache WFI detection
- Fixed reversed bit for turning off cluster
- Put cluster in reset before turning off power (or it hangs)
- Added dedicated workqueue for turning off power to cpus and clusters
- Request CPUCFG and SRAM MMIO ranges
- Some comments fixed or added
- Some debug messages added
[1] http://www.spinics.net/lists/arm-kernel/msg418350.html
[2] https://lists.denx.de/pipermail/u-boot/2017-June/294637.html
[3] https://github.com/wens/linux/commit/c48654c1f737116e7a7660183c8c74fa91970528
[4] http://www.spinics.net/lists/arm-kernel/msg434160.html
[5] http://www.spinics.net/lists/arm-kernel/msg434408.html
Chen-Yu Tsai (8):
ARM: sun9i: Support SMP on A80 with Multi-Cluster Power Management
(MCPM)
ARM: dts: sun9i: Add CCI-400 device nodes for A80
ARM: dts: sun9i: Add CPUCFG device node for A80 dtsi
ARM: dts: sun9i: Add PRCM device node for the A80 dtsi
ARM: sun9i: mcpm: Support CPU/cluster power down and hotplugging for
cpu1~7
dt-bindings: ARM: sunxi: Document A80 SoC secure SRAM usage by SMP
hotplug
ARM: sun9i: mcpm: Support cpu0 hotplug
ARM: dts: sun9i: Add secure SRAM node used for MCPM SMP hotplug
.../devicetree/bindings/arm/sunxi/smp-sram.txt | 44 ++
arch/arm/boot/dts/sun9i-a80.dtsi | 75 +++
arch/arm/mach-sunxi/Kconfig | 2 +
arch/arm/mach-sunxi/Makefile | 1 +
arch/arm/mach-sunxi/mcpm.c | 633 +++++++++++++++++++++
5 files changed, 755 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
create mode 100644 arch/arm/mach-sunxi/mcpm.c
--
2.15.1
The BROM has a branch that checks if the primary core is hotplugging.
If the magic flag is set, execution jumps to the address set in the
software entry register. (Secondary cores always branch to the that
address.)
This patch sets the flags that makes BROM jump execution on the
primary core (cpu0) to the SMP software entry code when the core is
powered back up. After it is re-integrated into the system, the flag
is cleared.
Signed-off-by: Chen-Yu Tsai <[email protected]>
---
arch/arm/mach-sunxi/mcpm.c | 54 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 51 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index ddc26b5fec48..c8d74c783644 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -56,8 +56,12 @@
#define PRCM_PWR_SWITCH_REG(c, cpu) (0x140 + 0x10 * (c) + 0x4 * (cpu))
#define PRCM_CPU_SOFT_ENTRY_REG 0x164
+#define CPU0_SUPPORT_HOTPLUG_MAGIC0 0xFA50392F
+#define CPU0_SUPPORT_HOTPLUG_MAGIC1 0x790DCA3A
+
static void __iomem *cpucfg_base;
static void __iomem *prcm_base;
+static void __iomem *sram_b_smp_base;
static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
{
@@ -116,6 +120,17 @@ static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
return 0;
}
+static void sunxi_cpu0_hotplug_support_set(bool enable)
+{
+ if (enable) {
+ writel(CPU0_SUPPORT_HOTPLUG_MAGIC0, sram_b_smp_base);
+ writel(CPU0_SUPPORT_HOTPLUG_MAGIC1, sram_b_smp_base + 0x4);
+ } else {
+ writel(0x0, sram_b_smp_base);
+ writel(0x0, sram_b_smp_base + 0x4);
+ }
+}
+
static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
{
u32 reg;
@@ -124,6 +139,10 @@ static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
return -EINVAL;
+ /* Set hotplug support magic flags for cpu0 */
+ if (cluster == 0 && cpu == 0)
+ sunxi_cpu0_hotplug_support_set(true);
+
/* assert processor power-on reset */
reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
@@ -406,6 +425,13 @@ static void sunxi_cluster_powerdown(struct work_struct *_work)
sunxi_do_cluster_powerdown(cluster);
}
+static void sunxi_cpu_is_up(unsigned int cpu, unsigned int cluster)
+{
+ /* Clear hotplug support magic flags for cpu0 */
+ if (cluster == 0 && cpu == 0)
+ sunxi_cpu0_hotplug_support_set(false);
+}
+
static int sunxi_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
{
int ret;
@@ -425,6 +451,7 @@ static const struct mcpm_platform_ops sunxi_power_ops = {
.cluster_powerdown_prepare = sunxi_cluster_powerdown_prepare,
.cpu_cache_disable = sunxi_cpu_cache_disable,
.cluster_cache_disable = sunxi_cluster_cache_disable,
+ .cpu_is_up = sunxi_cpu_is_up,
.wait_for_powerdown = sunxi_wait_for_powerdown,
};
@@ -484,7 +511,7 @@ static void sunxi_mcpm_setup_entry_point(void)
static int __init sunxi_mcpm_init(void)
{
- struct device_node *cpucfg_node, *node;
+ struct device_node *cpucfg_node, *sram_node, *node;
struct resource res;
int ret;
int i, j;
@@ -525,12 +552,26 @@ static int __init sunxi_mcpm_init(void)
goto err_put_cpucfg_node;
}
+ sram_node = of_find_compatible_node(NULL, NULL,
+ "allwinner,sun9i-a80-smp-sram");
+ if (!sram_node) {
+ ret = -ENODEV;
+ goto err_unmap_release_cpucfg;
+ }
+
+ sram_b_smp_base = of_io_request_and_map(sram_node, 0, "sunxi-mcpm");
+ if (IS_ERR(sram_b_smp_base)) {
+ ret = PTR_ERR(sram_b_smp_base);
+ pr_err("%s: failed to map secure SRAM\n", __func__);
+ goto err_put_sram_node;
+ }
+
/* Initialize our strictly ordered workqueue */
sunxi_mcpm_wq = alloc_ordered_workqueue("%s", 0, "sunxi-mcpm");
if (!sunxi_mcpm_wq) {
ret = -ENOMEM;
pr_err("%s: failed to create our workqueue\n", __func__);
- goto err_unmap_release_cpucfg;
+ goto err_unmap_release_secure_sram;
}
/* Initialize power down work */
@@ -556,8 +597,9 @@ static int __init sunxi_mcpm_init(void)
if (ret)
goto err_destroy_workqueue;
- /* We don't need the CPUCFG device node anymore */
+ /* We don't need the CPUCFG and SRAM device nodes anymore */
of_node_put(cpucfg_node);
+ of_node_put(sram_node);
/* Set the hardware entry point address */
sunxi_mcpm_setup_entry_point();
@@ -571,6 +613,12 @@ static int __init sunxi_mcpm_init(void)
err_destroy_workqueue:
destroy_workqueue(sunxi_mcpm_wq);
+err_unmap_release_secure_sram:
+ iounmap(sram_b_smp_base);
+ of_address_to_resource(sram_node, 0, &res);
+ release_mem_region(res.start, resource_size(&res));
+err_put_sram_node:
+ of_node_put(sram_node);
err_unmap_release_cpucfg:
iounmap(cpucfg_base);
of_address_to_resource(cpucfg_node, 0, &res);
--
2.15.1
The A80 stores some magic flags in a portion of the secure SRAM. The
BROM jumps directly to the software entry point set by the SMP code
if the flags are set. This is required for CPU0 hotplugging.
Signed-off-by: Chen-Yu Tsai <[email protected]>
---
arch/arm/boot/dts/sun9i-a80.dtsi | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index bf4d40e8359f..b1c86b76ac3c 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -250,6 +250,25 @@
*/
ranges = <0 0 0 0x20000000>;
+ sram_b: sram@20000 {
+ /* 256 KiB secure SRAM at 0x20000 */
+ compatible = "mmio-sram";
+ reg = <0x00020000 0x40000>;
+
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0x00020000 0x40000>;
+
+ smp-sram@1000 {
+ /*
+ * This is checked by BROM to determine if
+ * cpu0 should jump to SMP entry vector
+ */
+ compatible = "allwinner,sun9i-a80-smp-sram";
+ reg = <0x1000 0x8>;
+ };
+ };
+
ehci0: usb@a00000 {
compatible = "allwinner,sun9i-a80-ehci", "generic-ehci";
reg = <0x00a00000 0x100>;
--
2.15.1
CPUCFG is a collection of registers that are mapped to the SoC's signals
from each individual processor core and associated peripherals, such as
resets for processors, L1/L2 cache and other things.
These registers are used for SMP bringup and CPU hotplugging.
Signed-off-by: Chen-Yu Tsai <[email protected]>
---
arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85fb800af8ab..85ecb4d64cfd 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -363,6 +363,11 @@
#reset-cells = <1>;
};
+ cpucfg@1700000 {
+ compatible = "allwinner,sun9i-a80-cpucfg";
+ reg = <0x01700000 0x100>;
+ };
+
mmc0: mmc@1c0f000 {
compatible = "allwinner,sun9i-a80-mmc";
reg = <0x01c0f000 0x1000>;
--
2.15.1
On the Allwinner A80 SoC the BROM supports hotplugging the primary core
(cpu0) by checking two 32bit values at a specific location within the
secure SRAM block. This region needs to be reserved and accessible to
the SMP code.
Document its usage.
Signed-off-by: Chen-Yu Tsai <[email protected]>
---
.../devicetree/bindings/arm/sunxi/smp-sram.txt | 44 ++++++++++++++++++++++
1 file changed, 44 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
diff --git a/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
new file mode 100644
index 000000000000..082e6a9382d3
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
@@ -0,0 +1,44 @@
+Allwinner SRAM for smp bringup:
+------------------------------------------------
+
+Allwinner's A80 SoC uses part of the secure sram for hotplugging of the
+primary core (cpu0). Once the core gets powered up it checks if a magic
+value is set at a specific location. If it is then the BROM will jump
+to the software entry address, instead of executing a standard boot.
+
+Therefore a reserved section sub-node has to be added to the mmio-sram
+declaration.
+
+Note that this is separate from the Allwinner SRAM controller found in
+../../sram/sunxi-sram.txt. This SRAM is secure only and not mappable to
+any device.
+
+Also there are no "secure-only" properties. The implementation should
+check if this SRAM is usable first.
+
+Required sub-node properties:
+- compatible : depending on the SoC this should be one of:
+ "allwinner,sun9i-a80-smp-sram"
+
+The rest of the properties should follow the generic mmio-sram discription
+found in ../../misc/sram.txt
+
+Example:
+
+ sram_b: sram@20000 {
+ /* 256 KiB secure SRAM at 0x20000 */
+ compatible = "mmio-sram";
+ reg = <0x00020000 0x40000>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0x00020000 0x40000>;
+
+ smp-sram@1000 {
+ /*
+ * This is checked by BROM to determine if
+ * cpu0 should jump to SMP entry vector
+ */
+ compatible = "allwinner,sun9i-a80-smp-sram";
+ reg = <0x1000 0x8>;
+ };
+ };
--
2.15.1
The A80 includes an ARM CCI-400 interconnect to support multi-cluster
CPU caches.
Also add the maximum clock frequency for the CPUs, as listed in the
A80 Optimus Board FEX file.
Signed-off-by: Chen-Yu Tsai <[email protected]>
---
arch/arm/boot/dts/sun9i-a80.dtsi | 46 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 90eac0b2a193..85fb800af8ab 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -63,48 +63,64 @@
cpu0: cpu@0 {
compatible = "arm,cortex-a7";
device_type = "cpu";
+ cci-control-port = <&cci_control0>;
+ clock-frequency = <12000000>;
reg = <0x0>;
};
cpu1: cpu@1 {
compatible = "arm,cortex-a7";
device_type = "cpu";
+ cci-control-port = <&cci_control0>;
+ clock-frequency = <12000000>;
reg = <0x1>;
};
cpu2: cpu@2 {
compatible = "arm,cortex-a7";
device_type = "cpu";
+ cci-control-port = <&cci_control0>;
+ clock-frequency = <12000000>;
reg = <0x2>;
};
cpu3: cpu@3 {
compatible = "arm,cortex-a7";
device_type = "cpu";
+ cci-control-port = <&cci_control0>;
+ clock-frequency = <12000000>;
reg = <0x3>;
};
cpu4: cpu@100 {
compatible = "arm,cortex-a15";
device_type = "cpu";
+ cci-control-port = <&cci_control1>;
+ clock-frequency = <18000000>;
reg = <0x100>;
};
cpu5: cpu@101 {
compatible = "arm,cortex-a15";
device_type = "cpu";
+ cci-control-port = <&cci_control1>;
+ clock-frequency = <18000000>;
reg = <0x101>;
};
cpu6: cpu@102 {
compatible = "arm,cortex-a15";
device_type = "cpu";
+ cci-control-port = <&cci_control1>;
+ clock-frequency = <18000000>;
reg = <0x102>;
};
cpu7: cpu@103 {
compatible = "arm,cortex-a15";
device_type = "cpu";
+ cci-control-port = <&cci_control1>;
+ clock-frequency = <18000000>;
reg = <0x103>;
};
};
@@ -431,6 +447,36 @@
interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
};
+ cci: cci@1c90000 {
+ compatible = "arm,cci-400";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ reg = <0x01c90000 0x1000>;
+ ranges = <0x0 0x01c90000 0x10000>;
+
+ cci_control0: slave-if@4000 {
+ compatible = "arm,cci-400-ctrl-if";
+ interface-type = "ace";
+ reg = <0x4000 0x1000>;
+ };
+
+ cci_control1: slave-if@5000 {
+ compatible = "arm,cci-400-ctrl-if";
+ interface-type = "ace";
+ reg = <0x5000 0x1000>;
+ };
+
+ pmu@9000 {
+ compatible = "arm,cci-400-pmu,r1";
+ reg = <0x9000 0x5000>;
+ interrupts = <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 134 IRQ_TYPE_LEVEL_HIGH>;
+ };
+ };
+
de_clocks: clock@3000000 {
compatible = "allwinner,sun9i-a80-de-clks";
reg = <0x03000000 0x30>;
--
2.15.1
This patch adds common code used to power down all cores and clusters.
The code is quite long. The common MCPM library does not provide a
callback for doing the actual power down sequence. Instead it assumes
some other part (maybe a power management coprocessor) will handle it.
Since our platform does not have this, we resort to using a single thread
workqueue, based on how our work should be done in the order they were
queued, so the cluster power down code does not execute before the core
power down code. Though the scenario is not catastrophic, it will leave
the cluster on and using power.
This might be a bit racy, as nothing prevents the system from bringing a
core or cluster back before the asynchronous work shuts it down. This would
likely happen under a heavily loaded system with a scheduler that brings
cores in and out of the system frequently. It would either result in a
stall on a single core, or worse, hang the system if a cluster is abruptly
turned off. In simple use-cases it performs OK.
The primary core (cpu0) requires setting flags to have the BROM bounce
execution to the SMP software entry code. This is done in a subsequent
patch to keep the changes cleanly separated.
Signed-off-by: Chen-Yu Tsai <[email protected]>
---
arch/arm/mach-sunxi/mcpm.c | 170 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 165 insertions(+), 5 deletions(-)
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
index 30719998f3f0..ddc26b5fec48 100644
--- a/arch/arm/mach-sunxi/mcpm.c
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -12,9 +12,12 @@
#include <linux/arm-cci.h>
#include <linux/delay.h>
#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/irqchip/arm-gic.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
+#include <linux/workqueue.h>
#include <asm/cputype.h>
#include <asm/cp15.h>
@@ -30,6 +33,9 @@
#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15 BIT(0)
#define CPUCFG_CX_CTRL_REG1(c) (0x10 * (c) + 0x4)
#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
+#define CPUCFG_CX_STATUS(c) (0x30 + 0x4 * (c))
+#define CPUCFG_CX_STATUS_STANDBYWFI(n) BIT(16 + (n))
+#define CPUCFG_CX_STATUS_STANDBYWFIL2 BIT(0)
#define CPUCFG_CX_RST_CTRL(c) (0x80 + 0x4 * (c))
#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST BIT(24)
#define CPUCFG_CX_RST_CTRL_ETM_RST(n) BIT(20 + (n))
@@ -237,6 +243,30 @@ static int sunxi_cluster_powerup(unsigned int cluster)
return 0;
}
+struct sunxi_mcpm_work {
+ struct work_struct work;
+ unsigned int cpu;
+ unsigned int cluster;
+};
+
+static struct workqueue_struct *sunxi_mcpm_wq;
+static struct sunxi_mcpm_work sunxi_mcpm_cpu_down_work[2][4];
+static struct sunxi_mcpm_work sunxi_mcpm_cluster_down_work[2];
+
+static void sunxi_cpu_powerdown_prepare(unsigned int cpu, unsigned int cluster)
+{
+ gic_cpu_if_down(0);
+
+ queue_work(sunxi_mcpm_wq,
+ &sunxi_mcpm_cpu_down_work[cluster][cpu].work);
+}
+
+static void sunxi_cluster_powerdown_prepare(unsigned int cluster)
+{
+ queue_work(sunxi_mcpm_wq,
+ &sunxi_mcpm_cluster_down_work[cluster].work);
+}
+
static void sunxi_cpu_cache_disable(void)
{
/* Disable and flush the local CPU cache. */
@@ -286,11 +316,116 @@ static void sunxi_cluster_cache_disable(void)
writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
}
+static int sunxi_do_cpu_powerdown(unsigned int cpu, unsigned int cluster)
+{
+ u32 reg;
+
+ pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+ if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+ return -EINVAL;
+
+ /* gate processor power */
+ reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+ reg |= PRCM_PWROFF_GATING_REG_CORE(cpu);
+ writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+ udelay(20);
+
+ /* close power switch */
+ sunxi_cpu_power_switch_set(cpu, cluster, false);
+
+ return 0;
+}
+
+static int sunxi_do_cluster_powerdown(unsigned int cluster)
+{
+ u32 reg;
+
+ pr_debug("%s: cluster %u\n", __func__, cluster);
+ if (cluster >= SUNXI_NR_CLUSTERS)
+ return -EINVAL;
+
+ /* assert cluster resets */
+ pr_debug("%s: assert cluster reset\n", __func__);
+ reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+ reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+ reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+ reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+ writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+ /* gate cluster power */
+ pr_debug("%s: gate cluster power\n", __func__);
+ reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+ reg |= PRCM_PWROFF_GATING_REG_CLUSTER;
+ writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+ udelay(20);
+
+ return 0;
+}
+
+static struct sunxi_mcpm_work *to_sunxi_mcpm_work(struct work_struct *work)
+{
+ return container_of(work, struct sunxi_mcpm_work, work);
+}
+
+/* async. work functions to bring down cpus and clusters */
+static void sunxi_cpu_powerdown(struct work_struct *_work)
+{
+ struct sunxi_mcpm_work *work = to_sunxi_mcpm_work(_work);
+ unsigned int cluster = work->cluster, cpu = work->cpu;
+ int ret;
+ u32 reg;
+
+ /* wait for CPU core to enter WFI */
+ ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+ reg & CPUCFG_CX_STATUS_STANDBYWFI(cpu),
+ 1000, 100000);
+
+ if (ret)
+ return;
+
+ /* power down CPU core */
+ sunxi_do_cpu_powerdown(cpu, cluster);
+}
+
+static void sunxi_cluster_powerdown(struct work_struct *_work)
+{
+ struct sunxi_mcpm_work *work = to_sunxi_mcpm_work(_work);
+ unsigned int cluster = work->cluster;
+ int ret;
+ u32 reg;
+
+ pr_debug("%s: cluster %u\n", __func__, cluster);
+
+ /* wait for cluster L2 WFI */
+ ret = readl_poll_timeout(cpucfg_base + CPUCFG_CX_STATUS(cluster), reg,
+ reg & CPUCFG_CX_STATUS_STANDBYWFIL2,
+ 1000, 100000);
+ if (ret)
+ return;
+
+ sunxi_do_cluster_powerdown(cluster);
+}
+
+static int sunxi_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
+{
+ int ret;
+ u32 reg;
+
+ ret = readl_poll_timeout(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu),
+ reg, reg == 0xff, 1000, 100000);
+ pr_debug("%s: cpu %u cluster %u powerdown: %s\n", __func__,
+ cpu, cluster, ret ? "timed out" : "done");
+ return ret;
+}
+
static const struct mcpm_platform_ops sunxi_power_ops = {
- .cpu_powerup = sunxi_cpu_powerup,
- .cluster_powerup = sunxi_cluster_powerup,
- .cpu_cache_disable = sunxi_cpu_cache_disable,
- .cluster_cache_disable = sunxi_cluster_cache_disable,
+ .cpu_powerup = sunxi_cpu_powerup,
+ .cpu_powerdown_prepare = sunxi_cpu_powerdown_prepare,
+ .cluster_powerup = sunxi_cluster_powerup,
+ .cluster_powerdown_prepare = sunxi_cluster_powerdown_prepare,
+ .cpu_cache_disable = sunxi_cpu_cache_disable,
+ .cluster_cache_disable = sunxi_cluster_cache_disable,
+ .wait_for_powerdown = sunxi_wait_for_powerdown,
};
/*
@@ -352,6 +487,7 @@ static int __init sunxi_mcpm_init(void)
struct device_node *cpucfg_node, *node;
struct resource res;
int ret;
+ int i, j;
if (!of_machine_is_compatible("allwinner,sun9i-a80"))
return -ENODEV;
@@ -389,6 +525,28 @@ static int __init sunxi_mcpm_init(void)
goto err_put_cpucfg_node;
}
+ /* Initialize our strictly ordered workqueue */
+ sunxi_mcpm_wq = alloc_ordered_workqueue("%s", 0, "sunxi-mcpm");
+ if (!sunxi_mcpm_wq) {
+ ret = -ENOMEM;
+ pr_err("%s: failed to create our workqueue\n", __func__);
+ goto err_unmap_release_cpucfg;
+ }
+
+ /* Initialize power down work */
+ for (i = 0; i < SUNXI_NR_CLUSTERS; i++) {
+ for (j = 0; j < SUNXI_CPUS_PER_CLUSTER; j++) {
+ sunxi_mcpm_cpu_down_work[i][j].cluster = i;
+ sunxi_mcpm_cpu_down_work[i][j].cpu = j;
+ INIT_WORK(&sunxi_mcpm_cpu_down_work[i][j].work,
+ sunxi_cpu_powerdown);
+ }
+
+ sunxi_mcpm_cluster_down_work[i].cluster = i;
+ INIT_WORK(&sunxi_mcpm_cluster_down_work[i].work,
+ sunxi_cluster_powerdown);
+ }
+
ret = mcpm_platform_register(&sunxi_power_ops);
if (!ret)
ret = mcpm_sync_init(sunxi_power_up_setup);
@@ -396,7 +554,7 @@ static int __init sunxi_mcpm_init(void)
/* do not disable AXI master as no one will re-enable it */
ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
if (ret)
- goto err_unmap_release_cpucfg;
+ goto err_destroy_workqueue;
/* We don't need the CPUCFG device node anymore */
of_node_put(cpucfg_node);
@@ -411,6 +569,8 @@ static int __init sunxi_mcpm_init(void)
return ret;
+err_destroy_workqueue:
+ destroy_workqueue(sunxi_mcpm_wq);
err_unmap_release_cpucfg:
iounmap(cpucfg_base);
of_address_to_resource(cpucfg_node, 0, &res);
--
2.15.1
The A80 is a big.LITTLE SoC with 1 cluster of 4 Cortex-A7s and
1 cluster of 4 Cortex-A15s.
This patch adds support to bring up the second cluster and thus all
cores using the common MCPM code. Core/cluster power down has not
been implemented, thus CPU hotplugging and big.LITTLE switcher is
not supported.
Signed-off-by: Chen-Yu Tsai <[email protected]>
---
arch/arm/mach-sunxi/Kconfig | 2 +
arch/arm/mach-sunxi/Makefile | 1 +
arch/arm/mach-sunxi/mcpm.c | 425 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 428 insertions(+)
create mode 100644 arch/arm/mach-sunxi/mcpm.c
diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig
index 58153cdf025b..53c4e7420cfb 100644
--- a/arch/arm/mach-sunxi/Kconfig
+++ b/arch/arm/mach-sunxi/Kconfig
@@ -47,5 +47,7 @@ config MACH_SUN9I
bool "Allwinner (sun9i) SoCs support"
default ARCH_SUNXI
select ARM_GIC
+ select MCPM if SMP
+ select ARM_CCI400_PORT_CTRL if SMP
endif
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile
index 27b168f121a1..cd25d9d81257 100644
--- a/arch/arm/mach-sunxi/Makefile
+++ b/arch/arm/mach-sunxi/Makefile
@@ -1,2 +1,3 @@
obj-$(CONFIG_ARCH_SUNXI) += sunxi.o
+obj-$(CONFIG_MCPM) += mcpm.o
obj-$(CONFIG_SMP) += platsmp.o
diff --git a/arch/arm/mach-sunxi/mcpm.c b/arch/arm/mach-sunxi/mcpm.c
new file mode 100644
index 000000000000..30719998f3f0
--- /dev/null
+++ b/arch/arm/mach-sunxi/mcpm.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2015 Chen-Yu Tsai
+ *
+ * Chen-Yu Tsai <[email protected]>
+ *
+ * arch/arm/mach-sunxi/mcpm.c
+ *
+ * Based on arch/arm/mach-exynos/mcpm-exynos.c and Allwinner code
+ */
+
+#include <linux/arm-cci.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+#include <asm/mcpm.h>
+
+#define SUNXI_CPUS_PER_CLUSTER 4
+#define SUNXI_NR_CLUSTERS 2
+
+#define CPUCFG_CX_CTRL_REG0(c) (0x10 * (c))
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(n) BIT(n)
+#define CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL 0xf
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7 BIT(4)
+#define CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15 BIT(0)
+#define CPUCFG_CX_CTRL_REG1(c) (0x10 * (c) + 0x4)
+#define CPUCFG_CX_CTRL_REG1_ACINACTM BIT(0)
+#define CPUCFG_CX_RST_CTRL(c) (0x80 + 0x4 * (c))
+#define CPUCFG_CX_RST_CTRL_DBG_SOC_RST BIT(24)
+#define CPUCFG_CX_RST_CTRL_ETM_RST(n) BIT(20 + (n))
+#define CPUCFG_CX_RST_CTRL_ETM_RST_ALL (0xf << 20)
+#define CPUCFG_CX_RST_CTRL_DBG_RST(n) BIT(16 + (n))
+#define CPUCFG_CX_RST_CTRL_DBG_RST_ALL (0xf << 16)
+#define CPUCFG_CX_RST_CTRL_H_RST BIT(12)
+#define CPUCFG_CX_RST_CTRL_L2_RST BIT(8)
+#define CPUCFG_CX_RST_CTRL_CX_RST(n) BIT(4 + (n))
+#define CPUCFG_CX_RST_CTRL_CORE_RST(n) BIT(n)
+
+#define PRCM_CPU_PO_RST_CTRL(c) (0x4 + 0x4 * (c))
+#define PRCM_CPU_PO_RST_CTRL_CORE(n) BIT(n)
+#define PRCM_CPU_PO_RST_CTRL_CORE_ALL 0xf
+#define PRCM_PWROFF_GATING_REG(c) (0x100 + 0x4 * (c))
+#define PRCM_PWROFF_GATING_REG_CLUSTER BIT(4)
+#define PRCM_PWROFF_GATING_REG_CORE(n) BIT(n)
+#define PRCM_PWR_SWITCH_REG(c, cpu) (0x140 + 0x10 * (c) + 0x4 * (cpu))
+#define PRCM_CPU_SOFT_ENTRY_REG 0x164
+
+static void __iomem *cpucfg_base;
+static void __iomem *prcm_base;
+
+static bool sunxi_core_is_cortex_a15(unsigned int core, unsigned int cluster)
+{
+ struct device_node *node;
+ int cpu = cluster * SUNXI_CPUS_PER_CLUSTER + core;
+
+ node = of_cpu_device_node_get(cpu);
+
+ /* In case of_cpu_device_node_get fails */
+ if (!node)
+ node = of_get_cpu_node(cpu, NULL);
+
+ if (!node) {
+ /*
+ * There's no point in returning an error, since we
+ * would be mid way in a core or cluster power sequence.
+ */
+ pr_err("%s: Couldn't get CPU cluster %u core %u device node\n",
+ __func__, cluster, core);
+
+ return false;
+ }
+
+ return of_device_is_compatible(node, "arm,cortex-a15");
+}
+
+static int sunxi_cpu_power_switch_set(unsigned int cpu, unsigned int cluster,
+ bool enable)
+{
+ u32 reg;
+
+ /* control sequence from Allwinner A80 user manual v1.2 PRCM section */
+ reg = readl(prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+ if (enable) {
+ if (reg == 0x00) {
+ pr_debug("power clamp for cluster %u cpu %u already open\n",
+ cluster, cpu);
+ return 0;
+ }
+
+ writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+ udelay(10);
+ writel(0xfe, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+ udelay(10);
+ writel(0xf8, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+ udelay(10);
+ writel(0xf0, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+ udelay(10);
+ writel(0x00, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+ udelay(10);
+ } else {
+ writel(0xff, prcm_base + PRCM_PWR_SWITCH_REG(cluster, cpu));
+ udelay(10);
+ }
+
+ return 0;
+}
+
+static int sunxi_cpu_powerup(unsigned int cpu, unsigned int cluster)
+{
+ u32 reg;
+
+ pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+ if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
+ return -EINVAL;
+
+ /* assert processor power-on reset */
+ reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+ reg &= ~PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+ writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+ /* Cortex-A7: hold L1 reset disable signal low */
+ if (!sunxi_core_is_cortex_a15(cpu, cluster)) {
+ reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+ reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE(cpu);
+ writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+ }
+
+ /* assert processor related resets */
+ reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+ reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+
+ /*
+ * Allwinner code also asserts resets for NEON on A15. According
+ * to ARM manuals, asserting power-on reset is sufficient.
+ */
+ if (!sunxi_core_is_cortex_a15(cpu, cluster))
+ reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+
+ writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+ /* open power switch */
+ sunxi_cpu_power_switch_set(cpu, cluster, true);
+
+ /* clear processor power gate */
+ reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+ reg &= ~PRCM_PWROFF_GATING_REG_CORE(cpu);
+ writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+ udelay(20);
+
+ /* de-assert processor power-on reset */
+ reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+ reg |= PRCM_CPU_PO_RST_CTRL_CORE(cpu);
+ writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+ /* de-assert all processor resets */
+ reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+ reg |= CPUCFG_CX_RST_CTRL_DBG_RST(cpu);
+ reg |= CPUCFG_CX_RST_CTRL_CORE_RST(cpu);
+ if (!sunxi_core_is_cortex_a15(cpu, cluster))
+ reg |= CPUCFG_CX_RST_CTRL_ETM_RST(cpu);
+ else
+ reg |= CPUCFG_CX_RST_CTRL_CX_RST(cpu); /* NEON */
+ writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+ return 0;
+}
+
+static int sunxi_cluster_powerup(unsigned int cluster)
+{
+ u32 reg;
+
+ pr_debug("%s: cluster %u\n", __func__, cluster);
+ if (cluster >= SUNXI_NR_CLUSTERS)
+ return -EINVAL;
+
+ /* assert ACINACTM */
+ reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+ reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+ writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+ /* assert cluster processor power-on resets */
+ reg = readl(prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+ reg &= ~PRCM_CPU_PO_RST_CTRL_CORE_ALL;
+ writel(reg, prcm_base + PRCM_CPU_PO_RST_CTRL(cluster));
+
+ /* assert cluster resets */
+ reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+ reg &= ~CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+ reg &= ~CPUCFG_CX_RST_CTRL_DBG_RST_ALL;
+ reg &= ~CPUCFG_CX_RST_CTRL_H_RST;
+ reg &= ~CPUCFG_CX_RST_CTRL_L2_RST;
+
+ /*
+ * Allwinner code also asserts resets for NEON on A15. According
+ * to ARM manuals, asserting power-on reset is sufficient.
+ */
+ if (!sunxi_core_is_cortex_a15(0, cluster))
+ reg &= ~CPUCFG_CX_RST_CTRL_ETM_RST_ALL;
+
+ writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+ /* hold L1/L2 reset disable signals low */
+ reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+ if (sunxi_core_is_cortex_a15(0, cluster)) {
+ /* Cortex-A15: hold L2RSTDISABLE low */
+ reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A15;
+ } else {
+ /* Cortex-A7: hold L1RSTDISABLE and L2RSTDISABLE low */
+ reg &= ~CPUCFG_CX_CTRL_REG0_L1_RST_DISABLE_ALL;
+ reg &= ~CPUCFG_CX_CTRL_REG0_L2_RST_DISABLE_A7;
+ }
+ writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG0(cluster));
+
+ /* clear cluster power gate */
+ reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+ reg &= ~PRCM_PWROFF_GATING_REG_CLUSTER;
+ writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
+ udelay(20);
+
+ /* de-assert cluster resets */
+ reg = readl(cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+ reg |= CPUCFG_CX_RST_CTRL_DBG_SOC_RST;
+ reg |= CPUCFG_CX_RST_CTRL_H_RST;
+ reg |= CPUCFG_CX_RST_CTRL_L2_RST;
+ writel(reg, cpucfg_base + CPUCFG_CX_RST_CTRL(cluster));
+
+ /* de-assert ACINACTM */
+ reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+ reg &= ~CPUCFG_CX_CTRL_REG1_ACINACTM;
+ writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+
+ return 0;
+}
+
+static void sunxi_cpu_cache_disable(void)
+{
+ /* Disable and flush the local CPU cache. */
+ v7_exit_coherency_flush(louis);
+}
+
+/*
+ * This bit is shared between the initial mcpm_sync_init call to enable
+ * CCI-400 and proper cluster cache disable before power down.
+ */
+static void sunxi_cluster_cache_disable_without_axi(void)
+{
+ if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A15) {
+ /*
+ * On the Cortex-A15 we need to disable
+ * L2 prefetching before flushing the cache.
+ */
+ asm volatile(
+ "mcr p15, 1, %0, c15, c0, 3\n"
+ "isb\n"
+ "dsb"
+ : : "r" (0x400));
+ }
+
+ /* Flush all cache levels for this cluster. */
+ v7_exit_coherency_flush(all);
+
+ /*
+ * Disable cluster-level coherency by masking
+ * incoming snoops and DVM messages:
+ */
+ cci_disable_port_by_cpu(read_cpuid_mpidr());
+}
+
+static void sunxi_cluster_cache_disable(void)
+{
+ unsigned int cluster = MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 1);
+ u32 reg;
+
+ pr_debug("%s: cluster %u\n", __func__, cluster);
+
+ sunxi_cluster_cache_disable_without_axi();
+
+ /* last man standing, assert ACINACTM */
+ reg = readl(cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+ reg |= CPUCFG_CX_CTRL_REG1_ACINACTM;
+ writel(reg, cpucfg_base + CPUCFG_CX_CTRL_REG1(cluster));
+}
+
+static const struct mcpm_platform_ops sunxi_power_ops = {
+ .cpu_powerup = sunxi_cpu_powerup,
+ .cluster_powerup = sunxi_cluster_powerup,
+ .cpu_cache_disable = sunxi_cpu_cache_disable,
+ .cluster_cache_disable = sunxi_cluster_cache_disable,
+};
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ *
+ * Also enable regional clock gating and L2 data latency settings for
+ * Cortex-A15. These settings are from the vendor kernel.
+ */
+static void __naked sunxi_power_up_setup(unsigned int affinity_level)
+{
+ asm volatile (
+ "mrc p15, 0, r1, c0, c0, 0\n"
+ "movw r2, #" __stringify(ARM_CPU_PART_MASK & 0xffff) "\n"
+ "movt r2, #" __stringify(ARM_CPU_PART_MASK >> 16) "\n"
+ "and r1, r1, r2\n"
+ "movw r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 & 0xffff) "\n"
+ "movt r2, #" __stringify(ARM_CPU_PART_CORTEX_A15 >> 16) "\n"
+ "cmp r1, r2\n"
+ "bne not_a15\n"
+
+ /* The following is Cortex-A15 specific */
+
+ /* ACTLR2: Enable CPU regional clock gates */
+ "mrc p15, 1, r1, c15, c0, 4\n"
+ "orr r1, r1, #(0x1<<31)\n"
+ "mcr p15, 1, r1, c15, c0, 4\n"
+
+ /* L2ACTLR */
+ "mrc p15, 1, r1, c15, c0, 0\n"
+ /* Enable L2, GIC, and Timer regional clock gates */
+ "orr r1, r1, #(0x1<<26)\n"
+ /* Disable clean/evict from being pushed to external */
+ "orr r1, r1, #(0x1<<3)\n"
+ "mcr p15, 1, r1, c15, c0, 0\n"
+
+ /* L2CTRL: L2 data RAM latency */
+ "mrc p15, 1, r1, c9, c0, 2\n"
+ "bic r1, r1, #(0x7<<0)\n"
+ "orr r1, r1, #(0x3<<0)\n"
+ "mcr p15, 1, r1, c9, c0, 2\n"
+
+ /* End of Cortex-A15 specific setup */
+ "not_a15:\n"
+
+ "cmp r0, #1\n"
+ "bxne lr\n"
+ "b cci_enable_port_for_self"
+ );
+}
+
+static void sunxi_mcpm_setup_entry_point(void)
+{
+ __raw_writel(virt_to_phys(mcpm_entry_point),
+ prcm_base + PRCM_CPU_SOFT_ENTRY_REG);
+}
+
+static int __init sunxi_mcpm_init(void)
+{
+ struct device_node *cpucfg_node, *node;
+ struct resource res;
+ int ret;
+
+ if (!of_machine_is_compatible("allwinner,sun9i-a80"))
+ return -ENODEV;
+
+ if (!cci_probed())
+ return -ENODEV;
+
+ node = of_find_compatible_node(NULL, NULL, "allwinner,sun9i-a80-prcm");
+ if (!node)
+ return -ENODEV;
+
+ /*
+ * Unfortunately we can not request the I/O region for the PRCM.
+ * It is shared with the PRCM clock.
+ */
+ prcm_base = of_iomap(node, 0);
+ of_node_put(node);
+ if (!prcm_base) {
+ pr_err("%s: failed to map PRCM registers\n", __func__);
+ return -ENOMEM;
+ }
+
+ cpucfg_node = of_find_compatible_node(NULL, NULL,
+ "allwinner,sun9i-a80-cpucfg");
+ if (!cpucfg_node) {
+ ret = -ENODEV;
+ goto err_unmap_prcm;
+ }
+
+ cpucfg_base = of_io_request_and_map(cpucfg_node, 0, "sunxi-mcpm");
+ if (IS_ERR(cpucfg_base)) {
+ ret = PTR_ERR(cpucfg_base);
+ pr_err("%s: failed to map CPUCFG registers: %d\n",
+ __func__, ret);
+ goto err_put_cpucfg_node;
+ }
+
+ ret = mcpm_platform_register(&sunxi_power_ops);
+ if (!ret)
+ ret = mcpm_sync_init(sunxi_power_up_setup);
+ if (!ret)
+ /* do not disable AXI master as no one will re-enable it */
+ ret = mcpm_loopback(sunxi_cluster_cache_disable_without_axi);
+ if (ret)
+ goto err_unmap_release_cpucfg;
+
+ /* We don't need the CPUCFG device node anymore */
+ of_node_put(cpucfg_node);
+
+ /* Set the hardware entry point address */
+ sunxi_mcpm_setup_entry_point();
+
+ /* Actually enable MCPM */
+ mcpm_smp_set_ops();
+
+ pr_info("sunxi MCPM support installed\n");
+
+ return ret;
+
+err_unmap_release_cpucfg:
+ iounmap(cpucfg_base);
+ of_address_to_resource(cpucfg_node, 0, &res);
+ release_mem_region(res.start, resource_size(&res));
+err_put_cpucfg_node:
+ of_node_put(cpucfg_node);
+err_unmap_prcm:
+ iounmap(prcm_base);
+ return ret;
+}
+
+early_initcall(sunxi_mcpm_init);
--
2.15.1
The PRCM is a collection of clock controls, reset controls, and various
power switches/gates. Some of these can be independently listed and
supported, while a number of CPU related ones are used in tandem with
CPUCFG for SMP bringup and CPU hotplugging.
Signed-off-by: Chen-Yu Tsai <[email protected]>
---
arch/arm/boot/dts/sun9i-a80.dtsi | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/arm/boot/dts/sun9i-a80.dtsi b/arch/arm/boot/dts/sun9i-a80.dtsi
index 85ecb4d64cfd..bf4d40e8359f 100644
--- a/arch/arm/boot/dts/sun9i-a80.dtsi
+++ b/arch/arm/boot/dts/sun9i-a80.dtsi
@@ -709,6 +709,11 @@
interrupts = <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>;
};
+ prcm@8001400 {
+ compatible = "allwinner,sun9i-a80-prcm";
+ reg = <0x08001400 0x200>;
+ };
+
apbs_rst: reset@80014b0 {
reg = <0x080014b0 0x4>;
compatible = "allwinner,sun6i-a31-clock-reset";
--
2.15.1
On Thu, Jan 04, 2018 at 10:37:46PM +0800, Chen-Yu Tsai wrote:
> This is v2 of my sun9i SMP support with MCPM series which was started
> over two years ago [1]. We've tried to implement PSCI for both the A80
> and A83T. Results were not promising. The issue is that these two chips
> have a broken security extensions implementation. If a specific bit is
> not burned in its e-fuse, most if not all security protections don't
> work [2]. Even worse, non-secure access to the GIC become secure. This
> requires a crazy workaround in the GIC driver which probably doesn't work
> in all cases [3].
>
> Nicolas mentioned that the MCPM framework is likely overkill in our
> case [4]. However the framework does provide cluster/core state tracking
> and proper sequencing of cache related operations. We could rework
> the code to use standard smp_ops, but I would like to actually get
> a working version in first.
>
> Much of the sunxi-specific MCPM code is derived from Allwinner code and
> documentation, with some references to the other MCPM implementations,
> as well as the Cortex's Technical Reference Manuals for the power
> sequencing info.
>
> One major difference compared to other platforms is we currently do not
> have a standalone PMU or other embedded firmware to do the actually power
> sequencing. All power/reset control is done by the kernel. Nicolas
> mentioned that a new optional callback should be added in cases where the
> kernel has to do the actual power down [5]. For now however I'm using a
> dedicated single thread workqueue. CPU and cluster power off work is
> queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> is somewhat heavy, as I have a total of 10 static work structs. It might
> also be a bit racy, as nothing prevents the system from bringing a core
> back before the asynchronous work shuts it down. This would likely
> happen under a heavily loaded system with a scheduler that brings cores
> in and out of the system frequently. In simple use-cases it performs OK.
It all looks sane to me
Acked-by: Maxime Ripard <[email protected]>
Maxime
--
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
On Thu, Jan 04, 2018 at 03:58:38PM +0100, Maxime Ripard wrote:
> On Thu, Jan 04, 2018 at 10:37:46PM +0800, Chen-Yu Tsai wrote:
> > This is v2 of my sun9i SMP support with MCPM series which was started
> > over two years ago [1]. We've tried to implement PSCI for both the A80
> > and A83T. Results were not promising. The issue is that these two chips
> > have a broken security extensions implementation. If a specific bit is
> > not burned in its e-fuse, most if not all security protections don't
> > work [2]. Even worse, non-secure access to the GIC become secure. This
> > requires a crazy workaround in the GIC driver which probably doesn't work
> > in all cases [3].
> >
> > Nicolas mentioned that the MCPM framework is likely overkill in our
> > case [4]. However the framework does provide cluster/core state tracking
> > and proper sequencing of cache related operations. We could rework
> > the code to use standard smp_ops, but I would like to actually get
> > a working version in first.
> >
> > Much of the sunxi-specific MCPM code is derived from Allwinner code and
> > documentation, with some references to the other MCPM implementations,
> > as well as the Cortex's Technical Reference Manuals for the power
> > sequencing info.
> >
> > One major difference compared to other platforms is we currently do not
> > have a standalone PMU or other embedded firmware to do the actually power
> > sequencing. All power/reset control is done by the kernel. Nicolas
> > mentioned that a new optional callback should be added in cases where the
> > kernel has to do the actual power down [5]. For now however I'm using a
> > dedicated single thread workqueue. CPU and cluster power off work is
> > queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> > is somewhat heavy, as I have a total of 10 static work structs. It might
> > also be a bit racy, as nothing prevents the system from bringing a core
> > back before the asynchronous work shuts it down. This would likely
> > happen under a heavily loaded system with a scheduler that brings cores
> > in and out of the system frequently. In simple use-cases it performs OK.
>
> It all looks sane to me
> Acked-by: Maxime Ripard <[email protected]>
It does not to me, sorry. You do not need MCPM (and workqueues) to
do SMP bring-up.
Nico explained why, just do it:
commit 905cdf9dda5d ("ARM: hisi/hip04: remove the MCPM overhead")
Lorenzo
On Thu, 4 Jan 2018, Chen-Yu Tsai wrote:
> Nicolas mentioned that the MCPM framework is likely overkill in our
> case [4]. However the framework does provide cluster/core state tracking
> and proper sequencing of cache related operations. We could rework
> the code to use standard smp_ops, but I would like to actually get
> a working version in first.
>
> [...] For now however I'm using a
> dedicated single thread workqueue. CPU and cluster power off work is
> queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> is somewhat heavy, as I have a total of 10 static work structs. It might
> also be a bit racy, as nothing prevents the system from bringing a core
> back before the asynchronous work shuts it down. This would likely
> happen under a heavily loaded system with a scheduler that brings cores
> in and out of the system frequently. In simple use-cases it performs OK.
If you know up front your code is racy then this doesn't fully qualify
as a "working version". Furthermore you're trading custom cluster/core
state tracking for workqueue handling which doesn't look like a winning
tradeoff to me. Especially given you can't have asynchronous CPU wakeups
in hardware from an IRQ to deal with then the state tracking becomes
very simple.
If you hook into struct smp_operations directly, you'll have direct
access to both .cpu_die and .cpu_kill methods which are executed on the
target CPU and on a different CPU respectively, which is exactly what
you need. Those calls are already serialized with .smp_boot_secondary so
you don't have to worry about races. The only thing you need to protect
against races is your cluster usage count. Your code will end up being
simpler than what you have now. See arch/arm/mach-hisi/platmcpm.c for
example.
Nicolas
On Thu, Jan 04, 2018 at 10:37:52PM +0800, Chen-Yu Tsai wrote:
> On the Allwinner A80 SoC the BROM supports hotplugging the primary core
> (cpu0) by checking two 32bit values at a specific location within the
> secure SRAM block. This region needs to be reserved and accessible to
> the SMP code.
>
> Document its usage.
>
> Signed-off-by: Chen-Yu Tsai <[email protected]>
> ---
> .../devicetree/bindings/arm/sunxi/smp-sram.txt | 44 ++++++++++++++++++++++
> 1 file changed, 44 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/arm/sunxi/smp-sram.txt
Reviewed-by: Rob Herring <[email protected]>