2014-12-23 10:48:55

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 0/8] Enable L2 cache support on Exynos4210/4x12 SoCs

This is an updated patchset, which intends to add support for L2 cache
on Exynos4 SoCs on boards running under secure firmware, which requires
certain initialization steps to be done with help of firmware, as
selected registers are writable only from secure mode.

First patch updates Omap2+ platforms by moving l2cache initialization to
common code. This will resolve too early call to l2cache init, what might
cause kmalloc failure in code added in next patches.

Next four patches extend existing support for secure write in L2C driver
to account for design of secure firmware running on Exynos. Namely:
1) direct read access to certain registers is needed on Exynos, because
secure firmware calls set several registers at once,
2) not all boards are running secure firmware, so .write_sec callback
needs to be installed in Exynos firmware ops initialization code,
3) write access to {DATA,TAG}_LATENCY_CTRL registers fron non-secure world
is not allowed and so must use l2c_write_sec as well,
4) on certain boards, default value of prefetch register is incorrect
and must be overridden at L2C initialization.
For boards running with firmware that provides access to individual
L2C registers this series should introduce no functional changes. However
since the driver is widely used on other platforms I'd like to kindly ask
any interested people for testing.

Further three patches add implementation of .write_sec and .configure
callbacks for Exynos secure firmware and necessary DT nodes to enable
L2 cache.

Changes in this version tested on Exynos4412-based TRATS2 and OdroidU3+
boards (both with secure firmware). There should be no functional change
for Exynos boards running without secure firmware. I do not have access
to affected non-Exynos boards, so I could not test on them.

Omap related changes were only compile time tested.

Depends on:
- v3.19-rc1

Changelog:

Changes since v9:
(https://lkml.org/lkml/2014/11/17/217)
- Rebased onto vanilla v3.19-rc1
- Added patch for Omap2+ (move l2cache initialization to common code), what
fixes too early initialization (kmalloc failure)

Changes since v8:
(http://lkml.org/lkml/2014/11/13/263)
- Rebased onto vanilla v3.18-rc3 and added required includes, which were
previously added by other patches
- Added Acked-by tags for Exynos part

Changes since v7:
(https://lkml.org/lkml/2014/10/29/158)
- rebased onto arm-soc/for-next kernel tree (depends on patches merged to
v3.18-rc3 and arm-soc/samsung/pm2 branch)
- removed 'ARM: l2c: unify L2C-310 OF initialization error messages' patch
(no longer needed)

Changes since v6:
(https://lkml.org/lkml/2014/10/27/233)
- changed PL310 to L2C-310 prefix in error messages
- added patch shortening the error message about incorrect associativity

Changes since v5:
(https://lkml.org/lkml/2014/9/24/364)
- rebased onto v3.18-rc2
- added error message about missing properties values

Changes since v4:
(https://lkml.org/lkml/2014/8/26/461)
- rewrote the code accessing l2x0_saved_regs from assembly code
- added comment and reworked unconditional call to SMC_CMD_L2X0INVALL


Patch summary:

Marek Szyprowski (1):
ARM: OMAP2+: use common l2cache initialization code

Tomasz Figa (7):
ARM: l2c: Refactor the driver to use commit-like interface
ARM: l2c: Add interface to ask hypervisor to configure L2C
ARM: l2c: Get outer cache .write_sec callback from mach_desc only if
not NULL
ARM: l2c: Add support for overriding prefetch settings
ARM: EXYNOS: Add .write_sec outer cache callback for L2C-310
ARM: EXYNOS: Add support for non-secure L2X0 resume
ARM: dts: exynos4: Add nodes for L2 cache controller

Documentation/devicetree/bindings/arm/l2cc.txt | 10 +
arch/arm/boot/dts/exynos4210.dtsi | 9 +
arch/arm/boot/dts/exynos4x12.dtsi | 14 ++
arch/arm/include/asm/outercache.h | 3 +
arch/arm/kernel/irq.c | 3 +-
arch/arm/mach-exynos/firmware.c | 50 +++++
arch/arm/mach-exynos/sleep.S | 46 +++++
arch/arm/mach-omap2/board-generic.c | 6 +
arch/arm/mach-omap2/common.h | 7 +
arch/arm/mach-omap2/omap4-common.c | 16 +-
arch/arm/mm/cache-l2x0.c | 270 ++++++++++++++++---------
11 files changed, 323 insertions(+), 111 deletions(-)

--
1.9.2


2014-12-23 10:48:58

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 6/8] ARM: EXYNOS: Add .write_sec outer cache callback for L2C-310

From: Tomasz Figa <[email protected]>

Exynos4 SoCs equipped with an L2C-310 cache controller and running under
secure firmware require certain registers of aforementioned IP to be
accessed only from secure mode. This means that SMC calls are required
for certain register writes. To handle this, an implementation of
.write_sec and .configure callbacks is provided by this patch.

Signed-off-by: Tomasz Figa <[email protected]>
[added comment and reworked unconditional call to SMC_CMD_L2X0INVALL]
Signed-off-by: Marek Szyprowski <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Acked-by: Kukjin Kim <[email protected]>
---
arch/arm/mach-exynos/firmware.c | 50 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)

diff --git a/arch/arm/mach-exynos/firmware.c b/arch/arm/mach-exynos/firmware.c
index 766f57d2f029..dc5ae53aa317 100644
--- a/arch/arm/mach-exynos/firmware.c
+++ b/arch/arm/mach-exynos/firmware.c
@@ -17,6 +17,7 @@
#include <asm/cacheflush.h>
#include <asm/cputype.h>
#include <asm/firmware.h>
+#include <asm/hardware/cache-l2x0.h>
#include <asm/suspend.h>

#include <mach/map.h>
@@ -136,6 +137,43 @@ static const struct firmware_ops exynos_firmware_ops = {
.resume = IS_ENABLED(CONFIG_EXYNOS_CPU_SUSPEND) ? exynos_resume : NULL,
};

+static void exynos_l2_write_sec(unsigned long val, unsigned reg)
+{
+ static int l2cache_enabled;
+
+ switch (reg) {
+ case L2X0_CTRL:
+ if (val & L2X0_CTRL_EN) {
+ /*
+ * Before the cache can be enabled, due to firmware
+ * design, SMC_CMD_L2X0INVALL must be called.
+ */
+ if (!l2cache_enabled) {
+ exynos_smc(SMC_CMD_L2X0INVALL, 0, 0, 0);
+ l2cache_enabled = 1;
+ }
+ } else {
+ l2cache_enabled = 0;
+ }
+ exynos_smc(SMC_CMD_L2X0CTRL, val, 0, 0);
+ break;
+
+ case L2X0_DEBUG_CTRL:
+ exynos_smc(SMC_CMD_L2X0DEBUG, val, 0, 0);
+ break;
+
+ default:
+ WARN_ONCE(1, "%s: ignoring write to reg 0x%x\n", __func__, reg);
+ }
+}
+
+static void exynos_l2_configure(const struct l2x0_regs *regs)
+{
+ exynos_smc(SMC_CMD_L2X0SETUP1, regs->tag_latency, regs->data_latency,
+ regs->prefetch_ctrl);
+ exynos_smc(SMC_CMD_L2X0SETUP2, regs->pwr_ctrl, regs->aux_ctrl, 0);
+}
+
void __init exynos_firmware_init(void)
{
struct device_node *nd;
@@ -155,4 +193,16 @@ void __init exynos_firmware_init(void)
pr_info("Running under secure firmware.\n");

register_firmware_ops(&exynos_firmware_ops);
+
+ /*
+ * Exynos 4 SoCs (based on Cortex A9 and equipped with L2C-310),
+ * running under secure firmware, require certain registers of L2
+ * cache controller to be written in secure mode. Here .write_sec
+ * callback is provided to perform necessary SMC calls.
+ */
+ if (IS_ENABLED(CONFIG_CACHE_L2X0)
+ && read_cpuid_part() == ARM_CPU_PART_CORTEX_A9) {
+ outer_cache.write_sec = exynos_l2_write_sec;
+ outer_cache.configure = exynos_l2_configure;
+ }
}
--
1.9.2

2014-12-23 10:49:28

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 7/8] ARM: EXYNOS: Add support for non-secure L2X0 resume

From: Tomasz Figa <[email protected]>

On Exynos SoCs it is necessary to resume operation of L2C early in
assembly code, because otherwise certain systems will crash. This patch
adds necessary code to non-secure resume handler.

Signed-off-by: Tomasz Figa <[email protected]>
[rewrote the code accessing l2x0_saved_regs]
Sigend-off-by: Marek Szyprowski <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Acked-by: Kukjin Kim <[email protected]>
---
arch/arm/mach-exynos/sleep.S | 46 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)

diff --git a/arch/arm/mach-exynos/sleep.S b/arch/arm/mach-exynos/sleep.S
index e3c373082bbe..31d25834b9c4 100644
--- a/arch/arm/mach-exynos/sleep.S
+++ b/arch/arm/mach-exynos/sleep.S
@@ -16,6 +16,8 @@
*/

#include <linux/linkage.h>
+#include <asm/asm-offsets.h>
+#include <asm/hardware/cache-l2x0.h>
#include "smc.h"

#define CPU_MASK 0xff0ffff0
@@ -74,6 +76,45 @@ ENTRY(exynos_cpu_resume_ns)
mov r0, #SMC_CMD_C15RESUME
dsb
smc #0
+#ifdef CONFIG_CACHE_L2X0
+ adr r0, 1f
+ ldr r2, [r0]
+ add r0, r2, r0
+
+ /* Check that the address has been initialised. */
+ ldr r1, [r0, #L2X0_R_PHY_BASE]
+ teq r1, #0
+ beq skip_l2x0
+
+ /* Check if controller has been enabled. */
+ ldr r2, [r1, #L2X0_CTRL]
+ tst r2, #0x1
+ bne skip_l2x0
+
+ ldr r1, [r0, #L2X0_R_TAG_LATENCY]
+ ldr r2, [r0, #L2X0_R_DATA_LATENCY]
+ ldr r3, [r0, #L2X0_R_PREFETCH_CTRL]
+ mov r0, #SMC_CMD_L2X0SETUP1
+ smc #0
+
+ /* Reload saved regs pointer because smc corrupts registers. */
+ adr r0, 1f
+ ldr r2, [r0]
+ add r0, r2, r0
+
+ ldr r1, [r0, #L2X0_R_PWR_CTRL]
+ ldr r2, [r0, #L2X0_R_AUX_CTRL]
+ mov r0, #SMC_CMD_L2X0SETUP2
+ smc #0
+
+ mov r0, #SMC_CMD_L2X0INVALL
+ smc #0
+
+ mov r1, #1
+ mov r0, #SMC_CMD_L2X0CTRL
+ smc #0
+skip_l2x0:
+#endif /* CONFIG_CACHE_L2X0 */
skip_cp15:
b cpu_resume
ENDPROC(exynos_cpu_resume_ns)
@@ -83,3 +124,8 @@ cp15_save_diag:
.globl cp15_save_power
cp15_save_power:
.long 0 @ cp15 power control
+
+#ifdef CONFIG_CACHE_L2X0
+ .align
+1: .long l2x0_saved_regs - .
+#endif /* CONFIG_CACHE_L2X0 */
--
1.9.2

2014-12-23 10:49:49

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 8/8] ARM: dts: exynos4: Add nodes for L2 cache controller

From: Tomasz Figa <[email protected]>

This patch adds device tree nodes for L2 cache controller present on
Exynos4 SoCs.

Signed-off-by: Tomasz Figa <[email protected]>
Signed-off-by: Marek Szyprowski <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Acked-by: Kukjin Kim <[email protected]>
---
arch/arm/boot/dts/exynos4210.dtsi | 9 +++++++++
arch/arm/boot/dts/exynos4x12.dtsi | 14 ++++++++++++++
2 files changed, 23 insertions(+)

diff --git a/arch/arm/boot/dts/exynos4210.dtsi b/arch/arm/boot/dts/exynos4210.dtsi
index bcc9e63c8070..8e45ea44317e 100644
--- a/arch/arm/boot/dts/exynos4210.dtsi
+++ b/arch/arm/boot/dts/exynos4210.dtsi
@@ -81,6 +81,15 @@
reg = <0x10023CA0 0x20>;
};

+ l2c: l2-cache-controller@10502000 {
+ compatible = "arm,pl310-cache";
+ reg = <0x10502000 0x1000>;
+ cache-unified;
+ cache-level = <2>;
+ arm,tag-latency = <2 2 1>;
+ arm,data-latency = <2 2 1>;
+ };
+
gic: interrupt-controller@10490000 {
cpu-offset = <0x8000>;
};
diff --git a/arch/arm/boot/dts/exynos4x12.dtsi b/arch/arm/boot/dts/exynos4x12.dtsi
index 93b70402e943..8bc97c415c9a 100644
--- a/arch/arm/boot/dts/exynos4x12.dtsi
+++ b/arch/arm/boot/dts/exynos4x12.dtsi
@@ -54,6 +54,20 @@
reg = <0x10023CA0 0x20>;
};

+ l2c: l2-cache-controller@10502000 {
+ compatible = "arm,pl310-cache";
+ reg = <0x10502000 0x1000>;
+ cache-unified;
+ cache-level = <2>;
+ arm,tag-latency = <2 2 1>;
+ arm,data-latency = <3 2 1>;
+ arm,double-linefill = <1>;
+ arm,double-linefill-incr = <0>;
+ arm,double-linefill-wrap = <1>;
+ arm,prefetch-drop = <1>;
+ arm,prefetch-offset = <7>;
+ };
+
clock: clock-controller@10030000 {
compatible = "samsung,exynos4412-clock";
reg = <0x10030000 0x20000>;
--
1.9.2

2014-12-23 10:49:46

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 5/8] ARM: l2c: Add support for overriding prefetch settings

From: Tomasz Figa <[email protected]>

Firmware on certain boards (e.g. ODROID-U3) can leave incorrect L2C prefetch
settings configured in registers leading to crashes if L2C is enabled
without overriding them. This patch introduces bindings to enable
prefetch settings to be specified from DT and necessary support in the
driver.

Signed-off-by: Tomasz Figa <[email protected]>
[mszyprow: rebased onto v3.18-rc1, added error message when prefetch related
dt property has been provided without any value]
Signed-off-by: Marek Szyprowski <[email protected]>
---
Documentation/devicetree/bindings/arm/l2cc.txt | 10 +++++
arch/arm/mm/cache-l2x0.c | 54 ++++++++++++++++++++++++++
2 files changed, 64 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/l2cc.txt b/Documentation/devicetree/bindings/arm/l2cc.txt
index 292ef7ca3058..0dbabe9a6b0a 100644
--- a/Documentation/devicetree/bindings/arm/l2cc.txt
+++ b/Documentation/devicetree/bindings/arm/l2cc.txt
@@ -57,6 +57,16 @@ Optional properties:
- cache-id-part: cache id part number to be used if it is not present
on hardware
- wt-override: If present then L2 is forced to Write through mode
+- arm,double-linefill : Override double linefill enable setting. Enable if
+ non-zero, disable if zero.
+- arm,double-linefill-incr : Override double linefill on INCR read. Enable
+ if non-zero, disable if zero.
+- arm,double-linefill-wrap : Override double linefill on WRAP read. Enable
+ if non-zero, disable if zero.
+- arm,prefetch-drop : Override prefetch drop enable setting. Enable if non-zero,
+ disable if zero.
+- arm,prefetch-offset : Override prefetch offset value. Valid values are
+ 0-7, 15, 23, and 31.

Example:

diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index d214be207517..6f9d5a02d053 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -1169,6 +1169,8 @@ static void __init l2c310_of_parse(const struct device_node *np,
u32 tag[3] = { 0, 0, 0 };
u32 filter[2] = { 0, 0 };
u32 assoc;
+ u32 prefetch;
+ u32 val;
int ret;

of_property_read_u32_array(np, "arm,tag-latency", tag, ARRAY_SIZE(tag));
@@ -1214,6 +1216,58 @@ static void __init l2c310_of_parse(const struct device_node *np,
assoc);
break;
}
+
+ prefetch = l2x0_saved_regs.prefetch_ctrl;
+
+ ret = of_property_read_u32(np, "arm,double-linefill", &val);
+ if (ret == 0) {
+ if (val)
+ prefetch |= L310_PREFETCH_CTRL_DBL_LINEFILL;
+ else
+ prefetch &= ~L310_PREFETCH_CTRL_DBL_LINEFILL;
+ } else if (ret != -EINVAL) {
+ pr_err("L2C-310 OF arm,double-linefill property value is missing\n");
+ }
+
+ ret = of_property_read_u32(np, "arm,double-linefill-incr", &val);
+ if (ret == 0) {
+ if (val)
+ prefetch |= L310_PREFETCH_CTRL_DBL_LINEFILL_INCR;
+ else
+ prefetch &= ~L310_PREFETCH_CTRL_DBL_LINEFILL_INCR;
+ } else if (ret != -EINVAL) {
+ pr_err("L2C-310 OF arm,double-linefill-incr property value is missing\n");
+ }
+
+ ret = of_property_read_u32(np, "arm,double-linefill-wrap", &val);
+ if (ret == 0) {
+ if (!val)
+ prefetch |= L310_PREFETCH_CTRL_DBL_LINEFILL_WRAP;
+ else
+ prefetch &= ~L310_PREFETCH_CTRL_DBL_LINEFILL_WRAP;
+ } else if (ret != -EINVAL) {
+ pr_err("L2C-310 OF arm,double-linefill-wrap property value is missing\n");
+ }
+
+ ret = of_property_read_u32(np, "arm,prefetch-drop", &val);
+ if (ret == 0) {
+ if (val)
+ prefetch |= L310_PREFETCH_CTRL_PREFETCH_DROP;
+ else
+ prefetch &= ~L310_PREFETCH_CTRL_PREFETCH_DROP;
+ } else if (ret != -EINVAL) {
+ pr_err("L2C-310 OF arm,prefetch-drop property value is missing\n");
+ }
+
+ ret = of_property_read_u32(np, "arm,prefetch-offset", &val);
+ if (ret == 0) {
+ prefetch &= ~L310_PREFETCH_CTRL_OFFSET_MASK;
+ prefetch |= val & L310_PREFETCH_CTRL_OFFSET_MASK;
+ } else if (ret != -EINVAL) {
+ pr_err("L2C-310 OF arm,prefetch-offset property value is missing\n");
+ }
+
+ l2x0_saved_regs.prefetch_ctrl = prefetch;
}

static const struct l2c_init_data of_l2c310_data __initconst = {
--
1.9.2

2014-12-23 10:48:52

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 3/8] ARM: l2c: Add interface to ask hypervisor to configure L2C

From: Tomasz Figa <[email protected]>

Because certain secure hypervisor do not allow writes to individual L2C
registers, but rather expect set of parameters to be passed as argument
to secure monitor calls, there is a need to provide an interface for the
L2C driver to ask the firmware to configure the hardware according to
specified parameters. This patch adds such.

Signed-off-by: Tomasz Figa <[email protected]>
Signed-off-by: Marek Szyprowski <[email protected]>
---
arch/arm/include/asm/outercache.h | 3 +++
arch/arm/mm/cache-l2x0.c | 6 ++++++
2 files changed, 9 insertions(+)

diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index 891a56b35bcf..563b92fc2f41 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -23,6 +23,8 @@

#include <linux/types.h>

+struct l2x0_regs;
+
struct outer_cache_fns {
void (*inv_range)(unsigned long, unsigned long);
void (*clean_range)(unsigned long, unsigned long);
@@ -36,6 +38,7 @@ struct outer_cache_fns {

/* This is an ARM L2C thing */
void (*write_sec)(unsigned long, unsigned);
+ void (*configure)(const struct l2x0_regs *);
};

extern struct outer_cache_fns outer_cache;
diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index e5948c5adaa7..d214be207517 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -110,6 +110,11 @@ static inline void l2c_unlock(void __iomem *base, unsigned num)

static void l2c_configure(void __iomem *base)
{
+ if (outer_cache.configure) {
+ outer_cache.configure(&l2x0_saved_regs);
+ return;
+ }
+
if (l2x0_data->configure)
l2x0_data->configure(base);

@@ -910,6 +915,7 @@ static int __init __l2c_init(const struct l2c_init_data *data,

fns = data->outer_cache;
fns.write_sec = outer_cache.write_sec;
+ fns.configure = outer_cache.configure;
if (data->fixup)
data->fixup(l2x0_base, cache_id, &fns);

--
1.9.2

2014-12-23 10:50:44

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 4/8] ARM: l2c: Get outer cache .write_sec callback from mach_desc only if not NULL

From: Tomasz Figa <[email protected]>

Certain platforms (i.e. Exynos) might need to set .write_sec callback
from firmware initialization which is happenning in .init_early callback
of machine descriptor. However current code will overwrite the pointer
with whatever is present in machine descriptor, even though it can be
already set earlier. This patch fixes this by making the assignment
conditional, depending on whether current .write_sec callback is NULL.

Signed-off-by: Tomasz Figa <[email protected]>
Signed-off-by: Marek Szyprowski <[email protected]>
---
arch/arm/kernel/irq.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c
index ad857bada96c..350f188c92d2 100644
--- a/arch/arm/kernel/irq.c
+++ b/arch/arm/kernel/irq.c
@@ -109,7 +109,8 @@ void __init init_IRQ(void)

if (IS_ENABLED(CONFIG_OF) && IS_ENABLED(CONFIG_CACHE_L2X0) &&
(machine_desc->l2c_aux_mask || machine_desc->l2c_aux_val)) {
- outer_cache.write_sec = machine_desc->l2c_write_sec;
+ if (!outer_cache.write_sec)
+ outer_cache.write_sec = machine_desc->l2c_write_sec;
ret = l2x0_of_init(machine_desc->l2c_aux_val,
machine_desc->l2c_aux_mask);
if (ret)
--
1.9.2

2014-12-23 10:50:43

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 1/8] ARM: OMAP2+: use common l2cache initialization code

This patch implements generic DT L2C initialisation (the one from
init_IRQ in arch/arm/kernel/irq.c) for Omap4 and AM43 platforms and
kills the SoC specific stuff in arch/arm/mach-omap2/omap4-common.c.

Signed-off-by: Marek Szyprowski <[email protected]>
---
arch/arm/mach-omap2/board-generic.c | 6 ++++++
arch/arm/mach-omap2/common.h | 7 +++++++
arch/arm/mach-omap2/omap4-common.c | 16 +---------------
3 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/arm/mach-omap2/board-generic.c b/arch/arm/mach-omap2/board-generic.c
index 608079a1aba6..c5c480b76da5 100644
--- a/arch/arm/mach-omap2/board-generic.c
+++ b/arch/arm/mach-omap2/board-generic.c
@@ -171,6 +171,9 @@ static const char *const omap4_boards_compat[] __initconst = {
};

DT_MACHINE_START(OMAP4_DT, "Generic OMAP4 (Flattened Device Tree)")
+ .l2c_aux_val = OMAP_L2C_AUX_CTRL,
+ .l2c_aux_mask = 0xcf9fffff,
+ .l2c_write_sec = omap4_l2c310_write_sec,
.reserve = omap_reserve,
.smp = smp_ops(omap4_smp_ops),
.map_io = omap4_map_io,
@@ -214,6 +217,9 @@ static const char *const am43_boards_compat[] __initconst = {
};

DT_MACHINE_START(AM43_DT, "Generic AM43 (Flattened Device Tree)")
+ .l2c_aux_val = OMAP_L2C_AUX_CTRL,
+ .l2c_aux_mask = 0xcf9fffff,
+ .l2c_write_sec = omap4_l2c310_write_sec,
.map_io = am33xx_map_io,
.init_early = am43xx_init_early,
.init_late = am43xx_init_late,
diff --git a/arch/arm/mach-omap2/common.h b/arch/arm/mach-omap2/common.h
index 377eea849e7b..19c9144d8b38 100644
--- a/arch/arm/mach-omap2/common.h
+++ b/arch/arm/mach-omap2/common.h
@@ -35,6 +35,7 @@
#include <linux/irqchip/irq-omap-intc.h>

#include <asm/proc-fns.h>
+#include <asm/hardware/cache-l2x0.h>

#include "i2c.h"
#include "serial.h"
@@ -94,11 +95,17 @@ extern void omap3_gptimer_timer_init(void);
extern void omap4_local_timer_init(void);
#ifdef CONFIG_CACHE_L2X0
int omap_l2_cache_init(void);
+#define OMAP_L2C_AUX_CTRL (L2C_AUX_CTRL_SHARED_OVERRIDE | \
+ L310_AUX_CTRL_DATA_PREFETCH | \
+ L310_AUX_CTRL_INSTR_PREFETCH)
+void omap4_l2c310_write_sec(unsigned long val, unsigned reg);
#else
static inline int omap_l2_cache_init(void)
{
return 0;
}
+#define OMAP_L2C_AUX_CTRL 0
+#define omap4_l2c310_write_sec NULL
#endif
extern void omap5_realtime_timer_init(void);

diff --git a/arch/arm/mach-omap2/omap4-common.c b/arch/arm/mach-omap2/omap4-common.c
index b7cb44abe49b..fe99ceff2e2d 100644
--- a/arch/arm/mach-omap2/omap4-common.c
+++ b/arch/arm/mach-omap2/omap4-common.c
@@ -166,7 +166,7 @@ void __iomem *omap4_get_l2cache_base(void)
return l2cache_base;
}

-static void omap4_l2c310_write_sec(unsigned long val, unsigned reg)
+void omap4_l2c310_write_sec(unsigned long val, unsigned reg)
{
unsigned smc_op;

@@ -201,24 +201,10 @@ static void omap4_l2c310_write_sec(unsigned long val, unsigned reg)

int __init omap_l2_cache_init(void)
{
- u32 aux_ctrl;
-
/* Static mapping, never released */
l2cache_base = ioremap(OMAP44XX_L2CACHE_BASE, SZ_4K);
if (WARN_ON(!l2cache_base))
return -ENOMEM;
-
- /* 16-way associativity, parity disabled, way size - 64KB (es2.0 +) */
- aux_ctrl = L2C_AUX_CTRL_SHARED_OVERRIDE |
- L310_AUX_CTRL_DATA_PREFETCH |
- L310_AUX_CTRL_INSTR_PREFETCH;
-
- outer_cache.write_sec = omap4_l2c310_write_sec;
- if (of_have_populated_dt())
- l2x0_of_init(aux_ctrl, 0xcf9fffff);
- else
- l2x0_init(l2cache_base, aux_ctrl, 0xcf9fffff);
-
return 0;
}
#endif
--
1.9.2

2014-12-23 10:51:59

by Marek Szyprowski

[permalink] [raw]
Subject: [PATCH v10 2/8] ARM: l2c: Refactor the driver to use commit-like interface

From: Tomasz Figa <[email protected]>

Certain implementations of secure hypervisors (namely the one found on
Samsung Exynos-based boards) do not provide access to individual L2C
registers. This makes the .write_sec()-based interface insufficient and
provoking ugly hacks.

This patch is first step to make the driver not rely on availability of
writes to individual registers. This is achieved by refactoring the
driver to use a commit-like operation scheme: all register values are
prepared first and stored in an instance of l2x0_regs struct and then a
single callback is responsible to flush those values to the hardware.

Signed-off-by: Tomasz Figa <[email protected]>
Signed-off-by: Marek Szyprowski <[email protected]>
---
arch/arm/mm/cache-l2x0.c | 210 ++++++++++++++++++++++++++---------------------
1 file changed, 115 insertions(+), 95 deletions(-)

diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index 5e65ca8dea62..e5948c5adaa7 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -41,12 +41,14 @@ struct l2c_init_data {
void (*enable)(void __iomem *, u32, unsigned);
void (*fixup)(void __iomem *, u32, struct outer_cache_fns *);
void (*save)(void __iomem *);
+ void (*configure)(void __iomem *);
struct outer_cache_fns outer_cache;
};

#define CACHE_LINE_SIZE 32

static void __iomem *l2x0_base;
+static const struct l2c_init_data *l2x0_data;
static DEFINE_RAW_SPINLOCK(l2x0_lock);
static u32 l2x0_way_mask; /* Bitmask of active ways */
static u32 l2x0_size;
@@ -106,6 +108,14 @@ static inline void l2c_unlock(void __iomem *base, unsigned num)
}
}

+static void l2c_configure(void __iomem *base)
+{
+ if (l2x0_data->configure)
+ l2x0_data->configure(base);
+
+ l2c_write_sec(l2x0_saved_regs.aux_ctrl, base, L2X0_AUX_CTRL);
+}
+
/*
* Enable the L2 cache controller. This function must only be
* called when the cache controller is known to be disabled.
@@ -114,7 +124,12 @@ static void l2c_enable(void __iomem *base, u32 aux, unsigned num_lock)
{
unsigned long flags;

- l2c_write_sec(aux, base, L2X0_AUX_CTRL);
+ /* Do not touch the controller if already enabled. */
+ if (readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)
+ return;
+
+ l2x0_saved_regs.aux_ctrl = aux;
+ l2c_configure(base);

l2c_unlock(base, num_lock);

@@ -208,6 +223,11 @@ static void l2c_save(void __iomem *base)
l2x0_saved_regs.aux_ctrl = readl_relaxed(l2x0_base + L2X0_AUX_CTRL);
}

+static void l2c_resume(void)
+{
+ l2c_enable(l2x0_base, l2x0_saved_regs.aux_ctrl, l2x0_data->num_lock);
+}
+
/*
* L2C-210 specific code.
*
@@ -288,14 +308,6 @@ static void l2c210_sync(void)
__l2c210_cache_sync(l2x0_base);
}

-static void l2c210_resume(void)
-{
- void __iomem *base = l2x0_base;
-
- if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN))
- l2c_enable(base, l2x0_saved_regs.aux_ctrl, 1);
-}
-
static const struct l2c_init_data l2c210_data __initconst = {
.type = "L2C-210",
.way_size_0 = SZ_8K,
@@ -309,7 +321,7 @@ static const struct l2c_init_data l2c210_data __initconst = {
.flush_all = l2c210_flush_all,
.disable = l2c_disable,
.sync = l2c210_sync,
- .resume = l2c210_resume,
+ .resume = l2c_resume,
},
};

@@ -466,7 +478,7 @@ static const struct l2c_init_data l2c220_data = {
.flush_all = l2c220_flush_all,
.disable = l2c_disable,
.sync = l2c220_sync,
- .resume = l2c210_resume,
+ .resume = l2c_resume,
},
};

@@ -615,39 +627,29 @@ static void __init l2c310_save(void __iomem *base)
L310_POWER_CTRL);
}

-static void l2c310_resume(void)
+static void l2c310_configure(void __iomem *base)
{
- void __iomem *base = l2x0_base;
+ unsigned revision;

- if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
- unsigned revision;
-
- /* restore pl310 setup */
- writel_relaxed(l2x0_saved_regs.tag_latency,
- base + L310_TAG_LATENCY_CTRL);
- writel_relaxed(l2x0_saved_regs.data_latency,
- base + L310_DATA_LATENCY_CTRL);
- writel_relaxed(l2x0_saved_regs.filter_end,
- base + L310_ADDR_FILTER_END);
- writel_relaxed(l2x0_saved_regs.filter_start,
- base + L310_ADDR_FILTER_START);
-
- revision = readl_relaxed(base + L2X0_CACHE_ID) &
- L2X0_CACHE_ID_RTL_MASK;
-
- if (revision >= L310_CACHE_ID_RTL_R2P0)
- l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
- L310_PREFETCH_CTRL);
- if (revision >= L310_CACHE_ID_RTL_R3P0)
- l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
- L310_POWER_CTRL);
-
- l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
-
- /* Re-enable full-line-of-zeros for Cortex-A9 */
- if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
- set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
- }
+ /* restore pl310 setup */
+ writel_relaxed(l2x0_saved_regs.tag_latency,
+ base + L310_TAG_LATENCY_CTRL);
+ writel_relaxed(l2x0_saved_regs.data_latency,
+ base + L310_DATA_LATENCY_CTRL);
+ writel_relaxed(l2x0_saved_regs.filter_end,
+ base + L310_ADDR_FILTER_END);
+ writel_relaxed(l2x0_saved_regs.filter_start,
+ base + L310_ADDR_FILTER_START);
+
+ revision = readl_relaxed(base + L2X0_CACHE_ID) &
+ L2X0_CACHE_ID_RTL_MASK;
+
+ if (revision >= L310_CACHE_ID_RTL_R2P0)
+ l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
+ L310_PREFETCH_CTRL);
+ if (revision >= L310_CACHE_ID_RTL_R3P0)
+ l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
+ L310_POWER_CTRL);
}

static int l2c310_cpu_enable_flz(struct notifier_block *nb, unsigned long act, void *data)
@@ -699,6 +701,23 @@ static void __init l2c310_enable(void __iomem *base, u32 aux, unsigned num_lock)
aux &= ~(L310_AUX_CTRL_FULL_LINE_ZERO | L310_AUX_CTRL_EARLY_BRESP);
}

+ /* r3p0 or later has power control register */
+ if (rev >= L310_CACHE_ID_RTL_R3P0)
+ l2x0_saved_regs.pwr_ctrl = L310_DYNAMIC_CLK_GATING_EN |
+ L310_STNDBY_MODE_EN;
+
+ /*
+ * Always enable non-secure access to the lockdown registers -
+ * we write to them as part of the L2C enable sequence so they
+ * need to be accessible.
+ */
+ aux |= L310_AUX_CTRL_NS_LOCKDOWN;
+
+ l2c_enable(base, aux, num_lock);
+
+ /* Read back resulting AUX_CTRL value as it could have been altered. */
+ aux = readl_relaxed(base + L2X0_AUX_CTRL);
+
if (aux & (L310_AUX_CTRL_DATA_PREFETCH | L310_AUX_CTRL_INSTR_PREFETCH)) {
u32 prefetch = readl_relaxed(base + L310_PREFETCH_CTRL);

@@ -712,23 +731,12 @@ static void __init l2c310_enable(void __iomem *base, u32 aux, unsigned num_lock)
if (rev >= L310_CACHE_ID_RTL_R3P0) {
u32 power_ctrl;

- l2c_write_sec(L310_DYNAMIC_CLK_GATING_EN | L310_STNDBY_MODE_EN,
- base, L310_POWER_CTRL);
power_ctrl = readl_relaxed(base + L310_POWER_CTRL);
pr_info("L2C-310 dynamic clock gating %sabled, standby mode %sabled\n",
power_ctrl & L310_DYNAMIC_CLK_GATING_EN ? "en" : "dis",
power_ctrl & L310_STNDBY_MODE_EN ? "en" : "dis");
}

- /*
- * Always enable non-secure access to the lockdown registers -
- * we write to them as part of the L2C enable sequence so they
- * need to be accessible.
- */
- aux |= L310_AUX_CTRL_NS_LOCKDOWN;
-
- l2c_enable(base, aux, num_lock);
-
if (aux & L310_AUX_CTRL_FULL_LINE_ZERO) {
set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
cpu_notifier(l2c310_cpu_enable_flz, 0);
@@ -760,11 +768,11 @@ static void __init l2c310_fixup(void __iomem *base, u32 cache_id,

if (revision >= L310_CACHE_ID_RTL_R3P0 &&
revision < L310_CACHE_ID_RTL_R3P2) {
- u32 val = readl_relaxed(base + L310_PREFETCH_CTRL);
+ u32 val = l2x0_saved_regs.prefetch_ctrl;
/* I don't think bit23 is required here... but iMX6 does so */
if (val & (BIT(30) | BIT(23))) {
val &= ~(BIT(30) | BIT(23));
- l2c_write_sec(val, base, L310_PREFETCH_CTRL);
+ l2x0_saved_regs.prefetch_ctrl = val;
errata[n++] = "752271";
}
}
@@ -800,6 +808,15 @@ static void l2c310_disable(void)
l2c_disable();
}

+static void l2c310_resume(void)
+{
+ l2c_resume();
+
+ /* Re-enable full-line-of-zeros for Cortex-A9 */
+ if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
+ set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
+}
+
static const struct l2c_init_data l2c310_init_fns __initconst = {
.type = "L2C-310",
.way_size_0 = SZ_8K,
@@ -807,6 +824,7 @@ static const struct l2c_init_data l2c310_init_fns __initconst = {
.enable = l2c310_enable,
.fixup = l2c310_fixup,
.save = l2c310_save,
+ .configure = l2c310_configure,
.outer_cache = {
.inv_range = l2c210_inv_range,
.clean_range = l2c210_clean_range,
@@ -818,7 +836,7 @@ static const struct l2c_init_data l2c310_init_fns __initconst = {
},
};

-static void __init __l2c_init(const struct l2c_init_data *data,
+static int __init __l2c_init(const struct l2c_init_data *data,
u32 aux_val, u32 aux_mask, u32 cache_id)
{
struct outer_cache_fns fns;
@@ -826,6 +844,14 @@ static void __init __l2c_init(const struct l2c_init_data *data,
u32 aux, old_aux;

/*
+ * Save the pointer globally so that callbacks which do not receive
+ * context from callers can access the structure.
+ */
+ l2x0_data = kmemdup(data, sizeof(*data), GFP_KERNEL);
+ if (!l2x0_data)
+ return -ENOMEM;
+
+ /*
* Sanity check the aux values. aux_mask is the bits we preserve
* from reading the hardware register, and aux_val is the bits we
* set.
@@ -910,6 +936,8 @@ static void __init __l2c_init(const struct l2c_init_data *data,
data->type, ways, l2x0_size >> 10);
pr_info("%s: CACHE_ID 0x%08x, AUX_CTRL 0x%08x\n",
data->type, cache_id, aux);
+
+ return 0;
}

void __init l2x0_init(void __iomem *base, u32 aux_val, u32 aux_mask)
@@ -936,6 +964,10 @@ void __init l2x0_init(void __iomem *base, u32 aux_val, u32 aux_mask)
break;
}

+ /* Read back current (default) hardware configuration */
+ if (data->save)
+ data->save(l2x0_base);
+
__l2c_init(data, aux_val, aux_mask, cache_id);
}

@@ -1102,7 +1134,7 @@ static const struct l2c_init_data of_l2c210_data __initconst = {
.flush_all = l2c210_flush_all,
.disable = l2c_disable,
.sync = l2c210_sync,
- .resume = l2c210_resume,
+ .resume = l2c_resume,
},
};

@@ -1120,7 +1152,7 @@ static const struct l2c_init_data of_l2c220_data __initconst = {
.flush_all = l2c220_flush_all,
.disable = l2c_disable,
.sync = l2c220_sync,
- .resume = l2c210_resume,
+ .resume = l2c_resume,
},
};

@@ -1135,28 +1167,26 @@ static void __init l2c310_of_parse(const struct device_node *np,

of_property_read_u32_array(np, "arm,tag-latency", tag, ARRAY_SIZE(tag));
if (tag[0] && tag[1] && tag[2])
- writel_relaxed(
+ l2x0_saved_regs.tag_latency =
L310_LATENCY_CTRL_RD(tag[0] - 1) |
L310_LATENCY_CTRL_WR(tag[1] - 1) |
- L310_LATENCY_CTRL_SETUP(tag[2] - 1),
- l2x0_base + L310_TAG_LATENCY_CTRL);
+ L310_LATENCY_CTRL_SETUP(tag[2] - 1);

of_property_read_u32_array(np, "arm,data-latency",
data, ARRAY_SIZE(data));
if (data[0] && data[1] && data[2])
- writel_relaxed(
+ l2x0_saved_regs.data_latency =
L310_LATENCY_CTRL_RD(data[0] - 1) |
L310_LATENCY_CTRL_WR(data[1] - 1) |
- L310_LATENCY_CTRL_SETUP(data[2] - 1),
- l2x0_base + L310_DATA_LATENCY_CTRL);
+ L310_LATENCY_CTRL_SETUP(data[2] - 1);

of_property_read_u32_array(np, "arm,filter-ranges",
filter, ARRAY_SIZE(filter));
if (filter[1]) {
- writel_relaxed(ALIGN(filter[0] + filter[1], SZ_1M),
- l2x0_base + L310_ADDR_FILTER_END);
- writel_relaxed((filter[0] & ~(SZ_1M - 1)) | L310_ADDR_FILTER_EN,
- l2x0_base + L310_ADDR_FILTER_START);
+ l2x0_saved_regs.filter_end =
+ ALIGN(filter[0] + filter[1], SZ_1M);
+ l2x0_saved_regs.filter_start = (filter[0] & ~(SZ_1M - 1))
+ | L310_ADDR_FILTER_EN;
}

ret = l2x0_cache_size_of_parse(np, aux_val, aux_mask, &assoc, SZ_512K);
@@ -1188,6 +1218,7 @@ static const struct l2c_init_data of_l2c310_data __initconst = {
.enable = l2c310_enable,
.fixup = l2c310_fixup,
.save = l2c310_save,
+ .configure = l2c310_configure,
.outer_cache = {
.inv_range = l2c210_inv_range,
.clean_range = l2c210_clean_range,
@@ -1216,6 +1247,7 @@ static const struct l2c_init_data of_l2c310_coherent_data __initconst = {
.enable = l2c310_enable,
.fixup = l2c310_fixup,
.save = l2c310_save,
+ .configure = l2c310_configure,
.outer_cache = {
.inv_range = l2c210_inv_range,
.clean_range = l2c210_clean_range,
@@ -1330,16 +1362,6 @@ static void aurora_save(void __iomem *base)
l2x0_saved_regs.aux_ctrl = readl_relaxed(base + L2X0_AUX_CTRL);
}

-static void aurora_resume(void)
-{
- void __iomem *base = l2x0_base;
-
- if (!(readl(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
- writel_relaxed(l2x0_saved_regs.aux_ctrl, base + L2X0_AUX_CTRL);
- writel_relaxed(l2x0_saved_regs.ctrl, base + L2X0_CTRL);
- }
-}
-
/*
* For Aurora cache in no outer mode, enable via the CP15 coprocessor
* broadcasting of cache commands to L2.
@@ -1401,7 +1423,7 @@ static const struct l2c_init_data of_aurora_with_outer_data __initconst = {
.flush_all = l2x0_flush_all,
.disable = l2x0_disable,
.sync = l2x0_cache_sync,
- .resume = aurora_resume,
+ .resume = l2c_resume,
},
};

@@ -1414,7 +1436,7 @@ static const struct l2c_init_data of_aurora_no_outer_data __initconst = {
.fixup = aurora_fixup,
.save = aurora_save,
.outer_cache = {
- .resume = aurora_resume,
+ .resume = l2c_resume,
},
};

@@ -1562,6 +1584,7 @@ static const struct l2c_init_data of_bcm_l2x0_data __initconst = {
.of_parse = l2c310_of_parse,
.enable = l2c310_enable,
.save = l2c310_save,
+ .configure = l2c310_configure,
.outer_cache = {
.inv_range = bcm_inv_range,
.clean_range = bcm_clean_range,
@@ -1583,18 +1606,12 @@ static void __init tauros3_save(void __iomem *base)
readl_relaxed(base + L310_PREFETCH_CTRL);
}

-static void tauros3_resume(void)
+static void tauros3_configure(void __iomem *base)
{
- void __iomem *base = l2x0_base;
-
- if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
- writel_relaxed(l2x0_saved_regs.aux2_ctrl,
- base + TAUROS3_AUX2_CTRL);
- writel_relaxed(l2x0_saved_regs.prefetch_ctrl,
- base + L310_PREFETCH_CTRL);
-
- l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
- }
+ writel_relaxed(l2x0_saved_regs.aux2_ctrl,
+ base + TAUROS3_AUX2_CTRL);
+ writel_relaxed(l2x0_saved_regs.prefetch_ctrl,
+ base + L310_PREFETCH_CTRL);
}

static const struct l2c_init_data of_tauros3_data __initconst = {
@@ -1603,9 +1620,10 @@ static const struct l2c_init_data of_tauros3_data __initconst = {
.num_lock = 8,
.enable = l2c_enable,
.save = tauros3_save,
+ .configure = tauros3_configure,
/* Tauros3 broadcasts L1 cache operations to L2 */
.outer_cache = {
- .resume = tauros3_resume,
+ .resume = l2c_resume,
},
};

@@ -1661,6 +1679,10 @@ int __init l2x0_of_init(u32 aux_val, u32 aux_mask)
if (!of_property_read_bool(np, "cache-unified"))
pr_err("L2C: device tree omits to specify unified cache\n");

+ /* Read back current (default) hardware configuration */
+ if (data->save)
+ data->save(l2x0_base);
+
/* L2 configuration can only be changed if the cache is disabled */
if (!(readl_relaxed(l2x0_base + L2X0_CTRL) & L2X0_CTRL_EN))
if (data->of_parse)
@@ -1671,8 +1693,6 @@ int __init l2x0_of_init(u32 aux_val, u32 aux_mask)
else
cache_id = readl_relaxed(l2x0_base + L2X0_CACHE_ID);

- __l2c_init(data, aux_val, aux_mask, cache_id);
-
- return 0;
+ return __l2c_init(data, aux_val, aux_mask, cache_id);
}
#endif
--
1.9.2

2014-12-23 17:08:57

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] ARM: l2c: Refactor the driver to use commit-like interface

* Marek Szyprowski <[email protected]> [141223 02:51]:
> From: Tomasz Figa <[email protected]>
>
> Certain implementations of secure hypervisors (namely the one found on
> Samsung Exynos-based boards) do not provide access to individual L2C
> registers. This makes the .write_sec()-based interface insufficient and
> provoking ugly hacks.
>
> This patch is first step to make the driver not rely on availability of
> writes to individual registers. This is achieved by refactoring the
> driver to use a commit-like operation scheme: all register values are
> prepared first and stored in an instance of l2x0_regs struct and then a
> single callback is responsible to flush those values to the hardware.

The first patch of the series applied things boot with no problem.
But after applying this one I get the following on am437x:

Unhandled fault: imprecise external abort (0xc06) at 0xb6f33884

Probably the same issue Nishanth mentioned.

Regards,

Tony

2014-12-23 17:14:16

by Nishanth Menon

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] ARM: l2c: Refactor the driver to use commit-like interface

On 12/23/2014 11:06 AM, Tony Lindgren wrote:
> * Marek Szyprowski <[email protected]> [141223 02:51]:
>> From: Tomasz Figa <[email protected]>
>>
>> Certain implementations of secure hypervisors (namely the one found on
>> Samsung Exynos-based boards) do not provide access to individual L2C
>> registers. This makes the .write_sec()-based interface insufficient and
>> provoking ugly hacks.
>>
>> This patch is first step to make the driver not rely on availability of
>> writes to individual registers. This is achieved by refactoring the
>> driver to use a commit-like operation scheme: all register values are
>> prepared first and stored in an instance of l2x0_regs struct and then a
>> single callback is responsible to flush those values to the hardware.
>
> The first patch of the series applied things boot with no problem.
> But after applying this one I get the following on am437x:
>
> Unhandled fault: imprecise external abort (0xc06) at 0xb6f33884
>
> Probably the same issue Nishanth mentioned.
>

yep - just finished the bisect... came to the same conclusion..

c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f is the first bad commit
commit c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f
Author: Tomasz Figa <[email protected]>
Date: Tue Dec 23 11:48:30 2014 +0100

ARM: l2c: Refactor the driver to use commit-like interface

Certain implementations of secure hypervisors (namely the one found on
Samsung Exynos-based boards) do not provide access to individual L2C
registers. This makes the .write_sec()-based interface
insufficient and
provoking ugly hacks.

This patch is first step to make the driver not rely on
availability of
writes to individual registers. This is achieved by refactoring the
driver to use a commit-like operation scheme: all register values are
prepared first and stored in an instance of l2x0_regs struct and
then a
single callback is responsible to flush those values to the hardware.

Signed-off-by: Tomasz Figa <[email protected]>
Signed-off-by: Marek Szyprowski <[email protected]>

:040000 040000 74c6c74a0dc0612d124cd759951adf2a1e4124ee
8082aabb474f8659231de744d87cd8dbd6dd79bb M arch


$ git bisect log
git bisect start
# good: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
git bisect good 97bf6af1f928216fd6c5a66e8a57bfa95a659672
# bad: [9afe195db6558621bd8bac379ed65ef121930684] ARM: dts: exynos4:
Add nodes for L2 cache controller
git bisect bad 9afe195db6558621bd8bac379ed65ef121930684
# bad: [0a89ef4dd870bbf692e30fef6c8182d7b8b42e17] ARM: l2c: Get outer
cache .write_sec callback from mach_desc only if not NULL
git bisect bad 0a89ef4dd870bbf692e30fef6c8182d7b8b42e17
# bad: [c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f] ARM: l2c: Refactor
the driver to use commit-like interface
git bisect bad c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f
# good: [080ab387c653b8655dc1ee790658b618399db2aa] ARM: OMAP2+: use
common l2cache initialization code
git bisect good 080ab387c653b8655dc1ee790658b618399db2aa


--
Regards,
Nishanth Menon

2014-12-28 11:34:28

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] ARM: l2c: Refactor the driver to use commit-like interface

Nishanth, Tony,

On 24.12.2014 02:13, Nishanth Menon wrote:
> On 12/23/2014 11:06 AM, Tony Lindgren wrote:
>> * Marek Szyprowski <[email protected]> [141223 02:51]:
>>> From: Tomasz Figa <[email protected]>
>>>
>>> Certain implementations of secure hypervisors (namely the one found on
>>> Samsung Exynos-based boards) do not provide access to individual L2C
>>> registers. This makes the .write_sec()-based interface insufficient and
>>> provoking ugly hacks.
>>>
>>> This patch is first step to make the driver not rely on availability of
>>> writes to individual registers. This is achieved by refactoring the
>>> driver to use a commit-like operation scheme: all register values are
>>> prepared first and stored in an instance of l2x0_regs struct and then a
>>> single callback is responsible to flush those values to the hardware.
>>
>> The first patch of the series applied things boot with no problem.
>> But after applying this one I get the following on am437x:
>>
>> Unhandled fault: imprecise external abort (0xc06) at 0xb6f33884
>>
>> Probably the same issue Nishanth mentioned.
>>
>
> yep - just finished the bisect... came to the same conclusion..
>
> c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f is the first bad commit
> commit c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f
> Author: Tomasz Figa <[email protected]>
> Date: Tue Dec 23 11:48:30 2014 +0100
>
> ARM: l2c: Refactor the driver to use commit-like interface
>
> Certain implementations of secure hypervisors (namely the one found on
> Samsung Exynos-based boards) do not provide access to individual L2C
> registers. This makes the .write_sec()-based interface
> insufficient and
> provoking ugly hacks.
>
> This patch is first step to make the driver not rely on
> availability of
> writes to individual registers. This is achieved by refactoring the
> driver to use a commit-like operation scheme: all register values are
> prepared first and stored in an instance of l2x0_regs struct and
> then a
> single callback is responsible to flush those values to the hardware.
>
> Signed-off-by: Tomasz Figa <[email protected]>
> Signed-off-by: Marek Szyprowski <[email protected]>
>
> :040000 040000 74c6c74a0dc0612d124cd759951adf2a1e4124ee
> 8082aabb474f8659231de744d87cd8dbd6dd79bb M arch
>
>
> $ git bisect log
> git bisect start
> # good: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
> git bisect good 97bf6af1f928216fd6c5a66e8a57bfa95a659672
> # bad: [9afe195db6558621bd8bac379ed65ef121930684] ARM: dts: exynos4:
> Add nodes for L2 cache controller
> git bisect bad 9afe195db6558621bd8bac379ed65ef121930684
> # bad: [0a89ef4dd870bbf692e30fef6c8182d7b8b42e17] ARM: l2c: Get outer
> cache .write_sec callback from mach_desc only if not NULL
> git bisect bad 0a89ef4dd870bbf692e30fef6c8182d7b8b42e17
> # bad: [c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f] ARM: l2c: Refactor
> the driver to use commit-like interface
> git bisect bad c8c3a07fa6a8e9b27a1658e0d305b6f7e0fa068f
> # good: [080ab387c653b8655dc1ee790658b618399db2aa] ARM: OMAP2+: use
> common l2cache initialization code
> git bisect good 080ab387c653b8655dc1ee790658b618399db2aa
>
>

May I ask you (or anyone else working on OMAP) to try to figure out what
the issue is? It is stopping L2 cache support for Exynos4 being merged
and Exynos people don't have access to any of affected boards to do
anything about it. After all, this is generic code, so I believe
community should cooperate with pushing it forward. (Of course I
understand it is a holiday season at the moment, so I don't expect any
solution right at this moment :))

Apparently patch 1/8 solved problems with some of the boards. Could you
check how those boards differ and look for potential causes?

Best regards,
Tomasz

2014-12-29 14:31:03

by Nishanth Menon

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] ARM: l2c: Refactor the driver to use commit-like interface

On 20:34-20141228, Tomasz Figa wrote:
> May I ask you (or anyone else working on OMAP) to try to figure out
> what the issue is? It is stopping L2 cache support for Exynos4 being

http://slexy.org/view/s2BnzxVglj
Took a register dump and compared -> Got the same dump
http://slexy.org/view/s21YRHpzeW .

Basically, this implies: possible sequence change broke AM437x or
some secure function call (which i have'nt dumped).

Is it possible to break up this patch into easier patches?

--
Regards,
Nishanth Menon

2014-12-29 18:24:10

by Nishanth Menon

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] ARM: l2c: Refactor the driver to use commit-like interface

On 12/23/2014 04:48 AM, Marek Szyprowski wrote:

> -static void l2c310_resume(void)
> +static void l2c310_configure(void __iomem *base)
> {
> - void __iomem *base = l2x0_base;
> + unsigned revision;
>
> - if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
> - unsigned revision;
> -
> - /* restore pl310 setup */
> - writel_relaxed(l2x0_saved_regs.tag_latency,
> - base + L310_TAG_LATENCY_CTRL);
> - writel_relaxed(l2x0_saved_regs.data_latency,
> - base + L310_DATA_LATENCY_CTRL);
> - writel_relaxed(l2x0_saved_regs.filter_end,
> - base + L310_ADDR_FILTER_END);
> - writel_relaxed(l2x0_saved_regs.filter_start,
> - base + L310_ADDR_FILTER_START);
> -
> - revision = readl_relaxed(base + L2X0_CACHE_ID) &
> - L2X0_CACHE_ID_RTL_MASK;
> -
> - if (revision >= L310_CACHE_ID_RTL_R2P0)
> - l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
> - L310_PREFETCH_CTRL);
> - if (revision >= L310_CACHE_ID_RTL_R3P0)
> - l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
> - L310_POWER_CTRL);
> -
> - l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
> -
> - /* Re-enable full-line-of-zeros for Cortex-A9 */
> - if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
> - set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
> - }
> + /* restore pl310 setup */
> + writel_relaxed(l2x0_saved_regs.tag_latency,
> + base + L310_TAG_LATENCY_CTRL);
> + writel_relaxed(l2x0_saved_regs.data_latency,
> + base + L310_DATA_LATENCY_CTRL);
> + writel_relaxed(l2x0_saved_regs.filter_end,
> + base + L310_ADDR_FILTER_END);
> + writel_relaxed(l2x0_saved_regs.filter_start,
> + base + L310_ADDR_FILTER_START);
> +

^^ The above change broke AM437xx. Looks like the change causes the
following behavior difference on AM437x. For some reason, touching any
of the above 4 registers(even with the values read from the same
registers) causes AM437x to go beserk. Comment the 4 writes and we
reach shell. looks like l2c310_resume is not invoked prior to this
series. :(.. now that we reuse that logic to actually do programming,
we start to see the problem.

one option might be to write only those registers that differ from
saved_registers (example: unmodified values dont need reprogramming).

Looks like the following also need addressing:
data->save is called twice (once more after l2cof_init)
l2c310_init_fns also needs l2c310_configure
will be nice to use l2x0_data only after we kmemdup data in __l2c_init

if you'd like to split this up in pieces, [1] might be nice - will go
good to change the pl310, aurora etc in each chunks to enable better
review.

[1]
https://github.com/nmenon/linux-2.6-playground/commits/temp/l2c-patch2-splitup

--
Regards,
Nishanth Menon

2014-12-30 09:05:56

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] ARM: l2c: Refactor the driver to use commit-like interface

Thanks a lot for investigating this, even before I could look into
splitting this.

2014-12-30 3:23 GMT+09:00 Nishanth Menon <[email protected]>:
> On 12/23/2014 04:48 AM, Marek Szyprowski wrote:
>
>> -static void l2c310_resume(void)
>> +static void l2c310_configure(void __iomem *base)
>> {
>> - void __iomem *base = l2x0_base;
>> + unsigned revision;
>>
>> - if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
>> - unsigned revision;
>> -
>> - /* restore pl310 setup */
>> - writel_relaxed(l2x0_saved_regs.tag_latency,
>> - base + L310_TAG_LATENCY_CTRL);
>> - writel_relaxed(l2x0_saved_regs.data_latency,
>> - base + L310_DATA_LATENCY_CTRL);
>> - writel_relaxed(l2x0_saved_regs.filter_end,
>> - base + L310_ADDR_FILTER_END);
>> - writel_relaxed(l2x0_saved_regs.filter_start,
>> - base + L310_ADDR_FILTER_START);
>> -
>> - revision = readl_relaxed(base + L2X0_CACHE_ID) &
>> - L2X0_CACHE_ID_RTL_MASK;
>> -
>> - if (revision >= L310_CACHE_ID_RTL_R2P0)
>> - l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
>> - L310_PREFETCH_CTRL);
>> - if (revision >= L310_CACHE_ID_RTL_R3P0)
>> - l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
>> - L310_POWER_CTRL);
>> -
>> - l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
>> -
>> - /* Re-enable full-line-of-zeros for Cortex-A9 */
>> - if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
>> - set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
>> - }
>> + /* restore pl310 setup */
>> + writel_relaxed(l2x0_saved_regs.tag_latency,
>> + base + L310_TAG_LATENCY_CTRL);
>> + writel_relaxed(l2x0_saved_regs.data_latency,
>> + base + L310_DATA_LATENCY_CTRL);
>> + writel_relaxed(l2x0_saved_regs.filter_end,
>> + base + L310_ADDR_FILTER_END);
>> + writel_relaxed(l2x0_saved_regs.filter_start,
>> + base + L310_ADDR_FILTER_START);
>> +
>
> ^^ The above change broke AM437xx. Looks like the change causes the
> following behavior difference on AM437x. For some reason, touching any
> of the above 4 registers(even with the values read from the same
> registers) causes AM437x to go beserk. Comment the 4 writes and we
> reach shell. looks like l2c310_resume is not invoked prior to this
> series. :(.. now that we reuse that logic to actually do programming,
> we start to see the problem.

Hmm, but the thing is that .configure() should not be called if the
controller is already configured, i.e. L2X0_CTRL_EN in L2X0_CTRL is
set. Maybe I missed some check somewhere. Let me reread my code I
wrote quite a long time ago and make sure.

>
> one option might be to write only those registers that differ from
> saved_registers (example: unmodified values dont need reprogramming).
>
> Looks like the following also need addressing:
> data->save is called twice (once more after l2cof_init)
> l2c310_init_fns also needs l2c310_configure
> will be nice to use l2x0_data only after we kmemdup data in __l2c_init

I'll check this.

>
> if you'd like to split this up in pieces, [1] might be nice - will go
> good to change the pl310, aurora etc in each chunks to enable better
> review.

Thanks a lot, the split up version will be definitely useful. Just to
make sure, the parts look quite bisectable, but have you verified that
applying the changes one by one leave the L2 cache working on OMAP?

>
> [1]
> https://github.com/nmenon/linux-2.6-playground/commits/temp/l2c-patch2-splitup

Best regards,
Tomasz

2014-12-30 14:52:07

by Nishanth Menon

[permalink] [raw]
Subject: Re: [PATCH v10 2/8] ARM: l2c: Refactor the driver to use commit-like interface

On 12/30/2014 03:05 AM, Tomasz Figa wrote:
> Thanks a lot for investigating this, even before I could look into
> splitting this.
>
> 2014-12-30 3:23 GMT+09:00 Nishanth Menon <[email protected]>:
>> On 12/23/2014 04:48 AM, Marek Szyprowski wrote:
>>
>>> -static void l2c310_resume(void)
>>> +static void l2c310_configure(void __iomem *base)
>>> {
>>> - void __iomem *base = l2x0_base;
>>> + unsigned revision;
>>>
>>> - if (!(readl_relaxed(base + L2X0_CTRL) & L2X0_CTRL_EN)) {
>>> - unsigned revision;
>>> -
>>> - /* restore pl310 setup */
>>> - writel_relaxed(l2x0_saved_regs.tag_latency,
>>> - base + L310_TAG_LATENCY_CTRL);
>>> - writel_relaxed(l2x0_saved_regs.data_latency,
>>> - base + L310_DATA_LATENCY_CTRL);
>>> - writel_relaxed(l2x0_saved_regs.filter_end,
>>> - base + L310_ADDR_FILTER_END);
>>> - writel_relaxed(l2x0_saved_regs.filter_start,
>>> - base + L310_ADDR_FILTER_START);
>>> -
>>> - revision = readl_relaxed(base + L2X0_CACHE_ID) &
>>> - L2X0_CACHE_ID_RTL_MASK;
>>> -
>>> - if (revision >= L310_CACHE_ID_RTL_R2P0)
>>> - l2c_write_sec(l2x0_saved_regs.prefetch_ctrl, base,
>>> - L310_PREFETCH_CTRL);
>>> - if (revision >= L310_CACHE_ID_RTL_R3P0)
>>> - l2c_write_sec(l2x0_saved_regs.pwr_ctrl, base,
>>> - L310_POWER_CTRL);
>>> -
>>> - l2c_enable(base, l2x0_saved_regs.aux_ctrl, 8);
>>> -
>>> - /* Re-enable full-line-of-zeros for Cortex-A9 */
>>> - if (l2x0_saved_regs.aux_ctrl & L310_AUX_CTRL_FULL_LINE_ZERO)
>>> - set_auxcr(get_auxcr() | BIT(3) | BIT(2) | BIT(1));
>>> - }
>>> + /* restore pl310 setup */
>>> + writel_relaxed(l2x0_saved_regs.tag_latency,
>>> + base + L310_TAG_LATENCY_CTRL);
>>> + writel_relaxed(l2x0_saved_regs.data_latency,
>>> + base + L310_DATA_LATENCY_CTRL);
>>> + writel_relaxed(l2x0_saved_regs.filter_end,
>>> + base + L310_ADDR_FILTER_END);
>>> + writel_relaxed(l2x0_saved_regs.filter_start,
>>> + base + L310_ADDR_FILTER_START);
>>> +
>>
>> ^^ The above change broke AM437xx. Looks like the change causes the
>> following behavior difference on AM437x. For some reason, touching any
>> of the above 4 registers(even with the values read from the same
>> registers) causes AM437x to go beserk. Comment the 4 writes and we
>> reach shell. looks like l2c310_resume is not invoked prior to this
>> series. :(.. now that we reuse that logic to actually do programming,
>> we start to see the problem.
>
> Hmm, but the thing is that .configure() should not be called if the
> controller is already configured, i.e. L2X0_CTRL_EN in L2X0_CTRL is
> set. Maybe I missed some check somewhere. Let me reread my code I
> wrote quite a long time ago and make sure.

you have'nt missed a check here. it does indeed get called
l2c310_enable->l2c_enable -> (if cache
disabled)->l2c_configure->l2c310_configure

It looks like a quirk of AM437x which remained hidden till this patch
exposed it. The original pl310 resume would have been invoked if
outer_resume was invoked (if my reading of code is correct), which in
the case of AM437x was never invoked during boot. By reusing the
restore code for resume (which in my opinion is a good change), we
seemed to have exposed quirky behavior on am437x. I started a thread
yesterday with hardware folks trying to understand the integration
aspects and explanation for this quirky behavior - unfortunately, with
holidays, I doubt I might get a quick answer fast. but the workaround
seems obvious - do not write to the the mentioned 4 registers on
am437x. This might make sense it arm,tag-latency , arm,data-latency,
arm,filter-ranges properties are not defined in OF.

>> one option might be to write only those registers that differ from
>> saved_registers (example: unmodified values dont need reprogramming).
>>
>> Looks like the following also need addressing:
>> data->save is called twice (once more after l2cof_init)
>> l2c310_init_fns also needs l2c310_configure
>> will be nice to use l2x0_data only after we kmemdup data in __l2c_init
>
> I'll check this.
Thanks.

>
>>
>> if you'd like to split this up in pieces, [1] might be nice - will go
>> good to change the pl310, aurora etc in each chunks to enable better
>> review.
>
> Thanks a lot, the split up version will be definitely useful. Just to
> make sure, the parts look quite bisectable, but have you verified that
> applying the changes one by one leave the L2 cache working on OMAP?

The split is not complete or properly done as I ignored aurora and was
trying to do it focussed on pl310 impacted code in a hurry, but yes,
the split is indeed bisectable for OMAP platforms. The intent was to
indicate the direction we should probably take and introduce the
refactor in stages. At the very least, it tends to be easier to debug
down to.

>> [1]
>> https://github.com/nmenon/linux-2.6-playground/commits/temp/l2c-patch2-splitup
>
> Best regards,
> Tomasz
>


--
Regards,
Nishanth Menon