2013-10-29 00:31:36

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH 0/6] Krait L1/L2 EDAC driver

This patchset adds support for the Krait L1/L2 cache error detection
hardware. The first patch fixes a generic framework bug. The next
two patches lay the groundwork for this driver to be added by
exporting percpu irq functions as well as adding the Krait l2 indirection
register code. The next two patches add the driver and the binding and
the final patch hooks it all up by adding the device tree node.

I'm not sure which tree this is supposed to go through. Ideally we could
send the first 3 plus the 5th one through an edac tree. The final dts changes
could go through arm-soc via davidb's tree and the Documentation patch could
go through the devicetree tree.

Stephen Boyd (6):
edac: Don't try to cancel workqueue when it's never setup
genirq: export percpu irq functions for module usage
ARM: Add Krait L2 accessor functions
edac: Document Krait L1/L2 EDAC driver binding
edac: Add support for Krait CPU cache error detection
ARM: dts: msm: Add Krait edac node

.../bindings/arm/qcom,krait-cache-erp.txt | 16 +
arch/arm/boot/dts/qcom-msm8974.dtsi | 5 +
arch/arm/common/Kconfig | 3 +
arch/arm/common/Makefile | 1 +
arch/arm/common/krait-l2-accessors.c | 52 ++++
arch/arm/include/asm/krait-l2-accessors.h | 20 ++
drivers/edac/Kconfig | 8 +
drivers/edac/Makefile | 2 +
drivers/edac/edac_device.c | 3 +
drivers/edac/krait_edac.c | 335 +++++++++++++++++++++
kernel/irq/manage.c | 2 +
11 files changed, 447 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
create mode 100644 arch/arm/common/krait-l2-accessors.c
create mode 100644 arch/arm/include/asm/krait-l2-accessors.h
create mode 100644 drivers/edac/krait_edac.c

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


2013-10-29 00:31:39

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH 2/6] genirq: export percpu irq functions for module usage

In the near future we're going to use these percpu irq functions
in the Krait CPU EDAC driver. Export them so that the EDAC driver
can be compiled as a module.

Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
---
kernel/irq/manage.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 514bcfd..c6009d1 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1623,6 +1623,7 @@ void free_percpu_irq(unsigned int irq, void __percpu *dev_id)
kfree(__free_percpu_irq(irq, dev_id));
chip_bus_sync_unlock(desc);
}
+EXPORT_SYMBOL_GPL(free_percpu_irq);

/**
* setup_percpu_irq - setup a per-cpu interrupt
@@ -1693,3 +1694,4 @@ int request_percpu_irq(unsigned int irq, irq_handler_t handler,

return retval;
}
+EXPORT_SYMBOL_GPL(request_percpu_irq);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 00:31:38

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH 1/6] edac: Don't try to cancel workqueue when it's never setup

We only setup a workqueue for edac devices that use the polling
method. We still try to cancel the workqueue if an edac_device
uses the irq method though. This causes a warning from debug
objects when we remove an edac device:

WARNING: CPU: 0 PID: 56 at lib/debugobjects.c:260 debug_print_object+0x98/0xc0()
ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x28
Modules linked in: krait_edac(-)
CPU: 0 PID: 56 Comm: rmmod Not tainted 3.12.0-rc2-00035-g15292a0 #69
[<c02179b4>] (unwind_backtrace+0x0/0x144) from [<c0213e58>] (show_stack+0x20/0x24)
[<c0213e58>] (show_stack+0x20/0x24) from [<c06f7fc4>] (dump_stack+0x74/0xb4)
[<c06f7fc4>] (dump_stack+0x74/0xb4) from [<c0223c34>] (warn_slowpath_common+0x78/0x9c)
[<c0223c34>] (warn_slowpath_common+0x78/0x9c) from [<c0223d14>] (warn_slowpath_fmt+0x40/0x48)
[<c0223d14>] (warn_slowpath_fmt+0x40/0x48) from [<c04cf078>] (debug_print_object+0x98/0xc0)
[<c04cf078>] (debug_print_object+0x98/0xc0) from [<c04cfb54>] (debug_object_assert_init+0xdc/0xec)
[<c04cfb54>] (debug_object_assert_init+0xdc/0xec) from [<c02310c4>] (del_timer+0x24/0x7c)
[<c02310c4>] (del_timer+0x24/0x7c) from [<c023fef8>] (try_to_grab_pending+0xc0/0x1b0)
[<c023fef8>] (try_to_grab_pending+0xc0/0x1b0) from [<c0240014>] (cancel_delayed_work+0x2c/0xa0)
[<c0240014>] (cancel_delayed_work+0x2c/0xa0) from [<c059fa14>] (edac_device_workq_teardown+0x1c/0x38)
[<c059fa14>] (edac_device_workq_teardown+0x1c/0x38) from [<c059fae8>] (edac_device_del_device+0xb8/0xe4)
[<c059fae8>] (edac_device_del_device+0xb8/0xe4) from [<bf000050>] (krait_edac_remove+0x50/0x70 [krait_edac])
[<bf000050>] (krait_edac_remove+0x50/0x70 [krait_edac]) from [<c0511e34>] (platform_drv_remove+0x24/0x28)
[<c0511e34>] (platform_drv_remove+0x24/0x28) from [<c050fffc>] (__device_release_driver+0x68/0xc0)
[<c050fffc>] (__device_release_driver+0x68/0xc0) from [<c0510adc>] (driver_detach+0xc4/0xc8)
[<c0510adc>] (driver_detach+0xc4/0xc8) from [<c050fd98>] (bus_remove_driver+0xac/0x114)
[<c050fd98>] (bus_remove_driver+0xac/0x114) from [<c05111d8>] (driver_unregister+0x38/0x58)
[<c05111d8>] (driver_unregister+0x38/0x58) from [<c0511fe8>] (platform_driver_unregister+0x1c/0x20)
[<c0511fe8>] (platform_driver_unregister+0x1c/0x20) from [<bf0005c4>] (krait_edac_driver_exit+0x14/0x1c [krait_edac])
[<bf0005c4>] (krait_edac_driver_exit+0x14/0x1c [krait_edac]) from [<c028b248>] (SyS_delete_module+0x178/0x2b4)
[<c028b248>] (SyS_delete_module+0x178/0x2b4) from [<c020f760>] (ret_fast_syscall+0x0/0x48)

Fix it by skipping the workqueue teardown for such devices.

Signed-off-by: Stephen Boyd <[email protected]>
---
drivers/edac/edac_device.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 211021d..203561b 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -437,6 +437,9 @@ void edac_device_workq_teardown(struct edac_device_ctl_info *edac_dev)
{
int status;

+ if (!edac_dev->edac_check)
+ return;
+
status = cancel_delayed_work(&edac_dev->work);
if (status == 0) {
/* workq instance might be running, wait for it */
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 00:32:23

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH 6/6] ARM: dts: msm: Add Krait edac node

Cc: David Brown <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
---

This is based on the dts updates I sent two weeks ago to add mmio
architected timesr to qcom-msm8974.dtsi.

arch/arm/boot/dts/qcom-msm8974.dtsi | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom-msm8974.dtsi
index 152879d..7b15eb7 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -32,6 +32,11 @@
clock-frequency = <19200000>;
};

+ edac {
+ compatible = "qcom,krait-cache-erp";
+ interrupts = <1 9 0xf04>, <0 2 0>;
+ };
+
timer@f9020000 {
#address-cells = <1>;
#size-cells = <1>;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 00:32:20

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH 5/6] edac: Add support for Krait CPU cache error detection

Add support for the Krait CPU cache error detection. This is a
simplified version of the code originally written by Stepan
Moskovchenko[1] ported to the EDAC device framework.

[1] https://www.codeaurora.org/cgit/quic/la/kernel/msm/tree/arch/arm/mach-msm/cache_erp.c?h=msm-3.4

Cc: Stepan Moskovchenko <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
---
drivers/edac/Kconfig | 8 ++
drivers/edac/Makefile | 2 +
drivers/edac/krait_edac.c | 336 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 346 insertions(+)
create mode 100644 drivers/edac/krait_edac.c

diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 878f090..68f612d 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -368,4 +368,12 @@ config EDAC_OCTEON_PCI
Support for error detection and correction on the
Cavium Octeon family of SOCs.

+config EDAC_KRAIT_CACHE
+ tristate "Krait L1/L2 Cache"
+ depends on EDAC_MM_EDAC && ARCH_MSM
+ select KRAIT_L2_ACCESSORS
+ help
+ Support for error detection and correction on the
+ Krait L1/L2 cache controller.
+
endif # EDAC
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 4154ed6..b6ea505 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -64,3 +64,5 @@ obj-$(CONFIG_EDAC_OCTEON_PC) += octeon_edac-pc.o
obj-$(CONFIG_EDAC_OCTEON_L2C) += octeon_edac-l2c.o
obj-$(CONFIG_EDAC_OCTEON_LMC) += octeon_edac-lmc.o
obj-$(CONFIG_EDAC_OCTEON_PCI) += octeon_edac-pci.o
+
+obj-$(CONFIG_EDAC_KRAIT_CACHE) += krait_edac.o
diff --git a/drivers/edac/krait_edac.c b/drivers/edac/krait_edac.c
new file mode 100644
index 0000000..89380a1
--- /dev/null
+++ b/drivers/edac/krait_edac.c
@@ -0,0 +1,336 @@
+/* Copyright (c) 2012-2013, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/platform_device.h>
+#include <linux/module.h>
+#include <linux/err.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/of.h>
+
+#include <asm/krait-l2-accessors.h>
+
+#include "edac_core.h"
+
+#define CESR_DCTPE BIT(0)
+#define CESR_DCDPE BIT(1)
+#define CESR_ICTPE BIT(2)
+#define CESR_ICDPE BIT(3)
+#define CESR_DCTE (BIT(4) | BIT(5))
+#define CESR_ICTE (BIT(6) | BIT(7))
+#define CESR_TLBMH BIT(16)
+#define CESR_I_MASK 0x000000cc
+/* Print a message for everything but TLB MH events */
+#define CESR_PRINT_MASK 0x000000ff
+
+#define L2ESR 0x204
+#define L2ESR_MPDCD BIT(0)
+#define L2ESR_MPSLV BIT(1)
+#define L2ESR_TSESB BIT(2)
+#define L2ESR_TSEDB BIT(3)
+#define L2ESR_DSESB BIT(4)
+#define L2ESR_DSEDB BIT(5)
+#define L2ESR_MSE BIT(6)
+#define L2ESR_MPLDREXNOK BIT(8)
+#define L2ESR_CPU_MASK 0xf
+#define L2ESR_CPU_SHIFT 16
+#define L2ESR_SP BIT(20)
+
+#define L2ESYNR0 0x208
+#define L2ESYNR1 0x209
+#define L2EAR0 0x20c
+#define L2EAR1 0x20d
+
+struct krait_edac {
+ int l1_irq;
+ struct edac_device_ctl_info * __percpu *edev;
+ struct notifier_block notifier;
+};
+
+struct krait_edac_error {
+ const char * const msg;
+ void (*func)(struct edac_device_ctl_info *edac_dev,
+ int inst_nr, int block_nr, const char *msg);
+};
+
+static unsigned int read_cesr(void)
+{
+ unsigned int cesr;
+
+ asm volatile ("mrc p15, 7, %0, c15, c0, 1" : "=r" (cesr));
+ return cesr;
+}
+
+static void write_cesr(unsigned int cesr)
+{
+ asm volatile ("mcr p15, 7, %0, c15, c0, 1" : : "r" (cesr));
+}
+
+static unsigned int read_cesynr(void)
+{
+ unsigned int cesynr;
+
+ asm volatile ("mrc p15, 7, %0, c15, c0, 3" : "=r" (cesynr));
+ return cesynr;
+}
+
+static irqreturn_t krait_l1_irq(int irq, void *dev_id)
+{
+ struct edac_device_ctl_info *edac = dev_id;
+ unsigned int cesr = read_cesr();
+ unsigned int i_cesynr, d_cesynr;
+ unsigned int cpu = smp_processor_id();
+ int print_regs = cesr & CESR_PRINT_MASK;
+ int i;
+ static const struct krait_edac_error errors[] = {
+ { "D-cache tag parity error", edac_device_handle_ue },
+ { "D-cache data parity error", edac_device_handle_ue },
+ { "I-cache tag parity error", edac_device_handle_ce },
+ { "I-cache data parity error", edac_device_handle_ce },
+ { "D-cache tag timing error", edac_device_handle_ue },
+ { "D-cache data timing error", edac_device_handle_ue },
+ { "I-cache tag timing error", edac_device_handle_ce },
+ { "I-cache data timing error", edac_device_handle_ce }
+ };
+
+ if (print_regs) {
+ pr_alert("L1 / TLB Error detected on CPU %d!\n", cpu);
+ pr_alert("CESR = 0x%08x\n", cesr);
+ pr_alert("MIDR = 0x%08x\n", read_cpuid_id());
+ }
+
+ for (i = 0; i < ARRAY_SIZE(errors); i++)
+ if (BIT(i) & cesr)
+ errors[i].func(edac, cpu, 0, errors[i].msg);
+
+ if (cesr & CESR_TLBMH) {
+ asm ("mcr p15, 0, r0, c8, c7, 0");
+ edac_device_handle_ce(edac, cpu, 0, "TLB Multi-Hit error");
+ }
+
+ if (cesr & (CESR_ICTPE | CESR_ICDPE | CESR_ICTE)) {
+ i_cesynr = read_cesynr();
+ pr_alert("I-side CESYNR = 0x%08x\n", i_cesynr);
+ write_cesr(CESR_I_MASK);
+
+ /*
+ * Clear the I-side bits from the captured CESR value so that we
+ * don't accidentally clear any new I-side errors when we do
+ * the CESR write-clear operation.
+ */
+ cesr &= ~CESR_I_MASK;
+ }
+
+ if (cesr & (CESR_DCTPE | CESR_DCDPE | CESR_DCTE)) {
+ d_cesynr = read_cesynr();
+ pr_alert("D-side CESYNR = 0x%08x\n", d_cesynr);
+ }
+
+ /* Clear the interrupt bits we processed */
+ write_cesr(cesr);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t krait_l2_irq(int irq, void *dev_id)
+{
+ struct edac_device_ctl_info *edac = dev_id;
+ unsigned int l2esr;
+ unsigned int l2esynr0;
+ unsigned int l2esynr1;
+ unsigned int l2ear0;
+ unsigned int l2ear1;
+ unsigned long cpu;
+ int i;
+ static const struct krait_edac_error errors[] = {
+ { "master port decode error", edac_device_handle_ce },
+ { "master port slave error", edac_device_handle_ce },
+ { "tag soft error, single-bit", edac_device_handle_ce },
+ { "tag soft error, double-bit", edac_device_handle_ue },
+ { "data soft error, single-bit", edac_device_handle_ce },
+ { "data soft error, double-bit", edac_device_handle_ue },
+ { "modified soft error", edac_device_handle_ce },
+ { "slave port exclusive monitor not available",
+ edac_device_handle_ue},
+ { "master port LDREX received Normal OK response",
+ edac_device_handle_ce },
+ };
+
+ l2esr = get_l2_indirect_reg(L2ESR);
+ pr_alert("Error detected!\n");
+ pr_alert("L2ESR = 0x%08x\n", l2esr);
+
+ if (l2esr & (L2ESR_TSESB | L2ESR_TSEDB | L2ESR_MSE | L2ESR_SP)) {
+ l2esynr0 = get_l2_indirect_reg(L2ESYNR0);
+ l2esynr1 = get_l2_indirect_reg(L2ESYNR1);
+ l2ear0 = get_l2_indirect_reg(L2EAR0);
+ l2ear1 = get_l2_indirect_reg(L2EAR1);
+
+ pr_alert("L2ESYNR0 = 0x%08x\n", l2esynr0);
+ pr_alert("L2ESYNR1 = 0x%08x\n", l2esynr1);
+ pr_alert("L2EAR0 = 0x%08x\n", l2ear0);
+ pr_alert("L2EAR1 = 0x%08x\n", l2ear1);
+ }
+
+ cpu = (l2esr >> L2ESR_CPU_SHIFT) & L2ESR_CPU_MASK;
+ cpu = __ffs(cpu);
+ if (cpu)
+ cpu--;
+ for (i = 0; i < ARRAY_SIZE(errors); i++)
+ if (BIT(i) & l2esr)
+ errors[i].func(edac, cpu, 1, errors[i].msg);
+
+ set_l2_indirect_reg(L2ESR, l2esr);
+
+ return IRQ_HANDLED;
+}
+
+static void enable_l1_irq(void *info)
+{
+ const struct krait_edac *k = info;
+
+ enable_percpu_irq(k->l1_irq, IRQ_TYPE_LEVEL_HIGH);
+}
+
+static void disable_l1_irq(void *info)
+{
+ const struct krait_edac *k = info;
+
+ disable_percpu_irq(k->l1_irq);
+}
+
+static int
+krait_edac_notify(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+ struct krait_edac *p = container_of(nfb, struct krait_edac, notifier);
+
+ switch (action & ~CPU_TASKS_FROZEN) {
+ case CPU_STARTING:
+ enable_l1_irq(p);
+ break;
+
+ case CPU_DYING:
+ disable_l1_irq(p);
+ break;
+ }
+ return NOTIFY_OK;
+}
+
+static int krait_edac_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct edac_device_ctl_info *edev;
+ struct krait_edac *p;
+ int l1_irq, l2_irq;
+ int ret;
+ int cpu;
+
+ l1_irq = platform_get_irq(pdev, 0);
+ if (l1_irq < 0)
+ return l1_irq;
+
+ l2_irq = platform_get_irq(pdev, 1);
+ if (l2_irq < 0)
+ return l2_irq;
+
+ p = devm_kzalloc(dev, sizeof(*p), GFP_KERNEL);
+ if (!p)
+ return -ENOMEM;
+ platform_set_drvdata(pdev, p);
+ p->l1_irq = l1_irq;
+
+ p->edev = alloc_percpu(struct edac_device_ctl_info *);
+ if (!p->edev)
+ return -ENOMEM;
+
+ edev = edac_device_alloc_ctl_info(0, "cpu", num_possible_cpus(),
+ "L", 2, 1, NULL, 0,
+ edac_device_alloc_index());
+ if (!edev) {
+ ret = -ENOMEM;
+ goto err_alloc;
+ }
+
+ edev->dev = dev;
+ edev->mod_name = dev_name(dev);
+ edev->dev_name = dev_name(dev);
+ edev->ctl_name = "cache";
+
+ for_each_possible_cpu(cpu)
+ *per_cpu_ptr(p->edev, cpu) = edev;
+
+ ret = edac_device_add_device(edev);
+ if (ret)
+ goto err_add;
+
+ ret = request_percpu_irq(l1_irq, krait_l1_irq, "[edac] L1 err",
+ p->edev);
+ if (ret)
+ goto err_l1_irq;
+
+ ret = devm_request_irq(dev, l2_irq, krait_l2_irq, 0, "[edac] L2 err",
+ edev);
+ if (ret)
+ goto err_l2_irq;
+
+ p->notifier.notifier_call = krait_edac_notify;
+ register_hotcpu_notifier(&p->notifier);
+ on_each_cpu(enable_l1_irq, p, true);
+
+ return 0;
+err_l2_irq:
+ free_percpu_irq(p->l1_irq, p->edev);
+err_l1_irq:
+ edac_device_del_device(dev);
+err_add:
+ edac_device_free_ctl_info(edev);
+err_alloc:
+ free_percpu(p->edev);
+ return ret;
+}
+
+static int krait_edac_remove(struct platform_device *pdev)
+{
+ struct krait_edac *p = platform_get_drvdata(pdev);
+
+ unregister_hotcpu_notifier(&p->notifier);
+ on_each_cpu(disable_l1_irq, p, true);
+ free_percpu_irq(p->l1_irq, p->edev);
+ edac_device_del_device(&pdev->dev);
+ edac_device_free_ctl_info(*__this_cpu_ptr(p->edev));
+ free_percpu(p->edev);
+
+ return 0;
+}
+
+static const struct of_device_id krait_edac_match_table[] = {
+ { .compatible = "qcom,krait-cache-erp" },
+ { }
+};
+MODULE_DEVICE_TABLE(of, krait_edac_match_table);
+
+static struct platform_driver krait_edac_driver = {
+ .probe = krait_edac_probe,
+ .remove = krait_edac_remove,
+ .driver = {
+ .name = "krait_edac",
+ .owner = THIS_MODULE,
+ .of_match_table = of_match_ptr(krait_edac_match_table),
+ },
+};
+module_platform_driver(krait_edac_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Krait CPU cache error reporting driver");
+MODULE_ALIAS("platform:krait_edac");
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 00:32:55

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH 3/6] ARM: Add Krait L2 accessor functions

Qualcomm's Krait CPUs have a handful of L2 cache controller
registers that live behind a cp15 based indirection register.
First you program the indirection register (l2cpselr) to point
the L2 'window' register (l2cpdr) at what you want to read/write.
Then you read/write the 'window' register to do what you want.
The l2cpselr register is not banked per-cpu so we must lock
around accesses to it to prevent other CPUs from re-pointing
l2cpdr underneath us.

Cc: Russell King <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
---
arch/arm/common/Kconfig | 3 ++
arch/arm/common/Makefile | 1 +
arch/arm/common/krait-l2-accessors.c | 52 +++++++++++++++++++++++++++++++
arch/arm/include/asm/krait-l2-accessors.h | 20 ++++++++++++
4 files changed, 76 insertions(+)
create mode 100644 arch/arm/common/krait-l2-accessors.c
create mode 100644 arch/arm/include/asm/krait-l2-accessors.h

diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
index c3a4e9c..9da52dc 100644
--- a/arch/arm/common/Kconfig
+++ b/arch/arm/common/Kconfig
@@ -9,6 +9,9 @@ config DMABOUNCE
bool
select ZONE_DMA

+config KRAIT_L2_ACCESSORS
+ bool
+
config SHARP_LOCOMO
bool

diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
index 8c60f47..606131a 100644
--- a/arch/arm/common/Makefile
+++ b/arch/arm/common/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_ICST) += icst.o
obj-$(CONFIG_SA1111) += sa1111.o
obj-$(CONFIG_PCI_HOST_VIA82C505) += via82c505.o
obj-$(CONFIG_DMABOUNCE) += dmabounce.o
+obj-$(CONFIG_KRAIT_L2_ACCESSORS) += krait-l2-accessors.o
obj-$(CONFIG_SHARP_LOCOMO) += locomo.o
obj-$(CONFIG_SHARP_PARAM) += sharpsl_param.o
obj-$(CONFIG_SHARP_SCOOP) += scoop.o
diff --git a/arch/arm/common/krait-l2-accessors.c b/arch/arm/common/krait-l2-accessors.c
new file mode 100644
index 0000000..c579794
--- /dev/null
+++ b/arch/arm/common/krait-l2-accessors.c
@@ -0,0 +1,52 @@
+/*
+ * Copyright (c) 2011-2013, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/spinlock.h>
+#include <linux/export.h>
+
+#include <asm/barrier.h>
+#include <asm/krait-l2-accessors.h>
+
+static DEFINE_RAW_SPINLOCK(krait_l2_lock);
+
+void set_l2_indirect_reg(u32 addr, u32 val)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&krait_l2_lock, flags);
+
+ asm volatile ("mcr p15, 3, %0, c15, c0, 6" : : "r" (addr));
+ isb();
+ asm volatile ("mcr p15, 3, %0, c15, c0, 7" : : "r" (val));
+ isb();
+
+ raw_spin_unlock_irqrestore(&krait_l2_lock, flags);
+}
+EXPORT_SYMBOL(set_l2_indirect_reg);
+
+u32 get_l2_indirect_reg(u32 addr)
+{
+ u32 val;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&krait_l2_lock, flags);
+
+ asm volatile ("mcr p15, 3, %0, c15, c0, 6" : : "r" (addr));
+ isb();
+ asm volatile ("mrc p15, 3, %0, c15, c0, 7" : "=r" (val));
+
+ raw_spin_unlock_irqrestore(&krait_l2_lock, flags);
+
+ return val;
+}
+EXPORT_SYMBOL(get_l2_indirect_reg);
diff --git a/arch/arm/include/asm/krait-l2-accessors.h b/arch/arm/include/asm/krait-l2-accessors.h
new file mode 100644
index 0000000..d5305c4
--- /dev/null
+++ b/arch/arm/include/asm/krait-l2-accessors.h
@@ -0,0 +1,20 @@
+/*
+ * Copyright (c) 2011-2013, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __ASMARM_KRAIT_L2_ACCESSORS_H
+#define __ASMARM_KRAIT_L2_ACCESSORS_H
+
+extern void set_l2_indirect_reg(u32 addr, u32 val);
+extern u32 get_l2_indirect_reg(u32 addr);
+
+#endif
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 00:32:54

by Stephen Boyd

[permalink] [raw]
Subject: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding

The Krait L1/L2 error reporting device is made up of two
interrupts, one per-CPU interrupt for the L1 caches and one
interrupt for the L2 cache.

Cc: <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
---
.../devicetree/bindings/arm/qcom,krait-cache-erp.txt | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt

diff --git a/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
new file mode 100644
index 0000000..01fe8a8
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
@@ -0,0 +1,16 @@
+* Qualcomm Krait L1 / L2 cache error reporting
+
+Required properties:
+- compatible: Should be "qcom,krait-cache-erp"
+- interrupts: Should contain the L1/CPU error interrupt number and
+ then the L2 cache error interrupt number
+
+Optional properties:
+- interrupt-names: Should contain the interrupt names "l1_irq" and
+ "l2_irq"
+
+Example:
+ edac {
+ compatible = "qcom,krait-cache-erp";
+ interrupts = <1 9 0xf04>, <0 2 0x4>;
+ };
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 01:20:21

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 3/6] ARM: Add Krait L2 accessor functions

On Tue, Oct 29, 2013 at 12:31:27AM +0000, Stephen Boyd wrote:
> Qualcomm's Krait CPUs have a handful of L2 cache controller
> registers that live behind a cp15 based indirection register.
> First you program the indirection register (l2cpselr) to point
> the L2 'window' register (l2cpdr) at what you want to read/write.
> Then you read/write the 'window' register to do what you want.
> The l2cpselr register is not banked per-cpu so we must lock
> around accesses to it to prevent other CPUs from re-pointing
> l2cpdr underneath us.
>
> Cc: Russell King <[email protected]>
> Signed-off-by: Stephen Boyd <[email protected]>
> ---
> arch/arm/common/Kconfig | 3 ++
> arch/arm/common/Makefile | 1 +
> arch/arm/common/krait-l2-accessors.c | 52 +++++++++++++++++++++++++++++++
> arch/arm/include/asm/krait-l2-accessors.h | 20 ++++++++++++
> 4 files changed, 76 insertions(+)
> create mode 100644 arch/arm/common/krait-l2-accessors.c
> create mode 100644 arch/arm/include/asm/krait-l2-accessors.h

[...]

> +void set_l2_indirect_reg(u32 addr, u32 val)
> +{
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&krait_l2_lock, flags);
> +
> + asm volatile ("mcr p15, 3, %0, c15, c0, 6" : : "r" (addr));
> + isb();
> + asm volatile ("mcr p15, 3, %0, c15, c0, 7" : : "r" (val));
> + isb();
> +
> + raw_spin_unlock_irqrestore(&krait_l2_lock, flags);
> +}
> +EXPORT_SYMBOL(set_l2_indirect_reg);

It might be worth commmenting inline as to what register each of these is
accessing. Inevitably the commit message will become harder to find and
associate with the code over time.

Similarly for get_l2_indirect_reg.

Thanks,
Mark.

2013-10-29 01:22:00

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 3/6] ARM: Add Krait L2 accessor functions

On 10/28/13 18:19, Mark Rutland wrote:
> On Tue, Oct 29, 2013 at 12:31:27AM +0000, Stephen Boyd wrote:
>> Qualcomm's Krait CPUs have a handful of L2 cache controller
>> registers that live behind a cp15 based indirection register.
>> First you program the indirection register (l2cpselr) to point
>> the L2 'window' register (l2cpdr) at what you want to read/write.
>> Then you read/write the 'window' register to do what you want.
>> The l2cpselr register is not banked per-cpu so we must lock
>> around accesses to it to prevent other CPUs from re-pointing
>> l2cpdr underneath us.
>>
>> Cc: Russell King <[email protected]>
>> Signed-off-by: Stephen Boyd <[email protected]>
>> ---
>> arch/arm/common/Kconfig | 3 ++
>> arch/arm/common/Makefile | 1 +
>> arch/arm/common/krait-l2-accessors.c | 52 +++++++++++++++++++++++++++++++
>> arch/arm/include/asm/krait-l2-accessors.h | 20 ++++++++++++
>> 4 files changed, 76 insertions(+)
>> create mode 100644 arch/arm/common/krait-l2-accessors.c
>> create mode 100644 arch/arm/include/asm/krait-l2-accessors.h
> [...]
>
>> +void set_l2_indirect_reg(u32 addr, u32 val)
>> +{
>> + unsigned long flags;
>> +
>> + raw_spin_lock_irqsave(&krait_l2_lock, flags);
>> +
>> + asm volatile ("mcr p15, 3, %0, c15, c0, 6" : : "r" (addr));
>> + isb();
>> + asm volatile ("mcr p15, 3, %0, c15, c0, 7" : : "r" (val));
>> + isb();
>> +
>> + raw_spin_unlock_irqrestore(&krait_l2_lock, flags);
>> +}
>> +EXPORT_SYMBOL(set_l2_indirect_reg);
> It might be worth commmenting inline as to what register each of these is
> accessing. Inevitably the commit message will become harder to find and
> associate with the code over time.
>
> Similarly for get_l2_indirect_reg.

Do you mean with the "@" syntax in the assembly? The 80-character limit
is out to get me.

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 01:28:31

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 5/6] edac: Add support for Krait CPU cache error detection

On Tue, Oct 29, 2013 at 12:31:29AM +0000, Stephen Boyd wrote:
> Add support for the Krait CPU cache error detection. This is a
> simplified version of the code originally written by Stepan
> Moskovchenko[1] ported to the EDAC device framework.
>
> [1] https://www.codeaurora.org/cgit/quic/la/kernel/msm/tree/arch/arm/mach-msm/cache_erp.c?h=msm-3.4
>
> Cc: Stepan Moskovchenko <[email protected]>
> Signed-off-by: Stephen Boyd <[email protected]>
> ---
> drivers/edac/Kconfig | 8 ++
> drivers/edac/Makefile | 2 +
> drivers/edac/krait_edac.c | 336 ++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 346 insertions(+)
> create mode 100644 drivers/edac/krait_edac.c

[...]

> +static int krait_edac_probe(struct platform_device *pdev)
> +{
> + struct device *dev = &pdev->dev;
> + struct edac_device_ctl_info *edev;
> + struct krait_edac *p;
> + int l1_irq, l2_irq;
> + int ret;
> + int cpu;
> +
> + l1_irq = platform_get_irq(pdev, 0);
> + if (l1_irq < 0)
> + return l1_irq;
> +
> + l2_irq = platform_get_irq(pdev, 1);
> + if (l2_irq < 0)
> + return l2_irq;

As you have optional interrupt-names, I would prefer that you attempted to
request interrupts by name if interrupt-names is present. Otherwise
interrupt-names brings no value.

Thanks,
Mark.

2013-10-29 01:33:07

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 3/6] ARM: Add Krait L2 accessor functions

On Tue, Oct 29, 2013 at 01:21:57AM +0000, Stephen Boyd wrote:
> On 10/28/13 18:19, Mark Rutland wrote:
> > On Tue, Oct 29, 2013 at 12:31:27AM +0000, Stephen Boyd wrote:
> >> Qualcomm's Krait CPUs have a handful of L2 cache controller
> >> registers that live behind a cp15 based indirection register.
> >> First you program the indirection register (l2cpselr) to point
> >> the L2 'window' register (l2cpdr) at what you want to read/write.
> >> Then you read/write the 'window' register to do what you want.
> >> The l2cpselr register is not banked per-cpu so we must lock
> >> around accesses to it to prevent other CPUs from re-pointing
> >> l2cpdr underneath us.
> >>
> >> Cc: Russell King <[email protected]>
> >> Signed-off-by: Stephen Boyd <[email protected]>
> >> ---
> >> arch/arm/common/Kconfig | 3 ++
> >> arch/arm/common/Makefile | 1 +
> >> arch/arm/common/krait-l2-accessors.c | 52 +++++++++++++++++++++++++++++++
> >> arch/arm/include/asm/krait-l2-accessors.h | 20 ++++++++++++
> >> 4 files changed, 76 insertions(+)
> >> create mode 100644 arch/arm/common/krait-l2-accessors.c
> >> create mode 100644 arch/arm/include/asm/krait-l2-accessors.h
> > [...]
> >
> >> +void set_l2_indirect_reg(u32 addr, u32 val)
> >> +{
> >> + unsigned long flags;
> >> +
> >> + raw_spin_lock_irqsave(&krait_l2_lock, flags);
> >> +
> >> + asm volatile ("mcr p15, 3, %0, c15, c0, 6" : : "r" (addr));
> >> + isb();
> >> + asm volatile ("mcr p15, 3, %0, c15, c0, 7" : : "r" (val));
> >> + isb();
> >> +
> >> + raw_spin_unlock_irqrestore(&krait_l2_lock, flags);
> >> +}
> >> +EXPORT_SYMBOL(set_l2_indirect_reg);
> > It might be worth commmenting inline as to what register each of these is
> > accessing. Inevitably the commit message will become harder to find and
> > associate with the code over time.
> >
> > Similarly for get_l2_indirect_reg.
>
> Do you mean with the "@" syntax in the assembly? The 80-character limit
> is out to get me.

I probably didn't mean inline :)

How about a block comment above the first asm block like:

/*
* Select the L2 window by poking l2cpselr, then write to the window via
* l2cpdr.
*/

Thanks,
Mark.

2013-10-29 01:35:25

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding

On Tue, Oct 29, 2013 at 12:31:28AM +0000, Stephen Boyd wrote:
> The Krait L1/L2 error reporting device is made up of two
> interrupts, one per-CPU interrupt for the L1 caches and one
> interrupt for the L2 cache.
>
> Cc: <[email protected]>
> Signed-off-by: Stephen Boyd <[email protected]>
> ---
> .../devicetree/bindings/arm/qcom,krait-cache-erp.txt | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>
> diff --git a/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> new file mode 100644
> index 0000000..01fe8a8
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> @@ -0,0 +1,16 @@
> +* Qualcomm Krait L1 / L2 cache error reporting
> +
> +Required properties:
> +- compatible: Should be "qcom,krait-cache-erp"
> +- interrupts: Should contain the L1/CPU error interrupt number and
> + then the L2 cache error interrupt number
> +
> +Optional properties:
> +- interrupt-names: Should contain the interrupt names "l1_irq" and
> + "l2_irq"

As with my comment on the parsing code, I'd prefer that if interrupt-names was
present it defined the order of interrupts. Otherwise it's redundant and of no
value.

Otherwise, the binding looks fine to me:

Acked-by: Mark Rutland <[email protected]>

> +
> +Example:
> + edac {
> + compatible = "qcom,krait-cache-erp";
> + interrupts = <1 9 0xf04>, <0 2 0x4>;
> + };
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2013-10-29 05:05:52

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 3/6] ARM: Add Krait L2 accessor functions

On 10/28, Mark Rutland wrote:
> On Tue, Oct 29, 2013 at 01:21:57AM +0000, Stephen Boyd wrote:
> > On 10/28/13 18:19, Mark Rutland wrote:
> > > It might be worth commmenting inline as to what register each of these is
> > > accessing. Inevitably the commit message will become harder to find and
> > > associate with the code over time.
> > >
> > > Similarly for get_l2_indirect_reg.
> >
> > Do you mean with the "@" syntax in the assembly? The 80-character limit
> > is out to get me.
>
> I probably didn't mean inline :)
>
> How about a block comment above the first asm block like:
>
> /*
> * Select the L2 window by poking l2cpselr, then write to the window via
> * l2cpdr.
> */
>

Ok sure.

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 05:06:49

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding

On 10/28, Mark Rutland wrote:
> On Tue, Oct 29, 2013 at 12:31:28AM +0000, Stephen Boyd wrote:
> > +
> > +Optional properties:
> > +- interrupt-names: Should contain the interrupt names "l1_irq" and
> > + "l2_irq"
>
> As with my comment on the parsing code, I'd prefer that if interrupt-names was
> present it defined the order of interrupts. Otherwise it's redundant and of no
> value.
>
> Otherwise, the binding looks fine to me:
>
> Acked-by: Mark Rutland <[email protected]>

How about I just drop the interrupt-names property? It isn't
adding much and is a holdover from the vendor kernel.

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 08:21:14

by Kumar Gala

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding


On Oct 28, 2013, at 7:31 PM, Stephen Boyd wrote:

> The Krait L1/L2 error reporting device is made up of two
> interrupts, one per-CPU interrupt for the L1 caches and one
> interrupt for the L2 cache.
>
> Cc: <[email protected]>
> Signed-off-by: Stephen Boyd <[email protected]>
> ---
> .../devicetree/bindings/arm/qcom,krait-cache-erp.txt | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>
> diff --git a/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> new file mode 100644
> index 0000000..01fe8a8
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> @@ -0,0 +1,16 @@
> +* Qualcomm Krait L1 / L2 cache error reporting
> +
> +Required properties:
> +- compatible: Should be "qcom,krait-cache-erp"
> +- interrupts: Should contain the L1/CPU error interrupt number and
> + then the L2 cache error interrupt number
> +
> +Optional properties:
> +- interrupt-names: Should contain the interrupt names "l1_irq" and
> + "l2_irq"
> +
> +Example:
> + edac {
> + compatible = "qcom,krait-cache-erp";
> + interrupts = <1 9 0xf04>, <0 2 0x4>;
> + };

Why wouldn't we have these as part of cache nodes in the dts? (which begs the question why we don't have cache nodes?)

- k

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

2013-10-29 12:54:12

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 2/6] genirq: export percpu irq functions for module usage

On Mon, 28 Oct 2013, Stephen Boyd wrote:

> In the near future we're going to use these percpu irq functions
> in the Krait CPU EDAC driver. Export them so that the EDAC driver
> can be compiled as a module.
>
> Cc: Thomas Gleixner <[email protected]>
> Signed-off-by: Stephen Boyd <[email protected]>

Acked-by: Thomas Gleixner <[email protected]>

Please take it through the tree which will take the EDAC patches.

Thanks,

tglx

> ---
> kernel/irq/manage.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index 514bcfd..c6009d1 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1623,6 +1623,7 @@ void free_percpu_irq(unsigned int irq, void __percpu *dev_id)
> kfree(__free_percpu_irq(irq, dev_id));
> chip_bus_sync_unlock(desc);
> }
> +EXPORT_SYMBOL_GPL(free_percpu_irq);
>
> /**
> * setup_percpu_irq - setup a per-cpu interrupt
> @@ -1693,3 +1694,4 @@ int request_percpu_irq(unsigned int irq, irq_handler_t handler,
>
> return retval;
> }
> +EXPORT_SYMBOL_GPL(request_percpu_irq);
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>
>

2013-10-29 18:01:04

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding

On 10/29/13 01:21, Kumar Gala wrote:
> On Oct 28, 2013, at 7:31 PM, Stephen Boyd wrote:
>
>> The Krait L1/L2 error reporting device is made up of two
>> interrupts, one per-CPU interrupt for the L1 caches and one
>> interrupt for the L2 cache.
>>
>> Cc: <[email protected]>
>> Signed-off-by: Stephen Boyd <[email protected]>
>> ---
>> .../devicetree/bindings/arm/qcom,krait-cache-erp.txt | 16 ++++++++++++++++
>> 1 file changed, 16 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>>
>> diff --git a/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>> new file mode 100644
>> index 0000000..01fe8a8
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>> @@ -0,0 +1,16 @@
>> +* Qualcomm Krait L1 / L2 cache error reporting
>> +
>> +Required properties:
>> +- compatible: Should be "qcom,krait-cache-erp"
>> +- interrupts: Should contain the L1/CPU error interrupt number and
>> + then the L2 cache error interrupt number
>> +
>> +Optional properties:
>> +- interrupt-names: Should contain the interrupt names "l1_irq" and
>> + "l2_irq"
>> +
>> +Example:
>> + edac {
>> + compatible = "qcom,krait-cache-erp";
>> + interrupts = <1 9 0xf04>, <0 2 0x4>;
>> + };
> Why wouldn't we have these as part of cache nodes in the dts? (which begs the question why we don't have cache nodes?)
>

I can certainly add cache nodes and cpu nodes and then put the
interrupts in those nodes. I was thinking along those same lines when I
ported this driver but figured it would be good to get something out
there. The only question I have is how am I supposed to hook that up
into the linux device model? Will the edac driver bind to the device
created for the cpus node and the cache node? I guess it will have to be
a driver that binds to two devices.

One could argue that we should put the cp15 based architected timers in
the cpus node also but so far nobody has done that and I think there was
some reasoning behind that, Mark?

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

2013-10-29 20:22:59

by Olof Johansson

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding

On Tue, Oct 29, 2013 at 11:00 AM, Stephen Boyd <[email protected]> wrote:
> On 10/29/13 01:21, Kumar Gala wrote:
>> On Oct 28, 2013, at 7:31 PM, Stephen Boyd wrote:
>>
>>> The Krait L1/L2 error reporting device is made up of two
>>> interrupts, one per-CPU interrupt for the L1 caches and one
>>> interrupt for the L2 cache.
>>>
>>> Cc: <[email protected]>
>>> Signed-off-by: Stephen Boyd <[email protected]>
>>> ---
>>> .../devicetree/bindings/arm/qcom,krait-cache-erp.txt | 16 ++++++++++++++++
>>> 1 file changed, 16 insertions(+)
>>> create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>>> new file mode 100644
>>> index 0000000..01fe8a8
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>>> @@ -0,0 +1,16 @@
>>> +* Qualcomm Krait L1 / L2 cache error reporting
>>> +
>>> +Required properties:
>>> +- compatible: Should be "qcom,krait-cache-erp"
>>> +- interrupts: Should contain the L1/CPU error interrupt number and
>>> + then the L2 cache error interrupt number
>>> +
>>> +Optional properties:
>>> +- interrupt-names: Should contain the interrupt names "l1_irq" and
>>> + "l2_irq"
>>> +
>>> +Example:
>>> + edac {
>>> + compatible = "qcom,krait-cache-erp";
>>> + interrupts = <1 9 0xf04>, <0 2 0x4>;
>>> + };
>> Why wouldn't we have these as part of cache nodes in the dts? (which begs the question why we don't have cache nodes?)

In particular, naming the node edac seems like a suboptimal choice,
since that's a very linux-specific name for the error reporting
framework.

> I can certainly add cache nodes and cpu nodes and then put the
> interrupts in those nodes. I was thinking along those same lines when I
> ported this driver but figured it would be good to get something out
> there. The only question I have is how am I supposed to hook that up
> into the linux device model? Will the edac driver bind to the device
> created for the cpus node and the cache node? I guess it will have to be
> a driver that binds to two devices.

Or it could bind to one device and look up the info for the other from
the devicetree without relying on the driver core to instantiate
devices.


-Olof

2013-10-29 20:48:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/6] edac: Don't try to cancel workqueue when it's never setup

On Mon, Oct 28, 2013 at 05:31:25PM -0700, Stephen Boyd wrote:
> We only setup a workqueue for edac devices that use the polling
> method. We still try to cancel the workqueue if an edac_device
> uses the irq method though. This causes a warning from debug
> objects when we remove an edac device:
>
> WARNING: CPU: 0 PID: 56 at lib/debugobjects.c:260 debug_print_object+0x98/0xc0()
> ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x28
> Modules linked in: krait_edac(-)
> CPU: 0 PID: 56 Comm: rmmod Not tainted 3.12.0-rc2-00035-g15292a0 #69
> [<c02179b4>] (unwind_backtrace+0x0/0x144) from [<c0213e58>] (show_stack+0x20/0x24)
> [<c0213e58>] (show_stack+0x20/0x24) from [<c06f7fc4>] (dump_stack+0x74/0xb4)
> [<c06f7fc4>] (dump_stack+0x74/0xb4) from [<c0223c34>] (warn_slowpath_common+0x78/0x9c)
> [<c0223c34>] (warn_slowpath_common+0x78/0x9c) from [<c0223d14>] (warn_slowpath_fmt+0x40/0x48)
> [<c0223d14>] (warn_slowpath_fmt+0x40/0x48) from [<c04cf078>] (debug_print_object+0x98/0xc0)
> [<c04cf078>] (debug_print_object+0x98/0xc0) from [<c04cfb54>] (debug_object_assert_init+0xdc/0xec)
> [<c04cfb54>] (debug_object_assert_init+0xdc/0xec) from [<c02310c4>] (del_timer+0x24/0x7c)
> [<c02310c4>] (del_timer+0x24/0x7c) from [<c023fef8>] (try_to_grab_pending+0xc0/0x1b0)
> [<c023fef8>] (try_to_grab_pending+0xc0/0x1b0) from [<c0240014>] (cancel_delayed_work+0x2c/0xa0)
> [<c0240014>] (cancel_delayed_work+0x2c/0xa0) from [<c059fa14>] (edac_device_workq_teardown+0x1c/0x38)
> [<c059fa14>] (edac_device_workq_teardown+0x1c/0x38) from [<c059fae8>] (edac_device_del_device+0xb8/0xe4)
> [<c059fae8>] (edac_device_del_device+0xb8/0xe4) from [<bf000050>] (krait_edac_remove+0x50/0x70 [krait_edac])
> [<bf000050>] (krait_edac_remove+0x50/0x70 [krait_edac]) from [<c0511e34>] (platform_drv_remove+0x24/0x28)
> [<c0511e34>] (platform_drv_remove+0x24/0x28) from [<c050fffc>] (__device_release_driver+0x68/0xc0)
> [<c050fffc>] (__device_release_driver+0x68/0xc0) from [<c0510adc>] (driver_detach+0xc4/0xc8)
> [<c0510adc>] (driver_detach+0xc4/0xc8) from [<c050fd98>] (bus_remove_driver+0xac/0x114)
> [<c050fd98>] (bus_remove_driver+0xac/0x114) from [<c05111d8>] (driver_unregister+0x38/0x58)
> [<c05111d8>] (driver_unregister+0x38/0x58) from [<c0511fe8>] (platform_driver_unregister+0x1c/0x20)
> [<c0511fe8>] (platform_driver_unregister+0x1c/0x20) from [<bf0005c4>] (krait_edac_driver_exit+0x14/0x1c [krait_edac])
> [<bf0005c4>] (krait_edac_driver_exit+0x14/0x1c [krait_edac]) from [<c028b248>] (SyS_delete_module+0x178/0x2b4)
> [<c028b248>] (SyS_delete_module+0x178/0x2b4) from [<c020f760>] (ret_fast_syscall+0x0/0x48)
>
> Fix it by skipping the workqueue teardown for such devices.
>
> Signed-off-by: Stephen Boyd <[email protected]>

Acked-by: Borislav Petkov <[email protected]>

Btw, you might want to remove the hex numbers in the trace above and
leave only the call stack to make it more readable.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-10-29 20:51:35

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 1/6] edac: Don't try to cancel workqueue when it's never setup

On 10/29/13 13:48, Borislav Petkov wrote:
> On Mon, Oct 28, 2013 at 05:31:25PM -0700, Stephen Boyd wrote:
>> We only setup a workqueue for edac devices that use the polling
>> method. We still try to cancel the workqueue if an edac_device
>> uses the irq method though. This causes a warning from debug
>> objects when we remove an edac device:
>>
>> WARNING: CPU: 0 PID: 56 at lib/debugobjects.c:260 debug_print_object+0x98/0xc0()
>> ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x28
>> Modules linked in: krait_edac(-)
>> CPU: 0 PID: 56 Comm: rmmod Not tainted 3.12.0-rc2-00035-g15292a0 #69
>> [<c02179b4>] (unwind_backtrace+0x0/0x144) from [<c0213e58>] (show_stack+0x20/0x24)
>> [<c0213e58>] (show_stack+0x20/0x24) from [<c06f7fc4>] (dump_stack+0x74/0xb4)
>> [<c06f7fc4>] (dump_stack+0x74/0xb4) from [<c0223c34>] (warn_slowpath_common+0x78/0x9c)
>> [<c0223c34>] (warn_slowpath_common+0x78/0x9c) from [<c0223d14>] (warn_slowpath_fmt+0x40/0x48)
>> [<c0223d14>] (warn_slowpath_fmt+0x40/0x48) from [<c04cf078>] (debug_print_object+0x98/0xc0)
>> [<c04cf078>] (debug_print_object+0x98/0xc0) from [<c04cfb54>] (debug_object_assert_init+0xdc/0xec)
>> [<c04cfb54>] (debug_object_assert_init+0xdc/0xec) from [<c02310c4>] (del_timer+0x24/0x7c)
>> [<c02310c4>] (del_timer+0x24/0x7c) from [<c023fef8>] (try_to_grab_pending+0xc0/0x1b0)
>> [<c023fef8>] (try_to_grab_pending+0xc0/0x1b0) from [<c0240014>] (cancel_delayed_work+0x2c/0xa0)
>> [<c0240014>] (cancel_delayed_work+0x2c/0xa0) from [<c059fa14>] (edac_device_workq_teardown+0x1c/0x38)
>> [<c059fa14>] (edac_device_workq_teardown+0x1c/0x38) from [<c059fae8>] (edac_device_del_device+0xb8/0xe4)
>> [<c059fae8>] (edac_device_del_device+0xb8/0xe4) from [<bf000050>] (krait_edac_remove+0x50/0x70 [krait_edac])
>> [<bf000050>] (krait_edac_remove+0x50/0x70 [krait_edac]) from [<c0511e34>] (platform_drv_remove+0x24/0x28)
>> [<c0511e34>] (platform_drv_remove+0x24/0x28) from [<c050fffc>] (__device_release_driver+0x68/0xc0)
>> [<c050fffc>] (__device_release_driver+0x68/0xc0) from [<c0510adc>] (driver_detach+0xc4/0xc8)
>> [<c0510adc>] (driver_detach+0xc4/0xc8) from [<c050fd98>] (bus_remove_driver+0xac/0x114)
>> [<c050fd98>] (bus_remove_driver+0xac/0x114) from [<c05111d8>] (driver_unregister+0x38/0x58)
>> [<c05111d8>] (driver_unregister+0x38/0x58) from [<c0511fe8>] (platform_driver_unregister+0x1c/0x20)
>> [<c0511fe8>] (platform_driver_unregister+0x1c/0x20) from [<bf0005c4>] (krait_edac_driver_exit+0x14/0x1c [krait_edac])
>> [<bf0005c4>] (krait_edac_driver_exit+0x14/0x1c [krait_edac]) from [<c028b248>] (SyS_delete_module+0x178/0x2b4)
>> [<c028b248>] (SyS_delete_module+0x178/0x2b4) from [<c020f760>] (ret_fast_syscall+0x0/0x48)
>>
>> Fix it by skipping the workqueue teardown for such devices.
>>
>> Signed-off-by: Stephen Boyd <[email protected]>
> Acked-by: Borislav Petkov <[email protected]>
>
> Btw, you might want to remove the hex numbers in the trace above and
> leave only the call stack to make it more readable.
>

Thanks. I'll do that in the next round.

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

2013-10-30 00:07:30

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding

On 10/29, Stephen Boyd wrote:
> On 10/29/13 01:21, Kumar Gala wrote:
> > On Oct 28, 2013, at 7:31 PM, Stephen Boyd wrote:
> >
> >> The Krait L1/L2 error reporting device is made up of two
> >> interrupts, one per-CPU interrupt for the L1 caches and one
> >> interrupt for the L2 cache.
> >>
> >> Cc: <[email protected]>
> >> Signed-off-by: Stephen Boyd <[email protected]>
> >> ---
> >> .../devicetree/bindings/arm/qcom,krait-cache-erp.txt | 16 ++++++++++++++++
> >> 1 file changed, 16 insertions(+)
> >> create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> >>
> >> diff --git a/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> >> new file mode 100644
> >> index 0000000..01fe8a8
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> >> @@ -0,0 +1,16 @@
> >> +* Qualcomm Krait L1 / L2 cache error reporting
> >> +
> >> +Required properties:
> >> +- compatible: Should be "qcom,krait-cache-erp"
> >> +- interrupts: Should contain the L1/CPU error interrupt number and
> >> + then the L2 cache error interrupt number
> >> +
> >> +Optional properties:
> >> +- interrupt-names: Should contain the interrupt names "l1_irq" and
> >> + "l2_irq"
> >> +
> >> +Example:
> >> + edac {
> >> + compatible = "qcom,krait-cache-erp";
> >> + interrupts = <1 9 0xf04>, <0 2 0x4>;
> >> + };
> > Why wouldn't we have these as part of cache nodes in the dts? (which begs the question why we don't have cache nodes?)
> >
>
> I can certainly add cache nodes and cpu nodes and then put the
> interrupts in those nodes. I was thinking along those same lines when I
> ported this driver but figured it would be good to get something out
> there. The only question I have is how am I supposed to hook that up
> into the linux device model? Will the edac driver bind to the device
> created for the cpus node and the cache node? I guess it will have to be
> a driver that binds to two devices.
>
> One could argue that we should put the cp15 based architected timers in
> the cpus node also but so far nobody has done that and I think there was
> some reasoning behind that, Mark?
>

Ok I've come up with this:

cpus {
#address-cells = <1>;
#size-cells = <0>;
compatible = "qcom,krait";
interrupts = <1 9 0xf04>;

cpu@0 {
reg = <0>;
next-level-cache = <&L2>;
};

cpu@1 {
reg = <1>;
next-level-cache = <&L2>;
};

cpu@2 {
reg = <2>;
next-level-cache = <&L2>;
};

cpu@3 {
reg = <3>;
next-level-cache = <&L2>;
};

L2: l2-cache {
compatible = "qcom,krait-l2", "cache";
cache-level = <2>;
interrupts = <0 2 0x4>;
};
};

But now I don't know where to document this. Should I make a
krait-edac.txt file and document it there? The problem is now
we're documenting more than the error interrupts.

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

2013-10-30 00:35:09

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding

On Tue, Oct 29, 2013 at 05:06:45AM +0000, Stephen Boyd wrote:
> On 10/28, Mark Rutland wrote:
> > On Tue, Oct 29, 2013 at 12:31:28AM +0000, Stephen Boyd wrote:
> > > +
> > > +Optional properties:
> > > +- interrupt-names: Should contain the interrupt names "l1_irq" and
> > > + "l2_irq"
> >
> > As with my comment on the parsing code, I'd prefer that if interrupt-names was
> > present it defined the order of interrupts. Otherwise it's redundant and of no
> > value.
> >
> > Otherwise, the binding looks fine to me:
> >
> > Acked-by: Mark Rutland <[email protected]>
>
> How about I just drop the interrupt-names property? It isn't
> adding much and is a holdover from the vendor kernel.

That's also fine given that this is a very specific binding.

Mark.

2013-10-30 00:38:38

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding

On Tue, Oct 29, 2013 at 06:00:59PM +0000, Stephen Boyd wrote:
> On 10/29/13 01:21, Kumar Gala wrote:
> > On Oct 28, 2013, at 7:31 PM, Stephen Boyd wrote:
> >
> >> The Krait L1/L2 error reporting device is made up of two
> >> interrupts, one per-CPU interrupt for the L1 caches and one
> >> interrupt for the L2 cache.
> >>
> >> Cc: <[email protected]>
> >> Signed-off-by: Stephen Boyd <[email protected]>
> >> ---
> >> .../devicetree/bindings/arm/qcom,krait-cache-erp.txt | 16 ++++++++++++++++
> >> 1 file changed, 16 insertions(+)
> >> create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> >>
> >> diff --git a/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> >> new file mode 100644
> >> index 0000000..01fe8a8
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
> >> @@ -0,0 +1,16 @@
> >> +* Qualcomm Krait L1 / L2 cache error reporting
> >> +
> >> +Required properties:
> >> +- compatible: Should be "qcom,krait-cache-erp"
> >> +- interrupts: Should contain the L1/CPU error interrupt number and
> >> + then the L2 cache error interrupt number
> >> +
> >> +Optional properties:
> >> +- interrupt-names: Should contain the interrupt names "l1_irq" and
> >> + "l2_irq"
> >> +
> >> +Example:
> >> + edac {
> >> + compatible = "qcom,krait-cache-erp";
> >> + interrupts = <1 9 0xf04>, <0 2 0x4>;
> >> + };
> > Why wouldn't we have these as part of cache nodes in the dts? (which begs the question why we don't have cache nodes?)
> >
>
> I can certainly add cache nodes and cpu nodes and then put the
> interrupts in those nodes. I was thinking along those same lines when I
> ported this driver but figured it would be good to get something out
> there. The only question I have is how am I supposed to hook that up
> into the linux device model? Will the edac driver bind to the device
> created for the cpus node and the cache node? I guess it will have to be
> a driver that binds to two devices.
>
> One could argue that we should put the cp15 based architected timers in
> the cpus node also but so far nobody has done that and I think there was
> some reasoning behind that, Mark?

The architected timer binding was created at a time I wasn't involved in kernel
development, and I'm not aware of any particular reasoning. I've heard that
there was a decision to not duplicate banked resources, which would explain not
having the timer under /cpus/cpu@N, but doesn't imply that having it under
/cpus is bad.

Do we have precedent for putting any devices other than CPUs in /cpus?

Thanks,
Mark.

2013-10-30 07:19:06

by Kumar Gala

[permalink] [raw]
Subject: Re: [PATCH 4/6] edac: Document Krait L1/L2 EDAC driver binding


On Oct 29, 2013, at 7:38 PM, Mark Rutland wrote:

> On Tue, Oct 29, 2013 at 06:00:59PM +0000, Stephen Boyd wrote:
>> On 10/29/13 01:21, Kumar Gala wrote:
>>> On Oct 28, 2013, at 7:31 PM, Stephen Boyd wrote:
>>>
>>>> The Krait L1/L2 error reporting device is made up of two
>>>> interrupts, one per-CPU interrupt for the L1 caches and one
>>>> interrupt for the L2 cache.
>>>>
>>>> Cc: <[email protected]>
>>>> Signed-off-by: Stephen Boyd <[email protected]>
>>>> ---
>>>> .../devicetree/bindings/arm/qcom,krait-cache-erp.txt | 16 ++++++++++++++++
>>>> 1 file changed, 16 insertions(+)
>>>> create mode 100644 Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>>>> new file mode 100644
>>>> index 0000000..01fe8a8
>>>> --- /dev/null
>>>> +++ b/Documentation/devicetree/bindings/arm/qcom,krait-cache-erp.txt
>>>> @@ -0,0 +1,16 @@
>>>> +* Qualcomm Krait L1 / L2 cache error reporting
>>>> +
>>>> +Required properties:
>>>> +- compatible: Should be "qcom,krait-cache-erp"
>>>> +- interrupts: Should contain the L1/CPU error interrupt number and
>>>> + then the L2 cache error interrupt number
>>>> +
>>>> +Optional properties:
>>>> +- interrupt-names: Should contain the interrupt names "l1_irq" and
>>>> + "l2_irq"
>>>> +
>>>> +Example:
>>>> + edac {
>>>> + compatible = "qcom,krait-cache-erp";
>>>> + interrupts = <1 9 0xf04>, <0 2 0x4>;
>>>> + };
>>> Why wouldn't we have these as part of cache nodes in the dts? (which begs the question why we don't have cache nodes?)
>>>
>>
>> I can certainly add cache nodes and cpu nodes and then put the
>> interrupts in those nodes. I was thinking along those same lines when I
>> ported this driver but figured it would be good to get something out
>> there. The only question I have is how am I supposed to hook that up
>> into the linux device model? Will the edac driver bind to the device
>> created for the cpus node and the cache node? I guess it will have to be
>> a driver that binds to two devices.
>>
>> One could argue that we should put the cp15 based architected timers in
>> the cpus node also but so far nobody has done that and I think there was
>> some reasoning behind that, Mark?
>
> The architected timer binding was created at a time I wasn't involved in kernel
> development, and I'm not aware of any particular reasoning. I've heard that
> there was a decision to not duplicate banked resources, which would explain not
> having the timer under /cpus/cpu@N, but doesn't imply that having it under
> /cpus is bad.
>
> Do we have precedent for putting any devices other than CPUs in /cpus?

On PPC we had core cache's (depending on topology) that would be there, and thus why I raised the suggestion.

- k

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation