2020-04-01 07:47:49

by Sherry Sun

[permalink] [raw]
Subject: [PATCH v2 0/4] Add edac driver for i.MX8MP based on synopsys edac driver

This patchset add edac driver support for i.MX8MP, since i.MX8MP use the same
synopsys memory controller with ZynqMP, so the driver is based on synopsys_edac
driver.

Considering that i.MX8MP dts file is still in kernel/git/shawnguo/linux.git and
isn't in mainline now, I will add EDAC node in i.MX8MP dts after this file is
pushed to mainline.

And the edac driver for i.MX8MP has been tested in kernel/git/shawnguo/linux.git
where i.MX8MP is supported, it turns out that this driver works well to detect
corrected and uncorrected errors for LPDDR4 in i.MX8MP.

Changes since v1:
=================
- Reorganized the patchset, no content changes.

Sherry Sun (4):
dt-bindings: memory-controllers: Add i.MX8MP DDRC binding doc
EDAC: Add synopsys edac driver support for i.MX8MP
EDAC: synopsys: Add edac driver support for i.MX8MP
EDAC: synopsys: Add useful debug and output information for 64bit
systems

.../bindings/memory-controllers/synopsys.txt | 8 +-
drivers/edac/Kconfig | 2 +-
drivers/edac/synopsys_edac.c | 259 ++++++++++++------
3 files changed, 185 insertions(+), 84 deletions(-)

--
2.17.1


2020-04-01 07:47:54

by Sherry Sun

[permalink] [raw]
Subject: [PATCH v2 1/4] dt-bindings: memory-controllers: Add i.MX8MP DDRC binding doc

Add documentation for i.MX8MP DDRC binding based on synopsys_edac doc,
which use the same memory-controller IP.

Signed-off-by: Sherry Sun <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
---
.../devicetree/bindings/memory-controllers/synopsys.txt | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/memory-controllers/synopsys.txt b/Documentation/devicetree/bindings/memory-controllers/synopsys.txt
index 9d32762c47e1..4fd14ba61474 100644
--- a/Documentation/devicetree/bindings/memory-controllers/synopsys.txt
+++ b/Documentation/devicetree/bindings/memory-controllers/synopsys.txt
@@ -6,16 +6,20 @@ bus width configurations.
The Zynq DDR ECC controller has an optional ECC support in half-bus width
(16-bit) configuration.

-These both ECC controllers correct single bit ECC errors and detect double bit
+The i.MX8MP DDR ECC controller has an ECC support in 64-bit bus width
+configurations.
+
+All the ECC controllers correct single bit ECC errors and detect double bit
ECC errors.

Required properties:
- compatible: One of:
- 'xlnx,zynq-ddrc-a05' : Zynq DDR ECC controller
- 'xlnx,zynqmp-ddrc-2.40a' : ZynqMP DDR ECC controller
+ - 'fsl,imx8mp-ddrc' : i.MX8MP DDR ECC controller
- reg: Should contain DDR controller registers location and length.

-Required properties for "xlnx,zynqmp-ddrc-2.40a":
+Required properties for "xlnx,zynqmp-ddrc-2.40a" and "fsl,imx8mp-ddrc":
- interrupts: Property with a value describing the interrupt number.

Example:
--
2.17.1

2020-04-01 07:48:08

by Sherry Sun

[permalink] [raw]
Subject: [PATCH v2 2/4] EDAC: Add synopsys edac driver support for i.MX8MP

The i.MX8MP has a memory controller supported by this driver. So here
add edac driver for i.MX8MP based on synopsys edac driver.

Signed-off-by: Sherry Sun <[email protected]>
---
drivers/edac/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index fe2eb892a1bd..58a2d67d5513 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -475,7 +475,7 @@ config EDAC_ARMADA_XP

config EDAC_SYNOPSYS
tristate "Synopsys DDR Memory Controller"
- depends on ARCH_ZYNQ || ARCH_ZYNQMP
+ depends on ARCH_ZYNQ || ARCH_ZYNQMP || ARCH_MXC
help
Support for error detection and correction on the Synopsys DDR
memory controller.
--
2.17.1

2020-04-01 07:49:19

by Sherry Sun

[permalink] [raw]
Subject: [PATCH v2 3/4] EDAC: synopsys: Add edac driver support for i.MX8MP

Since i.MX8MP use synopsys ddr controller IP, so add edac support
for i.MX8MP based on synopsys edac driver. i.MX8MP use LPDDR4 and
support interrupts for corrected and uncorrected errors. The main
difference between ZynqMP and i.MX8MP ddr controller is the interrupt
registers. So add another interrupt handler function, enable/disable
interrupt function to distinguish with ZynqMP.

Signed-off-by: Sherry Sun <[email protected]>
---
drivers/edac/synopsys_edac.c | 77 +++++++++++++++++++++++++++++++++++-
1 file changed, 76 insertions(+), 1 deletion(-)

diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 12211dc040e8..bf4202a24683 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -101,6 +101,7 @@
/* DDR ECC Quirks */
#define DDR_ECC_INTR_SUPPORT BIT(0)
#define DDR_ECC_DATA_POISON_SUPPORT BIT(1)
+#define DDR_ECC_IMX8MP BIT(2)

/* ZynqMP Enhanced DDR memory controller registers that are relevant to ECC */
/* ECC Configuration Registers */
@@ -266,6 +267,11 @@

#define RANK_B0_BASE 6

+/* ECCCTL UE/CE Interrupt enable/disable for IMX8MP*/
+#define DDR_CE_INTR_EN_MASK 0x100
+#define DDR_UE_INTR_EN_MASK 0x200
+#define ECC_INTR_MASK 0x10100
+
/**
* struct ecc_error_info - ECC error log information.
* @row: Row number.
@@ -516,6 +522,54 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
memset(p, 0, sizeof(*p));
}

+static void enable_intr_imx8mp(struct synps_edac_priv *priv)
+{
+ int regval;
+
+ regval = readl(priv->baseaddr + ECC_CLR_OFST);
+ regval |= (DDR_CE_INTR_EN_MASK | DDR_UE_INTR_EN_MASK);
+ writel(regval, priv->baseaddr + ECC_CLR_OFST);
+}
+
+static void disable_intr_imx8mp(struct synps_edac_priv *priv)
+{
+ int regval;
+
+ regval = readl(priv->baseaddr + ECC_CLR_OFST);
+ regval &= ~(DDR_CE_INTR_EN_MASK | DDR_UE_INTR_EN_MASK);
+ writel(regval, priv->baseaddr + ECC_CLR_OFST);
+}
+
+/* Interrupt Handler for ECC interrupts on imx8mp platform. */
+static irqreturn_t intr_handler_imx8mp(int irq, void *dev_id)
+{
+ const struct synps_platform_data *p_data;
+ struct mem_ctl_info *mci = dev_id;
+ struct synps_edac_priv *priv;
+ int status, regval;
+
+ priv = mci->pvt_info;
+ p_data = priv->p_data;
+
+ regval = readl(priv->baseaddr + ECC_STAT_OFST);
+ if (!(regval & ECC_INTR_MASK))
+ return IRQ_NONE;
+
+ status = p_data->get_error_info(priv);
+ if (status)
+ return IRQ_NONE;
+
+ priv->ce_cnt += priv->stat.ce_cnt;
+ priv->ue_cnt += priv->stat.ue_cnt;
+ handle_error(mci, &priv->stat);
+
+ edac_dbg(3, "Total error count CE %d UE %d\n",
+ priv->ce_cnt, priv->ue_cnt);
+ enable_intr_imx8mp(priv);
+
+ return IRQ_HANDLED;
+}
+
/**
* intr_handler - Interrupt Handler for ECC interrupts.
* @irq: IRQ number.
@@ -533,6 +587,9 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
priv = mci->pvt_info;
p_data = priv->p_data;

+ if (p_data->quirks & DDR_ECC_IMX8MP)
+ return intr_handler_imx8mp(irq, dev_id);
+
regval = readl(priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
regval &= (DDR_QOSCE_MASK | DDR_QOSUE_MASK);
if (!(regval & ECC_CE_UE_INTR_MASK))
@@ -809,7 +866,7 @@ static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
platform_set_drvdata(pdev, mci);

/* Initialize controller capabilities and configuration */
- mci->mtype_cap = MEM_FLAG_DDR3 | MEM_FLAG_DDR2;
+ mci->mtype_cap = MEM_FLAG_LRDDR4 | MEM_FLAG_DDR3 | MEM_FLAG_DDR2;
mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
mci->scrub_cap = SCRUB_HW_SRC;
mci->scrub_mode = SCRUB_NONE;
@@ -834,6 +891,9 @@ static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
static void enable_intr(struct synps_edac_priv *priv)
{
/* Enable UE/CE Interrupts */
+ if (priv->p_data->quirks & DDR_ECC_IMX8MP)
+ return enable_intr_imx8mp(priv);
+
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
}
@@ -841,6 +901,9 @@ static void enable_intr(struct synps_edac_priv *priv)
static void disable_intr(struct synps_edac_priv *priv)
{
/* Disable UE/CE Interrupts */
+ if (priv->p_data->quirks & DDR_ECC_IMX8MP)
+ return disable_intr_imx8mp(priv);
+
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_DB_OFST);
}
@@ -890,6 +953,14 @@ static const struct synps_platform_data zynqmp_edac_def = {
),
};

+static const struct synps_platform_data imx8mp_edac_def = {
+ .get_error_info = zynqmp_get_error_info,
+ .get_mtype = zynqmp_get_mtype,
+ .get_dtype = zynqmp_get_dtype,
+ .get_ecc_state = zynqmp_get_ecc_state,
+ .quirks = (DDR_ECC_INTR_SUPPORT | DDR_ECC_IMX8MP),
+};
+
static const struct of_device_id synps_edac_match[] = {
{
.compatible = "xlnx,zynq-ddrc-a05",
@@ -899,6 +970,10 @@ static const struct of_device_id synps_edac_match[] = {
.compatible = "xlnx,zynqmp-ddrc-2.40a",
.data = (void *)&zynqmp_edac_def
},
+ {
+ .compatible = "fsl,imx8mp-ddrc",
+ .data = (void *)&imx8mp_edac_def
+ },
{
/* end of table */
}
--
2.17.1

2020-04-01 07:49:33

by Sherry Sun

[permalink] [raw]
Subject: [PATCH v2 4/4] EDAC: synopsys: Add useful debug and output information for 64bit systems

Now the synopsys_edac driver only support to output the 32-bit error
data, but for 64 bit systems, such as i.MX8MP, 64 bit error data is
needed. At the same time, when CE/UE happens, syndrome data is also
useful to showed to user. So here add data_high and syndrome data for
64-bit systems.

And in order to distinguish 64-bit systems and other systems, here
adjust the position of the zynqmp_get_dtype(), so we can called
this function to distinguish it. To ensure that functions of the same
function are in the same position, here adjust the position of the
zynq_get_dtype() too.

Signed-off-by: Sherry Sun <[email protected]>
---
drivers/edac/synopsys_edac.c | 182 ++++++++++++++++++++---------------
1 file changed, 102 insertions(+), 80 deletions(-)

diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index bf4202a24683..e8c3631ddff4 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -278,18 +278,22 @@
* @col: Column number.
* @bank: Bank number.
* @bitpos: Bit position.
- * @data: Data causing the error.
+ * @data_low: Low bit data causing the error.
+ * @data_high: High bit data causing the error(used for 64 bit systems).
* @bankgrpnr: Bank group number.
* @blknr: Block number.
+ * @syndrome: Syndrome of the error.
*/
struct ecc_error_info {
u32 row;
u32 col;
u32 bank;
u32 bitpos;
- u32 data;
+ u32 data_low;
+ u32 data_high;
u32 bankgrpnr;
u32 blknr;
+ u32 syndrome;
};

/**
@@ -354,6 +358,70 @@ struct synps_platform_data {
int quirks;
};

+/**
+ * zynq_get_dtype - Return the controller memory width.
+ * @base: DDR memory controller base address.
+ *
+ * Get the EDAC device type width appropriate for the current controller
+ * configuration.
+ *
+ * Return: a device type width enumeration.
+ */
+static enum dev_type zynq_get_dtype(const void __iomem *base)
+{
+ enum dev_type dt;
+ u32 width;
+
+ width = readl(base + CTRL_OFST);
+ width = (width & CTRL_BW_MASK) >> CTRL_BW_SHIFT;
+
+ switch (width) {
+ case DDRCTL_WDTH_16:
+ dt = DEV_X2;
+ break;
+ case DDRCTL_WDTH_32:
+ dt = DEV_X4;
+ break;
+ default:
+ dt = DEV_UNKNOWN;
+ }
+
+ return dt;
+}
+
+/**
+ * zynqmp_get_dtype - Return the controller memory width.
+ * @base: DDR memory controller base address.
+ *
+ * Get the EDAC device type width appropriate for the current controller
+ * configuration.
+ *
+ * Return: a device type width enumeration.
+ */
+static enum dev_type zynqmp_get_dtype(const void __iomem *base)
+{
+ enum dev_type dt;
+ u32 width;
+
+ width = readl(base + CTRL_OFST);
+ width = (width & ECC_CTRL_BUSWIDTH_MASK) >> ECC_CTRL_BUSWIDTH_SHIFT;
+ switch (width) {
+ case DDRCTL_EWDTH_16:
+ dt = DEV_X2;
+ break;
+ case DDRCTL_EWDTH_32:
+ dt = DEV_X4;
+ break;
+ case DDRCTL_EWDTH_64:
+ dt = DEV_X8;
+ break;
+ default:
+ dt = DEV_UNKNOWN;
+ }
+
+ return dt;
+}
+
/**
* zynq_get_error_info - Get the current ECC error info.
* @priv: DDR memory controller private instance data.
@@ -385,9 +453,9 @@ static int zynq_get_error_info(struct synps_edac_priv *priv)
p->ceinfo.row = (regval & ADDR_ROW_MASK) >> ADDR_ROW_SHIFT;
p->ceinfo.col = regval & ADDR_COL_MASK;
p->ceinfo.bank = (regval & ADDR_BANK_MASK) >> ADDR_BANK_SHIFT;
- p->ceinfo.data = readl(base + CE_DATA_31_0_OFST);
+ p->ceinfo.data_low = readl(base + CE_DATA_31_0_OFST);
edac_dbg(3, "CE bit position: %d data: %d\n", p->ceinfo.bitpos,
- p->ceinfo.data);
+ p->ceinfo.data_low);
clearval = ECC_CTRL_CLR_CE_ERR;

ue_err:
@@ -399,7 +467,7 @@ static int zynq_get_error_info(struct synps_edac_priv *priv)
p->ueinfo.row = (regval & ADDR_ROW_MASK) >> ADDR_ROW_SHIFT;
p->ueinfo.col = regval & ADDR_COL_MASK;
p->ueinfo.bank = (regval & ADDR_BANK_MASK) >> ADDR_BANK_SHIFT;
- p->ueinfo.data = readl(base + UE_DATA_31_0_OFST);
+ p->ueinfo.data_low = readl(base + UE_DATA_31_0_OFST);
clearval |= ECC_CTRL_CLR_UE_ERR;

out:
@@ -443,10 +511,14 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
p->ceinfo.bankgrpnr = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
ECC_CEADDR1_BNKGRP_SHIFT;
p->ceinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
- p->ceinfo.data = readl(base + ECC_CSYND0_OFST);
- edac_dbg(2, "ECCCSYN0: 0x%08X ECCCSYN1: 0x%08X ECCCSYN2: 0x%08X\n",
- readl(base + ECC_CSYND0_OFST), readl(base + ECC_CSYND1_OFST),
- readl(base + ECC_CSYND2_OFST));
+ p->ceinfo.data_low = readl(base + ECC_CSYND0_OFST);
+ if (zynqmp_get_dtype(base) == DEV_X8) {
+ p->ceinfo.data_high = readl(base + ECC_CSYND1_OFST);
+ p->ceinfo.syndrome = readl(base + ECC_CSYND2_OFST);
+ edac_dbg(2, "CE data_low: 0x%08X data_high: 0x%08X syndrome: 0x%08X\n",
+ p->ceinfo.data_low, p->ceinfo.data_high,
+ p->ceinfo.syndrome);
+ }
ue_err:
if (!p->ue_cnt)
goto out;
@@ -459,7 +531,14 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
p->ueinfo.bank = (regval & ECC_CEADDR1_BNKNR_MASK) >>
ECC_CEADDR1_BNKNR_SHIFT;
p->ueinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
- p->ueinfo.data = readl(base + ECC_UESYND0_OFST);
+ p->ueinfo.data_low = readl(base + ECC_UESYND0_OFST);
+ if (zynqmp_get_dtype(base) == DEV_X8) {
+ p->ueinfo.data_high = readl(base + ECC_UESYND1_OFST);
+ p->ueinfo.syndrome = readl(base + ECC_UESYND2_OFST);
+ edac_dbg(2, "UE data_low: 0x%08X data_high: 0x%08X syndrome: 0x%08X\n",
+ p->ueinfo.data_low, p->ueinfo.data_high,
+ p->ueinfo.syndrome);
+ }
out:
clearval = ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT;
clearval |= ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT;
@@ -480,20 +559,27 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
{
struct synps_edac_priv *priv = mci->pvt_info;
struct ecc_error_info *pinf;
+ int n;

if (p->ce_cnt) {
pinf = &p->ceinfo;
if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
- snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type:%s Row %d Bank %d BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
- "CE", pinf->row, pinf->bank,
- pinf->bankgrpnr, pinf->blknr,
- pinf->bitpos, pinf->data);
+ n = snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
+ "DDR ECC error type:%s Row %d Bank %d BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
+ "CE", pinf->row, pinf->bank,
+ pinf->bankgrpnr, pinf->blknr,
+ pinf->bitpos, pinf->data_low);
+
+ if (zynqmp_get_dtype(priv->baseaddr) == DEV_X8)
+ snprintf(priv->message + n,
+ SYNPS_EDAC_MSG_SIZE - n,
+ " Data_high: 0x%08x Syndrome: 0x%08x",
+ pinf->data_high, pinf->syndrome);
} else {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
"DDR ECC error type:%s Row %d Bank %d Col %d Bit Position: %d Data: 0x%08x",
"CE", pinf->row, pinf->bank, pinf->col,
- pinf->bitpos, pinf->data);
+ pinf->bitpos, pinf->data_low);
}

edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
@@ -636,70 +722,6 @@ static void check_errors(struct mem_ctl_info *mci)
priv->ce_cnt, priv->ue_cnt);
}

-/**
- * zynq_get_dtype - Return the controller memory width.
- * @base: DDR memory controller base address.
- *
- * Get the EDAC device type width appropriate for the current controller
- * configuration.
- *
- * Return: a device type width enumeration.
- */
-static enum dev_type zynq_get_dtype(const void __iomem *base)
-{
- enum dev_type dt;
- u32 width;
-
- width = readl(base + CTRL_OFST);
- width = (width & CTRL_BW_MASK) >> CTRL_BW_SHIFT;
-
- switch (width) {
- case DDRCTL_WDTH_16:
- dt = DEV_X2;
- break;
- case DDRCTL_WDTH_32:
- dt = DEV_X4;
- break;
- default:
- dt = DEV_UNKNOWN;
- }
-
- return dt;
-}
-
-/**
- * zynqmp_get_dtype - Return the controller memory width.
- * @base: DDR memory controller base address.
- *
- * Get the EDAC device type width appropriate for the current controller
- * configuration.
- *
- * Return: a device type width enumeration.
- */
-static enum dev_type zynqmp_get_dtype(const void __iomem *base)
-{
- enum dev_type dt;
- u32 width;
-
- width = readl(base + CTRL_OFST);
- width = (width & ECC_CTRL_BUSWIDTH_MASK) >> ECC_CTRL_BUSWIDTH_SHIFT;
- switch (width) {
- case DDRCTL_EWDTH_16:
- dt = DEV_X2;
- break;
- case DDRCTL_EWDTH_32:
- dt = DEV_X4;
- break;
- case DDRCTL_EWDTH_64:
- dt = DEV_X8;
- break;
- default:
- dt = DEV_UNKNOWN;
- }
-
- return dt;
-}
-
/**
* zynq_get_ecc_state - Return the controller ECC enable/disable status.
* @base: DDR memory controller base address.
--
2.17.1

2020-04-01 10:58:59

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] EDAC: synopsys: Add useful debug and output information for 64bit systems

On 01.04.20 15:39:09, Sherry Sun wrote:
> Now the synopsys_edac driver only support to output the 32-bit error
> data, but for 64 bit systems, such as i.MX8MP, 64 bit error data is
> needed. At the same time, when CE/UE happens, syndrome data is also
> useful to showed to user. So here add data_high and syndrome data for
> 64-bit systems.
>
> And in order to distinguish 64-bit systems and other systems, here
> adjust the position of the zynqmp_get_dtype(), so we can called
> this function to distinguish it. To ensure that functions of the same
> function are in the same position, here adjust the position of the
> zynq_get_dtype() too.
>
> Signed-off-by: Sherry Sun <[email protected]>
> ---
> drivers/edac/synopsys_edac.c | 182 ++++++++++++++++++++---------------
> 1 file changed, 102 insertions(+), 80 deletions(-)
>
> diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
> index bf4202a24683..e8c3631ddff4 100644
> --- a/drivers/edac/synopsys_edac.c
> +++ b/drivers/edac/synopsys_edac.c
> @@ -278,18 +278,22 @@
> * @col: Column number.
> * @bank: Bank number.
> * @bitpos: Bit position.
> - * @data: Data causing the error.
> + * @data_low: Low bit data causing the error.
> + * @data_high: High bit data causing the error(used for 64 bit systems).
> * @bankgrpnr: Bank group number.
> * @blknr: Block number.
> + * @syndrome: Syndrome of the error.
> */
> struct ecc_error_info {
> u32 row;
> u32 col;
> u32 bank;
> u32 bitpos;
> - u32 data;
> + u32 data_low;
> + u32 data_high;

Where are 16, 32 and 64 bit widths. You could handle them all the same
in a u64.

If I am not wrong, the width is fix for the whole mci. So you could
create various .get_error_info() functions depending on the data
width without run time width checks.

> u32 bankgrpnr;
> u32 blknr;
> + u32 syndrome;
> };
>
> /**

> @@ -399,7 +467,7 @@ static int zynq_get_error_info(struct synps_edac_priv *priv)
> p->ueinfo.row = (regval & ADDR_ROW_MASK) >> ADDR_ROW_SHIFT;
> p->ueinfo.col = regval & ADDR_COL_MASK;
> p->ueinfo.bank = (regval & ADDR_BANK_MASK) >> ADDR_BANK_SHIFT;
> - p->ueinfo.data = readl(base + UE_DATA_31_0_OFST);
> + p->ueinfo.data_low = readl(base + UE_DATA_31_0_OFST);
> clearval |= ECC_CTRL_CLR_UE_ERR;
>
> out:
> @@ -443,10 +511,14 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
> p->ceinfo.bankgrpnr = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
> ECC_CEADDR1_BNKGRP_SHIFT;
> p->ceinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
> - p->ceinfo.data = readl(base + ECC_CSYND0_OFST);
> - edac_dbg(2, "ECCCSYN0: 0x%08X ECCCSYN1: 0x%08X ECCCSYN2: 0x%08X\n",
> - readl(base + ECC_CSYND0_OFST), readl(base + ECC_CSYND1_OFST),
> - readl(base + ECC_CSYND2_OFST));
> + p->ceinfo.data_low = readl(base + ECC_CSYND0_OFST);
> + if (zynqmp_get_dtype(base) == DEV_X8) {
> + p->ceinfo.data_high = readl(base + ECC_CSYND1_OFST);
> + p->ceinfo.syndrome = readl(base + ECC_CSYND2_OFST);
> + edac_dbg(2, "CE data_low: 0x%08X data_high: 0x%08X syndrome: 0x%08X\n",
> + p->ceinfo.data_low, p->ceinfo.data_high,
> + p->ceinfo.syndrome);

You are loosing edac_dbg() here for the != DEV_X8 cases.

> + }
> ue_err:
> if (!p->ue_cnt)
> goto out;
> @@ -459,7 +531,14 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
> p->ueinfo.bank = (regval & ECC_CEADDR1_BNKNR_MASK) >>
> ECC_CEADDR1_BNKNR_SHIFT;
> p->ueinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
> - p->ueinfo.data = readl(base + ECC_UESYND0_OFST);
> + p->ueinfo.data_low = readl(base + ECC_UESYND0_OFST);
> + if (zynqmp_get_dtype(base) == DEV_X8) {
> + p->ueinfo.data_high = readl(base + ECC_UESYND1_OFST);
> + p->ueinfo.syndrome = readl(base + ECC_UESYND2_OFST);
> + edac_dbg(2, "UE data_low: 0x%08X data_high: 0x%08X syndrome: 0x%08X\n",
> + p->ueinfo.data_low, p->ueinfo.data_high,
> + p->ueinfo.syndrome);

Similar here, no edac_dbg() for != DEV_X8.

> + }
> out:
> clearval = ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT;
> clearval |= ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT;
> @@ -480,20 +559,27 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
> {
> struct synps_edac_priv *priv = mci->pvt_info;
> struct ecc_error_info *pinf;
> + int n;
>
> if (p->ce_cnt) {
> pinf = &p->ceinfo;
> if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
> - snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> - "DDR ECC error type:%s Row %d Bank %d BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
> - "CE", pinf->row, pinf->bank,
> - pinf->bankgrpnr, pinf->blknr,
> - pinf->bitpos, pinf->data);
> + n = snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> + "DDR ECC error type:%s Row %d Bank %d BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
> + "CE", pinf->row, pinf->bank,
> + pinf->bankgrpnr, pinf->blknr,
> + pinf->bitpos, pinf->data_low);
> +
> + if (zynqmp_get_dtype(priv->baseaddr) == DEV_X8)

This is zynqmp specific, right? but you call it in the generic
function handle_error().

-Robert

> + snprintf(priv->message + n,
> + SYNPS_EDAC_MSG_SIZE - n,
> + " Data_high: 0x%08x Syndrome: 0x%08x",
> + pinf->data_high, pinf->syndrome);
> } else {
> snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> "DDR ECC error type:%s Row %d Bank %d Col %d Bit Position: %d Data: 0x%08x",
> "CE", pinf->row, pinf->bank, pinf->col,
> - pinf->bitpos, pinf->data);
> + pinf->bitpos, pinf->data_low);
> }
>
> edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,

2020-04-01 14:33:47

by Sherry Sun

[permalink] [raw]
Subject: RE: [PATCH v2 4/4] EDAC: synopsys: Add useful debug and output information for 64bit systems

Hi Robert,

> -----Original Message-----
> From: Robert Richter <[email protected]>
> Sent: 2020??4??1?? 18:57
> To: Sherry Sun <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; Frank Li <[email protected]>
> Subject: Re: [PATCH v2 4/4] EDAC: synopsys: Add useful debug and output
> information for 64bit systems
>
> On 01.04.20 15:39:09, Sherry Sun wrote:
> > Now the synopsys_edac driver only support to output the 32-bit error
> > data, but for 64 bit systems, such as i.MX8MP, 64 bit error data is
> > needed. At the same time, when CE/UE happens, syndrome data is also
> > useful to showed to user. So here add data_high and syndrome data for
> > 64-bit systems.
> >
> > And in order to distinguish 64-bit systems and other systems, here
> > adjust the position of the zynqmp_get_dtype(), so we can called this
> > function to distinguish it. To ensure that functions of the same
> > function are in the same position, here adjust the position of the
> > zynq_get_dtype() too.
> >
> > Signed-off-by: Sherry Sun <[email protected]>
> > ---
> > drivers/edac/synopsys_edac.c | 182
> > ++++++++++++++++++++---------------
> > 1 file changed, 102 insertions(+), 80 deletions(-)
> >
> > diff --git a/drivers/edac/synopsys_edac.c
> > b/drivers/edac/synopsys_edac.c index bf4202a24683..e8c3631ddff4
> 100644
> > --- a/drivers/edac/synopsys_edac.c
> > +++ b/drivers/edac/synopsys_edac.c
> > @@ -278,18 +278,22 @@
> > * @col: Column number.
> > * @bank: Bank number.
> > * @bitpos: Bit position.
> > - * @data: Data causing the error.
> > + * @data_low: Low bit data causing the error.
> > + * @data_high: High bit data causing the error(used for 64 bit
> systems).
> > * @bankgrpnr: Bank group number.
> > * @blknr: Block number.
> > + * @syndrome: Syndrome of the error.
> > */
> > struct ecc_error_info {
> > u32 row;
> > u32 col;
> > u32 bank;
> > u32 bitpos;
> > - u32 data;
> > + u32 data_low;
> > + u32 data_high;
>
> Where are 16, 32 and 64 bit widths. You could handle them all the same in a
> u64.
>
> If I am not wrong, the width is fix for the whole mci. So you could create
> various .get_error_info() functions depending on the data width without run
> time width checks.
>

Yes, here we can use u64. Will correct it in v3.
For the zynq, it's 16-bit bus width, for imx8mp, it's 64-bit bus width.
But for zynqmp, it's optional between 32-bit and 64-bit bus width,
So we had better run time check the bus width, what do you think?

> > u32 bankgrpnr;
> > u32 blknr;
> > + u32 syndrome;
> > };
> >
> > /**
>
> > @@ -399,7 +467,7 @@ static int zynq_get_error_info(struct
> synps_edac_priv *priv)
> > p->ueinfo.row = (regval & ADDR_ROW_MASK) >> ADDR_ROW_SHIFT;
> > p->ueinfo.col = regval & ADDR_COL_MASK;
> > p->ueinfo.bank = (regval & ADDR_BANK_MASK) >>
> ADDR_BANK_SHIFT;
> > - p->ueinfo.data = readl(base + UE_DATA_31_0_OFST);
> > + p->ueinfo.data_low = readl(base + UE_DATA_31_0_OFST);
> > clearval |= ECC_CTRL_CLR_UE_ERR;
> >
> > out:
> > @@ -443,10 +511,14 @@ static int zynqmp_get_error_info(struct
> synps_edac_priv *priv)
> > p->ceinfo.bankgrpnr = (regval &
> ECC_CEADDR1_BNKGRP_MASK) >>
> > ECC_CEADDR1_BNKGRP_SHIFT;
> > p->ceinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
> > - p->ceinfo.data = readl(base + ECC_CSYND0_OFST);
> > - edac_dbg(2, "ECCCSYN0: 0x%08X ECCCSYN1: 0x%08X ECCCSYN2:
> 0x%08X\n",
> > - readl(base + ECC_CSYND0_OFST), readl(base +
> ECC_CSYND1_OFST),
> > - readl(base + ECC_CSYND2_OFST));
> > + p->ceinfo.data_low = readl(base + ECC_CSYND0_OFST);
> > + if (zynqmp_get_dtype(base) == DEV_X8) {
> > + p->ceinfo.data_high = readl(base + ECC_CSYND1_OFST);
> > + p->ceinfo.syndrome = readl(base + ECC_CSYND2_OFST);
> > + edac_dbg(2, "CE data_low: 0x%08X data_high: 0x%08X
> syndrome: 0x%08X\n",
> > + p->ceinfo.data_low, p->ceinfo.data_high,
> > + p->ceinfo.syndrome);
>
> You are loosing edac_dbg() here for the != DEV_X8 cases.
>
> > + }
> > ue_err:
> > if (!p->ue_cnt)
> > goto out;
> > @@ -459,7 +531,14 @@ static int zynqmp_get_error_info(struct
> synps_edac_priv *priv)
> > p->ueinfo.bank = (regval & ECC_CEADDR1_BNKNR_MASK) >>
> > ECC_CEADDR1_BNKNR_SHIFT;
> > p->ueinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
> > - p->ueinfo.data = readl(base + ECC_UESYND0_OFST);
> > + p->ueinfo.data_low = readl(base + ECC_UESYND0_OFST);
> > + if (zynqmp_get_dtype(base) == DEV_X8) {
> > + p->ueinfo.data_high = readl(base + ECC_UESYND1_OFST);
> > + p->ueinfo.syndrome = readl(base + ECC_UESYND2_OFST);
> > + edac_dbg(2, "UE data_low: 0x%08X data_high: 0x%08X
> syndrome: 0x%08X\n",
> > + p->ueinfo.data_low, p->ueinfo.data_high,
> > + p->ueinfo.syndrome);
>
> Similar here, no edac_dbg() for != DEV_X8.

Thanks, will correct it in v3.

>
> > + }
> > out:
> > clearval = ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT;
> > clearval |= ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT;
> @@ -480,20
> > +559,27 @@ static void handle_error(struct mem_ctl_info *mci, struct
> > synps_ecc_status *p) {
> > struct synps_edac_priv *priv = mci->pvt_info;
> > struct ecc_error_info *pinf;
> > + int n;
> >
> > if (p->ce_cnt) {
> > pinf = &p->ceinfo;
> > if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
> > - snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > - "DDR ECC error type:%s Row %d Bank %d
> BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
> > - "CE", pinf->row, pinf->bank,
> > - pinf->bankgrpnr, pinf->blknr,
> > - pinf->bitpos, pinf->data);
> > + n = snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > + "DDR ECC error type:%s Row %d Bank %d
> BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
> > + "CE", pinf->row, pinf->bank,
> > + pinf->bankgrpnr, pinf->blknr,
> > + pinf->bitpos, pinf->data_low);
> > +
> > + if (zynqmp_get_dtype(priv->baseaddr) == DEV_X8)
>
> This is zynqmp specific, right? but you call it in the generic function
> handle_error().

Yes, although handle_error() is a generic function,
but this operation is checked by priv->p_data->quirks,
so here is zynqmp and imx8mp specific.

Best regards
Sherry Sun

>
> -Robert
>
> > + snprintf(priv->message + n,
> > + SYNPS_EDAC_MSG_SIZE - n,
> > + " Data_high: 0x%08x Syndrome:
> 0x%08x",
> > + pinf->data_high, pinf->syndrome);
> > } else {
> > snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > "DDR ECC error type:%s Row %d Bank %d
> Col %d Bit Position: %d Data: 0x%08x",
> > "CE", pinf->row, pinf->bank, pinf->col,
> > - pinf->bitpos, pinf->data);
> > + pinf->bitpos, pinf->data_low);
> > }
> >
> > edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,

2020-04-02 07:21:35

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] EDAC: synopsys: Add useful debug and output information for 64bit systems

On 01.04.20 14:32:58, Sherry Sun wrote:
> > From: Robert Richter <[email protected]>
> > On 01.04.20 15:39:09, Sherry Sun wrote:

> > > --- a/drivers/edac/synopsys_edac.c
> > > +++ b/drivers/edac/synopsys_edac.c
> > > @@ -278,18 +278,22 @@
> > > * @col: Column number.
> > > * @bank: Bank number.
> > > * @bitpos: Bit position.
> > > - * @data: Data causing the error.
> > > + * @data_low: Low bit data causing the error.
> > > + * @data_high: High bit data causing the error(used for 64 bit
> > systems).
> > > * @bankgrpnr: Bank group number.
> > > * @blknr: Block number.
> > > + * @syndrome: Syndrome of the error.
> > > */
> > > struct ecc_error_info {
> > > u32 row;
> > > u32 col;
> > > u32 bank;
> > > u32 bitpos;
> > > - u32 data;
> > > + u32 data_low;
> > > + u32 data_high;
> >
> > Where are 16, 32 and 64 bit widths. You could handle them all the same in a
> > u64.
> >
> > If I am not wrong, the width is fix for the whole mci. So you could create
> > various .get_error_info() functions depending on the data width without run
> > time width checks.
> >
>
> Yes, here we can use u64. Will correct it in v3.
> For the zynq, it's 16-bit bus width, for imx8mp, it's 64-bit bus width.
> But for zynqmp, it's optional between 32-bit and 64-bit bus width,
> So we had better run time check the bus width, what do you think?

I am wondering a bit here. *get_dtype() is used only in init_csrows()
which sets up the mci. So it will be a fix value all the time after
init. With that you easily can setup functions depending on the width,
alternatively the width could be stored in struct synps_platform_data
or struct synps_edac_priv.

>
> > > u32 bankgrpnr;
> > > u32 blknr;
> > > + u32 syndrome;
> > > };

> > @@ -480,20
> > > +559,27 @@ static void handle_error(struct mem_ctl_info *mci, struct
> > > synps_ecc_status *p) {
> > > struct synps_edac_priv *priv = mci->pvt_info;
> > > struct ecc_error_info *pinf;
> > > + int n;
> > >
> > > if (p->ce_cnt) {
> > > pinf = &p->ceinfo;
> > > if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
> > > - snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > > - "DDR ECC error type:%s Row %d Bank %d
> > BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
> > > - "CE", pinf->row, pinf->bank,
> > > - pinf->bankgrpnr, pinf->blknr,
> > > - pinf->bitpos, pinf->data);
> > > + n = snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > > + "DDR ECC error type:%s Row %d Bank %d
> > BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
> > > + "CE", pinf->row, pinf->bank,
> > > + pinf->bankgrpnr, pinf->blknr,
> > > + pinf->bitpos, pinf->data_low);
> > > +
> > > + if (zynqmp_get_dtype(priv->baseaddr) == DEV_X8)
> >
> > This is zynqmp specific, right? but you call it in the generic function
> > handle_error().
>
> Yes, although handle_error() is a generic function,
> but this operation is checked by priv->p_data->quirks,
> so here is zynqmp and imx8mp specific.

Naah, this is a bit hacky, just make 2 variants of handlers and set
them up during init. Storing the width in some private data would be
an alternative.

Thanks,

-Robert

>
> Best regards
> Sherry Sun
>
> >
> > -Robert
> >
> > > + snprintf(priv->message + n,
> > > + SYNPS_EDAC_MSG_SIZE - n,
> > > + " Data_high: 0x%08x Syndrome:
> > 0x%08x",
> > > + pinf->data_high, pinf->syndrome);
> > > } else {
> > > snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > > "DDR ECC error type:%s Row %d Bank %d
> > Col %d Bit Position: %d Data: 0x%08x",
> > > "CE", pinf->row, pinf->bank, pinf->col,
> > > - pinf->bitpos, pinf->data);
> > > + pinf->bitpos, pinf->data_low);
> > > }
> > >
> > > edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,

2020-04-02 08:46:06

by Sherry Sun

[permalink] [raw]
Subject: RE: [PATCH v2 4/4] EDAC: synopsys: Add useful debug and output information for 64bit systems

Hi Robert,

> -----Original Message-----
> From: Robert Richter <[email protected]>
> Sent: 2020??4??2?? 15:20
> To: Sherry Sun <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; Frank Li <[email protected]>
> Subject: Re: [PATCH v2 4/4] EDAC: synopsys: Add useful debug and output
> information for 64bit systems
>
> On 01.04.20 14:32:58, Sherry Sun wrote:
> > > From: Robert Richter <[email protected]> On 01.04.20 15:39:09,
> > > Sherry Sun wrote:
>
> > > > --- a/drivers/edac/synopsys_edac.c
> > > > +++ b/drivers/edac/synopsys_edac.c
> > > > @@ -278,18 +278,22 @@
> > > > * @col: Column number.
> > > > * @bank: Bank number.
> > > > * @bitpos: Bit position.
> > > > - * @data: Data causing the error.
> > > > + * @data_low: Low bit data causing the error.
> > > > + * @data_high: High bit data causing the error(used for 64 bit
> > > systems).
> > > > * @bankgrpnr: Bank group number.
> > > > * @blknr: Block number.
> > > > + * @syndrome: Syndrome of the error.
> > > > */
> > > > struct ecc_error_info {
> > > > u32 row;
> > > > u32 col;
> > > > u32 bank;
> > > > u32 bitpos;
> > > > - u32 data;
> > > > + u32 data_low;
> > > > + u32 data_high;
> > >
> > > Where are 16, 32 and 64 bit widths. You could handle them all the
> > > same in a u64.
> > >
> > > If I am not wrong, the width is fix for the whole mci. So you could
> > > create various .get_error_info() functions depending on the data
> > > width without run time width checks.
> > >
> >
> > Yes, here we can use u64. Will correct it in v3.
> > For the zynq, it's 16-bit bus width, for imx8mp, it's 64-bit bus width.
> > But for zynqmp, it's optional between 32-bit and 64-bit bus width, So
> > we had better run time check the bus width, what do you think?
>
> I am wondering a bit here. *get_dtype() is used only in init_csrows() which
> sets up the mci. So it will be a fix value all the time after init. With that you
> easily can setup functions depending on the width, alternatively the width
> could be stored in struct synps_platform_data or struct synps_edac_priv.
>

Yes, I think your suggestion is reasonable, I will have a try later.

> >
> > > > u32 bankgrpnr;
> > > > u32 blknr;
> > > > + u32 syndrome;
> > > > };
>
> > > @@ -480,20
> > > > +559,27 @@ static void handle_error(struct mem_ctl_info *mci,
> > > > +struct
> > > > synps_ecc_status *p) {
> > > > struct synps_edac_priv *priv = mci->pvt_info;
> > > > struct ecc_error_info *pinf;
> > > > + int n;
> > > >
> > > > if (p->ce_cnt) {
> > > > pinf = &p->ceinfo;
> > > > if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
> > > > - snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > > > - "DDR ECC error type:%s Row %d Bank %d
> > > BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
> > > > - "CE", pinf->row, pinf->bank,
> > > > - pinf->bankgrpnr, pinf->blknr,
> > > > - pinf->bitpos, pinf->data);
> > > > + n = snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > > > + "DDR ECC error type:%s Row %d Bank %d
> > > BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
> > > > + "CE", pinf->row, pinf->bank,
> > > > + pinf->bankgrpnr, pinf->blknr,
> > > > + pinf->bitpos, pinf->data_low);
> > > > +
> > > > + if (zynqmp_get_dtype(priv->baseaddr) == DEV_X8)
> > >
> > > This is zynqmp specific, right? but you call it in the generic
> > > function handle_error().
> >
> > Yes, although handle_error() is a generic function, but this operation
> > is checked by priv->p_data->quirks, so here is zynqmp and imx8mp
> > specific.
>
> Naah, this is a bit hacky, just make 2 variants of handlers and set them up
> during init. Storing the width in some private data would be an alternative.
>

Okay, I will try to store the bus width in private data.Thanks.

Best regards
Sherry Sun

> Thanks,
>
> -Robert
>
> >
> > Best regards
> > Sherry Sun
> >
> > >
> > > -Robert
> > >
> > > > + snprintf(priv->message + n,
> > > > + SYNPS_EDAC_MSG_SIZE - n,
> > > > + " Data_high: 0x%08x Syndrome:
> > > 0x%08x",
> > > > + pinf->data_high, pinf->syndrome);
> > > > } else {
> > > > snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
> > > > "DDR ECC error type:%s Row %d Bank %d
> > > Col %d Bit Position: %d Data: 0x%08x",
> > > > "CE", pinf->row, pinf->bank, pinf->col,
> > > > - pinf->bitpos, pinf->data);
> > > > + pinf->bitpos, pinf->data_low);
> > > > }
> > > >
> > > > edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,