This patchset is a first one in the series created in the framework of
my Synopsys DW uMCTL2 DDRC-related work:
[1: In-progress v5] EDAC/mc/synopsys: Various fixes and cleanups
Link: ---you are looking at it---
[2: In-progress v4] EDAC/synopsys: Add generic DDRC info and address mapping
Link: https://lore.kernel.org/linux-edac/[email protected]
[3: In-progress v4] EDAC/synopsys: Add generic resources and Scrub support
Link: https://lore.kernel.org/linux-edac/[email protected]
Note the patchsets above must be merged in the same order as they are
placed in the list in order to prevent conflicts. Nothing prevents them
from being reviewed synchronously though. Any tests are very welcome.
Thanks in advance.
The main goal of the entire set of the changes provided in the mentioned
patchsets is to as much as possible specialise the synopsys_edac.c driver
to be working with the Synopsys DW uMCTL2 DDR controllers of various
versions and synthesized parameters, and add useful error-detection
features.
Regarding this series content. It's an initial patchset which
traditionally provides various fixes, cleanups and modifications required
for the more comfortable further features development. The main goal of it
though is to detach the Xilinx Zynq A05 DDRC related code into the
dedicated driver since first it has nothing to do with the Synopsys DW
uMCTL2 DDR controller and second it will be a great deal obstacle on the
way of extending the Synopsys-part functionality.
The series starts with the fixes patches, which in short concern the next
aspects: touching the ZynqMP-specific CSRs on the Xilinx ZinqMP platform
only, serializing an access to the ECCCLR/ECCCTL register, adding correct memory
devices type detection, setting a correct value to the
mem_ctl_info.scrub_cap field, dropping an erroneous ADDRMAP[4] parsing and
getting back a correct order of the ECC errors info detection procedure.
Afterwards the patchset provides several cleanup patches required for the
more coherent code splitting up (Xilinx Zynq A05 and Synopsys DW uMCTL2
DDRCs) so the provided modifications would be useful in both drivers.
First the platform resource open-coded IO-space remapping is replaced with
the devm_platform_ioremap_resource() method call for the sake of the code
simplification. Secondly the next redundant entities are dropped: internal
CE/UE errors counters, local to_mci() macros definition, some redundant
ecc_error_info structure fields and redundant info from the error message,
duplicated dimm->nr_pages debug printout and spaces from the MEM_TYPE
flags declarations. (The later two updates concern the MCI core part.)
Thirdly before detaching the Zynq A05-related code an unique MC index
allocation infrastructure is added to the MCI core. It's required since
after splitting the driver up both supported types of memory devices could
be correctly probed on the same platform. Note even though it's currently
unsupported by the synsopsys_edac.c driver it's claimed to be possible by
the original driver author (it was a reason of having two unrelated
devices supported in a single driver). Finally the Xilinx Zynq A05 part of
the driver is moved out to a dedicated driver. After that the
platform-specific setups API is removed from the Synopsys DW uMCTL2 DDRC
driver since it's no longer required.
Finally as the cherry on the cake a set of the local coding style
cleanups are provided: unify the DW uMCTL2 DDRC driver entities naming and
replace the open-coded "shift/mask" pattern with the kernel helpers like
BIT/GENMASK/FIELD_x in there. It shall significantly improve the code
readability.
Changelog v2:
- Move Synopsys DW uMCTL2 DDRC bindings file renaming to a separate patch.
(@Krzysztof)
- Introduce a new compatible string "snps,dw-umctl2-ddrc" matching the new
DT-schema name.
- Forgot to fix some of the prefix of the SYNPS_ZYNQMP_IRQ_REGS macro
in several places. (@tbot)
- Drop the no longer used "priv" pointer from the mc_init() function.
(@tbot)
- Include "linux/bitfield.h" header file to get the FIELD_GET macro
definition. (@tbot)
- Drop the already merged in patches:
[PATCH 12/20] EDAC/mc: Replace spaces with tabs in memtype flags definition
[PATCH 13/20] EDAC/mc: Drop duplicated dimm->nr_pages debug printout
Changelog v3:
- Drop the no longer used "priv" pointer from the mc_init() function.
(@tbot)
- Drop the merged in patches:
[PATCH v2 14/19] dt-bindings: memory: snps: Detach Zynq DDRC controller support
[PATCH v2 15/19] dt-bindings: memory: snps: Use more descriptive device name
(@Krzysztof)
Changelog v4:
- Remove Rob, Krzysztof and DT-mailing list from Cc since the respective
patches have already been merged in.
- Add a new patch
[PATCH v4 6/20] EDAC/synopsys: Fix misleading IRQ self-cleared quirk flag
detached from the very first patch of the series.
- Add a new patch
[PATCH v4 15/20] EDAC/mc: Re-use generic unique MC index allocation procedure
- Add a new patch
[PATCH v4 18/20] EDAC/synopsys: Unify CSRs macro declarations
collecting the changes from various patches of the series.
- Drop redundant empty lines left by mistake.
- Drop private counters access from the check_errors() method too.
- Rebase onto the kernel v6.6-rcX.
Link: https://lore.kernel.org/linux-edac/[email protected]
Changelog v5:
- Fix function names in the zynq_edac.c kdoc.
- Rebase onto the kernel 6.8-rc3.
Signed-off-by: Serge Semin <[email protected]>
Cc: Punnaiah Choudary Kalluri <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Serge Semin (20):
EDAC/synopsys: Fix ECC status data and IRQ disable race condition
EDAC/synopsys: Fix generic device type detection procedure
EDAC/synopsys: Fix mci->scrub_cap field setting
EDAC/synopsys: Drop erroneous ADDRMAP4.addrmap_col_b10 parse
EDAC/synopsys: Fix reading errors count before ECC status
EDAC/synopsys: Fix misleading IRQ self-cleared quirk flag
EDAC/synopsys: Use platform device devm ioremap method
EDAC/synopsys: Drop internal CE and UE counters
EDAC/synopsys: Drop local to_mci() macro definition
EDAC/synopsys: Drop struct ecc_error_info.blknr field
EDAC/synopsys: Shorten out struct ecc_error_info.bankgrpnr field name
EDAC/synopsys: Drop redundant info from the error messages
EDAC/mc: Init DIMM labels in MC registration method
EDAC/mc: Add generic unique MC index allocation procedure
EDAC/mc: Re-use generic unique MC index allocation procedure
EDAC/synopsys: Detach Zynq A05 DDRC support to separate driver
EDAC/synopsys: Drop unused platform-specific setup API
EDAC/synopsys: Unify CSRs macro declarations
EDAC/synopsys: Unify struct/macro/function prefixes
EDAC/synopsys: Convert to using BIT/GENMASK/FIELD_x macros
MAINTAINERS | 1 +
drivers/edac/Kconfig | 9 +-
drivers/edac/Makefile | 1 +
drivers/edac/dmc520_edac.c | 4 +-
drivers/edac/edac_mc.c | 135 ++++-
drivers/edac/edac_mc.h | 4 +
drivers/edac/pasemi_edac.c | 5 +-
drivers/edac/ppc4xx_edac.c | 5 +-
drivers/edac/synopsys_edac.c | 967 ++++++++++++-----------------------
drivers/edac/zynq_edac.c | 501 ++++++++++++++++++
10 files changed, 963 insertions(+), 669 deletions(-)
create mode 100644 drivers/edac/zynq_edac.c
--
2.43.0
The race condition around the ECCCLR register access happens in the IRQ
disable method called in the device remove() procedure and in the ECC IRQ
handler:
1. Enable IRQ:
a. ECCCLR = EN_CE | EN_UE
2. Disable IRQ:
a. ECCCLR = 0
3. IRQ handler:
a. ECCCLR = CLR_CE | CLR_CE_CNT | CLR_CE | CLR_CE_CNT
b. ECCCLR = 0
c. ECCCLR = EN_CE | EN_UE
So if the IRQ disabling procedure is called concurrently with the IRQ
handler method the IRQ might be actually left enabled due to the
statement 3c.
The root cause of the problem is that ECCCLR register (which since v3.10a
has been called as ECCCTL) has intermixed ECC status data clear flags and
the IRQ enable/disable flags. Thus the IRQ disabling (clear EN flags) and
handling (write 1 to clear ECC status data) procedures must be serialised
around the ECCCTL register modification to prevent the race.
So fix the problem described above by adding the spin-lock around the
ECCCLR modifications and preventing the IRQ-handler from modifying the
IRQs enable flags (there is no point in disabling the IRQ and then
re-enabling it again within a single IRQ handler call, see the statements
3a/3b and 3c above).
Fixes: f7824ded4149 ("EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR")
Signed-off-by: Serge Semin <[email protected]>
---
Cc: Sherry Sun <[email protected]>
Changelog v4:
- This is a new patch detached from
[PATCH v3 01/17] EDAC/synopsys: Fix native uMCTL2 IRQs handling procedure
- Rename lock to reglock (Borislav)
---
drivers/edac/synopsys_edac.c | 50 ++++++++++++++++++++++++++----------
1 file changed, 37 insertions(+), 13 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 709babce43ba..0168b05e3ca1 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -9,6 +9,7 @@
#include <linux/edac.h>
#include <linux/module.h>
#include <linux/platform_device.h>
+#include <linux/spinlock.h>
#include <linux/interrupt.h>
#include <linux/of.h>
@@ -299,6 +300,7 @@ struct synps_ecc_status {
/**
* struct synps_edac_priv - DDR memory controller private instance data.
* @baseaddr: Base address of the DDR controller.
+ * @reglock: Concurrent CSRs access lock.
* @message: Buffer for framing the event specific info.
* @stat: ECC status information.
* @p_data: Platform data.
@@ -313,6 +315,7 @@ struct synps_ecc_status {
*/
struct synps_edac_priv {
void __iomem *baseaddr;
+ spinlock_t reglock;
char message[SYNPS_EDAC_MSG_SIZE];
struct synps_ecc_status stat;
const struct synps_platform_data *p_data;
@@ -408,7 +411,8 @@ static int zynq_get_error_info(struct synps_edac_priv *priv)
static int zynqmp_get_error_info(struct synps_edac_priv *priv)
{
struct synps_ecc_status *p;
- u32 regval, clearval = 0;
+ u32 regval, clearval;
+ unsigned long flags;
void __iomem *base;
base = priv->baseaddr;
@@ -452,10 +456,14 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
p->ueinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
p->ueinfo.data = readl(base + ECC_UESYND0_OFST);
out:
- clearval = ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT;
- clearval |= ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT;
+ spin_lock_irqsave(&priv->reglock, flags);
+
+ clearval = readl(base + ECC_CLR_OFST) |
+ ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT |
+ ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT;
writel(clearval, base + ECC_CLR_OFST);
- writel(0x0, base + ECC_CLR_OFST);
+
+ spin_unlock_irqrestore(&priv->reglock, flags);
return 0;
}
@@ -515,24 +523,41 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
static void enable_intr(struct synps_edac_priv *priv)
{
+ unsigned long flags;
+
/* Enable UE/CE Interrupts */
- if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
- writel(DDR_UE_MASK | DDR_CE_MASK,
- priv->baseaddr + ECC_CLR_OFST);
- else
+ if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)) {
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
+ return;
+ }
+
+ spin_lock_irqsave(&priv->reglock, flags);
+
+ writel(DDR_UE_MASK | DDR_CE_MASK,
+ priv->baseaddr + ECC_CLR_OFST);
+
+ spin_unlock_irqrestore(&priv->reglock, flags);
}
static void disable_intr(struct synps_edac_priv *priv)
{
+ unsigned long flags;
+
/* Disable UE/CE Interrupts */
- if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
- writel(0x0, priv->baseaddr + ECC_CLR_OFST);
- else
+ if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)) {
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_DB_OFST);
+
+ return;
+ }
+
+ spin_lock_irqsave(&priv->reglock, flags);
+
+ writel(0, priv->baseaddr + ECC_CLR_OFST);
+
+ spin_unlock_irqrestore(&priv->reglock, flags);
}
/**
@@ -576,8 +601,6 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
/* v3.0 of the controller does not have this register */
if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR))
writel(regval, priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
- else
- enable_intr(priv);
return IRQ_HANDLED;
}
@@ -1359,6 +1382,7 @@ static int mc_probe(struct platform_device *pdev)
priv = mci->pvt_info;
priv->baseaddr = baseaddr;
priv->p_data = p_data;
+ spin_lock_init(&priv->reglock);
mc_init(mci, pdev);
--
2.43.0
First of all the enum dev_type constants describe the memory DRAM chips
used at the stick, not the entire DQ-bus width (see the enumeration kdoc
for details). So what is returned from the zynqmp_get_dtype() function and
then specified to the dimm_info->dtype field is definitely incorrect.
Secondly the DRAM chips type has nothing to do with the data bus width
specified in the MSTR.data_bus_width CSR field. That CSR field just
determines the part of the whole DQ-bus currently used to access the data
from the all DRAM memory chips. So it doesn't indicate the individual
chips type. Thirdly the DRAM chips type can be determined only in case of
the DDR4 protocol by means of the MSTR.device_config field state (it is
supposed to be set by the system firmware). Finally the DW uMCTL2 DDRC ECC
capability doesn't depend on the memory chips type. Moreover it doesn't
depend on the utilized data bus width in runtime either. The IP-core
reference manual says in [1,2] that the ECC support can't be enabled
during the IP-core synthesizes for the DRAM data bus widths other than 16,
32 or 64. At the same time the bus width mode (MSTR.data_bus_width)
doesn't change the ECC feature availability. Thus it was wrong to
determine the ECC state with respect to the DQ-bus width mode.
Fix all of the mistakes described above in the zynqmp_get_dtype() and
zynqmp_get_ecc_state() methods: specify actual DRAM chips data width only
for the DDR4 protocol and return that it's UNKNOWN in the rest of the
cases; determine ECC availability by the ECCCFG0.ecc_mode field state
only (that field can't be modified anyway if the IP-core was synthesized
with no ECC support).
[1] DesignWare® Cores Enhanced Universal DDR Memory Controller (uMCTL2)
Databook, Version 3.91a, October 2020, p. 421.
[2] DesignWare® Cores Enhanced Universal DDR Memory Controller (uMCTL2)
Databook, Version 3.91a, October 2020, p. 633.
Fixes: b500b4a029d5 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
Signed-off-by: Serge Semin <[email protected]>
---
Changelog v2:
- Include "linux/bitfield.h" header file to get the FIELD_GET macro
definition. (@tbot)
---
drivers/edac/synopsys_edac.c | 49 +++++++++++++++---------------------
1 file changed, 20 insertions(+), 29 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 0168b05e3ca1..455d2fcfd8c1 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -674,26 +674,25 @@ static enum dev_type zynq_get_dtype(const void __iomem *base)
*/
static enum dev_type zynqmp_get_dtype(const void __iomem *base)
{
- enum dev_type dt;
- u32 width;
-
- width = readl(base + CTRL_OFST);
- width = (width & ECC_CTRL_BUSWIDTH_MASK) >> ECC_CTRL_BUSWIDTH_SHIFT;
- switch (width) {
- case DDRCTL_EWDTH_16:
- dt = DEV_X2;
- break;
- case DDRCTL_EWDTH_32:
- dt = DEV_X4;
- break;
- case DDRCTL_EWDTH_64:
- dt = DEV_X8;
- break;
- default:
- dt = DEV_UNKNOWN;
+ u32 regval;
+
+ regval = readl(base + CTRL_OFST);
+ if (!(regval & MEM_TYPE_DDR4))
+ return DEV_UNKNOWN;
+
+ regval = (regval & DDRC_MSTR_CFG_MASK) >> DDRC_MSTR_CFG_SHIFT;
+ switch (regval) {
+ case DDRC_MSTR_CFG_X4_MASK:
+ return DEV_X4;
+ case DDRC_MSTR_CFG_X8_MASK:
+ return DEV_X8;
+ case DDRC_MSTR_CFG_X16_MASK:
+ return DEV_X16;
+ case DDRC_MSTR_CFG_X32_MASK:
+ return DEV_X32;
}
- return dt;
+ return DEV_UNKNOWN;
}
/**
@@ -730,19 +729,11 @@ static bool zynq_get_ecc_state(void __iomem *base)
*/
static bool zynqmp_get_ecc_state(void __iomem *base)
{
- enum dev_type dt;
- u32 ecctype;
+ u32 regval;
- dt = zynqmp_get_dtype(base);
- if (dt == DEV_UNKNOWN)
- return false;
+ regval = readl(base + ECC_CFG0_OFST) & SCRUB_MODE_MASK;
- ecctype = readl(base + ECC_CFG0_OFST) & SCRUB_MODE_MASK;
- if ((ecctype == SCRUB_MODE_SECDED) &&
- ((dt == DEV_X2) || (dt == DEV_X4) || (dt == DEV_X8)))
- return true;
-
- return false;
+ return (regval == SCRUB_MODE_SECDED);
}
/**
--
2.43.0
Currently the ADDRMAP4.addrmap_col_b10 field gets to be parsed in case of
the LPDDR3 memory and Quarter DQ bus width mode. It's wrong since that
field is marked as unused for that mode in all the available DW uMCTL2
DDRC releases (up to IP-core v3.91a). Most likely the field parsing was
added by mistake as a result of the copy-paste from the Half DQ bus width
mode part of the same function. Even though the field is supposed to be
always set to the UNUSED value (0x1F) drop parsing it anyway so to
simplify the setup_column_address_map() method a tiny bit.
Fixes: 1a81361f75d8 ("EDAC, synopsys: Add Error Injection support for ZynqMP DDR controller")
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/synopsys_edac.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 7c57c43b4d31..bd6e52db68bc 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -1236,10 +1236,6 @@ static void setup_column_address_map(struct synps_edac_priv *priv, u32 *addrmap)
COL_MAX_VAL_MASK) == COL_MAX_VAL_MASK) ? 0 :
(((addrmap[3] >> 24) & COL_MAX_VAL_MASK) +
COL_B9_BASE);
- priv->col_shift[13] = ((addrmap[4] &
- COL_MAX_VAL_MASK) == COL_MAX_VAL_MASK) ? 0 :
- ((addrmap[4] & COL_MAX_VAL_MASK) +
- COL_B10_BASE);
} else {
priv->col_shift[11] = (((addrmap[3] >> 16) &
COL_MAX_VAL_MASK) == COL_MAX_VAL_MASK) ? 0 :
--
2.43.0
The mem_ctl_info.scrub_cap field is supposed to be set with the ECC
scrub-related flags. Instead the driver erroneously initializes it with
the SCRUB_HW_SRC flag ID. It's definitely wrong, though it hasn't caused
any problem so far since the structure field isn't used by the EDAC core.
Fix it anyway by using the SCRUB_FLAG_HW_SRC macro to initialize the
field.
Fixes: ae9b56e3996d ("EDAC, synps: Add EDAC support for zynq ddr ecc controller")
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/synopsys_edac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 455d2fcfd8c1..7c57c43b4d31 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -855,7 +855,7 @@ static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
/* Initialize controller capabilities and configuration */
mci->mtype_cap = MEM_FLAG_DDR3 | MEM_FLAG_DDR2;
mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
- mci->scrub_cap = SCRUB_HW_SRC;
+ mci->scrub_cap = SCRUB_FLAG_HW_SRC;
mci->scrub_mode = SCRUB_NONE;
mci->edac_cap = EDAC_FLAG_SECDED;
--
2.43.0
The DDR_ECC_INTR_SELF_CLEAR quirk flag was initially added in the commit
f7824ded4149 ("EDAC/synopsys: Add support for version 3 of the Synopsys
EDAC DDR") in order to distinguish the ZynqMP DDRC (based on DW uMCTL2
DDRC v2.40a) and the announced in that commit Synopsys DDR controller
v3.80a. The selected name is misleading for the next reasons:
1. None of the Synopsys DW uMCTL2 DDR IP-core has the UE/CE IRQs
auto or self cleared. The IRQ signals (ecc_corrected_err_intr and
ecc_uncorrected_err_intr) are cleared together with the rest of the ECC
error data by means of writing 1's to the respective ECCCLR bits. It
worked like that in DW uMCTL2 DDRC v2.x IP-core and it's still true for
the modern DW uMCTL2 DDRC v3.x.
2. The IRQ-related registers accessed unless the denoted quirk is
specified are actually Xilinx Zynq-specific. None of the Synopsys DW uMCTL
DDRC IP-core have any registers at the offsets 0x20200/0x20208/0x2020C.
The most modern DW uMCTL2 DDRC v3.91a IP-core available has CSRs space end
at the 0x43dc offset. The older IP-cores have even smaller registers
space.
3. What was actually introduced in the DW uMCTL2 DDRC v3.10 by Synopsys is
the IRQ enable flags which older DW uMCTL2 DDRC IP-core didn't have. They
were added to the ECCCLR register (the CSR was also renamed to ECCCTL in
the v3.10 IP-core HW databook). So since then there have been no point in
having a vendor-specific IRQs masking solution like described in 2. and
the IRQ signal can be now shared even for the native DW uMCTL2 DDR
controllers.
So let's harmonize the quirked IRQs code based on the statements above:
rename the DDR_ECC_INTR_SELF_CLEAR quirk flag to SYNPS_ZYNQMP_IRQ_REGS
thus indicating the ZynqMP-specific IRQ CSRs; add the new quirk flag to
the ZynqMP platform data; drop the misleading comments about the
auto-cleared ue/ce flags; add a comment about the new IRQ enable flags
added in v3.10 IP-core.
Signed-off-by: Serge Semin <[email protected]>
---
Changelog v4:
- This is a new patch detached from
[PATCH v3 01/17] EDAC/synopsys: Fix native uMCTL2 IRQs handling procedure
---
drivers/edac/synopsys_edac.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index fbaf3d9ad517..9f79f14e57b2 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -88,7 +88,7 @@
/* DDR ECC Quirks */
#define DDR_ECC_INTR_SUPPORT BIT(0)
#define DDR_ECC_DATA_POISON_SUPPORT BIT(1)
-#define DDR_ECC_INTR_SELF_CLEAR BIT(2)
+#define SYNPS_ZYNQMP_IRQ_REGS BIT(2)
/* ZynqMP Enhanced DDR memory controller registers that are relevant to ECC */
/* ECC Configuration Registers */
@@ -526,7 +526,7 @@ static void enable_intr(struct synps_edac_priv *priv)
unsigned long flags;
/* Enable UE/CE Interrupts */
- if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)) {
+ if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
@@ -535,6 +535,10 @@ static void enable_intr(struct synps_edac_priv *priv)
spin_lock_irqsave(&priv->reglock, flags);
+ /*
+ * IRQs Enable/Disable flags have been available since v3.10a.
+ * This is noop for the older controllers.
+ */
writel(DDR_UE_MASK | DDR_CE_MASK,
priv->baseaddr + ECC_CLR_OFST);
@@ -546,7 +550,7 @@ static void disable_intr(struct synps_edac_priv *priv)
unsigned long flags;
/* Disable UE/CE Interrupts */
- if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)) {
+ if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
priv->baseaddr + DDR_QOS_IRQ_DB_OFST);
@@ -577,11 +581,7 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
priv = mci->pvt_info;
p_data = priv->p_data;
- /*
- * v3.0 of the controller has the ce/ue bits cleared automatically,
- * so this condition does not apply.
- */
- if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)) {
+ if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
regval = readl(priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
regval &= (DDR_QOSCE_MASK | DDR_QOSUE_MASK);
if (!(regval & ECC_CE_UE_INTR_MASK))
@@ -598,8 +598,8 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
edac_dbg(3, "Total error count CE %d UE %d\n",
priv->ce_cnt, priv->ue_cnt);
- /* v3.0 of the controller does not have this register */
- if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR))
+
+ if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS)
writel(regval, priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
return IRQ_HANDLED;
@@ -913,7 +913,7 @@ static const struct synps_platform_data zynqmp_edac_def = {
.get_mtype = zynqmp_get_mtype,
.get_dtype = zynqmp_get_dtype,
.get_ecc_state = zynqmp_get_ecc_state,
- .quirks = (DDR_ECC_INTR_SUPPORT
+ .quirks = (DDR_ECC_INTR_SUPPORT | SYNPS_ZYNQMP_IRQ_REGS
#ifdef CONFIG_EDAC_DEBUG
| DDR_ECC_DATA_POISON_SUPPORT
#endif
@@ -925,7 +925,7 @@ static const struct synps_platform_data synopsys_edac_def = {
.get_mtype = zynqmp_get_mtype,
.get_dtype = zynqmp_get_dtype,
.get_ecc_state = zynqmp_get_ecc_state,
- .quirks = (DDR_ECC_INTR_SUPPORT | DDR_ECC_INTR_SELF_CLEAR
+ .quirks = (DDR_ECC_INTR_SUPPORT
#ifdef CONFIG_EDAC_DEBUG
| DDR_ECC_DATA_POISON_SUPPORT
#endif
--
2.43.0
First of all these counters aren't exposed anyhow from the driver.
Secondly the EDAC core already tracks the total amount of the correctable
and uncorrectable errors (see mem_ctl_info.{ce_mc,ue_mc} fields usage).
Drop the useless internal counters then for good.
Signed-off-by: Serge Semin <[email protected]>
---
Changelog v4:
- Drop redundant empty line.
- Drop private counters access from the check_errors() method too.
---
drivers/edac/synopsys_edac.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 6976ef84e952..5099246db90e 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -304,8 +304,6 @@ struct synps_ecc_status {
* @message: Buffer for framing the event specific info.
* @stat: ECC status information.
* @p_data: Platform data.
- * @ce_cnt: Correctable Error count.
- * @ue_cnt: Uncorrectable Error count.
* @poison_addr: Data poison address.
* @row_shift: Bit shifts for row bit.
* @col_shift: Bit shifts for column bit.
@@ -319,8 +317,6 @@ struct synps_edac_priv {
char message[SYNPS_EDAC_MSG_SIZE];
struct synps_ecc_status stat;
const struct synps_platform_data *p_data;
- u32 ce_cnt;
- u32 ue_cnt;
#ifdef CONFIG_EDAC_DEBUG
ulong poison_addr;
u32 row_shift[18];
@@ -592,13 +588,8 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
if (status)
return IRQ_NONE;
- priv->ce_cnt += priv->stat.ce_cnt;
- priv->ue_cnt += priv->stat.ue_cnt;
handle_error(mci, &priv->stat);
- edac_dbg(3, "Total error count CE %d UE %d\n",
- priv->ce_cnt, priv->ue_cnt);
-
if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS)
writel(regval, priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
@@ -624,12 +615,7 @@ static void check_errors(struct mem_ctl_info *mci)
if (status)
return;
- priv->ce_cnt += priv->stat.ce_cnt;
- priv->ue_cnt += priv->stat.ue_cnt;
handle_error(mci, &priv->stat);
-
- edac_dbg(3, "Total error count CE %d UE %d\n",
- priv->ce_cnt, priv->ue_cnt);
}
/**
--
2.43.0
The to_mci() macro was added in commit 1a81361f75d8 ("EDAC, synopsys: Add
Error Injection support for ZynqMP DDR controller") together with the
errors injection debug feature. It turns out the absolutely the same
macro-function has already been defined in the edac_mc.h (former
edac_core.h) header file. No idea why it was needed to have a local
version of the macro, but there is no point in it now. Drop the local
macro-function definition for good.
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/synopsys_edac.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 5099246db90e..21b0d791cb8b 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -940,7 +940,6 @@ static const struct of_device_id synps_edac_match[] = {
MODULE_DEVICE_TABLE(of, synps_edac_match);
#ifdef CONFIG_EDAC_DEBUG
-#define to_mci(k) container_of(k, struct mem_ctl_info, dev)
/**
* ddr_poison_setup - Update poison registers.
--
2.43.0
Even though the ECC(C|U)ADDR1 CSR description indeed says it's a "Block
number" in the DW uMCTL2 DDRC IP-core databooks, the corresponding
register field is named as ECC(C|U)ADDR1.ecc_(un)corr_col (which means ECC
(un)corrected column) and in the rest of the document it's referred as the
SDRAM address column. Thus use the already available ecc_error_info.col
field to read the column number to and drop the questionable
ecc_error_info.blknr field for good.
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/synopsys_edac.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 21b0d791cb8b..6ca119459bd3 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -173,7 +173,7 @@
#define ECC_CEADDR0_RNK_MASK BIT(24)
#define ECC_CEADDR1_BNKGRP_MASK 0x3000000
#define ECC_CEADDR1_BNKNR_MASK 0x70000
-#define ECC_CEADDR1_BLKNR_MASK 0xFFF
+#define ECC_CEADDR1_COL_MASK 0xFFF
#define ECC_CEADDR1_BNKGRP_SHIFT 24
#define ECC_CEADDR1_BNKNR_SHIFT 16
@@ -271,7 +271,6 @@
* @bitpos: Bit position.
* @data: Data causing the error.
* @bankgrpnr: Bank group number.
- * @blknr: Block number.
*/
struct ecc_error_info {
u32 row;
@@ -280,7 +279,6 @@ struct ecc_error_info {
u32 bitpos;
u32 data;
u32 bankgrpnr;
- u32 blknr;
};
/**
@@ -433,7 +431,7 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
ECC_CEADDR1_BNKNR_SHIFT;
p->ceinfo.bankgrpnr = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
ECC_CEADDR1_BNKGRP_SHIFT;
- p->ceinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
+ p->ceinfo.col = (regval & ECC_CEADDR1_COL_MASK);
p->ceinfo.data = readl(base + ECC_CSYND0_OFST);
edac_dbg(2, "ECCCSYN0: 0x%08X ECCCSYN1: 0x%08X ECCCSYN2: 0x%08X\n",
readl(base + ECC_CSYND0_OFST), readl(base + ECC_CSYND1_OFST),
@@ -449,7 +447,7 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
ECC_CEADDR1_BNKGRP_SHIFT;
p->ueinfo.bank = (regval & ECC_CEADDR1_BNKNR_MASK) >>
ECC_CEADDR1_BNKNR_SHIFT;
- p->ueinfo.blknr = (regval & ECC_CEADDR1_BLKNR_MASK);
+ p->ueinfo.col = (regval & ECC_CEADDR1_COL_MASK);
p->ueinfo.data = readl(base + ECC_UESYND0_OFST);
out:
spin_lock_irqsave(&priv->reglock, flags);
@@ -480,10 +478,9 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
pinf = &p->ceinfo;
if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type:%s Row %d Bank %d BankGroup Number %d Block Number %d Bit Position: %d Data: 0x%08x",
- "CE", pinf->row, pinf->bank,
- pinf->bankgrpnr, pinf->blknr,
- pinf->bitpos, pinf->data);
+ "DDR ECC error type:%s Row %d Col %d Bank %d BankGroup Number %d Bit Position: %d Data: 0x%08x",
+ "CE", pinf->row, pinf->col, pinf->bank,
+ pinf->bankgrpnr, pinf->bitpos, pinf->data);
} else {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
"DDR ECC error type:%s Row %d Bank %d Col %d Bit Position: %d Data: 0x%08x",
@@ -500,9 +497,9 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
pinf = &p->ueinfo;
if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type :%s Row %d Bank %d BankGroup Number %d Block Number %d",
- "UE", pinf->row, pinf->bank,
- pinf->bankgrpnr, pinf->blknr);
+ "DDR ECC error type :%s Row %d Col %d Bank %d BankGroup Number %d",
+ "UE", pinf->row, pinf->col, pinf->bank,
+ pinf->bankgrpnr);
} else {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
"DDR ECC error type :%s Row %d Bank %d Col %d ",
--
2.43.0
None of the ecc_error_info structure fields have "nr" suffix even though
each of them do represent some number (row number, column number, bank
number). Drop the suffix from the bankgrpnr field name for the sake of
unification then. Similarly drop the word "Number" from the CE/UE error
messages too since it doesn't give any helpful info there.
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/synopsys_edac.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 6ca119459bd3..b0ff831287f5 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -268,17 +268,17 @@
* @row: Row number.
* @col: Column number.
* @bank: Bank number.
+ * @bankgrp: Bank group number.
* @bitpos: Bit position.
* @data: Data causing the error.
- * @bankgrpnr: Bank group number.
*/
struct ecc_error_info {
u32 row;
u32 col;
u32 bank;
+ u32 bankgrp;
u32 bitpos;
u32 data;
- u32 bankgrpnr;
};
/**
@@ -429,7 +429,7 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
regval = readl(base + ECC_CEADDR1_OFST);
p->ceinfo.bank = (regval & ECC_CEADDR1_BNKNR_MASK) >>
ECC_CEADDR1_BNKNR_SHIFT;
- p->ceinfo.bankgrpnr = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
+ p->ceinfo.bankgrp = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
ECC_CEADDR1_BNKGRP_SHIFT;
p->ceinfo.col = (regval & ECC_CEADDR1_COL_MASK);
p->ceinfo.data = readl(base + ECC_CSYND0_OFST);
@@ -443,7 +443,7 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
regval = readl(base + ECC_UEADDR0_OFST);
p->ueinfo.row = (regval & ECC_CEADDR0_RW_MASK);
regval = readl(base + ECC_UEADDR1_OFST);
- p->ueinfo.bankgrpnr = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
+ p->ueinfo.bankgrp = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
ECC_CEADDR1_BNKGRP_SHIFT;
p->ueinfo.bank = (regval & ECC_CEADDR1_BNKNR_MASK) >>
ECC_CEADDR1_BNKNR_SHIFT;
@@ -478,9 +478,9 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
pinf = &p->ceinfo;
if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type:%s Row %d Col %d Bank %d BankGroup Number %d Bit Position: %d Data: 0x%08x",
+ "DDR ECC error type:%s Row %d Col %d Bank %d Bank Group %d Bit Position: %d Data: 0x%08x",
"CE", pinf->row, pinf->col, pinf->bank,
- pinf->bankgrpnr, pinf->bitpos, pinf->data);
+ pinf->bankgrp, pinf->bitpos, pinf->data);
} else {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
"DDR ECC error type:%s Row %d Bank %d Col %d Bit Position: %d Data: 0x%08x",
@@ -497,9 +497,9 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
pinf = &p->ueinfo;
if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type :%s Row %d Col %d Bank %d BankGroup Number %d",
+ "DDR ECC error type :%s Row %d Col %d Bank %d Bank Group %d",
"UE", pinf->row, pinf->col, pinf->bank,
- pinf->bankgrpnr);
+ pinf->bankgrp);
} else {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
"DDR ECC error type :%s Row %d Bank %d Col %d ",
--
2.43.0
Move the DIMM labels initialization to the memory controller registration
method as a preparation before adding the generic procedure to allocate an
unique MC index. It's required because the DIMM labels contain the MC
index as the "mc%u" part of the string, which in case of the
auto-generated index isn't available at the moment of the MCI/csrow/dimms
descriptor allocation.
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/edac_mc.c | 48 +++++++++++++++++++++++++++---------------
1 file changed, 31 insertions(+), 17 deletions(-)
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index d6eed727b0cd..c0b36349999f 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -257,7 +257,6 @@ static int edac_mc_alloc_dimms(struct mem_ctl_info *mci)
unsigned int pos[EDAC_MAX_LAYERS];
unsigned int row, chn, idx;
int layer;
- void *p;
/*
* Allocate and fill the dimm structs
@@ -272,7 +271,6 @@ static int edac_mc_alloc_dimms(struct mem_ctl_info *mci)
for (idx = 0; idx < mci->tot_dimms; idx++) {
struct dimm_info *dimm;
struct rank_info *chan;
- int n, len;
chan = mci->csrows[row]->channels[chn];
@@ -283,22 +281,9 @@ static int edac_mc_alloc_dimms(struct mem_ctl_info *mci)
dimm->mci = mci;
dimm->idx = idx;
- /*
- * Copy DIMM location and initialize it.
- */
- len = sizeof(dimm->label);
- p = dimm->label;
- n = scnprintf(p, len, "mc#%u", mci->mc_idx);
- p += n;
- len -= n;
- for (layer = 0; layer < mci->n_layers; layer++) {
- n = scnprintf(p, len, "%s#%u",
- edac_layer_name[mci->layers[layer].type],
- pos[layer]);
- p += n;
- len -= n;
+ /* Copy DIMM location */
+ for (layer = 0; layer < mci->n_layers; layer++)
dimm->location[layer] = pos[layer];
- }
/* Link it to the csrows old API data */
chan->dimm = dimm;
@@ -511,6 +496,33 @@ void edac_mc_reset_delay_period(unsigned long value)
+/**
+ * edac_mc_init_labels() - Initialize DIMM labels
+ *
+ * @mci: pointer to the mci structure which DIMM labels need to be initialized
+ *
+ * .. note::
+ * locking model: must be called with the mem_ctls_mutex lock held
+ */
+static void edac_mc_init_labels(struct mem_ctl_info *mci)
+{
+ int n, len, layer;
+ unsigned int idx;
+ char *p;
+
+ for (idx = 0; idx < mci->tot_dimms; idx++) {
+ len = sizeof(mci->dimms[idx]->label);
+ p = mci->dimms[idx]->label;
+
+ n = scnprintf(p, len, "mc#%u", mci->mc_idx);
+ for (layer = 0; layer < mci->n_layers; layer++) {
+ n += scnprintf(p + n, len - n, "%s#%u",
+ edac_layer_name[mci->layers[layer].type],
+ mci->dimms[idx]->location[layer]);
+ }
+ }
+}
+
/* Return 0 on success, 1 on failure.
* Before calling this function, caller must
* assign a unique value to mci->mc_idx.
@@ -638,6 +650,8 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
goto fail0;
}
+ edac_mc_init_labels(mci);
+
if (add_mc_to_global_list(mci))
goto fail0;
--
2.43.0
Currently the EDAC subsystem relies on the low-level device drivers to
select an unique index for each memory controller available in the system.
Here are the already implemented approaches:
1. Fixed zero id. The vast majority of the drivers expect to have a single
memory controller in the system.
2. Calculate based on a platform-specific way (Pre-defined devices order,
PCIe-bus address, Numa node ID + PCIe-function number, etc).
3. Use platform_device->id.
4. Use custom ACPI/OF property value.
5. Use locally maintained static MC counter.
Create a generic method of the MC index allocation which could be utilized
for the case 5 (it doesn't imply any strict memory controller order) and
which would prevent the new MC EDAC drivers re-implementing the approaches
3 and 4. Moreover it will be useful for the cases when a platform is
equipped with memory-controllers of different types [1] and which are
probed by different drivers [2].
[1] Link: https://lore.kernel.org/all/9dc2a947-d2ab-4f00-8ed3-d2499cb6fdfd@BN1BFFO11FD002.protection.gbl/
[2] Link: https://lore.kernel.org/linux-edac/BY5PR12MB4258CB67B70D71F107EC1E9DDB3E9@BY5PR12MB4258.namprd12.prod.outlook.com
The suggested implementation is based on the IDA kernel API and implies
the next semantics:
1. If a particular MC index is specified it will be registered in the
IDR pool unless the specified ID has already been reserved.
2. If a special MC index is specified (EDAC_AUTO_MC_NUM) the EDAC
core will check whether there is a "mcID" alias is defined in the device
tree and use the ID from there if it's found.
3. Otherwise a next free index will be allocated and assigned to the
registered memory controller.
Signed-off-by: Serge Semin <[email protected]>
---
Note the approach implemented here has been partly ported from the SPI
core driver using IDA to track/allocate SPI bus numbers.
Link: https://elixir.bootlin.com/linux/latest/source/drivers/spi/spi.c#L2957
---
drivers/edac/edac_mc.c | 89 +++++++++++++++++++++++++++++++++++++++---
drivers/edac/edac_mc.h | 4 ++
2 files changed, 87 insertions(+), 6 deletions(-)
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index c0b36349999f..2144e0615679 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -29,6 +29,9 @@
#include <linux/edac.h>
#include <linux/bitops.h>
#include <linux/uaccess.h>
+#include <linux/idr.h>
+#include <linux/of.h>
+
#include <asm/page.h>
#include "edac_mc.h"
#include "edac_module.h"
@@ -46,6 +49,7 @@ EXPORT_SYMBOL_GPL(edac_op_state);
/* lock to memory controller's control array */
static DEFINE_MUTEX(mem_ctls_mutex);
static LIST_HEAD(mc_devices);
+static DEFINE_IDR(mc_idr);
/*
* Used to lock EDAC MC to just one module, avoiding two drivers e. g.
@@ -494,7 +498,64 @@ void edac_mc_reset_delay_period(unsigned long value)
mutex_unlock(&mem_ctls_mutex);
}
+/**
+ * edac_mc_alloc_id() - Allocate unique Memory Controller identifier
+ *
+ * @mci: pointer to the mci structure to allocate ID for
+ *
+ * Use edac_mc_free_id() to coherently free the MC identifier.
+ *
+ * .. note::
+ * locking model: must be called with the mem_ctls_mutex lock held
+ *
+ * Returns:
+ * 0 on Success, or an error code on failure
+ */
+static int edac_mc_alloc_id(struct mem_ctl_info *mci)
+{
+ struct device_node *np = dev_of_node(mci->pdev);
+ int ret, min, max;
+
+ if (mci->mc_idx == EDAC_AUTO_MC_NUM) {
+ ret = of_alias_get_id(np, "mc");
+ if (ret >= 0) {
+ min = ret;
+ max = ret + 1;
+ } else {
+ min = of_alias_get_highest_id("mc");
+ if (min >= 0)
+ min++;
+ else
+ min = 0;
+
+ max = 0;
+ }
+ } else {
+ min = mci->mc_idx;
+ max = mci->mc_idx + 1;
+ }
+
+ ret = idr_alloc(&mc_idr, mci, min, max, GFP_KERNEL);
+ if (ret < 0)
+ return ret == -ENOSPC ? -EBUSY : ret;
+
+ mci->mc_idx = ret;
+
+ return 0;
+}
+/**
+ * edac_mc_free_id() - Free Memory Controller identifier
+ *
+ * @mci: pointer to the mci structure to free ID from
+ *
+ * .. note::
+ * locking model: must be called with the mem_ctls_mutex lock held
+ */
+static void edac_mc_free_id(struct mem_ctl_info *mci)
+{
+ idr_remove(&mc_idr, mci->mc_idx);
+}
/**
* edac_mc_init_labels() - Initialize DIMM labels
@@ -613,7 +674,8 @@ EXPORT_SYMBOL_GPL(edac_get_owner);
int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
const struct attribute_group **groups)
{
- int ret = -EINVAL;
+ int ret;
+
edac_dbg(0, "\n");
#ifdef CONFIG_EDAC_DEBUG
@@ -650,20 +712,30 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
goto fail0;
}
+ ret = edac_mc_alloc_id(mci);
+ if (ret) {
+ edac_printk(KERN_ERR, EDAC_MC, "failed to allocate MC idx %u\n",
+ mci->mc_idx);
+ goto fail0;
+ }
+
edac_mc_init_labels(mci);
- if (add_mc_to_global_list(mci))
- goto fail0;
+ if (add_mc_to_global_list(mci)) {
+ ret = -EINVAL;
+ goto fail1;
+ }
/* set load time so that error rate can be tracked */
mci->start_time = jiffies;
mci->bus = edac_get_sysfs_subsys();
- if (edac_create_sysfs_mci_device(mci, groups)) {
+ ret = edac_create_sysfs_mci_device(mci, groups);
+ if (ret) {
edac_mc_printk(mci, KERN_WARNING,
"failed to create sysfs device\n");
- goto fail1;
+ goto fail2;
}
if (mci->edac_check) {
@@ -687,9 +759,12 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
mutex_unlock(&mem_ctls_mutex);
return 0;
-fail1:
+fail2:
del_mc_from_global_list(mci);
+fail1:
+ edac_mc_free_id(mci);
+
fail0:
mutex_unlock(&mem_ctls_mutex);
return ret;
@@ -717,6 +792,8 @@ struct mem_ctl_info *edac_mc_del_mc(struct device *dev)
if (del_mc_from_global_list(mci))
edac_mc_owner = NULL;
+ edac_mc_free_id(mci);
+
mutex_unlock(&mem_ctls_mutex);
if (mci->edac_check)
diff --git a/drivers/edac/edac_mc.h b/drivers/edac/edac_mc.h
index 881b00eadf7a..4b6676235b1b 100644
--- a/drivers/edac/edac_mc.h
+++ b/drivers/edac/edac_mc.h
@@ -23,6 +23,7 @@
#define _EDAC_MC_H_
#include <linux/kernel.h>
+#include <linux/limits.h>
#include <linux/types.h>
#include <linux/module.h>
#include <linux/spinlock.h>
@@ -37,6 +38,9 @@
#include <linux/workqueue.h>
#include <linux/edac.h>
+/* Generate MC identifier automatically */
+#define EDAC_AUTO_MC_NUM UINT_MAX
+
#if PAGE_SHIFT < 20
#define PAGES_TO_MiB(pages) ((pages) >> (20 - PAGE_SHIFT))
#define MiB_TO_PAGES(mb) ((mb) << (20 - PAGE_SHIFT))
--
2.43.0
The EDAC drivers locally maintaining a statically defined
memory-controllers counter don't care much about the MC index assigned as
long as it's unique so the EDAC core perceives it. Convert these drivers
to be using the generic MC index allocation procedure recently added to
the EDAC core.
Signed-off-by: Serge Semin <[email protected]>
---
Changelog v4:
- Initial patch introduction.
---
drivers/edac/dmc520_edac.c | 4 +---
drivers/edac/pasemi_edac.c | 5 +----
drivers/edac/ppc4xx_edac.c | 5 +----
3 files changed, 3 insertions(+), 11 deletions(-)
diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c
index 4e30b989a1a4..93734a97a67b 100644
--- a/drivers/edac/dmc520_edac.c
+++ b/drivers/edac/dmc520_edac.c
@@ -173,8 +173,6 @@ struct dmc520_edac {
int masks[NUMBER_OF_IRQS];
};
-static int dmc520_mc_idx;
-
static u32 dmc520_read_reg(struct dmc520_edac *pvt, u32 offset)
{
return readl(pvt->reg_base + offset);
@@ -517,7 +515,7 @@ static int dmc520_edac_probe(struct platform_device *pdev)
layers[0].size = dmc520_get_rank_count(reg_base);
layers[0].is_virt_csrow = true;
- mci = edac_mc_alloc(dmc520_mc_idx++, ARRAY_SIZE(layers), layers, sizeof(*pvt));
+ mci = edac_mc_alloc(EDAC_AUTO_MC_NUM, ARRAY_SIZE(layers), layers, sizeof(*pvt));
if (!mci) {
edac_printk(KERN_ERR, EDAC_MOD_NAME,
"Failed to allocate memory for mc instance\n");
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 1a1c3296ccc8..afebfbda1ea0 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -57,8 +57,6 @@
#define PASEMI_EDAC_ERROR_GRAIN 64
static int last_page_in_mmc;
-static int system_mmc_id;
-
static u32 pasemi_edac_get_error_info(struct mem_ctl_info *mci)
{
@@ -203,8 +201,7 @@ static int pasemi_edac_probe(struct pci_dev *pdev,
layers[1].type = EDAC_MC_LAYER_CHANNEL;
layers[1].size = PASEMI_EDAC_NR_CHANS;
layers[1].is_virt_csrow = false;
- mci = edac_mc_alloc(system_mmc_id++, ARRAY_SIZE(layers), layers,
- 0);
+ mci = edac_mc_alloc(EDAC_AUTO_MC_NUM, ARRAY_SIZE(layers), layers, 0);
if (mci == NULL)
return -ENOMEM;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index 1eea3341a916..06d267d40a6a 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -1214,7 +1214,6 @@ static int ppc4xx_edac_probe(struct platform_device *op)
const struct device_node *np = op->dev.of_node;
struct mem_ctl_info *mci = NULL;
struct edac_mc_layer layers[2];
- static int ppc4xx_edac_instance;
/*
* At this point, we only support the controller realized on
@@ -1265,7 +1264,7 @@ static int ppc4xx_edac_probe(struct platform_device *op)
layers[1].type = EDAC_MC_LAYER_CHANNEL;
layers[1].size = ppc4xx_edac_nr_chans;
layers[1].is_virt_csrow = false;
- mci = edac_mc_alloc(ppc4xx_edac_instance, ARRAY_SIZE(layers), layers,
+ mci = edac_mc_alloc(EDAC_AUTO_MC_NUM, ARRAY_SIZE(layers), layers,
sizeof(struct ppc4xx_edac_pdata));
if (mci == NULL) {
ppc4xx_edac_printk(KERN_ERR, "%pOF: "
@@ -1303,8 +1302,6 @@ static int ppc4xx_edac_probe(struct platform_device *op)
goto fail1;
}
- ppc4xx_edac_instance++;
-
return 0;
fail1:
--
2.43.0
The synopsys_edac.c driver currently supports three memory-controllers:
1. Synopsys DW uMCTL2 DDRC v3.80 (with the ZynqMP-specific params).
2. Xilinx ZynqMP DDRC Synopsys DW uMCTL2 DDRC v2.40.
3. Xilinx Zynq A05 DDR controller.
If the first two devices are based on the Synopsys DW uMCTL2 IP-cores
(ZynqMP MC is based on the DW uMCTL2 v2.40) the later device has
absolutely nothing in common with the Synopsys DW uMCTL2 DDR controllers:
the CSRs map is absolutely different; it doesn't support IRQs unlike
the Synopsys memory controllers. Having the driver to support so different
devices caused implementing an additional level of abstraction, which will
be a great deal of obstacle in about to be added comprehensive set of the
Synopsys DDR-controller features (DW uCMTL2 IP-core parameters
auto-detection, common SDRAM<->phys address translation, multi-ranked
memory, ECC scrubber, individual IRQs, DFI alert_n IRQ, and so on).
Moreover the original reason of having these devices support living in a
single driver hasn't been introduced for all these years: the
synopsys_edac.c driver currently supports only a single memory controller
detected on the system.
So in order to make the Synopsys driver ready for the new features added
move the Xilinx Zynq A05 DDRC support into a separate driver: detach the
Zynq-specific callbacks, init/probe/remove methods and move them into the
new zynq_edac.c driver. The resultant driver will mostly look similar to
the code submitted in the initial commit ae9b56e3996d ("EDAC, synps: Add
EDAC support for zynq ddr ecc controller") except a few fixes added
afterwards.
Note several Zynq-specific macros have been used in the DW uMCTL2
DDRC-specific functions. These macros just for a mere coincident have
values suitable for the DW uMCTL2 code but their names either partly or
fully unrelated with the Synopsys memory controllers. Since these macros
are now moved to another driver introduce a new Synopsys-specific ones and
fix the places where the macros have been improperly utilized.
Signed-off-by: Serge Semin <[email protected]>
---
Changelog v3:
- Drop the no longer used "priv" pointer from the zynq_mc_init() function.
(@tbot)
Changelog v5:
- Rename handle_error and check_errors to zynq_handle_error and
zynq_check_errors in the kdoc comments (@tbot)
---
MAINTAINERS | 1 +
drivers/edac/Kconfig | 9 +-
drivers/edac/Makefile | 1 +
drivers/edac/synopsys_edac.c | 239 ++---------------
drivers/edac/zynq_edac.c | 501 +++++++++++++++++++++++++++++++++++
5 files changed, 531 insertions(+), 220 deletions(-)
create mode 100644 drivers/edac/zynq_edac.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 960512bec428..baca1e55cf49 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3020,6 +3020,7 @@ F: arch/arm/mach-zynq/
F: drivers/clocksource/timer-cadence-ttc.c
F: drivers/cpuidle/cpuidle-zynq.c
F: drivers/edac/synopsys_edac.c
+F: drivers/edac/zynq_edac.c
F: drivers/i2c/busses/i2c-cadence.c
F: drivers/i2c/busses/i2c-xiic.c
F: drivers/mmc/host/sdhci-of-arasan.c
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 5a7f3fabee22..6e411376d2bb 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -487,7 +487,7 @@ config EDAC_ARMADA_XP
config EDAC_SYNOPSYS
tristate "Synopsys DDR Memory Controller"
- depends on ARCH_ZYNQ || ARCH_ZYNQMP || ARCH_INTEL_SOCFPGA || ARCH_MXC
+ depends on ARCH_ZYNQMP || ARCH_INTEL_SOCFPGA || ARCH_MXC
help
Support for error detection and correction on the Synopsys DDR
memory controller.
@@ -542,6 +542,13 @@ config EDAC_DMC520
Support for error detection and correction on the
SoCs with ARM DMC-520 DRAM controller.
+config EDAC_ZYNQ
+ tristate "Xilinx Zynq A05 DDR Memory Controller"
+ depends on ARCH_ZYNQ || COMPILE_TEST
+ help
+ Support for error detection and correction on the Xilinx Zynq A05
+ DDR memory controller.
+
config EDAC_ZYNQMP
tristate "Xilinx ZynqMP OCM Controller"
depends on ARCH_ZYNQMP || COMPILE_TEST
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 9c09893695b7..2829036f8468 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -85,5 +85,6 @@ obj-$(CONFIG_EDAC_ASPEED) += aspeed_edac.o
obj-$(CONFIG_EDAC_BLUEFIELD) += bluefield_edac.o
obj-$(CONFIG_EDAC_DMC520) += dmc520_edac.o
obj-$(CONFIG_EDAC_NPCM) += npcm_edac.o
+obj-$(CONFIG_EDAC_ZYNQ) += zynq_edac.o
obj-$(CONFIG_EDAC_ZYNQMP) += zynqmp_edac.o
obj-$(CONFIG_EDAC_VERSAL) += versal_edac.o
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index dfe1abe7c86c..6ea1eaaa7d6f 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -29,68 +29,16 @@
#define SYNPS_EDAC_MOD_STRING "synps_edac"
#define SYNPS_EDAC_MOD_VER "1"
-/* Synopsys DDR memory controller registers that are relevant to ECC */
-#define CTRL_OFST 0x0
-#define T_ZQ_OFST 0xA4
-
-/* ECC control register */
-#define ECC_CTRL_OFST 0xC4
-/* ECC log register */
-#define CE_LOG_OFST 0xC8
-/* ECC address register */
-#define CE_ADDR_OFST 0xCC
-/* ECC data[31:0] register */
-#define CE_DATA_31_0_OFST 0xD0
-
-/* Uncorrectable error info registers */
-#define UE_LOG_OFST 0xDC
-#define UE_ADDR_OFST 0xE0
-#define UE_DATA_31_0_OFST 0xE4
-
-#define STAT_OFST 0xF0
-#define SCRUB_OFST 0xF4
-
-/* Control register bit field definitions */
-#define CTRL_BW_MASK 0xC
-#define CTRL_BW_SHIFT 2
-
-#define DDRCTL_WDTH_16 1
-#define DDRCTL_WDTH_32 0
-
-/* ZQ register bit field definitions */
-#define T_ZQ_DDRMODE_MASK 0x2
-
-/* ECC control register bit field definitions */
-#define ECC_CTRL_CLR_CE_ERR 0x2
-#define ECC_CTRL_CLR_UE_ERR 0x1
-
-/* ECC correctable/uncorrectable error log register definitions */
-#define LOG_VALID 0x1
-#define CE_LOG_BITPOS_MASK 0xFE
-#define CE_LOG_BITPOS_SHIFT 1
-
-/* ECC correctable/uncorrectable error address register definitions */
-#define ADDR_COL_MASK 0xFFF
-#define ADDR_ROW_MASK 0xFFFF000
-#define ADDR_ROW_SHIFT 12
-#define ADDR_BANK_MASK 0x70000000
-#define ADDR_BANK_SHIFT 28
-
-/* ECC statistic register definitions */
-#define STAT_UECNT_MASK 0xFF
-#define STAT_CECNT_MASK 0xFF00
-#define STAT_CECNT_SHIFT 8
-
-/* ECC scrub register definitions */
-#define SCRUB_MODE_MASK 0x7
-#define SCRUB_MODE_SECDED 0x4
-
/* DDR ECC Quirks */
#define DDR_ECC_INTR_SUPPORT BIT(0)
#define DDR_ECC_DATA_POISON_SUPPORT BIT(1)
#define SYNPS_ZYNQMP_IRQ_REGS BIT(2)
-/* ZynqMP Enhanced DDR memory controller registers that are relevant to ECC */
+/* Synopsys DDR memory controller registers that are relevant to ECC */
+
+/* DDRC Master 0 Register */
+#define DDR_MSTR_OFST 0x0
+
/* ECC Configuration Registers */
#define ECC_CFG0_OFST 0x70
#define ECC_CFG1_OFST 0x74
@@ -134,16 +82,22 @@
#define ECC_ADDRMAP0_OFFSET 0x200
/* Control register bitfield definitions */
-#define ECC_CTRL_BUSWIDTH_MASK 0x3000
-#define ECC_CTRL_BUSWIDTH_SHIFT 12
+#define ECC_CTRL_CLR_CE_ERR BIT(0)
+#define ECC_CTRL_CLR_UE_ERR BIT(1)
#define ECC_CTRL_CLR_CE_ERRCNT BIT(2)
#define ECC_CTRL_CLR_UE_ERRCNT BIT(3)
/* DDR Control Register width definitions */
+#define DDR_MSTR_BUSWIDTH_MASK 0x3000
+#define DDR_MSTR_BUSWIDTH_SHIFT 12
#define DDRCTL_EWDTH_16 2
#define DDRCTL_EWDTH_32 1
#define DDRCTL_EWDTH_64 0
+/* ECC CFG0 register definitions */
+#define ECC_CFG0_MODE_MASK 0x7
+#define ECC_CFG0_MODE_SECDED 0x4
+
/* ECC status register definitions */
#define ECC_STAT_UECNT_MASK 0xF0000
#define ECC_STAT_UECNT_SHIFT 16
@@ -341,61 +295,6 @@ struct synps_platform_data {
int quirks;
};
-/**
- * zynq_get_error_info - Get the current ECC error info.
- * @priv: DDR memory controller private instance data.
- *
- * Return: one if there is no error, otherwise zero.
- */
-static int zynq_get_error_info(struct synps_edac_priv *priv)
-{
- struct synps_ecc_status *p;
- u32 regval, clearval = 0;
- void __iomem *base;
-
- base = priv->baseaddr;
- p = &priv->stat;
-
- regval = readl(base + STAT_OFST);
- if (!regval)
- return 1;
-
- p->ce_cnt = (regval & STAT_CECNT_MASK) >> STAT_CECNT_SHIFT;
- p->ue_cnt = regval & STAT_UECNT_MASK;
-
- regval = readl(base + CE_LOG_OFST);
- if (!(p->ce_cnt && (regval & LOG_VALID)))
- goto ue_err;
-
- p->ceinfo.bitpos = (regval & CE_LOG_BITPOS_MASK) >> CE_LOG_BITPOS_SHIFT;
- regval = readl(base + CE_ADDR_OFST);
- p->ceinfo.row = (regval & ADDR_ROW_MASK) >> ADDR_ROW_SHIFT;
- p->ceinfo.col = regval & ADDR_COL_MASK;
- p->ceinfo.bank = (regval & ADDR_BANK_MASK) >> ADDR_BANK_SHIFT;
- p->ceinfo.data = readl(base + CE_DATA_31_0_OFST);
- edac_dbg(3, "CE bit position: %d data: %d\n", p->ceinfo.bitpos,
- p->ceinfo.data);
- clearval = ECC_CTRL_CLR_CE_ERR;
-
-ue_err:
- regval = readl(base + UE_LOG_OFST);
- if (!(p->ue_cnt && (regval & LOG_VALID)))
- goto out;
-
- regval = readl(base + UE_ADDR_OFST);
- p->ueinfo.row = (regval & ADDR_ROW_MASK) >> ADDR_ROW_SHIFT;
- p->ueinfo.col = regval & ADDR_COL_MASK;
- p->ueinfo.bank = (regval & ADDR_BANK_MASK) >> ADDR_BANK_SHIFT;
- p->ueinfo.data = readl(base + UE_DATA_31_0_OFST);
- clearval |= ECC_CTRL_CLR_UE_ERR;
-
-out:
- writel(clearval, base + ECC_CTRL_OFST);
- writel(0x0, base + ECC_CTRL_OFST);
-
- return 0;
-}
-
/**
* zynqmp_get_error_info - Get the current ECC error info.
* @priv: DDR memory controller private instance data.
@@ -614,37 +513,6 @@ static void check_errors(struct mem_ctl_info *mci)
handle_error(mci, &priv->stat);
}
-/**
- * zynq_get_dtype - Return the controller memory width.
- * @base: DDR memory controller base address.
- *
- * Get the EDAC device type width appropriate for the current controller
- * configuration.
- *
- * Return: a device type width enumeration.
- */
-static enum dev_type zynq_get_dtype(const void __iomem *base)
-{
- enum dev_type dt;
- u32 width;
-
- width = readl(base + CTRL_OFST);
- width = (width & CTRL_BW_MASK) >> CTRL_BW_SHIFT;
-
- switch (width) {
- case DDRCTL_WDTH_16:
- dt = DEV_X2;
- break;
- case DDRCTL_WDTH_32:
- dt = DEV_X4;
- break;
- default:
- dt = DEV_UNKNOWN;
- }
-
- return dt;
-}
-
/**
* zynqmp_get_dtype - Return the controller memory width.
* @base: DDR memory controller base address.
@@ -658,7 +526,7 @@ static enum dev_type zynqmp_get_dtype(const void __iomem *base)
{
u32 regval;
- regval = readl(base + CTRL_OFST);
+ regval = readl(base + DDR_MSTR_OFST);
if (!(regval & MEM_TYPE_DDR4))
return DEV_UNKNOWN;
@@ -677,30 +545,6 @@ static enum dev_type zynqmp_get_dtype(const void __iomem *base)
return DEV_UNKNOWN;
}
-/**
- * zynq_get_ecc_state - Return the controller ECC enable/disable status.
- * @base: DDR memory controller base address.
- *
- * Get the ECC enable/disable status of the controller.
- *
- * Return: true if enabled, otherwise false.
- */
-static bool zynq_get_ecc_state(void __iomem *base)
-{
- enum dev_type dt;
- u32 ecctype;
-
- dt = zynq_get_dtype(base);
- if (dt == DEV_UNKNOWN)
- return false;
-
- ecctype = readl(base + SCRUB_OFST) & SCRUB_MODE_MASK;
- if ((ecctype == SCRUB_MODE_SECDED) && (dt == DEV_X2))
- return true;
-
- return false;
-}
-
/**
* zynqmp_get_ecc_state - Return the controller ECC enable/disable status.
* @base: DDR memory controller base address.
@@ -713,9 +557,9 @@ static bool zynqmp_get_ecc_state(void __iomem *base)
{
u32 regval;
- regval = readl(base + ECC_CFG0_OFST) & SCRUB_MODE_MASK;
+ regval = readl(base + ECC_CFG0_OFST) & ECC_CFG0_MODE_MASK;
- return (regval == SCRUB_MODE_SECDED);
+ return (regval == ECC_CFG0_MODE_SECDED);
}
/**
@@ -732,30 +576,6 @@ static u32 get_memsize(void)
return inf.totalram * inf.mem_unit;
}
-/**
- * zynq_get_mtype - Return the controller memory type.
- * @base: Synopsys ECC status structure.
- *
- * Get the EDAC memory type appropriate for the current controller
- * configuration.
- *
- * Return: a memory type enumeration.
- */
-static enum mem_type zynq_get_mtype(const void __iomem *base)
-{
- enum mem_type mt;
- u32 memtype;
-
- memtype = readl(base + T_ZQ_OFST);
-
- if (memtype & T_ZQ_DDRMODE_MASK)
- mt = MEM_DDR3;
- else
- mt = MEM_DDR2;
-
- return mt;
-}
-
/**
* zynqmp_get_mtype - Returns controller memory type.
* @base: Synopsys ECC status structure.
@@ -770,7 +590,7 @@ static enum mem_type zynqmp_get_mtype(const void __iomem *base)
enum mem_type mt;
u32 memtype;
- memtype = readl(base + CTRL_OFST);
+ memtype = readl(base + DDR_MSTR_OFST);
if ((memtype & MEM_TYPE_DDR3) || (memtype & MEM_TYPE_LPDDR3))
mt = MEM_DDR3;
@@ -882,14 +702,6 @@ static int setup_irq(struct mem_ctl_info *mci,
return 0;
}
-static const struct synps_platform_data zynq_edac_def = {
- .get_error_info = zynq_get_error_info,
- .get_mtype = zynq_get_mtype,
- .get_dtype = zynq_get_dtype,
- .get_ecc_state = zynq_get_ecc_state,
- .quirks = 0,
-};
-
static const struct synps_platform_data zynqmp_edac_def = {
.get_error_info = zynqmp_get_error_info,
.get_mtype = zynqmp_get_mtype,
@@ -916,10 +728,6 @@ static const struct synps_platform_data synopsys_edac_def = {
static const struct of_device_id synps_edac_match[] = {
- {
- .compatible = "xlnx,zynq-ddrc-a05",
- .data = (void *)&zynq_edac_def
- },
{
.compatible = "xlnx,zynqmp-ddrc-2.40a",
.data = (void *)&zynqmp_edac_def
@@ -1141,8 +949,8 @@ static void setup_column_address_map(struct synps_edac_priv *priv, u32 *addrmap)
u32 width, memtype;
int index;
- memtype = readl(priv->baseaddr + CTRL_OFST);
- width = (memtype & ECC_CTRL_BUSWIDTH_MASK) >> ECC_CTRL_BUSWIDTH_SHIFT;
+ memtype = readl(priv->baseaddr + DDR_MSTR_OFST);
+ width = (memtype & DDR_MSTR_BUSWIDTH_MASK) >> DDR_MSTR_BUSWIDTH_SHIFT;
priv->col_shift[0] = 0;
priv->col_shift[1] = 1;
@@ -1337,7 +1145,7 @@ static int mc_probe(struct platform_device *pdev)
layers[1].size = SYNPS_EDAC_NR_CHANS;
layers[1].is_virt_csrow = false;
- mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+ mci = edac_mc_alloc(EDAC_AUTO_MC_NUM, ARRAY_SIZE(layers), layers,
sizeof(struct synps_edac_priv));
if (!mci) {
edac_printk(KERN_ERR, EDAC_MC,
@@ -1379,13 +1187,6 @@ static int mc_probe(struct platform_device *pdev)
setup_address_map(priv);
#endif
- /*
- * Start capturing the correctable and uncorrectable errors. A write of
- * 0 starts the counters.
- */
- if (!(priv->p_data->quirks & DDR_ECC_INTR_SUPPORT))
- writel(0x0, baseaddr + ECC_CTRL_OFST);
-
return rc;
free_edac_mc:
diff --git a/drivers/edac/zynq_edac.c b/drivers/edac/zynq_edac.c
new file mode 100644
index 000000000000..be26934a9c20
--- /dev/null
+++ b/drivers/edac/zynq_edac.c
@@ -0,0 +1,501 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Zynq DDR ECC Driver
+ * This driver is based on ppc4xx_edac.c drivers
+ *
+ * Copyright (C) 2012 - 2014 Xilinx, Inc.
+ */
+
+#include <linux/edac.h>
+#include <linux/interrupt.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+
+#include "edac_module.h"
+
+/* Number of cs_rows needed per memory controller */
+#define ZYNQ_EDAC_NR_CSROWS 1
+
+/* Number of channels per memory controller */
+#define ZYNQ_EDAC_NR_CHANS 1
+
+/* Granularity of reported error in bytes */
+#define ZYNQ_EDAC_ERR_GRAIN 1
+
+#define ZYNQ_EDAC_MSG_SIZE 256
+
+#define ZYNQ_EDAC_MOD_STRING "zynq_edac"
+#define ZYNQ_EDAC_MOD_VER "1"
+
+/* Zynq DDR memory controller ECC registers */
+#define ZYNQ_CTRL_OFST 0x0
+#define ZYNQ_T_ZQ_OFST 0xA4
+
+/* ECC control register */
+#define ZYNQ_ECC_CTRL_OFST 0xC4
+/* ECC log register */
+#define ZYNQ_CE_LOG_OFST 0xC8
+/* ECC address register */
+#define ZYNQ_CE_ADDR_OFST 0xCC
+/* ECC data[31:0] register */
+#define ZYNQ_CE_DATA_31_0_OFST 0xD0
+
+/* Uncorrectable error info registers */
+#define ZYNQ_UE_LOG_OFST 0xDC
+#define ZYNQ_UE_ADDR_OFST 0xE0
+#define ZYNQ_UE_DATA_31_0_OFST 0xE4
+
+#define ZYNQ_STAT_OFST 0xF0
+#define ZYNQ_SCRUB_OFST 0xF4
+
+/* Control register bit field definitions */
+#define ZYNQ_CTRL_BW_MASK 0xC
+#define ZYNQ_CTRL_BW_SHIFT 2
+
+#define ZYNQ_DDRCTL_WDTH_16 1
+#define ZYNQ_DDRCTL_WDTH_32 0
+
+/* ZQ register bit field definitions */
+#define ZYNQ_T_ZQ_DDRMODE_MASK 0x2
+
+/* ECC control register bit field definitions */
+#define ZYNQ_ECC_CTRL_CLR_CE_ERR 0x2
+#define ZYNQ_ECC_CTRL_CLR_UE_ERR 0x1
+
+/* ECC correctable/uncorrectable error log register definitions */
+#define ZYNQ_LOG_VALID 0x1
+#define ZYNQ_CE_LOG_BITPOS_MASK 0xFE
+#define ZYNQ_CE_LOG_BITPOS_SHIFT 1
+
+/* ECC correctable/uncorrectable error address register definitions */
+#define ZYNQ_ADDR_COL_MASK 0xFFF
+#define ZYNQ_ADDR_ROW_MASK 0xFFFF000
+#define ZYNQ_ADDR_ROW_SHIFT 12
+#define ZYNQ_ADDR_BANK_MASK 0x70000000
+#define ZYNQ_ADDR_BANK_SHIFT 28
+
+/* ECC statistic register definitions */
+#define ZYNQ_STAT_UECNT_MASK 0xFF
+#define ZYNQ_STAT_CECNT_MASK 0xFF00
+#define ZYNQ_STAT_CECNT_SHIFT 8
+
+/* ECC scrub register definitions */
+#define ZYNQ_SCRUB_MODE_MASK 0x7
+#define ZYNQ_SCRUB_MODE_SECDED 0x4
+
+/**
+ * struct zynq_ecc_error_info - ECC error log information.
+ * @row: Row number.
+ * @col: Column number.
+ * @bank: Bank number.
+ * @bitpos: Bit position.
+ * @data: Data causing the error.
+ */
+struct zynq_ecc_error_info {
+ u32 row;
+ u32 col;
+ u32 bank;
+ u32 bitpos;
+ u32 data;
+};
+
+/**
+ * struct zynq_ecc_status - ECC status information to report.
+ * @ce_cnt: Correctable error count.
+ * @ue_cnt: Uncorrectable error count.
+ * @ceinfo: Correctable error log information.
+ * @ueinfo: Uncorrectable error log information.
+ */
+struct zynq_ecc_status {
+ u32 ce_cnt;
+ u32 ue_cnt;
+ struct zynq_ecc_error_info ceinfo;
+ struct zynq_ecc_error_info ueinfo;
+};
+
+/**
+ * struct zynq_edac_priv - DDR memory controller private instance data.
+ * @baseaddr: Base address of the DDR controller.
+ * @message: Buffer for framing the event specific info.
+ * @stat: ECC status information.
+ */
+struct zynq_edac_priv {
+ void __iomem *baseaddr;
+ char message[ZYNQ_EDAC_MSG_SIZE];
+ struct zynq_ecc_status stat;
+};
+
+/**
+ * zynq_get_error_info - Get the current ECC error info.
+ * @priv: DDR memory controller private instance data.
+ *
+ * Return: one if there is no error, otherwise zero.
+ */
+static int zynq_get_error_info(struct zynq_edac_priv *priv)
+{
+ struct zynq_ecc_status *p;
+ u32 regval, clearval = 0;
+ void __iomem *base;
+
+ base = priv->baseaddr;
+ p = &priv->stat;
+
+ regval = readl(base + ZYNQ_STAT_OFST);
+ if (!regval)
+ return 1;
+
+ p->ce_cnt = (regval & ZYNQ_STAT_CECNT_MASK) >> ZYNQ_STAT_CECNT_SHIFT;
+ p->ue_cnt = regval & ZYNQ_STAT_UECNT_MASK;
+
+ regval = readl(base + ZYNQ_CE_LOG_OFST);
+ if (!(p->ce_cnt && (regval & ZYNQ_LOG_VALID)))
+ goto ue_err;
+
+ p->ceinfo.bitpos = (regval & ZYNQ_CE_LOG_BITPOS_MASK) >> ZYNQ_CE_LOG_BITPOS_SHIFT;
+ regval = readl(base + ZYNQ_CE_ADDR_OFST);
+ p->ceinfo.row = (regval & ZYNQ_ADDR_ROW_MASK) >> ZYNQ_ADDR_ROW_SHIFT;
+ p->ceinfo.col = regval & ZYNQ_ADDR_COL_MASK;
+ p->ceinfo.bank = (regval & ZYNQ_ADDR_BANK_MASK) >> ZYNQ_ADDR_BANK_SHIFT;
+ p->ceinfo.data = readl(base + ZYNQ_CE_DATA_31_0_OFST);
+ edac_dbg(3, "CE bit position: %d data: %d\n", p->ceinfo.bitpos,
+ p->ceinfo.data);
+ clearval = ZYNQ_ECC_CTRL_CLR_CE_ERR;
+
+ue_err:
+ regval = readl(base + ZYNQ_UE_LOG_OFST);
+ if (!(p->ue_cnt && (regval & ZYNQ_LOG_VALID)))
+ goto out;
+
+ regval = readl(base + ZYNQ_UE_ADDR_OFST);
+ p->ueinfo.row = (regval & ZYNQ_ADDR_ROW_MASK) >> ZYNQ_ADDR_ROW_SHIFT;
+ p->ueinfo.col = regval & ZYNQ_ADDR_COL_MASK;
+ p->ueinfo.bank = (regval & ZYNQ_ADDR_BANK_MASK) >> ZYNQ_ADDR_BANK_SHIFT;
+ p->ueinfo.data = readl(base + ZYNQ_UE_DATA_31_0_OFST);
+ clearval |= ZYNQ_ECC_CTRL_CLR_UE_ERR;
+
+out:
+ writel(clearval, base + ZYNQ_ECC_CTRL_OFST);
+ writel(0x0, base + ZYNQ_ECC_CTRL_OFST);
+
+ return 0;
+}
+
+/**
+ * zynq_handle_error - Handle Correctable and Uncorrectable errors.
+ * @mci: EDAC memory controller instance.
+ * @p: Zynq ECC status structure.
+ *
+ * Handles ECC correctable and uncorrectable errors.
+ */
+static void zynq_handle_error(struct mem_ctl_info *mci, struct zynq_ecc_status *p)
+{
+ struct zynq_edac_priv *priv = mci->pvt_info;
+ struct zynq_ecc_error_info *pinf;
+
+ if (p->ce_cnt) {
+ pinf = &p->ceinfo;
+
+ snprintf(priv->message, ZYNQ_EDAC_MSG_SIZE,
+ "Row %d Bank %d Col %d Bit %d Data 0x%08x",
+ pinf->row, pinf->bank, pinf->col,
+ pinf->bitpos, pinf->data);
+
+ edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+ p->ce_cnt, 0, 0, 0, 0, 0, -1,
+ priv->message, "");
+ }
+
+ if (p->ue_cnt) {
+ pinf = &p->ueinfo;
+
+ snprintf(priv->message, ZYNQ_EDAC_MSG_SIZE,
+ "Row %d Bank %d Col %d",
+ pinf->row, pinf->bank, pinf->col);
+
+ edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+ p->ue_cnt, 0, 0, 0, 0, 0, -1,
+ priv->message, "");
+ }
+
+ memset(p, 0, sizeof(*p));
+}
+
+/**
+ * zynq_check_errors - Check controller for ECC errors.
+ * @mci: EDAC memory controller instance.
+ *
+ * Check and post ECC errors. Called by the polling thread.
+ */
+static void zynq_check_errors(struct mem_ctl_info *mci)
+{
+ struct zynq_edac_priv *priv = mci->pvt_info;
+ int status;
+
+ status = zynq_get_error_info(priv);
+ if (status)
+ return;
+
+ zynq_handle_error(mci, &priv->stat);
+}
+
+/**
+ * zynq_get_dtype - Return the controller memory width.
+ * @base: DDR memory controller base address.
+ *
+ * Get the EDAC device type width appropriate for the current controller
+ * configuration.
+ *
+ * Return: a device type width enumeration.
+ */
+static enum dev_type zynq_get_dtype(const void __iomem *base)
+{
+ enum dev_type dt;
+ u32 width;
+
+ width = readl(base + ZYNQ_CTRL_OFST);
+ width = (width & ZYNQ_CTRL_BW_MASK) >> ZYNQ_CTRL_BW_SHIFT;
+
+ switch (width) {
+ case ZYNQ_DDRCTL_WDTH_16:
+ dt = DEV_X2;
+ break;
+ case ZYNQ_DDRCTL_WDTH_32:
+ dt = DEV_X4;
+ break;
+ default:
+ dt = DEV_UNKNOWN;
+ }
+
+ return dt;
+}
+
+/**
+ * zynq_get_ecc_state - Return the controller ECC enable/disable status.
+ * @base: DDR memory controller base address.
+ *
+ * Get the ECC enable/disable status of the controller.
+ *
+ * Return: true if enabled, otherwise false.
+ */
+static bool zynq_get_ecc_state(void __iomem *base)
+{
+ enum dev_type dt;
+ u32 ecctype;
+
+ dt = zynq_get_dtype(base);
+ if (dt == DEV_UNKNOWN)
+ return false;
+
+ ecctype = readl(base + ZYNQ_SCRUB_OFST) & ZYNQ_SCRUB_MODE_MASK;
+ if ((ecctype == ZYNQ_SCRUB_MODE_SECDED) && (dt == DEV_X2))
+ return true;
+
+ return false;
+}
+
+/**
+ * zynq_get_memsize - Read the size of the attached memory device.
+ *
+ * Return: the memory size in bytes.
+ */
+static u32 zynq_get_memsize(void)
+{
+ struct sysinfo inf;
+
+ si_meminfo(&inf);
+
+ return inf.totalram * inf.mem_unit;
+}
+
+/**
+ * zynq_get_mtype - Return the controller memory type.
+ * @base: Zynq ECC status structure.
+ *
+ * Get the EDAC memory type appropriate for the current controller
+ * configuration.
+ *
+ * Return: a memory type enumeration.
+ */
+static enum mem_type zynq_get_mtype(const void __iomem *base)
+{
+ enum mem_type mt;
+ u32 memtype;
+
+ memtype = readl(base + ZYNQ_T_ZQ_OFST);
+
+ if (memtype & ZYNQ_T_ZQ_DDRMODE_MASK)
+ mt = MEM_DDR3;
+ else
+ mt = MEM_DDR2;
+
+ return mt;
+}
+
+/**
+ * zynq_init_csrows - Initialize the csrow data.
+ * @mci: EDAC memory controller instance.
+ *
+ * Initialize the chip select rows associated with the EDAC memory
+ * controller instance.
+ */
+static void zynq_init_csrows(struct mem_ctl_info *mci)
+{
+ struct zynq_edac_priv *priv = mci->pvt_info;
+ struct csrow_info *csi;
+ struct dimm_info *dimm;
+ u32 size, row;
+ int j;
+
+ for (row = 0; row < mci->nr_csrows; row++) {
+ csi = mci->csrows[row];
+ size = zynq_get_memsize();
+
+ for (j = 0; j < csi->nr_channels; j++) {
+ dimm = csi->channels[j]->dimm;
+ dimm->edac_mode = EDAC_SECDED;
+ dimm->mtype = zynq_get_mtype(priv->baseaddr);
+ dimm->nr_pages = (size >> PAGE_SHIFT) / csi->nr_channels;
+ dimm->grain = ZYNQ_EDAC_ERR_GRAIN;
+ dimm->dtype = zynq_get_dtype(priv->baseaddr);
+ }
+ }
+}
+
+/**
+ * zynq_mc_init - Initialize one driver instance.
+ * @mci: EDAC memory controller instance.
+ * @pdev: platform device.
+ *
+ * Perform initialization of the EDAC memory controller instance and
+ * related driver-private data associated with the memory controller the
+ * instance is bound to.
+ */
+static void zynq_mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
+{
+ mci->pdev = &pdev->dev;
+ platform_set_drvdata(pdev, mci);
+
+ /* Initialize controller capabilities and configuration */
+ mci->mtype_cap = MEM_FLAG_DDR3 | MEM_FLAG_DDR2;
+ mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
+ mci->scrub_cap = SCRUB_FLAG_HW_SRC;
+ mci->scrub_mode = SCRUB_NONE;
+
+ mci->edac_cap = EDAC_FLAG_SECDED;
+ mci->ctl_name = "zynq_ddr_controller";
+ mci->dev_name = ZYNQ_EDAC_MOD_STRING;
+ mci->mod_name = ZYNQ_EDAC_MOD_VER;
+
+ edac_op_state = EDAC_OPSTATE_POLL;
+ mci->edac_check = zynq_check_errors;
+
+ mci->ctl_page_to_phys = NULL;
+
+ zynq_init_csrows(mci);
+}
+
+/**
+ * zynq_mc_probe - Check controller and bind driver.
+ * @pdev: platform device.
+ *
+ * Probe a specific controller instance for binding with the driver.
+ *
+ * Return: 0 if the controller instance was successfully bound to the
+ * driver; otherwise, < 0 on error.
+ */
+static int zynq_mc_probe(struct platform_device *pdev)
+{
+ struct edac_mc_layer layers[2];
+ struct zynq_edac_priv *priv;
+ struct mem_ctl_info *mci;
+ void __iomem *baseaddr;
+ int rc;
+
+ baseaddr = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(baseaddr))
+ return PTR_ERR(baseaddr);
+
+ if (!zynq_get_ecc_state(baseaddr)) {
+ edac_printk(KERN_INFO, EDAC_MC, "ECC not enabled\n");
+ return -ENXIO;
+ }
+
+ layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+ layers[0].size = ZYNQ_EDAC_NR_CSROWS;
+ layers[0].is_virt_csrow = true;
+ layers[1].type = EDAC_MC_LAYER_CHANNEL;
+ layers[1].size = ZYNQ_EDAC_NR_CHANS;
+ layers[1].is_virt_csrow = false;
+
+ mci = edac_mc_alloc(EDAC_AUTO_MC_NUM, ARRAY_SIZE(layers), layers,
+ sizeof(struct zynq_edac_priv));
+ if (!mci) {
+ edac_printk(KERN_ERR, EDAC_MC,
+ "Failed memory allocation for mc instance\n");
+ return -ENOMEM;
+ }
+
+ priv = mci->pvt_info;
+ priv->baseaddr = baseaddr;
+
+ zynq_mc_init(mci, pdev);
+
+ rc = edac_mc_add_mc(mci);
+ if (rc) {
+ edac_printk(KERN_ERR, EDAC_MC,
+ "Failed to register with EDAC core\n");
+ goto free_edac_mc;
+ }
+
+ /*
+ * Start capturing the correctable and uncorrectable errors. A write of
+ * 0 starts the counters.
+ */
+ writel(0x0, baseaddr + ZYNQ_ECC_CTRL_OFST);
+
+ return 0;
+
+free_edac_mc:
+ edac_mc_free(mci);
+
+ return rc;
+}
+
+/**
+ * zynq_mc_remove - Unbind driver from controller.
+ * @pdev: Platform device.
+ *
+ * Return: Unconditionally 0
+ */
+static int zynq_mc_remove(struct platform_device *pdev)
+{
+ struct mem_ctl_info *mci = platform_get_drvdata(pdev);
+
+ edac_mc_del_mc(&pdev->dev);
+ edac_mc_free(mci);
+
+ return 0;
+}
+
+static const struct of_device_id zynq_edac_match[] = {
+ { .compatible = "xlnx,zynq-ddrc-a05" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, zynq_edac_match);
+
+static struct platform_driver zynq_edac_mc_driver = {
+ .driver = {
+ .name = "zynq-edac",
+ .of_match_table = zynq_edac_match,
+ },
+ .probe = zynq_mc_probe,
+ .remove = zynq_mc_remove,
+};
+module_platform_driver(zynq_edac_mc_driver);
+
+MODULE_AUTHOR("Xilinx Inc");
+MODULE_DESCRIPTION("Zynq DDR ECC driver");
+MODULE_LICENSE("GPL v2");
--
2.43.0
The driver now works with the Synopsys DW uMCTL2 DDRC IP-core-based
devices only (Xilinx Zynq A05 DDRC support has been moved to the separate
driver). All the currently available IP-core revisions have got almost the
same ECC and main part of the DDR-config CSRs map. Thus there is no point
in supporting the no longer used internal abstraction layer like the
callbacks responsible for getting the ECC errors info, memory and device
types, ECC state. All of that data can be retrieved in the same way from
all the Synopys DW uMCTL2 DDR controller versions. Similarly there is no
longer need in the DDR_ECC_INTR_SUPPORT and DDR_ECC_DATA_POISON_SUPPORT
quirk flags since DW uMCTL2 always supports IRQs and data poisoning (as
long as the ECC is supported). Drop that infrastructure for good then.
While at it move the module device table to being defined at the bottom of
the file, above the platform driver descriptor definition.
Signed-off-by: Serge Semin <[email protected]>
---
Changelog v2:
- Drop the no longer used "priv" pointer from the mc_init() function.
(@tbot)
---
drivers/edac/synopsys_edac.c | 192 ++++++++++-------------------------
1 file changed, 51 insertions(+), 141 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 6ea1eaaa7d6f..1cd02859e2b9 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -30,9 +30,7 @@
#define SYNPS_EDAC_MOD_VER "1"
/* DDR ECC Quirks */
-#define DDR_ECC_INTR_SUPPORT BIT(0)
-#define DDR_ECC_DATA_POISON_SUPPORT BIT(1)
-#define SYNPS_ZYNQMP_IRQ_REGS BIT(2)
+#define SYNPS_ZYNQMP_IRQ_REGS BIT(0)
/* Synopsys DDR memory controller registers that are relevant to ECC */
@@ -280,28 +278,20 @@ struct synps_edac_priv {
};
/**
- * struct synps_platform_data - synps platform data structure.
- * @get_error_info: Get EDAC error info.
- * @get_mtype: Get mtype.
- * @get_dtype: Get dtype.
- * @get_ecc_state: Get ECC state.
- * @quirks: To differentiate IPs.
+ * struct synps_platform_data - Synopsys uMCTL2 DDRC platform data.
+ * @quirks: IP-core specific quirks.
*/
struct synps_platform_data {
- int (*get_error_info)(struct synps_edac_priv *priv);
- enum mem_type (*get_mtype)(const void __iomem *base);
- enum dev_type (*get_dtype)(const void __iomem *base);
- bool (*get_ecc_state)(void __iomem *base);
- int quirks;
+ u32 quirks;
};
/**
- * zynqmp_get_error_info - Get the current ECC error info.
+ * synps_get_error_info - Get the current ECC error info.
* @priv: DDR memory controller private instance data.
*
* Return: one if there is no error otherwise returns zero.
*/
-static int zynqmp_get_error_info(struct synps_edac_priv *priv)
+static int synps_get_error_info(struct synps_edac_priv *priv)
{
struct synps_ecc_status *p;
u32 regval, clearval;
@@ -375,17 +365,11 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
if (p->ce_cnt) {
pinf = &p->ceinfo;
- if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
- snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "Row %d Col %d Bank %d Bank Group %d Bit %d Data 0x%08x",
- pinf->row, pinf->col, pinf->bank, pinf->bankgrp,
- pinf->bitpos, pinf->data);
- } else {
- snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "Row %d Bank %d Col %d Bit: %d Data: 0x%08x",
- pinf->row, pinf->bank, pinf->col,
- pinf->bitpos, pinf->data);
- }
+
+ snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
+ "Row %d Col %d Bank %d Bank Group %d Bit %d Data 0x%08x",
+ pinf->row, pinf->col, pinf->bank, pinf->bankgrp,
+ pinf->bitpos, pinf->data);
edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
p->ce_cnt, 0, 0, 0, 0, 0, -1,
@@ -394,15 +378,10 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
if (p->ue_cnt) {
pinf = &p->ueinfo;
- if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
- snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "Row %d Col %d Bank %d Bank Group %d",
- pinf->row, pinf->col, pinf->bank, pinf->bankgrp);
- } else {
- snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "Row %d Bank %d Col %d",
- pinf->row, pinf->bank, pinf->col);
- }
+
+ snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
+ "Row %d Col %d Bank %d Bank Group %d",
+ pinf->row, pinf->col, pinf->bank, pinf->bankgrp);
edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
p->ue_cnt, 0, 0, 0, 0, 0, -1,
@@ -464,13 +443,11 @@ static void disable_intr(struct synps_edac_priv *priv)
*/
static irqreturn_t intr_handler(int irq, void *dev_id)
{
- const struct synps_platform_data *p_data;
struct mem_ctl_info *mci = dev_id;
struct synps_edac_priv *priv;
int status, regval;
priv = mci->pvt_info;
- p_data = priv->p_data;
if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
regval = readl(priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
@@ -479,7 +456,7 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
return IRQ_NONE;
}
- status = p_data->get_error_info(priv);
+ status = synps_get_error_info(priv);
if (status)
return IRQ_NONE;
@@ -492,29 +469,7 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
}
/**
- * check_errors - Check controller for ECC errors.
- * @mci: EDAC memory controller instance.
- *
- * Check and post ECC errors. Called by the polling thread.
- */
-static void check_errors(struct mem_ctl_info *mci)
-{
- const struct synps_platform_data *p_data;
- struct synps_edac_priv *priv;
- int status;
-
- priv = mci->pvt_info;
- p_data = priv->p_data;
-
- status = p_data->get_error_info(priv);
- if (status)
- return;
-
- handle_error(mci, &priv->stat);
-}
-
-/**
- * zynqmp_get_dtype - Return the controller memory width.
+ * synps_get_dtype - Return the controller memory width.
* @base: DDR memory controller base address.
*
* Get the EDAC device type width appropriate for the current controller
@@ -522,7 +477,7 @@ static void check_errors(struct mem_ctl_info *mci)
*
* Return: a device type width enumeration.
*/
-static enum dev_type zynqmp_get_dtype(const void __iomem *base)
+static enum dev_type synps_get_dtype(const void __iomem *base)
{
u32 regval;
@@ -546,14 +501,14 @@ static enum dev_type zynqmp_get_dtype(const void __iomem *base)
}
/**
- * zynqmp_get_ecc_state - Return the controller ECC enable/disable status.
+ * synps_get_ecc_state - Return the controller ECC enable/disable status.
* @base: DDR memory controller base address.
*
* Get the ECC enable/disable status for the controller.
*
* Return: a ECC status boolean i.e true/false - enabled/disabled.
*/
-static bool zynqmp_get_ecc_state(void __iomem *base)
+static bool synps_get_ecc_state(void __iomem *base)
{
u32 regval;
@@ -577,7 +532,7 @@ static u32 get_memsize(void)
}
/**
- * zynqmp_get_mtype - Returns controller memory type.
+ * synps_get_mtype - Returns controller memory type.
* @base: Synopsys ECC status structure.
*
* Get the EDAC memory type appropriate for the current controller
@@ -585,7 +540,7 @@ static u32 get_memsize(void)
*
* Return: a memory type enumeration.
*/
-static enum mem_type zynqmp_get_mtype(const void __iomem *base)
+static enum mem_type synps_get_mtype(const void __iomem *base)
{
enum mem_type mt;
u32 memtype;
@@ -614,14 +569,11 @@ static enum mem_type zynqmp_get_mtype(const void __iomem *base)
static void init_csrows(struct mem_ctl_info *mci)
{
struct synps_edac_priv *priv = mci->pvt_info;
- const struct synps_platform_data *p_data;
struct csrow_info *csi;
struct dimm_info *dimm;
u32 size, row;
int j;
- p_data = priv->p_data;
-
for (row = 0; row < mci->nr_csrows; row++) {
csi = mci->csrows[row];
size = get_memsize();
@@ -629,10 +581,10 @@ static void init_csrows(struct mem_ctl_info *mci)
for (j = 0; j < csi->nr_channels; j++) {
dimm = csi->channels[j]->dimm;
dimm->edac_mode = EDAC_SECDED;
- dimm->mtype = p_data->get_mtype(priv->baseaddr);
+ dimm->mtype = synps_get_mtype(priv->baseaddr);
dimm->nr_pages = (size >> PAGE_SHIFT) / csi->nr_channels;
dimm->grain = SYNPS_EDAC_ERR_GRAIN;
- dimm->dtype = p_data->get_dtype(priv->baseaddr);
+ dimm->dtype = synps_get_dtype(priv->baseaddr);
}
}
}
@@ -648,10 +600,7 @@ static void init_csrows(struct mem_ctl_info *mci)
*/
static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
{
- struct synps_edac_priv *priv;
-
mci->pdev = &pdev->dev;
- priv = mci->pvt_info;
platform_set_drvdata(pdev, mci);
/* Initialize controller capabilities and configuration */
@@ -665,12 +614,7 @@ static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
mci->dev_name = SYNPS_EDAC_MOD_STRING;
mci->mod_name = SYNPS_EDAC_MOD_VER;
- if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
- edac_op_state = EDAC_OPSTATE_INT;
- } else {
- edac_op_state = EDAC_OPSTATE_POLL;
- mci->edac_check = check_errors;
- }
+ edac_op_state = EDAC_OPSTATE_INT;
mci->ctl_page_to_phys = NULL;
@@ -702,47 +646,6 @@ static int setup_irq(struct mem_ctl_info *mci,
return 0;
}
-static const struct synps_platform_data zynqmp_edac_def = {
- .get_error_info = zynqmp_get_error_info,
- .get_mtype = zynqmp_get_mtype,
- .get_dtype = zynqmp_get_dtype,
- .get_ecc_state = zynqmp_get_ecc_state,
- .quirks = (DDR_ECC_INTR_SUPPORT | SYNPS_ZYNQMP_IRQ_REGS
-#ifdef CONFIG_EDAC_DEBUG
- | DDR_ECC_DATA_POISON_SUPPORT
-#endif
- ),
-};
-
-static const struct synps_platform_data synopsys_edac_def = {
- .get_error_info = zynqmp_get_error_info,
- .get_mtype = zynqmp_get_mtype,
- .get_dtype = zynqmp_get_dtype,
- .get_ecc_state = zynqmp_get_ecc_state,
- .quirks = (DDR_ECC_INTR_SUPPORT
-#ifdef CONFIG_EDAC_DEBUG
- | DDR_ECC_DATA_POISON_SUPPORT
-#endif
- ),
-};
-
-
-static const struct of_device_id synps_edac_match[] = {
- {
- .compatible = "xlnx,zynqmp-ddrc-2.40a",
- .data = (void *)&zynqmp_edac_def
- },
- {
- .compatible = "snps,ddrc-3.80a",
- .data = (void *)&synopsys_edac_def
- },
- {
- /* end of table */
- }
-};
-
-MODULE_DEVICE_TABLE(of, synps_edac_match);
-
#ifdef CONFIG_EDAC_DEBUG
/**
@@ -1133,7 +1036,7 @@ static int mc_probe(struct platform_device *pdev)
if (!p_data)
return -ENODEV;
- if (!p_data->get_ecc_state(baseaddr)) {
+ if (!synps_get_ecc_state(baseaddr)) {
edac_printk(KERN_INFO, EDAC_MC, "ECC not enabled\n");
return -ENXIO;
}
@@ -1160,11 +1063,9 @@ static int mc_probe(struct platform_device *pdev)
mc_init(mci, pdev);
- if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
- rc = setup_irq(mci, pdev);
- if (rc)
- goto free_edac_mc;
- }
+ rc = setup_irq(mci, pdev);
+ if (rc)
+ goto free_edac_mc;
rc = edac_mc_add_mc(mci);
if (rc) {
@@ -1174,17 +1075,13 @@ static int mc_probe(struct platform_device *pdev)
}
#ifdef CONFIG_EDAC_DEBUG
- if (priv->p_data->quirks & DDR_ECC_DATA_POISON_SUPPORT) {
- rc = edac_create_sysfs_attributes(mci);
- if (rc) {
- edac_printk(KERN_ERR, EDAC_MC,
- "Failed to create sysfs entries\n");
- goto free_edac_mc;
- }
+ rc = edac_create_sysfs_attributes(mci);
+ if (rc) {
+ edac_printk(KERN_ERR, EDAC_MC, "Failed to create sysfs entries\n");
+ goto free_edac_mc;
}
- if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT)
- setup_address_map(priv);
+ setup_address_map(priv);
#endif
return rc;
@@ -1206,18 +1103,31 @@ static void mc_remove(struct platform_device *pdev)
struct mem_ctl_info *mci = platform_get_drvdata(pdev);
struct synps_edac_priv *priv = mci->pvt_info;
- if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT)
- disable_intr(priv);
+ disable_intr(priv);
#ifdef CONFIG_EDAC_DEBUG
- if (priv->p_data->quirks & DDR_ECC_DATA_POISON_SUPPORT)
- edac_remove_sysfs_attributes(mci);
+ edac_remove_sysfs_attributes(mci);
#endif
edac_mc_del_mc(&pdev->dev);
edac_mc_free(mci);
}
+static const struct synps_platform_data zynqmp_edac_def = {
+ .quirks = SYNPS_ZYNQMP_IRQ_REGS,
+};
+
+static const struct synps_platform_data synopsys_edac_def = {
+ .quirks = 0,
+};
+
+static const struct of_device_id synps_edac_match[] = {
+ { .compatible = "xlnx,zynqmp-ddrc-2.40a", .data = &zynqmp_edac_def },
+ { .compatible = "snps,ddrc-3.80a", .data = &synopsys_edac_def },
+ { }
+};
+MODULE_DEVICE_TABLE(of, synps_edac_match);
+
static struct platform_driver synps_edac_mc_driver = {
.driver = {
.name = "synopsys-edac",
--
2.43.0
Originally the device CSR macros were supposed to be defined following the
next local convention: first CSR offsets are listed, then their fields and
flags are described in the same order; CSRs offset macros are supposed to
have _OFST suffix; ECC-related macros shall have ECC_ prefix, generic
DDR-related macros - DDR_ prefix. After all the years the driver has been
living in kernel the CSRs macros have turned to be partly deviated away
from the denoted convention. Fix all the related inconsistencies: move
several CSRs offset macros to be defined before the CSRs fields and flags
macros; replace OFFSET suffix with OFST; replace DDRC_ prefix with DDR_
and ECC_ with DDR_ where it's appropriate; group DDR_MSTR and ECC_CTRL
(ECC_CLR) sibling fields macros together and make sure their prefixes
match to the CSRs offset macros. In addition to that drop _MASK suffix
from the macros which aren't used as masks and add ZYNQMP_ prefix to the
ZYNQMP-specific macros to distinguish them from the generic Synopsys
memory controller macros.
Signed-off-by: Serge Semin <[email protected]>
---
Changelog v4:
- This is a new patch collected from the rest of the series to simplify
the review process.
---
drivers/edac/synopsys_edac.c | 134 +++++++++++++++++------------------
1 file changed, 66 insertions(+), 68 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 1cd02859e2b9..1de9f3f86d5a 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -77,20 +77,34 @@
#define ECC_POISON0_OFST 0xB8
#define ECC_POISON1_OFST 0xBC
-#define ECC_ADDRMAP0_OFFSET 0x200
-
-/* Control register bitfield definitions */
-#define ECC_CTRL_CLR_CE_ERR BIT(0)
-#define ECC_CTRL_CLR_UE_ERR BIT(1)
-#define ECC_CTRL_CLR_CE_ERRCNT BIT(2)
-#define ECC_CTRL_CLR_UE_ERRCNT BIT(3)
-
-/* DDR Control Register width definitions */
+/* DDR Address Map Registers */
+#define DDR_ADDRMAP0_OFST 0x200
+
+/* DDR Software Control Register */
+#define DDR_SWCTL 0x320
+
+/* ZynqMP DDR QOS Registers */
+#define ZYNQMP_DDR_QOS_IRQ_STAT_OFST 0x20200
+#define ZYNQMP_DDR_QOS_IRQ_EN_OFST 0x20208
+#define ZYNQMP_DDR_QOS_IRQ_DB_OFST 0x2020C
+
+/* DDR Master register definitions */
+#define DDR_MSTR_DEV_CFG_MASK 0xC0000000
+#define DDR_MSTR_DEV_CFG_SHIFT 30
+#define DDR_MSTR_DEV_X4 0
+#define DDR_MSTR_DEV_X8 1
+#define DDR_MSTR_DEV_X16 2
+#define DDR_MSTR_DEV_X32 3
#define DDR_MSTR_BUSWIDTH_MASK 0x3000
#define DDR_MSTR_BUSWIDTH_SHIFT 12
-#define DDRCTL_EWDTH_16 2
-#define DDRCTL_EWDTH_32 1
-#define DDRCTL_EWDTH_64 0
+#define DDR_MSTR_BUSWIDTH_16 2
+#define DDR_MSTR_BUSWIDTH_32 1
+#define DDR_MSTR_BUSWIDTH_64 0
+#define DDR_MSTR_MEM_LPDDR4 0x20
+#define DDR_MSTR_MEM_DDR4 0x10
+#define DDR_MSTR_MEM_LPDDR3 0x8
+#define DDR_MSTR_MEM_DDR2 0x4
+#define DDR_MSTR_MEM_DDR3 0x1
/* ECC CFG0 register definitions */
#define ECC_CFG0_MODE_MASK 0x7
@@ -103,23 +117,19 @@
#define ECC_STAT_CECNT_SHIFT 8
#define ECC_STAT_BITNUM_MASK 0x7F
+/* ECC control/clear register definitions */
+#define ECC_CTRL_CLR_CE_ERR BIT(0)
+#define ECC_CTRL_CLR_UE_ERR BIT(1)
+#define ECC_CTRL_CLR_CE_ERRCNT BIT(2)
+#define ECC_CTRL_CLR_UE_ERRCNT BIT(3)
+#define ECC_CTRL_EN_CE_IRQ BIT(8)
+#define ECC_CTRL_EN_UE_IRQ BIT(9)
+
/* ECC error count register definitions */
#define ECC_ERRCNT_UECNT_MASK 0xFFFF0000
#define ECC_ERRCNT_UECNT_SHIFT 16
#define ECC_ERRCNT_CECNT_MASK 0xFFFF
-/* DDR QOS Interrupt register definitions */
-#define DDR_QOS_IRQ_STAT_OFST 0x20200
-#define DDR_QOSUE_MASK 0x4
-#define DDR_QOSCE_MASK 0x2
-#define ECC_CE_UE_INTR_MASK 0x6
-#define DDR_QOS_IRQ_EN_OFST 0x20208
-#define DDR_QOS_IRQ_DB_OFST 0x2020C
-
-/* DDR QOS Interrupt register definitions */
-#define DDR_UE_MASK BIT(9)
-#define DDR_CE_MASK BIT(8)
-
/* ECC Corrected Error Register Mask and Shifts*/
#define ECC_CEADDR0_RW_MASK 0x3FFFF
#define ECC_CEADDR0_RNK_MASK BIT(24)
@@ -141,28 +151,11 @@
#define ECC_POISON1_ROW_SHIFT 0
#define ECC_POISON1_ROW_MASK 0x3FFFF
-/* DDR Memory type defines */
-#define MEM_TYPE_DDR3 0x1
-#define MEM_TYPE_LPDDR3 0x8
-#define MEM_TYPE_DDR2 0x4
-#define MEM_TYPE_DDR4 0x10
-#define MEM_TYPE_LPDDR4 0x20
-
-/* DDRC Software control register */
-#define DDRC_SWCTL 0x320
-
/* DDRC ECC CE & UE poison mask */
#define ECC_CEPOISON_MASK 0x3
#define ECC_UEPOISON_MASK 0x1
-/* DDRC Device config masks */
-#define DDRC_MSTR_CFG_MASK 0xC0000000
-#define DDRC_MSTR_CFG_SHIFT 30
-#define DDRC_MSTR_CFG_X4_MASK 0x0
-#define DDRC_MSTR_CFG_X8_MASK 0x1
-#define DDRC_MSTR_CFG_X16_MASK 0x2
-#define DDRC_MSTR_CFG_X32_MASK 0x3
-
+/* DDRC Device config shifts/masks */
#define DDR_MAX_ROW_SHIFT 18
#define DDR_MAX_COL_SHIFT 14
#define DDR_MAX_BANK_SHIFT 3
@@ -215,6 +208,11 @@
#define RANK_B0_BASE 6
+/* ZynqMP DDR QOS Interrupt register definitions */
+#define ZYNQMP_DDR_QOS_UE_MASK 0x4
+#define ZYNQMP_DDR_QOS_CE_MASK 0x2
+#define ZYNQMP_DDR_QOS_IRQ_MASK 0x6
+
/**
* struct ecc_error_info - ECC error log information.
* @row: Row number.
@@ -397,8 +395,8 @@ static void enable_intr(struct synps_edac_priv *priv)
/* Enable UE/CE Interrupts */
if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
- writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
- priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
+ writel(ZYNQMP_DDR_QOS_UE_MASK | ZYNQMP_DDR_QOS_CE_MASK,
+ priv->baseaddr + ZYNQMP_DDR_QOS_IRQ_EN_OFST);
return;
}
@@ -409,7 +407,7 @@ static void enable_intr(struct synps_edac_priv *priv)
* IRQs Enable/Disable flags have been available since v3.10a.
* This is noop for the older controllers.
*/
- writel(DDR_UE_MASK | DDR_CE_MASK,
+ writel(ECC_CTRL_EN_CE_IRQ | ECC_CTRL_EN_UE_IRQ,
priv->baseaddr + ECC_CLR_OFST);
spin_unlock_irqrestore(&priv->reglock, flags);
@@ -421,8 +419,8 @@ static void disable_intr(struct synps_edac_priv *priv)
/* Disable UE/CE Interrupts */
if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
- writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
- priv->baseaddr + DDR_QOS_IRQ_DB_OFST);
+ writel(ZYNQMP_DDR_QOS_UE_MASK | ZYNQMP_DDR_QOS_CE_MASK,
+ priv->baseaddr + ZYNQMP_DDR_QOS_IRQ_DB_OFST);
return;
}
@@ -450,9 +448,9 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
priv = mci->pvt_info;
if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
- regval = readl(priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
- regval &= (DDR_QOSCE_MASK | DDR_QOSUE_MASK);
- if (!(regval & ECC_CE_UE_INTR_MASK))
+ regval = readl(priv->baseaddr + ZYNQMP_DDR_QOS_IRQ_STAT_OFST);
+ regval &= (ZYNQMP_DDR_QOS_CE_MASK | ZYNQMP_DDR_QOS_UE_MASK);
+ if (!(regval & ZYNQMP_DDR_QOS_IRQ_MASK))
return IRQ_NONE;
}
@@ -463,7 +461,7 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
handle_error(mci, &priv->stat);
if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS)
- writel(regval, priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
+ writel(regval, priv->baseaddr + ZYNQMP_DDR_QOS_IRQ_STAT_OFST);
return IRQ_HANDLED;
}
@@ -482,18 +480,18 @@ static enum dev_type synps_get_dtype(const void __iomem *base)
u32 regval;
regval = readl(base + DDR_MSTR_OFST);
- if (!(regval & MEM_TYPE_DDR4))
+ if (!(regval & DDR_MSTR_MEM_DDR4))
return DEV_UNKNOWN;
- regval = (regval & DDRC_MSTR_CFG_MASK) >> DDRC_MSTR_CFG_SHIFT;
+ regval = (regval & DDR_MSTR_DEV_CFG_MASK) >> DDR_MSTR_DEV_CFG_SHIFT;
switch (regval) {
- case DDRC_MSTR_CFG_X4_MASK:
+ case DDR_MSTR_DEV_X4:
return DEV_X4;
- case DDRC_MSTR_CFG_X8_MASK:
+ case DDR_MSTR_DEV_X8:
return DEV_X8;
- case DDRC_MSTR_CFG_X16_MASK:
+ case DDR_MSTR_DEV_X16:
return DEV_X16;
- case DDRC_MSTR_CFG_X32_MASK:
+ case DDR_MSTR_DEV_X32:
return DEV_X32;
}
@@ -547,11 +545,11 @@ static enum mem_type synps_get_mtype(const void __iomem *base)
memtype = readl(base + DDR_MSTR_OFST);
- if ((memtype & MEM_TYPE_DDR3) || (memtype & MEM_TYPE_LPDDR3))
+ if ((memtype & DDR_MSTR_MEM_DDR3) || (memtype & DDR_MSTR_MEM_LPDDR3))
mt = MEM_DDR3;
- else if (memtype & MEM_TYPE_DDR2)
+ else if (memtype & DDR_MSTR_MEM_DDR2)
mt = MEM_RDDR2;
- else if ((memtype & MEM_TYPE_LPDDR4) || (memtype & MEM_TYPE_DDR4))
+ else if ((memtype & DDR_MSTR_MEM_LPDDR4) || (memtype & DDR_MSTR_MEM_DDR4))
mt = MEM_DDR4;
else
mt = MEM_EMPTY;
@@ -756,12 +754,12 @@ static ssize_t inject_data_poison_store(struct device *dev,
struct mem_ctl_info *mci = to_mci(dev);
struct synps_edac_priv *priv = mci->pvt_info;
- writel(0, priv->baseaddr + DDRC_SWCTL);
+ writel(0, priv->baseaddr + DDR_SWCTL);
if (strncmp(data, "CE", 2) == 0)
writel(ECC_CEPOISON_MASK, priv->baseaddr + ECC_CFG1_OFST);
else
writel(ECC_UEPOISON_MASK, priv->baseaddr + ECC_CFG1_OFST);
- writel(1, priv->baseaddr + DDRC_SWCTL);
+ writel(1, priv->baseaddr + DDR_SWCTL);
return count;
}
@@ -878,8 +876,8 @@ static void setup_column_address_map(struct synps_edac_priv *priv, u32 *addrmap)
priv->col_shift[9] = (((addrmap[3] >> 24) & COL_MAX_VAL_MASK) ==
COL_MAX_VAL_MASK) ? 0 : (((addrmap[3] >> 24) &
COL_MAX_VAL_MASK) + COL_B9_BASE);
- if (width == DDRCTL_EWDTH_64) {
- if (memtype & MEM_TYPE_LPDDR3) {
+ if (width == DDR_MSTR_BUSWIDTH_64) {
+ if (memtype & DDR_MSTR_MEM_LPDDR3) {
priv->col_shift[10] = ((addrmap[4] &
COL_MAX_VAL_MASK) == COL_MAX_VAL_MASK) ? 0 :
((addrmap[4] & COL_MAX_VAL_MASK) +
@@ -898,8 +896,8 @@ static void setup_column_address_map(struct synps_edac_priv *priv, u32 *addrmap)
(((addrmap[4] >> 8) & COL_MAX_VAL_MASK) +
COL_B11_BASE);
}
- } else if (width == DDRCTL_EWDTH_32) {
- if (memtype & MEM_TYPE_LPDDR3) {
+ } else if (width == DDR_MSTR_BUSWIDTH_32) {
+ if (memtype & DDR_MSTR_MEM_LPDDR3) {
priv->col_shift[10] = (((addrmap[3] >> 24) &
COL_MAX_VAL_MASK) == COL_MAX_VAL_MASK) ? 0 :
(((addrmap[3] >> 24) & COL_MAX_VAL_MASK) +
@@ -919,7 +917,7 @@ static void setup_column_address_map(struct synps_edac_priv *priv, u32 *addrmap)
COL_B10_BASE);
}
} else {
- if (memtype & MEM_TYPE_LPDDR3) {
+ if (memtype & DDR_MSTR_MEM_LPDDR3) {
priv->col_shift[10] = (((addrmap[3] >> 16) &
COL_MAX_VAL_MASK) == COL_MAX_VAL_MASK) ? 0 :
(((addrmap[3] >> 16) & COL_MAX_VAL_MASK) +
@@ -994,7 +992,7 @@ static void setup_address_map(struct synps_edac_priv *priv)
for (index = 0; index < 12; index++) {
u32 addrmap_offset;
- addrmap_offset = ECC_ADDRMAP0_OFFSET + (index * 4);
+ addrmap_offset = DDR_ADDRMAP0_OFST + (index * 4);
addrmap[index] = readl(priv->baseaddr + addrmap_offset);
}
--
2.43.0
Instead of using the very handy helpers denoted in the subject the driver
has been created with the open-coded {mask,shift} statements. It makes the
code bulky, prone to mistakes and much harder to read. Seeing there are
many places in the driver implementing the CSR fields get/set pattern use
the FIELD_GET()/FIELD_PREP() macros introduced in the kernel specifically
for that case. In addition use the BIT() and GENMASK() macros to generate
the CSR flags/masks. While at it unify the row, column, rank, bank and
bank group macros names to be having a suffix similar to the
snps_ecc_error_info structure fields name.
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/synopsys_edac.c | 137 +++++++++++++++++------------------
1 file changed, 67 insertions(+), 70 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index c46cee035c0d..f181cd7b4447 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -6,6 +6,8 @@
* Copyright (C) 2012 - 2014 Xilinx, Inc.
*/
+#include <linux/bitfield.h>
+#include <linux/bits.h>
#include <linux/edac.h>
#include <linux/module.h>
#include <linux/platform_device.h>
@@ -89,33 +91,29 @@
#define ZYNQMP_DDR_QOS_IRQ_DB_OFST 0x2020C
/* DDR Master register definitions */
-#define DDR_MSTR_DEV_CFG_MASK 0xC0000000
-#define DDR_MSTR_DEV_CFG_SHIFT 30
+#define DDR_MSTR_DEV_CFG_MASK GENMASK(31, 30)
#define DDR_MSTR_DEV_X4 0
#define DDR_MSTR_DEV_X8 1
#define DDR_MSTR_DEV_X16 2
#define DDR_MSTR_DEV_X32 3
-#define DDR_MSTR_BUSWIDTH_MASK 0x3000
-#define DDR_MSTR_BUSWIDTH_SHIFT 12
+#define DDR_MSTR_BUSWIDTH_MASK GENMASK(13, 12)
#define DDR_MSTR_BUSWIDTH_16 2
#define DDR_MSTR_BUSWIDTH_32 1
#define DDR_MSTR_BUSWIDTH_64 0
-#define DDR_MSTR_MEM_LPDDR4 0x20
-#define DDR_MSTR_MEM_DDR4 0x10
-#define DDR_MSTR_MEM_LPDDR3 0x8
-#define DDR_MSTR_MEM_DDR2 0x4
-#define DDR_MSTR_MEM_DDR3 0x1
+#define DDR_MSTR_MEM_LPDDR4 BIT(5)
+#define DDR_MSTR_MEM_DDR4 BIT(4)
+#define DDR_MSTR_MEM_LPDDR3 BIT(3)
+#define DDR_MSTR_MEM_DDR2 BIT(2)
+#define DDR_MSTR_MEM_DDR3 BIT(0)
/* ECC CFG0 register definitions */
-#define ECC_CFG0_MODE_MASK 0x7
+#define ECC_CFG0_MODE_MASK GENMASK(2, 0)
#define ECC_CFG0_MODE_SECDED 0x4
/* ECC status register definitions */
-#define ECC_STAT_UECNT_MASK 0xF0000
-#define ECC_STAT_UECNT_SHIFT 16
-#define ECC_STAT_CECNT_MASK 0xF00
-#define ECC_STAT_CECNT_SHIFT 8
-#define ECC_STAT_BITNUM_MASK 0x7F
+#define ECC_STAT_UE_MASK GENMASK(23, 16)
+#define ECC_STAT_CE_MASK GENMASK(15, 8)
+#define ECC_STAT_BITNUM_MASK GENMASK(6, 0)
/* ECC control/clear register definitions */
#define ECC_CTRL_CLR_CE_ERR BIT(0)
@@ -126,34 +124,26 @@
#define ECC_CTRL_EN_UE_IRQ BIT(9)
/* ECC error count register definitions */
-#define ECC_ERRCNT_UECNT_MASK 0xFFFF0000
-#define ECC_ERRCNT_UECNT_SHIFT 16
-#define ECC_ERRCNT_CECNT_MASK 0xFFFF
-
-/* ECC Corrected Error Register Mask and Shifts*/
-#define ECC_CEADDR0_RW_MASK 0x3FFFF
-#define ECC_CEADDR0_RNK_MASK BIT(24)
-#define ECC_CEADDR1_BNKGRP_MASK 0x3000000
-#define ECC_CEADDR1_BNKNR_MASK 0x70000
-#define ECC_CEADDR1_COL_MASK 0xFFF
-#define ECC_CEADDR1_BNKGRP_SHIFT 24
-#define ECC_CEADDR1_BNKNR_SHIFT 16
-
-/* ECC Poison register shifts */
-#define ECC_POISON0_RANK_SHIFT 24
-#define ECC_POISON0_RANK_MASK BIT(24)
-#define ECC_POISON0_COLUMN_SHIFT 0
-#define ECC_POISON0_COLUMN_MASK 0xFFF
-#define ECC_POISON1_BG_SHIFT 28
-#define ECC_POISON1_BG_MASK 0x30000000
-#define ECC_POISON1_BANKNR_SHIFT 24
-#define ECC_POISON1_BANKNR_MASK 0x7000000
-#define ECC_POISON1_ROW_SHIFT 0
-#define ECC_POISON1_ROW_MASK 0x3FFFF
+#define ECC_ERRCNT_UECNT_MASK GENMASK(31, 16)
+#define ECC_ERRCNT_CECNT_MASK GENMASK(15, 0)
+
+/* ECC Corrected Error register definitions */
+#define ECC_CEADDR0_RANK_MASK GENMASK(27, 24)
+#define ECC_CEADDR0_ROW_MASK GENMASK(17, 0)
+#define ECC_CEADDR1_BANKGRP_MASK GENMASK(25, 24)
+#define ECC_CEADDR1_BANK_MASK GENMASK(23, 16)
+#define ECC_CEADDR1_COL_MASK GENMASK(11, 0)
+
+/* ECC Poison register definitions */
+#define ECC_POISON0_RANK_MASK GENMASK(27, 24)
+#define ECC_POISON0_COL_MASK GENMASK(11, 0)
+#define ECC_POISON1_BANKGRP_MASK GENMASK(29, 28)
+#define ECC_POISON1_BANK_MASK GENMASK(26, 24)
+#define ECC_POISON1_ROW_MASK GENMASK(17, 0)
/* DDRC ECC CE & UE poison mask */
-#define ECC_CEPOISON_MASK 0x3
-#define ECC_UEPOISON_MASK 0x1
+#define ECC_CEPOISON_MASK GENMASK(1, 0)
+#define ECC_UEPOISON_MASK BIT(0)
/* DDRC Device config shifts/masks */
#define DDR_MAX_ROW_SHIFT 18
@@ -209,9 +199,9 @@
#define RANK_B0_BASE 6
/* ZynqMP DDR QOS Interrupt register definitions */
-#define ZYNQMP_DDR_QOS_UE_MASK 0x4
-#define ZYNQMP_DDR_QOS_CE_MASK 0x2
-#define ZYNQMP_DDR_QOS_IRQ_MASK 0x6
+#define ZYNQMP_DDR_QOS_UE_MASK BIT(2)
+#define ZYNQMP_DDR_QOS_CE_MASK BIT(1)
+#define ZYNQMP_DDR_QOS_IRQ_MASK (ZYNQMP_DDR_QOS_UE_MASK | ZYNQMP_DDR_QOS_CE_MASK)
/**
* struct snps_ecc_error_info - ECC error log information.
@@ -303,38 +293,40 @@ static int snps_get_error_info(struct snps_edac_priv *priv)
if (!regval)
return 1;
- p->ceinfo.bitpos = (regval & ECC_STAT_BITNUM_MASK);
+ p->ceinfo.bitpos = FIELD_GET(ECC_STAT_BITNUM_MASK, regval);
regval = readl(base + ECC_ERRCNT_OFST);
- p->ce_cnt = regval & ECC_ERRCNT_CECNT_MASK;
- p->ue_cnt = (regval & ECC_ERRCNT_UECNT_MASK) >> ECC_ERRCNT_UECNT_SHIFT;
+ p->ce_cnt = FIELD_GET(ECC_ERRCNT_CECNT_MASK, regval);
+ p->ue_cnt = FIELD_GET(ECC_ERRCNT_UECNT_MASK, regval);
if (!p->ce_cnt)
goto ue_err;
regval = readl(base + ECC_CEADDR0_OFST);
- p->ceinfo.row = (regval & ECC_CEADDR0_RW_MASK);
+ p->ceinfo.row = FIELD_GET(ECC_CEADDR0_ROW_MASK, regval);
+
regval = readl(base + ECC_CEADDR1_OFST);
- p->ceinfo.bank = (regval & ECC_CEADDR1_BNKNR_MASK) >>
- ECC_CEADDR1_BNKNR_SHIFT;
- p->ceinfo.bankgrp = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
- ECC_CEADDR1_BNKGRP_SHIFT;
- p->ceinfo.col = (regval & ECC_CEADDR1_COL_MASK);
+ p->ceinfo.bank = FIELD_GET(ECC_CEADDR1_BANK_MASK, regval);
+ p->ceinfo.bankgrp = FIELD_GET(ECC_CEADDR1_BANKGRP_MASK, regval);
+ p->ceinfo.col = FIELD_GET(ECC_CEADDR1_COL_MASK, regval);
+
p->ceinfo.data = readl(base + ECC_CSYND0_OFST);
+
edac_dbg(2, "ECCCSYN0: 0x%08X ECCCSYN1: 0x%08X ECCCSYN2: 0x%08X\n",
readl(base + ECC_CSYND0_OFST), readl(base + ECC_CSYND1_OFST),
readl(base + ECC_CSYND2_OFST));
+
ue_err:
if (!p->ue_cnt)
goto out;
regval = readl(base + ECC_UEADDR0_OFST);
- p->ueinfo.row = (regval & ECC_CEADDR0_RW_MASK);
+ p->ueinfo.row = FIELD_GET(ECC_CEADDR0_ROW_MASK, regval);
+
regval = readl(base + ECC_UEADDR1_OFST);
- p->ueinfo.bankgrp = (regval & ECC_CEADDR1_BNKGRP_MASK) >>
- ECC_CEADDR1_BNKGRP_SHIFT;
- p->ueinfo.bank = (regval & ECC_CEADDR1_BNKNR_MASK) >>
- ECC_CEADDR1_BNKNR_SHIFT;
- p->ueinfo.col = (regval & ECC_CEADDR1_COL_MASK);
+ p->ueinfo.bankgrp = FIELD_GET(ECC_CEADDR1_BANKGRP_MASK, regval);
+ p->ueinfo.bank = FIELD_GET(ECC_CEADDR1_BANK_MASK, regval);
+ p->ueinfo.col = FIELD_GET(ECC_CEADDR1_COL_MASK, regval);
+
p->ueinfo.data = readl(base + ECC_UESYND0_OFST);
out:
spin_lock_irqsave(&priv->reglock, flags);
@@ -483,7 +475,7 @@ static enum dev_type snps_get_dtype(const void __iomem *base)
if (!(regval & DDR_MSTR_MEM_DDR4))
return DEV_UNKNOWN;
- regval = (regval & DDR_MSTR_DEV_CFG_MASK) >> DDR_MSTR_DEV_CFG_SHIFT;
+ regval = FIELD_GET(DDR_MSTR_DEV_CFG_MASK, regval);
switch (regval) {
case DDR_MSTR_DEV_X4:
return DEV_X4;
@@ -510,7 +502,8 @@ static bool snps_get_ecc_state(void __iomem *base)
{
u32 regval;
- regval = readl(base + ECC_CFG0_OFST) & ECC_CFG0_MODE_MASK;
+ regval = readl(base + ECC_CFG0_OFST);
+ regval = FIELD_GET(ECC_CFG0_MODE_MASK, regval);
return (regval == ECC_CFG0_MODE_SECDED);
}
@@ -697,13 +690,13 @@ static void snps_data_poison_setup(struct snps_edac_priv *priv)
if (priv->rank_shift[0])
rank = (hif_addr >> priv->rank_shift[0]) & BIT(0);
- regval = (rank << ECC_POISON0_RANK_SHIFT) & ECC_POISON0_RANK_MASK;
- regval |= (col << ECC_POISON0_COLUMN_SHIFT) & ECC_POISON0_COLUMN_MASK;
+ regval = FIELD_PREP(ECC_POISON0_RANK_MASK, rank) |
+ FIELD_PREP(ECC_POISON0_COL_MASK, col);
writel(regval, priv->baseaddr + ECC_POISON0_OFST);
- regval = (bankgrp << ECC_POISON1_BG_SHIFT) & ECC_POISON1_BG_MASK;
- regval |= (bank << ECC_POISON1_BANKNR_SHIFT) & ECC_POISON1_BANKNR_MASK;
- regval |= (row << ECC_POISON1_ROW_SHIFT) & ECC_POISON1_ROW_MASK;
+ regval = FIELD_PREP(ECC_POISON1_BANKGRP_MASK, bankgrp) |
+ FIELD_PREP(ECC_POISON1_BANK_MASK, bank) |
+ FIELD_PREP(ECC_POISON1_ROW_MASK, row);
writel(regval, priv->baseaddr + ECC_POISON1_OFST);
}
@@ -742,10 +735,14 @@ static ssize_t inject_data_poison_show(struct device *dev,
{
struct mem_ctl_info *mci = to_mci(dev);
struct snps_edac_priv *priv = mci->pvt_info;
+ const char *errstr;
+ u32 regval;
+
+ regval = readl(priv->baseaddr + ECC_CFG1_OFST);
+ errstr = FIELD_GET(ECC_CEPOISON_MASK, regval) == ECC_CEPOISON_MASK ?
+ "Correctable Error" : "UnCorrectable Error";
- return sprintf(data, "Data Poisoning: %s\n\r",
- (((readl(priv->baseaddr + ECC_CFG1_OFST)) & 0x3) == 0x3)
- ? ("Correctable Error") : ("UnCorrectable Error"));
+ return sprintf(data, "Data Poisoning: %s\n\r", errstr);
}
static ssize_t inject_data_poison_store(struct device *dev,
@@ -852,7 +849,7 @@ static void snps_setup_column_address_map(struct snps_edac_priv *priv, u32 *addr
int index;
memtype = readl(priv->baseaddr + DDR_MSTR_OFST);
- width = (memtype & DDR_MSTR_BUSWIDTH_MASK) >> DDR_MSTR_BUSWIDTH_SHIFT;
+ width = FIELD_GET(DDR_MSTR_BUSWIDTH_MASK, memtype);
priv->col_shift[0] = 0;
priv->col_shift[1] = 1;
--
2.43.0
Currently the driver entities naming schema is kind of random. There are
structures and methods with the "synps" prefix, there are structures and
methods with no driver-specific prefix, there are methods with the "edac"
prefix, there are structure instances with "zynqmp" and "synopsys"
prefixes, there are macros with "SYNPS", "ECC" and "DDR" prefixes.
Moreover some time ago some of function names were shortened out by
completely removing the vendor-specific prefixes thus leaving the driver
with no strict entities naming convention (see commit bb894bc46ed0 ("EDAC,
synopsys: Shorten static function names")).
All of that makes the code much harder to read for no much reason (except
shorter names utilization) since there is no easy way to distinguish now
the local, EDAC core and global name spaces right from the code context.
Similarly the kernel code index services (like elixir) gets to find the
different functions with the same name, which harden the kernel hacking.
Fix all of that by unifying the driver local entity names like functions,
structures and non-CSR-related macros especially seeing the same approach
has been used in the most of the EDAC LLDD. Use the "snps" prefix here as
the shortest version of the controller vendor name. While at it add a more
detailed controller name (DW uMCTL2 DDRC) to the driver comments and
string literals where it's appropriate.
Signed-off-by: Serge Semin <[email protected]>
---
Note "dw" prefix would be even shorter alternative. But we decided to
stick with "snps" since "synopsys" has already been used in the module
name.
Changelog v2:
- Forgot to fix some of the SYNPS_ZYNQMP_IRQ_REGS macro utilizations.
(@tbot)
---
drivers/edac/synopsys_edac.c | 241 ++++++++++++++++++-----------------
1 file changed, 121 insertions(+), 120 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 1de9f3f86d5a..c46cee035c0d 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
- * Synopsys DDR ECC Driver
+ * Synopsys DW uMCTL2 DDR ECC Driver
* This driver is based on ppc4xx_edac.c drivers
*
* Copyright (C) 2012 - 2014 Xilinx, Inc.
@@ -16,23 +16,23 @@
#include "edac_module.h"
/* Number of cs_rows needed per memory controller */
-#define SYNPS_EDAC_NR_CSROWS 1
+#define SNPS_EDAC_NR_CSROWS 1
/* Number of channels per memory controller */
-#define SYNPS_EDAC_NR_CHANS 1
+#define SNPS_EDAC_NR_CHANS 1
/* Granularity of reported error in bytes */
-#define SYNPS_EDAC_ERR_GRAIN 1
+#define SNPS_EDAC_ERR_GRAIN 1
-#define SYNPS_EDAC_MSG_SIZE 256
+#define SNPS_EDAC_MSG_SIZE 256
-#define SYNPS_EDAC_MOD_STRING "synps_edac"
-#define SYNPS_EDAC_MOD_VER "1"
+#define SNPS_EDAC_MOD_STRING "snps_edac"
+#define SNPS_EDAC_MOD_VER "1"
/* DDR ECC Quirks */
-#define SYNPS_ZYNQMP_IRQ_REGS BIT(0)
+#define SNPS_ZYNQMP_IRQ_REGS BIT(0)
-/* Synopsys DDR memory controller registers that are relevant to ECC */
+/* Synopsys uMCTL2 DDR controller registers that are relevant to ECC */
/* DDRC Master 0 Register */
#define DDR_MSTR_OFST 0x0
@@ -214,7 +214,7 @@
#define ZYNQMP_DDR_QOS_IRQ_MASK 0x6
/**
- * struct ecc_error_info - ECC error log information.
+ * struct snps_ecc_error_info - ECC error log information.
* @row: Row number.
* @col: Column number.
* @bank: Bank number.
@@ -222,7 +222,7 @@
* @bitpos: Bit position.
* @data: Data causing the error.
*/
-struct ecc_error_info {
+struct snps_ecc_error_info {
u32 row;
u32 col;
u32 bank;
@@ -232,21 +232,21 @@ struct ecc_error_info {
};
/**
- * struct synps_ecc_status - ECC status information to report.
+ * struct snps_ecc_status - ECC status information to report.
* @ce_cnt: Correctable error count.
* @ue_cnt: Uncorrectable error count.
* @ceinfo: Correctable error log information.
* @ueinfo: Uncorrectable error log information.
*/
-struct synps_ecc_status {
+struct snps_ecc_status {
u32 ce_cnt;
u32 ue_cnt;
- struct ecc_error_info ceinfo;
- struct ecc_error_info ueinfo;
+ struct snps_ecc_error_info ceinfo;
+ struct snps_ecc_error_info ueinfo;
};
/**
- * struct synps_edac_priv - DDR memory controller private instance data.
+ * struct snps_edac_priv - DDR memory controller private data.
* @baseaddr: Base address of the DDR controller.
* @reglock: Concurrent CSRs access lock.
* @message: Buffer for framing the event specific info.
@@ -259,12 +259,12 @@ struct synps_ecc_status {
* @bankgrp_shift: Bit shifts for bank group bit.
* @rank_shift: Bit shifts for rank bit.
*/
-struct synps_edac_priv {
+struct snps_edac_priv {
void __iomem *baseaddr;
spinlock_t reglock;
- char message[SYNPS_EDAC_MSG_SIZE];
- struct synps_ecc_status stat;
- const struct synps_platform_data *p_data;
+ char message[SNPS_EDAC_MSG_SIZE];
+ struct snps_ecc_status stat;
+ const struct snps_platform_data *p_data;
#ifdef CONFIG_EDAC_DEBUG
ulong poison_addr;
u32 row_shift[18];
@@ -276,22 +276,22 @@ struct synps_edac_priv {
};
/**
- * struct synps_platform_data - Synopsys uMCTL2 DDRC platform data.
+ * struct snps_platform_data - Synopsys uMCTL2 DDRC platform data.
* @quirks: IP-core specific quirks.
*/
-struct synps_platform_data {
+struct snps_platform_data {
u32 quirks;
};
/**
- * synps_get_error_info - Get the current ECC error info.
+ * snps_get_error_info - Get the current ECC error info.
* @priv: DDR memory controller private instance data.
*
* Return: one if there is no error otherwise returns zero.
*/
-static int synps_get_error_info(struct synps_edac_priv *priv)
+static int snps_get_error_info(struct snps_edac_priv *priv)
{
- struct synps_ecc_status *p;
+ struct snps_ecc_status *p;
u32 regval, clearval;
unsigned long flags;
void __iomem *base;
@@ -350,21 +350,21 @@ static int synps_get_error_info(struct synps_edac_priv *priv)
}
/**
- * handle_error - Handle Correctable and Uncorrectable errors.
+ * snps_handle_error - Handle Correctable and Uncorrectable errors.
* @mci: EDAC memory controller instance.
* @p: Synopsys ECC status structure.
*
* Handles ECC correctable and uncorrectable errors.
*/
-static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
+static void snps_handle_error(struct mem_ctl_info *mci, struct snps_ecc_status *p)
{
- struct synps_edac_priv *priv = mci->pvt_info;
- struct ecc_error_info *pinf;
+ struct snps_edac_priv *priv = mci->pvt_info;
+ struct snps_ecc_error_info *pinf;
if (p->ce_cnt) {
pinf = &p->ceinfo;
- snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
+ snprintf(priv->message, SNPS_EDAC_MSG_SIZE,
"Row %d Col %d Bank %d Bank Group %d Bit %d Data 0x%08x",
pinf->row, pinf->col, pinf->bank, pinf->bankgrp,
pinf->bitpos, pinf->data);
@@ -377,7 +377,7 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
if (p->ue_cnt) {
pinf = &p->ueinfo;
- snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
+ snprintf(priv->message, SNPS_EDAC_MSG_SIZE,
"Row %d Col %d Bank %d Bank Group %d",
pinf->row, pinf->col, pinf->bank, pinf->bankgrp);
@@ -389,12 +389,12 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
memset(p, 0, sizeof(*p));
}
-static void enable_intr(struct synps_edac_priv *priv)
+static void snps_enable_irq(struct snps_edac_priv *priv)
{
unsigned long flags;
/* Enable UE/CE Interrupts */
- if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
+ if (priv->p_data->quirks & SNPS_ZYNQMP_IRQ_REGS) {
writel(ZYNQMP_DDR_QOS_UE_MASK | ZYNQMP_DDR_QOS_CE_MASK,
priv->baseaddr + ZYNQMP_DDR_QOS_IRQ_EN_OFST);
@@ -413,12 +413,12 @@ static void enable_intr(struct synps_edac_priv *priv)
spin_unlock_irqrestore(&priv->reglock, flags);
}
-static void disable_intr(struct synps_edac_priv *priv)
+static void snps_disable_irq(struct snps_edac_priv *priv)
{
unsigned long flags;
/* Disable UE/CE Interrupts */
- if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
+ if (priv->p_data->quirks & SNPS_ZYNQMP_IRQ_REGS) {
writel(ZYNQMP_DDR_QOS_UE_MASK | ZYNQMP_DDR_QOS_CE_MASK,
priv->baseaddr + ZYNQMP_DDR_QOS_IRQ_DB_OFST);
@@ -433,41 +433,41 @@ static void disable_intr(struct synps_edac_priv *priv)
}
/**
- * intr_handler - Interrupt Handler for ECC interrupts.
+ * snps_irq_handler - Interrupt Handler for ECC interrupts.
* @irq: IRQ number.
* @dev_id: Device ID.
*
* Return: IRQ_NONE, if interrupt not set or IRQ_HANDLED otherwise.
*/
-static irqreturn_t intr_handler(int irq, void *dev_id)
+static irqreturn_t snps_irq_handler(int irq, void *dev_id)
{
struct mem_ctl_info *mci = dev_id;
- struct synps_edac_priv *priv;
+ struct snps_edac_priv *priv;
int status, regval;
priv = mci->pvt_info;
- if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS) {
+ if (priv->p_data->quirks & SNPS_ZYNQMP_IRQ_REGS) {
regval = readl(priv->baseaddr + ZYNQMP_DDR_QOS_IRQ_STAT_OFST);
regval &= (ZYNQMP_DDR_QOS_CE_MASK | ZYNQMP_DDR_QOS_UE_MASK);
if (!(regval & ZYNQMP_DDR_QOS_IRQ_MASK))
return IRQ_NONE;
}
- status = synps_get_error_info(priv);
+ status = snps_get_error_info(priv);
if (status)
return IRQ_NONE;
- handle_error(mci, &priv->stat);
+ snps_handle_error(mci, &priv->stat);
- if (priv->p_data->quirks & SYNPS_ZYNQMP_IRQ_REGS)
+ if (priv->p_data->quirks & SNPS_ZYNQMP_IRQ_REGS)
writel(regval, priv->baseaddr + ZYNQMP_DDR_QOS_IRQ_STAT_OFST);
return IRQ_HANDLED;
}
/**
- * synps_get_dtype - Return the controller memory width.
+ * snps_get_dtype - Return the controller memory width.
* @base: DDR memory controller base address.
*
* Get the EDAC device type width appropriate for the current controller
@@ -475,7 +475,7 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
*
* Return: a device type width enumeration.
*/
-static enum dev_type synps_get_dtype(const void __iomem *base)
+static enum dev_type snps_get_dtype(const void __iomem *base)
{
u32 regval;
@@ -499,14 +499,14 @@ static enum dev_type synps_get_dtype(const void __iomem *base)
}
/**
- * synps_get_ecc_state - Return the controller ECC enable/disable status.
+ * snps_get_ecc_state - Return the controller ECC enable/disable status.
* @base: DDR memory controller base address.
*
* Get the ECC enable/disable status for the controller.
*
* Return: a ECC status boolean i.e true/false - enabled/disabled.
*/
-static bool synps_get_ecc_state(void __iomem *base)
+static bool snps_get_ecc_state(void __iomem *base)
{
u32 regval;
@@ -516,11 +516,11 @@ static bool synps_get_ecc_state(void __iomem *base)
}
/**
- * get_memsize - Read the size of the attached memory device.
+ * snps_get_memsize - Read the size of the attached memory device.
*
* Return: the memory size in bytes.
*/
-static u32 get_memsize(void)
+static u32 snps_get_memsize(void)
{
struct sysinfo inf;
@@ -530,7 +530,7 @@ static u32 get_memsize(void)
}
/**
- * synps_get_mtype - Returns controller memory type.
+ * snps_get_mtype - Returns controller memory type.
* @base: Synopsys ECC status structure.
*
* Get the EDAC memory type appropriate for the current controller
@@ -538,7 +538,7 @@ static u32 get_memsize(void)
*
* Return: a memory type enumeration.
*/
-static enum mem_type synps_get_mtype(const void __iomem *base)
+static enum mem_type snps_get_mtype(const void __iomem *base)
{
enum mem_type mt;
u32 memtype;
@@ -558,15 +558,15 @@ static enum mem_type synps_get_mtype(const void __iomem *base)
}
/**
- * init_csrows - Initialize the csrow data.
+ * snps_init_csrows - Initialize the csrow data.
* @mci: EDAC memory controller instance.
*
* Initialize the chip select rows associated with the EDAC memory
* controller instance.
*/
-static void init_csrows(struct mem_ctl_info *mci)
+static void snps_init_csrows(struct mem_ctl_info *mci)
{
- struct synps_edac_priv *priv = mci->pvt_info;
+ struct snps_edac_priv *priv = mci->pvt_info;
struct csrow_info *csi;
struct dimm_info *dimm;
u32 size, row;
@@ -574,21 +574,21 @@ static void init_csrows(struct mem_ctl_info *mci)
for (row = 0; row < mci->nr_csrows; row++) {
csi = mci->csrows[row];
- size = get_memsize();
+ size = snps_get_memsize();
for (j = 0; j < csi->nr_channels; j++) {
dimm = csi->channels[j]->dimm;
dimm->edac_mode = EDAC_SECDED;
- dimm->mtype = synps_get_mtype(priv->baseaddr);
+ dimm->mtype = snps_get_mtype(priv->baseaddr);
dimm->nr_pages = (size >> PAGE_SHIFT) / csi->nr_channels;
- dimm->grain = SYNPS_EDAC_ERR_GRAIN;
- dimm->dtype = synps_get_dtype(priv->baseaddr);
+ dimm->grain = SNPS_EDAC_ERR_GRAIN;
+ dimm->dtype = snps_get_dtype(priv->baseaddr);
}
}
}
/**
- * mc_init - Initialize one driver instance.
+ * snps_mc_init - Initialize one driver instance.
* @mci: EDAC memory controller instance.
* @pdev: platform device.
*
@@ -596,7 +596,7 @@ static void init_csrows(struct mem_ctl_info *mci)
* related driver-private data associated with the memory controller the
* instance is bound to.
*/
-static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
+static void snps_mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
{
mci->pdev = &pdev->dev;
platform_set_drvdata(pdev, mci);
@@ -608,21 +608,22 @@ static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
mci->scrub_mode = SCRUB_NONE;
mci->edac_cap = EDAC_FLAG_SECDED;
- mci->ctl_name = "synps_ddr_controller";
- mci->dev_name = SYNPS_EDAC_MOD_STRING;
- mci->mod_name = SYNPS_EDAC_MOD_VER;
+ mci->ctl_name = "snps_umctl2_ddrc";
+ mci->dev_name = SNPS_EDAC_MOD_STRING;
+ mci->mod_name = SNPS_EDAC_MOD_VER;
edac_op_state = EDAC_OPSTATE_INT;
mci->ctl_page_to_phys = NULL;
- init_csrows(mci);
+ snps_init_csrows(mci);
}
-static int setup_irq(struct mem_ctl_info *mci,
- struct platform_device *pdev)
+
+
+static int snps_setup_irq(struct mem_ctl_info *mci, struct platform_device *pdev)
{
- struct synps_edac_priv *priv = mci->pvt_info;
+ struct snps_edac_priv *priv = mci->pvt_info;
int ret, irq;
irq = platform_get_irq(pdev, 0);
@@ -632,14 +633,14 @@ static int setup_irq(struct mem_ctl_info *mci,
return irq;
}
- ret = devm_request_irq(&pdev->dev, irq, intr_handler,
+ ret = devm_request_irq(&pdev->dev, irq, snps_irq_handler,
0, dev_name(&pdev->dev), mci);
if (ret < 0) {
edac_printk(KERN_ERR, EDAC_MC, "Failed to request IRQ\n");
return ret;
}
- enable_intr(priv);
+ snps_enable_irq(priv);
return 0;
}
@@ -647,13 +648,13 @@ static int setup_irq(struct mem_ctl_info *mci,
#ifdef CONFIG_EDAC_DEBUG
/**
- * ddr_poison_setup - Update poison registers.
+ * snps_data_poison_setup - Update poison registers.
* @priv: DDR memory controller private instance data.
*
* Update poison registers as per DDR mapping.
* Return: none.
*/
-static void ddr_poison_setup(struct synps_edac_priv *priv)
+static void snps_data_poison_setup(struct snps_edac_priv *priv)
{
int col = 0, row = 0, bank = 0, bankgrp = 0, rank = 0, regval;
int index;
@@ -711,7 +712,7 @@ static ssize_t inject_data_error_show(struct device *dev,
char *data)
{
struct mem_ctl_info *mci = to_mci(dev);
- struct synps_edac_priv *priv = mci->pvt_info;
+ struct snps_edac_priv *priv = mci->pvt_info;
return sprintf(data, "Poison0 Addr: 0x%08x\n\rPoison1 Addr: 0x%08x\n\r"
"Error injection Address: 0x%lx\n\r",
@@ -725,12 +726,12 @@ static ssize_t inject_data_error_store(struct device *dev,
const char *data, size_t count)
{
struct mem_ctl_info *mci = to_mci(dev);
- struct synps_edac_priv *priv = mci->pvt_info;
+ struct snps_edac_priv *priv = mci->pvt_info;
if (kstrtoul(data, 0, &priv->poison_addr))
return -EINVAL;
- ddr_poison_setup(priv);
+ snps_data_poison_setup(priv);
return count;
}
@@ -740,7 +741,7 @@ static ssize_t inject_data_poison_show(struct device *dev,
char *data)
{
struct mem_ctl_info *mci = to_mci(dev);
- struct synps_edac_priv *priv = mci->pvt_info;
+ struct snps_edac_priv *priv = mci->pvt_info;
return sprintf(data, "Data Poisoning: %s\n\r",
(((readl(priv->baseaddr + ECC_CFG1_OFST)) & 0x3) == 0x3)
@@ -752,7 +753,7 @@ static ssize_t inject_data_poison_store(struct device *dev,
const char *data, size_t count)
{
struct mem_ctl_info *mci = to_mci(dev);
- struct synps_edac_priv *priv = mci->pvt_info;
+ struct snps_edac_priv *priv = mci->pvt_info;
writel(0, priv->baseaddr + DDR_SWCTL);
if (strncmp(data, "CE", 2) == 0)
@@ -767,7 +768,7 @@ static ssize_t inject_data_poison_store(struct device *dev,
static DEVICE_ATTR_RW(inject_data_error);
static DEVICE_ATTR_RW(inject_data_poison);
-static int edac_create_sysfs_attributes(struct mem_ctl_info *mci)
+static int snps_create_sysfs_attributes(struct mem_ctl_info *mci)
{
int rc;
@@ -780,13 +781,13 @@ static int edac_create_sysfs_attributes(struct mem_ctl_info *mci)
return 0;
}
-static void edac_remove_sysfs_attributes(struct mem_ctl_info *mci)
+static void snps_remove_sysfs_attributes(struct mem_ctl_info *mci)
{
device_remove_file(&mci->dev, &dev_attr_inject_data_error);
device_remove_file(&mci->dev, &dev_attr_inject_data_poison);
}
-static void setup_row_address_map(struct synps_edac_priv *priv, u32 *addrmap)
+static void snps_setup_row_address_map(struct snps_edac_priv *priv, u32 *addrmap)
{
u32 addrmap_row_b2_10;
int index;
@@ -845,7 +846,7 @@ static void setup_row_address_map(struct synps_edac_priv *priv, u32 *addrmap)
ROW_MAX_VAL_MASK) + ROW_B17_BASE);
}
-static void setup_column_address_map(struct synps_edac_priv *priv, u32 *addrmap)
+static void snps_setup_column_address_map(struct snps_edac_priv *priv, u32 *addrmap)
{
u32 width, memtype;
int index;
@@ -947,7 +948,7 @@ static void setup_column_address_map(struct synps_edac_priv *priv, u32 *addrmap)
}
-static void setup_bank_address_map(struct synps_edac_priv *priv, u32 *addrmap)
+static void snps_setup_bank_address_map(struct snps_edac_priv *priv, u32 *addrmap)
{
priv->bank_shift[0] = (addrmap[1] & BANK_MAX_VAL_MASK) + BANK_B0_BASE;
priv->bank_shift[1] = ((addrmap[1] >> 8) &
@@ -959,7 +960,7 @@ static void setup_bank_address_map(struct synps_edac_priv *priv, u32 *addrmap)
}
-static void setup_bg_address_map(struct synps_edac_priv *priv, u32 *addrmap)
+static void snps_setup_bg_address_map(struct snps_edac_priv *priv, u32 *addrmap)
{
priv->bankgrp_shift[0] = (addrmap[8] &
BANKGRP_MAX_VAL_MASK) + BANKGRP_B0_BASE;
@@ -969,7 +970,7 @@ static void setup_bg_address_map(struct synps_edac_priv *priv, u32 *addrmap)
}
-static void setup_rank_address_map(struct synps_edac_priv *priv, u32 *addrmap)
+static void snps_setup_rank_address_map(struct snps_edac_priv *priv, u32 *addrmap)
{
priv->rank_shift[0] = ((addrmap[0] & RANK_MAX_VAL_MASK) ==
RANK_MAX_VAL_MASK) ? 0 : ((addrmap[0] &
@@ -977,14 +978,14 @@ static void setup_rank_address_map(struct synps_edac_priv *priv, u32 *addrmap)
}
/**
- * setup_address_map - Set Address Map by querying ADDRMAP registers.
+ * snps_setup_address_map - Set Address Map by querying ADDRMAP registers.
* @priv: DDR memory controller private instance data.
*
* Set Address Map by querying ADDRMAP registers.
*
* Return: none.
*/
-static void setup_address_map(struct synps_edac_priv *priv)
+static void snps_setup_address_map(struct snps_edac_priv *priv)
{
u32 addrmap[12];
int index;
@@ -996,20 +997,20 @@ static void setup_address_map(struct synps_edac_priv *priv)
addrmap[index] = readl(priv->baseaddr + addrmap_offset);
}
- setup_row_address_map(priv, addrmap);
+ snps_setup_row_address_map(priv, addrmap);
- setup_column_address_map(priv, addrmap);
+ snps_setup_column_address_map(priv, addrmap);
- setup_bank_address_map(priv, addrmap);
+ snps_setup_bank_address_map(priv, addrmap);
- setup_bg_address_map(priv, addrmap);
+ snps_setup_bg_address_map(priv, addrmap);
- setup_rank_address_map(priv, addrmap);
+ snps_setup_rank_address_map(priv, addrmap);
}
#endif /* CONFIG_EDAC_DEBUG */
/**
- * mc_probe - Check controller and bind driver.
+ * snps_mc_probe - Check controller and bind driver.
* @pdev: platform device.
*
* Probe a specific controller instance for binding with the driver.
@@ -1017,11 +1018,11 @@ static void setup_address_map(struct synps_edac_priv *priv)
* Return: 0 if the controller instance was successfully bound to the
* driver; otherwise, < 0 on error.
*/
-static int mc_probe(struct platform_device *pdev)
+static int snps_mc_probe(struct platform_device *pdev)
{
- const struct synps_platform_data *p_data;
+ const struct snps_platform_data *p_data;
struct edac_mc_layer layers[2];
- struct synps_edac_priv *priv;
+ struct snps_edac_priv *priv;
struct mem_ctl_info *mci;
void __iomem *baseaddr;
int rc;
@@ -1034,20 +1035,20 @@ static int mc_probe(struct platform_device *pdev)
if (!p_data)
return -ENODEV;
- if (!synps_get_ecc_state(baseaddr)) {
+ if (!snps_get_ecc_state(baseaddr)) {
edac_printk(KERN_INFO, EDAC_MC, "ECC not enabled\n");
return -ENXIO;
}
layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
- layers[0].size = SYNPS_EDAC_NR_CSROWS;
+ layers[0].size = SNPS_EDAC_NR_CSROWS;
layers[0].is_virt_csrow = true;
layers[1].type = EDAC_MC_LAYER_CHANNEL;
- layers[1].size = SYNPS_EDAC_NR_CHANS;
+ layers[1].size = SNPS_EDAC_NR_CHANS;
layers[1].is_virt_csrow = false;
mci = edac_mc_alloc(EDAC_AUTO_MC_NUM, ARRAY_SIZE(layers), layers,
- sizeof(struct synps_edac_priv));
+ sizeof(struct snps_edac_priv));
if (!mci) {
edac_printk(KERN_ERR, EDAC_MC,
"Failed memory allocation for mc instance\n");
@@ -1059,9 +1060,9 @@ static int mc_probe(struct platform_device *pdev)
priv->p_data = p_data;
spin_lock_init(&priv->reglock);
- mc_init(mci, pdev);
+ snps_mc_init(mci, pdev);
- rc = setup_irq(mci, pdev);
+ rc = snps_setup_irq(mci, pdev);
if (rc)
goto free_edac_mc;
@@ -1073,13 +1074,13 @@ static int mc_probe(struct platform_device *pdev)
}
#ifdef CONFIG_EDAC_DEBUG
- rc = edac_create_sysfs_attributes(mci);
+ rc = snps_create_sysfs_attributes(mci);
if (rc) {
edac_printk(KERN_ERR, EDAC_MC, "Failed to create sysfs entries\n");
goto free_edac_mc;
}
- setup_address_map(priv);
+ snps_setup_address_map(priv);
#endif
return rc;
@@ -1091,52 +1092,52 @@ static int mc_probe(struct platform_device *pdev)
}
/**
- * mc_remove - Unbind driver from controller.
+ * snps_mc_remove - Unbind driver from device.
* @pdev: Platform device.
*
* Return: Unconditionally 0
*/
-static void mc_remove(struct platform_device *pdev)
+
+static void snps_mc_remove(struct platform_device *pdev)
{
struct mem_ctl_info *mci = platform_get_drvdata(pdev);
- struct synps_edac_priv *priv = mci->pvt_info;
+ struct snps_edac_priv *priv = mci->pvt_info;
- disable_intr(priv);
+ snps_disable_irq(priv);
#ifdef CONFIG_EDAC_DEBUG
- edac_remove_sysfs_attributes(mci);
+ snps_remove_sysfs_attributes(mci);
#endif
edac_mc_del_mc(&pdev->dev);
edac_mc_free(mci);
}
-static const struct synps_platform_data zynqmp_edac_def = {
- .quirks = SYNPS_ZYNQMP_IRQ_REGS,
+static const struct snps_platform_data zynqmp_edac_def = {
+ .quirks = SNPS_ZYNQMP_IRQ_REGS,
};
-static const struct synps_platform_data synopsys_edac_def = {
+static const struct snps_platform_data snps_edac_def = {
.quirks = 0,
};
-static const struct of_device_id synps_edac_match[] = {
+static const struct of_device_id snps_edac_match[] = {
{ .compatible = "xlnx,zynqmp-ddrc-2.40a", .data = &zynqmp_edac_def },
- { .compatible = "snps,ddrc-3.80a", .data = &synopsys_edac_def },
+ { .compatible = "snps,ddrc-3.80a", .data = &snps_edac_def },
{ }
};
-MODULE_DEVICE_TABLE(of, synps_edac_match);
+MODULE_DEVICE_TABLE(of, snps_edac_match);
-static struct platform_driver synps_edac_mc_driver = {
+static struct platform_driver snps_edac_mc_driver = {
.driver = {
- .name = "synopsys-edac",
- .of_match_table = synps_edac_match,
+ .name = "snps-edac",
+ .of_match_table = snps_edac_match,
},
- .probe = mc_probe,
- .remove_new = mc_remove,
+ .probe = snps_mc_probe,
+ .remove_new = snps_mc_remove,
};
-
-module_platform_driver(synps_edac_mc_driver);
+module_platform_driver(snps_edac_mc_driver);
MODULE_AUTHOR("Xilinx Inc");
-MODULE_DESCRIPTION("Synopsys DDR ECC driver");
+MODULE_DESCRIPTION("Synopsys uMCTL2 DDR ECC driver");
MODULE_LICENSE("GPL v2");
--
2.43.0
Aside with fixing the errors count CSR usage the commit e2932d1f6f05
("EDAC/synopsys: Read the error count from the correct register") all of
the sudden has also changed the order of the errors status check
procedure. So now the errors handler method first reads the number of CE
and UE and only then makes sure that any of these errors have actually
happened. It doesn't make sense. Fix that by getting back the correct
procedures order: first check the ECC status, then read the number of
errors.
Fixes: e2932d1f6f05 ("EDAC/synopsys: Read the error count from the correct register")
Signed-off-by: Serge Semin <[email protected]>
Reviewed-by: Shubhrajyoti Datta <[email protected]>
---
drivers/edac/synopsys_edac.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index bd6e52db68bc..fbaf3d9ad517 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -418,18 +418,18 @@ static int zynqmp_get_error_info(struct synps_edac_priv *priv)
base = priv->baseaddr;
p = &priv->stat;
- regval = readl(base + ECC_ERRCNT_OFST);
- p->ce_cnt = regval & ECC_ERRCNT_CECNT_MASK;
- p->ue_cnt = (regval & ECC_ERRCNT_UECNT_MASK) >> ECC_ERRCNT_UECNT_SHIFT;
- if (!p->ce_cnt)
- goto ue_err;
-
regval = readl(base + ECC_STAT_OFST);
if (!regval)
return 1;
p->ceinfo.bitpos = (regval & ECC_STAT_BITNUM_MASK);
+ regval = readl(base + ECC_ERRCNT_OFST);
+ p->ce_cnt = regval & ECC_ERRCNT_CECNT_MASK;
+ p->ue_cnt = (regval & ECC_ERRCNT_UECNT_MASK) >> ECC_ERRCNT_UECNT_SHIFT;
+ if (!p->ce_cnt)
+ goto ue_err;
+
regval = readl(base + ECC_CEADDR0_OFST);
p->ceinfo.row = (regval & ECC_CEADDR0_RW_MASK);
regval = readl(base + ECC_CEADDR1_OFST);
--
2.43.0
DW DDRs CSRs resource descriptor is used by the devm_ioremap_resource()
function invocation only in the driver probe method. Thus convert the
platform_get_resource() and devm_ioremap_resource() couple to just a
single devm_platform_ioremap_resource() method call.
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/synopsys_edac.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index 9f79f14e57b2..6976ef84e952 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -1334,11 +1334,9 @@ static int mc_probe(struct platform_device *pdev)
struct synps_edac_priv *priv;
struct mem_ctl_info *mci;
void __iomem *baseaddr;
- struct resource *res;
int rc;
- res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- baseaddr = devm_ioremap_resource(&pdev->dev, res);
+ baseaddr = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(baseaddr))
return PTR_ERR(baseaddr);
--
2.43.0
Currently the custom error messages are needlessly long so the logged text
gets to be printed in several lines in console. There is some
duplicated/redundant information which can be freely removed from it: drop
the message prefix "DDR ECC error type:%s" since the resultant text
printed to the log by the edac_mc_printk() method will contain the error
type and the memory controller id referring to the device detected the
error anyway; with no harm to readability shorten out the phrase "Bit
Position" to being just "Bit".
Signed-off-by: Serge Semin <[email protected]>
---
drivers/edac/synopsys_edac.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
index b0ff831287f5..dfe1abe7c86c 100644
--- a/drivers/edac/synopsys_edac.c
+++ b/drivers/edac/synopsys_edac.c
@@ -478,13 +478,13 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
pinf = &p->ceinfo;
if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type:%s Row %d Col %d Bank %d Bank Group %d Bit Position: %d Data: 0x%08x",
- "CE", pinf->row, pinf->col, pinf->bank,
- pinf->bankgrp, pinf->bitpos, pinf->data);
+ "Row %d Col %d Bank %d Bank Group %d Bit %d Data 0x%08x",
+ pinf->row, pinf->col, pinf->bank, pinf->bankgrp,
+ pinf->bitpos, pinf->data);
} else {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type:%s Row %d Bank %d Col %d Bit Position: %d Data: 0x%08x",
- "CE", pinf->row, pinf->bank, pinf->col,
+ "Row %d Bank %d Col %d Bit: %d Data: 0x%08x",
+ pinf->row, pinf->bank, pinf->col,
pinf->bitpos, pinf->data);
}
@@ -497,13 +497,12 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
pinf = &p->ueinfo;
if (priv->p_data->quirks & DDR_ECC_INTR_SUPPORT) {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type :%s Row %d Col %d Bank %d Bank Group %d",
- "UE", pinf->row, pinf->col, pinf->bank,
- pinf->bankgrp);
+ "Row %d Col %d Bank %d Bank Group %d",
+ pinf->row, pinf->col, pinf->bank, pinf->bankgrp);
} else {
snprintf(priv->message, SYNPS_EDAC_MSG_SIZE,
- "DDR ECC error type :%s Row %d Bank %d Col %d ",
- "UE", pinf->row, pinf->bank, pinf->col);
+ "Row %d Bank %d Col %d",
+ pinf->row, pinf->bank, pinf->col);
}
edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
--
2.43.0
On Thu, Feb 22, 2024 at 11:52 PM Serge Semin <[email protected]> wrote:
>
> This patchset is a first one in the series created in the framework of
> my Synopsys DW uMCTL2 DDRC-related work:
>
> [1: In-progress v5] EDAC/mc/synopsys: Various fixes and cleanups
> Link: ---you are looking at it---
> [2: In-progress v4] EDAC/synopsys: Add generic DDRC info and address mapping
> Link: https://lore.kernel.org/linux-edac/[email protected]
> [3: In-progress v4] EDAC/synopsys: Add generic resources and Scrub support
> Link: https://lore.kernel.org/linux-edac/[email protected]
>
> Note the patchsets above must be merged in the same order as they are
> placed in the list in order to prevent conflicts. Nothing prevents them
> from being reviewed synchronously though. Any tests are very welcome.
> Thanks in advance.
>
> The main goal of the entire set of the changes provided in the mentioned
> patchsets is to as much as possible specialise the synopsys_edac.c driver
> to be working with the Synopsys DW uMCTL2 DDR controllers of various
> versions and synthesized parameters, and add useful error-detection
> features.
>
> Regarding this series content. It's an initial patchset which
> traditionally provides various fixes, cleanups and modifications required
> for the more comfortable further features development. The main goal of it
> though is to detach the Xilinx Zynq A05 DDRC related code into the
> dedicated driver since first it has nothing to do with the Synopsys DW
> uMCTL2 DDR controller and second it will be a great deal obstacle on the
> way of extending the Synopsys-part functionality.
>
> The series starts with the fixes patches, which in short concern the next
> aspects: touching the ZynqMP-specific CSRs on the Xilinx ZinqMP platform
> only, serializing an access to the ECCCLR/ECCCTL register, adding correct memory
> devices type detection, setting a correct value to the
> mem_ctl_info.scrub_cap field, dropping an erroneous ADDRMAP[4] parsing and
> getting back a correct order of the ECC errors info detection procedure.
>
> Afterwards the patchset provides several cleanup patches required for the
> more coherent code splitting up (Xilinx Zynq A05 and Synopsys DW uMCTL2
> DDRCs) so the provided modifications would be useful in both drivers.
> First the platform resource open-coded IO-space remapping is replaced with
> the devm_platform_ioremap_resource() method call for the sake of the code
> simplification. Secondly the next redundant entities are dropped: internal
> CE/UE errors counters, local to_mci() macros definition, some redundant
> ecc_error_info structure fields and redundant info from the error message,
> duplicated dimm->nr_pages debug printout and spaces from the MEM_TYPE
> flags declarations. (The later two updates concern the MCI core part.)
> Thirdly before detaching the Zynq A05-related code an unique MC index
> allocation infrastructure is added to the MCI core. It's required since
> after splitting the driver up both supported types of memory devices could
> be correctly probed on the same platform. Note even though it's currently
> unsupported by the synsopsys_edac.c driver it's claimed to be possible by
> the original driver author (it was a reason of having two unrelated
> devices supported in a single driver). Finally the Xilinx Zynq A05 part of
> the driver is moved out to a dedicated driver. After that the
> platform-specific setups API is removed from the Synopsys DW uMCTL2 DDRC
> driver since it's no longer required.
>
> Finally as the cherry on the cake a set of the local coding style
> cleanups are provided: unify the DW uMCTL2 DDRC driver entities naming and
> replace the open-coded "shift/mask" pattern with the kernel helpers like
> BIT/GENMASK/FIELD_x in there. It shall significantly improve the code
> readability.
>
For the zynqmp
Reviewed-by: Shubhrajyoti Datta <[email protected]>
Thanks,
> Changelog v2:
> - Move Synopsys DW uMCTL2 DDRC bindings file renaming to a separate patch.
> (@Krzysztof)
> - Introduce a new compatible string "snps,dw-umctl2-ddrc" matching the new
> DT-schema name.
> - Forgot to fix some of the prefix of the SYNPS_ZYNQMP_IRQ_REGS macro
> in several places. (@tbot)
> - Drop the no longer used "priv" pointer from the mc_init() function.
> (@tbot)
> - Include "linux/bitfield.h" header file to get the FIELD_GET macro
> definition. (@tbot)
> - Drop the already merged in patches:
> [PATCH 12/20] EDAC/mc: Replace spaces with tabs in memtype flags definition
> [PATCH 13/20] EDAC/mc: Drop duplicated dimm->nr_pages debug printout
>
> Changelog v3:
> - Drop the no longer used "priv" pointer from the mc_init() function.
> (@tbot)
> - Drop the merged in patches:
> [PATCH v2 14/19] dt-bindings: memory: snps: Detach Zynq DDRC controller support
> [PATCH v2 15/19] dt-bindings: memory: snps: Use more descriptive device name
> (@Krzysztof)
>
> Changelog v4:
> - Remove Rob, Krzysztof and DT-mailing list from Cc since the respective
> patches have already been merged in.
> - Add a new patch
> [PATCH v4 6/20] EDAC/synopsys: Fix misleading IRQ self-cleared quirk flag
> detached from the very first patch of the series.
> - Add a new patch
> [PATCH v4 15/20] EDAC/mc: Re-use generic unique MC index allocation procedure
> - Add a new patch
> [PATCH v4 18/20] EDAC/synopsys: Unify CSRs macro declarations
> collecting the changes from various patches of the series.
> - Drop redundant empty lines left by mistake.
> - Drop private counters access from the check_errors() method too.
> - Rebase onto the kernel v6.6-rcX.
>
> Link: https://lore.kernel.org/linux-edac/[email protected]
> Changelog v5:
> - Fix function names in the zynq_edac.c kdoc.
> - Rebase onto the kernel 6.8-rc3.
>
> Signed-off-by: Serge Semin <[email protected]>
> Cc: Punnaiah Choudary Kalluri <[email protected]>
> Cc: Dinh Nguyen <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
>
> Serge Semin (20):
> EDAC/synopsys: Fix ECC status data and IRQ disable race condition
> EDAC/synopsys: Fix generic device type detection procedure
> EDAC/synopsys: Fix mci->scrub_cap field setting
> EDAC/synopsys: Drop erroneous ADDRMAP4.addrmap_col_b10 parse
> EDAC/synopsys: Fix reading errors count before ECC status
> EDAC/synopsys: Fix misleading IRQ self-cleared quirk flag
> EDAC/synopsys: Use platform device devm ioremap method
> EDAC/synopsys: Drop internal CE and UE counters
> EDAC/synopsys: Drop local to_mci() macro definition
> EDAC/synopsys: Drop struct ecc_error_info.blknr field
> EDAC/synopsys: Shorten out struct ecc_error_info.bankgrpnr field name
> EDAC/synopsys: Drop redundant info from the error messages
> EDAC/mc: Init DIMM labels in MC registration method
> EDAC/mc: Add generic unique MC index allocation procedure
> EDAC/mc: Re-use generic unique MC index allocation procedure
> EDAC/synopsys: Detach Zynq A05 DDRC support to separate driver
> EDAC/synopsys: Drop unused platform-specific setup API
> EDAC/synopsys: Unify CSRs macro declarations
> EDAC/synopsys: Unify struct/macro/function prefixes
> EDAC/synopsys: Convert to using BIT/GENMASK/FIELD_x macros
>
> MAINTAINERS | 1 +
> drivers/edac/Kconfig | 9 +-
> drivers/edac/Makefile | 1 +
> drivers/edac/dmc520_edac.c | 4 +-
> drivers/edac/edac_mc.c | 135 ++++-
> drivers/edac/edac_mc.h | 4 +
> drivers/edac/pasemi_edac.c | 5 +-
> drivers/edac/ppc4xx_edac.c | 5 +-
> drivers/edac/synopsys_edac.c | 967 ++++++++++++-----------------------
> drivers/edac/zynq_edac.c | 501 ++++++++++++++++++
> 10 files changed, 963 insertions(+), 669 deletions(-)
> create mode 100644 drivers/edac/zynq_edac.c
>
> --
> 2.43.0
>
>
On Thu, Feb 22, 2024 at 09:12:46PM +0300, Serge Semin wrote:
> The race condition around the ECCCLR register access happens in the IRQ
> disable method called in the device remove() procedure and in the ECC IRQ
> handler:
> 1. Enable IRQ:
> a. ECCCLR = EN_CE | EN_UE
> 2. Disable IRQ:
> a. ECCCLR = 0
> 3. IRQ handler:
> a. ECCCLR = CLR_CE | CLR_CE_CNT | CLR_CE | CLR_CE_CNT
> b. ECCCLR = 0
> c. ECCCLR = EN_CE | EN_UE
> So if the IRQ disabling procedure is called concurrently with the IRQ
> handler method the IRQ might be actually left enabled due to the
> statement 3c.
>
> The root cause of the problem is that ECCCLR register (which since v3.10a
> has been called as ECCCTL) has intermixed ECC status data clear flags and
> the IRQ enable/disable flags. Thus the IRQ disabling (clear EN flags) and
> handling (write 1 to clear ECC status data) procedures must be serialised
> around the ECCCTL register modification to prevent the race.
>
> So fix the problem described above by adding the spin-lock around the
> ECCCLR modifications and preventing the IRQ-handler from modifying the
> IRQs enable flags (there is no point in disabling the IRQ and then
> re-enabling it again within a single IRQ handler call, see the statements
> 3a/3b and 3c above).
So I'm looking at the code and am looking at this and wondering how we
even ended up in this mess?!
An interrupt handler should not *enable* the interrupt again - that's
just crazy. And I should've seen that in
4bcffe941758 ("EDAC/synopsys: Re-enable the error interrupts on v3 hw")
and stopped it right there. But well, it is what it is...
So I'd like to see the following flow:
* on init, the interrupt is enabled with enable_intr() *after*
registering the interrupt handler.
* on exit, the interrupt is disabled with disable_intr() and then no
interrupts are coming in anymore.
And then I don't think you'll need the spinlock and it'll be sane
design.
Right?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Apr 15, 2024 at 08:36:16PM +0200, Borislav Petkov wrote:
> On Thu, Feb 22, 2024 at 09:12:46PM +0300, Serge Semin wrote:
> > The race condition around the ECCCLR register access happens in the IRQ
> > disable method called in the device remove() procedure and in the ECC IRQ
> > handler:
> > 1. Enable IRQ:
> > a. ECCCLR = EN_CE | EN_UE
> > 2. Disable IRQ:
> > a. ECCCLR = 0
> > 3. IRQ handler:
> > a. ECCCLR = CLR_CE | CLR_CE_CNT | CLR_CE | CLR_CE_CNT
> > b. ECCCLR = 0
> > c. ECCCLR = EN_CE | EN_UE
> > So if the IRQ disabling procedure is called concurrently with the IRQ
> > handler method the IRQ might be actually left enabled due to the
> > statement 3c.
> >
> > The root cause of the problem is that ECCCLR register (which since v3.10a
> > has been called as ECCCTL) has intermixed ECC status data clear flags and
> > the IRQ enable/disable flags. Thus the IRQ disabling (clear EN flags) and
> > handling (write 1 to clear ECC status data) procedures must be serialised
> > around the ECCCTL register modification to prevent the race.
> >
> > So fix the problem described above by adding the spin-lock around the
> > ECCCLR modifications and preventing the IRQ-handler from modifying the
> > IRQs enable flags (there is no point in disabling the IRQ and then
> > re-enabling it again within a single IRQ handler call, see the statements
> > 3a/3b and 3c above).
>
> So I'm looking at the code and am looking at this and wondering how we
> even ended up in this mess?!
>
> An interrupt handler should not *enable* the interrupt again - that's
> just crazy. And I should've seen that in
>
> 4bcffe941758 ("EDAC/synopsys: Re-enable the error interrupts on v3 hw")
>
> and stopped it right there. But well, it is what it is...
It looks indeed crazy because the method is called enable_intr() and
is called in the IRQ handler. Right, re-enabling the IRQ in the handler
doesn't look good. But under the hood it was just a way to fix the
problem described in the commit you cited. enable_intr() just gets
back the IRQ Enable flags cleared a bit before in the
zynqmp_get_error_info() method.
The root cause of the problem is that the IRQ status/clear flags:
ECCCLR.ecc_corrected_err_clr (R/W1C)
ECCCLR.ecc_uncorrected_err_clr (R/W1C)
ECCCLR.ecc_corr_err_cnt_clr (R/W1C)
ECCCLR.ecc_uncorr_err_cnt_clr (R/W1C)
etc
and the IRQ enable/disable flags (since v3.10a):
ECCCLR.ecc_corrected_err_intr_en (R/W)
ECCCLR.ecc_uncorrected_err_intr_en (R/W)
reside in a single register - ECCCLR (Synopsys has renamed it to
ECCCTL since v3.10a due to adding the IRQ En/Dis flags).
Thus any concurrent access to that CSR like "Clear IRQ
status/counters" and "IRQ disable/enable" need to be protected from
the race condition.
>
> So I'd like to see the following flow:
>
> * on init, the interrupt is enabled with enable_intr() *after*
> registering the interrupt handler.
>
> * on exit, the interrupt is disabled with disable_intr() and then no
> interrupts are coming in anymore.
>
> And then I don't think you'll need the spinlock and it'll be sane
> design.
>
> Right?
This is what is implemented at the moment and it's racy. IRQ-handler
clears the IRQ status/counters by writing 1s to the respective flags
in the ECCCLR register. This inevitable causes writing to the IRQ
Enable/disable bits. Thus if during the IRQ handling the driver/device
gets to be removed (disable_intr() is called), the IRQ might be left
enabled despite of having the disable_intr() method executed. In its
turn that may cause fatal problems if the IRQ handler is executed once
again before the IRQ line is freed and disabled in the GIC side.
If we wish to avoid using the atomic spin-lock we'll need to change
the order:
0. Enable ECC IRQ in ECCCLR/ECCCTL CSR
1. Request IRQ line and register the IRQ handler
..
2. Free IRQ line
3. Disable ECC IRQ in ECCCLR/ECCCTL CSR
But if that path is decided to be taken the next aspects will need to
be taken into account:
1. If the IRQ line is shared, then the ECC IRQ might be delivered
somewhere between steps 0 and 1, which won't make the IRQ subsystem
happy. (Although this isn't actual for the current driver because it
requests the non-shared IRQ line for ECC, but who knows how the
situation will change in future).
2. A bit later (in the patchset #3) I am adding the ECC Scrubber
support, which will need the spin-lock anyway to protect another two
CSRs access. These CSRs are touched in run-time to enable/disable the
scrubbing and set/get the scrub rate.
So since we'll need to have a spin-lock anyway and from the
scalability point of view, it sounds reasonable to keep what is
suggested in my patch. What do you think?
-Serge(y)
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Apr 16, 2024 at 01:06:11PM +0300, Serge Semin wrote:
> It looks indeed crazy because the method is called enable_intr() and
> is called in the IRQ handler. Right, re-enabling the IRQ in the handler
> doesn't look good. But under the hood it was just a way to fix the
> problem described in the commit you cited. enable_intr() just gets
> back the IRQ Enable flags cleared a bit before in the
> zynqmp_get_error_info() method.
>
> The root cause of the problem is that the IRQ status/clear flags:
> ECCCLR.ecc_corrected_err_clr (R/W1C)
> ECCCLR.ecc_uncorrected_err_clr (R/W1C)
> ECCCLR.ecc_corr_err_cnt_clr (R/W1C)
> ECCCLR.ecc_uncorr_err_cnt_clr (R/W1C)
> etc
>
> and the IRQ enable/disable flags (since v3.10a):
> ECCCLR.ecc_corrected_err_intr_en (R/W)
> ECCCLR.ecc_uncorrected_err_intr_en (R/W)
>
> reside in a single register - ECCCLR (Synopsys has renamed it to
> ECCCTL since v3.10a due to adding the IRQ En/Dis flags).
>
> Thus any concurrent access to that CSR like "Clear IRQ
> status/counters" and "IRQ disable/enable" need to be protected from
> the race condition.
Ok, let's pick this apart one-by-one. I'll return to the rest you're
explaining as needed.
So, can writes to the status/counter bits while writing the *same* bit
to the IRQ enable/disable bit prevent any race conditions?
Meaning, you only change the status and counter bits and you preserve
the same value in the IRQ disable/enable bit?
IOW, I'm thinking of shadowing that ECCCTL in software so that we update
it from the shadowed value.
Because, AFAIU, the spinlock won't help if you grab it, clear the
status/counter bits and disable the interrupt in the process. You want
to only clear the status/counter bits and leave the interrupt enabled.
Right?
IOW, in one single write you do:
ECCCLR.ecc_corrected_err_clr=1
ECCCLR.ecc_uncorrected_err_clr=1
ECCCLR.ecc_corr_err_cnt_clr=1
ECCCLR.ecc_uncorr_err_cnt_clr=1
ECCCLR.ecc_corrected_err_intr_en=1
ECCCLR.ecc_uncorrected_err_intr_en=1
?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Sun, Apr 21, 2024 at 12:07:30PM +0200, Borislav Petkov wrote:
> On Tue, Apr 16, 2024 at 01:06:11PM +0300, Serge Semin wrote:
> > It looks indeed crazy because the method is called enable_intr() and
> > is called in the IRQ handler. Right, re-enabling the IRQ in the handler
> > doesn't look good. But under the hood it was just a way to fix the
> > problem described in the commit you cited. enable_intr() just gets
> > back the IRQ Enable flags cleared a bit before in the
> > zynqmp_get_error_info() method.
> >
> > The root cause of the problem is that the IRQ status/clear flags:
> > ECCCLR.ecc_corrected_err_clr (R/W1C)
> > ECCCLR.ecc_uncorrected_err_clr (R/W1C)
> > ECCCLR.ecc_corr_err_cnt_clr (R/W1C)
> > ECCCLR.ecc_uncorr_err_cnt_clr (R/W1C)
> > etc
> >
> > and the IRQ enable/disable flags (since v3.10a):
> > ECCCLR.ecc_corrected_err_intr_en (R/W)
> > ECCCLR.ecc_uncorrected_err_intr_en (R/W)
> >
> > reside in a single register - ECCCLR (Synopsys has renamed it to
> > ECCCTL since v3.10a due to adding the IRQ En/Dis flags).
> >
> > Thus any concurrent access to that CSR like "Clear IRQ
> > status/counters" and "IRQ disable/enable" need to be protected from
> > the race condition.
>
> Ok, let's pick this apart one-by-one. I'll return to the rest you're
> explaining as needed.
>
> So, can writes to the status/counter bits while writing the *same* bit
> to the IRQ enable/disable bit prevent any race conditions?
No, because the clear and enable/disable bits belong to the same CSR.
While you are writing the clear+same/enable bits, the concurrent IO
may have changed the same/enable bits. Like this:
IRQ-handler | IRQ-disabler
|
tmp = clear_sts_bits | enable_irq_bits;|
| ECCCLR = 0; // disable IRQ
ECCCLR = tmp; |
----------------------------------------+--------------------------------------
As a result even though the IRQ-disabler cleared the IRQ-enable bits,
the IRQ-handler got them back to being set. The same will happen if we
get to write the *same* bits in the handler:
IRQ-handler | IRQ-disabler
|
tmp = ECCCLR | clear_sts_bits; |
| ECCCLR = 0; // disable IRQs
ECCCLR = tmp; |
----------------------------------------+--------------------------------------
The last example is almost the same as what happens at the moment and
what I am fixing in this patch. The difference is that there is a
greater number of ECCCLR CSR changes performed in the IRQ-handler
context, which makes the critical section even wider than it could be:
IRQ-handler | IRQ-disabler
|
zynqmp_get_error_info: |
ECCCLR = clear_sts_bits; |
ECCCLR = 0; // actually redundant |
.. | ECCCLR = 0; // disable IRQs
enable_intr: |
ECCCLR = enable_irq_bits; |
----------------------------------------+--------------------------------------
>
> Meaning, you only change the status and counter bits and you preserve
> the same value in the IRQ disable/enable bit?
AFAICS this won't help to solve the race condition because writing the
preserved value of the enable/disable bits is the cause of the race
condition. The critical section is in concurrent flushing of different
values to the ECCCLR.*en bits. The only ways to solve that are:
1. prevent the concurrent access
2. serialize the critical section
>
> IOW, I'm thinking of shadowing that ECCCTL in software so that we update
> it from the shadowed value.
I don't see the shadowing will help to prevent what is happening
unless you know some shadow-register pattern I am not aware of. AFAIR
the shadow register is normally utilized for the cases of:
1. read ops returns an incorrect value or a CSR couldn't be read
2. IO bus is too slow in order to speed-up the RMW-pattern
In any case the shadowed value and the process of the data flushing
would need to be protected with a lock anyway in order to sync the
shadow register content and the actual value written to the CSR.
>
> Because, AFAIU, the spinlock won't help if you grab it, clear the
> status/counter bits and disable the interrupt in the process. You want
> to only clear the status/counter bits and leave the interrupt enabled.
>
> Right?
Right, but the spinlock will help. What I need to do deal with two
concurrent operations:
IRQ-handler: clear the status/counter bits and leave the IRQ enable
bits as is.
IRQ-disabler: clear the IRQ enable bits
These actions need to be serialized in order to prevent the race
condition.
>
> IOW, in one single write you do:
>
> ECCCLR.ecc_corrected_err_clr=1
> ECCCLR.ecc_uncorrected_err_clr=1
> ECCCLR.ecc_corr_err_cnt_clr=1
> ECCCLR.ecc_uncorr_err_cnt_clr=1
> ECCCLR.ecc_corrected_err_intr_en=1
> ECCCLR.ecc_uncorrected_err_intr_en=1
>
> ?
This won't be help because the concurrent IRQ-disabler could have
already cleared the IRQ enable bits while the IRQ-handler is being
executed and about to write to the ECCCLR register. Like this:
IRQ-handler | IRQ-disabler
|
tmp = clear_sts_bits | enable_irq_bits;|
| ECCCLR = 0; // disable IRQ
ECCCLR = tmp; |
----------------------------------------+--------------------------------------
Even if we get to add the spin-lock serializing the ECCCLR writes it
won't solve the problem since the IRQ-disabler critical section could
be executed a bit before the IRQ-handler critical section so the later
one will just re-enable the IRQs disabled by the former one.
Here is what is suggested in my patch to fix the problem:
IRQ-handler | IRQ-disabler
|
zynqmp_get_error_info: |
| lock_irqsave
| ECCCLR = 0; // disable IRQs
| unlock_irqrestore
lock_irqsave; |
tmp = ECCCLR | clear_sts_bits; |
ECCCLR = tmp; |
unlock_irqrestore; |
----------------------------------------+--------------------------------------
See, the IRQ-status/counters clearing and IRQ disabling processes are
serialized so the former one wouldn't override the values written by
the later one.
Here is the way it would have looked in case of the shadow-register
implementation:
IRQ-handler | IRQ-disabler
|
zynqmp_get_error_info: |
| lock_irqsave
| shadow_en_bits = 0;
| ECCCLR = shadow_en_bits; // disable IRQs
| unlock_irqrestore
lock_irqsave; |
tmp = clear_sts_bits | shadow_en_bits; |
ECCCLR = tmp; |
unlock_irqrestore; |
----------------------------------------+--------------------------------------
The shadow-register pattern just prevents one ECCCLR read op. The
shadowed data sync would have needed the serialization anyway. Seeing
the DW DDR uMCTL2 controller CSRs are always memory mapped, I don't
see using the shadow-register CSR would worth being implemented unless
you meant something different.
-Serge(y)
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Apr 25, 2024 at 03:52:38PM +0300, Serge Semin wrote:
> Even if we get to add the spin-lock serializing the ECCCLR writes it
> won't solve the problem since the IRQ-disabler critical section could
> be executed a bit before the IRQ-handler critical section so the later
> one will just re-enable the IRQs disabled by the former one.
>
> Here is what is suggested in my patch to fix the problem:
>
> IRQ-handler | IRQ-disabler
> |
> zynqmp_get_error_info: |
> | lock_irqsave
> | ECCCLR = 0; // disable IRQs
> | unlock_irqrestore
> lock_irqsave; |
> tmp = ECCCLR | clear_sts_bits; |
> ECCCLR = tmp; |
> unlock_irqrestore; |
<--- I'm presuming here the IRQ-disabler will reenable interrupts at
some point?
Otherwise we have the same problem as before when interrupts remain off
after the IRQ handler has run.
Other than that, yes, I see it, we will need the locking.
Thanks for elaborating.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, May 06, 2024 at 12:20:29PM +0200, Borislav Petkov wrote:
> On Thu, Apr 25, 2024 at 03:52:38PM +0300, Serge Semin wrote:
> > Even if we get to add the spin-lock serializing the ECCCLR writes it
> > won't solve the problem since the IRQ-disabler critical section could
> > be executed a bit before the IRQ-handler critical section so the later
> > one will just re-enable the IRQs disabled by the former one.
> >
> > Here is what is suggested in my patch to fix the problem:
> >
> > IRQ-handler | IRQ-disabler
> > |
> > zynqmp_get_error_info: |
> > | lock_irqsave
> > | ECCCLR = 0; // disable IRQs
> > | unlock_irqrestore
> > lock_irqsave; |
> > tmp = ECCCLR | clear_sts_bits; |
> > ECCCLR = tmp; |
> > unlock_irqrestore; |
>
> <--- I'm presuming here the IRQ-disabler will reenable interrupts at
> some point?
>
> Otherwise we have the same problem as before when interrupts remain off
> after the IRQ handler has run.
In the sketch above the IRQ-disabler is the method which disables the
IRQ in the concurrent manner. After my patch is applied the
IRQ-handler will no longer touch the IRQ enable/disable bits, but will
preserve them as is:
- clearval = ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT;
- clearval |= ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT;
+ spin_lock_irqsave(&priv->reglock, flags);
+
+ clearval = readl(base + ECC_CLR_OFST) |
+ ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT |
+ ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT;
writel(clearval, base + ECC_CLR_OFST);
- writel(0x0, base + ECC_CLR_OFST);
+
+ spin_unlock_irqrestore(&priv->reglock, flags);
Thus there won't be need in the IRQs re-enabling later in the handler:
@@ -576,8 +601,6 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
/* v3.0 of the controller does not have this register */
if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR))
writel(regval, priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
- else
- enable_intr(priv);
So the only IRQ-disabler left in the driver - disable_intr() - will be
called from the device/driver remove() function. The ECCCLR CSR access
will be guarded with the spin-lock in the IRQ-disabler and in the
IRQ-handler. So it will be safe to have them executed concurrently.
>
> Other than that, yes, I see it, we will need the locking.
>
> Thanks for elaborating.
Always welcome. Glad we've settled this.)
-Serge(y)
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
On Mon, May 06, 2024 at 02:27:50PM +0300, Serge Semin wrote:
> Always welcome. Glad we've settled this.)
Yap, it looks good so far.
Lemme queue it into urgent and send it Linuswards soon-ish.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Feb 22, 2024 at 09:12:47PM +0300, Serge Semin wrote:
> First of all the enum dev_type constants describe the memory DRAM chips
> used at the stick, not the entire DQ-bus width (see the enumeration kdoc
Which kdoc?
The kernel doc above enum dev_type in include/linux/edac.h?
In any case, you need to be precise pls.
> for details). So what is returned from the zynqmp_get_dtype() function and
> then specified to the dimm_info->dtype field is definitely incorrect.
Whoops, you lost me here. Why is it incorrect?
You want
"zynqmp_get_dtype - Return the controller memory width."
to return the memory width supported by the controller?
dimm->dtype = p_data->get_dtype(priv->baseaddr);
enum dev_type dtype; /* memory device type */
Yeah, no, that function returns the DIMM device type.
/me looks at the code.
Aha, so you mean the device width should be determined from that
DDRC_MSTR_CFG* thing.
> Secondly the DRAM chips type has nothing to do with the data bus width
> specified in the MSTR.data_bus_width CSR field. That CSR field just
> determines the part of the whole DQ-bus currently used to access the data
> from the all DRAM memory chips. So it doesn't indicate the individual
> chips type. Thirdly the DRAM chips type can be determined only in case of
> the DDR4 protocol by means of the MSTR.device_config field state (it is
Hold on, this driver runs on all kinds of hardware I presume. Are you
thinking about older ones which don't do DDR4?
Or does that thing do DDR4 only?
> supposed to be set by the system firmware). Finally the DW uMCTL2 DDRC ECC
> capability doesn't depend on the memory chips type. Moreover it doesn't
> depend on the utilized data bus width in runtime either. The IP-core
> reference manual says in [1,2] that the ECC support can't be enabled
> during the IP-core synthesizes for the DRAM data bus widths other than 16,
This sentence is missing something.
> 32 or 64. At the same time the bus width mode (MSTR.data_bus_width)
> doesn't change the ECC feature availability. Thus it was wrong to
> determine the ECC state with respect to the DQ-bus width mode.
You need to split your paragraphs with newlines to help readability.
Right now it is a blob of hard to parse text. For example, when you have
to write "Secondly, " that's your split right there. "Thirdly," is your
next newline. And so on.
> Fix all of the mistakes described above in the zynqmp_get_dtype() and
> zynqmp_get_ecc_state() methods: specify actual DRAM chips data width only
> for the DDR4 protocol and return that it's UNKNOWN in the rest of the
> cases;
What are the rest of the cases and why is it ok to return UNKNOWN all of
a sudden? IOW, how was the old code even tested?!
> determine ECC availability by the ECCCFG0.ecc_mode field state
> only (that field can't be modified anyway if the IP-core was synthesized
> with no ECC support).
>
> [1] DesignWare® Cores Enhanced Universal DDR Memory Controller (uMCTL2)
> Databook, Version 3.91a, October 2020, p. 421.
> [2] DesignWare® Cores Enhanced Universal DDR Memory Controller (uMCTL2)
> Databook, Version 3.91a, October 2020, p. 633.
Can those be freely accessed?
If not, you should say so.
> Fixes: b500b4a029d5 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
So this commit is in 4.20.
Does that mean that this fix needs to get backported to all stable
kernels?
Have you tested this on all hw this driver supports and made sure no
regressions are introduced?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Jun 04, 2024 at 08:38:15PM +0200, Borislav Petkov wrote:
> On Thu, Feb 22, 2024 at 09:12:47PM +0300, Serge Semin wrote:
> > First of all the enum dev_type constants describe the memory DRAM chips
> > used at the stick, not the entire DQ-bus width (see the enumeration kdoc
>
> Which kdoc?
>
> The kernel doc above enum dev_type in include/linux/edac.h?
Right.
>
> In any case, you need to be precise pls.
>
> > for details). So what is returned from the zynqmp_get_dtype() function and
> > then specified to the dimm_info->dtype field is definitely incorrect.
>
> Whoops, you lost me here. Why is it incorrect?
As I said because dev_type is the memory DRAM chips type (individual
DRAM chip data bus width), and not the entire DQ-bus width or its
currently active part. Even from that perspective the function name
and the subsequent return value utilization is incorrect.
Imagine the Xilinx ZynqMP has the 64-bit DQ-bus width. Having
(MSTR.data_bus_width == DDRCTL_EWDTH_16) means that only a quarter of
the bus will be utilized to get data from the DRAMs. So all the
connected DRAM chip(s) data buses are _somehow_ distributed along the
16 bits of the DRAM controller DQ-bus. It can be a single DRAM chip
with 16-bit DQ-bus, or two x8 DRAM chips, or four x4 DRAM chips, etc.
>
> You want
>
> "zynqmp_get_dtype - Return the controller memory width."
>
> to return the memory width supported by the controller?
>
> dimm->dtype = p_data->get_dtype(priv->baseaddr);
>
> enum dev_type dtype; /* memory device type */
>
> Yeah, no, that function returns the DIMM device type.
>
> /me looks at the code.
>
> Aha, so you mean the device width should be determined from that
> DDRC_MSTR_CFG* thing.
That's what is said in "Secondly" and "Thirdly".
>
> > Secondly the DRAM chips type has nothing to do with the data bus width
> > specified in the MSTR.data_bus_width CSR field. That CSR field just
> > determines the part of the whole DQ-bus currently used to access the data
> > from the all DRAM memory chips. So it doesn't indicate the individual
> > chips type. Thirdly the DRAM chips type can be determined only in case of
> > the DDR4 protocol by means of the MSTR.device_config field state (it is
>
> Hold on, this driver runs on all kinds of hardware I presume. Are you
> thinking about older ones which don't do DDR4?
>
> Or does that thing do DDR4 only?
First of all, not that much of the kinds. Just Xilinx ZynqMP DDRC
(based on the DW uMCTL 2.40a IP-core) and some version of DW uMCTL
3.80a being possessed by Dinh Nguyen and, by a lucky coincident, turned
to be mainly compatibly with the Xilinx ZynqMP DDR controller.
Secondly I've checked that part on all the DW uMCTL2 databooks I've
got (I've got lots: versions 1.x, 2.x and 3.x). DW uMCTL2 v1.x doesn't
support DDR4. DW uMCTL2 v2.x and v3.x IP-cores do support DDR4 but
work as I described: the only way to determine the DRAM chips type is
to use the MSTR.device_config field content and for DDR4 only.
MSTR.data_bus_width field has nothing to do with that.
>
> > supposed to be set by the system firmware).
> > Finally the DW uMCTL2 DDRC ECC
> > capability doesn't depend on the memory chips type. Moreover it doesn't
> > depend on the utilized data bus width in runtime either. The IP-core
> > reference manual says in [1,2] that the ECC support can't be enabled
> > during the IP-core synthesizes for the DRAM data bus widths other than 16,
>
> This sentence is missing something.
>
> > 32 or 64. At the same time the bus width mode (MSTR.data_bus_width)
> > doesn't change the ECC feature availability. Thus it was wrong to
> > determine the ECC state with respect to the DQ-bus width mode.
Sorry, but this part doesn't miss anything. It merely says that
neither memory DRAM chips type nor MSTR.data_bus_width value could be
utilized to determine the ECC support. According to the databooks the
ECC support can be available on the IP-cores which _full_ DQ-bus width
is 16, 32 or 64. MSTR.data_bus_width, selecting the active part of the
full DQ-bus, doesn't change the ECC feature availability in anyway.
From that perspective the zynqmp_get_dtype() utilization in
zynqmp_get_ecc_state() has also been incorrect.
>
> You need to split your paragraphs with newlines to help readability.
> Right now it is a blob of hard to parse text. For example, when you have
> to write "Secondly, " that's your split right there. "Thirdly," is your
> next newline. And so on.
Ok.
>
> > Fix all of the mistakes described above in the zynqmp_get_dtype() and
> > zynqmp_get_ecc_state() methods: specify actual DRAM chips data width only
> > for the DDR4 protocol and return that it's UNKNOWN in the rest of the
> > cases;
>
> What are the rest of the cases and why is it ok to return UNKNOWN all of
> a sudden? IOW, how was the old code even tested?!
First of all, MSTR.data_bus_width field can have only one of the next
three values: 0x1, 0x2 and 0x3. All of them are handled in
zynqmp_get_dtype(). So in the current (incorrect) implementation it
will never return DEV_UNKNOWN.
Secondly, dimm->dtype isn't utilized for something significant in the
EDAC subsystem, but is just exposed to the user-space via the dev_type
sysfs node.
So based on that my bet is that since the incorrect code didn't affect
the main driver functionality and since the dimm->dtype is just
exposed to user-space, the bug has been living just fine unnoticed up
until I started digging into the original DW uMCTL2 HW-manuals,
started studying the driver code, and decided to convert the driver to
supporting generic version of the DW uMCTL2 controller (not only the
Xilinx version of it). That's what this series and the next two ones
are about - about converting the driver to supporting truly generic DW
uMCTL controllers.
>
> > determine ECC availability by the ECCCFG0.ecc_mode field state
> > only (that field can't be modified anyway if the IP-core was synthesized
> > with no ECC support).
> >
> > [1] DesignWare? Cores Enhanced Universal DDR Memory Controller (uMCTL2)
> > Databook, Version 3.91a, October 2020, p. 421.
> > [2] DesignWare? Cores Enhanced Universal DDR Memory Controller (uMCTL2)
> > Databook, Version 3.91a, October 2020, p. 633.
>
> Can those be freely accessed?
>
> If not, you should say so.
No, they can't be.
>
> > Fixes: b500b4a029d5 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
>
> So this commit is in 4.20.
>
> Does that mean that this fix needs to get backported to all stable
> kernels?
It's up to the stable maintainers to decide.
>
> Have you tested this on all hw this driver supports and made sure no
> regressions are introduced?
I've tested it on the devices with DW uMCTL 2.51a + DDR3 memory and DW
uMCTL 3.10a + DDR4 memory. I am sure this will work for Xilinx ZynqMP
too, especially seeing we've already got the Shubhrajyoti Datta Rb
tag:
https://lore.kernel.org/linux-edac/CAKfKVtErVuCM+pa1e7Lwt0DUU-t-U0eNRnZSw39pfsZ8gv8QZQ@mail.gmail.com/
-Serge(y)
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jun 05, 2024 at 01:11:27AM +0300, Serge Semin wrote:
> As I said because dev_type is the memory DRAM chips type (individual
> DRAM chip data bus width), and not the entire DQ-bus width or its
> currently active part. Even from that perspective the function name
> and the subsequent return value utilization is incorrect.
Well, maybe the author misunderstood it but the result of this goes to
sysfs:
dimm->dtype = p_data->get_dtype(priv->baseaddr);
which is in Documentation/ABI/testing/sysfs-devices-edac:
What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_dev_type
Date: April 2012
Contact: Mauro Carvalho Chehab <[email protected]>
[email protected]
Description: This attribute file will display what type of DRAM device is
being utilized on this DIMM (x1, x2, x4, x8, ...).
So you'd need to fix the comment above zynqmp_get_dtype() or I can do so
too while applying.
> First of all, not that much of the kinds.
What does that mean: "not that much of the kinds"?
> Just Xilinx ZynqMP DDRC (based on the DW uMCTL 2.40a IP-core) and some
> version of DW uMCTL 3.80a being possessed by Dinh Nguyen and, by
> a lucky coincident, turned to be mainly compatibly with the Xilinx
> ZynqMP DDR controller.
Then Dinh better holler here what the story is.
> > > 32 or 64. At the same time the bus width mode (MSTR.data_bus_width)
> > > doesn't change the ECC feature availability. Thus it was wrong to
> > > determine the ECC state with respect to the DQ-bus width mode.
>
> Sorry, but this part doesn't miss anything.
Gramatically:
"The IP-core reference manual says in [1,2] that the ECC support can't
be enabled during the IP-core synthesizes for the DRAM data bus widths
other than 16,..."
"synthesizes" looks wrong.
It either needs to be
"... be enabled *while* the IP-core synthesizes for the DRAM..." which
still doesn't make too much sense.
Or
"... be enabled during the IP-core *synthesis* for the DRAM..."
I don't know what you mean with that "synthesizes" thing.
> First of all, MSTR.data_bus_width field can have only one of the next
> three values: 0x1, 0x2 and 0x3. All of them are handled in
> zynqmp_get_dtype(). So in the current (incorrect) implementation it
> will never return DEV_UNKNOWN.
>
> Secondly, dimm->dtype isn't utilized for something significant in the
> EDAC subsystem, but is just exposed to the user-space via the dev_type
> sysfs node.
See above.
> So based on that my bet is that since the incorrect code didn't affect
> the main driver functionality and since the dimm->dtype is just
> exposed to user-space, the bug has been living just fine unnoticed up
> until I started digging into the original DW uMCTL2 HW-manuals,
> started studying the driver code, and decided to convert the driver to
> supporting generic version of the DW uMCTL2 controller (not only the
> Xilinx version of it). That's what this series and the next two ones
> are about - about converting the driver to supporting truly generic DW
> uMCTL controllers.
I absolutely don't have a problem with that - good idea!
However, we don't break machines and don't introduce regressions.
> > Can those be freely accessed?
> >
> > If not, you should say so.
>
> No, they can't be.
Then you don't need to mention them.
>
> >
> > > Fixes: b500b4a029d5 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
> >
> > So this commit is in 4.20.
> >
> > Does that mean that this fix needs to get backported to all stable
> > kernels?
>
> It's up to the stable maintainers to decide.
Haha, you're funny. How can the stable maintainers know whether each
patch that has Fixes: tags is stable material?
Nope, that's up to the maintainer to decide.
> I've tested it on the devices with DW uMCTL 2.51a + DDR3 memory and DW
> uMCTL 3.10a + DDR4 memory. I am sure this will work for Xilinx ZynqMP
> too, especially seeing we've already got the Shubhrajyoti Datta Rb
> tag:
Yes, I've asked him to review that driver because this is not something
I have or use and so on...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Jun 10, 2024 at 10:00:37AM +0200, Borislav Petkov wrote:
> On Wed, Jun 05, 2024 at 01:11:27AM +0300, Serge Semin wrote:
> > As I said because dev_type is the memory DRAM chips type (individual
> > DRAM chip data bus width), and not the entire DQ-bus width or its
> > currently active part. Even from that perspective the function name
> > and the subsequent return value utilization is incorrect.
>
> Well, maybe the author misunderstood it but the result of this goes to
> sysfs:
>
> dimm->dtype = p_data->get_dtype(priv->baseaddr);
>
> which is in Documentation/ABI/testing/sysfs-devices-edac:
>
> What: /sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_dev_type
> Date: April 2012
> Contact: Mauro Carvalho Chehab <[email protected]>
> [email protected]
> Description: This attribute file will display what type of DRAM device is
> being utilized on this DIMM (x1, x2, x4, x8, ...).
>
> So you'd need to fix the comment above zynqmp_get_dtype() or I can do so
> too while applying.
Right. I missed the comment indeed. Thanks for spotting that. If it
won't be that much of a burden please fix it on merging the patch in.
If I need to release v7 with this patch included, then I'll fix the
comment myself.
>
> > First of all, not that much of the kinds.
>
> What does that mean: "not that much of the kinds"?
The answer was following that phrase. Just two devices: ZynqMP DDRC
and a DW uMCTL v3.80a-based DDRC available on the Dinh Nguyen's
device. Seeing there were no much changes provided to the driver to
support that controller, the controller must have been compatible with
the Xilinx ZynqMP DDRC in the vast majority of the DW uMCTL2 DDRC
parameters/features.
>
> > Just Xilinx ZynqMP DDRC (based on the DW uMCTL 2.40a IP-core) and some
> > version of DW uMCTL 3.80a being possessed by Dinh Nguyen and, by
> > a lucky coincident, turned to be mainly compatibly with the Xilinx
> > ZynqMP DDR controller.
>
> Then Dinh better holler here what the story is.
>
> > > > 32 or 64. At the same time the bus width mode (MSTR.data_bus_width)
> > > > doesn't change the ECC feature availability. Thus it was wrong to
> > > > determine the ECC state with respect to the DQ-bus width mode.
> >
> > Sorry, but this part doesn't miss anything.
>
> Gramatically:
>
> "The IP-core reference manual says in [1,2] that the ECC support can't
> be enabled during the IP-core synthesizes for the DRAM data bus widths
> other than 16,..."
>
> "synthesizes" looks wrong.
>
> It either needs to be
>
> "... be enabled *while* the IP-core synthesizes for the DRAM..." which
> still doesn't make too much sense.
>
> Or
>
> "... be enabled during the IP-core *synthesis* for the DRAM..."
>
> I don't know what you mean with that "synthesizes" thing.
But you know what it means if "synthesis" would have been utilized, right?
If no, I'll explain. If yes, then you're right. My mistake. I confused two
letters. I'll fix it in v7 should the patch need to be included there.
>
> > First of all, MSTR.data_bus_width field can have only one of the next
> > three values: 0x1, 0x2 and 0x3. All of them are handled in
> > zynqmp_get_dtype(). So in the current (incorrect) implementation it
> > will never return DEV_UNKNOWN.
> >
> > Secondly, dimm->dtype isn't utilized for something significant in the
> > EDAC subsystem, but is just exposed to the user-space via the dev_type
> > sysfs node.
>
> See above.
>
> > So based on that my bet is that since the incorrect code didn't affect
> > the main driver functionality and since the dimm->dtype is just
> > exposed to user-space, the bug has been living just fine unnoticed up
> > until I started digging into the original DW uMCTL2 HW-manuals,
> > started studying the driver code, and decided to convert the driver to
> > supporting generic version of the DW uMCTL2 controller (not only the
> > Xilinx version of it). That's what this series and the next two ones
> > are about - about converting the driver to supporting truly generic DW
> > uMCTL controllers.
>
> I absolutely don't have a problem with that - good idea!
>
> However, we don't break machines and don't introduce regressions.
Who would have argued.)
>
> > > Can those be freely accessed?
> > >
> > > If not, you should say so.
> >
> > No, they can't be.
>
> Then you don't need to mention them.
Well, I see it otherwise. If you posses the databook then by using the
references you can find the info there straight away with no need in
struggling through the _1.5K_ pages file. If you don't have one, then
you can just skip that part of the log.
So I'd rather leave the refs be in the log.
>
> >
> > >
> > > > Fixes: b500b4a029d5 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
> > >
> > > So this commit is in 4.20.
> > >
> > > Does that mean that this fix needs to get backported to all stable
> > > kernels?
> >
> > It's up to the stable maintainers to decide.
>
> Haha, you're funny. How can the stable maintainers know whether each
> patch that has Fixes: tags is stable material?
>
> Nope, that's up to the maintainer to decide.
... and the review committee, and the linux-kernel list members may
participate in the discussion too. But that's not the point here,
right?
Anyway if you wished to know my opinion, then really I don't have a
strong one in this patch regard. From one side the patch does fix an
"oh, that's not good" issue. That's why I has added the Fixes tag. On
the other hand the problem has been here unnoticed for years and
nobody cared. The only parts the incorrect method implementation has
affected was a wrong value returned to the user-space via the
sysfs-node, and the first part of the ECC-availability test procedure
which has turned to be redundant anyway since the zynqmp_get_dtype()
method never returns DEV_UNKNOWN. So my conclusion is the same. It's up
to the maintainers to decide.
>
> > I've tested it on the devices with DW uMCTL 2.51a + DDR3 memory and DW
> > uMCTL 3.10a + DDR4 memory. I am sure this will work for Xilinx ZynqMP
> > too, especially seeing we've already got the Shubhrajyoti Datta Rb
> > tag:
>
> Yes, I've asked him to review that driver because this is not something
> I have or use and so on...
As you can see, I do and of two IP-core major versions (and plenty of
the DW uMCTL2 IP-core databooks). So should you need some help with
testing the bits coming for the Synopsys DW uMCTL2 EDAC driver, just
send a ping to me. I'll test them out.
-Serge(y)
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette