LinuxLists.cc - [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

2012-10-31 13:58:26

Subject: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

There's a know bug that happens when apei/ghes is loaded together
with an EDAC module: the same error is reported several times,
as ghes calls mcelog, with, in tune, calls edac.

On some cases, the error reported is even mangled by ghes, as it
reports both Nehalem and Sandy Bridge errors with the same format
to MCE log. However, the error decoding is different for each.

Fixing it requires some work, making both EDAC and GHES know
each other.

My idea is to make apei/ghes.c to register itself at edac, using
edac_mc_add_mc() call (or something similar to it), and report
errors using a light version of edac_mc_handle_error().

Assuming that ghes got registered at the EDAC core (if CONFIG_EDAC),
then all we need to do at EDAC core is to be sure that there will
be there just one module providing error reports for the system
memory.

The patch below is just the first step for it. On my better
undestanding at the EDAC subsystem, I don't think that there are
any systems where more than one different EDAC module could be
used, as hardware have just one memory controller type. However,
if this is ever used on any weird place, we may need to re-think
about this proposed solution. That's why I'm submitting this
RFC patch in such early stage.

-

APEI GHES and i7core_edac/sb_edac currently can be loaded at
the same time, but those are Highlander modules:
"There can be only one".

So, we need to add an module owner's lock at the EDAC core,
in order to avoid having two different modules handling memory
errors at the same time.

A change is also needed at apei/ghes.c, in order to register
it as an EDAC module, in order to properly enable such lock.

Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---
drivers/edac/amd64_edac.c | 2 +-
drivers/edac/amd76x_edac.c | 2 +-
drivers/edac/cell_edac.c | 2 +-
drivers/edac/cpc925_edac.c | 2 +-
drivers/edac/e752x_edac.c | 2 +-
drivers/edac/e7xxx_edac.c | 2 +-
drivers/edac/edac_core.h | 2 +-
drivers/edac/edac_mc.c | 30 +++++++++++++++++++++++++-----
drivers/edac/highbank_mc_edac.c | 2 +-
drivers/edac/i3000_edac.c | 2 +-
drivers/edac/i3200_edac.c | 2 +-
drivers/edac/i5000_edac.c | 2 +-
drivers/edac/i5100_edac.c | 2 +-
drivers/edac/i5400_edac.c | 2 +-
drivers/edac/i7300_edac.c | 2 +-
drivers/edac/i7core_edac.c | 2 +-
drivers/edac/i82443bxgx_edac.c | 2 +-
drivers/edac/i82860_edac.c | 2 +-
drivers/edac/i82875p_edac.c | 2 +-
drivers/edac/i82975x_edac.c | 2 +-
drivers/edac/mpc85xx_edac.c | 2 +-
drivers/edac/mv64x60_edac.c | 2 +-
drivers/edac/pasemi_edac.c | 2 +-
drivers/edac/ppc4xx_edac.c | 2 +-
drivers/edac/r82600_edac.c | 2 +-
drivers/edac/sb_edac.c | 2 +-
drivers/edac/tile_edac.c | 2 +-
drivers/edac/x38_edac.c | 2 +-
28 files changed, 52 insertions(+), 32 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 5a297a2..6aeb7e1 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2601,7 +2601,7 @@ static int amd64_init_one_instance(struct pci_dev *F2)
mci->edac_cap = EDAC_FLAG_NONE;

ret = -ENODEV;
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(1, "failed edac_mc_add_mc()\n");
goto err_add_mc;
}
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 29eeb68..b4bf72e 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -275,7 +275,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
/* Here we assume that we will never see multiple instances of this
* type of memory controller. The ID is therefore hardcoded to 0.
*/
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto fail;
}
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index a1bbd8e..738bd19 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -223,7 +223,7 @@ static int __devinit cell_edac_probe(struct platform_device *pdev)
cell_edac_init_csrows(mci);

/* Register with EDAC core */
- rc = edac_mc_add_mc(mci);
+ rc = edac_mc_add_mc(mci, THIS_MODULE);
if (rc) {
dev_err(&pdev->dev, "failed to register with EDAC core\n");
edac_mc_free(mci);
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index c2ef134..6afcc52 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -1016,7 +1016,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
/* Setup memory controller registers */
cpc925_mc_init(mci);

- if (edac_mc_add_mc(mci) > 0) {
+ if (edac_mc_add_mc(mci, THIS_MODULE) > 0) {
cpc925_mc_printk(mci, KERN_ERR, "Failed edac_mc_add_mc()\n");
goto err3;
}
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index a5ed6b7..e63cd71 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1358,7 +1358,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
/* Here we assume that we will never see multiple instances of this
* type of memory controller. The ID is therefore hardcoded to 0.
*/
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto fail;
}
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 9ff57f3..6bdff5f 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -498,7 +498,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
/* Here we assume that we will never see multiple instances of this
* type of memory controller. The ID is therefore hardcoded to 0.
*/
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto fail1;
}
diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index 23bb99f..db30694 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -446,7 +446,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned mc_num,
unsigned n_layers,
struct edac_mc_layer *layers,
unsigned sz_pvt);
-extern int edac_mc_add_mc(struct mem_ctl_info *mci);
+extern int edac_mc_add_mc(struct mem_ctl_info *mci, struct module *owner);
extern void edac_mc_free(struct mem_ctl_info *mci);
extern struct mem_ctl_info *edac_mc_find(int idx);
extern struct mem_ctl_info *find_mci_by_dev(struct device *dev);
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 8bb27f2..d4d891a 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -42,6 +42,12 @@
static DEFINE_MUTEX(mem_ctls_mutex);
static LIST_HEAD(mc_devices);

+/*
+ * Used to lock EDAC MC to just one module, avoiding two drivers e. g.
+ * apei/ghes and i7core_edac to be used at the same time.
+ */
+static struct module *edac_mc_owner;
+
unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
unsigned len)
{
@@ -674,9 +680,9 @@ fail1:
return 1;
}

-static void del_mc_from_global_list(struct mem_ctl_info *mci)
+static int del_mc_from_global_list(struct mem_ctl_info *mci)
{
- atomic_dec(&edac_handlers);
+ int handlers = atomic_dec_return(&edac_handlers);
list_del_rcu(&mci->link);

/* these are for safe removal of devices from global list while
@@ -684,6 +690,8 @@ static void del_mc_from_global_list(struct mem_ctl_info *mci)
*/
synchronize_rcu();
INIT_LIST_HEAD(&mci->link);
+
+ return handlers;
}

/**
@@ -725,8 +733,9 @@ EXPORT_SYMBOL(edac_mc_find);
*/

/* FIXME - should a warning be printed if no error detection? correction? */
-int edac_mc_add_mc(struct mem_ctl_info *mci)
+int edac_mc_add_mc(struct mem_ctl_info *mci, struct module *owner)
{
+ int ret = -EINVAL;
edac_dbg(0, "\n");

#ifdef CONFIG_EDAC_DEBUG
@@ -757,6 +766,11 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
#endif
mutex_lock(&mem_ctls_mutex);

+ if (edac_mc_owner && edac_mc_owner != owner) {
+ ret = -EPERM;
+ goto fail0;
+ }
+
if (add_mc_to_global_list(mci))
goto fail0;

@@ -783,6 +797,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
edac_mc_printk(mci, KERN_INFO, "Giving out device to '%s' '%s':"
" DEV %s\n", mci->mod_name, mci->ctl_name, edac_dev_name(mci));

+ edac_mc_owner = owner;
+
mutex_unlock(&mem_ctls_mutex);
return 0;

@@ -791,7 +807,9 @@ fail1:

fail0:
mutex_unlock(&mem_ctls_mutex);
- return 1;
+ edac_mc_owner = owner;
+
+ return ret;
}
EXPORT_SYMBOL_GPL(edac_mc_add_mc);

@@ -817,7 +835,9 @@ struct mem_ctl_info *edac_mc_del_mc(struct device *dev)
return NULL;
}

- del_mc_from_global_list(mci);
+ if (!del_mc_from_global_list(mci)) {
+ edac_mc_owner = NULL;
+ }
mutex_unlock(&mem_ctls_mutex);

/* flush workq processes */
diff --git a/drivers/edac/highbank_mc_edac.c b/drivers/edac/highbank_mc_edac.c
index c769f47..ac92269 100644
--- a/drivers/edac/highbank_mc_edac.c
+++ b/drivers/edac/highbank_mc_edac.c
@@ -219,7 +219,7 @@ static int __devinit highbank_mc_probe(struct platform_device *pdev)
dimm->mtype = MEM_DDR3;
dimm->edac_mode = EDAC_SECDED;

- res = edac_mc_add_mc(mci);
+ res = edac_mc_add_mc(mci, THIS_MODULE);
if (res < 0)
goto err;

diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index d3d19cc..48005a7 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -427,7 +427,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
I3000_ERRSTS_BITS);

rc = -ENODEV;
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "MC: failed edac_mc_add_mc()\n");
goto fail;
}
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index b6653a6..de44881 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -402,7 +402,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
i3200_clear_error_info(mci);

rc = -ENODEV;
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "MC: failed edac_mc_add_mc()\n");
goto fail;
}
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 6a49dd0..d93d205 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1449,7 +1449,7 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
}

/* add this new MC control structure to EDAC's list of MCs */
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
/* FIXME: perhaps some code should go here that disables error
* reporting if we just enabled it
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index c4b5e5f..3df2042 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -989,7 +989,7 @@ static int __devinit i5100_init_one(struct pci_dev *pdev,
break;
}

- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
ret = -ENODEV;
goto bail_scrub;
}
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 2772469..f77aa37 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1333,7 +1333,7 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
}

/* add this new MC control structure to EDAC's list of MCs */
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
/* FIXME: perhaps some code should go here that disables error
* reporting if we just enabled it
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index a09d066..f34c004 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -1084,7 +1084,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
}

/* add this new MC control structure to EDAC's list of MCs */
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
/* FIXME: perhaps some code should go here that disables error
* reporting if we just enabled it
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 3672101..bccd429 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -2258,7 +2258,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
enable_sdram_scrub_setting(mci);

/* add this new MC control structure to EDAC's list of MCs */
- if (unlikely(edac_mc_add_mc(mci))) {
+ if (unlikely(edac_mc_add_mc(mci, THIS_MODULE))) {
edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
/* FIXME: perhaps some code should go here that disables error
* reporting if we just enabled it
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 90f303d..761418a 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -326,7 +326,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
mci->edac_check = i82443bxgx_edacmc_check;
mci->ctl_page_to_phys = NULL;

- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto fail;
}
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 1faa749..18772e8 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -227,7 +227,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
/* Here we assume that we will never see multiple instances of this
* type of memory controller. The ID is therefore hardcoded to 0.
*/
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto fail;
}
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 3e416b1..f8a4b7f 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -446,7 +446,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
/* Here we assume that we will never see multiple instances of this
* type of memory controller. The ID is therefore hardcoded to 0.
*/
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto fail1;
}
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 082e91e..5993d5b 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -663,7 +663,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
i82975x_get_error_info(mci, &discard); /* clear counters */

/* finalize this instance of memory controller with edac core */
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto fail2;
}
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index a1e791e..97f7d1b 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -1063,7 +1063,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
/* clear all error bits */
out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT, ~0);

- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto err;
}
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 2b315c2..80ba5a6 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -789,7 +789,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
ctl = (ctl & 0xff00ffff) | 0x10000;
out_le32(pdata->mc_vbase + MV64X60_SDRAM_ERR_ECC_CNTL, ctl);

- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto err;
}
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 2d35b78..d38c05a 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -255,7 +255,7 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
*/
pasemi_edac_get_error_info(mci);

- if (edac_mc_add_mc(mci))
+ if (edac_mc_add_mc(mci, THIS_MODULE))
goto fail;

/* get this far and it's successful */
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index bf09576..1188071 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -1315,7 +1315,7 @@ static int __devinit ppc4xx_edac_probe(struct platform_device *op)
* and, if necessary, register interrupts.
*/

- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
ppc4xx_edac_mc_printk(KERN_ERR, mci,
"Failed to add instance!\n");
status = -ENODEV;
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index f854deb..3515a01 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -327,7 +327,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
/* Here we assume that we will never see multiple instances of this
* type of memory controller. The ID is therefore hardcoded to 0.
*/
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "failed edac_mc_add_mc()\n");
goto fail;
}
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 9ad2424..c61d281 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1669,7 +1669,7 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
mci->pdev = &sbridge_dev->pdev[0]->dev;

/* add this new MC control structure to EDAC's list of MCs */
- if (unlikely(edac_mc_add_mc(mci))) {
+ if (unlikely(edac_mc_add_mc(mci, THIS_MODULE))) {
edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
rc = -EINVAL;
goto fail0;
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index 1e904b7..3f25408 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -176,7 +176,7 @@ static int __devinit tile_edac_mc_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, mci);

/* Register with EDAC core */
- rc = edac_mc_add_mc(mci);
+ rc = edac_mc_add_mc(mci, THIS_MODULE);
if (rc) {
dev_err(&pdev->dev, "failed to register with EDAC core\n");
edac_mc_free(mci);
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 08a9926..dd12aab 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -401,7 +401,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
x38_clear_error_info(mci);

rc = -ENODEV;
- if (edac_mc_add_mc(mci)) {
+ if (edac_mc_add_mc(mci, THIS_MODULE)) {
edac_dbg(3, "MC: failed edac_mc_add_mc()\n");
goto fail;
}
--
1.7.11.7

2012-11-01 11:05:19

by Borislav Petkov

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

+ Tony.

On Wed, Oct 31, 2012 at 11:58:15AM -0200, Mauro Carvalho Chehab wrote:
> There's a know bug that happens when apei/ghes is loaded together
> with an EDAC module: the same error is reported several times,
> as ghes calls mcelog, with, in tune, calls edac.

This is exactly why I think APEI is crap. So it is a completely useless
additional layer between the MCA code and the rest.

The #MC handler runs, logs the error, and then a split happens which
runs in parallel:

* we do mce_log which carries the error to EDAC
* we enter APEI, do some mumbo jumbo and then do mce_log AGAIN! Wtf?

So, in order to sort this out properly, let's take a step back first:
what do we actually want to do?

* the error coming from APEI still needs to get decoded by EDAC? If yes,
then WTF we need APEI for anyway?

* the error coming from APEI is already decoded, so no need for EDAC? I
highly doubt that.

* add a filter to the MCE code so that certain types of errors are not
reported by it but by APEI so that the double reporting doesn't happen?

Right about now, I'm open for hints as to why we need that APEI crap at
all. And I don't want to hear that "clear interface so that OS coders
don't need to know the hardware" bullshit argument from the sick world
of windoze.

Thanks.

--
Regards/Gruss,
Boris.

2012-11-01 11:47:39

by Mauro Carvalho Chehab

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

Em Thu, 1 Nov 2012 12:05:12 +0100
Borislav Petkov <[email protected]> escreveu:

> + Tony.
>
> On Wed, Oct 31, 2012 at 11:58:15AM -0200, Mauro Carvalho Chehab wrote:
> > There's a know bug that happens when apei/ghes is loaded together
> > with an EDAC module: the same error is reported several times,
> > as ghes calls mcelog, with, in tune, calls edac.
>
> This is exactly why I think APEI is crap. So it is a completely useless
> additional layer between the MCA code and the rest.

I agree with you on that: getting data directly from the MC is, IMHO,
more reliable, see below for the reasons why we need this.

> The #MC handler runs, logs the error, and then a split happens which
> runs in parallel:
>
> * we do mce_log which carries the error to EDAC
> * we enter APEI, do some mumbo jumbo and then do mce_log AGAIN! Wtf?
>
> So, in order to sort this out properly, let's take a step back first:
> what do we actually want to do?

I can give you more details in person next week, but, basically, there are
a few issues we're trying to solve:

1) when both APEI/GHES and sb_edac are loaded, error reports are
inconsistent: race issues; bad APEI/MCE interface, etc. So, there's
curently a bug that needs to be fixed;

2) some vendors refuse to support EDAC[1];

3) there are some really complex environments with memory hot-plugging,
mirrored memory, spare memories, etc where only the BIOS may provide
a reliable information about the DIMM location, as the configuration
may change dynamically at runtime.

[1] they claim that the firmware provided errors are more reliable
than reading directly from hardware, as they have some special
heuristics logic on their BIOS that detects the difference between a
simple interference and a damaged memory.

> * the error coming from APEI still needs to get decoded by EDAC? If yes,
> then WTF we need APEI for anyway?

That's a good question. I understood on some discussions we had, that APEI
would be able to provide the DIMM label. However, I didn't find any field
with such information there at APEI mem_err struct.

So, either there are something missing (maybe DIMM labels are part of
APEI 5.0), or we'll still need EDAC decoding logic to get the DIMM.

> * the error coming from APEI is already decoded, so no need for EDAC? I
> highly doubt that.

The interface I wrote is a "minimum EDAC" interface: it currently bypasses
almost all EDAC error logic; it only uses the EDAC way to report errors: via
trace and/or via printk. E. g. it is almost a direct call to the RAS tracing
facility.

I did this because I assumed that there's a way to get the DIMM labels
directly at apei/ghes.c.

> * add a filter to the MCE code so that certain types of errors are not
> reported by it but by APEI so that the double reporting doesn't happen?

Take a look at arch/x86/kernel/cpu/mcheck/mce-apei.c:

void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
{
struct mce m;

/* Only corrected MC is reported */
if (!corrected || !(mem_err->validation_bits &
CPER_MEM_VALID_PHYSICAL_ADDRESS))
return;

mce_setup(&m);
m.bank = 1;
/* Fake a memory read corrected error with unknown channel */
m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
m.addr = mem_err->physical_addr;
mce_log(&m);
mce_notify_irq();
}

Bank information there is fake; status is fake. Only addr is really filled
there; it works only for corrected errors.

Also if you try to decode this, the logic will likely fail, as not all
fields used by either i7core_edac/sb_edac parsers or by userspace decoders
are filled there.

For it to work, apei_mce_report_mem_error() would require a complex logic,
that would identify what kind of CPU is in the system, emulating every single
detail of the error reports there, with would be complex, and will be reversed
in userspace anyway.

So, IMO, the APEI-MCE integration interface should be simply removed, in favor
of reporting errors using the EDAC/RAS interface.

> Right about now, I'm open for hints as to why we need that APEI crap at
> all. And I don't want to hear that "clear interface so that OS coders
> don't need to know the hardware" bullshit argument from the sick world
> of windoze.

--
Regards,
Mauro

2012-11-01 17:25:29

by Luck, Tony

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

On Thu, Nov 1, 2012 at 4:47 AM, Mauro Carvalho Chehab
<[email protected]> wrote:
> Take a look at arch/x86/kernel/cpu/mcheck/mce-apei.c:
>
> void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
> {
> struct mce m;
>
> /* Only corrected MC is reported */
> if (!corrected || !(mem_err->validation_bits &
> CPER_MEM_VALID_PHYSICAL_ADDRESS))
> return;
>
> mce_setup(&m);
> m.bank = 1;
> /* Fake a memory read corrected error with unknown channel */
> m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
> m.addr = mem_err->physical_addr;
> mce_log(&m);
> mce_notify_irq();
> }
>
> Bank information there is fake; status is fake. Only addr is really filled
> there; it works only for corrected errors.

This went in like this to help out the Westmere-EX processors that
didn't fill out MCi_ADDR for corrected errors. APEI could get the
address from some platform CSRs ... reporting via /dev/mcelog
so that predictive analysis in mcelog(8) would work on these machines.

I don't think we can rip it out yet ... not until those machines are
shuffled off to recycle heaven.

But perhaps we should get smarter about which machines we enable
APEI on? If we get everything we need from the machine check banks,
then the detour via the BIOS to report the same thing again isn't helpful.

-Tony

2012-11-01 17:47:53

by Mauro Carvalho Chehab

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

Em Thu, 1 Nov 2012 10:25:23 -0700
Tony Luck <[email protected]> escreveu:

> On Thu, Nov 1, 2012 at 4:47 AM, Mauro Carvalho Chehab
> <[email protected]> wrote:
> > Take a look at arch/x86/kernel/cpu/mcheck/mce-apei.c:
> >
> > void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
> > {
> > struct mce m;
> >
> > /* Only corrected MC is reported */
> > if (!corrected || !(mem_err->validation_bits &
> > CPER_MEM_VALID_PHYSICAL_ADDRESS))
> > return;
> >
> > mce_setup(&m);
> > m.bank = 1;
> > /* Fake a memory read corrected error with unknown channel */
> > m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
> > m.addr = mem_err->physical_addr;
> > mce_log(&m);
> > mce_notify_irq();
> > }
> >
> > Bank information there is fake; status is fake. Only addr is really filled
> > there; it works only for corrected errors.
>
> This went in like this to help out the Westmere-EX processors that
> didn't fill out MCi_ADDR for corrected errors. APEI could get the
> address from some platform CSRs ... reporting via /dev/mcelog
> so that predictive analysis in mcelog(8) would work on these machines.

Ok, but it is broken on other platforms like Sandy Bridge.

> I don't think we can rip it out yet ... not until those machines are
> shuffled off to recycle heaven.

Perhaps then we could add a logic at apei-mce to only forward errors to
MCE on the platforms where the MCE log is known to be right.

> But perhaps we should get smarter about which machines we enable
> APEI on?

That makes sense. IMO, APEI should be on by default only if no other driver
exists, like in the case of Nehalem-EX. For platforms supported by i7core_edac,
sb_edac and amd64_edac, we could add a parameter to explicitly force it to
be on, otherwise, APEI will be disabled.

> If we get everything we need from the machine check banks,
> then the detour via the BIOS to report the same thing again isn't helpful.

Agreed.

Regards,
Mauro

2012-11-01 19:55:18

by Borislav Petkov

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

On Thu, Nov 01, 2012 at 09:47:21AM -0200, Mauro Carvalho Chehab wrote:
> 1) when both APEI/GHES and sb_edac are loaded, error reports are
> inconsistent: race issues; bad APEI/MCE interface, etc. So, there's
> curently a bug that needs to be fixed;

That's correct. And we probably could add some filter logic to mce_log
path so that it doesn't call straight into EDAC over the notifier but go
over APEI.

But as you say, if APEI fakes error information, then it is worth shit.

> 2) some vendors refuse to support EDAC[1];
>
> 3) there are some really complex environments with memory hot-plugging,
> mirrored memory, spare memories, etc where only the BIOS may provide
> a reliable information about the DIMM location, as the configuration
> may change dynamically at runtime.

That is correct, unfortunately. That information is not available to
software in all cases. Maybe APEI could be used for that DIMM location
mapping through simple tables instead of letting it fumble the error
handling path.

> [1] they claim that the firmware provided errors are more reliable
> than reading directly from hardware, as they have some special
> heuristics logic on their BIOS that detects the difference between a
> simple interference and a damaged memory.

Let me guess: they do thresholding. And we actually have that already :)

> > * the error coming from APEI still needs to get decoded by EDAC? If yes,
> > then WTF we need APEI for anyway?
>
> That's a good question. I understood on some discussions we had, that APEI
> would be able to provide the DIMM label. However, I didn't find any field
> with such information there at APEI mem_err struct.
>
> So, either there are something missing (maybe DIMM labels are part of
> APEI 5.0), or we'll still need EDAC decoding logic to get the DIMM.

Right, so we should rely on BIOS telling us what the DIMM mapping is,
considering the fact that it does all the DRAM training during boot and
has intimate knowledge of the DIMM layout, in general.

APEI, being a WHEA port to the ACPI spec and thus Linux, is kinda
useless as I see it.

[ … ]

> Bank information there is fake; status is fake. Only addr is really filled
> there; it works only for corrected errors.
>
> Also if you try to decode this, the logic will likely fail, as not all
> fields used by either i7core_edac/sb_edac parsers or by userspace decoders
> are filled there.
>
> For it to work, apei_mce_report_mem_error() would require a complex logic,
> that would identify what kind of CPU is in the system, emulating every single
> detail of the error reports there, with would be complex, and will be reversed
> in userspace anyway.
>
> So, IMO, the APEI-MCE integration interface should be simply removed, in favor
> of reporting errors using the EDAC/RAS interface.

Yes, my other concern I had is that once APEI is cast in the firmware,
it cannot be changed (or at least very hard: I've tried to get OEMs to
update their BIOS for different reasons but it has almost always ended
in nirvana). So, I want firmware error handling and OS error handling to
be interchangeable so that the one doesn't depend on the other and vice
versa.

I.e., I want to be able to turn off APEI and handle errors only in
software, without any disadvantage to the users, if APEI is buggy and/or
doesn't give all the required info.

And so, if we let APEI handle certain errors, it either should handle
them properly or not at all.

Otherwise, we don't need this useless overhead.

--
Regards/Gruss,
Boris.

2012-11-01 21:09:15

by Luck, Tony

[permalink] [raw]

Subject: RE: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

> That is correct, unfortunately. That information is not available to
> software in all cases. Maybe APEI could be used for that DIMM location
> mapping through simple tables instead of letting it fumble the error
> handling path.

Not much hope for "simple"[1] tables. There is also a timings issue on
system with rank sparing, memory mirroring etc. ... you need to decode
to the DIMM at the time the error happened. If you wait until later, then
the system may have switched over to the spare rank or mirror ... and then
your decode will point at the new target, rather than the old.

-Tony

[1] Consider a 4 cpu-socket machine with 4 channels per socket and three
DIMMs per channel - so there are 48 sockets on the motherboard. Then
some lab monkey takes a box of random 1, 2, 4, 8 GB DIMMs and fills
most of the sockets. BIOS will somehow make sense out of this and interleave
where it finds matching speeds across pairs/quads of channels (though size
need not match ... if you have a 2G and 4G DIMM you may get interleaving for
the part. then non-interleaved for the "extra" 2G).
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?

2012-11-01 22:02:54

by Borislav Petkov

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

On Thu, Nov 01, 2012 at 09:09:07PM +0000, Luck, Tony wrote:
> > That is correct, unfortunately. That information is not available to
> > software in all cases. Maybe APEI could be used for that DIMM location
> > mapping through simple tables instead of letting it fumble the error
> > handling path.
>
> Not much hope for "simple"[1] tables. There is also a timings issue on
> system with rank sparing, memory mirroring etc. ... you need to decode
> to the DIMM at the time the error happened. If you wait until later, then
> the system may have switched over to the spare rank or mirror ... and then
> your decode will point at the new target, rather than the old.

Yeah, normally we're decoding the error right after being logged so...

> [1] Consider a 4 cpu-socket machine with 4 channels per socket and three
> DIMMs per channel - so there are 48 sockets on the motherboard. Then

You mean 48 DIMM slots, right?

> some lab monkey takes a box of random 1, 2, 4, 8 GB DIMMs and fills
> most of the sockets. BIOS will somehow make sense out of this and
> interleave where it finds matching speeds across pairs/quads of
> channels (though size need not match ... if you have a 2G and 4G DIMM
> you may get interleaving for the part. then non-interleaved for the
> "extra" 2G).

Right, but at least in the csrow case, we still can compute back the
csrow even with the interleaving, after we know how it is done exactly
(on which address bits, etc). I think this should be doable on Intel
controllers too but I don't know.

--
Regards/Gruss,
Boris.

2012-11-01 23:47:58

by Luck, Tony

[permalink] [raw]

Subject: RE: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

> Right, but at least in the csrow case, we still can compute back the
> csrow even with the interleaving, after we know how it is done exactly
> (on which address bits, etc). I think this should be doable on Intel
> controllers too but I don't know.

No. Architecturally all Intel provides is the physical address in MCi_ADDR.
To do anything with that you are into per-system space, and the
registers that define the mappings are not necessarily available
to OS code ... sometimes they are, and sometimes they are even
documented in places where Mauro can use them to write an
EDAC driver ... but there are no guarantees.

-Tony
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?

2012-11-01 23:55:03

by Borislav Petkov

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

On Thu, Nov 01, 2012 at 11:47:52PM +0000, Luck, Tony wrote:
> > Right, but at least in the csrow case, we still can compute back the
> > csrow even with the interleaving, after we know how it is done exactly
> > (on which address bits, etc). I think this should be doable on Intel
> > controllers too but I don't know.
>
> No. Architecturally all Intel provides is the physical address in MCi_ADDR.
> To do anything with that you are into per-system space, and the
> registers that define the mappings are not necessarily available
> to OS code ... sometimes they are, and sometimes they are even
> documented in places where Mauro can use them to write an
> EDAC driver ... but there are no guarantees.

One more reason that we need some sort of tables telling us which
rank/csrow maps to which DIMM and thus silkscreen label so that we can
be able to say the following from software:

"You just had a single corrected ECC error in the DIMM with label
P0_DIMM_A"

or whatever unique naming each platform vendor comes up with.

The day hw people give me this, I'm going to throw a big party and
invite all LKML.

--
Regards/Gruss,
Boris.

2012-11-02 02:12:50

by Mauro Carvalho Chehab

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

Em Thu, 1 Nov 2012 21:09:07 +0000
"Luck, Tony" <[email protected]> escreveu:
> Em Thu, 1 Nov 2012 20:55:09 +0100
> Borislav Petkov <[email protected]> escreveu:

> > On Thu, Nov 01, 2012 at 09:47:21AM -0200, Mauro Carvalho Chehab wrote:
> > > 1) when both APEI/GHES and sb_edac are loaded, error reports are
> > > inconsistent: race issues; bad APEI/MCE interface, etc. So, there's
> > > curently a bug that needs to be fixed;
> >
> > That's correct. And we probably could add some filter logic to mce_log
> > path so that it doesn't call straight into EDAC over the notifier but go
> > over APEI.

This is one alternative, but I don't think we should invest at
APEI->MCE interface.

Also, after applying those 4 patches I proposed:
http://git.infradead.org/users/mchehab/edac.git/shortlog/refs/heads/experimental

There will be two alternatives for error report:

MCE----->EDAC driver ----> EDAC core error report

(usual EDAC way to report errors, on MCE-based platforms)

OR:

APEI---->MCE
|
\--->EDAC core error report

"Firmware first" APEI way.

This is warranted by this proposed patch, as it warrants that, once the first
driver registers at EDAC core, the second one will fail, with a -EPERM return code.

> But as you say, if APEI fakes error information, then it is worth shit.

Yes. There are things that require fixes at the two APEI->EDAC glue patches,
but this path is more consistent than APEI--->MCE--->EDAC driver

Basically, at:
http://git.infradead.org/users/mchehab/edac.git/commitdiff/e5c85a309f58b260c8ec0a2c6eff9a6d66b4c7e3

It doesn't fill some data needed by EDAC sysfs nodes, on this
proof of concept patches. We need to investigate further if
APEI can provide such info in a consistent way.

If not... well, we can just disable any EDAC sysfs nodes or maybe
provide some way to fill it via some userspace interface, letting
some application getting it via dmidecode and/or from some files
with will provide the mapping tables.

>From the error report point of view:
http://git.infradead.org/users/mchehab/edac.git/commitdiff/654292c78376dfe2b65cd9027787ab54705eeaa6

Nothing is faked on this patch. The only missing information
there is that the label info is not filled.

So, except for Nehalem-EX reports, even with those simple PoC
patches, the error is provided consistently.

> > > 2) some vendors refuse to support EDAC[1];
> > >
> > > 3) there are some really complex environments with memory hot-plugging,
> > > mirrored memory, spare memories, etc where only the BIOS may provide
> > > a reliable information about the DIMM location, as the configuration
> > > may change dynamically at runtime.

> > That is correct, unfortunately. That information is not available to
> > software in all cases. Maybe APEI could be used for that DIMM location
> > mapping through simple tables instead of letting it fumble the error
> > handling path.
>
> Not much hope for "simple"[1] tables. There is also a timings issue on
> system with rank sparing, memory mirroring etc. ... you need to decode
> to the DIMM at the time the error happened. If you wait until later, then
> the system may have switched over to the spare rank or mirror ... and then
> your decode will point at the new target, rather than the old.

If APEI won't provide any way to uniquely identify the DIMM where the
error occurred, decoding will require a driver as complex as sb_edac, and
indeed, the APEI interface is useless.

However, the APEI interface provides a bunch of values:

+ sprintf(location,"node:%d card:%d module:%d bank:%d device:%d row: %d column:%d bit_pos:%d",
+ mem_err->node, mem_err->card, mem_err->module,
+ mem_err->bank, mem_err->device, mem_err->row, mem_err->column,
+ mem_err->bit_pos);

I didn't read the APEI specs, so, I'm not sure what is filled there, but,
based on the fields names, it seems that the DIMM location is there
(but not the DIMM label).

If so, a simple table is likely enough to convert a node/card/module/bank/device
information into a DIMM label.

There is one problem, though: in order to use the EDAC->dimm label conversion
logic, the APEI driver should be telling to the EDAC core how many nodes,
cards, modules, banks, devices are there at the memory architecture, in order
to allow the EDAC core to create the needed sysfs nodes for the DIMM labels.

I don't think APEI has such kind of information.

> > > [1] they claim that the firmware provided errors are more reliable
> > > than reading directly from hardware, as they have some special
> > > heuristics logic on their BIOS that detects the difference between a
> > > simple interference and a damaged memory.
> >
> > Let me guess: they do thresholding. And we actually have that already :)
> >
> > > > * the error coming from APEI still needs to get decoded by EDAC? If yes,
> > > > then WTF we need APEI for anyway?
> > >
> > > That's a good question. I understood on some discussions we had, that APEI
> > > would be able to provide the DIMM label. However, I didn't find any field
> > > with such information there at APEI mem_err struct.
> > >
> > > So, either there are something missing (maybe DIMM labels are part of
> > > APEI 5.0), or we'll still need EDAC decoding logic to get the DIMM.
> >
> > Right, so we should rely on BIOS telling us what the DIMM mapping is,
> > considering the fact that it does all the DRAM training during boot and
> > has intimate knowledge of the DIMM layout, in general.

If the DIMM layout can't be retrieved by the Kernel by any other means,
relying on BIOS is the only option left.

> >
> > APEI, being a WHEA port to the ACPI spec and thus Linux, is kinda
> > useless as I see it.
> >
> > [ … ]
> >
> > > Bank information there is fake; status is fake. Only addr is really filled
> > > there; it works only for corrected errors.
> > >
> > > Also if you try to decode this, the logic will likely fail, as not all
> > > fields used by either i7core_edac/sb_edac parsers or by userspace decoders
> > > are filled there.
> > >
> > > For it to work, apei_mce_report_mem_error() would require a complex logic,
> > > that would identify what kind of CPU is in the system, emulating every single
> > > detail of the error reports there, with would be complex, and will be reversed
> > > in userspace anyway.
> > >
> > > So, IMO, the APEI-MCE integration interface should be simply removed, in favor
> > > of reporting errors using the EDAC/RAS interface.
> >
> > Yes, my other concern I had is that once APEI is cast in the firmware,
> > it cannot be changed (or at least very hard: I've tried to get OEMs to
> > update their BIOS for different reasons but it has almost always ended
> > in nirvana). So, I want firmware error handling and OS error handling to
> > be interchangeable so that the one doesn't depend on the other and vice
> > versa.
> >
> > I.e., I want to be able to turn off APEI and handle errors only in
> > software, without any disadvantage to the users, if APEI is buggy and/or
> > doesn't give all the required info.
> >
> > And so, if we let APEI handle certain errors, it either should handle
> > them properly or not at all.
> >
> > Otherwise, we don't need this useless overhead.

I agree: from Linux PoV, users should be free to select either the BIOS based
APEI interface, if they rely enough on their hardware/BIOS vendor, or to
use the hardware way of retrieving the errors (EDAC/MCE).

Regards,
Mauro

2012-11-02 02:25:25

by Mauro Carvalho Chehab

[permalink] [raw]

Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

Em Fri, 2 Nov 2012 00:54:57 +0100
Borislav Petkov <[email protected]> escreveu:

> On Thu, Nov 01, 2012 at 11:47:52PM +0000, Luck, Tony wrote:
> > > Right, but at least in the csrow case, we still can compute back the
> > > csrow even with the interleaving, after we know how it is done exactly
> > > (on which address bits, etc). I think this should be doable on Intel
> > > controllers too but I don't know.
> >
> > No. Architecturally all Intel provides is the physical address in MCi_ADDR.
> > To do anything with that you are into per-system space, and the
> > registers that define the mappings are not necessarily available
> > to OS code ... sometimes they are, and sometimes they are even
> > documented in places where Mauro can use them to write an
> > EDAC driver ... but there are no guarantees.
>
> One more reason that we need some sort of tables telling us which
> rank/csrow maps to which DIMM and thus silkscreen label so that we can
> be able to say the following from software:

If you take a look at sb_edac driver, you'll see that most of the driver's
logic is the part that do address->DIMM location mapping. It is a very
complex logic.

>
> "You just had a single corrected ECC error in the DIMM with label
> P0_DIMM_A"

Converting from a DIMM location into a DIMM label is trivial, through:
just a simple table lookup.

As APEI/GHES seem to provide the DIMM location, a simple table lookup
should also work there, as I said before.

> or whatever unique naming each platform vendor comes up with.
>
> The day hw people give me this, I'm going to throw a big party and
> invite all LKML.
>

--
Regards,
Mauro