2014-04-04 21:13:48

by Jason Baron

[permalink] [raw]
Subject: [PATCH 0/3] Add new ie31200_edac driver

Hi,

Add support for memory errors for the Intel E3-1200 processors.

While testing the driver, I found that doing a readq() on the
memory mapped memory controller hub registers caused a hard lockup
on my x86_64 system. It turns out that a read across a DW boundary
is a no no here.

"
Software must not access B0/D0/F0 32-bit memory-mapped registers with
requests that cross a DW boundary.
"

(http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html p. 16)

Thus, I've added a generic lo_hi_[read|write]_q API, to deal with
this issue.

I think longer term the right thing is maybe to simply add these
definitions to include/asm-generic/io.h, as I didn't see any
in tree users of 'io-64-nonatomic-hi-lo.h', and simply remove
io-64-nonatomic-hi-lo.h and io-64-nonatomic-lo-hi.h. But I didn't want to
tie that cleanup to this edac driver submission.

Thanks,

-Jason


Jason Baron (3):
readq/writeq: Add explicit lo_hi_[read|write]_q and hi_lo_[read|write]_q
x38_edac: make use of lo_hi_readq()
ie31200_edac: Add driver

drivers/edac/Kconfig | 7 +
drivers/edac/Makefile | 1 +
drivers/edac/ie31200_edac.c | 540 ++++++++++++++++++++++++++++
drivers/edac/x38_edac.c | 15 +-
include/asm-generic/io-64-nonatomic-hi-lo.h | 14 +-
include/asm-generic/io-64-nonatomic-lo-hi.h | 14 +-
6 files changed, 572 insertions(+), 19 deletions(-)
create mode 100644 drivers/edac/ie31200_edac.c

--
1.8.2.rc2


2014-04-04 21:14:10

by Jason Baron

[permalink] [raw]
Subject: [PATCH 3/3] ie31200_edac: Add driver

Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
is based on the following E3-1200 specs:

http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v3-vol-2-datasheet.html

I've tested this on bad memory hardware, and observed correlating bad reads
and uncorrected memory errors as reported by the driver.

Signed-off-by: Jason Baron <[email protected]>
---
drivers/edac/Kconfig | 7 +
drivers/edac/Makefile | 1 +
drivers/edac/ie31200_edac.c | 540 ++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 548 insertions(+)
create mode 100644 drivers/edac/ie31200_edac.c

diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 878f090..27f44a1 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -186,6 +186,13 @@ config EDAC_I3200
Support for error detection and correction on the Intel
3200 and 3210 server chipsets.

+config EDAC_IE31200
+ tristate "Intel e312xx"
+ depends on EDAC_MM_EDAC && PCI && X86
+ help
+ Support for error detection and correction on the Intel
+ E3-1200 processor.
+
config EDAC_X38
tristate "Intel X38"
depends on EDAC_MM_EDAC && PCI && X86
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 4154ed6..c479a24 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -37,6 +37,7 @@ obj-$(CONFIG_EDAC_I82875P) += i82875p_edac.o
obj-$(CONFIG_EDAC_I82975X) += i82975x_edac.o
obj-$(CONFIG_EDAC_I3000) += i3000_edac.o
obj-$(CONFIG_EDAC_I3200) += i3200_edac.o
+obj-$(CONFIG_EDAC_IE31200) += ie31200_edac.o
obj-$(CONFIG_EDAC_X38) += x38_edac.o
obj-$(CONFIG_EDAC_I82860) += i82860_edac.o
obj-$(CONFIG_EDAC_R82600) += r82600_edac.o
diff --git a/drivers/edac/ie31200_edac.c b/drivers/edac/ie31200_edac.c
new file mode 100644
index 0000000..ae03d21
--- /dev/null
+++ b/drivers/edac/ie31200_edac.c
@@ -0,0 +1,540 @@
+/*
+ * Intel E3-1200
+ * Copyright (C) 2014 Jason Baron <[email protected]>
+ *
+ * Support for the E3-1200 processor family. Heavily based on previous
+ * Intel EDAC drivers.
+ *
+ * Since the DRAM controller is on the cpu chip, we can use its PCI device
+ * id to identify these processors.
+ *
+ * PCI DRAM controller device ids (Taken from The PCI ID Repository - http://pci-ids.ucw.cz/)
+ *
+ * 0108: Xeon E3-1200 Processor Family DRAM Controller
+ * 010c: Xeon E3-1200/2nd Generation Core Processor Family DRAM Controller
+ * 0150: Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller
+ * 0158: Xeon E3-1200 v2/Ivy Bridge DRAM Controller
+ * 015c: Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller
+ * 0c04: Xeon E3-1200 v3/4th Gen Core Processor DRAM Controller
+ * 0c08: Xeon E3-1200 v3 Processor DRAM Controller
+ *
+ * Based on Intel specification:
+ * http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e3-1200v3-vol-2-datasheet.pdf
+ * http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
+ *
+ * According to the above datasheet (p.16):
+ * "
+ * 6. Software must not access B0/D0/F0 32-bit memory-mapped registers with
+ * requests that cross a DW boundary.
+ * "
+ *
+ * Thus, we make use of the explicit: lo_hi_readq(), which breaks the readq into
+ * 2 readl() calls. This restriction may be lifted in subsequent chip releases,
+ * but lo_hi_readq() ensures that we are safe across all e3-1200 processors.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/edac.h>
+
+#include <asm-generic/io-64-nonatomic-lo-hi.h>
+#include "edac_core.h"
+
+#define IE31200_REVISION "1.0"
+#define EDAC_MOD_STR "ie31200_edac"
+
+#define PCI_DEVICE_ID_INTEL_IE31200_HB_1 0x0108
+#define PCI_DEVICE_ID_INTEL_IE31200_HB_2 0x010c
+#define PCI_DEVICE_ID_INTEL_IE31200_HB_3 0x0150
+#define PCI_DEVICE_ID_INTEL_IE31200_HB_4 0x0158
+#define PCI_DEVICE_ID_INTEL_IE31200_HB_5 0x015c
+#define PCI_DEVICE_ID_INTEL_IE31200_HB_6 0x0c04
+#define PCI_DEVICE_ID_INTEL_IE31200_HB_7 0x0c08
+
+#define IE31200_DIMMS 4
+#define IE31200_RANKS 8
+#define IE31200_RANKS_PER_CHANNEL 4
+#define IE31200_DIMMS_PER_CHANNEL 2
+#define IE31200_CHANNELS 2
+
+/* Intel IE31200 register addresses - device 0 function 0 - DRAM Controller */
+#define IE31200_MCHBAR_LOW 0x48
+#define IE31200_MCHBAR_HIGH 0x4c
+#define IE31200_MCHBAR_MASK 0xffffff8000ULL /* bits 38:15 */
+#define IE31200_MMR_WINDOW_SIZE 32768
+
+/* Error Status Register (16b)
+ *
+ * 15 reserved
+ * 14 Isochronous TBWRR Run Behind FIFO Full
+ * (ITCV)
+ * 13 Isochronous TBWRR Run Behind FIFO Put
+ * (ITSTV)
+ * 12 reserved
+ * 11 MCH Thermal Sensor Event
+ * for SMI/SCI/SERR (GTSE)
+ * 10 reserved
+ * 9 LOCK to non-DRAM Memory Flag (LCKF)
+ * 8 reserved
+ * 7 DRAM Throttle Flag (DTF)
+ * 6:2 reserved
+ * 1 Multi-bit DRAM ECC Error Flag (DMERR)
+ * 0 Single-bit DRAM ECC Error Flag (DSERR)
+ */
+#define IE31200_ERRSTS 0xc8
+#define IE31200_ERRSTS_UE 0x0002
+#define IE31200_ERRSTS_CE 0x0001
+#define IE31200_ERRSTS_BITS (IE31200_ERRSTS_UE | IE31200_ERRSTS_CE)
+
+/*
+ * Channel 0 ECC Error Log (64b)
+ *
+ * 63:48 Error Column Address (ERRCOL)
+ * 47:32 Error Row Address (ERRROW)
+ * 31:29 Error Bank Address (ERRBANK)
+ * 28:27 Error Rank Address (ERRRANK)
+ * 26:24 reserved
+ * 23:16 Error Syndrome (ERRSYND)
+ * 15: 2 reserved
+ * 1 Multiple Bit Error Status (MERRSTS)
+ * 0 Correctable Error Status (CERRSTS)
+ */
+#define IE31200_C0ECCERRLOG 0x40c8
+#define IE31200_C1ECCERRLOG 0x44c8
+#define IE31200_ECCERRLOG_CE 0x1
+#define IE31200_ECCERRLOG_UE 0x2
+#define IE31200_ECCERRLOG_RANK_BITS 0x18000000
+#define IE31200_ECCERRLOG_RANK_SHIFT 27
+#define IE31200_ECCERRLOG_SYNDROME_BITS 0xff0000
+#define IE31200_ECCERRLOG_SYNDROME_SHIFT 16
+#define IE31200_CAPID0 0xe4
+
+#define IE31200_MAD_DIMM_0_OFFSET 0x5004
+
+#define IE31200_PAGES(n) \
+ (n << (28 - PAGE_SHIFT))
+
+struct ie31200_priv {
+ void __iomem *window;
+};
+
+static int nr_channels;
+
+static int how_many_channels(struct pci_dev *pdev)
+{
+ int n_channels;
+ unsigned char capid0_2b; /* 2nd byte of CAPID0 */
+
+ pci_read_config_byte(pdev, IE31200_CAPID0 + 1, &capid0_2b);
+
+ /* check PDCD: Dual Channel Disable */
+ if (capid0_2b & 0x10) {
+ edac_dbg(0, "In single channel mode\n");
+ n_channels = 1;
+ } else {
+ edac_dbg(0, "In dual channel mode\n");
+ n_channels = 2;
+ }
+
+ /* check DDPCD - check if both channels are filled */
+ if (capid0_2b & 0x40)
+ edac_dbg(0, "2 DIMMS per channel disabled\n");
+ else
+ edac_dbg(0, "2 DIMMS per channel enabled\n");
+
+ return n_channels;
+}
+
+static bool ecc_capable(struct pci_dev *pdev)
+{
+ unsigned char capid0_4b; /* 4th byte of CAPID0 */
+
+ pci_read_config_byte(pdev, IE31200_CAPID0 + 3, &capid0_4b);
+ if (capid0_4b & 0x2)
+ return false;
+ return true;
+}
+
+static unsigned long eccerrlog_syndrome(u64 log)
+{
+ return (log & IE31200_ECCERRLOG_SYNDROME_BITS) >>
+ IE31200_ECCERRLOG_SYNDROME_SHIFT;
+}
+
+static int eccerrlog_row(int channel, u64 log)
+{
+ u64 rank = ((log & IE31200_ECCERRLOG_RANK_BITS) >>
+ IE31200_ECCERRLOG_RANK_SHIFT);
+ return rank;
+}
+
+enum ie31200_chips {
+ IE31200 = 0,
+};
+
+struct ie31200_dev_info {
+ const char *ctl_name;
+};
+
+struct ie31200_error_info {
+ u16 errsts;
+ u16 errsts2;
+ u64 eccerrlog[IE31200_CHANNELS];
+};
+
+static const struct ie31200_dev_info ie31200_devs[] = {
+ [IE31200] = {
+ .ctl_name = "IE31200"
+ },
+};
+
+static void ie31200_clear_error_info(struct mem_ctl_info *mci)
+{
+ struct pci_dev *pdev;
+
+ pdev = to_pci_dev(mci->pdev);
+
+ /*
+ * Clear any error bits.
+ * (Yes, we really clear bits by writing 1 to them.)
+ */
+ pci_write_bits16(pdev, IE31200_ERRSTS, IE31200_ERRSTS_BITS,
+ IE31200_ERRSTS_BITS);
+}
+
+static void ie31200_get_and_clear_error_info(struct mem_ctl_info *mci,
+ struct ie31200_error_info *info)
+{
+ struct pci_dev *pdev;
+ struct ie31200_priv *priv = mci->pvt_info;
+ void __iomem *window = priv->window;
+
+ pdev = to_pci_dev(mci->pdev);
+
+ /*
+ * This is a mess because there is no atomic way to read all the
+ * registers at once and the registers can transition from CE being
+ * overwritten by UE.
+ */
+ pci_read_config_word(pdev, IE31200_ERRSTS, &info->errsts);
+ if (!(info->errsts & IE31200_ERRSTS_BITS))
+ return;
+
+ info->eccerrlog[0] = lo_hi_readq(window + IE31200_C0ECCERRLOG);
+ if (nr_channels == 2)
+ info->eccerrlog[1] = lo_hi_readq(window + IE31200_C1ECCERRLOG);
+
+ pci_read_config_word(pdev, IE31200_ERRSTS, &info->errsts2);
+
+ /*
+ * If the error is the same for both reads then the first set
+ * of reads is valid. If there is a change then there is a CE
+ * with no info and the second set of reads is valid and
+ * should be UE info.
+ */
+ if ((info->errsts ^ info->errsts2) & IE31200_ERRSTS_BITS) {
+ info->eccerrlog[0] = lo_hi_readq(window + IE31200_C0ECCERRLOG);
+ if (nr_channels == 2)
+ info->eccerrlog[1] =
+ lo_hi_readq(window + IE31200_C1ECCERRLOG);
+ }
+
+ ie31200_clear_error_info(mci);
+}
+
+static void ie31200_process_error_info(struct mem_ctl_info *mci,
+ struct ie31200_error_info *info)
+{
+ int channel;
+ u64 log;
+
+ if (!(info->errsts & IE31200_ERRSTS_BITS))
+ return;
+
+ if ((info->errsts ^ info->errsts2) & IE31200_ERRSTS_BITS) {
+ edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1, 0, 0, 0,
+ -1, -1, -1, "UE overwrote CE", "");
+ info->errsts = info->errsts2;
+ }
+
+ for (channel = 0; channel < nr_channels; channel++) {
+ log = info->eccerrlog[channel];
+ if (log & IE31200_ECCERRLOG_UE) {
+ edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1,
+ 0, 0, 0,
+ eccerrlog_row(channel, log),
+ channel, -1,
+ "i3000 UE", "");
+ } else if (log & IE31200_ECCERRLOG_CE) {
+ edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 1,
+ 0, 0, eccerrlog_syndrome(log),
+ eccerrlog_row(channel, log),
+ channel, -1,
+ "i3000 CE", "");
+ }
+ }
+}
+
+static void ie31200_check(struct mem_ctl_info *mci)
+{
+ struct ie31200_error_info info;
+
+ edac_dbg(1, "MC%d\n", mci->mc_idx);
+ ie31200_get_and_clear_error_info(mci, &info);
+ ie31200_process_error_info(mci, &info);
+}
+
+static void __iomem *ie31200_map_mchbar(struct pci_dev *pdev)
+{
+ union {
+ u64 mchbar;
+ struct {
+ u32 mchbar_low;
+ u32 mchbar_high;
+ };
+ } u;
+ void __iomem *window;
+
+ pci_read_config_dword(pdev, IE31200_MCHBAR_LOW, &u.mchbar_low);
+ pci_read_config_dword(pdev, IE31200_MCHBAR_HIGH, &u.mchbar_high);
+ u.mchbar &= IE31200_MCHBAR_MASK;
+
+ if (u.mchbar != (resource_size_t)u.mchbar) {
+ pr_err("ie31200: mmio space beyond accessible range (0x%llx)\n",
+ (unsigned long long)u.mchbar);
+ return NULL;
+ }
+
+ window = ioremap_nocache(u.mchbar, IE31200_MMR_WINDOW_SIZE);
+ if (!window)
+ pr_err("ie31200: cannot map mmio space at 0x%llx\n",
+ (unsigned long long)u.mchbar);
+
+ return window;
+}
+
+struct dimm_data {
+ u8 size; /* in 256MB multiples */
+ u8 dual_rank : 1,
+ x16_width : 1; /* 0 means x8 width */
+};
+
+static int ie31200_probe1(struct pci_dev *pdev, int dev_idx)
+{
+ int rc;
+ int i, j;
+ struct mem_ctl_info *mci = NULL;
+ struct edac_mc_layer layers[2];
+ struct dimm_data dimm_info[IE31200_CHANNELS][IE31200_DIMMS_PER_CHANNEL];
+ void __iomem *window;
+ struct ie31200_priv *priv;
+ u32 addr_decode;
+
+ edac_dbg(0, "MC:\n");
+
+ if (!ecc_capable(pdev)) {
+ edac_dbg(1, "Not edac capable\n");
+ return -ENODEV;
+ }
+
+ window = ie31200_map_mchbar(pdev);
+ if (!window)
+ return -ENODEV;
+
+ /* populate DIMM info */
+ for (i = 0; i < IE31200_CHANNELS; i++) {
+ addr_decode = readl(window + IE31200_MAD_DIMM_0_OFFSET +
+ (i * 4));
+ edac_dbg(0, "addr_decode: 0x%x\n", addr_decode);
+ for (j = 0; j < IE31200_DIMMS_PER_CHANNEL; j++) {
+ dimm_info[i][j].size = (addr_decode >> (j * 8)) & 0xff;
+ dimm_info[i][j].dual_rank =
+ (addr_decode & (0x20000 << j)) ? 1 : 0;
+ dimm_info[i][j].x16_width =
+ (addr_decode & (0x80000 << j)) ? 1 : 0;
+ edac_dbg(0, "size: 0x%x, rank: %d, width: %d\n",
+ dimm_info[i][j].size,
+ dimm_info[i][j].dual_rank,
+ dimm_info[i][j].x16_width);
+ }
+ }
+
+ nr_channels = how_many_channels(pdev);
+
+ layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+ layers[0].size = IE31200_DIMMS;
+ layers[0].is_virt_csrow = true;
+ layers[1].type = EDAC_MC_LAYER_CHANNEL;
+ layers[1].size = nr_channels;
+ layers[1].is_virt_csrow = false;
+ mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+ sizeof(struct ie31200_priv));
+
+ rc = -ENOMEM;
+ if (!mci)
+ goto fail_unmap;
+
+ edac_dbg(3, "MC: init mci\n");
+
+ mci->pdev = &pdev->dev;
+ mci->mtype_cap = MEM_FLAG_DDR3;
+
+ mci->edac_ctl_cap = EDAC_FLAG_SECDED;
+ mci->edac_cap = EDAC_FLAG_SECDED;
+
+ mci->mod_name = EDAC_MOD_STR;
+ mci->mod_ver = IE31200_REVISION;
+ mci->ctl_name = ie31200_devs[dev_idx].ctl_name;
+ mci->dev_name = pci_name(pdev);
+ mci->edac_check = ie31200_check;
+ mci->ctl_page_to_phys = NULL;
+ priv = mci->pvt_info;
+ priv->window = window;
+
+ /*
+ * The dram rank boundary (DRB) reg values are boundary addresses
+ * for each DRAM rank with a granularity of 64MB. DRB regs are
+ * cumulative; the last one will contain the total memory
+ * contained in all ranks.
+ */
+ for (i = 0; i < IE31200_DIMMS_PER_CHANNEL; i++) {
+ for (j = 0; j < IE31200_CHANNELS; j++) {
+ struct dimm_info *dimm;
+ unsigned long nr_pages;
+
+ nr_pages = IE31200_PAGES(dimm_info[j][i].size);
+ if (nr_pages == 0)
+ continue;
+
+ if (dimm_info[j][i].dual_rank) {
+ nr_pages = nr_pages / 2;
+ dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
+ mci->n_layers, (i * 2) + 1,
+ j, 0);
+ dimm->nr_pages = nr_pages;
+ edac_dbg(0, "set nr pages: 0x%lx\n", nr_pages);
+ dimm->grain = 8; /* just a guess */
+ dimm->mtype = MEM_DDR3;
+ dimm->dtype = DEV_UNKNOWN;
+ dimm->edac_mode = EDAC_UNKNOWN;
+ }
+ dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
+ mci->n_layers, i * 2, j, 0);
+ dimm->nr_pages = nr_pages;
+ edac_dbg(0, "set nr pages: 0x%lx\n", nr_pages);
+ dimm->grain = 8; /* same guess */
+ dimm->mtype = MEM_DDR3;
+ dimm->dtype = DEV_UNKNOWN;
+ dimm->edac_mode = EDAC_UNKNOWN;
+ }
+ }
+
+ ie31200_clear_error_info(mci);
+
+ rc = -ENODEV;
+ if (edac_mc_add_mc(mci)) {
+ edac_dbg(3, "MC: failed edac_mc_add_mc()\n");
+ goto fail_free;
+ }
+
+ /* get this far and it's successful */
+ edac_dbg(3, "MC: success\n");
+ return 0;
+
+fail_free:
+ if (mci)
+ edac_mc_free(mci);
+fail_unmap:
+ iounmap(window);
+
+ return rc;
+}
+
+static int ie31200_init_one(struct pci_dev *pdev,
+ const struct pci_device_id *ent)
+{
+ int rc;
+
+ edac_dbg(0, "MC:\n");
+
+ if (pci_enable_device(pdev) < 0)
+ return -EIO;
+
+ rc = ie31200_probe1(pdev, ent->driver_data);
+
+ return rc;
+}
+
+static void ie31200_remove_one(struct pci_dev *pdev)
+{
+ struct mem_ctl_info *mci;
+ struct ie31200_priv *priv;
+
+ edac_dbg(0, "\n");
+ mci = edac_mc_del_mc(&pdev->dev);
+ if (!mci)
+ return;
+ priv = mci->pvt_info;
+ iounmap(priv->window);
+ edac_mc_free(mci);
+}
+
+static const struct pci_device_id ie31200_pci_tbl[] = {
+ {
+ PCI_VEND_DEV(INTEL, IE31200_HB_1), PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ IE31200},
+ {
+ PCI_VEND_DEV(INTEL, IE31200_HB_2), PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ IE31200},
+ {
+ PCI_VEND_DEV(INTEL, IE31200_HB_3), PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ IE31200},
+ {
+ PCI_VEND_DEV(INTEL, IE31200_HB_4), PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ IE31200},
+ {
+ PCI_VEND_DEV(INTEL, IE31200_HB_5), PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ IE31200},
+ {
+ PCI_VEND_DEV(INTEL, IE31200_HB_6), PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ IE31200},
+ {
+ PCI_VEND_DEV(INTEL, IE31200_HB_7), PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ IE31200},
+ {
+ 0,
+ } /* 0 terminated list. */
+};
+MODULE_DEVICE_TABLE(pci, ie31200_pci_tbl);
+
+static struct pci_driver ie31200_driver = {
+ .name = EDAC_MOD_STR,
+ .probe = ie31200_init_one,
+ .remove = ie31200_remove_one,
+ .id_table = ie31200_pci_tbl,
+};
+
+static int __init ie31200_init(void)
+{
+ edac_dbg(3, "MC:\n");
+ /* Ensure that the OPSTATE is set correctly for POLL or NMI */
+ opstate_init();
+
+ return pci_register_driver(&ie31200_driver);
+}
+
+static void __exit ie31200_exit(void)
+{
+ edac_dbg(3, "MC:\n");
+ pci_unregister_driver(&ie31200_driver);
+}
+
+module_init(ie31200_init);
+module_exit(ie31200_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Jason Baron <[email protected]>");
+MODULE_DESCRIPTION("MC support for Intel Processor E31200 memory hub controllers");
--
1.8.2.rc2

2014-04-04 21:14:04

by Jason Baron

[permalink] [raw]
Subject: [PATCH 2/3] x38_edac: make use of lo_hi_readq()

Convert to the generic API.

Signed-off-by: Jason Baron <[email protected]>
---
drivers/edac/x38_edac.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 4891b45..e644b52 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -14,6 +14,8 @@
#include <linux/pci.h>
#include <linux/pci_ids.h>
#include <linux/edac.h>
+
+#include <asm-generic/io-64-nonatomic-lo-hi.h>
#include "edac_core.h"

#define X38_REVISION "1.1"
@@ -161,11 +163,6 @@ static void x38_clear_error_info(struct mem_ctl_info *mci)
X38_ERRSTS_BITS);
}

-static u64 x38_readq(const void __iomem *addr)
-{
- return readl(addr) | (((u64)readl(addr + 4)) << 32);
-}
-
static void x38_get_and_clear_error_info(struct mem_ctl_info *mci,
struct x38_error_info *info)
{
@@ -183,9 +180,9 @@ static void x38_get_and_clear_error_info(struct mem_ctl_info *mci,
if (!(info->errsts & X38_ERRSTS_BITS))
return;

- info->eccerrlog[0] = x38_readq(window + X38_C0ECCERRLOG);
+ info->eccerrlog[0] = lo_hi_readq(window + X38_C0ECCERRLOG);
if (x38_channel_num == 2)
- info->eccerrlog[1] = x38_readq(window + X38_C1ECCERRLOG);
+ info->eccerrlog[1] = lo_hi_readq(window + X38_C1ECCERRLOG);

pci_read_config_word(pdev, X38_ERRSTS, &info->errsts2);

@@ -196,10 +193,10 @@ static void x38_get_and_clear_error_info(struct mem_ctl_info *mci,
* should be UE info.
*/
if ((info->errsts ^ info->errsts2) & X38_ERRSTS_BITS) {
- info->eccerrlog[0] = x38_readq(window + X38_C0ECCERRLOG);
+ info->eccerrlog[0] = lo_hi_readq(window + X38_C0ECCERRLOG);
if (x38_channel_num == 2)
info->eccerrlog[1] =
- x38_readq(window + X38_C1ECCERRLOG);
+ lo_hi_readq(window + X38_C1ECCERRLOG);
}

x38_clear_error_info(mci);
--
1.8.2.rc2

2014-04-04 21:13:57

by Jason Baron

[permalink] [raw]
Subject: [PATCH 1/3] readq/writeq: Add explicit lo_hi_[read|write]_q and hi_lo_[read|write]_q

Even on x86-64, I've found the need to break up a readq() into 2 readl()
calls. According to the Intel datasheet for the E3-1200 processor:

"
Software must not access B0/D0/F0 32-bit memory-mapped registers with
requests that cross a DW boundary.
"

(http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html p. 16)

I can confirm this is true via several hard machine lockups.

Thus, add explicit hi_lo_[readq|write]_q and lo_hi_[read|write]_q so that these
uses are spelled out.

Signed-off-by: Jason Baron <[email protected]>
---
include/asm-generic/io-64-nonatomic-hi-lo.h | 14 +++++++++-----
include/asm-generic/io-64-nonatomic-lo-hi.h | 14 +++++++++-----
2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/include/asm-generic/io-64-nonatomic-hi-lo.h b/include/asm-generic/io-64-nonatomic-hi-lo.h
index a6806a9..2e29d13 100644
--- a/include/asm-generic/io-64-nonatomic-hi-lo.h
+++ b/include/asm-generic/io-64-nonatomic-hi-lo.h
@@ -4,8 +4,7 @@
#include <linux/io.h>
#include <asm-generic/int-ll64.h>

-#ifndef readq
-static inline __u64 readq(const volatile void __iomem *addr)
+static inline __u64 hi_lo_readq(const volatile void __iomem *addr)
{
const volatile u32 __iomem *p = addr;
u32 low, high;
@@ -15,14 +14,19 @@ static inline __u64 readq(const volatile void __iomem *addr)

return low + ((u64)high << 32);
}
-#endif

-#ifndef writeq
-static inline void writeq(__u64 val, volatile void __iomem *addr)
+static inline void hi_lo_writeq(__u64 val, volatile void __iomem *addr)
{
writel(val >> 32, addr + 4);
writel(val, addr);
}
+
+#ifndef readq
+#define readq hi_lo_readq
+#endif
+
+#ifndef writeq
+#define writeq hi_lo_writeq
#endif

#endif /* _ASM_IO_64_NONATOMIC_HI_LO_H_ */
diff --git a/include/asm-generic/io-64-nonatomic-lo-hi.h b/include/asm-generic/io-64-nonatomic-lo-hi.h
index ca546b1..0efacff 100644
--- a/include/asm-generic/io-64-nonatomic-lo-hi.h
+++ b/include/asm-generic/io-64-nonatomic-lo-hi.h
@@ -4,8 +4,7 @@
#include <linux/io.h>
#include <asm-generic/int-ll64.h>

-#ifndef readq
-static inline __u64 readq(const volatile void __iomem *addr)
+static inline __u64 lo_hi_readq(const volatile void __iomem *addr)
{
const volatile u32 __iomem *p = addr;
u32 low, high;
@@ -15,14 +14,19 @@ static inline __u64 readq(const volatile void __iomem *addr)

return low + ((u64)high << 32);
}
-#endif

-#ifndef writeq
-static inline void writeq(__u64 val, volatile void __iomem *addr)
+static inline void lo_hi_writeq(__u64 val, volatile void __iomem *addr)
{
writel(val, addr);
writel(val >> 32, addr + 4);
}
+
+#ifndef readq
+#define readq lo_hi_readq
+#endif
+
+#ifndef writeq
+#define writeq lo_hi_writeq
#endif

#endif /* _ASM_IO_64_NONATOMIC_LO_HI_H_ */
--
1.8.2.rc2

2014-04-07 20:00:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 0/3] Add new ie31200_edac driver

On Fri, Apr 04, 2014 at 09:13:45PM +0000, Jason Baron wrote:
> Hi,
>
> Add support for memory errors for the Intel E3-1200 processors.
>
> While testing the driver, I found that doing a readq() on the
> memory mapped memory controller hub registers caused a hard lockup
> on my x86_64 system. It turns out that a read across a DW boundary
> is a no no here.
>
> "
> Software must not access B0/D0/F0 32-bit memory-mapped registers with
> requests that cross a DW boundary.
> "
>
> (http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html p. 16)
>
> Thus, I've added a generic lo_hi_[read|write]_q API, to deal with
> this issue.
>
> I think longer term the right thing is maybe to simply add these
> definitions to include/asm-generic/io.h, as I didn't see any
> in tree users of 'io-64-nonatomic-hi-lo.h', and simply remove
> io-64-nonatomic-hi-lo.h and io-64-nonatomic-lo-hi.h. But I didn't want to
> tie that cleanup to this edac driver submission.

Makes sense to me. There are a couple of drivers which include
io-64-nonatomic-lo-hi.h though.

As part of the cleanup, it might be even clearer to do the lo_hi_* calls
directly instead of the readq/writeq defines as a means of documenting
that this particular code cannot stomach doubleword-crossing accesses.

x86 guys, opinions? Especially patch 1/3, should I pick it up or you
want?

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-08 09:09:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Fri, Apr 04, 2014 at 09:14:04PM +0000, Jason Baron wrote:
> Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
> is based on the following E3-1200 specs:
>
> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v3-vol-2-datasheet.html
>
> I've tested this on bad memory hardware, and observed correlating bad reads
> and uncorrected memory errors as reported by the driver.
>
> Signed-off-by: Jason Baron <[email protected]>
> ---
> drivers/edac/Kconfig | 7 +
> drivers/edac/Makefile | 1 +
> drivers/edac/ie31200_edac.c | 540 ++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 548 insertions(+)
> create mode 100644 drivers/edac/ie31200_edac.c
>
> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
> index 878f090..27f44a1 100644
> --- a/drivers/edac/Kconfig
> +++ b/drivers/edac/Kconfig
> @@ -186,6 +186,13 @@ config EDAC_I3200
> Support for error detection and correction on the Intel
> 3200 and 3210 server chipsets.
>
> +config EDAC_IE31200
> + tristate "Intel e312xx"
> + depends on EDAC_MM_EDAC && PCI && X86
> + help
> + Support for error detection and correction on the Intel
> + E3-1200 processor.
> +
> config EDAC_X38
> tristate "Intel X38"
> depends on EDAC_MM_EDAC && PCI && X86
> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index 4154ed6..c479a24 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -37,6 +37,7 @@ obj-$(CONFIG_EDAC_I82875P) += i82875p_edac.o
> obj-$(CONFIG_EDAC_I82975X) += i82975x_edac.o
> obj-$(CONFIG_EDAC_I3000) += i3000_edac.o
> obj-$(CONFIG_EDAC_I3200) += i3200_edac.o
> +obj-$(CONFIG_EDAC_IE31200) += ie31200_edac.o
> obj-$(CONFIG_EDAC_X38) += x38_edac.o
> obj-$(CONFIG_EDAC_I82860) += i82860_edac.o
> obj-$(CONFIG_EDAC_R82600) += r82600_edac.o
> diff --git a/drivers/edac/ie31200_edac.c b/drivers/edac/ie31200_edac.c
> new file mode 100644
> index 0000000..ae03d21
> --- /dev/null
> +++ b/drivers/edac/ie31200_edac.c
> @@ -0,0 +1,540 @@
> +/*
> + * Intel E3-1200
> + * Copyright (C) 2014 Jason Baron <[email protected]>
> + *
> + * Support for the E3-1200 processor family. Heavily based on previous
> + * Intel EDAC drivers.
> + *
> + * Since the DRAM controller is on the cpu chip, we can use its PCI device
> + * id to identify these processors.
> + *
> + * PCI DRAM controller device ids (Taken from The PCI ID Repository - http://pci-ids.ucw.cz/)
> + *
> + * 0108: Xeon E3-1200 Processor Family DRAM Controller
> + * 010c: Xeon E3-1200/2nd Generation Core Processor Family DRAM Controller
> + * 0150: Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller
> + * 0158: Xeon E3-1200 v2/Ivy Bridge DRAM Controller
> + * 015c: Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller
> + * 0c04: Xeon E3-1200 v3/4th Gen Core Processor DRAM Controller
> + * 0c08: Xeon E3-1200 v3 Processor DRAM Controller
> + *
> + * Based on Intel specification:
> + * http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e3-1200v3-vol-2-datasheet.pdf
> + * http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
> + *
> + * According to the above datasheet (p.16):
> + * "
> + * 6. Software must not access B0/D0/F0 32-bit memory-mapped registers with
> + * requests that cross a DW boundary.
> + * "
> + *
> + * Thus, we make use of the explicit: lo_hi_readq(), which breaks the readq into
> + * 2 readl() calls. This restriction may be lifted in subsequent chip releases,
> + * but lo_hi_readq() ensures that we are safe across all e3-1200 processors.
> + *
> + */
> +
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/pci.h>
> +#include <linux/pci_ids.h>
> +#include <linux/edac.h>
> +
> +#include <asm-generic/io-64-nonatomic-lo-hi.h>
> +#include "edac_core.h"
> +
> +#define IE31200_REVISION "1.0"
> +#define EDAC_MOD_STR "ie31200_edac"
> +
> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_1 0x0108
> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_2 0x010c
> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_3 0x0150
> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_4 0x0158
> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_5 0x015c
> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_6 0x0c04
> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_7 0x0c08
> +
> +#define IE31200_DIMMS 4
> +#define IE31200_RANKS 8
> +#define IE31200_RANKS_PER_CHANNEL 4
> +#define IE31200_DIMMS_PER_CHANNEL 2
> +#define IE31200_CHANNELS 2
> +
> +/* Intel IE31200 register addresses - device 0 function 0 - DRAM Controller */
> +#define IE31200_MCHBAR_LOW 0x48
> +#define IE31200_MCHBAR_HIGH 0x4c
> +#define IE31200_MCHBAR_MASK 0xffffff8000ULL /* bits 38:15 */

GENMASK_ULL()

> +#define IE31200_MMR_WINDOW_SIZE 32768

(1 << 15) ?

> +
> +/* Error Status Register (16b)
> + *
> + * 15 reserved
> + * 14 Isochronous TBWRR Run Behind FIFO Full
> + * (ITCV)
> + * 13 Isochronous TBWRR Run Behind FIFO Put
> + * (ITSTV)
> + * 12 reserved
> + * 11 MCH Thermal Sensor Event
> + * for SMI/SCI/SERR (GTSE)
> + * 10 reserved
> + * 9 LOCK to non-DRAM Memory Flag (LCKF)
> + * 8 reserved
> + * 7 DRAM Throttle Flag (DTF)
> + * 6:2 reserved
> + * 1 Multi-bit DRAM ECC Error Flag (DMERR)
> + * 0 Single-bit DRAM ECC Error Flag (DSERR)
> + */
> +#define IE31200_ERRSTS 0xc8
> +#define IE31200_ERRSTS_UE 0x0002

Let's use the BIT() macro for those:

BIT(1)

> +#define IE31200_ERRSTS_CE 0x0001

BIT(0)

> +#define IE31200_ERRSTS_BITS (IE31200_ERRSTS_UE | IE31200_ERRSTS_CE)
> +
> +/*
> + * Channel 0 ECC Error Log (64b)
> + *
> + * 63:48 Error Column Address (ERRCOL)
> + * 47:32 Error Row Address (ERRROW)
> + * 31:29 Error Bank Address (ERRBANK)
> + * 28:27 Error Rank Address (ERRRANK)
> + * 26:24 reserved
> + * 23:16 Error Syndrome (ERRSYND)
> + * 15: 2 reserved
> + * 1 Multiple Bit Error Status (MERRSTS)
> + * 0 Correctable Error Status (CERRSTS)
> + */
> +#define IE31200_C0ECCERRLOG 0x40c8
> +#define IE31200_C1ECCERRLOG 0x44c8
> +#define IE31200_ECCERRLOG_CE 0x1
> +#define IE31200_ECCERRLOG_UE 0x2

Ditto.

> +#define IE31200_ECCERRLOG_RANK_BITS 0x18000000

GENMASK

> +#define IE31200_ECCERRLOG_RANK_SHIFT 27
> +#define IE31200_ECCERRLOG_SYNDROME_BITS 0xff0000

Ditto.

> +#define IE31200_ECCERRLOG_SYNDROME_SHIFT 16
> +#define IE31200_CAPID0 0xe4
> +
> +#define IE31200_MAD_DIMM_0_OFFSET 0x5004
> +
> +#define IE31200_PAGES(n) \
> + (n << (28 - PAGE_SHIFT))
> +
> +struct ie31200_priv {
> + void __iomem *window;
> +};
> +
> +static int nr_channels;
> +
> +static int how_many_channels(struct pci_dev *pdev)
> +{
> + int n_channels;
> + unsigned char capid0_2b; /* 2nd byte of CAPID0 */
> +
> + pci_read_config_byte(pdev, IE31200_CAPID0 + 1, &capid0_2b);
> +
> + /* check PDCD: Dual Channel Disable */
> + if (capid0_2b & 0x10) {

Now that you have nice defines for all those bits, why not do that for
those naked numbers too? :-)

Maybe even local to this function:

#define PDCD BIT(4)


> + edac_dbg(0, "In single channel mode\n");
> + n_channels = 1;
> + } else {
> + edac_dbg(0, "In dual channel mode\n");
> + n_channels = 2;
> + }
> +
> + /* check DDPCD - check if both channels are filled */

Ditto:

#define DDPCD BIT(6)

> + if (capid0_2b & 0x40)
> + edac_dbg(0, "2 DIMMS per channel disabled\n");
> + else
> + edac_dbg(0, "2 DIMMS per channel enabled\n");
> +
> + return n_channels;
> +}
> +
> +static bool ecc_capable(struct pci_dev *pdev)
> +{
> + unsigned char capid0_4b; /* 4th byte of CAPID0 */
> +
> + pci_read_config_byte(pdev, IE31200_CAPID0 + 3, &capid0_4b);
> + if (capid0_4b & 0x2)

BIT(1) at least, if no define.

> + return false;
> + return true;
> +}
> +
> +static unsigned long eccerrlog_syndrome(u64 log)
> +{
> + return (log & IE31200_ECCERRLOG_SYNDROME_BITS) >>
> + IE31200_ECCERRLOG_SYNDROME_SHIFT;
> +}

Looks like this wants to be a macro.

> +
> +static int eccerrlog_row(int channel, u64 log)
> +{
> + u64 rank = ((log & IE31200_ECCERRLOG_RANK_BITS) >>
> + IE31200_ECCERRLOG_RANK_SHIFT);
> + return rank;
> +}

Ditto.

> +
> +enum ie31200_chips {
> + IE31200 = 0,
> +};
> +
> +struct ie31200_dev_info {
> + const char *ctl_name;
> +};
> +
> +struct ie31200_error_info {
> + u16 errsts;
> + u16 errsts2;
> + u64 eccerrlog[IE31200_CHANNELS];
> +};
> +
> +static const struct ie31200_dev_info ie31200_devs[] = {
> + [IE31200] = {
> + .ctl_name = "IE31200"
> + },
> +};

Maybe pull those up before the first function definition so that you can
have all struct definitions together.

> +
> +static void ie31200_clear_error_info(struct mem_ctl_info *mci)
> +{
> + struct pci_dev *pdev;
> +
> + pdev = to_pci_dev(mci->pdev);

Just drop all the local gaming by simply doing:

pci_write_bits16(to_pci_dev(mci->pdev), IE31200_ERRSTS,
IE31200_ERRSTS_BITS,
IE31200_ERRSTS_BITS);

> +
> + /*
> + * Clear any error bits.
> + * (Yes, we really clear bits by writing 1 to them.)

Yeah, that's that write-to-clear mechanism.

> + */
> + pci_write_bits16(pdev, IE31200_ERRSTS, IE31200_ERRSTS_BITS,
> + IE31200_ERRSTS_BITS);

Arg alignment.

> +}
> +
> +static void ie31200_get_and_clear_error_info(struct mem_ctl_info *mci,
> + struct ie31200_error_info *info)

Ditto.

> +{
> + struct pci_dev *pdev;
> + struct ie31200_priv *priv = mci->pvt_info;
> + void __iomem *window = priv->window;
> +
> + pdev = to_pci_dev(mci->pdev);
> +
> + /*
> + * This is a mess because there is no atomic way to read all the
> + * registers at once and the registers can transition from CE being
> + * overwritten by UE.
> + */

For the UE you'd get an MCE, right?

Btw, this driver is polling, AFAICT. Doesn't e3-12xx support the CMCI
interrupt which you can feed into this driver directly and thus not need
the polling at all?

> + pci_read_config_word(pdev, IE31200_ERRSTS, &info->errsts);
> + if (!(info->errsts & IE31200_ERRSTS_BITS))
> + return;
> +
> + info->eccerrlog[0] = lo_hi_readq(window + IE31200_C0ECCERRLOG);
> + if (nr_channels == 2)
> + info->eccerrlog[1] = lo_hi_readq(window + IE31200_C1ECCERRLOG);
> +
> + pci_read_config_word(pdev, IE31200_ERRSTS, &info->errsts2);
> +
> + /*
> + * If the error is the same for both reads then the first set
> + * of reads is valid. If there is a change then there is a CE
> + * with no info and the second set of reads is valid and
> + * should be UE info.
> + */
> + if ((info->errsts ^ info->errsts2) & IE31200_ERRSTS_BITS) {
> + info->eccerrlog[0] = lo_hi_readq(window + IE31200_C0ECCERRLOG);
> + if (nr_channels == 2)
> + info->eccerrlog[1] =
> + lo_hi_readq(window + IE31200_C1ECCERRLOG);
> + }
> +
> + ie31200_clear_error_info(mci);
> +}
> +
> +static void ie31200_process_error_info(struct mem_ctl_info *mci,
> + struct ie31200_error_info *info)

arg alignment.

> +{
> + int channel;
> + u64 log;
> +
> + if (!(info->errsts & IE31200_ERRSTS_BITS))
> + return;
> +
> + if ((info->errsts ^ info->errsts2) & IE31200_ERRSTS_BITS) {
> + edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1, 0, 0, 0,
> + -1, -1, -1, "UE overwrote CE", "");
> + info->errsts = info->errsts2;
> + }
> +
> + for (channel = 0; channel < nr_channels; channel++) {
> + log = info->eccerrlog[channel];
> + if (log & IE31200_ECCERRLOG_UE) {
> + edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1,
> + 0, 0, 0,
> + eccerrlog_row(channel, log),
> + channel, -1,
> + "i3000 UE", "");
> + } else if (log & IE31200_ECCERRLOG_CE) {
> + edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 1,
> + 0, 0, eccerrlog_syndrome(log),
> + eccerrlog_row(channel, log),
> + channel, -1,
> + "i3000 CE", "");
> + }
> + }
> +}
> +
> +static void ie31200_check(struct mem_ctl_info *mci)
> +{
> + struct ie31200_error_info info;
> +
> + edac_dbg(1, "MC%d\n", mci->mc_idx);
> + ie31200_get_and_clear_error_info(mci, &info);
> + ie31200_process_error_info(mci, &info);
> +}
> +
> +static void __iomem *ie31200_map_mchbar(struct pci_dev *pdev)
> +{
> + union {
> + u64 mchbar;
> + struct {
> + u32 mchbar_low;
> + u32 mchbar_high;
> + };
> + } u;
> + void __iomem *window;
> +
> + pci_read_config_dword(pdev, IE31200_MCHBAR_LOW, &u.mchbar_low);
> + pci_read_config_dword(pdev, IE31200_MCHBAR_HIGH, &u.mchbar_high);
> + u.mchbar &= IE31200_MCHBAR_MASK;
> +
> + if (u.mchbar != (resource_size_t)u.mchbar) {
> + pr_err("ie31200: mmio space beyond accessible range (0x%llx)\n",
> + (unsigned long long)u.mchbar);

We have edac_*_printk for those, which takes care of the prefix, etc.

> + return NULL;
> + }
> +
> + window = ioremap_nocache(u.mchbar, IE31200_MMR_WINDOW_SIZE);
> + if (!window)
> + pr_err("ie31200: cannot map mmio space at 0x%llx\n",
> + (unsigned long long)u.mchbar);

Ditto.

> +
> + return window;
> +}
> +
> +struct dimm_data {
> + u8 size; /* in 256MB multiples */
> + u8 dual_rank : 1,
> + x16_width : 1; /* 0 means x8 width */
> +};

Also move up to the rest of the struct definitions.

> +static int ie31200_probe1(struct pci_dev *pdev, int dev_idx)
> +{
> + int rc;
> + int i, j;
> + struct mem_ctl_info *mci = NULL;
> + struct edac_mc_layer layers[2];
> + struct dimm_data dimm_info[IE31200_CHANNELS][IE31200_DIMMS_PER_CHANNEL];
> + void __iomem *window;
> + struct ie31200_priv *priv;
> + u32 addr_decode;
> +
> + edac_dbg(0, "MC:\n");
> +
> + if (!ecc_capable(pdev)) {
> + edac_dbg(1, "Not edac capable\n");

dbg? Why not info? Also, "No ECC support" or somesuch, "edac capable" is funny :-)

> + return -ENODEV;
> + }
> +
> + window = ie31200_map_mchbar(pdev);
> + if (!window)
> + return -ENODEV;
> +
> + /* populate DIMM info */
> + for (i = 0; i < IE31200_CHANNELS; i++) {
> + addr_decode = readl(window + IE31200_MAD_DIMM_0_OFFSET +
> + (i * 4));
> + edac_dbg(0, "addr_decode: 0x%x\n", addr_decode);
> + for (j = 0; j < IE31200_DIMMS_PER_CHANNEL; j++) {
> + dimm_info[i][j].size = (addr_decode >> (j * 8)) & 0xff;
> + dimm_info[i][j].dual_rank =
> + (addr_decode & (0x20000 << j)) ? 1 : 0;
> + dimm_info[i][j].x16_width =
> + (addr_decode & (0x80000 << j)) ? 1 : 0;

#define's for those naked bits?

> + edac_dbg(0, "size: 0x%x, rank: %d, width: %d\n",
> + dimm_info[i][j].size,
> + dimm_info[i][j].dual_rank,
> + dimm_info[i][j].x16_width);
> + }
> + }
> +
> + nr_channels = how_many_channels(pdev);
> +
> + layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
> + layers[0].size = IE31200_DIMMS;
> + layers[0].is_virt_csrow = true;
> + layers[1].type = EDAC_MC_LAYER_CHANNEL;
> + layers[1].size = nr_channels;
> + layers[1].is_virt_csrow = false;
> + mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
> + sizeof(struct ie31200_priv));
> +
> + rc = -ENOMEM;
> + if (!mci)
> + goto fail_unmap;
> +
> + edac_dbg(3, "MC: init mci\n");
> +
> + mci->pdev = &pdev->dev;
> + mci->mtype_cap = MEM_FLAG_DDR3;
> +
> + mci->edac_ctl_cap = EDAC_FLAG_SECDED;
> + mci->edac_cap = EDAC_FLAG_SECDED;
> +
> + mci->mod_name = EDAC_MOD_STR;
> + mci->mod_ver = IE31200_REVISION;
> + mci->ctl_name = ie31200_devs[dev_idx].ctl_name;
> + mci->dev_name = pci_name(pdev);
> + mci->edac_check = ie31200_check;
> + mci->ctl_page_to_phys = NULL;
> + priv = mci->pvt_info;
> + priv->window = window;
> +
> + /*
> + * The dram rank boundary (DRB) reg values are boundary addresses
> + * for each DRAM rank with a granularity of 64MB. DRB regs are
> + * cumulative; the last one will contain the total memory
> + * contained in all ranks.
> + */
> + for (i = 0; i < IE31200_DIMMS_PER_CHANNEL; i++) {
> + for (j = 0; j < IE31200_CHANNELS; j++) {
> + struct dimm_info *dimm;
> + unsigned long nr_pages;
> +
> + nr_pages = IE31200_PAGES(dimm_info[j][i].size);
> + if (nr_pages == 0)
> + continue;
> +
> + if (dimm_info[j][i].dual_rank) {
> + nr_pages = nr_pages / 2;
> + dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
> + mci->n_layers, (i * 2) + 1,
> + j, 0);
> + dimm->nr_pages = nr_pages;
> + edac_dbg(0, "set nr pages: 0x%lx\n", nr_pages);
> + dimm->grain = 8; /* just a guess */
> + dimm->mtype = MEM_DDR3;
> + dimm->dtype = DEV_UNKNOWN;
> + dimm->edac_mode = EDAC_UNKNOWN;
> + }
> + dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
> + mci->n_layers, i * 2, j, 0);
> + dimm->nr_pages = nr_pages;
> + edac_dbg(0, "set nr pages: 0x%lx\n", nr_pages);
> + dimm->grain = 8; /* same guess */
> + dimm->mtype = MEM_DDR3;
> + dimm->dtype = DEV_UNKNOWN;
> + dimm->edac_mode = EDAC_UNKNOWN;
> + }
> + }
> +
> + ie31200_clear_error_info(mci);
> +
> + rc = -ENODEV;
> + if (edac_mc_add_mc(mci)) {
> + edac_dbg(3, "MC: failed edac_mc_add_mc()\n");
> + goto fail_free;
> + }
> +
> + /* get this far and it's successful */
> + edac_dbg(3, "MC: success\n");
> + return 0;
> +
> +fail_free:
> + if (mci)
> + edac_mc_free(mci);
> +fail_unmap:
> + iounmap(window);
> +
> + return rc;
> +}
> +
> +static int ie31200_init_one(struct pci_dev *pdev,
> + const struct pci_device_id *ent)
> +{
> + int rc;
> +
> + edac_dbg(0, "MC:\n");
> +
> + if (pci_enable_device(pdev) < 0)
> + return -EIO;
> +
> + rc = ie31200_probe1(pdev, ent->driver_data);

return ie31200_probe1(..)

and kill rc.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-08 22:16:47

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver


Hi,

On 04/08/2014 05:09 AM, Borislav Petkov wrote:
> On Fri, Apr 04, 2014 at 09:14:04PM +0000, Jason Baron wrote:
>> Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
>> is based on the following E3-1200 specs:
>>
>> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
>> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v3-vol-2-datasheet.html
>>
>> I've tested this on bad memory hardware, and observed correlating bad reads
>> and uncorrected memory errors as reported by the driver.
>>
>> Signed-off-by: Jason Baron <[email protected]>
>> ---
>> drivers/edac/Kconfig | 7 +
>> drivers/edac/Makefile | 1 +
>> drivers/edac/ie31200_edac.c | 540 ++++++++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 548 insertions(+)
>> create mode 100644 drivers/edac/ie31200_edac.c
>>
>> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
>> index 878f090..27f44a1 100644
>> --- a/drivers/edac/Kconfig
>> +++ b/drivers/edac/Kconfig
>> @@ -186,6 +186,13 @@ config EDAC_I3200
>> Support for error detection and correction on the Intel
>> 3200 and 3210 server chipsets.
>>
>> +config EDAC_IE31200
>> + tristate "Intel e312xx"
>> + depends on EDAC_MM_EDAC && PCI && X86
>> + help
>> + Support for error detection and correction on the Intel
>> + E3-1200 processor.
>> +
>> config EDAC_X38
>> tristate "Intel X38"
>> depends on EDAC_MM_EDAC && PCI && X86
>> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
>> index 4154ed6..c479a24 100644
>> --- a/drivers/edac/Makefile
>> +++ b/drivers/edac/Makefile
>> @@ -37,6 +37,7 @@ obj-$(CONFIG_EDAC_I82875P) += i82875p_edac.o
>> obj-$(CONFIG_EDAC_I82975X) += i82975x_edac.o
>> obj-$(CONFIG_EDAC_I3000) += i3000_edac.o
>> obj-$(CONFIG_EDAC_I3200) += i3200_edac.o
>> +obj-$(CONFIG_EDAC_IE31200) += ie31200_edac.o
>> obj-$(CONFIG_EDAC_X38) += x38_edac.o
>> obj-$(CONFIG_EDAC_I82860) += i82860_edac.o
>> obj-$(CONFIG_EDAC_R82600) += r82600_edac.o
>> diff --git a/drivers/edac/ie31200_edac.c b/drivers/edac/ie31200_edac.c
>> new file mode 100644
>> index 0000000..ae03d21
>> --- /dev/null
>> +++ b/drivers/edac/ie31200_edac.c
>> @@ -0,0 +1,540 @@
>> +/*
>> + * Intel E3-1200
>> + * Copyright (C) 2014 Jason Baron <[email protected]>
>> + *
>> + * Support for the E3-1200 processor family. Heavily based on previous
>> + * Intel EDAC drivers.
>> + *
>> + * Since the DRAM controller is on the cpu chip, we can use its PCI device
>> + * id to identify these processors.
>> + *
>> + * PCI DRAM controller device ids (Taken from The PCI ID Repository - http://pci-ids.ucw.cz/)
>> + *
>> + * 0108: Xeon E3-1200 Processor Family DRAM Controller
>> + * 010c: Xeon E3-1200/2nd Generation Core Processor Family DRAM Controller
>> + * 0150: Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller
>> + * 0158: Xeon E3-1200 v2/Ivy Bridge DRAM Controller
>> + * 015c: Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller
>> + * 0c04: Xeon E3-1200 v3/4th Gen Core Processor DRAM Controller
>> + * 0c08: Xeon E3-1200 v3 Processor DRAM Controller
>> + *
>> + * Based on Intel specification:
>> + * http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e3-1200v3-vol-2-datasheet.pdf
>> + * http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
>> + *
>> + * According to the above datasheet (p.16):
>> + * "
>> + * 6. Software must not access B0/D0/F0 32-bit memory-mapped registers with
>> + * requests that cross a DW boundary.
>> + * "
>> + *
>> + * Thus, we make use of the explicit: lo_hi_readq(), which breaks the readq into
>> + * 2 readl() calls. This restriction may be lifted in subsequent chip releases,
>> + * but lo_hi_readq() ensures that we are safe across all e3-1200 processors.
>> + *
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/init.h>
>> +#include <linux/pci.h>
>> +#include <linux/pci_ids.h>
>> +#include <linux/edac.h>
>> +
>> +#include <asm-generic/io-64-nonatomic-lo-hi.h>
>> +#include "edac_core.h"
>> +
>> +#define IE31200_REVISION "1.0"
>> +#define EDAC_MOD_STR "ie31200_edac"
>> +
>> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_1 0x0108
>> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_2 0x010c
>> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_3 0x0150
>> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_4 0x0158
>> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_5 0x015c
>> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_6 0x0c04
>> +#define PCI_DEVICE_ID_INTEL_IE31200_HB_7 0x0c08
>> +
>> +#define IE31200_DIMMS 4
>> +#define IE31200_RANKS 8
>> +#define IE31200_RANKS_PER_CHANNEL 4
>> +#define IE31200_DIMMS_PER_CHANNEL 2
>> +#define IE31200_CHANNELS 2
>> +
>> +/* Intel IE31200 register addresses - device 0 function 0 - DRAM Controller */
>> +#define IE31200_MCHBAR_LOW 0x48
>> +#define IE31200_MCHBAR_HIGH 0x4c
>> +#define IE31200_MCHBAR_MASK 0xffffff8000ULL /* bits 38:15 */
>
> GENMASK_ULL()
>

ok

>> +#define IE31200_MMR_WINDOW_SIZE 32768
>
> (1 << 15) ?
>

ok

>> +
>> +/* Error Status Register (16b)
>> + *
>> + * 15 reserved
>> + * 14 Isochronous TBWRR Run Behind FIFO Full
>> + * (ITCV)
>> + * 13 Isochronous TBWRR Run Behind FIFO Put
>> + * (ITSTV)
>> + * 12 reserved
>> + * 11 MCH Thermal Sensor Event
>> + * for SMI/SCI/SERR (GTSE)
>> + * 10 reserved
>> + * 9 LOCK to non-DRAM Memory Flag (LCKF)
>> + * 8 reserved
>> + * 7 DRAM Throttle Flag (DTF)
>> + * 6:2 reserved
>> + * 1 Multi-bit DRAM ECC Error Flag (DMERR)
>> + * 0 Single-bit DRAM ECC Error Flag (DSERR)
>> + */
>> +#define IE31200_ERRSTS 0xc8
>> +#define IE31200_ERRSTS_UE 0x0002
>
> Let's use the BIT() macro for those:
>
> BIT(1)
>
>> +#define IE31200_ERRSTS_CE 0x0001
>
> BIT(0)
>

make sense.

>> +#define IE31200_ERRSTS_BITS (IE31200_ERRSTS_UE | IE31200_ERRSTS_CE)
>> +
>> +/*
>> + * Channel 0 ECC Error Log (64b)
>> + *
>> + * 63:48 Error Column Address (ERRCOL)
>> + * 47:32 Error Row Address (ERRROW)
>> + * 31:29 Error Bank Address (ERRBANK)
>> + * 28:27 Error Rank Address (ERRRANK)
>> + * 26:24 reserved
>> + * 23:16 Error Syndrome (ERRSYND)
>> + * 15: 2 reserved
>> + * 1 Multiple Bit Error Status (MERRSTS)
>> + * 0 Correctable Error Status (CERRSTS)
>> + */
>> +#define IE31200_C0ECCERRLOG 0x40c8
>> +#define IE31200_C1ECCERRLOG 0x44c8
>> +#define IE31200_ECCERRLOG_CE 0x1
>> +#define IE31200_ECCERRLOG_UE 0x2
>
> Ditto.
>
>> +#define IE31200_ECCERRLOG_RANK_BITS 0x18000000
>
> GENMASK
>
>> +#define IE31200_ECCERRLOG_RANK_SHIFT 27
>> +#define IE31200_ECCERRLOG_SYNDROME_BITS 0xff0000
>
> Ditto.
>

ok.

>> +#define IE31200_ECCERRLOG_SYNDROME_SHIFT 16
>> +#define IE31200_CAPID0 0xe4
>> +
>> +#define IE31200_MAD_DIMM_0_OFFSET 0x5004
>> +
>> +#define IE31200_PAGES(n) \
>> + (n << (28 - PAGE_SHIFT))
>> +
>> +struct ie31200_priv {
>> + void __iomem *window;
>> +};
>> +
>> +static int nr_channels;
>> +
>> +static int how_many_channels(struct pci_dev *pdev)
>> +{
>> + int n_channels;
>> + unsigned char capid0_2b; /* 2nd byte of CAPID0 */
>> +
>> + pci_read_config_byte(pdev, IE31200_CAPID0 + 1, &capid0_2b);
>> +
>> + /* check PDCD: Dual Channel Disable */
>> + if (capid0_2b & 0x10) {
>
> Now that you have nice defines for all those bits, why not do that for
> those naked numbers too? :-)
>
> Maybe even local to this function:
>
> #define PDCD BIT(4)
>
>

ok.

>> + edac_dbg(0, "In single channel mode\n");
>> + n_channels = 1;
>> + } else {
>> + edac_dbg(0, "In dual channel mode\n");
>> + n_channels = 2;
>> + }
>> +
>> + /* check DDPCD - check if both channels are filled */
>
> Ditto:
>
> #define DDPCD BIT(6)
>

ok.

>> + if (capid0_2b & 0x40)
>> + edac_dbg(0, "2 DIMMS per channel disabled\n");
>> + else
>> + edac_dbg(0, "2 DIMMS per channel enabled\n");
>> +
>> + return n_channels;
>> +}
>> +
>> +static bool ecc_capable(struct pci_dev *pdev)
>> +{
>> + unsigned char capid0_4b; /* 4th byte of CAPID0 */
>> +
>> + pci_read_config_byte(pdev, IE31200_CAPID0 + 3, &capid0_4b);
>> + if (capid0_4b & 0x2)
>
> BIT(1) at least, if no define.
>
>> + return false;
>> + return true;
>> +}
>> +
>> +static unsigned long eccerrlog_syndrome(u64 log)
>> +{
>> + return (log & IE31200_ECCERRLOG_SYNDROME_BITS) >>
>> + IE31200_ECCERRLOG_SYNDROME_SHIFT;
>> +}
>
> Looks like this wants to be a macro.
>
>> +
>> +static int eccerrlog_row(int channel, u64 log)
>> +{
>> + u64 rank = ((log & IE31200_ECCERRLOG_RANK_BITS) >>
>> + IE31200_ECCERRLOG_RANK_SHIFT);
>> + return rank;
>> +}
>
> Ditto.
>

ok.

>> +
>> +enum ie31200_chips {
>> + IE31200 = 0,
>> +};
>> +
>> +struct ie31200_dev_info {
>> + const char *ctl_name;
>> +};
>> +
>> +struct ie31200_error_info {
>> + u16 errsts;
>> + u16 errsts2;
>> + u64 eccerrlog[IE31200_CHANNELS];
>> +};
>> +
>> +static const struct ie31200_dev_info ie31200_devs[] = {
>> + [IE31200] = {
>> + .ctl_name = "IE31200"
>> + },
>> +};
>
> Maybe pull those up before the first function definition so that you can
> have all struct definitions together.
>
>> +
>> +static void ie31200_clear_error_info(struct mem_ctl_info *mci)
>> +{
>> + struct pci_dev *pdev;
>> +
>> + pdev = to_pci_dev(mci->pdev);
>
> Just drop all the local gaming by simply doing:
>
> pci_write_bits16(to_pci_dev(mci->pdev), IE31200_ERRSTS,
> IE31200_ERRSTS_BITS,
> IE31200_ERRSTS_BITS);
>
>> +
>> + /*
>> + * Clear any error bits.
>> + * (Yes, we really clear bits by writing 1 to them.)
>
> Yeah, that's that write-to-clear mechanism.
>
>> + */
>> + pci_write_bits16(pdev, IE31200_ERRSTS, IE31200_ERRSTS_BITS,
>> + IE31200_ERRSTS_BITS);
>
> Arg alignment.
>
>> +}
>> +
>> +static void ie31200_get_and_clear_error_info(struct mem_ctl_info *mci,
>> + struct ie31200_error_info *info)
>
> Ditto.
>
>> +{
>> + struct pci_dev *pdev;
>> + struct ie31200_priv *priv = mci->pvt_info;
>> + void __iomem *window = priv->window;
>> +
>> + pdev = to_pci_dev(mci->pdev);
>> +
>> + /*
>> + * This is a mess because there is no atomic way to read all the
>> + * registers at once and the registers can transition from CE being
>> + * overwritten by UE.
>> + */
>
> For the UE you'd get an MCE, right?
>

One would think. But I have a system which is:

Intel(R) Xeon(R) CPU E31270 @ 3.40GHz

and has:

00:00.0 Host bridge: Intel Corporation Device 0108 (rev 09)

And I can regularly generate uncorrected errors, which I see both from the
simple memory scanner I have (which simply compares what it reads to what
it wrote), and as reported by this driver in the 'ue_count'. BUT I don't
see any evidence of MCEs in the logs and the MCE interrupt count in
/proc/interrupts is 0.

> Btw, this driver is polling, AFAICT. Doesn't e3-12xx support the CMCI
> interrupt which you can feed into this driver directly and thus not need
> the polling at all?

On the system with the ce and ue events that I'm testing on, I don't see
'MCE' nudge above 0, in /proc/interrupts. So I think that implies that
we are not getting any CMCI there?

So if possible maybe we can confirm with Intel whether we expect an MCE
for memory errors...

>
>> + pci_read_config_word(pdev, IE31200_ERRSTS, &info->errsts);
>> + if (!(info->errsts & IE31200_ERRSTS_BITS))
>> + return;
>> +
>> + info->eccerrlog[0] = lo_hi_readq(window + IE31200_C0ECCERRLOG);
>> + if (nr_channels == 2)
>> + info->eccerrlog[1] = lo_hi_readq(window + IE31200_C1ECCERRLOG);
>> +
>> + pci_read_config_word(pdev, IE31200_ERRSTS, &info->errsts2);
>> +
>> + /*
>> + * If the error is the same for both reads then the first set
>> + * of reads is valid. If there is a change then there is a CE
>> + * with no info and the second set of reads is valid and
>> + * should be UE info.
>> + */
>> + if ((info->errsts ^ info->errsts2) & IE31200_ERRSTS_BITS) {
>> + info->eccerrlog[0] = lo_hi_readq(window + IE31200_C0ECCERRLOG);
>> + if (nr_channels == 2)
>> + info->eccerrlog[1] =
>> + lo_hi_readq(window + IE31200_C1ECCERRLOG);
>> + }
>> +
>> + ie31200_clear_error_info(mci);
>> +}
>> +
>> +static void ie31200_process_error_info(struct mem_ctl_info *mci,
>> + struct ie31200_error_info *info)
>
> arg alignment.
>

ok.

>> +{
>> + int channel;
>> + u64 log;
>> +
>> + if (!(info->errsts & IE31200_ERRSTS_BITS))
>> + return;
>> +
>> + if ((info->errsts ^ info->errsts2) & IE31200_ERRSTS_BITS) {
>> + edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1, 0, 0, 0,
>> + -1, -1, -1, "UE overwrote CE", "");
>> + info->errsts = info->errsts2;
>> + }
>> +
>> + for (channel = 0; channel < nr_channels; channel++) {
>> + log = info->eccerrlog[channel];
>> + if (log & IE31200_ECCERRLOG_UE) {
>> + edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1,
>> + 0, 0, 0,
>> + eccerrlog_row(channel, log),
>> + channel, -1,
>> + "i3000 UE", "");
>> + } else if (log & IE31200_ECCERRLOG_CE) {
>> + edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 1,
>> + 0, 0, eccerrlog_syndrome(log),
>> + eccerrlog_row(channel, log),
>> + channel, -1,
>> + "i3000 CE", "");
>> + }
>> + }
>> +}
>> +
>> +static void ie31200_check(struct mem_ctl_info *mci)
>> +{
>> + struct ie31200_error_info info;
>> +
>> + edac_dbg(1, "MC%d\n", mci->mc_idx);
>> + ie31200_get_and_clear_error_info(mci, &info);
>> + ie31200_process_error_info(mci, &info);
>> +}
>> +
>> +static void __iomem *ie31200_map_mchbar(struct pci_dev *pdev)
>> +{
>> + union {
>> + u64 mchbar;
>> + struct {
>> + u32 mchbar_low;
>> + u32 mchbar_high;
>> + };
>> + } u;
>> + void __iomem *window;
>> +
>> + pci_read_config_dword(pdev, IE31200_MCHBAR_LOW, &u.mchbar_low);
>> + pci_read_config_dword(pdev, IE31200_MCHBAR_HIGH, &u.mchbar_high);
>> + u.mchbar &= IE31200_MCHBAR_MASK;
>> +
>> + if (u.mchbar != (resource_size_t)u.mchbar) {
>> + pr_err("ie31200: mmio space beyond accessible range (0x%llx)\n",
>> + (unsigned long long)u.mchbar);
>
> We have edac_*_printk for those, which takes care of the prefix, etc.
>
>> + return NULL;
>> + }
>> +
>> + window = ioremap_nocache(u.mchbar, IE31200_MMR_WINDOW_SIZE);
>> + if (!window)
>> + pr_err("ie31200: cannot map mmio space at 0x%llx\n",
>> + (unsigned long long)u.mchbar);
>
> Ditto.
>
>> +
>> + return window;
>> +}
>> +
>> +struct dimm_data {
>> + u8 size; /* in 256MB multiples */
>> + u8 dual_rank : 1,
>> + x16_width : 1; /* 0 means x8 width */
>> +};
>
> Also move up to the rest of the struct definitions.
>

ok.

>> +static int ie31200_probe1(struct pci_dev *pdev, int dev_idx)
>> +{
>> + int rc;
>> + int i, j;
>> + struct mem_ctl_info *mci = NULL;
>> + struct edac_mc_layer layers[2];
>> + struct dimm_data dimm_info[IE31200_CHANNELS][IE31200_DIMMS_PER_CHANNEL];
>> + void __iomem *window;
>> + struct ie31200_priv *priv;
>> + u32 addr_decode;
>> +
>> + edac_dbg(0, "MC:\n");
>> +
>> + if (!ecc_capable(pdev)) {
>> + edac_dbg(1, "Not edac capable\n");
>
> dbg? Why not info? Also, "No ECC support" or somesuch, "edac capable" is funny :-)
>
>> + return -ENODEV;
>> + }
>> +
>> + window = ie31200_map_mchbar(pdev);
>> + if (!window)
>> + return -ENODEV;
>> +
>> + /* populate DIMM info */
>> + for (i = 0; i < IE31200_CHANNELS; i++) {
>> + addr_decode = readl(window + IE31200_MAD_DIMM_0_OFFSET +
>> + (i * 4));
>> + edac_dbg(0, "addr_decode: 0x%x\n", addr_decode);
>> + for (j = 0; j < IE31200_DIMMS_PER_CHANNEL; j++) {
>> + dimm_info[i][j].size = (addr_decode >> (j * 8)) & 0xff;
>> + dimm_info[i][j].dual_rank =
>> + (addr_decode & (0x20000 << j)) ? 1 : 0;
>> + dimm_info[i][j].x16_width =
>> + (addr_decode & (0x80000 << j)) ? 1 : 0;
>
> #define's for those naked bits?
>
>> + edac_dbg(0, "size: 0x%x, rank: %d, width: %d\n",
>> + dimm_info[i][j].size,
>> + dimm_info[i][j].dual_rank,
>> + dimm_info[i][j].x16_width);
>> + }
>> + }
>> +
>> + nr_channels = how_many_channels(pdev);
>> +
>> + layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
>> + layers[0].size = IE31200_DIMMS;
>> + layers[0].is_virt_csrow = true;
>> + layers[1].type = EDAC_MC_LAYER_CHANNEL;
>> + layers[1].size = nr_channels;
>> + layers[1].is_virt_csrow = false;
>> + mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
>> + sizeof(struct ie31200_priv));
>> +
>> + rc = -ENOMEM;
>> + if (!mci)
>> + goto fail_unmap;
>> +
>> + edac_dbg(3, "MC: init mci\n");
>> +
>> + mci->pdev = &pdev->dev;
>> + mci->mtype_cap = MEM_FLAG_DDR3;
>> +
>> + mci->edac_ctl_cap = EDAC_FLAG_SECDED;
>> + mci->edac_cap = EDAC_FLAG_SECDED;
>> +
>> + mci->mod_name = EDAC_MOD_STR;
>> + mci->mod_ver = IE31200_REVISION;
>> + mci->ctl_name = ie31200_devs[dev_idx].ctl_name;
>> + mci->dev_name = pci_name(pdev);
>> + mci->edac_check = ie31200_check;
>> + mci->ctl_page_to_phys = NULL;
>> + priv = mci->pvt_info;
>> + priv->window = window;
>> +
>> + /*
>> + * The dram rank boundary (DRB) reg values are boundary addresses
>> + * for each DRAM rank with a granularity of 64MB. DRB regs are
>> + * cumulative; the last one will contain the total memory
>> + * contained in all ranks.
>> + */
>> + for (i = 0; i < IE31200_DIMMS_PER_CHANNEL; i++) {
>> + for (j = 0; j < IE31200_CHANNELS; j++) {
>> + struct dimm_info *dimm;
>> + unsigned long nr_pages;
>> +
>> + nr_pages = IE31200_PAGES(dimm_info[j][i].size);
>> + if (nr_pages == 0)
>> + continue;
>> +
>> + if (dimm_info[j][i].dual_rank) {
>> + nr_pages = nr_pages / 2;
>> + dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
>> + mci->n_layers, (i * 2) + 1,
>> + j, 0);
>> + dimm->nr_pages = nr_pages;
>> + edac_dbg(0, "set nr pages: 0x%lx\n", nr_pages);
>> + dimm->grain = 8; /* just a guess */
>> + dimm->mtype = MEM_DDR3;
>> + dimm->dtype = DEV_UNKNOWN;
>> + dimm->edac_mode = EDAC_UNKNOWN;
>> + }
>> + dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
>> + mci->n_layers, i * 2, j, 0);
>> + dimm->nr_pages = nr_pages;
>> + edac_dbg(0, "set nr pages: 0x%lx\n", nr_pages);
>> + dimm->grain = 8; /* same guess */
>> + dimm->mtype = MEM_DDR3;
>> + dimm->dtype = DEV_UNKNOWN;
>> + dimm->edac_mode = EDAC_UNKNOWN;
>> + }
>> + }
>> +
>> + ie31200_clear_error_info(mci);
>> +
>> + rc = -ENODEV;
>> + if (edac_mc_add_mc(mci)) {
>> + edac_dbg(3, "MC: failed edac_mc_add_mc()\n");
>> + goto fail_free;
>> + }
>> +
>> + /* get this far and it's successful */
>> + edac_dbg(3, "MC: success\n");
>> + return 0;
>> +
>> +fail_free:
>> + if (mci)
>> + edac_mc_free(mci);
>> +fail_unmap:
>> + iounmap(window);
>> +
>> + return rc;
>> +}
>> +
>> +static int ie31200_init_one(struct pci_dev *pdev,
>> + const struct pci_device_id *ent)
>> +{
>> + int rc;
>> +
>> + edac_dbg(0, "MC:\n");
>> +
>> + if (pci_enable_device(pdev) < 0)
>> + return -EIO;
>> +
>> + rc = ie31200_probe1(pdev, ent->driver_data);
>
> return ie31200_probe1(..)
>
> and kill rc.
>

I also noticed that some EDAC drivers do a 'pci_dev_get()' in
their 'init_one' function, so I'm not clear if that's needed as well
(I'm hoping the MCH can't be removed at run-time :)).

Thanks for the review.

-Jason






2014-04-08 22:35:54

by Tony Luck

[permalink] [raw]
Subject: RE: [PATCH 3/3] ie31200_edac: Add driver

>> Btw, this driver is polling, AFAICT. Doesn't e3-12xx support the CMCI
>> interrupt which you can feed into this driver directly and thus not need
>> the polling at all?
>
> On the system with the ce and ue events that I'm testing on, I don't see
> 'MCE' nudge above 0, in /proc/interrupts. So I think that implies that
> we are not getting any CMCI there?

CMCI will bump up the "THR" (Threshold) entries in /proc/interrupts.

> So if possible maybe we can confirm with Intel whether we expect an MCE
> for memory errors...

MCG_CAP bit 10 tells you whether a given processor implements CMCI.
If that is set - then MCi_CTL2 bit 30 indicates whether a given bank
supports it (Linux tries to set this bit, if it sticks, then it knows that CMCI
is supported - Linux also assigns ownership of the bank to the first cpu
to successfully set it (since a bank may be shared by multiple threads/cores
on a package).

Consumed uncorrectable errors should generate a machine check. Which
on the E3-12xx series will be a fatal machine check: MCi_STATUS.PCC=1

-Tony

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-04-09 03:03:14

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On 04/08/2014 06:34 PM, Luck, Tony wrote:
>>> Btw, this driver is polling, AFAICT. Doesn't e3-12xx support the CMCI
>>> interrupt which you can feed into this driver directly and thus not need
>>> the polling at all?
>> On the system with the ce and ue events that I'm testing on, I don't see
>> 'MCE' nudge above 0, in /proc/interrupts. So I think that implies that
>> we are not getting any CMCI there?
> CMCI will bump up the "THR" (Threshold) entries in /proc/interrupts.

Ok, so on the system with ue and ce events (as reported by driver
and confirmed with a memory scanner), "THR" is 0 across
all cpus, and I see no machine checks in the logs...

>> So if possible maybe we can confirm with Intel whether we expect an MCE
>> for memory errors...
> MCG_CAP bit 10 tells you whether a given processor implements CMCI.
> If that is set - then MCi_CTL2 bit 30 indicates whether a given bank
> supports it (Linux tries to set this bit, if it sticks, then it knows that CMCI
> is supported - Linux also assigns ownership of the bank to the first cpu
> to successfully set it (since a bank may be shared by multiple threads/cores
> on a package).
>
> Consumed uncorrectable errors should generate a machine check. Which
> on the E3-12xx series will be a fatal machine check: MCi_STATUS.PCC=1
>
> -Tony
>

Hmmm...as I said, I'm not getting any machine checks with ue errors. I've
got a fairly old kernel on the system atm, I will try loading a newer kernel,
to see if that makes any difference...

Thanks,

-Jason

2014-04-09 11:11:39

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Tue, Apr 08, 2014 at 11:03:08PM -0400, Jason Baron wrote:
> Hmmm...as I said, I'm not getting any machine checks with ue errors.
> I've got a fairly old kernel on the system atm, I will try loading a
> newer kernel, to see if that makes any difference...

Well, regardless of the kernel, if the machine encounters an UE, it
would explode. Can you confirm those are *actually* UEs and they're
getting consumed?

Also, check MCG_CAP, i.e.:

# rdmsr 0x179

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-09 11:13:25

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Tue, Apr 08, 2014 at 06:16:43PM -0400, Jason Baron wrote:
> I also noticed that some EDAC drivers do a 'pci_dev_get()' in their
> 'init_one' function, so I'm not clear if that's needed as well (I'm
> hoping the MCH can't be removed at run-time :)).

That'll be a fun stunt if it were possible. :-)

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-09 11:36:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Fri, Apr 04, 2014 at 09:14:04PM +0000, Jason Baron wrote:
> Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
> is based on the following E3-1200 specs:
>
> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v3-vol-2-datasheet.html
>
> I've tested this on bad memory hardware, and observed correlating bad reads
> and uncorrected memory errors as reported by the driver.
>
> Signed-off-by: Jason Baron <[email protected]>
> ---
> drivers/edac/Kconfig | 7 +
> drivers/edac/Makefile | 1 +
> drivers/edac/ie31200_edac.c | 540 ++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 548 insertions(+)
> create mode 100644 drivers/edac/ie31200_edac.c
>
> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
> index 878f090..27f44a1 100644
> --- a/drivers/edac/Kconfig
> +++ b/drivers/edac/Kconfig
> @@ -186,6 +186,13 @@ config EDAC_I3200
> Support for error detection and correction on the Intel
> 3200 and 3210 server chipsets.
>
> +config EDAC_IE31200
> + tristate "Intel e312xx"
> + depends on EDAC_MM_EDAC && PCI && X86
> + help
> + Support for error detection and correction on the Intel
> + E3-1200 processor.
> +
> config EDAC_X38
> tristate "Intel X38"
> depends on EDAC_MM_EDAC && PCI && X86
> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index 4154ed6..c479a24 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -37,6 +37,7 @@ obj-$(CONFIG_EDAC_I82875P) += i82875p_edac.o
> obj-$(CONFIG_EDAC_I82975X) += i82975x_edac.o
> obj-$(CONFIG_EDAC_I3000) += i3000_edac.o
> obj-$(CONFIG_EDAC_I3200) += i3200_edac.o
> +obj-$(CONFIG_EDAC_IE31200) += ie31200_edac.o
> obj-$(CONFIG_EDAC_X38) += x38_edac.o
> obj-$(CONFIG_EDAC_I82860) += i82860_edac.o
> obj-$(CONFIG_EDAC_R82600) += r82600_edac.o
> diff --git a/drivers/edac/ie31200_edac.c b/drivers/edac/ie31200_edac.c
> new file mode 100644
> index 0000000..ae03d21
> --- /dev/null
> +++ b/drivers/edac/ie31200_edac.c
> @@ -0,0 +1,540 @@
> +/*
> + * Intel E3-1200
> + * Copyright (C) 2014 Jason Baron <[email protected]>
> + *
> + * Support for the E3-1200 processor family. Heavily based on previous
> + * Intel EDAC drivers.

Btw, remind me again why this isn't part of the sb_edac? AFAICT, the
e3-12xx thing is a Sandybridge, right?

Why not put it into sb_edac - it is small enough and if you're lucky,
you might even share functionality?

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-09 13:35:04

by Aristeu Rozanski

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Wed, Apr 09, 2014 at 01:35:52PM +0200, Borislav Petkov wrote:
> Btw, remind me again why this isn't part of the sb_edac? AFAICT, the
> e3-12xx thing is a Sandybridge, right?
>
> Why not put it into sb_edac - it is small enough and if you're lucky,
> you might even share functionality?

By quickly looking at the driver (sorry Jason, no proper review yet :( )
it's a very different beast. Tony, any insights on why?

--
Aristeu

2014-04-09 17:30:31

by Tony Luck

[permalink] [raw]
Subject: RE: [PATCH 3/3] ie31200_edac: Add driver

>> Why not put it into sb_edac - it is small enough and if you're lucky,
>> you might even share functionality?
>
> By quickly looking at the driver (sorry Jason, no proper review yet :( )
> it's a very different beast. Tony, any insights on why?

The E3-12xx processors connect out to a different (desktop) chipset from
the E5 (server parts). Perhaps that means the memory controller are different
too???

-Tony

2014-04-09 17:36:43

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Wed, Apr 09, 2014 at 05:17:53PM +0000, Luck, Tony wrote:
> The E3-12xx processors connect out to a different (desktop) chipset
> from the E5 (server parts). Perhaps that means the memory controller
> are different too???

You gotta love how Intel has a different memory controller for server
and desktop parts. :-)

Btw, is that the official memory controller name you'd like the edac
driver to be called - ie31200? I'm asking because it should probably
have a name which denotes the memory controller and not the processor
series... (who knows, we might find that memory controller built in
somewhere else :-))

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-09 18:50:06

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On 04/09/2014 07:35 AM, Borislav Petkov wrote:
> On Fri, Apr 04, 2014 at 09:14:04PM +0000, Jason Baron wrote:
>> Add 'ie31200_edac' driver for the E3-1200 series of Intel processors. Driver
>> is based on the following E3-1200 specs:
>>
>> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-2-datasheet.html
>> http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200v3-vol-2-datasheet.html
>>
>> I've tested this on bad memory hardware, and observed correlating bad reads
>> and uncorrected memory errors as reported by the driver.
>>
>> Signed-off-by: Jason Baron <[email protected]>
>> ---
>> drivers/edac/Kconfig | 7 +
>> drivers/edac/Makefile | 1 +
>> drivers/edac/ie31200_edac.c | 540 ++++++++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 548 insertions(+)
>> create mode 100644 drivers/edac/ie31200_edac.c
>>
>> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
>> index 878f090..27f44a1 100644
>> --- a/drivers/edac/Kconfig
>> +++ b/drivers/edac/Kconfig
>> @@ -186,6 +186,13 @@ config EDAC_I3200
>> Support for error detection and correction on the Intel
>> 3200 and 3210 server chipsets.
>>
>> +config EDAC_IE31200
>> + tristate "Intel e312xx"
>> + depends on EDAC_MM_EDAC && PCI && X86
>> + help
>> + Support for error detection and correction on the Intel
>> + E3-1200 processor.
>> +
>> config EDAC_X38
>> tristate "Intel X38"
>> depends on EDAC_MM_EDAC && PCI && X86
>> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
>> index 4154ed6..c479a24 100644
>> --- a/drivers/edac/Makefile
>> +++ b/drivers/edac/Makefile
>> @@ -37,6 +37,7 @@ obj-$(CONFIG_EDAC_I82875P) += i82875p_edac.o
>> obj-$(CONFIG_EDAC_I82975X) += i82975x_edac.o
>> obj-$(CONFIG_EDAC_I3000) += i3000_edac.o
>> obj-$(CONFIG_EDAC_I3200) += i3200_edac.o
>> +obj-$(CONFIG_EDAC_IE31200) += ie31200_edac.o
>> obj-$(CONFIG_EDAC_X38) += x38_edac.o
>> obj-$(CONFIG_EDAC_I82860) += i82860_edac.o
>> obj-$(CONFIG_EDAC_R82600) += r82600_edac.o
>> diff --git a/drivers/edac/ie31200_edac.c b/drivers/edac/ie31200_edac.c
>> new file mode 100644
>> index 0000000..ae03d21
>> --- /dev/null
>> +++ b/drivers/edac/ie31200_edac.c
>> @@ -0,0 +1,540 @@
>> +/*
>> + * Intel E3-1200
>> + * Copyright (C) 2014 Jason Baron <[email protected]>
>> + *
>> + * Support for the E3-1200 processor family. Heavily based on previous
>> + * Intel EDAC drivers.
>
> Btw, remind me again why this isn't part of the sb_edac? AFAICT, the
> e3-12xx thing is a Sandybridge, right?
>

So, I *think* that the E3-12xx processors cross microarchitectures.
So while E3-1270 is sandy bridge, the E3-1270 v3 is Haswell, and
the E3-1270 v2 is Ivy Bridge.

> Why not put it into sb_edac - it is small enough and if you're lucky,
> you might even share functionality?
>

As far as I understand the sb_edac driver is decoding MCEs via a registration
function with the core MCE code.

However, we have a number of E3-12xx boxes, and we have not seen a MCE
generated on ue or ce errors.

Thanks,

-Jason



2014-04-09 18:57:23

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On 04/09/2014 01:36 PM, Borislav Petkov wrote:
> On Wed, Apr 09, 2014 at 05:17:53PM +0000, Luck, Tony wrote:
>> The E3-12xx processors connect out to a different (desktop) chipset
>> from the E5 (server parts). Perhaps that means the memory controller
>> are different too???
>
> You gotta love how Intel has a different memory controller for server
> and desktop parts. :-)
>

Right, so maybe the fact that its a desktop chipset means that it behaves
differently and doesn't raise MCEs on memory errors. We have a bunch
of these processors and we haven't yet seen an MCE raised on a memory
error.

> Btw, is that the official memory controller name you'd like the edac
> driver to be called - ie31200? I'm asking because it should probably
> have a name which denotes the memory controller and not the processor
> series... (who knows, we might find that memory controller built in
> somewhere else :-))
>
> Thanks.
>

The reason I went with ie31200 is that afaiu the memory controller
hub is integrated into the cpu. Another alternative might be ie312xx?

Thanks,

-Jason

2014-04-09 19:15:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Wed, Apr 09, 2014 at 02:57:19PM -0400, Jason Baron wrote:
> Right, so maybe the fact that its a desktop chipset means that it
> behaves differently and doesn't raise MCEs on memory errors. We have a
> bunch of these processors and we haven't yet seen an MCE raised on a
> memory error.

This can't be - an uncorrectable error will have to generate a machine
check exception if consumed - there's no other option.

Can you do

rdmsr 0x179
rdmsr 0x17b

and paste the results here?

> The reason I went with ie31200 is that afaiu the memory controller hub
> is integrated into the cpu. Another alternative might be ie312xx?

Yeah, if Tony doesn't come up with something more official about that
MCH.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-09 19:53:57

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On 04/09/2014 03:14 PM, Borislav Petkov wrote:
> On Wed, Apr 09, 2014 at 02:57:19PM -0400, Jason Baron wrote:
>> Right, so maybe the fact that its a desktop chipset means that it
>> behaves differently and doesn't raise MCEs on memory errors. We have a
>> bunch of these processors and we haven't yet seen an MCE raised on a
>> memory error.
>
> This can't be - an uncorrectable error will have to generate a machine
> check exception if consumed - there's no other option.
>
> Can you do
>
> rdmsr 0x179
> rdmsr 0x17b
>
> and paste the results here?
>

Unfortunately, the box reporting the ue errors just went into transit (so
that I can better examine this issue), so I will probably not be able to
run this experiment on that specific box until next week.

However, I was able to run it on a very similar box (same pci id for the
mch), and I get:

# ./rdmsr 0x179
c09

# ./rdmsr 0x17b
rdmsr: CPU 0 cannot read MSR 0x0000017b

Thanks,

-Jason

2014-04-09 20:16:24

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Wed, Apr 09, 2014 at 03:53:49PM -0400, Jason Baron wrote:
> Unfortunately, the box reporting the ue errors just went into transit (so
> that I can better examine this issue), so I will probably not be able to
> run this experiment on that specific box until next week.
>
> However, I was able to run it on a very similar box (same pci id for the
> mch), and I get:
>
> # ./rdmsr 0x179
> c09

1100_0000_1001
11 7 3

Hohumm, that's interesting: 9 MCE banks, there's *no* MCG_CTL register
present (bit 8) (that's why the read below fails) so you cannot turn on
or off the error reporting banks.

bit 10 says you have CMCI so you don't need the polling, you probably
don't even need the EDAC driver at all ... unless you want to report
which DRAM channel had the error (this is probably the only additional
information your drivers gives and which the info in mcelog won't give).

Which gives the next question: what exactly is this EDAC driver going
to be used for if it reports (row, channel, syndrome) and how are you
going to use that info? I'm looking at ie31200_process_error_info().

> # ./rdmsr 0x17b
> rdmsr: CPU 0 cannot read MSR 0x0000017b

In any case, that's some strange machine. If it is a desktop MCH, I can
understand it not raising MCEs but that would be pretty nasty, still.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-09 21:33:16

by Tony Luck

[permalink] [raw]
Subject: RE: [PATCH 3/3] ie31200_edac: Add driver

> Unfortunately, the box reporting the ue errors just went into transit (so
> that I can better examine this issue), so I will probably not be able to
> run this experiment on that specific box until next week.

Do you have any other logs from this machine. Is there something
logged in one (or more) of the machine check banks when your EDAC
driver says that there are uncorrected errors?

When the box is back online again - I'd be interested to know if mcelog(8)
daemon reports any errors. Grab the latest from mcelog.org, compile
and run as "mcelog --daemon". Logs show up in /var/log/mcelog

> # ./rdmsr 0x179
> c09

So this processor does support CMCI - next question is whether each
bank support it (and got enabled by Linux) [can run on any system ... don't
need to wait for the one to finish transit)]

# for I in `seq 0 8`
do
./rdmsr 0x28$i
done

will print the MCi_CTL2 registers from each bank. Bit 30 (0x40000000)
shows CMCI enabled.

On the name of the driver - can you throw in an underscore: ie3_12xx.c ?

Do you have systems from Sandy Bridge, Ivy Bridge and Haswell generations
(no suffix for Sandy Bridge, then v2 and v3) ... and does this driver work across all of
them? If it is just for Haswell ... then "ie3_12xx_v3.c" might be a better name.

-Tony
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-04-09 22:15:34

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On 04/09/2014 05:33 PM, Luck, Tony wrote:
>> Unfortunately, the box reporting the ue errors just went into transit (so
>> that I can better examine this issue), so I will probably not be able to
>> run this experiment on that specific box until next week.
>
> Do you have any other logs from this machine. Is there something
> logged in one (or more) of the machine check banks when your EDAC
> driver says that there are uncorrected errors?
>
> When the box is back online again - I'd be interested to know if mcelog(8)
> daemon reports any errors. Grab the latest from mcelog.org, compile
> and run as "mcelog --daemon". Logs show up in /var/log/mcelog
>


So when the driver sees uncorrected errors, I'm also seeing them in my
memory scanning program - so they correspond nicely. I didn't see anything
logged in /var/log/mcelog, but I will update to the latest when possible.


>> # ./rdmsr 0x179
>> c09
>
> So this processor does support CMCI - next question is whether each
> bank support it (and got enabled by Linux) [can run on any system ... don't
> need to wait for the one to finish transit)]
>
> # for I in `seq 0 8`
> do
> ./rdmsr 0x28$i
> done
>
> will print the MCi_CTL2 registers from each bank. Bit 30 (0x40000000)
> shows CMCI enabled.
>

>From the similar system:

# for i in `seq 0 8`; do ./rdmsr 0x28$i; done
40000001
40000001
0
40000001
0
40000001
40000001
40000001
40000001

So looks like cmci is eenabled on the banks. But like I said on the box with the ue errors I
didn't see 'THR' or 'MCE' increment in /proc/interrupts.


> On the name of the driver - can you throw in an underscore: ie3_12xx.c ?
>
> Do you have systems from Sandy Bridge, Ivy Bridge and Haswell generations
> (no suffix for Sandy Bridge, then v2 and v3) ... and does this driver work across all of
> them? If it is just for Haswell ... then "ie3_12xx_v3.c" might be a better name.
>

I do have all the above systems. The one with the memory errors in a Sandy Bridge. On
the other systems I can only say that the driver loads and reports the rank information
correctly. IE I haven't hit the error paths on those.

Thanks,

-Jason

2014-04-09 22:45:23

by Tony Luck

[permalink] [raw]
Subject: RE: [PATCH 3/3] ie31200_edac: Add driver

> So when the driver sees uncorrected errors, I'm also seeing them in my
> memory scanning program - so they correspond nicely. I didn't see anything
> logged in /var/log/mcelog, but I will update to the latest when possible.

I wonder if there are some BIOS options to enable reporting via CMCI/MCE?
On the E5 systems the reference BIOS uses phrases like "poison forwarding"
in the option names.

The above behavior sounds less than useful.

Scenario: Your mission critical app is running (controlling a giant laser cutter).
Oops there is a memory error, and the bad data arrives at the application causing
it to swing the laser beam through 180 degrees, destroying half of your lab.
A few seconds/minutes later - your EDAC driver prints a message saying that
the uncorrected error count just got incremented.

-Tony
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-04-10 01:52:23

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On 04/09/2014 06:44 PM, Luck, Tony wrote:
>> So when the driver sees uncorrected errors, I'm also seeing them in my
>> memory scanning program - so they correspond nicely. I didn't see anything
>> logged in /var/log/mcelog, but I will update to the latest when possible.
> I wonder if there are some BIOS options to enable reporting via CMCI/MCE?
> On the E5 systems the reference BIOS uses phrases like "poison forwarding"
> in the option names.
>
> The above behavior sounds less than useful.
>
> Scenario: Your mission critical app is running (controlling a giant laser cutter).
> Oops there is a memory error, and the bad data arrives at the application causing
> it to swing the laser beam through 180 degrees, destroying half of your lab.
> A few seconds/minutes later - your EDAC driver prints a message saying that
> the uncorrected error count just got incremented.
>
> -Tony

Agreed, and I don't like the polling either. But so far on this h/w I have not
been able to find a better option.

I also seem to recall, that ce errors tend to proceed ue errors. So, I think
alerting on ce errors can help avoid getting into ue errors.

So IMO there is currently value in this driver, and I know others have
requested support for this h/w in the past (on the edac mailing lists).

I'll see what else I can find once I get the problematic h/w in hand.

Thanks,

-Jason

2014-04-10 09:30:33

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On Wed, Apr 09, 2014 at 10:44:21PM +0000, Luck, Tony wrote:
> Scenario: Your mission critical app is running (controlling a giant
> laser cutter). Oops there is a memory error, and the bad data arrives
> at the application causing it to swing the laser beam through 180
> degrees, destroying half of your lab. A few seconds/minutes later -
> your EDAC driver prints a message saying that the uncorrected error
> count just got incremented.

LOL, that sounds like an episode from some cartoon movie with a nutty
professor with thick glasses in it. On the equipment boilerplate it says
ACME^WEDAC.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-06-11 21:54:30

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH 3/3] ie31200_edac: Add driver

On 04/09/2014 05:33 PM, Luck, Tony wrote:
>> Unfortunately, the box reporting the ue errors just went into transit (so
>> that I can better examine this issue), so I will probably not be able to
>> run this experiment on that specific box until next week.
>
> Do you have any other logs from this machine. Is there something
> logged in one (or more) of the machine check banks when your EDAC
> driver says that there are uncorrected errors?
>
> When the box is back online again - I'd be interested to know if mcelog(8)
> daemon reports any errors. Grab the latest from mcelog.org, compile
> and run as "mcelog --daemon". Logs show up in /var/log/mcelog
>

I finally got the bad dimm in hand - running the latest version
of mcelog, I don't see anything reported in the log. (I verified
that mcelog was reporting events via mce-inject.)


>> # ./rdmsr 0x179
>> c09
>
> So this processor does support CMCI - next question is whether each
> bank support it (and got enabled by Linux) [can run on any system ... don't
> need to wait for the one to finish transit)]
>
> # for I in `seq 0 8`
> do
> ./rdmsr 0x28$i
> done
>
> will print the MCi_CTL2 registers from each bank. Bit 30 (0x40000000)
> shows CMCI enabled.
>
> On the name of the driver - can you throw in an underscore: ie3_12xx.c ?
>
> Do you have systems from Sandy Bridge, Ivy Bridge and Haswell generations
> (no suffix for Sandy Bridge, then v2 and v3) ... and does this driver work across all of
> them? If it is just for Haswell ... then "ie3_12xx_v3.c" might be a better name.
>

So I've moved the dimm around to all three systems Sandy, Ivy Bridge and Haswell.
And this driver is populating '/sys/devices/system/edac/mc/mc0/ue_count' correctly
when a userspace memory scanning program also report errors. So the driver appears
to be doing the right thing.

I've tested:

CPU E3-1270 v3 @ 3.50GHz : 8086:0c08 (haswell)
CPU E3-1270 V2 @ 3.50GHz : 8086:0158 (ivy bridge)
CPU E31270 @ 3.40GHz : 8086:0108 (sandy bridge)

I still don't see any evidence of CMCI doing anything as the "THR" entry in
/proc/interrupts doesn't budge off of 0.

Let me know if you'd like me to look at anything else...

Thanks,

-Jason