2008-12-23 21:58:06

by Michał Mirosław

[permalink] [raw]
Subject: [PATCH] pci-quirks: Unhide 'Overflow' device on i828{6,7}5P/PE chipsets

As I found out from EDAC driver sources for i82875P some BIOSes for
i82875P/PE hide 'overflow' device 6. The same thing happens for
i82865P/PE chipsets.

After testing this patch for couple of days on my laptop (i82856P)
it looks like something is resetting device 0 (MCH) config register
0xF4 to zero and effectively disabling the device again. The delay
looks random to me. I can easily update the register using
'hexedit /sys/bus/pci/devices/0000\:00\:00.0/config' and see
correct values in lspci output afterwards.

Signed-off-by: Micha? Miros?aw <[email protected]>

diff -urN linux-2.6.27.7-brfix1-nvpid/drivers/pci/quirks.c pci-quirks/drivers/pci/quirks.c
--- linux-2.6.27.7-brfix1-nvpid/drivers/pci/quirks.c 2008-10-10 00:13:53.000000000 +0200
+++ pci-quirks/drivers/pci/quirks.c 2008-12-06 22:18:46.000000000 +0100
@@ -2007,3 +2008,25 @@
quirk_msi_intx_disable_bug);

#endif /* CONFIG_PCI_MSI */
+
+/* Originally in EDAC sources for i82875P:
+ * Intel tells BIOS developers to hide device 6 which
+ * configures the overflow device access containing
+ * the DRBs - this is where we expose device 6.
+ * http://www.x86-secret.com/articles/tweak/pat/patsecrets-2.htm
+ */
+static void __devinit quirk_unhide_mch_memory_controller_dev6(struct pci_dev *dev)
+{
+ u8 reg;
+
+ if (pci_read_config_byte(dev, 0xF4, &reg) == 0 && !(reg & 0x02)) {
+ dev_info(&dev->dev, "Enabling MCH Memory Controller 'Overflow' Device");
+ pci_write_config_byte(dev, 0xF4, reg | 0x02);
+ }
+}
+
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82865_HB,
+ quirk_unhide_mch_memory_controller_dev6);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82875_HB,
+ quirk_unhide_mch_memory_controller_dev6);
+


2008-12-23 23:35:38

by Robert Hancock

[permalink] [raw]
Subject: Re: [PATCH] pci-quirks: Unhide 'Overflow' device on i828{6,7}5P/PE chipsets

Micha? Miros?aw wrote:
> As I found out from EDAC driver sources for i82875P some BIOSes for
> i82875P/PE hide 'overflow' device 6. The same thing happens for
> i82865P/PE chipsets.
>
> After testing this patch for couple of days on my laptop (i82856P)
> it looks like something is resetting device 0 (MCH) config register
> 0xF4 to zero and effectively disabling the device again. The delay
> looks random to me. I can easily update the register using
> 'hexedit /sys/bus/pci/devices/0000\:00\:00.0/config' and see
> correct values in lspci output afterwards.

Could be the BIOS is using the device in SMI code or something (and
that's what disables it), therefore it may be unsafe to unhide this device.

2008-12-23 23:55:06

by Michał Mirosław

[permalink] [raw]
Subject: Re: [PATCH] pci-quirks: Unhide 'Overflow' device on i828{6,7}5P/PE chipsets

On Tue, Dec 23, 2008 at 05:35:14PM -0600, Robert Hancock wrote:
> Micha? Miros?aw wrote:
> >As I found out from EDAC driver sources for i82875P some BIOSes for
> >i82875P/PE hide 'overflow' device 6. The same thing happens for
> >i82865P/PE chipsets.
> >
> >After testing this patch for couple of days on my laptop (i82856P)
> >it looks like something is resetting device 0 (MCH) config register
> >0xF4 to zero and effectively disabling the device again. The delay
> >looks random to me. I can easily update the register using
> >'hexedit /sys/bus/pci/devices/0000\:00\:00.0/config' and see
> >correct values in lspci output afterwards.
> Could be the BIOS is using the device in SMI code or something (and
> that's what disables it), therefore it may be unsafe to unhide this device.

I'm going to dig into this deeper in my spare time. Googling around
for 'System Management Mode' gave interesting pointers...

Best Regards,
Micha? Miros?aw

2009-01-01 19:03:14

by Michał Mirosław

[permalink] [raw]
Subject: [PATCH 2.6.28 1/3] PCI-quirks: Unhide MCH5/6 memory controller configuration device

Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
The same happens for i82865P/PE. Add a quirk to enable this device.
This allows i82875 EDAC driver to bind to chipset's dev #6 and not
dev #0 as the latter is used by AGP driver.

This is rebased against 2.6.28 vanilla kernel, including trimming
of long lines.

After testing this patch for couple of days on my laptop (i82856P)
it looks like something is resetting device 0 (MCH) config register
0xF4 to zero and effectively disabling the device again. The delay
looks random to me. I can easily update the register using
'hexedit /sys/bus/pci/devices/0000\:00\:00.0/config' and see
correct values in lspci output afterwards. This is probably
BIOS's fault. This changes nothing as far as i82875P EDAC driver
is concerned as it has the same assumption that BIOS is well behaved.

In case some really broken BIOS is found, this can be wrapped around
some new Kconfig #ifdef.

Signed-off-by: Micha? Miros?aw <[email protected]>

diff -urN a/drivers/pci/quirks.c b/drivers/pci/quirks.c
--- a/drivers/pci/quirks.c 2008-10-10 00:13:53.000000000 +0200
+++ b/drivers/pci/quirks.c 2008-12-06 22:18:46.000000000 +0100
@@ -1946,6 +1947,27 @@

#endif /* CONFIG_PCI_MSI */

+/* Originally in EDAC sources for i82875P:
+ * Intel tells BIOS developers to hide device 6 which
+ * configures the overflow device access containing
+ * the DRBs - this is where we expose device 6.
+ * http://www.x86-secret.com/articles/tweak/pat/patsecrets-2.htm
+ */
+static void __devinit quirk_unhide_mch_dev6(struct pci_dev *dev)
+{
+ u8 reg;
+
+ if (pci_read_config_byte(dev, 0xF4, &reg) == 0 && !(reg & 0x02)) {
+ dev_info(&dev->dev, "Enabling MCH 'Overflow' Device\n");
+ pci_write_config_byte(dev, 0xF4, reg | 0x02);
+ }
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82865_HB,
+ quirk_unhide_mch_dev6);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82875_HB,
+ quirk_unhide_mch_dev6);
+
static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f, struct pci_fixup *end)
{
while (f < end) {

2009-01-01 19:05:31

by Michał Mirosław

[permalink] [raw]
Subject: [PATCH 2.6.28 2/3] EDAC: Use 'overflow' device for binding i82875 EDAC driver

Bind i82875 EDAC driver to the 'overflow' device.
Depends on 'Unhide MCH5/6 memory controller configuration device' patch
to work in some cases.

This is rebased against 2.6.28 vanilla kernel.

Signed-off-by: Micha? Miros?aw <[email protected]>

--- orig/drivers/edac/i82875p_edac.c 2008-12-07 01:15:51.000000000 +0100
+++ tmp/drivers/edac/i82875p_edac.c 2008-12-07 13:02:32.000000000 +0100
@@ -30,10 +30,6 @@
#define i82875p_mc_printk(mci, level, fmt, arg...) \
edac_mc_chipset_printk(mci, level, "i82875p", fmt, ##arg)

-#ifndef PCI_DEVICE_ID_INTEL_82875_0
-#define PCI_DEVICE_ID_INTEL_82875_0 0x2578
-#endif /* PCI_DEVICE_ID_INTEL_82875_0 */
-
#ifndef PCI_DEVICE_ID_INTEL_82875_6
#define PCI_DEVICE_ID_INTEL_82875_6 0x257e
#endif /* PCI_DEVICE_ID_INTEL_82875_6 */
@@ -157,7 +153,7 @@
};

struct i82875p_pvt {
- struct pci_dev *ovrfl_pdev;
+ struct pci_dev *mci_pdev;
void __iomem *ovrfl_window;
};

@@ -178,18 +174,13 @@
.ctl_name = "i82875p"},
};

-static struct pci_dev *mci_pdev; /* init dev: in case that AGP code has
- * already registered driver
- */
-
static struct edac_pci_ctl_info *i82875p_pci;

static void i82875p_get_error_info(struct mem_ctl_info *mci,
struct i82875p_error_info *info)
{
- struct pci_dev *pdev;
-
- pdev = to_pci_dev(mci->dev);
+ struct i82875p_pvt *pvt = mci->pvt_info;
+ struct pci_dev *pdev = pvt->mci_pdev;

/*
* This is a mess because there is no atomic way to read all the
@@ -262,78 +253,6 @@
i82875p_process_error_info(mci, &info, 1);
}

-/* Return 0 on success or 1 on failure. */
-static int i82875p_setup_overfl_dev(struct pci_dev *pdev,
- struct pci_dev **ovrfl_pdev,
- void __iomem **ovrfl_window)
-{
- struct pci_dev *dev;
- void __iomem *window;
- int err;
-
- *ovrfl_pdev = NULL;
- *ovrfl_window = NULL;
- dev = pci_get_device(PCI_VEND_DEV(INTEL, 82875_6), NULL);
-
- if (dev == NULL) {
- /* Intel tells BIOS developers to hide device 6 which
- * configures the overflow device access containing
- * the DRBs - this is where we expose device 6.
- * http://www.x86-secret.com/articles/tweak/pat/patsecrets-2.htm
- */
- pci_write_bits8(pdev, 0xf4, 0x2, 0x2);
- dev = pci_scan_single_device(pdev->bus, PCI_DEVFN(6, 0));
-
- if (dev == NULL)
- return 1;
-
- err = pci_bus_add_device(dev);
- if (err) {
- i82875p_printk(KERN_ERR,
- "%s(): pci_bus_add_device() Failed\n",
- __func__);
- }
- pci_bus_assign_resources(dev->bus);
- }
-
- *ovrfl_pdev = dev;
-
- if (pci_enable_device(dev)) {
- i82875p_printk(KERN_ERR, "%s(): Failed to enable overflow "
- "device\n", __func__);
- return 1;
- }
-
- if (pci_request_regions(dev, pci_name(dev))) {
-#ifdef CORRECT_BIOS
- goto fail0;
-#endif
- }
-
- /* cache is irrelevant for PCI bus reads/writes */
- window = ioremap_nocache(pci_resource_start(dev, 0),
- pci_resource_len(dev, 0));
-
- if (window == NULL) {
- i82875p_printk(KERN_ERR, "%s(): Failed to ioremap bar6\n",
- __func__);
- goto fail1;
- }
-
- *ovrfl_window = window;
- return 0;
-
-fail1:
- pci_release_regions(dev);
-
-#ifdef CORRECT_BIOS
-fail0:
- pci_disable_device(dev);
-#endif
- /* NOTE: the ovrfl proc entry and pci_dev are intentionally left */
- return 1;
-}
-
/* Return 1 if dual channel mode is active. Else return 0. */
static inline int dual_channel_active(u32 drc)
{
@@ -341,7 +260,6 @@
}

static void i82875p_init_csrows(struct mem_ctl_info *mci,
- struct pci_dev *pdev,
void __iomem * ovrfl_window, u32 drc)
{
struct csrow_info *csrow;
@@ -381,12 +299,12 @@
}
}

-static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
+static int i82875p_probe_edac(struct pci_dev *mci_pdev,
+ struct pci_dev *ovrfl_pdev, int dev_idx)
{
int rc = -ENODEV;
struct mem_ctl_info *mci;
struct i82875p_pvt *pvt;
- struct pci_dev *ovrfl_pdev;
void __iomem *ovrfl_window;
u32 drc;
u32 nr_chans;
@@ -394,10 +312,12 @@

debugf0("%s()\n", __func__);

- ovrfl_pdev = pci_get_device(PCI_VEND_DEV(INTEL, 82875_6), NULL);
+ /* cache is irrelevant for PCI bus reads/writes */
+ ovrfl_window = ioremap_nocache(pci_resource_start(ovrfl_pdev, 0),
+ pci_resource_len(ovrfl_pdev, 0));
+ if (!ovrfl_window)
+ return -EBUSY;

- if (i82875p_setup_overfl_dev(pdev, &ovrfl_pdev, &ovrfl_window))
- return -ENODEV;
drc = readl(ovrfl_window + I82875P_DRC);
nr_chans = dual_channel_active(drc) + 1;
mci = edac_mc_alloc(sizeof(*pvt), I82875P_NR_CSROWS(nr_chans),
@@ -412,21 +332,23 @@
kobject_get(&mci->edac_mci_kobj);

debugf3("%s(): init mci\n", __func__);
- mci->dev = &pdev->dev;
+ mci->dev = &ovrfl_pdev->dev;
mci->mtype_cap = MEM_FLAG_DDR;
mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
mci->edac_cap = EDAC_FLAG_UNKNOWN;
mci->mod_name = EDAC_MOD_STR;
mci->mod_ver = I82875P_REVISION;
mci->ctl_name = i82875p_devs[dev_idx].ctl_name;
- mci->dev_name = pci_name(pdev);
+ mci->dev_name = pci_name(mci_pdev);
mci->edac_check = i82875p_check;
mci->ctl_page_to_phys = NULL;
+
debugf3("%s(): init pvt\n", __func__);
- pvt = (struct i82875p_pvt *)mci->pvt_info;
- pvt->ovrfl_pdev = ovrfl_pdev;
+ pvt = mci->pvt_info;
+ pvt->mci_pdev = mci_pdev;
pvt->ovrfl_window = ovrfl_window;
- i82875p_init_csrows(mci, pdev, ovrfl_window, drc);
+
+ i82875p_init_csrows(mci, ovrfl_window, drc);
i82875p_get_error_info(mci, &discard); /* clear counters */

/* Here we assume that we will never see multiple instances of this
@@ -438,7 +360,7 @@
}

/* allocating generic PCI control info */
- i82875p_pci = edac_pci_create_generic_ctl(&pdev->dev, EDAC_MOD_STR);
+ i82875p_pci = edac_pci_create_generic_ctl(&mci_pdev->dev, EDAC_MOD_STR);
if (!i82875p_pci) {
printk(KERN_WARNING
"%s(): Unable to create PCI control\n",
@@ -458,10 +380,6 @@

fail0:
iounmap(ovrfl_window);
- pci_release_regions(ovrfl_pdev);
-
- pci_disable_device(ovrfl_pdev);
- /* NOTE: the ovrfl proc entry and pci_dev are intentionally left */
return rc;
}

@@ -469,26 +387,52 @@
static int __devinit i82875p_init_one(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
+ int mci_devfn = PCI_DEVFN(PCI_SLOT(pdev->devfn), 0);
+ struct pci_dev *mci_pdev;
int rc;

debugf0("%s()\n", __func__);
i82875p_printk(KERN_INFO, "i82875p init one\n");

- if (pci_enable_device(pdev) < 0)
- return -EIO;
+ mci_pdev = pci_get_slot(pdev->bus, mci_devfn);
+ if (!mci_pdev)
+ return -ENODEV;
+
+ rc = pci_enable_device(mci_pdev);
+ if (rc < 0)
+ goto err_out_put1;
+
+ /* XXX looks like something is clearing this bit after early
+ * quirk phase */
+ pci_write_bits8(pdev, 0xf4, 0x2, 0x2);

- rc = i82875p_probe1(pdev, ent->driver_data);
+ rc = pci_enable_device(pdev);
+ if (rc < 0)
+ goto err_out_put1;

- if (mci_pdev == NULL)
- mci_pdev = pci_dev_get(pdev);
+ rc = pci_request_regions(pdev, pci_name(pdev));
+ if (rc < 0)
+ goto err_out_noreg;

+ rc = i82875p_probe_edac(mci_pdev, pdev, ent->driver_data);
+
+ if (!rc)
+ return 0;
+
+ pci_release_regions(pdev);
+err_out_noreg:
+ pci_disable_device(pdev);
+err_out_put1:
+ pci_dev_put(mci_pdev);
return rc;
}

static void __devexit i82875p_remove_one(struct pci_dev *pdev)
{
+ int mci_devfn = PCI_DEVFN(PCI_SLOT(pdev->devfn), 0);
+ struct pci_dev *mci_pdev;
struct mem_ctl_info *mci;
- struct i82875p_pvt *pvt = NULL;
+ struct i82875p_pvt *pvt;

debugf0("%s()\n", __func__);

@@ -498,29 +442,19 @@
if ((mci = edac_mc_del_mc(&pdev->dev)) == NULL)
return;

- pvt = (struct i82875p_pvt *)mci->pvt_info;
+ pvt = mci->pvt_info;
+ iounmap(pvt->ovrfl_window);

- if (pvt->ovrfl_window)
- iounmap(pvt->ovrfl_window);
-
- if (pvt->ovrfl_pdev) {
-#ifdef CORRECT_BIOS
- pci_release_regions(pvt->ovrfl_pdev);
-#endif /*CORRECT_BIOS */
- pci_disable_device(pvt->ovrfl_pdev);
- pci_dev_put(pvt->ovrfl_pdev);
- }
+ pci_release_regions(pdev);
+ pci_disable_device(pdev);
+ pci_dev_put(pvt->mci_pdev);

edac_mc_free(mci);
}

-static const struct pci_device_id i82875p_pci_tbl[] __devinitdata = {
- {
- PCI_VEND_DEV(INTEL, 82875_0), PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- I82875P},
- {
- 0,
- } /* 0 terminated list. */
+static DEFINE_PCI_DEVICE_TABLE(i82875p_pci_tbl) = {
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_82875_6), I82875P },
+ { 0, } /* 0 terminated list. */
};

MODULE_DEVICE_TABLE(pci, i82875p_pci_tbl);
@@ -534,58 +468,19 @@

static int __init i82875p_init(void)
{
- int pci_rc;
-
debugf3("%s()\n", __func__);

- /* Ensure that the OPSTATE is set correctly for POLL or NMI */
- opstate_init();
-
- pci_rc = pci_register_driver(&i82875p_driver);
+ /* Ensure that the OPSTATE is set correctly for POLL or NMI */
+ opstate_init();

- if (pci_rc < 0)
- goto fail0;
-
- if (mci_pdev == NULL) {
- mci_pdev = pci_get_device(PCI_VENDOR_ID_INTEL,
- PCI_DEVICE_ID_INTEL_82875_0, NULL);
-
- if (!mci_pdev) {
- debugf0("875p pci_get_device fail\n");
- pci_rc = -ENODEV;
- goto fail1;
- }
-
- pci_rc = i82875p_init_one(mci_pdev, i82875p_pci_tbl);
-
- if (pci_rc < 0) {
- debugf0("875p init fail\n");
- pci_rc = -ENODEV;
- goto fail1;
- }
- }
-
- return 0;
-
-fail1:
- pci_unregister_driver(&i82875p_driver);
-
-fail0:
- if (mci_pdev != NULL)
- pci_dev_put(mci_pdev);
-
- return pci_rc;
+ return pci_register_driver(&i82875p_driver);
}

static void __exit i82875p_exit(void)
{
debugf3("%s()\n", __func__);

- i82875p_remove_one(mci_pdev);
- pci_dev_put(mci_pdev);
-
pci_unregister_driver(&i82875p_driver);
-
}

module_init(i82875p_init);

2009-01-01 19:06:40

by Michał Mirosław

[permalink] [raw]
Subject: [PATCH 2.6.28 3/3] EDAC: Add support for i82865P/PE chipsets

i82865P have the same registers as i82875P but without support for ECC
memory modules, so this change only allows to see some information from
/sys/devices/system/edac, but no memory errors will be detected.
Checked in chipset's datasheet and tested on HP nx9500 laptop.

This is a actually the same code as previous version. I resend it for
completeness, as it depends on previous patches in this series.

Signed-off-by: Micha? Miros?aw <[email protected]>

--- tmp/drivers/edac/i82875p_edac.c 2008-12-07 13:02:32.000000000 +0100
+++ rechot/drivers/edac/i82875p_edac.c 2008-12-07 13:01:37.000000000 +0100
@@ -30,6 +30,10 @@
#define i82875p_mc_printk(mci, level, fmt, arg...) \
edac_mc_chipset_printk(mci, level, "i82875p", fmt, ##arg)

+#ifndef PCI_DEVICE_ID_INTEL_82865_6
+#define PCI_DEVICE_ID_INTEL_82865_6 0x2576
+#endif /* PCI_DEVICE_ID_INTEL_82865_6 */
+
#ifndef PCI_DEVICE_ID_INTEL_82875_6
#define PCI_DEVICE_ID_INTEL_82875_6 0x257e
#endif /* PCI_DEVICE_ID_INTEL_82875_6 */
@@ -150,6 +154,7 @@

enum i82875p_chips {
I82875P = 0,
+ I82865P = 1, /* this one lacks support for ECC memory */
};

struct i82875p_pvt {
@@ -159,6 +164,9 @@

struct i82875p_dev_info {
const char *ctl_name;
+ unsigned long edac_ctl_cap;
+ void (*check)(struct mem_ctl_info *mci);
+ unsigned long drc_mask; /* valid DRC register bits according to datasheet */
};

struct i82875p_error_info {
@@ -169,11 +177,6 @@
u16 errsts2;
};

-static const struct i82875p_dev_info i82875p_devs[] = {
- [I82875P] = {
- .ctl_name = "i82875p"},
-};
-
static struct edac_pci_ctl_info *i82875p_pci;

static void i82875p_get_error_info(struct mem_ctl_info *mci,
@@ -253,6 +256,26 @@
i82875p_process_error_info(mci, &info, 1);
}

+static void i82865p_check(struct mem_ctl_info *mci)
+{
+}
+
+static const struct i82875p_dev_info i82875p_devs[] = {
+ [I82875P] = {
+ .ctl_name = "i82875p",
+ .edac_ctl_cap = EDAC_FLAG_NONE|EDAC_FLAG_SECDED,
+ .check = i82875p_check,
+ .drc_mask = 0x20660773,
+ },
+ [I82865P] = {
+ .ctl_name = "i82865p",
+ .edac_ctl_cap = EDAC_FLAG_NONE,
+ .check = i82865p_check,
+ .drc_mask = 0x20600773,
+ },
+};
+
+
/* Return 1 if dual channel mode is active. Else return 0. */
static inline int dual_channel_active(u32 drc)
{
@@ -318,7 +341,7 @@
if (!ovrfl_window)
return -EBUSY;

- drc = readl(ovrfl_window + I82875P_DRC);
+ drc = readl(ovrfl_window + I82875P_DRC) & i82875p_devs[dev_idx].drc_mask;
nr_chans = dual_channel_active(drc) + 1;
mci = edac_mc_alloc(sizeof(*pvt), I82875P_NR_CSROWS(nr_chans),
nr_chans, 0);
@@ -334,13 +357,13 @@
debugf3("%s(): init mci\n", __func__);
mci->dev = &ovrfl_pdev->dev;
mci->mtype_cap = MEM_FLAG_DDR;
- mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
+ mci->edac_ctl_cap = i82875p_devs[dev_idx].edac_ctl_cap;
mci->edac_cap = EDAC_FLAG_UNKNOWN;
mci->mod_name = EDAC_MOD_STR;
mci->mod_ver = I82875P_REVISION;
mci->ctl_name = i82875p_devs[dev_idx].ctl_name;
mci->dev_name = pci_name(mci_pdev);
- mci->edac_check = i82875p_check;
+ mci->edac_check = i82875p_devs[dev_idx].check;
mci->ctl_page_to_phys = NULL;

debugf3("%s(): init pvt\n", __func__);
@@ -429,8 +452,6 @@

static void __devexit i82875p_remove_one(struct pci_dev *pdev)
{
- int mci_devfn = PCI_DEVFN(PCI_SLOT(pdev->devfn), 0);
- struct pci_dev *mci_pdev;
struct mem_ctl_info *mci;
struct i82875p_pvt *pvt;

@@ -454,6 +475,7 @@

static DEFINE_PCI_DEVICE_TABLE(i82875p_pci_tbl) = {
{ PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_82875_6), I82875P },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_82865_6), I82865P },
{ 0, } /* 0 terminated list. */
};

2009-01-05 19:20:48

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 2.6.28 1/3] PCI-quirks: Unhide MCH5/6 memory controller configuration device

On Thursday, January 1, 2009 11:02 am Micha? Miros?aw wrote:
> Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
> The same happens for i82865P/PE. Add a quirk to enable this device.
> This allows i82875 EDAC driver to bind to chipset's dev #6 and not
> dev #0 as the latter is used by AGP driver.
>
> This is rebased against 2.6.28 vanilla kernel, including trimming
> of long lines.
>
> After testing this patch for couple of days on my laptop (i82856P)
> it looks like something is resetting device 0 (MCH) config register
> 0xF4 to zero and effectively disabling the device again. The delay
> looks random to me. I can easily update the register using
> 'hexedit /sys/bus/pci/devices/0000\:00\:00.0/config' and see
> correct values in lspci output afterwards. This is probably
> BIOS's fault. This changes nothing as far as i82875P EDAC driver
> is concerned as it has the same assumption that BIOS is well behaved.
>
> In case some really broken BIOS is found, this can be wrapped around
> some new Kconfig #ifdef.
>
> Signed-off-by: Micha? Miros?aw <[email protected]>

This one doesn't apply for me, can you resend a new one against my linux-next
branch?

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center

2009-01-05 20:28:42

by Michał Mirosław

[permalink] [raw]
Subject: Re: [PATCH 2.6.28 1/3] PCI-quirks: Unhide MCH5/6 memory controller configuration device

On Mon, Jan 05, 2009 at 11:20:35AM -0800, Jesse Barnes wrote:
> On Thursday, January 1, 2009 11:02 am Micha? Miros?aw wrote:
> > Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
> > The same happens for i82865P/PE. Add a quirk to enable this device.
> > This allows i82875 EDAC driver to bind to chipset's dev #6 and not
> > dev #0 as the latter is used by AGP driver.
> >
> > This is rebased against 2.6.28 vanilla kernel, including trimming
> > of long lines.
[cut]
> This one doesn't apply for me, can you resend a new one against my linux-next
> branch?

Sure. Following.
-- Micha? Miros?aw

2009-01-05 20:30:22

by Michał Mirosław

[permalink] [raw]
Subject: [PATCH 2.6.28 1/3] PCI-quirks: Unhide MCH5/6 memory controller configuration device

Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
The same happens for i82865P/PE. Add a quirk to enable this device.
This allows i82875 EDAC driver to bind to chipset's dev #6 and not
dev #0 as the latter is used by AGP driver.

After testing this patch for couple of days on my laptop (i82856P)
it looks like something is resetting device 0 (MCH) config register
0xF4 to zero and effectively disabling the device again. The delay
looks random to me. I can easily update the register using
'hexedit /sys/bus/pci/devices/0000\:00\:00.0/config' and see
correct values in lspci output afterwards. This is probably
BIOS's fault. This changes nothing as far as i82875P EDAC driver
is concerned as it has the same assumption that BIOS is well behaved.

In case some really broken BIOS is found, this can be wrapped around
some new Kconfig #ifdef.

Signed-off-by: Micha? Miros?aw <[email protected]>

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index a47db02..0ca02c7 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1787,6 +1787,28 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM,
PCI_DEVICE_ID_NX2_5709S,
quirk_brcm_570x_limit_vpd);

+/* Originally in EDAC sources for i82875P:
+ * Intel tells BIOS developers to hide device 6 which
+ * configures the overflow device access containing
+ * the DRBs - this is where we expose device 6.
+ * http://www.x86-secret.com/articles/tweak/pat/patsecrets-2.htm
+ */
+static void __devinit quirk_unhide_mch_dev6(struct pci_dev *dev)
+{
+ u8 reg;
+
+ if (pci_read_config_byte(dev, 0xF4, &reg) == 0 && !(reg & 0x02)) {
+ dev_info(&dev->dev, "Enabling MCH 'Overflow' Device\n");
+ pci_write_config_byte(dev, 0xF4, reg | 0x02);
+ }
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82865_HB,
+ quirk_unhide_mch_dev6);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82875_HB,
+ quirk_unhide_mch_dev6);
+
+
#ifdef CONFIG_PCI_MSI
/* Some chipsets do not support MSI. We cannot easily rely on setting
* PCI_BUS_FLAGS_NO_MSI in its bus flags because there are actually

2009-01-05 21:03:22

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 2.6.28 1/3] PCI-quirks: Unhide MCH5/6 memory controller configuration device

On Monday, January 5, 2009 12:30 pm Micha? Miros?aw wrote:
> Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
> The same happens for i82865P/PE. Add a quirk to enable this device.
> This allows i82875 EDAC driver to bind to chipset's dev #6 and not
> dev #0 as the latter is used by AGP driver.
>
> After testing this patch for couple of days on my laptop (i82856P)
> it looks like something is resetting device 0 (MCH) config register
> 0xF4 to zero and effectively disabling the device again. The delay
> looks random to me. I can easily update the register using
> 'hexedit /sys/bus/pci/devices/0000\:00\:00.0/config' and see
> correct values in lspci output afterwards. This is probably
> BIOS's fault. This changes nothing as far as i82875P EDAC driver
> is concerned as it has the same assumption that BIOS is well behaved.
>
> In case some really broken BIOS is found, this can be wrapped around
> some new Kconfig #ifdef.
>
> Signed-off-by: Micha? Miros?aw <[email protected]>

Applied, thanks.

--
Jesse Barnes, Intel Open Source Technology Center

2009-01-06 19:02:20

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 2.6.28 1/3] PCI-quirks: Unhide MCH5/6 memory controller configuration device

On Monday 05 January 2009 01:30:06 pm Micha? Miros?aw wrote:
> Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
> The same happens for i82865P/PE. Add a quirk to enable this device.
> This allows i82875 EDAC driver to bind to chipset's dev #6 and not
> dev #0 as the latter is used by AGP driver.
>
> After testing this patch for couple of days on my laptop (i82856P)
> it looks like something is resetting device 0 (MCH) config register
> 0xF4 to zero and effectively disabling the device again. The delay
> looks random to me.

The BIOS left the device hidden. When you enable it with the quirk,
the fact that it mysteriously gets disabled later seems like a pretty
clear indication that something else we don't know about is using the
device. Since there's no synchronization between the "something else"
and the i82875p_edac.c driver, it seems like you're introducing the
possibility for problems.

I don't know anything about the EDAC driver. Is the value it provides
really worth the possible problems with this approach? Maybe it is,
but I don't want to be the one to debug a random interaction that
causes a problem.

Bjorn

> I can easily update the register using
> 'hexedit /sys/bus/pci/devices/0000\:00\:00.0/config' and see
> correct values in lspci output afterwards. This is probably
> BIOS's fault. This changes nothing as far as i82875P EDAC driver
> is concerned as it has the same assumption that BIOS is well behaved.
>
> In case some really broken BIOS is found, this can be wrapped around
> some new Kconfig #ifdef.
>
> Signed-off-by: Micha? Miros?aw <[email protected]>
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index a47db02..0ca02c7 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -1787,6 +1787,28 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM,
> PCI_DEVICE_ID_NX2_5709S,
> quirk_brcm_570x_limit_vpd);
>
> +/* Originally in EDAC sources for i82875P:
> + * Intel tells BIOS developers to hide device 6 which
> + * configures the overflow device access containing
> + * the DRBs - this is where we expose device 6.
> + * http://www.x86-secret.com/articles/tweak/pat/patsecrets-2.htm
> + */
> +static void __devinit quirk_unhide_mch_dev6(struct pci_dev *dev)
> +{
> + u8 reg;
> +
> + if (pci_read_config_byte(dev, 0xF4, &reg) == 0 && !(reg & 0x02)) {
> + dev_info(&dev->dev, "Enabling MCH 'Overflow' Device\n");
> + pci_write_config_byte(dev, 0xF4, reg | 0x02);
> + }
> +}
> +
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82865_HB,
> + quirk_unhide_mch_dev6);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82875_HB,
> + quirk_unhide_mch_dev6);
> +
> +
> #ifdef CONFIG_PCI_MSI
> /* Some chipsets do not support MSI. We cannot easily rely on setting
> * PCI_BUS_FLAGS_NO_MSI in its bus flags because there are actually
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2009-01-06 20:21:32

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 2.6.28 1/3] PCI-quirks: Unhide MCH5/6 memory controller configuration device

On Tuesday, January 6, 2009 11:02 am Bjorn Helgaas wrote:
> On Monday 05 January 2009 01:30:06 pm Micha? Miros?aw wrote:
> > Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
> > The same happens for i82865P/PE. Add a quirk to enable this device.
> > This allows i82875 EDAC driver to bind to chipset's dev #6 and not
> > dev #0 as the latter is used by AGP driver.
> >
> > After testing this patch for couple of days on my laptop (i82856P)
> > it looks like something is resetting device 0 (MCH) config register
> > 0xF4 to zero and effectively disabling the device again. The delay
> > looks random to me.
>
> The BIOS left the device hidden. When you enable it with the quirk,
> the fact that it mysteriously gets disabled later seems like a pretty
> clear indication that something else we don't know about is using the
> device. Since there's no synchronization between the "something else"
> and the i82875p_edac.c driver, it seems like you're introducing the
> possibility for problems.
>
> I don't know anything about the EDAC driver. Is the value it provides
> really worth the possible problems with this approach? Maybe it is,
> but I don't want to be the one to debug a random interaction that
> causes a problem.

Yeah, there's some uncertainty there for sure so I've dropped the patch. If
it really provides a compelling feature we can always add it again with a
config option or runtime option to enable it.

--
Jesse Barnes, Intel Open Source Technology Center