2008-02-15 09:25:28

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 1/5] x86: validate against acpi motherboard resources

From: Robert Hancock <[email protected]>

This path adds validation of the MMCONFIG table against the ACPI reserved
motherboard resources. If the MMCONFIG table is found to be reserved in
ACPI, we don't bother checking the E820 table. The PCI Express firmware
spec apparently tells BIOS developers that reservation in ACPI is required
and E820 reservation is optional, so checking against ACPI first makes
sense. Many BIOSes don't reserve the MMCONFIG region in E820 even though
it is perfectly functional, the existing check needlessly disables MMCONFIG
in these cases.

In order to do this, MMCONFIG setup has been split into two phases. If PCI
configuration type 1 is not available then MMCONFIG is enabled early as
before. Otherwise, it is enabled later after the ACPI interpreter is
enabled, since we need to be able to execute control methods in order to
check the ACPI reserved resources. Presently this is just triggered off
the end of ACPI interpreter initialization.

There are a few other behavioral changes here:

- Validate all MMCONFIG configurations provided, not just the first one.

- Validate the entire required length of each configuration according to
the provided ending bus number is reserved, not just the minimum required
allocation.

- Validate that the area is reserved even if we read it from the chipset
directly and not from the MCFG table. This catches the case where the
BIOS didn't set the location properly in the chipset and has mapped it
over other things it shouldn't have.

This also cleans up the MMCONFIG initialization functions so that they
simply do nothing if MMCONFIG is not compiled in.

Based on an original patch by Rajesh Shah from Intel.

[[email protected]: many fixes and cleanups]
Signed-off-by: Robert Hancock <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
Tested-by: Andi Kleen <[email protected]>
Cc: Rajesh Shah <[email protected]>
Cc: Jesse Barnes <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Greg KH <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

arch/x86/pci/init.c | 4
arch/x86/pci/mmconfig-shared.c | 151 +++++++++++++++++++++++++++----
arch/x86/pci/pci.h | 1
drivers/acpi/bus.c | 2
include/linux/pci.h | 8 +
5 files changed, 144 insertions(+), 22 deletions(-)

Index: linux-2.6/arch/x86/pci/init.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/init.c
+++ linux-2.6/arch/x86/pci/init.c
@@ -11,9 +11,7 @@ static __init int pci_access_init(void)
#ifdef CONFIG_PCI_DIRECT
type = pci_direct_probe();
#endif
-#ifdef CONFIG_PCI_MMCONFIG
- pci_mmcfg_init(type);
-#endif
+ pci_mmcfg_early_init(type);
if (raw_pci_ops)
return 0;
#ifdef CONFIG_PCI_BIOS
Index: linux-2.6/arch/x86/pci/mmconfig-shared.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/mmconfig-shared.c
+++ linux-2.6/arch/x86/pci/mmconfig-shared.c
@@ -173,9 +173,78 @@ static void __init pci_mmcfg_insert_reso
pci_mmcfg_resources_inserted = 1;
}

-static void __init pci_mmcfg_reject_broken(int type)
+static acpi_status __init check_mcfg_resource(struct acpi_resource *res,
+ void *data)
+{
+ struct resource *mcfg_res = data;
+ struct acpi_resource_address64 address;
+ acpi_status status;
+
+ if (res->type == ACPI_RESOURCE_TYPE_FIXED_MEMORY32) {
+ struct acpi_resource_fixed_memory32 *fixmem32 =
+ &res->data.fixed_memory32;
+ if (!fixmem32)
+ return AE_OK;
+ if ((mcfg_res->start >= fixmem32->address) &&
+ (mcfg_res->end < (fixmem32->address +
+ fixmem32->address_length))) {
+ mcfg_res->flags = 1;
+ return AE_CTRL_TERMINATE;
+ }
+ }
+ if ((res->type != ACPI_RESOURCE_TYPE_ADDRESS32) &&
+ (res->type != ACPI_RESOURCE_TYPE_ADDRESS64))
+ return AE_OK;
+
+ status = acpi_resource_to_address64(res, &address);
+ if (ACPI_FAILURE(status) ||
+ (address.address_length <= 0) ||
+ (address.resource_type != ACPI_MEMORY_RANGE))
+ return AE_OK;
+
+ if ((mcfg_res->start >= address.minimum) &&
+ (mcfg_res->end < (address.minimum + address.address_length))) {
+ mcfg_res->flags = 1;
+ return AE_CTRL_TERMINATE;
+ }
+ return AE_OK;
+}
+
+static acpi_status __init find_mboard_resource(acpi_handle handle, u32 lvl,
+ void *context, void **rv)
+{
+ struct resource *mcfg_res = context;
+
+ acpi_walk_resources(handle, METHOD_NAME__CRS,
+ check_mcfg_resource, context);
+
+ if (mcfg_res->flags)
+ return AE_CTRL_TERMINATE;
+
+ return AE_OK;
+}
+
+static int __init is_acpi_reserved(unsigned long start, unsigned long end)
+{
+ struct resource mcfg_res;
+
+ mcfg_res.start = start;
+ mcfg_res.end = end;
+ mcfg_res.flags = 0;
+
+ acpi_get_devices("PNP0C01", find_mboard_resource, &mcfg_res, NULL);
+
+ if (!mcfg_res.flags)
+ acpi_get_devices("PNP0C02", find_mboard_resource, &mcfg_res,
+ NULL);
+
+ return mcfg_res.flags;
+}
+
+static void __init pci_mmcfg_reject_broken(void)
{
typeof(pci_mmcfg_config[0]) *cfg;
+ int i;

if ((pci_mmcfg_config_num == 0) ||
(pci_mmcfg_config == NULL) ||
@@ -196,17 +265,37 @@ static void __init pci_mmcfg_reject_brok
goto reject;
}

- /*
- * Only do this check when type 1 works. If it doesn't work
- * assume we run on a Mac and always use MCFG
- */
- if (type == 1 && !e820_all_mapped(cfg->address,
- cfg->address + MMCONFIG_APER_MIN,
- E820_RESERVED)) {
- printk(KERN_ERR "PCI: BIOS Bug: MCFG area at %Lx is not"
- " E820-reserved\n", cfg->address);
- goto reject;
+ for (i = 0; i < pci_mmcfg_config_num; i++) {
+ u32 size = (cfg->end_bus_number + 1) << 20;
+ cfg = &pci_mmcfg_config[i];
+ printk(KERN_NOTICE "PCI: MCFG configuration %d: base %lu "
+ "segment %hu buses %u - %u\n",
+ i, (unsigned long)cfg->address, cfg->pci_segment,
+ (unsigned int)cfg->start_bus_number,
+ (unsigned int)cfg->end_bus_number);
+ if (is_acpi_reserved(cfg->address, cfg->address + size - 1)) {
+ printk(KERN_NOTICE "PCI: MCFG area at %Lx reserved "
+ "in ACPI motherboard resources\n",
+ cfg->address);
+ } else {
+ printk(KERN_ERR "PCI: BIOS Bug: MCFG area at %Lx is not"
+ " reserved in ACPI motherboard resources\n",
+ cfg->address);
+ /* Don't try to do this check unless configuration
+ type 1 is available. */
+ if ((pci_probe & PCI_PROBE_CONF1) &&
+ e820_all_mapped(cfg->address,
+ cfg->address + size - 1,
+ E820_RESERVED))
+ printk(KERN_NOTICE
+ "PCI: MCFG area at %Lx reserved in "
+ "E820\n",
+ cfg->address);
+ else
+ goto reject;
+ }
}
+
return;

reject:
@@ -216,20 +305,46 @@ reject:
pci_mmcfg_config_num = 0;
}

-void __init pci_mmcfg_init(int type)
+void __init pci_mmcfg_early_init(int type)
+{
+ if ((pci_probe & PCI_PROBE_MMCONF) == 0)
+ return;
+
+ /* If type 1 access is available, no need to enable MMCONFIG yet, we can
+ defer until later when the ACPI interpreter is available to better
+ validate things. */
+ if (type == 1)
+ return;
+
+ acpi_table_parse(ACPI_SIG_MCFG, acpi_parse_mcfg);
+
+ if ((pci_mmcfg_config_num == 0) ||
+ (pci_mmcfg_config == NULL) ||
+ (pci_mmcfg_config[0].address == 0))
+ return;
+
+ if (pci_mmcfg_arch_init())
+ pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
+}
+
+void __init pci_mmcfg_late_init(void)
{
int known_bridge = 0;

+ /* MMCONFIG disabled */
if ((pci_probe & PCI_PROBE_MMCONF) == 0)
return;

- if (type == 1 && pci_mmcfg_check_hostbridge())
- known_bridge = 1;
+ /* MMCONFIG already enabled */
+ if (!(pci_probe & PCI_PROBE_MASK & ~PCI_PROBE_MMCONF))
+ return;

- if (!known_bridge) {
+ if ((pci_probe & PCI_PROBE_CONF1) && pci_mmcfg_check_hostbridge())
+ known_bridge = 1;
+ else
acpi_table_parse(ACPI_SIG_MCFG, acpi_parse_mcfg);
- pci_mmcfg_reject_broken(type);
- }
+
+ pci_mmcfg_reject_broken();

if ((pci_mmcfg_config_num == 0) ||
(pci_mmcfg_config == NULL) ||
Index: linux-2.6/arch/x86/pci/pci.h
===================================================================
--- linux-2.6.orig/arch/x86/pci/pci.h
+++ linux-2.6/arch/x86/pci/pci.h
@@ -100,7 +100,6 @@ extern struct pci_raw_ops pci_direct_con
extern int pci_direct_probe(void);
extern void pci_direct_init(int type);
extern void pci_pcbios_init(void);
-extern void pci_mmcfg_init(int type);
extern void pcibios_sort(void);

/* pci-mmconfig.c */
Index: linux-2.6/drivers/acpi/bus.c
===================================================================
--- linux-2.6.orig/drivers/acpi/bus.c
+++ linux-2.6/drivers/acpi/bus.c
@@ -35,6 +35,7 @@
#ifdef CONFIG_X86
#include <asm/mpspec.h>
#endif
+#include <linux/pci.h>
#include <acpi/acpi_bus.h>
#include <acpi/acpi_drivers.h>

@@ -783,6 +784,7 @@ static int __init acpi_init(void)
result = acpi_bus_init();

if (!result) {
+ pci_mmcfg_late_init();
if (!(pm_flags & PM_APM))
pm_flags |= PM_ACPI;
else {
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -1050,5 +1050,13 @@ extern unsigned long pci_cardbus_mem_siz

extern int pcibios_add_platform_entries(struct pci_dev *dev);

+#ifdef CONFIG_PCI_MMCONFIG
+extern void __init pci_mmcfg_early_init(int type);
+extern void __init pci_mmcfg_late_init(void);
+#else
+static inline void pci_mmcfg_early_init(int type) { }
+static inline void pci_mmcfg_late_init(void) { }
+#endif
+
#endif /* __KERNEL__ */
#endif /* LINUX_PCI_H */


2008-02-15 09:26:02

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 2/5] x86: clear pci_mmcfg_virt when mmcfg get rejected

From: Yinghai Lu <[email protected]>

For x86_64, need to free pci_mmcfg_virt, and iounmap some pointers
when MMCONF is not reserved in E820 or acpi _CRS and get rejected

Signed-off-by: Yinghai Lu <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

arch/x86/pci/mmconfig-shared.c | 1 +
arch/x86/pci/mmconfig_32.c | 4 ++++
arch/x86/pci/mmconfig_64.c | 22 +++++++++++++++++++++-
arch/x86/pci/pci.h | 1 +
4 files changed, 27 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/pci/mmconfig-shared.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/mmconfig-shared.c
+++ linux-2.6/arch/x86/pci/mmconfig-shared.c
@@ -300,6 +300,7 @@ static void __init pci_mmcfg_reject_brok

reject:
printk(KERN_ERR "PCI: Not using MMCONFIG.\n");
+ pci_mmcfg_arch_free();
kfree(pci_mmcfg_config);
pci_mmcfg_config = NULL;
pci_mmcfg_config_num = 0;
Index: linux-2.6/arch/x86/pci/mmconfig_32.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/mmconfig_32.c
+++ linux-2.6/arch/x86/pci/mmconfig_32.c
@@ -136,3 +136,7 @@ int __init pci_mmcfg_arch_init(void)
raw_pci_ext_ops = &pci_mmcfg;
return 1;
}
+
+void __init pci_mmcfg_arch_free(void)
+{
+}
Index: linux-2.6/arch/x86/pci/mmconfig_64.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/mmconfig_64.c
+++ linux-2.6/arch/x86/pci/mmconfig_64.c
@@ -127,7 +127,7 @@ static void __iomem * __init mcfg_iorema
int __init pci_mmcfg_arch_init(void)
{
int i;
- pci_mmcfg_virt = kmalloc(sizeof(*pci_mmcfg_virt) *
+ pci_mmcfg_virt = kzalloc(sizeof(*pci_mmcfg_virt) *
pci_mmcfg_config_num, GFP_KERNEL);
if (pci_mmcfg_virt == NULL) {
printk(KERN_ERR "PCI: Can not allocate memory for mmconfig structures\n");
@@ -141,9 +141,29 @@ int __init pci_mmcfg_arch_init(void)
printk(KERN_ERR "PCI: Cannot map mmconfig aperture for "
"segment %d\n",
pci_mmcfg_config[i].pci_segment);
+ pci_mmcfg_arch_free();
return 0;
}
}
raw_pci_ext_ops = &pci_mmcfg;
return 1;
}
+
+void __init pci_mmcfg_arch_free(void)
+{
+ int i;
+
+ if (pci_mmcfg_virt == NULL)
+ return;
+
+ for (i = 0; i < pci_mmcfg_config_num; ++i) {
+ if (pci_mmcfg_virt[i].virt) {
+ iounmap(pci_mmcfg_virt[i].virt);
+ pci_mmcfg_virt[i].virt = NULL;
+ pci_mmcfg_virt[i].cfg = NULL;
+ }
+ }
+
+ kfree(pci_mmcfg_virt);
+ pci_mmcfg_virt = NULL;
+}
Index: linux-2.6/arch/x86/pci/pci.h
===================================================================
--- linux-2.6.orig/arch/x86/pci/pci.h
+++ linux-2.6/arch/x86/pci/pci.h
@@ -105,6 +105,7 @@ extern void pcibios_sort(void);
/* pci-mmconfig.c */

extern int __init pci_mmcfg_arch_init(void);
+extern void __init pci_mmcfg_arch_free(void);

/*
* AMD Fam10h CPUs are buggy, and cannot access MMIO config space

2008-02-15 09:26:33

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 3/5] x86: mmconf enable mcfg early

From: Yinghai Lu <[email protected]>

Patch
x86: validate against ACPI motherboard resources

changed the mmconf init sequence, and init MMCONF late in acpi_init.

here change it back to old sequence
1. check hostbridge in early
2. check MCFG with e820 in early
3. if all fail, will check MCFg with acpi _CRS in acpi_init

So we can make MCONF working again when acpi=off is set if hostbridge support
that.

Signed-off-by: Yinghai Lu <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

arch/x86/pci/mmconfig-shared.c | 106 +++++++++++++++++--------------
1 file changed, 61 insertions(+), 45 deletions(-)

Index: linux-2.6/arch/x86/pci/mmconfig-shared.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/mmconfig-shared.c
+++ linux-2.6/arch/x86/pci/mmconfig-shared.c
@@ -241,7 +241,7 @@ static int __init is_acpi_reserved(unsig
return mcfg_res.flags;
}

-static void __init pci_mmcfg_reject_broken(void)
+static void __init pci_mmcfg_reject_broken(int type, int early)
{
typeof(pci_mmcfg_config[0]) *cfg;
int i;
@@ -266,34 +266,43 @@ static void __init pci_mmcfg_reject_brok
}

for (i = 0; i < pci_mmcfg_config_num; i++) {
+ int valid = 0;
u32 size = (cfg->end_bus_number + 1) << 20;
cfg = &pci_mmcfg_config[i];
- printk(KERN_NOTICE "PCI: MCFG configuration %d: base %lu "
+ printk(KERN_NOTICE "PCI: MCFG configuration %d: base %lx "
"segment %hu buses %u - %u\n",
i, (unsigned long)cfg->address, cfg->pci_segment,
(unsigned int)cfg->start_bus_number,
(unsigned int)cfg->end_bus_number);
- if (is_acpi_reserved(cfg->address, cfg->address + size - 1)) {
+
+ if (!early &&
+ is_acpi_reserved(cfg->address, cfg->address + size - 1)) {
printk(KERN_NOTICE "PCI: MCFG area at %Lx reserved "
"in ACPI motherboard resources\n",
cfg->address);
- } else {
+ valid = 1;
+ }
+
+ if (valid)
+ continue;
+
+ if (!early)
printk(KERN_ERR "PCI: BIOS Bug: MCFG area at %Lx is not"
" reserved in ACPI motherboard resources\n",
cfg->address);
- /* Don't try to do this check unless configuration
- type 1 is available. */
- if ((pci_probe & PCI_PROBE_CONF1) &&
- e820_all_mapped(cfg->address,
- cfg->address + size - 1,
- E820_RESERVED))
- printk(KERN_NOTICE
- "PCI: MCFG area at %Lx reserved in "
- "E820\n",
- cfg->address);
- else
- goto reject;
+ /* Don't try to do this check unless configuration
+ type 1 is available. */
+ if (type == 1 && e820_all_mapped(cfg->address,
+ cfg->address + size - 1,
+ E820_RESERVED)) {
+ printk(KERN_NOTICE
+ "PCI: MCFG area at %Lx reserved in E820\n",
+ cfg->address);
+ valid = 1;
}
+
+ if (!valid)
+ goto reject;
}

return;
@@ -306,46 +315,36 @@ reject:
pci_mmcfg_config_num = 0;
}

-void __init pci_mmcfg_early_init(int type)
-{
- if ((pci_probe & PCI_PROBE_MMCONF) == 0)
- return;
-
- /* If type 1 access is available, no need to enable MMCONFIG yet, we can
- defer until later when the ACPI interpreter is available to better
- validate things. */
- if (type == 1)
- return;
-
- acpi_table_parse(ACPI_SIG_MCFG, acpi_parse_mcfg);
-
- if ((pci_mmcfg_config_num == 0) ||
- (pci_mmcfg_config == NULL) ||
- (pci_mmcfg_config[0].address == 0))
- return;
+static int __initdata known_bridge;

- if (pci_mmcfg_arch_init())
- pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
-}
-
-void __init pci_mmcfg_late_init(void)
+void __init __pci_mmcfg_init(int type, int early)
{
- int known_bridge = 0;
-
/* MMCONFIG disabled */
if ((pci_probe & PCI_PROBE_MMCONF) == 0)
return;

/* MMCONFIG already enabled */
- if (!(pci_probe & PCI_PROBE_MASK & ~PCI_PROBE_MMCONF))
+ if (!early && !(pci_probe & PCI_PROBE_MASK & ~PCI_PROBE_MMCONF))
return;

- if ((pci_probe & PCI_PROBE_CONF1) && pci_mmcfg_check_hostbridge())
- known_bridge = 1;
- else
- acpi_table_parse(ACPI_SIG_MCFG, acpi_parse_mcfg);
+ /* for late to exit */
+ if (known_bridge)
+ return;

- pci_mmcfg_reject_broken();
+ if (early && type == 1) {
+ if (pci_mmcfg_check_hostbridge())
+ known_bridge = 1;
+#if 0
+ /* check e820 in late? */
+ else
+ return;
+#endif
+ }
+
+ if (!known_bridge) {
+ acpi_table_parse(ACPI_SIG_MCFG, acpi_parse_mcfg);
+ pci_mmcfg_reject_broken(type, early);
+ }

if ((pci_mmcfg_config_num == 0) ||
(pci_mmcfg_config == NULL) ||
@@ -365,6 +364,21 @@ void __init pci_mmcfg_late_init(void)
}
}

+void __init pci_mmcfg_early_init(int type)
+{
+ __pci_mmcfg_init(type, 1);
+}
+
+void __init pci_mmcfg_late_init(void)
+{
+ int type = 0;
+
+ if (pci_probe & PCI_PROBE_CONF1)
+ type = 1;
+
+ __pci_mmcfg_init(type, 0);
+}
+
static int __init pci_mmcfg_late_insert_resources(void)
{
/*

2008-02-15 09:27:06

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 4/5] x86_64: check msr to get mmconfig for amd family 10h opteron v3

From: Yinghai Lu <[email protected]>

so even booting kernel with acpi=off or even MCFG is not there, we still can
use MMCONFIG.

Signed-off-by: Yinghai Lu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Greg KH <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

arch/x86/pci/mmconfig-shared.c | 67 ++++++++++++++++++++++++++++---
1 file changed, 61 insertions(+), 6 deletions(-)

Index: linux-2.6/arch/x86/pci/mmconfig-shared.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/mmconfig-shared.c
+++ linux-2.6/arch/x86/pci/mmconfig-shared.c
@@ -100,33 +100,88 @@ static const char __init *pci_mmcfg_inte
return "Intel Corporation 945G/GZ/P/PL Express Memory Controller Hub";
}

+static const char __init *pci_mmcfg_amd_fam10h(void)
+{
+ u32 low, high, address;
+ u64 base, msr;
+ int i;
+ unsigned segnbits = 0, busnbits;
+
+ address = MSR_FAM10H_MMIO_CONF_BASE;
+ if (rdmsr_safe(address, &low, &high))
+ return NULL;
+
+ msr = high;
+ msr <<= 32;
+ msr |= low;
+
+ /* mmconfig is not enable */
+ if (!(msr & FAM10H_MMIO_CONF_ENABLE))
+ return NULL;
+
+ base = msr & (FAM10H_MMIO_CONF_BASE_MASK<<FAM10H_MMIO_CONF_BASE_SHIFT);
+
+ busnbits = (msr >> FAM10H_MMIO_CONF_BUSRANGE_SHIFT) &
+ FAM10H_MMIO_CONF_BUSRANGE_MASK;
+ if (busnbits > 8) {
+ segnbits = busnbits - 8;
+ busnbits = 8;
+ }
+
+ pci_mmcfg_config_num = (1 << segnbits);
+ pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]) *
+ pci_mmcfg_config_num, GFP_KERNEL);
+ if (!pci_mmcfg_config)
+ return NULL;
+
+ for (i = 0; i < (1 << segnbits); i++) {
+ pci_mmcfg_config[i].address = base + (1<<28) * i;
+ pci_mmcfg_config[i].pci_segment = i;
+ pci_mmcfg_config[i].start_bus_number = 0;
+ pci_mmcfg_config[i].end_bus_number = (1 << busnbits) - 1;
+ }
+
+ return "AMD Family 10h NB";
+}
+
struct pci_mmcfg_hostbridge_probe {
+ u32 bus;
+ u32 devfn;
u32 vendor;
u32 device;
const char *(*probe)(void);
};

static struct pci_mmcfg_hostbridge_probe pci_mmcfg_probes[] __initdata = {
- { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7520_MCH, pci_mmcfg_e7520 },
- { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82945G_HB, pci_mmcfg_intel_945 },
+ { 0, PCI_DEVFN(0, 0), PCI_VENDOR_ID_INTEL,
+ PCI_DEVICE_ID_INTEL_E7520_MCH, pci_mmcfg_e7520 },
+ { 0, PCI_DEVFN(0, 0), PCI_VENDOR_ID_INTEL,
+ PCI_DEVICE_ID_INTEL_82945G_HB, pci_mmcfg_intel_945 },
+ { 0, PCI_DEVFN(0x18, 0), PCI_VENDOR_ID_AMD,
+ 0x1200, pci_mmcfg_amd_fam10h },
+ { 0xff, PCI_DEVFN(0, 0), PCI_VENDOR_ID_AMD,
+ 0x1200, pci_mmcfg_amd_fam10h },
};

static int __init pci_mmcfg_check_hostbridge(void)
{
u32 l;
+ u32 bus, devfn;
u16 vendor, device;
int i;
const char *name;

- pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
- vendor = l & 0xffff;
- device = (l >> 16) & 0xffff;
-
pci_mmcfg_config_num = 0;
pci_mmcfg_config = NULL;
name = NULL;

for (i = 0; !name && i < ARRAY_SIZE(pci_mmcfg_probes); i++) {
+ bus = pci_mmcfg_probes[i].bus;
+ devfn = pci_mmcfg_probes[i].devfn;
+ pci_direct_conf1.read(0, bus, devfn, 0, 4, &l);
+ vendor = l & 0xffff;
+ device = (l >> 16) & 0xffff;
+
if (pci_mmcfg_probes[i].vendor == vendor &&
pci_mmcfg_probes[i].device == device)
name = pci_mmcfg_probes[i].probe();

2008-02-15 09:27:46

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 5/5] x86_64: set cfg_size for AMD Family 10h in case MMCONFIG is used. v4


reuse pci_cfg_space_size but skip check pci express and pci-x CAP ID.

Signed-off-by: Yinghai Lu <[email protected]>

===================================================================
Index: linux-2.6/arch/x86/pci/fixup.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/fixup.c
+++ linux-2.6/arch/x86/pci/fixup.c
@@ -493,3 +493,20 @@ static void __devinit pci_siemens_interr
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SIEMENS, 0x0015,
pci_siemens_interrupt_controller);
+
+/*
+ * Regular PCI devices have 256 bytes, but AMD Family 10h Opteron ext config
+ * have 4096 bytes. Even if the device is capable, that doesn't mean we can
+ * access it. Maybe we don't have a way to generate extended config space
+ * accesses. So check it
+ */
+static void fam10h_pci_cfg_space_size(struct pci_dev *dev)
+{
+ dev->cfg_size = pci_cfg_space_size_ext(dev, 0);
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x1200, fam10h_pci_cfg_space_size);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x1201, fam10h_pci_cfg_space_size);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x1202, fam10h_pci_cfg_space_size);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x1203, fam10h_pci_cfg_space_size);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x1204, fam10h_pci_cfg_space_size);
Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -809,11 +809,14 @@ static void set_pcie_port_type(struct pc
* reading the dword at 0x100 which must either be 0 or a valid extended
* capability header.
*/
-int pci_cfg_space_size(struct pci_dev *dev)
+int pci_cfg_space_size_ext(struct pci_dev *dev, unsigned check_exp_pcix)
{
int pos;
u32 status;

+ if (!check_exp_pcix)
+ goto skip;
+
pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
if (!pos) {
pos = pci_find_capability(dev, PCI_CAP_ID_PCIX);
@@ -825,6 +828,7 @@ int pci_cfg_space_size(struct pci_dev *d
goto fail;
}

+ skip:
if (pci_read_config_dword(dev, 256, &status) != PCIBIOS_SUCCESSFUL)
goto fail;
if (status == 0xffffffff)
@@ -836,6 +840,11 @@ int pci_cfg_space_size(struct pci_dev *d
return PCI_CFG_SPACE_SIZE;
}

+int pci_cfg_space_size(struct pci_dev *dev)
+{
+ return pci_cfg_space_size_ext(dev, 1);
+}
+
static void pci_release_bus_bridge_dev(struct device *dev)
{
kfree(dev);
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -654,6 +654,7 @@ int pci_scan_bridge(struct pci_bus *bus,

void pci_walk_bus(struct pci_bus *top, void (*cb)(struct pci_dev *, void *),
void *userdata);
+int pci_cfg_space_size_ext(struct pci_dev *dev, unsigned check_exp_pcix);
int pci_cfg_space_size(struct pci_dev *dev);
unsigned char pci_bus_max_busnr(struct pci_bus *bus);

2008-02-15 11:12:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

Yinghai Lu <[email protected]> writes:
>
> [[email protected]: many fixes and cleanups]
> Signed-off-by: Robert Hancock <[email protected]>
> Signed-off-by: Andi Kleen <[email protected]>
> Tested-by: Andi Kleen <[email protected]>

iirc it really was
Tested-and-didnt-pass-test-by: Andi Kleen
unfortunately. I have not rechecked recently, but on the one Intel
box the original patch and the other mcfg heuristics removed didn't work.

-Andi

2008-02-15 18:26:18

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

On Friday 15 February 2008 03:11:57 am Andi Kleen wrote:
> Yinghai Lu <[email protected]> writes:
> >
> > [[email protected]: many fixes and cleanups]
> > Signed-off-by: Robert Hancock <[email protected]>
> > Signed-off-by: Andi Kleen <[email protected]>
> > Tested-by: Andi Kleen <[email protected]>
>
> iirc it really was
> Tested-and-didnt-pass-test-by: Andi Kleen
> unfortunately. I have not rechecked recently, but on the one Intel
> box the original patch and the other mcfg heuristics removed didn't work.

i can not find the one that disable MMIO for BAR probing...

YH

2008-02-15 22:11:47

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

On Fri, Feb 15, 2008 at 3:11 AM, Andi Kleen <[email protected]> wrote:
> Yinghai Lu <[email protected]> writes:
> >
> > [[email protected]: many fixes and cleanups]
> > Signed-off-by: Robert Hancock <[email protected]>
> > Signed-off-by: Andi Kleen <[email protected]>
> > Tested-by: Andi Kleen <[email protected]>
>
> iirc it really was
> Tested-and-didnt-pass-test-by: Andi Kleen
> unfortunately. I have not rechecked recently, but on the one Intel
> box the original patch and the other mcfg heuristics removed didn't work.

it seems some intel system with old bios need

PATCH: Fix boot-time hang on G31/G33 PC

but greg decided not to use it, and user need to update BIOS or use pci=nommconf

but this one
x86: validate against acpi motherboard resources

should be different. it just reveres the BIOS bug.

Ingo,
Can you remove the line regarding Andi, and put them into x86 mm?

Andi could use pci=nommconf with that system with buggy BIOS or update the BIOS

YH

2008-02-15 22:17:03

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

On Fri, Feb 15, 2008 at 2:11 PM, Yinghai Lu <[email protected]> wrote:
> On Fri, Feb 15, 2008 at 3:11 AM, Andi Kleen <[email protected]> wrote:
> > Yinghai Lu <[email protected]> writes:
> > >
> > > [[email protected]: many fixes and cleanups]
> > > Signed-off-by: Robert Hancock <[email protected]>
> > > Signed-off-by: Andi Kleen <[email protected]>
> > > Tested-by: Andi Kleen <[email protected]>
> >
> > iirc it really was
> > Tested-and-didnt-pass-test-by: Andi Kleen
> > unfortunately. I have not rechecked recently, but on the one Intel
> > box the original patch and the other mcfg heuristics removed didn't work.

hope this could refresh your memory:

http://lkml.org/lkml/2008/1/8/197

YH

2008-02-16 05:52:33

by Robert Hancock

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

Andi Kleen wrote:
> Yinghai Lu <[email protected]> writes:
>> [[email protected]: many fixes and cleanups]
>> Signed-off-by: Robert Hancock <[email protected]>
>> Signed-off-by: Andi Kleen <[email protected]>
>> Tested-by: Andi Kleen <[email protected]>
>
> iirc it really was
> Tested-and-didnt-pass-test-by: Andi Kleen
> unfortunately. I have not rechecked recently, but on the one Intel
> box the original patch and the other mcfg heuristics removed didn't work.

With just this patch you will have this problem. You need either the
patch to disable decode during BAR sizing, or the patch to use MMCONFIG
for extended config space only, if you don't have them already.

2008-02-16 06:09:24

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

On Fri, Feb 15, 2008 at 9:52 PM, Robert Hancock <[email protected]> wrote:
> Andi Kleen wrote:
>
> > Yinghai Lu <[email protected]> writes:
> >> [[email protected]: many fixes and cleanups]
> >> Signed-off-by: Robert Hancock <[email protected]>
> >> Signed-off-by: Andi Kleen <[email protected]>
> >> Tested-by: Andi Kleen <[email protected]>
> >
> > iirc it really was
> > Tested-and-didnt-pass-test-by: Andi Kleen
> > unfortunately. I have not rechecked recently, but on the one Intel
> > box the original patch and the other mcfg heuristics removed didn't work.
>
> With just this patch you will have this problem. You need either the
> patch to disable decode during BAR sizing, or the patch to use MMCONFIG
> for extended config space only, if you don't have them already.

linus already apply the the patch to "use MMCONFIG for extended config
space only" to mainline. the last two near 2.6.25-rc1.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a0ca9909609470ad779b9b9cc68ce96e975afff7
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b6ce068a1285a24185b01be8a49021827516b3e1

Andi,
can you double check this patch with your test system?

YH

2008-02-17 14:06:51

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

On Fri, 15 Feb 2008, Yinghai Lu wrote:
> From: Robert Hancock <[email protected]>
>
> This path adds validation of the MMCONFIG table against the ACPI reserved
> motherboard resources. If the MMCONFIG table is found to be reserved in
> ACPI, we don't bother checking the E820 table. The PCI Express firmware
> spec apparently tells BIOS developers that reservation in ACPI is required
> and E820 reservation is optional, so checking against ACPI first makes
> sense. Many BIOSes don't reserve the MMCONFIG region in E820 even though
> it is perfectly functional, the existing check needlessly disables MMCONFIG
> in these cases.
>
> In order to do this, MMCONFIG setup has been split into two phases. If PCI
> configuration type 1 is not available then MMCONFIG is enabled early as
> before. Otherwise, it is enabled later after the ACPI interpreter is
> enabled, since we need to be able to execute control methods in order to
> check the ACPI reserved resources. Presently this is just triggered off
> the end of ACPI interpreter initialization.
>
> There are a few other behavioral changes here:
>
> - Validate all MMCONFIG configurations provided, not just the first one.
>
> - Validate the entire required length of each configuration according to
> the provided ending bus number is reserved, not just the minimum required
> allocation.
>
> - Validate that the area is reserved even if we read it from the chipset
> directly and not from the MCFG table. This catches the case where the
> BIOS didn't set the location properly in the chipset and has mapped it
> over other things it shouldn't have.
>
> This also cleans up the MMCONFIG initialization functions so that they
> simply do nothing if MMCONFIG is not compiled in.
>
> Based on an original patch by Rajesh Shah from Intel.

Applied. Thanks,

tglx

2008-02-17 16:42:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources


* Yinghai Lu <[email protected]> wrote:

> linus already apply the the patch to "use MMCONFIG for extended config
> space only" to mainline. the last two near 2.6.25-rc1.
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a0ca9909609470ad779b9b9cc68ce96e975afff7
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b6ce068a1285a24185b01be8a49021827516b3e1

FYI, we've applied your patches to x86.git, they certainly look
sensible.

Ingo

2008-02-18 10:55:20

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

On Fri, Feb 15, 2008 at 11:52:14PM -0600, Robert Hancock wrote:
> Andi Kleen wrote:
> >Yinghai Lu <[email protected]> writes:
> >>[[email protected]: many fixes and cleanups]
> >>Signed-off-by: Robert Hancock <[email protected]>
> >>Signed-off-by: Andi Kleen <[email protected]>
> >>Tested-by: Andi Kleen <[email protected]>
> >
> >iirc it really was
> >Tested-and-didnt-pass-test-by: Andi Kleen
> >unfortunately. I have not rechecked recently, but on the one Intel
> >box the original patch and the other mcfg heuristics removed didn't work.
>
> With just this patch you will have this problem. You need either the
> patch to disable decode during BAR sizing,

Isn't that one already merged?

I remember the BAR decoding patch did help with at least one
of the original failures (there were multiple ones iirc0)

If someone points me to all the patches needed or a tree who
has them all applied I can give it a quick spin on the boxes I have here.
One of the systems where it originally failed I don't have anymore
though.

> or the patch to use MMCONFIG
> for extended config space only, if you don't have them already.

That would mean it would boot, but anything that uses extended config
space would fail. While not as catastrophic as before I'm not sure
it's that great either. At least there would be still breakage,
but much more subtle ones.

-Andi

2008-02-18 20:26:22

by Robert Hancock

[permalink] [raw]
Subject: Re: [PATCH 1/5] x86: validate against acpi motherboard resources

Andi Kleen wrote:
>> With just this patch you will have this problem. You need either the
>> patch to disable decode during BAR sizing,
>
> Isn't that one already merged?
>
> I remember the BAR decoding patch did help with at least one
> of the original failures (there were multiple ones iirc0)

I believe that one's been dropped as it's not needed if we don't use
MMCONFIG for non-extended accesses (like we use during BAR sizing).
(Though, there may still be a case where it's needed, see below.)

>
> If someone points me to all the patches needed or a tree who
> has them all applied I can give it a quick spin on the boxes I have here.
> One of the systems where it originally failed I don't have anymore
> though.
>
>> or the patch to use MMCONFIG
>> for extended config space only, if you don't have them already.
>
> That would mean it would boot, but anything that uses extended config
> space would fail. While not as catastrophic as before I'm not sure
> it's that great either. At least there would be still breakage,
> but much more subtle ones.

The only issue on those boards is that since certain device BARs will
overlap the MMCONFIG area during BAR sizing, if you use MMCONFIG to do
the accesses used during BAR sizing itself, it'll fail. If you use conf1
to do the BAR sizing then that problem doesn't happen.

However, I suppose there could be an issue if you hotplugged a device
(causing BAR sizing) once you'd booted, while extended config space was
in use on another device. The BAR sizing wouldn't fail, but the guy
using extended config space would since he's actually reading
from/writing into the BAR of the device being sized instead of the
MMCONFIG area. That wouldn't be good. The disable-decode-during-sizing
patch would avoid that problem.

2008-02-19 03:59:59

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 4/5] x86_64: check msr to get mmconfig for amd family 10h opteron v3

On Feb 15, 2008 1:31 AM, Yinghai Lu <[email protected]> wrote:
> From: Yinghai Lu <[email protected]>
>
> so even booting kernel with acpi=off or even MCFG is not there, we still can
> use MMCONFIG.
>
> Signed-off-by: Yinghai Lu <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Greg KH <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> ---
>
> arch/x86/pci/mmconfig-shared.c | 67 ++++++++++++++++++++++++++++---
> 1 file changed, 61 insertions(+), 6 deletions(-)
>
> Index: linux-2.6/arch/x86/pci/mmconfig-shared.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/mmconfig-shared.c
> +++ linux-2.6/arch/x86/pci/mmconfig-shared.c

Ingo/Thomas,

It seems you missed this one in the 5.

this one should be safe. it only reads msr.

Andi had concern with other one that was touching msr. I will keep
that one in my local tree.

YH

2008-02-19 10:14:16

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 4/5] x86_64: check msr to get mmconfig for amd family 10h opteron v3


* Yinghai Lu <[email protected]> wrote:

> > Index: linux-2.6/arch/x86/pci/mmconfig-shared.c
> > ===================================================================
> > --- linux-2.6.orig/arch/x86/pci/mmconfig-shared.c
> > +++ linux-2.6/arch/x86/pci/mmconfig-shared.c
>
> Ingo/Thomas,
>
> It seems you missed this one in the 5.
>
> this one should be safe. it only reads msr.

we picked up all of your patches - have you checked the x86.git#testing
branch, or only the x86.git#mm branch? We started x86.git#testing
recently, it includes more bleeding-edge stuff and wider, cross-tree
features as well.

> Andi had concern with other one that was touching msr. I will keep
> that one in my local tree.

We've got that one queued up in x86.git#testing as well. Someone on lkml
having concerns is not a basis to not deal with a patch. What matters
is: are those concerns well-founded? If you do not agree with the
concerns then just indicate it to us when re-submitting but do not drop
patches. Your patches have a good track record so please do not let
yourself get scared away from contributing. Please do not keep patches
in your local tree, we'd like Linux to work fine out of box, on as broad
range of hardware as possible.

Ingo

2008-02-19 10:39:45

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 4/5] x86_64: check msr to get mmconfig for amd family 10h opteron v3

On Feb 19, 2008 2:13 AM, Ingo Molnar <[email protected]> wrote:
>
> * Yinghai Lu <[email protected]> wrote:
>
> > > Index: linux-2.6/arch/x86/pci/mmconfig-shared.c
> > > ===================================================================
> > > --- linux-2.6.orig/arch/x86/pci/mmconfig-shared.c
> > > +++ linux-2.6/arch/x86/pci/mmconfig-shared.c
> >
> > Ingo/Thomas,
> >
> > It seems you missed this one in the 5.
> >
> > this one should be safe. it only reads msr.
>
> we picked up all of your patches - have you checked the x86.git#testing
> branch, or only the x86.git#mm branch? We started x86.git#testing
> recently, it includes more bleeding-edge stuff and wider, cross-tree
> features as well.
>
> > Andi had concern with other one that was touching msr. I will keep
> > that one in my local tree.
>
> We've got that one queued up in x86.git#testing as well. Someone on lkml
> having concerns is not a basis to not deal with a patch. What matters
> is: are those concerns well-founded? If you do not agree with the
> concerns then just indicate it to us when re-submitting but do not drop
> patches. Your patches have a good track record so please do not let
> yourself get scared away from contributing. Please do not keep patches
> in your local tree, we'd like Linux to work fine out of box, on as broad
> range of hardware as possible.

good, I will send out the left over according to testing.

YH