Subject: Re: [PATCH] x86: Add PCI extended config space access for AMD Barcelona

Arjan,

On 28.05.08 12:02:53, Arjan van de Ven wrote:
> Comment 1:
> Can we make the 256/4096 thing conditional on actually having the
> feature somehow? (while not making the code TOO ugly)

In the first version I had 2 functions also. The patch have had lots
of duplicate code or inline functions. Since the conditional check is
already in raw_pci_* I decided to not implement an additional check
and use only one function.

int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
int reg, int len, u32 *val)
{
if (reg < 256 && raw_pci_ops)
return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
if (raw_pci_ext_ops)
return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
return -EINVAL;
}

That leaves as a difference to the basic access is the shift left of
bits 8:11 in the PCI_CONF1_ADDRESS macro. Functional the new macro is
the same and the overhead for this is small. So I see keeping all code
in one function as the best solution.

> Comment 2:
> The cpu_has_XXX is a bit dubious; while it's dependent on your cpu
> model right now, I'm a bit hesitant to consider a PCI feature something
> that belongs in the cpu_has_XXX namespace. (Yes I know PCI is moving
> into the cpu package, but on a logical level it seems just the wrong
> place).
> Do we need a platform_has_XXX namespace for things like this?

An alternative implementation would be here to use a check something
like pci_probe & PCI_HAS_EXT_CFG. If needed, I will send an updated
patch.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: [email protected]


2008-06-03 02:35:48

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH] x86: Add PCI extended config space access for AMD Barcelona

On Mon, 2 Jun 2008 16:19:29 +0200
Robert Richter <[email protected]> wrote:

> Arjan,
>
> On 28.05.08 12:02:53, Arjan van de Ven wrote:
> > Comment 1:
> > Can we make the 256/4096 thing conditional on actually having the
> > feature somehow? (while not making the code TOO ugly)
>
> In the first version I had 2 functions also. The patch have had lots
> of duplicate code or inline functions. Since the conditional check is
> already in raw_pci_* I decided to not implement an additional check
> and use only one function.

ok fair enough.

>
> > Comment 2:
> > The cpu_has_XXX is a bit dubious; while it's dependent on your cpu
> > model right now, I'm a bit hesitant to consider a PCI feature
> > something that belongs in the cpu_has_XXX namespace. (Yes I know
> > PCI is moving into the cpu package, but on a logical level it seems
> > just the wrong place).
> > Do we need a platform_has_XXX namespace for things like this?
>
> An alternative implementation would be here to use a check something
> like pci_probe & PCI_HAS_EXT_CFG. If needed, I will send an updated
> patch.

I kind of prefer this, since logically this is a PCI not a CPU property.
Would you mind doing this ?
(not that your current patch is wrong, it's just nicer to keep CPU
stuff with the CPU and PCI stuff with PCI :)
(This wouldn't need to stop inclusion of your current patch, it can
just be an incremental cleanup)

--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

Subject: Re: [PATCH] x86: Add PCI extended config space access for AMD Barcelona

On 02.06.08 19:35:27, Arjan van de Ven wrote:
> > An alternative implementation would be here to use a check something
> > like pci_probe & PCI_HAS_EXT_CFG. If needed, I will send an updated
> > patch.
>
> I kind of prefer this, since logically this is a PCI not a CPU property.
> Would you mind doing this ?
> (not that your current patch is wrong, it's just nicer to keep CPU
> stuff with the CPU and PCI stuff with PCI :)
> (This wouldn't need to stop inclusion of your current patch, it can
> just be an incremental cleanup)

Ok, I will send a follow on patch of this.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: [email protected]

Subject: [PATCH 2/2] x86: Move PCI IO ECS code to x86/pci

"Form follows function". Code is now where it belongs to.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/kernel/cpu/amd.c | 3 ---
arch/x86/kernel/cpu/amd_64.c | 4 ----
arch/x86/kernel/cpu/cpu.h | 2 --
arch/x86/kernel/setup.c | 13 -------------
arch/x86/pci/Makefile_32 | 1 +
arch/x86/pci/amd_bus.c | 32 ++++++++++++++++++++++++++++++++
arch/x86/pci/direct.c | 16 +++++++++-------
arch/x86/pci/pci.h | 1 +
include/asm-x86/cpufeature.h | 2 --
9 files changed, 43 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index acc891a..81a07ca 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -266,9 +266,6 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)

if (cpu_has_xmm2)
set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
-
- if (c->x86 == 0x10)
- amd_enable_pci_ext_cfg(c);
}

static unsigned int __cpuinit amd_size_cache(struct cpuinfo_x86 *c, unsigned int size)
diff --git a/arch/x86/kernel/cpu/amd_64.c b/arch/x86/kernel/cpu/amd_64.c
index f8d2058..250bfe6 100644
--- a/arch/x86/kernel/cpu/amd_64.c
+++ b/arch/x86/kernel/cpu/amd_64.c
@@ -6,7 +6,6 @@
#include <asm/cacheflush.h>

#include <mach_apic.h>
-#include "cpu.h"

extern int __cpuinit get_model_name(struct cpuinfo_x86 *c);
extern void __cpuinit display_cacheinfo(struct cpuinfo_x86 *c);
@@ -187,9 +186,6 @@ void __cpuinit init_amd(struct cpuinfo_x86 *c)
if (c->x86 == 0x10)
fam10h_check_enable_mmcfg();

- if (c->x86 == 0x10)
- amd_enable_pci_ext_cfg(c);
-
if (c == &boot_cpu_data && c->x86 >= 0xf && c->x86 <= 0x11) {
unsigned long long tseg;

diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
index f5d5bb1..40ad189 100644
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -39,5 +39,3 @@ extern int get_model_name(struct cpuinfo_x86 *c);
extern void display_cacheinfo(struct cpuinfo_x86 *c);

#endif /* CONFIG_X86_32 */
-
-extern void __cpuinit amd_enable_pci_ext_cfg(struct cpuinfo_x86 *c);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 20e14db..6f80b85 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -137,16 +137,3 @@ void __init setup_per_cpu_areas(void)
}

#endif
-#define ENABLE_CF8_EXT_CFG (1ULL << 46)
-
-void __cpuinit amd_enable_pci_ext_cfg(struct cpuinfo_x86 *c)
-{
- u64 reg;
- rdmsrl(MSR_AMD64_NB_CFG, reg);
- if (!(reg & ENABLE_CF8_EXT_CFG)) {
- reg |= ENABLE_CF8_EXT_CFG;
- wrmsrl(MSR_AMD64_NB_CFG, reg);
- }
- set_cpu_cap(c, X86_FEATURE_PCI_EXT_CFG);
-}
-
diff --git a/arch/x86/pci/Makefile_32 b/arch/x86/pci/Makefile_32
index 89ec35d..f647e7e 100644
--- a/arch/x86/pci/Makefile_32
+++ b/arch/x86/pci/Makefile_32
@@ -22,3 +22,4 @@ pci-$(CONFIG_X86_NUMAQ) := numa.o irq.o
pci-$(CONFIG_NUMA) += mp_bus_to_node.o

obj-y += $(pci-y) common.o early.o
+obj-y += amd_bus.o
diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c
index 5c2799c..15f505d 100644
--- a/arch/x86/pci/amd_bus.c
+++ b/arch/x86/pci/amd_bus.c
@@ -1,5 +1,9 @@
#include <linux/init.h>
#include <linux/pci.h>
+#include "pci.h"
+
+#ifdef CONFIG_X86_64
+
#include <asm/pci-direct.h>
#include <asm/mpspec.h>
#include <linux/cpumask.h>
@@ -526,3 +530,31 @@ static int __init early_fill_mp_bus_info(void)
}

postcore_initcall(early_fill_mp_bus_info);
+
+#endif
+
+/* common 32/64 bit code */
+
+#define ENABLE_CF8_EXT_CFG (1ULL << 46)
+
+static void enable_pci_io_ecs_per_cpu(void *unused)
+{
+ u64 reg;
+ rdmsrl(MSR_AMD64_NB_CFG, reg);
+ if (!(reg & ENABLE_CF8_EXT_CFG)) {
+ reg |= ENABLE_CF8_EXT_CFG;
+ wrmsrl(MSR_AMD64_NB_CFG, reg);
+ }
+}
+
+static int __init enable_pci_io_ecs(void)
+{
+ /* assume all cpus from fam10h have IO ECS */
+ if (boot_cpu_data.x86 < 0x10)
+ return 0;
+ on_each_cpu(enable_pci_io_ecs_per_cpu, NULL, 1, 1);
+ pci_probe |= PCI_HAS_IO_ECS;
+ return 0;
+}
+
+postcore_initcall(enable_pci_io_ecs);
diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
index 27d61b6..9915293 100644
--- a/arch/x86/pci/direct.c
+++ b/arch/x86/pci/direct.c
@@ -265,14 +265,16 @@ void __init pci_direct_init(int type)
type);
if (type == 1) {
raw_pci_ops = &pci_direct_conf1;
- if (!raw_pci_ext_ops && cpu_has_pci_ext_cfg) {
- printk(KERN_INFO "PCI: Using configuration type 1 "
- "for extended access\n");
- raw_pci_ext_ops = &pci_direct_conf1;
- }
- } else {
- raw_pci_ops = &pci_direct_conf2;
+ if (raw_pci_ext_ops)
+ return;
+ if (!(pci_probe & PCI_HAS_IO_ECS))
+ return;
+ printk(KERN_INFO "PCI: Using configuration type 1 "
+ "for extended access\n");
+ raw_pci_ext_ops = &pci_direct_conf1;
+ return;
}
+ raw_pci_ops = &pci_direct_conf2;
}

int __init pci_direct_probe(void)
diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
index f3972b1..fd53db5 100644
--- a/arch/x86/pci/pci.h
+++ b/arch/x86/pci/pci.h
@@ -27,6 +27,7 @@
#define PCI_CAN_SKIP_ISA_ALIGN 0x8000
#define PCI_USE__CRS 0x10000
#define PCI_CHECK_ENABLE_AMD_MMCONF 0x20000
+#define PCI_HAS_IO_ECS 0x40000

extern unsigned int pci_probe;
extern unsigned long pirq_table_addr;
diff --git a/include/asm-x86/cpufeature.h b/include/asm-x86/cpufeature.h
index 40fcbba..0d609c8 100644
--- a/include/asm-x86/cpufeature.h
+++ b/include/asm-x86/cpufeature.h
@@ -79,7 +79,6 @@
#define X86_FEATURE_REP_GOOD (3*32+16) /* rep microcode works well on this CPU */
#define X86_FEATURE_MFENCE_RDTSC (3*32+17) /* Mfence synchronizes RDTSC */
#define X86_FEATURE_LFENCE_RDTSC (3*32+18) /* Lfence synchronizes RDTSC */
-#define X86_FEATURE_PCI_EXT_CFG (3*32+19) /* PCI extended cfg access */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 (4*32+ 0) /* Streaming SIMD Extensions-3 */
@@ -188,7 +187,6 @@ extern const char * const x86_power_flags[32];
#define cpu_has_gbpages boot_cpu_has(X86_FEATURE_GBPAGES)
#define cpu_has_arch_perfmon boot_cpu_has(X86_FEATURE_ARCH_PERFMON)
#define cpu_has_pat boot_cpu_has(X86_FEATURE_PAT)
-#define cpu_has_pci_ext_cfg boot_cpu_has(X86_FEATURE_PCI_EXT_CFG)

#if defined(CONFIG_X86_INVLPG) || defined(CONFIG_X86_64)
# define cpu_has_invlpg 1
--
1.5.5.3

Subject: [PATCH 1/2] x86/pci: Renaming k8-bus_64.c to amd_bus.c

The name fits better since this is code not only for K8.

Signed-off-by: Robert Richter <[email protected]>
---
arch/x86/pci/Makefile_64 | 2 +-
arch/x86/pci/amd_bus.c | 528 ++++++++++++++++++++++++++++++++++++++++++++++
arch/x86/pci/k8-bus_64.c | 528 ----------------------------------------------
3 files changed, 529 insertions(+), 529 deletions(-)
create mode 100644 arch/x86/pci/amd_bus.c
delete mode 100644 arch/x86/pci/k8-bus_64.c

diff --git a/arch/x86/pci/Makefile_64 b/arch/x86/pci/Makefile_64
index 8fbd198..fd47068 100644
--- a/arch/x86/pci/Makefile_64
+++ b/arch/x86/pci/Makefile_64
@@ -13,5 +13,5 @@ obj-y += legacy.o irq.o common.o early.o
# mmconfig has a 64bit special
obj-$(CONFIG_PCI_MMCONFIG) += mmconfig_64.o direct.o mmconfig-shared.o

-obj-y += k8-bus_64.o
+obj-y += amd_bus.o

diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c
new file mode 100644
index 0000000..5c2799c
--- /dev/null
+++ b/arch/x86/pci/amd_bus.c
@@ -0,0 +1,528 @@
+#include <linux/init.h>
+#include <linux/pci.h>
+#include <asm/pci-direct.h>
+#include <asm/mpspec.h>
+#include <linux/cpumask.h>
+#include <linux/topology.h>
+
+/*
+ * This discovers the pcibus <-> node mapping on AMD K8.
+ * also get peer root bus resource for io,mmio
+ */
+
+
+/*
+ * sub bus (transparent) will use entres from 3 to store extra from root,
+ * so need to make sure have enought slot there, increase PCI_BUS_NUM_RESOURCES?
+ */
+#define RES_NUM 16
+struct pci_root_info {
+ char name[12];
+ unsigned int res_num;
+ struct resource res[RES_NUM];
+ int bus_min;
+ int bus_max;
+ int node;
+ int link;
+};
+
+/* 4 at this time, it may become to 32 */
+#define PCI_ROOT_NR 4
+static int pci_root_num;
+static struct pci_root_info pci_root_info[PCI_ROOT_NR];
+
+#ifdef CONFIG_NUMA
+
+#define BUS_NR 256
+
+static int mp_bus_to_node[BUS_NR];
+
+void set_mp_bus_to_node(int busnum, int node)
+{
+ if (busnum >= 0 && busnum < BUS_NR)
+ mp_bus_to_node[busnum] = node;
+}
+
+int get_mp_bus_to_node(int busnum)
+{
+ int node = -1;
+
+ if (busnum < 0 || busnum > (BUS_NR - 1))
+ return node;
+
+ node = mp_bus_to_node[busnum];
+
+ /*
+ * let numa_node_id to decide it later in dma_alloc_pages
+ * if there is no ram on that node
+ */
+ if (node != -1 && !node_online(node))
+ node = -1;
+
+ return node;
+}
+#endif
+
+void set_pci_bus_resources_arch_default(struct pci_bus *b)
+{
+ int i;
+ int j;
+ struct pci_root_info *info;
+
+ /* if only one root bus, don't need to anything */
+ if (pci_root_num < 2)
+ return;
+
+ for (i = 0; i < pci_root_num; i++) {
+ if (pci_root_info[i].bus_min == b->number)
+ break;
+ }
+
+ if (i == pci_root_num)
+ return;
+
+ info = &pci_root_info[i];
+ for (j = 0; j < info->res_num; j++) {
+ struct resource *res;
+ struct resource *root;
+
+ res = &info->res[j];
+ b->resource[j] = res;
+ if (res->flags & IORESOURCE_IO)
+ root = &ioport_resource;
+ else
+ root = &iomem_resource;
+ insert_resource(root, res);
+ }
+}
+
+#define RANGE_NUM 16
+
+struct res_range {
+ size_t start;
+ size_t end;
+};
+
+static void __init update_range(struct res_range *range, size_t start,
+ size_t end)
+{
+ int i;
+ int j;
+
+ for (j = 0; j < RANGE_NUM; j++) {
+ if (!range[j].end)
+ continue;
+
+ if (start <= range[j].start && end >= range[j].end) {
+ range[j].start = 0;
+ range[j].end = 0;
+ continue;
+ }
+
+ if (start <= range[j].start && end < range[j].end && range[j].start < end + 1) {
+ range[j].start = end + 1;
+ continue;
+ }
+
+
+ if (start > range[j].start && end >= range[j].end && range[j].end > start - 1) {
+ range[j].end = start - 1;
+ continue;
+ }
+
+ if (start > range[j].start && end < range[j].end) {
+ /* find the new spare */
+ for (i = 0; i < RANGE_NUM; i++) {
+ if (range[i].end == 0)
+ break;
+ }
+ if (i < RANGE_NUM) {
+ range[i].end = range[j].end;
+ range[i].start = end + 1;
+ } else {
+ printk(KERN_ERR "run of slot in ranges\n");
+ }
+ range[j].end = start - 1;
+ continue;
+ }
+ }
+}
+
+static void __init update_res(struct pci_root_info *info, size_t start,
+ size_t end, unsigned long flags, int merge)
+{
+ int i;
+ struct resource *res;
+
+ if (!merge)
+ goto addit;
+
+ /* try to merge it with old one */
+ for (i = 0; i < info->res_num; i++) {
+ size_t final_start, final_end;
+ size_t common_start, common_end;
+
+ res = &info->res[i];
+ if (res->flags != flags)
+ continue;
+
+ common_start = max((size_t)res->start, start);
+ common_end = min((size_t)res->end, end);
+ if (common_start > common_end + 1)
+ continue;
+
+ final_start = min((size_t)res->start, start);
+ final_end = max((size_t)res->end, end);
+
+ res->start = final_start;
+ res->end = final_end;
+ return;
+ }
+
+addit:
+
+ /* need to add that */
+ if (info->res_num >= RES_NUM)
+ return;
+
+ res = &info->res[info->res_num];
+ res->name = info->name;
+ res->flags = flags;
+ res->start = start;
+ res->end = end;
+ res->child = NULL;
+ info->res_num++;
+}
+
+struct pci_hostbridge_probe {
+ u32 bus;
+ u32 slot;
+ u32 vendor;
+ u32 device;
+};
+
+static struct pci_hostbridge_probe pci_probes[] __initdata = {
+ { 0, 0x18, PCI_VENDOR_ID_AMD, 0x1100 },
+ { 0, 0x18, PCI_VENDOR_ID_AMD, 0x1200 },
+ { 0xff, 0, PCI_VENDOR_ID_AMD, 0x1200 },
+ { 0, 0x18, PCI_VENDOR_ID_AMD, 0x1300 },
+};
+
+static u64 __initdata fam10h_mmconf_start;
+static u64 __initdata fam10h_mmconf_end;
+static void __init get_pci_mmcfg_amd_fam10h_range(void)
+{
+ u32 address;
+ u64 base, msr;
+ unsigned segn_busn_bits;
+
+ /* assume all cpus from fam10h have mmconf */
+ if (boot_cpu_data.x86 < 0x10)
+ return;
+
+ address = MSR_FAM10H_MMIO_CONF_BASE;
+ rdmsrl(address, msr);
+
+ /* mmconfig is not enable */
+ if (!(msr & FAM10H_MMIO_CONF_ENABLE))
+ return;
+
+ base = msr & (FAM10H_MMIO_CONF_BASE_MASK<<FAM10H_MMIO_CONF_BASE_SHIFT);
+
+ segn_busn_bits = (msr >> FAM10H_MMIO_CONF_BUSRANGE_SHIFT) &
+ FAM10H_MMIO_CONF_BUSRANGE_MASK;
+
+ fam10h_mmconf_start = base;
+ fam10h_mmconf_end = base + (1ULL<<(segn_busn_bits + 20)) - 1;
+}
+
+/**
+ * early_fill_mp_bus_to_node()
+ * called before pcibios_scan_root and pci_scan_bus
+ * fills the mp_bus_to_cpumask array based according to the LDT Bus Number
+ * Registers found in the K8 northbridge
+ */
+static int __init early_fill_mp_bus_info(void)
+{
+ int i;
+ int j;
+ unsigned bus;
+ unsigned slot;
+ int found;
+ int node;
+ int link;
+ int def_node;
+ int def_link;
+ struct pci_root_info *info;
+ u32 reg;
+ struct resource *res;
+ size_t start;
+ size_t end;
+ struct res_range range[RANGE_NUM];
+ u64 val;
+ u32 address;
+
+#ifdef CONFIG_NUMA
+ for (i = 0; i < BUS_NR; i++)
+ mp_bus_to_node[i] = -1;
+#endif
+
+ if (!early_pci_allowed())
+ return -1;
+
+ found = 0;
+ for (i = 0; i < ARRAY_SIZE(pci_probes); i++) {
+ u32 id;
+ u16 device;
+ u16 vendor;
+
+ bus = pci_probes[i].bus;
+ slot = pci_probes[i].slot;
+ id = read_pci_config(bus, slot, 0, PCI_VENDOR_ID);
+
+ vendor = id & 0xffff;
+ device = (id>>16) & 0xffff;
+ if (pci_probes[i].vendor == vendor &&
+ pci_probes[i].device == device) {
+ found = 1;
+ break;
+ }
+ }
+
+ if (!found)
+ return 0;
+
+ pci_root_num = 0;
+ for (i = 0; i < 4; i++) {
+ int min_bus;
+ int max_bus;
+ reg = read_pci_config(bus, slot, 1, 0xe0 + (i << 2));
+
+ /* Check if that register is enabled for bus range */
+ if ((reg & 7) != 3)
+ continue;
+
+ min_bus = (reg >> 16) & 0xff;
+ max_bus = (reg >> 24) & 0xff;
+ node = (reg >> 4) & 0x07;
+#ifdef CONFIG_NUMA
+ for (j = min_bus; j <= max_bus; j++)
+ mp_bus_to_node[j] = (unsigned char) node;
+#endif
+ link = (reg >> 8) & 0x03;
+
+ info = &pci_root_info[pci_root_num];
+ info->bus_min = min_bus;
+ info->bus_max = max_bus;
+ info->node = node;
+ info->link = link;
+ sprintf(info->name, "PCI Bus #%02x", min_bus);
+ pci_root_num++;
+ }
+
+ /* get the default node and link for left over res */
+ reg = read_pci_config(bus, slot, 0, 0x60);
+ def_node = (reg >> 8) & 0x07;
+ reg = read_pci_config(bus, slot, 0, 0x64);
+ def_link = (reg >> 8) & 0x03;
+
+ memset(range, 0, sizeof(range));
+ range[0].end = 0xffff;
+ /* io port resource */
+ for (i = 0; i < 4; i++) {
+ reg = read_pci_config(bus, slot, 1, 0xc0 + (i << 3));
+ if (!(reg & 3))
+ continue;
+
+ start = reg & 0xfff000;
+ reg = read_pci_config(bus, slot, 1, 0xc4 + (i << 3));
+ node = reg & 0x07;
+ link = (reg >> 4) & 0x03;
+ end = (reg & 0xfff000) | 0xfff;
+
+ /* find the position */
+ for (j = 0; j < pci_root_num; j++) {
+ info = &pci_root_info[j];
+ if (info->node == node && info->link == link)
+ break;
+ }
+ if (j == pci_root_num)
+ continue; /* not found */
+
+ info = &pci_root_info[j];
+ printk(KERN_DEBUG "node %d link %d: io port [%llx, %llx]\n",
+ node, link, (u64)start, (u64)end);
+
+ /* kernel only handle 16 bit only */
+ if (end > 0xffff)
+ end = 0xffff;
+ update_res(info, start, end, IORESOURCE_IO, 1);
+ update_range(range, start, end);
+ }
+ /* add left over io port range to def node/link, [0, 0xffff] */
+ /* find the position */
+ for (j = 0; j < pci_root_num; j++) {
+ info = &pci_root_info[j];
+ if (info->node == def_node && info->link == def_link)
+ break;
+ }
+ if (j < pci_root_num) {
+ info = &pci_root_info[j];
+ for (i = 0; i < RANGE_NUM; i++) {
+ if (!range[i].end)
+ continue;
+
+ update_res(info, range[i].start, range[i].end,
+ IORESOURCE_IO, 1);
+ }
+ }
+
+ memset(range, 0, sizeof(range));
+ /* 0xfd00000000-0xffffffffff for HT */
+ range[0].end = (0xfdULL<<32) - 1;
+
+ /* need to take out [0, TOM) for RAM*/
+ address = MSR_K8_TOP_MEM1;
+ rdmsrl(address, val);
+ end = (val & 0xffffff8000000ULL);
+ printk(KERN_INFO "TOM: %016lx aka %ldM\n", end, end>>20);
+ if (end < (1ULL<<32))
+ update_range(range, 0, end - 1);
+
+ /* get mmconfig */
+ get_pci_mmcfg_amd_fam10h_range();
+ /* need to take out mmconf range */
+ if (fam10h_mmconf_end) {
+ printk(KERN_DEBUG "Fam 10h mmconf [%llx, %llx]\n", fam10h_mmconf_start, fam10h_mmconf_end);
+ update_range(range, fam10h_mmconf_start, fam10h_mmconf_end);
+ }
+
+ /* mmio resource */
+ for (i = 0; i < 8; i++) {
+ reg = read_pci_config(bus, slot, 1, 0x80 + (i << 3));
+ if (!(reg & 3))
+ continue;
+
+ start = reg & 0xffffff00; /* 39:16 on 31:8*/
+ start <<= 8;
+ reg = read_pci_config(bus, slot, 1, 0x84 + (i << 3));
+ node = reg & 0x07;
+ link = (reg >> 4) & 0x03;
+ end = (reg & 0xffffff00);
+ end <<= 8;
+ end |= 0xffff;
+
+ /* find the position */
+ for (j = 0; j < pci_root_num; j++) {
+ info = &pci_root_info[j];
+ if (info->node == node && info->link == link)
+ break;
+ }
+ if (j == pci_root_num)
+ continue; /* not found */
+
+ info = &pci_root_info[j];
+
+ printk(KERN_DEBUG "node %d link %d: mmio [%llx, %llx]",
+ node, link, (u64)start, (u64)end);
+ /*
+ * some sick allocation would have range overlap with fam10h
+ * mmconf range, so need to update start and end.
+ */
+ if (fam10h_mmconf_end) {
+ int changed = 0;
+ u64 endx = 0;
+ if (start >= fam10h_mmconf_start &&
+ start <= fam10h_mmconf_end) {
+ start = fam10h_mmconf_end + 1;
+ changed = 1;
+ }
+
+ if (end >= fam10h_mmconf_start &&
+ end <= fam10h_mmconf_end) {
+ end = fam10h_mmconf_start - 1;
+ changed = 1;
+ }
+
+ if (start < fam10h_mmconf_start &&
+ end > fam10h_mmconf_end) {
+ /* we got a hole */
+ endx = fam10h_mmconf_start - 1;
+ update_res(info, start, endx, IORESOURCE_MEM, 0);
+ update_range(range, start, endx);
+ printk(KERN_CONT " ==> [%llx, %llx]", (u64)start, endx);
+ start = fam10h_mmconf_end + 1;
+ changed = 1;
+ }
+ if (changed) {
+ if (start <= end) {
+ printk(KERN_CONT " %s [%llx, %llx]", endx?"and":"==>", (u64)start, (u64)end);
+ } else {
+ printk(KERN_CONT "%s\n", endx?"":" ==> none");
+ continue;
+ }
+ }
+ }
+
+ update_res(info, start, end, IORESOURCE_MEM, 1);
+ update_range(range, start, end);
+ printk(KERN_CONT "\n");
+ }
+
+ /* need to take out [4G, TOM2) for RAM*/
+ /* SYS_CFG */
+ address = MSR_K8_SYSCFG;
+ rdmsrl(address, val);
+ /* TOP_MEM2 is enabled? */
+ if (val & (1<<21)) {
+ /* TOP_MEM2 */
+ address = MSR_K8_TOP_MEM2;
+ rdmsrl(address, val);
+ end = (val & 0xffffff8000000ULL);
+ printk(KERN_INFO "TOM2: %016lx aka %ldM\n", end, end>>20);
+ update_range(range, 1ULL<<32, end - 1);
+ }
+
+ /*
+ * add left over mmio range to def node/link ?
+ * that is tricky, just record range in from start_min to 4G
+ */
+ for (j = 0; j < pci_root_num; j++) {
+ info = &pci_root_info[j];
+ if (info->node == def_node && info->link == def_link)
+ break;
+ }
+ if (j < pci_root_num) {
+ info = &pci_root_info[j];
+
+ for (i = 0; i < RANGE_NUM; i++) {
+ if (!range[i].end)
+ continue;
+
+ update_res(info, range[i].start, range[i].end,
+ IORESOURCE_MEM, 1);
+ }
+ }
+
+ for (i = 0; i < pci_root_num; i++) {
+ int res_num;
+ int busnum;
+
+ info = &pci_root_info[i];
+ res_num = info->res_num;
+ busnum = info->bus_min;
+ printk(KERN_DEBUG "bus: [%02x,%02x] on node %x link %x\n",
+ info->bus_min, info->bus_max, info->node, info->link);
+ for (j = 0; j < res_num; j++) {
+ res = &info->res[j];
+ printk(KERN_DEBUG "bus: %02x index %x %s: [%llx, %llx]\n",
+ busnum, j,
+ (res->flags & IORESOURCE_IO)?"io port":"mmio",
+ res->start, res->end);
+ }
+ }
+
+ return 0;
+}
+
+postcore_initcall(early_fill_mp_bus_info);
diff --git a/arch/x86/pci/k8-bus_64.c b/arch/x86/pci/k8-bus_64.c
deleted file mode 100644
index 5c2799c..0000000
--- a/arch/x86/pci/k8-bus_64.c
+++ /dev/null
@@ -1,528 +0,0 @@
-#include <linux/init.h>
-#include <linux/pci.h>
-#include <asm/pci-direct.h>
-#include <asm/mpspec.h>
-#include <linux/cpumask.h>
-#include <linux/topology.h>
-
-/*
- * This discovers the pcibus <-> node mapping on AMD K8.
- * also get peer root bus resource for io,mmio
- */
-
-
-/*
- * sub bus (transparent) will use entres from 3 to store extra from root,
- * so need to make sure have enought slot there, increase PCI_BUS_NUM_RESOURCES?
- */
-#define RES_NUM 16
-struct pci_root_info {
- char name[12];
- unsigned int res_num;
- struct resource res[RES_NUM];
- int bus_min;
- int bus_max;
- int node;
- int link;
-};
-
-/* 4 at this time, it may become to 32 */
-#define PCI_ROOT_NR 4
-static int pci_root_num;
-static struct pci_root_info pci_root_info[PCI_ROOT_NR];
-
-#ifdef CONFIG_NUMA
-
-#define BUS_NR 256
-
-static int mp_bus_to_node[BUS_NR];
-
-void set_mp_bus_to_node(int busnum, int node)
-{
- if (busnum >= 0 && busnum < BUS_NR)
- mp_bus_to_node[busnum] = node;
-}
-
-int get_mp_bus_to_node(int busnum)
-{
- int node = -1;
-
- if (busnum < 0 || busnum > (BUS_NR - 1))
- return node;
-
- node = mp_bus_to_node[busnum];
-
- /*
- * let numa_node_id to decide it later in dma_alloc_pages
- * if there is no ram on that node
- */
- if (node != -1 && !node_online(node))
- node = -1;
-
- return node;
-}
-#endif
-
-void set_pci_bus_resources_arch_default(struct pci_bus *b)
-{
- int i;
- int j;
- struct pci_root_info *info;
-
- /* if only one root bus, don't need to anything */
- if (pci_root_num < 2)
- return;
-
- for (i = 0; i < pci_root_num; i++) {
- if (pci_root_info[i].bus_min == b->number)
- break;
- }
-
- if (i == pci_root_num)
- return;
-
- info = &pci_root_info[i];
- for (j = 0; j < info->res_num; j++) {
- struct resource *res;
- struct resource *root;
-
- res = &info->res[j];
- b->resource[j] = res;
- if (res->flags & IORESOURCE_IO)
- root = &ioport_resource;
- else
- root = &iomem_resource;
- insert_resource(root, res);
- }
-}
-
-#define RANGE_NUM 16
-
-struct res_range {
- size_t start;
- size_t end;
-};
-
-static void __init update_range(struct res_range *range, size_t start,
- size_t end)
-{
- int i;
- int j;
-
- for (j = 0; j < RANGE_NUM; j++) {
- if (!range[j].end)
- continue;
-
- if (start <= range[j].start && end >= range[j].end) {
- range[j].start = 0;
- range[j].end = 0;
- continue;
- }
-
- if (start <= range[j].start && end < range[j].end && range[j].start < end + 1) {
- range[j].start = end + 1;
- continue;
- }
-
-
- if (start > range[j].start && end >= range[j].end && range[j].end > start - 1) {
- range[j].end = start - 1;
- continue;
- }
-
- if (start > range[j].start && end < range[j].end) {
- /* find the new spare */
- for (i = 0; i < RANGE_NUM; i++) {
- if (range[i].end == 0)
- break;
- }
- if (i < RANGE_NUM) {
- range[i].end = range[j].end;
- range[i].start = end + 1;
- } else {
- printk(KERN_ERR "run of slot in ranges\n");
- }
- range[j].end = start - 1;
- continue;
- }
- }
-}
-
-static void __init update_res(struct pci_root_info *info, size_t start,
- size_t end, unsigned long flags, int merge)
-{
- int i;
- struct resource *res;
-
- if (!merge)
- goto addit;
-
- /* try to merge it with old one */
- for (i = 0; i < info->res_num; i++) {
- size_t final_start, final_end;
- size_t common_start, common_end;
-
- res = &info->res[i];
- if (res->flags != flags)
- continue;
-
- common_start = max((size_t)res->start, start);
- common_end = min((size_t)res->end, end);
- if (common_start > common_end + 1)
- continue;
-
- final_start = min((size_t)res->start, start);
- final_end = max((size_t)res->end, end);
-
- res->start = final_start;
- res->end = final_end;
- return;
- }
-
-addit:
-
- /* need to add that */
- if (info->res_num >= RES_NUM)
- return;
-
- res = &info->res[info->res_num];
- res->name = info->name;
- res->flags = flags;
- res->start = start;
- res->end = end;
- res->child = NULL;
- info->res_num++;
-}
-
-struct pci_hostbridge_probe {
- u32 bus;
- u32 slot;
- u32 vendor;
- u32 device;
-};
-
-static struct pci_hostbridge_probe pci_probes[] __initdata = {
- { 0, 0x18, PCI_VENDOR_ID_AMD, 0x1100 },
- { 0, 0x18, PCI_VENDOR_ID_AMD, 0x1200 },
- { 0xff, 0, PCI_VENDOR_ID_AMD, 0x1200 },
- { 0, 0x18, PCI_VENDOR_ID_AMD, 0x1300 },
-};
-
-static u64 __initdata fam10h_mmconf_start;
-static u64 __initdata fam10h_mmconf_end;
-static void __init get_pci_mmcfg_amd_fam10h_range(void)
-{
- u32 address;
- u64 base, msr;
- unsigned segn_busn_bits;
-
- /* assume all cpus from fam10h have mmconf */
- if (boot_cpu_data.x86 < 0x10)
- return;
-
- address = MSR_FAM10H_MMIO_CONF_BASE;
- rdmsrl(address, msr);
-
- /* mmconfig is not enable */
- if (!(msr & FAM10H_MMIO_CONF_ENABLE))
- return;
-
- base = msr & (FAM10H_MMIO_CONF_BASE_MASK<<FAM10H_MMIO_CONF_BASE_SHIFT);
-
- segn_busn_bits = (msr >> FAM10H_MMIO_CONF_BUSRANGE_SHIFT) &
- FAM10H_MMIO_CONF_BUSRANGE_MASK;
-
- fam10h_mmconf_start = base;
- fam10h_mmconf_end = base + (1ULL<<(segn_busn_bits + 20)) - 1;
-}
-
-/**
- * early_fill_mp_bus_to_node()
- * called before pcibios_scan_root and pci_scan_bus
- * fills the mp_bus_to_cpumask array based according to the LDT Bus Number
- * Registers found in the K8 northbridge
- */
-static int __init early_fill_mp_bus_info(void)
-{
- int i;
- int j;
- unsigned bus;
- unsigned slot;
- int found;
- int node;
- int link;
- int def_node;
- int def_link;
- struct pci_root_info *info;
- u32 reg;
- struct resource *res;
- size_t start;
- size_t end;
- struct res_range range[RANGE_NUM];
- u64 val;
- u32 address;
-
-#ifdef CONFIG_NUMA
- for (i = 0; i < BUS_NR; i++)
- mp_bus_to_node[i] = -1;
-#endif
-
- if (!early_pci_allowed())
- return -1;
-
- found = 0;
- for (i = 0; i < ARRAY_SIZE(pci_probes); i++) {
- u32 id;
- u16 device;
- u16 vendor;
-
- bus = pci_probes[i].bus;
- slot = pci_probes[i].slot;
- id = read_pci_config(bus, slot, 0, PCI_VENDOR_ID);
-
- vendor = id & 0xffff;
- device = (id>>16) & 0xffff;
- if (pci_probes[i].vendor == vendor &&
- pci_probes[i].device == device) {
- found = 1;
- break;
- }
- }
-
- if (!found)
- return 0;
-
- pci_root_num = 0;
- for (i = 0; i < 4; i++) {
- int min_bus;
- int max_bus;
- reg = read_pci_config(bus, slot, 1, 0xe0 + (i << 2));
-
- /* Check if that register is enabled for bus range */
- if ((reg & 7) != 3)
- continue;
-
- min_bus = (reg >> 16) & 0xff;
- max_bus = (reg >> 24) & 0xff;
- node = (reg >> 4) & 0x07;
-#ifdef CONFIG_NUMA
- for (j = min_bus; j <= max_bus; j++)
- mp_bus_to_node[j] = (unsigned char) node;
-#endif
- link = (reg >> 8) & 0x03;
-
- info = &pci_root_info[pci_root_num];
- info->bus_min = min_bus;
- info->bus_max = max_bus;
- info->node = node;
- info->link = link;
- sprintf(info->name, "PCI Bus #%02x", min_bus);
- pci_root_num++;
- }
-
- /* get the default node and link for left over res */
- reg = read_pci_config(bus, slot, 0, 0x60);
- def_node = (reg >> 8) & 0x07;
- reg = read_pci_config(bus, slot, 0, 0x64);
- def_link = (reg >> 8) & 0x03;
-
- memset(range, 0, sizeof(range));
- range[0].end = 0xffff;
- /* io port resource */
- for (i = 0; i < 4; i++) {
- reg = read_pci_config(bus, slot, 1, 0xc0 + (i << 3));
- if (!(reg & 3))
- continue;
-
- start = reg & 0xfff000;
- reg = read_pci_config(bus, slot, 1, 0xc4 + (i << 3));
- node = reg & 0x07;
- link = (reg >> 4) & 0x03;
- end = (reg & 0xfff000) | 0xfff;
-
- /* find the position */
- for (j = 0; j < pci_root_num; j++) {
- info = &pci_root_info[j];
- if (info->node == node && info->link == link)
- break;
- }
- if (j == pci_root_num)
- continue; /* not found */
-
- info = &pci_root_info[j];
- printk(KERN_DEBUG "node %d link %d: io port [%llx, %llx]\n",
- node, link, (u64)start, (u64)end);
-
- /* kernel only handle 16 bit only */
- if (end > 0xffff)
- end = 0xffff;
- update_res(info, start, end, IORESOURCE_IO, 1);
- update_range(range, start, end);
- }
- /* add left over io port range to def node/link, [0, 0xffff] */
- /* find the position */
- for (j = 0; j < pci_root_num; j++) {
- info = &pci_root_info[j];
- if (info->node == def_node && info->link == def_link)
- break;
- }
- if (j < pci_root_num) {
- info = &pci_root_info[j];
- for (i = 0; i < RANGE_NUM; i++) {
- if (!range[i].end)
- continue;
-
- update_res(info, range[i].start, range[i].end,
- IORESOURCE_IO, 1);
- }
- }
-
- memset(range, 0, sizeof(range));
- /* 0xfd00000000-0xffffffffff for HT */
- range[0].end = (0xfdULL<<32) - 1;
-
- /* need to take out [0, TOM) for RAM*/
- address = MSR_K8_TOP_MEM1;
- rdmsrl(address, val);
- end = (val & 0xffffff8000000ULL);
- printk(KERN_INFO "TOM: %016lx aka %ldM\n", end, end>>20);
- if (end < (1ULL<<32))
- update_range(range, 0, end - 1);
-
- /* get mmconfig */
- get_pci_mmcfg_amd_fam10h_range();
- /* need to take out mmconf range */
- if (fam10h_mmconf_end) {
- printk(KERN_DEBUG "Fam 10h mmconf [%llx, %llx]\n", fam10h_mmconf_start, fam10h_mmconf_end);
- update_range(range, fam10h_mmconf_start, fam10h_mmconf_end);
- }
-
- /* mmio resource */
- for (i = 0; i < 8; i++) {
- reg = read_pci_config(bus, slot, 1, 0x80 + (i << 3));
- if (!(reg & 3))
- continue;
-
- start = reg & 0xffffff00; /* 39:16 on 31:8*/
- start <<= 8;
- reg = read_pci_config(bus, slot, 1, 0x84 + (i << 3));
- node = reg & 0x07;
- link = (reg >> 4) & 0x03;
- end = (reg & 0xffffff00);
- end <<= 8;
- end |= 0xffff;
-
- /* find the position */
- for (j = 0; j < pci_root_num; j++) {
- info = &pci_root_info[j];
- if (info->node == node && info->link == link)
- break;
- }
- if (j == pci_root_num)
- continue; /* not found */
-
- info = &pci_root_info[j];
-
- printk(KERN_DEBUG "node %d link %d: mmio [%llx, %llx]",
- node, link, (u64)start, (u64)end);
- /*
- * some sick allocation would have range overlap with fam10h
- * mmconf range, so need to update start and end.
- */
- if (fam10h_mmconf_end) {
- int changed = 0;
- u64 endx = 0;
- if (start >= fam10h_mmconf_start &&
- start <= fam10h_mmconf_end) {
- start = fam10h_mmconf_end + 1;
- changed = 1;
- }
-
- if (end >= fam10h_mmconf_start &&
- end <= fam10h_mmconf_end) {
- end = fam10h_mmconf_start - 1;
- changed = 1;
- }
-
- if (start < fam10h_mmconf_start &&
- end > fam10h_mmconf_end) {
- /* we got a hole */
- endx = fam10h_mmconf_start - 1;
- update_res(info, start, endx, IORESOURCE_MEM, 0);
- update_range(range, start, endx);
- printk(KERN_CONT " ==> [%llx, %llx]", (u64)start, endx);
- start = fam10h_mmconf_end + 1;
- changed = 1;
- }
- if (changed) {
- if (start <= end) {
- printk(KERN_CONT " %s [%llx, %llx]", endx?"and":"==>", (u64)start, (u64)end);
- } else {
- printk(KERN_CONT "%s\n", endx?"":" ==> none");
- continue;
- }
- }
- }
-
- update_res(info, start, end, IORESOURCE_MEM, 1);
- update_range(range, start, end);
- printk(KERN_CONT "\n");
- }
-
- /* need to take out [4G, TOM2) for RAM*/
- /* SYS_CFG */
- address = MSR_K8_SYSCFG;
- rdmsrl(address, val);
- /* TOP_MEM2 is enabled? */
- if (val & (1<<21)) {
- /* TOP_MEM2 */
- address = MSR_K8_TOP_MEM2;
- rdmsrl(address, val);
- end = (val & 0xffffff8000000ULL);
- printk(KERN_INFO "TOM2: %016lx aka %ldM\n", end, end>>20);
- update_range(range, 1ULL<<32, end - 1);
- }
-
- /*
- * add left over mmio range to def node/link ?
- * that is tricky, just record range in from start_min to 4G
- */
- for (j = 0; j < pci_root_num; j++) {
- info = &pci_root_info[j];
- if (info->node == def_node && info->link == def_link)
- break;
- }
- if (j < pci_root_num) {
- info = &pci_root_info[j];
-
- for (i = 0; i < RANGE_NUM; i++) {
- if (!range[i].end)
- continue;
-
- update_res(info, range[i].start, range[i].end,
- IORESOURCE_MEM, 1);
- }
- }
-
- for (i = 0; i < pci_root_num; i++) {
- int res_num;
- int busnum;
-
- info = &pci_root_info[i];
- res_num = info->res_num;
- busnum = info->bus_min;
- printk(KERN_DEBUG "bus: [%02x,%02x] on node %x link %x\n",
- info->bus_min, info->bus_max, info->node, info->link);
- for (j = 0; j < res_num; j++) {
- res = &info->res[j];
- printk(KERN_DEBUG "bus: %02x index %x %s: [%llx, %llx]\n",
- busnum, j,
- (res->flags & IORESOURCE_IO)?"io port":"mmio",
- res->start, res->end);
- }
- }
-
- return 0;
-}
-
-postcore_initcall(early_fill_mp_bus_info);
--
1.5.5.3

2008-06-12 19:50:25

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86: Move PCI IO ECS code to x86/pci

On Thu, Jun 12, 2008 at 11:19 AM, Robert Richter <[email protected]> wrote:
> "Form follows function". Code is now where it belongs to.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> arch/x86/kernel/cpu/amd.c | 3 ---
> arch/x86/kernel/cpu/amd_64.c | 4 ----
> arch/x86/kernel/cpu/cpu.h | 2 --
> arch/x86/kernel/setup.c | 13 -------------
> arch/x86/pci/Makefile_32 | 1 +
> arch/x86/pci/amd_bus.c | 32 ++++++++++++++++++++++++++++++++
> arch/x86/pci/direct.c | 16 +++++++++-------
> arch/x86/pci/pci.h | 1 +
> include/asm-x86/cpufeature.h | 2 --
> 9 files changed, 43 insertions(+), 31 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index acc891a..81a07ca 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -266,9 +266,6 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
>
> if (cpu_has_xmm2)
> set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
> -
> - if (c->x86 == 0x10)
> - amd_enable_pci_ext_cfg(c);
> }
>
> static unsigned int __cpuinit amd_size_cache(struct cpuinfo_x86 *c, unsigned int size)
> diff --git a/arch/x86/kernel/cpu/amd_64.c b/arch/x86/kernel/cpu/amd_64.c
> index f8d2058..250bfe6 100644
> --- a/arch/x86/kernel/cpu/amd_64.c
> +++ b/arch/x86/kernel/cpu/amd_64.c
> @@ -6,7 +6,6 @@
> #include <asm/cacheflush.h>
>
> #include <mach_apic.h>
> -#include "cpu.h"
>
> extern int __cpuinit get_model_name(struct cpuinfo_x86 *c);
> extern void __cpuinit display_cacheinfo(struct cpuinfo_x86 *c);
> @@ -187,9 +186,6 @@ void __cpuinit init_amd(struct cpuinfo_x86 *c)
> if (c->x86 == 0x10)
> fam10h_check_enable_mmcfg();
>
> - if (c->x86 == 0x10)
> - amd_enable_pci_ext_cfg(c);
> -
> if (c == &boot_cpu_data && c->x86 >= 0xf && c->x86 <= 0x11) {
> unsigned long long tseg;
>
> diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
> index f5d5bb1..40ad189 100644
> --- a/arch/x86/kernel/cpu/cpu.h
> +++ b/arch/x86/kernel/cpu/cpu.h
> @@ -39,5 +39,3 @@ extern int get_model_name(struct cpuinfo_x86 *c);
> extern void display_cacheinfo(struct cpuinfo_x86 *c);
>
> #endif /* CONFIG_X86_32 */
> -
> -extern void __cpuinit amd_enable_pci_ext_cfg(struct cpuinfo_x86 *c);
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 20e14db..6f80b85 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -137,16 +137,3 @@ void __init setup_per_cpu_areas(void)
> }
>
> #endif
> -#define ENABLE_CF8_EXT_CFG (1ULL << 46)
> -
> -void __cpuinit amd_enable_pci_ext_cfg(struct cpuinfo_x86 *c)
> -{
> - u64 reg;
> - rdmsrl(MSR_AMD64_NB_CFG, reg);
> - if (!(reg & ENABLE_CF8_EXT_CFG)) {
> - reg |= ENABLE_CF8_EXT_CFG;
> - wrmsrl(MSR_AMD64_NB_CFG, reg);
> - }
> - set_cpu_cap(c, X86_FEATURE_PCI_EXT_CFG);
> -}
> -
> diff --git a/arch/x86/pci/Makefile_32 b/arch/x86/pci/Makefile_32
> index 89ec35d..f647e7e 100644
> --- a/arch/x86/pci/Makefile_32
> +++ b/arch/x86/pci/Makefile_32
> @@ -22,3 +22,4 @@ pci-$(CONFIG_X86_NUMAQ) := numa.o irq.o
> pci-$(CONFIG_NUMA) += mp_bus_to_node.o
>
> obj-y += $(pci-y) common.o early.o
> +obj-y += amd_bus.o
> diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c
> index 5c2799c..15f505d 100644
> --- a/arch/x86/pci/amd_bus.c
> +++ b/arch/x86/pci/amd_bus.c
> @@ -1,5 +1,9 @@
> #include <linux/init.h>
> #include <linux/pci.h>
> +#include "pci.h"
> +
> +#ifdef CONFIG_X86_64
> +
> #include <asm/pci-direct.h>
> #include <asm/mpspec.h>
> #include <linux/cpumask.h>
> @@ -526,3 +530,31 @@ static int __init early_fill_mp_bus_info(void)
> }
>
> postcore_initcall(early_fill_mp_bus_info);
> +
> +#endif
> +
> +/* common 32/64 bit code */
> +
> +#define ENABLE_CF8_EXT_CFG (1ULL << 46)
> +
> +static void enable_pci_io_ecs_per_cpu(void *unused)
> +{
> + u64 reg;
> + rdmsrl(MSR_AMD64_NB_CFG, reg);
> + if (!(reg & ENABLE_CF8_EXT_CFG)) {
> + reg |= ENABLE_CF8_EXT_CFG;
> + wrmsrl(MSR_AMD64_NB_CFG, reg);
> + }
> +}
> +
> +static int __init enable_pci_io_ecs(void)
> +{
> + /* assume all cpus from fam10h have IO ECS */
> + if (boot_cpu_data.x86 < 0x10)
> + return 0;
> + on_each_cpu(enable_pci_io_ecs_per_cpu, NULL, 1, 1);

how about some cpu hotplug? i mean if you limit that when booting and
later use "echo 1 > /sys/.../online" etc to get it online.

YH

2008-06-12 19:51:46

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/pci: Renaming k8-bus_64.c to amd_bus.c

On Thu, Jun 12, 2008 at 11:19 AM, Robert Richter <[email protected]> wrote:
> The name fits better since this is code not only for K8.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> arch/x86/pci/Makefile_64 | 2 +-
> arch/x86/pci/amd_bus.c | 528 ++++++++++++++++++++++++++++++++++++++++++++++
> arch/x86/pci/k8-bus_64.c | 528 ----------------------------------------------
> 3 files changed, 529 insertions(+), 529 deletions(-)
> create mode 100644 arch/x86/pci/amd_bus.c
> delete mode 100644 arch/x86/pci/k8-bus_64.c
>
> diff --git a/arch/x86/pci/Makefile_64 b/arch/x86/pci/Makefile_64
> index 8fbd198..fd47068 100644
> --- a/arch/x86/pci/Makefile_64
> +++ b/arch/x86/pci/Makefile_64
> @@ -13,5 +13,5 @@ obj-y += legacy.o irq.o common.o early.o
> # mmconfig has a 64bit special
> obj-$(CONFIG_PCI_MMCONFIG) += mmconfig_64.o direct.o mmconfig-shared.o
>
> -obj-y += k8-bus_64.o
> +obj-y += amd_bus.o

can we keep the _64?

It is used by 64bit only at this time.

YH

Subject: Re: [PATCH 1/2] x86/pci: Renaming k8-bus_64.c to amd_bus.c

On 12.06.08 12:51:36, Yinghai Lu wrote:
> > -obj-y += k8-bus_64.o
> > +obj-y += amd_bus.o
>
> can we keep the _64?
>
> It is used by 64bit only at this time.

Patch 2/2 will introduce code that works also for 32 bit.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: [email protected]

Subject: Re: [PATCH 2/2] x86: Move PCI IO ECS code to x86/pci

On 12.06.08 12:50:10, Yinghai Lu wrote:
> > +static int __init enable_pci_io_ecs(void)
> > +{
> > + /* assume all cpus from fam10h have IO ECS */
> > + if (boot_cpu_data.x86 < 0x10)
> > + return 0;
> > + on_each_cpu(enable_pci_io_ecs_per_cpu, NULL, 1, 1);
>
> how about some cpu hotplug? i mean if you limit that when booting and
> later use "echo 1 > /sys/.../online" etc to get it online.

Yinghai, will look at this.

Thanks,

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: [email protected]

2008-06-13 17:03:00

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86: Move PCI IO ECS code to x86/pci

On Thu, Jun 12, 2008 at 3:19 PM, Robert Richter <[email protected]> wrote:
> "Form follows function". Code is now where it belongs to.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> arch/x86/kernel/cpu/amd.c | 3 ---
> arch/x86/kernel/cpu/amd_64.c | 4 ----
> arch/x86/kernel/cpu/cpu.h | 2 --
> arch/x86/kernel/setup.c | 13 -------------
> arch/x86/pci/Makefile_32 | 1 +
> arch/x86/pci/amd_bus.c | 32 ++++++++++++++++++++++++++++++++
> arch/x86/pci/direct.c | 16 +++++++++-------
> arch/x86/pci/pci.h | 1 +
> include/asm-x86/cpufeature.h | 2 --
> 9 files changed, 43 insertions(+), 31 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index acc891a..81a07ca 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -266,9 +266,6 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
>
> if (cpu_has_xmm2)
> set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
> -
> - if (c->x86 == 0x10)
> - amd_enable_pci_ext_cfg(c);
> }
>
> static unsigned int __cpuinit amd_size_cache(struct cpuinfo_x86 *c, unsigned int size)
> diff --git a/arch/x86/kernel/cpu/amd_64.c b/arch/x86/kernel/cpu/amd_64.c
> index f8d2058..250bfe6 100644
> --- a/arch/x86/kernel/cpu/amd_64.c
> +++ b/arch/x86/kernel/cpu/amd_64.c
> @@ -6,7 +6,6 @@
> #include <asm/cacheflush.h>
>
> #include <mach_apic.h>
> -#include "cpu.h"
>
> extern int __cpuinit get_model_name(struct cpuinfo_x86 *c);
> extern void __cpuinit display_cacheinfo(struct cpuinfo_x86 *c);
> @@ -187,9 +186,6 @@ void __cpuinit init_amd(struct cpuinfo_x86 *c)
> if (c->x86 == 0x10)
> fam10h_check_enable_mmcfg();
>
> - if (c->x86 == 0x10)
> - amd_enable_pci_ext_cfg(c);
> -
> if (c == &boot_cpu_data && c->x86 >= 0xf && c->x86 <= 0x11) {
> unsigned long long tseg;
>
> diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
> index f5d5bb1..40ad189 100644
> --- a/arch/x86/kernel/cpu/cpu.h
> +++ b/arch/x86/kernel/cpu/cpu.h
> @@ -39,5 +39,3 @@ extern int get_model_name(struct cpuinfo_x86 *c);
> extern void display_cacheinfo(struct cpuinfo_x86 *c);
>
> #endif /* CONFIG_X86_32 */
> -
> -extern void __cpuinit amd_enable_pci_ext_cfg(struct cpuinfo_x86 *c);
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 20e14db..6f80b85 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -137,16 +137,3 @@ void __init setup_per_cpu_areas(void)
> }
>
> #endif
> -#define ENABLE_CF8_EXT_CFG (1ULL << 46)
> -
> -void __cpuinit amd_enable_pci_ext_cfg(struct cpuinfo_x86 *c)
> -{
> - u64 reg;
> - rdmsrl(MSR_AMD64_NB_CFG, reg);
> - if (!(reg & ENABLE_CF8_EXT_CFG)) {
> - reg |= ENABLE_CF8_EXT_CFG;
> - wrmsrl(MSR_AMD64_NB_CFG, reg);
> - }
> - set_cpu_cap(c, X86_FEATURE_PCI_EXT_CFG);
> -}
> -
> diff --git a/arch/x86/pci/Makefile_32 b/arch/x86/pci/Makefile_32
> index 89ec35d..f647e7e 100644
> --- a/arch/x86/pci/Makefile_32
> +++ b/arch/x86/pci/Makefile_32
> @@ -22,3 +22,4 @@ pci-$(CONFIG_X86_NUMAQ) := numa.o irq.o
> pci-$(CONFIG_NUMA) += mp_bus_to_node.o
>
> obj-y += $(pci-y) common.o early.o
> +obj-y += amd_bus.o
> diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c
> index 5c2799c..15f505d 100644
> --- a/arch/x86/pci/amd_bus.c
> +++ b/arch/x86/pci/amd_bus.c
> @@ -1,5 +1,9 @@
> #include <linux/init.h>
> #include <linux/pci.h>
> +#include "pci.h"
> +
> +#ifdef CONFIG_X86_64

Please, don't do that. We are in an ongoing effort to cleanup the
remaining ifdefs in x86 code, and adding more of them would just make
it harder.
If you really need it, move the common part to a separate file (avoid
the _32 and _64 naming), and have it compiled conditionally on your
architecture.

> #include <asm/pci-direct.h>
> #include <asm/mpspec.h>
> #include <linux/cpumask.h>
> @@ -526,3 +530,31 @@ static int __init early_fill_mp_bus_info(void)
> }
>
> postcore_initcall(early_fill_mp_bus_info);
> +
> +#endif
> +
> +/* common 32/64 bit code */
> +
> +#define ENABLE_CF8_EXT_CFG (1ULL << 46)
> +
> +static void enable_pci_io_ecs_per_cpu(void *unused)
> +{
> + u64 reg;
> + rdmsrl(MSR_AMD64_NB_CFG, reg);
> + if (!(reg & ENABLE_CF8_EXT_CFG)) {
> + reg |= ENABLE_CF8_EXT_CFG;
> + wrmsrl(MSR_AMD64_NB_CFG, reg);
> + }
> +}
> +
> +static int __init enable_pci_io_ecs(void)
> +{
> + /* assume all cpus from fam10h have IO ECS */
> + if (boot_cpu_data.x86 < 0x10)
> + return 0;
> + on_each_cpu(enable_pci_io_ecs_per_cpu, NULL, 1, 1);
> + pci_probe |= PCI_HAS_IO_ECS;
> + return 0;
> +}
> +
> +postcore_initcall(enable_pci_io_ecs);
> diff --git a/arch/x86/pci/direct.c b/arch/x86/pci/direct.c
> index 27d61b6..9915293 100644
> --- a/arch/x86/pci/direct.c
> +++ b/arch/x86/pci/direct.c
> @@ -265,14 +265,16 @@ void __init pci_direct_init(int type)
> type);
> if (type == 1) {
> raw_pci_ops = &pci_direct_conf1;
> - if (!raw_pci_ext_ops && cpu_has_pci_ext_cfg) {
> - printk(KERN_INFO "PCI: Using configuration type 1 "
> - "for extended access\n");
> - raw_pci_ext_ops = &pci_direct_conf1;
> - }
> - } else {
> - raw_pci_ops = &pci_direct_conf2;
> + if (raw_pci_ext_ops)
> + return;
> + if (!(pci_probe & PCI_HAS_IO_ECS))
> + return;
> + printk(KERN_INFO "PCI: Using configuration type 1 "
> + "for extended access\n");
> + raw_pci_ext_ops = &pci_direct_conf1;
> + return;
> }
> + raw_pci_ops = &pci_direct_conf2;
> }
>
> int __init pci_direct_probe(void)
> diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h
> index f3972b1..fd53db5 100644
> --- a/arch/x86/pci/pci.h
> +++ b/arch/x86/pci/pci.h
> @@ -27,6 +27,7 @@
> #define PCI_CAN_SKIP_ISA_ALIGN 0x8000
> #define PCI_USE__CRS 0x10000
> #define PCI_CHECK_ENABLE_AMD_MMCONF 0x20000
> +#define PCI_HAS_IO_ECS 0x40000
>
> extern unsigned int pci_probe;
> extern unsigned long pirq_table_addr;
> diff --git a/include/asm-x86/cpufeature.h b/include/asm-x86/cpufeature.h
> index 40fcbba..0d609c8 100644
> --- a/include/asm-x86/cpufeature.h
> +++ b/include/asm-x86/cpufeature.h
> @@ -79,7 +79,6 @@
> #define X86_FEATURE_REP_GOOD (3*32+16) /* rep microcode works well on this CPU */
> #define X86_FEATURE_MFENCE_RDTSC (3*32+17) /* Mfence synchronizes RDTSC */
> #define X86_FEATURE_LFENCE_RDTSC (3*32+18) /* Lfence synchronizes RDTSC */
> -#define X86_FEATURE_PCI_EXT_CFG (3*32+19) /* PCI extended cfg access */
>
> /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
> #define X86_FEATURE_XMM3 (4*32+ 0) /* Streaming SIMD Extensions-3 */
> @@ -188,7 +187,6 @@ extern const char * const x86_power_flags[32];
> #define cpu_has_gbpages boot_cpu_has(X86_FEATURE_GBPAGES)
> #define cpu_has_arch_perfmon boot_cpu_has(X86_FEATURE_ARCH_PERFMON)
> #define cpu_has_pat boot_cpu_has(X86_FEATURE_PAT)
> -#define cpu_has_pci_ext_cfg boot_cpu_has(X86_FEATURE_PCI_EXT_CFG)
>
> #if defined(CONFIG_X86_INVLPG) || defined(CONFIG_X86_64)
> # define cpu_has_invlpg 1
> --
> 1.5.5.3
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>



--
Glauber Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

Subject: Re: [PATCH 2/2] x86: Move PCI IO ECS code to x86/pci

On 13.06.08 14:02:46, Glauber Costa wrote:
> > diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c
> > index 5c2799c..15f505d 100644
> > --- a/arch/x86/pci/amd_bus.c
> > +++ b/arch/x86/pci/amd_bus.c
> > @@ -1,5 +1,9 @@
> > #include <linux/init.h>
> > #include <linux/pci.h>
> > +#include "pci.h"
> > +
> > +#ifdef CONFIG_X86_64
>
> Please, don't do that. We are in an ongoing effort to cleanup the
> remaining ifdefs in x86 code, and adding more of them would just make
> it harder.
> If you really need it, move the common part to a separate file (avoid
> the _32 and _64 naming), and have it compiled conditionally on your
> architecture.

Ok, so what about shared code? Keep all this in separate files:
amd_bus.c, (amd_bus_32.c), amd_bus_64.c, (amd_bus.h)?

Is the strategy to avoid #ifdefs and instead use the flags in
Makefiles? My intention was to coalesce the files. Maybe I was wrong
here.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: [email protected]

2008-06-13 18:27:07

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86: Move PCI IO ECS code to x86/pci

On Fri, Jun 13, 2008 at 3:16 PM, Robert Richter <[email protected]> wrote:
> On 13.06.08 14:02:46, Glauber Costa wrote:
>> > diff --git a/arch/x86/pci/amd_bus.c b/arch/x86/pci/amd_bus.c
>> > index 5c2799c..15f505d 100644
>> > --- a/arch/x86/pci/amd_bus.c
>> > +++ b/arch/x86/pci/amd_bus.c
>> > @@ -1,5 +1,9 @@
>> > #include <linux/init.h>
>> > #include <linux/pci.h>
>> > +#include "pci.h"
>> > +
>> > +#ifdef CONFIG_X86_64
>>
>> Please, don't do that. We are in an ongoing effort to cleanup the
>> remaining ifdefs in x86 code, and adding more of them would just make
>> it harder.
>> If you really need it, move the common part to a separate file (avoid
>> the _32 and _64 naming), and have it compiled conditionally on your
>> architecture.
>
> Ok, so what about shared code? Keep all this in separate files:
> amd_bus.c, (amd_bus_32.c), amd_bus_64.c, (amd_bus.h)?
>
> Is the strategy to avoid #ifdefs and instead use the flags in
> Makefiles? My intention was to coalesce the files. Maybe I was wrong
> here.

The strategy is to coalesce the files if they do the same thing. If
you can find an implementation that is shared
between both, it is surely the most preferable option among them all.
Even if it takes more time.

For some of them, this is obviously not intrinsically possible. That's
the case for example, of the .S files that touch deep details of the
architectures. (ok, ok, _some_ of that code might well be shared in
the future). Another example, is the page headers. Part of it
were kept in page_32.h and page_64.h.

If not, ifdefs in the .h file are probably okay, if they can be used
to avoid them in .c code.

But of course, those things are not carved in stone. If you really
think that ifdef should go there, feel free to do it, with a good
justification.

--
Glauber Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

2008-06-18 07:47:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86/pci: Renaming k8-bus_64.c to amd_bus.c


* Robert Richter <[email protected]> wrote:

> The name fits better since this is code not only for K8.
>
> Signed-off-by: Robert Richter <[email protected]>

applied to tip/x86/cpu, thanks Robert.

Ingo

2008-06-18 07:47:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86: Move PCI IO ECS code to x86/pci


* Robert Richter <[email protected]> wrote:

> "Form follows function". Code is now where it belongs to.

applied to tip/x86/cpu - thanks.

Ingo

2008-06-18 07:52:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86: Move PCI IO ECS code to x86/pci


* Robert Richter <[email protected]> wrote:

> Ok, so what about shared code? Keep all this in separate files:
> amd_bus.c, (amd_bus_32.c), amd_bus_64.c, (amd_bus.h)?
>
> Is the strategy to avoid #ifdefs and instead use the flags in
> Makefiles? My intention was to coalesce the files. Maybe I was wrong
> here.

What you did is fine - first step is to move the variants as close to
each other as possible, in an as obvious and mechanic step as possible.

If anything breaks due to your patches then they were probably too
large, but lets be optimistic and try your current splitup first :)

Then we can start eliminating any leftover #ifdefs, step by step. Please
send very, very small patches for that (put them into a git tree on
kernel.org if there's more than say a dozen of them), we'll test them,
and if the changes break anything it's all bisectable to an individual
change.

Ingo

Subject: Re: [PATCH 1/2] x86/pci: Renaming k8-bus_64.c to amd_bus.c

On 12.06.08 20:19:22, Robert Richter wrote:
> The name fits better since this is code not only for K8.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> arch/x86/pci/Makefile_64 | 2 +-
> arch/x86/pci/amd_bus.c | 528 ++++++++++++++++++++++++++++++++++++++++++++++
> arch/x86/pci/k8-bus_64.c | 528 ----------------------------------------------
> 3 files changed, 529 insertions(+), 529 deletions(-)
> create mode 100644 arch/x86/pci/amd_bus.c
> delete mode 100644 arch/x86/pci/k8-bus_64.c

I accidentally resent the patches, please igore.

Thanks,

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: [email protected]