2002-11-12 12:33:27

by William Lee Irwin III

[permalink] [raw]
Subject: [0/4] NUMA-Q: remove PCI bus number mangling

This fixes a longstanding bug with respect to bridge handling as well as
a Linux PCI faux pas, namely an attempt to support PCI domains with bus
number mangling.

The end result is that bridges off of quad 0 now work, and the code now
follows Linux PCI conventions.

[1/4] NUMA-Q: use sysdata as quad numbers in pci_scan_bus()"
[2/4] NUMA-Q: fetch quad numbers from struct pci_bus"
[3/4] NUMA-Q: use quad numbers passed to low-level config cycles"
[4/4] NUMA-Q: remove last traces of bus number mangling"


Bill


2002-11-12 20:48:26

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:37:46AM -0800, William Lee Irwin III wrote:
> This fixes a longstanding bug with respect to bridge handling as well as
> a Linux PCI faux pas, namely an attempt to support PCI domains with bus
> number mangling.
> The end result is that bridges off of quad 0 now work, and the code now
> follows Linux PCI conventions.
> [1/4] NUMA-Q: use sysdata as quad numbers in pci_scan_bus()"
> [2/4] NUMA-Q: fetch quad numbers from struct pci_bus"
> [3/4] NUMA-Q: use quad numbers passed to low-level config cycles"
> [4/4] NUMA-Q: remove last traces of bus number mangling"

Follow on #1:

[5/4] NUMA-Q: use "quad" instead of node in arch/i386/pci/numa.c


This renames all nonessential uses of the word "quad" in
arch/i386/pci/numa.c with "node".

numa.c | 24 ++++++++++++------------
1 files changed, 12 insertions(+), 12 deletions(-)


diff -urpN pci-2.5.47-4/arch/i386/pci/numa.c pci-2.5.47-5/arch/i386/pci/numa.c
--- pci-2.5.47-4/arch/i386/pci/numa.c 2002-11-12 03:32:13.000000000 -0800
+++ pci-2.5.47-5/arch/i386/pci/numa.c 2002-11-12 12:06:22.000000000 -0800
@@ -6,9 +6,9 @@
#include <linux/init.h>
#include "pci.h"

-#define BUS2QUAD(global) (mp_bus_id_to_node[global])
+#define BUS2NODE(global) (mp_bus_id_to_node[global])
#define BUS2LOCAL(global) (mp_bus_id_to_local[global])
-#define QUADLOCAL2BUS(quad,local) (quad_local_to_mp_bus_id[quad][local])
+#define NODELOCAL2BUS(node,local) (quad_local_to_mp_bus_id[node][local])

#define __PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg) \
(0x80000000 | (bus << 16) | (dev << 11) | (fn << 8) | (reg & ~3))
@@ -16,7 +16,7 @@
#define PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg) \
__PCI_CONF1_MQ_ADDRESS(BUS2LOCAL(bus), dev, fn, reg)

-static int bus2quad(struct pci_bus *bus)
+static int bus2node(struct pci_bus *bus)
{
return (int)bus->sysdata;
}
@@ -81,13 +81,13 @@ static int __pci_conf1_mq_write (int seg

static int pci_conf1_mq_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
{
- return __pci_conf1_mq_read(bus2quad(bus), bus->number, PCI_SLOT(devfn),
+ return __pci_conf1_mq_read(bus2node(bus), bus->number, PCI_SLOT(devfn),
PCI_FUNC(devfn), where, size, value);
}

static int pci_conf1_mq_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
{
- return __pci_conf1_mq_write(bus2quad(bus), bus->number, PCI_SLOT(devfn),
+ return __pci_conf1_mq_write(bus2node(bus), bus->number, PCI_SLOT(devfn),
PCI_FUNC(devfn), where, size, value);
}

@@ -104,7 +104,7 @@ static void __devinit pci_fixup_i450nx(s
*/
int pxb, reg;
u8 busno, suba, subb;
- int quad = bus2quad(d->bus);
+ int node = bus2node(d->bus);

printk("PCI: Searching for i450NX host bridges on %s\n", d->slot_name);
reg = 0xd0;
@@ -114,9 +114,9 @@ static void __devinit pci_fixup_i450nx(s
pci_read_config_byte(d, reg++, &subb);
DBG("i450NX PXB %d: %02x/%02x/%02x\n", pxb, busno, suba, subb);
if (busno)
- pci_scan_bus(busno, pci_root_ops, (void *)quad); /* Bus A */
+ pci_scan_bus(busno, pci_root_ops, (void *)node); /* Bus A */
if (suba < subb)
- pci_scan_bus(suba+1, pci_root_ops, (void *)quad); /* Bus B */
+ pci_scan_bus(suba+1, pci_root_ops, (void *)node); /* Bus B */
}
pcibios_last_bus = -1;
}
@@ -127,7 +127,7 @@ struct pci_fixup pcibios_fixups[] = {

static int __init pci_numa_init(void)
{
- int quad;
+ int node;

pci_root_ops = &pci_direct_conf1_mq;

@@ -136,9 +136,9 @@ static int __init pci_numa_init(void)

pci_root_bus = pcibios_scan_root(0);
if (clustered_apic_mode && (numnodes > 1)) {
- for (quad = 1; quad < numnodes; ++quad) {
- printk("Scanning PCI bus %d for quad %d\n", 0, quad);
- pci_scan_bus(0, pci_root_ops, (void *)quad);
+ for (node = 1; node < numnodes; ++node) {
+ printk("Scanning PCI bus %d for node %d\n", 0, node);
+ pci_scan_bus(0, pci_root_ops, (void *)node);
}
}
return 0;

2002-11-12 20:49:38

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:37:46AM -0800, William Lee Irwin III wrote:
> This fixes a longstanding bug with respect to bridge handling as well as
> a Linux PCI faux pas, namely an attempt to support PCI domains with bus
> number mangling.
> The end result is that bridges off of quad 0 now work, and the code now
> follows Linux PCI conventions.
> [1/4] NUMA-Q: use sysdata as quad numbers in pci_scan_bus()"
> [2/4] NUMA-Q: fetch quad numbers from struct pci_bus"
> [3/4] NUMA-Q: use quad numbers passed to low-level config cycles"
> [4/4] NUMA-Q: remove last traces of bus number mangling"

Follow-on #2:

[6/4] NUMA-Q: remove unused bus number conversion functions


This removes unused detritus left over from the conversion to the
standard PCI arch-private data mechanisms.

numa.c | 15 +++------------
1 files changed, 3 insertions(+), 12 deletions(-)


diff -urpN pci-2.5.47-5/arch/i386/pci/numa.c pci-2.5.47-6/arch/i386/pci/numa.c
--- pci-2.5.47-5/arch/i386/pci/numa.c 2002-11-12 12:06:22.000000000 -0800
+++ pci-2.5.47-6/arch/i386/pci/numa.c 2002-11-12 12:10:25.000000000 -0800
@@ -6,15 +6,8 @@
#include <linux/init.h>
#include "pci.h"

-#define BUS2NODE(global) (mp_bus_id_to_node[global])
-#define BUS2LOCAL(global) (mp_bus_id_to_local[global])
-#define NODELOCAL2BUS(node,local) (quad_local_to_mp_bus_id[node][local])
-
-#define __PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg) \
- (0x80000000 | (bus << 16) | (dev << 11) | (fn << 8) | (reg & ~3))
-
#define PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg) \
- __PCI_CONF1_MQ_ADDRESS(BUS2LOCAL(bus), dev, fn, reg)
+ (0x80000000 | (bus << 16) | (dev << 11) | (fn << 8) | (reg & ~3))

static int bus2node(struct pci_bus *bus)
{
@@ -30,7 +23,7 @@ static int __pci_conf1_mq_read (int seg,

spin_lock_irqsave(&pci_config_lock, flags);

- outl_quad(__PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg), 0xCF8, seg);
+ outl_quad(PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg), 0xCF8, seg);

switch (len) {
case 1:
@@ -58,7 +51,7 @@ static int __pci_conf1_mq_write (int seg

spin_lock_irqsave(&pci_config_lock, flags);

- outl_quad(__PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg), 0xCF8, seg);
+ outl_quad(PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg), 0xCF8, seg);

switch (len) {
case 1:
@@ -77,8 +70,6 @@ static int __pci_conf1_mq_write (int seg
return 0;
}

-#undef PCI_CONF1_MQ_ADDRESS
-
static int pci_conf1_mq_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
{
return __pci_conf1_mq_read(bus2node(bus), bus->number, PCI_SLOT(devfn),

2002-11-12 20:50:54

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 12:52:41PM -0800, William Lee Irwin III wrote:
> [5/4] NUMA-Q: use "quad" instead of node in arch/i386/pci/numa.c

Er, vice-versa, use "node" instead of "quad"

2002-11-12 21:27:46

by Matthew Dobson

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

diff -Nur linux-2.5.47-vanilla/arch/i386/pci/numa.c linux-2.5.47-pcibus_to_node/arch/i386/pci/numa.c
--- linux-2.5.47-vanilla/arch/i386/pci/numa.c Sun Nov 10 19:28:06 2002
+++ linux-2.5.47-pcibus_to_node/arch/i386/pci/numa.c Tue Nov 12 13:25:15 2002
@@ -5,13 +5,10 @@
#include <linux/pci.h>
#include <linux/init.h>
#include "pci.h"
-
-#define BUS2QUAD(global) (mp_bus_id_to_node[global])
-#define BUS2LOCAL(global) (mp_bus_id_to_local[global])
-#define QUADLOCAL2BUS(quad,local) (quad_local_to_mp_bus_id[quad][local])
+#include <asm/topology.h>

#define PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg) \
- (0x80000000 | (BUS2LOCAL(bus) << 16) | (dev << 11) | (fn << 8) | (reg & ~3))
+ (0x80000000 | mp_bus_id_to_local[bus] << 16) | (dev << 11) | (fn << 8) | (reg & ~3))

static int __pci_conf1_mq_read (int seg, int bus, int dev, int fn, int reg, int len, u32 *value)
{
@@ -22,17 +19,17 @@

spin_lock_irqsave(&pci_config_lock, flags);

- outl_quad(PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg), 0xCF8, BUS2QUAD(bus));
+ outl_quad(PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg), 0xCF8, __pcibus_to_node(bus));

switch (len) {
case 1:
- *value = inb_quad(0xCFC + (reg & 3), BUS2QUAD(bus));
+ *value = inb_quad(0xCFC + (reg & 3), __pcibus_to_node(bus));
break;
case 2:
- *value = inw_quad(0xCFC + (reg & 2), BUS2QUAD(bus));
+ *value = inw_quad(0xCFC + (reg & 2), __pcibus_to_node(bus));
break;
case 4:
- *value = inl_quad(0xCFC, BUS2QUAD(bus));
+ *value = inl_quad(0xCFC, __pcibus_to_node(bus));
break;
}

@@ -50,17 +47,17 @@

spin_lock_irqsave(&pci_config_lock, flags);

- outl_quad(PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg), 0xCF8, BUS2QUAD(bus));
+ outl_quad(PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg), 0xCF8, __pcibus_to_node(bus));

switch (len) {
case 1:
- outb_quad((u8)value, 0xCFC + (reg & 3), BUS2QUAD(bus));
+ outb_quad((u8)value, 0xCFC + (reg & 3), __pcibus_to_node(bus));
break;
case 2:
- outw_quad((u16)value, 0xCFC + (reg & 2), BUS2QUAD(bus));
+ outw_quad((u16)value, 0xCFC + (reg & 2), __pcibus_to_node(bus));
break;
case 4:
- outl_quad((u32)value, 0xCFC, BUS2QUAD(bus));
+ outl_quad((u32)value, 0xCFC, __pcibus_to_node(bus));
break;
}

@@ -96,7 +93,7 @@
*/
int pxb, reg;
u8 busno, suba, subb;
- int quad = BUS2QUAD(d->bus->number);
+ int quad = __pcibus_to_node(d->bus->number);

printk("PCI: Searching for i450NX host bridges on %s\n", d->slot_name);
reg = 0xd0;
@@ -106,9 +103,9 @@
pci_read_config_byte(d, reg++, &subb);
DBG("i450NX PXB %d: %02x/%02x/%02x\n", pxb, busno, suba, subb);
if (busno)
- pci_scan_bus(QUADLOCAL2BUS(quad,busno), pci_root_ops, NULL); /* Bus A */
+ pci_scan_bus(quad_local_to_mp_bus_id[quad][busno], pci_root_ops, NULL); /* Bus A */
if (suba < subb)
- pci_scan_bus(QUADLOCAL2BUS(quad,suba+1), pci_root_ops, NULL); /* Bus B */
+ pci_scan_bus(quad_local_to_mp_bus_id[quad][suba+1], pci_root_ops, NULL); /* Bus B */
}
pcibios_last_bus = -1;
}
@@ -130,8 +127,8 @@
if (clustered_apic_mode && (numnodes > 1)) {
for (quad = 1; quad < numnodes; ++quad) {
printk("Scanning PCI bus %d for quad %d\n",
- QUADLOCAL2BUS(quad,0), quad);
- pci_scan_bus(QUADLOCAL2BUS(quad,0),
+ quad_local_to_mp_bus_id[quad][0], quad);
+ pci_scan_bus(quad_local_to_mp_bus_id[quad][0],
pci_root_ops, NULL);
}
}
diff -Nur linux-2.5.47-vanilla/include/asm-generic/topology.h linux-2.5.47-pcibus_to_node/include/asm-generic/topology.h
--- linux-2.5.47-vanilla/include/asm-generic/topology.h Sun Nov 10 19:28:04 2002
+++ linux-2.5.47-pcibus_to_node/include/asm-generic/topology.h Tue Nov 12 13:25:15 2002
@@ -47,5 +47,8 @@
#ifndef __node_to_memblk
#define __node_to_memblk(node) (0)
#endif
+#ifndef __pcibus_to_node
+#define __pcibus_to_node(bus) (0)
+#endif

#endif /* _ASM_GENERIC_TOPOLOGY_H */
diff -Nur linux-2.5.47-vanilla/include/asm-i386/topology.h linux-2.5.47-pcibus_to_node/include/asm-i386/topology.h
--- linux-2.5.47-vanilla/include/asm-i386/topology.h Sun Nov 10 19:28:05 2002
+++ linux-2.5.47-pcibus_to_node/include/asm-i386/topology.h Tue Nov 12 13:25:15 2002
@@ -83,6 +83,9 @@
/* Returns the number of the first MemBlk on Node 'node' */
#define __node_to_memblk(node) (node)

+/* Returns the number of the node containing PCI bus 'bus' */
+#define __pcibus_to_node(bus) (mp_bus_id_to_node[bus])
+
#else /* !CONFIG_X86_NUMAQ */
/*
* Other i386 platforms should define their own version of the


Attachments:
pcibus_to_node-2.5.47.patch (4.33 kB)

2002-11-12 21:31:03

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 01:29:28PM -0800, Matthew Dobson wrote:
> I know I just sent this to you, but I've been meaning to repost this
> to lkml anyway. Here's something I wanted to add to the generic topology
> infrastructure for a while. pcibus_to_node()
> Have a look at this patch, and see if it might be useful to you, ok?
> Cheers!
> -Matt

I'll remove the bus number mangling from it so it uses ->sysdata
instead, make it an additional stage of the patch series and convert
arch/i386/pci/numa.c to use it instead.

Bus number mangling has been vetoed numerous times; the agreed-upon
method of dealing with this is stuffing arch-private information in
->sysdata and dispatching on that within PCI config access routines.


Bill

2002-11-12 21:34:52

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 01:29:28PM -0800, Matthew Dobson wrote:
>> I know I just sent this to you, but I've been meaning to repost this
>> to lkml anyway. Here's something I wanted to add to the generic topology
>> infrastructure for a while. pcibus_to_node()
>> Have a look at this patch, and see if it might be useful to you, ok?
>> Cheers!
>> -Matt

On Tue, Nov 12, 2002 at 01:35:04PM -0800, William Lee Irwin III wrote:
> I'll remove the bus number mangling from it so it uses ->sysdata
> instead, make it an additional stage of the patch series and convert
> arch/i386/pci/numa.c to use it instead.
> Bus number mangling has been vetoed numerous times; the agreed-upon
> method of dealing with this is stuffing arch-private information in
> ->sysdata and dispatching on that within PCI config access routines.

Also, every PCI bridge in my box has a bus number of 3 so the lookup
table will produce wrong answers every time.


Bill

2002-11-12 21:43:50

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 01:35:04PM -0800, William Lee Irwin III wrote:
> I'll remove the bus number mangling from it so it uses ->sysdata
> instead, make it an additional stage of the patch series and convert
> arch/i386/pci/numa.c to use it instead.
> Bus number mangling has been vetoed numerous times; the agreed-upon
> method of dealing with this is stuffing arch-private information in
> ->sysdata and dispatching on that within PCI config access routines.

[7/4] NUMA-Q: introduce __pcibus_to_node()

This introduces a generic __pcibus_to_node() method in asm/topology.h
and provides a NUMA-Q -specific implementation.

asm-generic/topology.h | 3 +++
asm-i386/topology.h | 4 ++++
2 files changed, 7 insertions(+)


diff -urpN pci-2.5.47-6/include/asm-generic/topology.h pci-2.5.47-7/include/asm-generic/topology.h
--- pci-2.5.47-6/include/asm-generic/topology.h 2002-11-10 19:28:04.000000000 -0800
+++ pci-2.5.47-7/include/asm-generic/topology.h 2002-11-12 13:03:40.000000000 -0800
@@ -47,5 +47,8 @@
#ifndef __node_to_memblk
#define __node_to_memblk(node) (0)
#endif
+#ifndef __pcibus_to_node
+#define __pcibus_to_node(bus) (0)
+#endif

#endif /* _ASM_GENERIC_TOPOLOGY_H */
diff -urpN pci-2.5.47-6/include/asm-i386/topology.h pci-2.5.47-7/include/asm-i386/topology.h
--- pci-2.5.47-6/include/asm-i386/topology.h 2002-11-10 19:28:05.000000000 -0800
+++ pci-2.5.47-7/include/asm-i386/topology.h 2002-11-12 13:04:43.000000000 -0800
@@ -83,6 +83,10 @@ static inline unsigned long __node_to_cp
/* Returns the number of the first MemBlk on Node 'node' */
#define __node_to_memblk(node) (node)

+/* Returns the number of the node containing PCI bus 'bus' */
+#define __pcibus_to_node(bus) ((int)((bus)->sysdata))
+
+
#else /* !CONFIG_X86_NUMAQ */
/*
* Other i386 platforms should define their own version of the

2002-11-12 21:50:45

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

At some point in the past, I wrote:
>> Also, every PCI bridge in my box has a bus number of 3 so the lookup
>> table will produce wrong answers every time.

On Tue, Nov 12, 2002 at 02:46:29PM -0800, Martin J. Bligh wrote:
> Isn't that the local bus number though? The topology functions take
> global bus numbers, which should be unique ...
> M.

That bus number mangling scheme is an instance of an approach vetoed
over a year ago by Dave Miller and others, and does not work for bridges
because arch code does not get the opportunity to mangle the bus number
during bridge discovery.

The inheritance of ->sysdata is the proper method, used by Alpha, PPC,
ARM, and others, and has been generally acknowledged as the acceptable
solution to the PCI domain/segment problem.


Bill

2002-11-12 21:45:30

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

> Also, every PCI bridge in my box has a bus number of 3 so the lookup
> table will produce wrong answers every time.

Isn't that the local bus number though? The topology functions take
global bus numbers, which should be unique ...

M.

2002-11-12 21:44:36

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 01:35:04PM -0800, William Lee Irwin III wrote:
> I'll remove the bus number mangling from it so it uses ->sysdata
> instead, make it an additional stage of the patch series and convert
> arch/i386/pci/numa.c to use it instead.
> Bus number mangling has been vetoed numerous times; the agreed-upon
> method of dealing with this is stuffing arch-private information in
> ->sysdata and dispatching on that within PCI config access routines.

[8/4] NUMA-Q: use __pcibus_to_node() in arch/i386/pci/numa.c


This uses the __pcibus_to_node() macro within arch/i386/pci/numa.c

numa.c | 11 +++--------
1 files changed, 3 insertions(+), 8 deletions(-)


diff -urpN pci-2.5.47-7/arch/i386/pci/numa.c pci-2.5.47-8/arch/i386/pci/numa.c
--- pci-2.5.47-7/arch/i386/pci/numa.c 2002-11-12 12:10:25.000000000 -0800
+++ pci-2.5.47-8/arch/i386/pci/numa.c 2002-11-12 13:07:27.000000000 -0800
@@ -9,11 +9,6 @@
#define PCI_CONF1_MQ_ADDRESS(bus, dev, fn, reg) \
(0x80000000 | (bus << 16) | (dev << 11) | (fn << 8) | (reg & ~3))

-static int bus2node(struct pci_bus *bus)
-{
- return (int)bus->sysdata;
-}
-
static int __pci_conf1_mq_read (int seg, int bus, int dev, int fn, int reg, int len, u32 *value)
{
unsigned long flags;
@@ -72,13 +67,13 @@ static int __pci_conf1_mq_write (int seg

static int pci_conf1_mq_read(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *value)
{
- return __pci_conf1_mq_read(bus2node(bus), bus->number, PCI_SLOT(devfn),
+ return __pci_conf1_mq_read(__pcibus_to_node(bus), bus->number, PCI_SLOT(devfn),
PCI_FUNC(devfn), where, size, value);
}

static int pci_conf1_mq_write(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 value)
{
- return __pci_conf1_mq_write(bus2node(bus), bus->number, PCI_SLOT(devfn),
+ return __pci_conf1_mq_write(__pcibus_to_node(bus), bus->number, PCI_SLOT(devfn),
PCI_FUNC(devfn), where, size, value);
}

@@ -95,7 +90,7 @@ static void __devinit pci_fixup_i450nx(s
*/
int pxb, reg;
u8 busno, suba, subb;
- int node = bus2node(d->bus);
+ int node = __pcibus_to_node(d->bus);

printk("PCI: Searching for i450NX host bridges on %s\n", d->slot_name);
reg = 0xd0;

2002-11-12 22:53:35

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

>>> Also, every PCI bridge in my box has a bus number of 3 so the lookup
>>> table will produce wrong answers every time.
>
> On Tue, Nov 12, 2002 at 02:46:29PM -0800, Martin J. Bligh wrote:
>> Isn't that the local bus number though? The topology functions take
>> global bus numbers, which should be unique ...
>> M.
>
> That bus number mangling scheme is an instance of an approach vetoed
> over a year ago by Dave Miller and others, and does not work for bridges
> because arch code does not get the opportunity to mangle the bus number
> during bridge discovery.

Right, I'm not against the sysdata thing, seems like a much better way
to do it in general (what I did was a quick hack). Was just confused
by the global bus number assertion, but if we use the sysdata stuff,
it's all a non-issue ;-)

M.

2002-11-12 22:55:53

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

At some point in the past, I wrote:
>> That bus number mangling scheme is an instance of an approach vetoed
>> over a year ago by Dave Miller and others, and does not work for bridges
>> because arch code does not get the opportunity to mangle the bus number
>> during bridge discovery.

On Tue, Nov 12, 2002 at 03:53:49PM -0800, Martin J. Bligh wrote:
> Right, I'm not against the sysdata thing, seems like a much better way
> to do it in general (what I did was a quick hack). Was just confused
> by the global bus number assertion, but if we use the sysdata stuff,
> it's all a non-issue ;-)

Non-issue for merging...

The pain isn't over yet. =(

Core PCI code is assuming unique bus numbers in several places.


Fixing now,
Bill

2002-11-12 23:15:35

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 03:53:49PM -0800, Martin J. Bligh wrote:
>> Right, I'm not against the sysdata thing, seems like a much better way
>> to do it in general (what I did was a quick hack). Was just confused
>> by the global bus number assertion, but if we use the sysdata stuff,
>> it's all a non-issue ;-)

On Tue, Nov 12, 2002 at 02:59:37PM -0800, William Lee Irwin III wrote:
> Non-issue for merging...
> The pain isn't over yet. =(
> Core PCI code is assuming unique bus numbers in several places.
> Fixing now,
> Bill

... and resource/region stuff is not being dealt with properly either.

Found that after dealing with pci_bus_exists() in pci_alloc_primary_bus().


Fixing that too,
Bill

2002-11-12 23:54:10

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 03:53:49PM -0800, Martin J. Bligh wrote:
>> Right, I'm not against the sysdata thing, seems like a much better way
>> to do it in general (what I did was a quick hack). Was just confused
>> by the global bus number assertion, but if we use the sysdata stuff,
>> it's all a non-issue ;-)
>
On Tue, Nov 12, 2002 at 02:59:37PM -0800, William Lee Irwin III wrote:
> Non-issue for merging...
> The pain isn't over yet. =(
> Core PCI code is assuming unique bus numbers in several places.

Okay, an attempt to remedy this world-breaking braindamage with the
fewest lines of code:

This alters PCI bus number "clash" detection to compare ->sysdata in
addition to the numbers. The bus number is not a unique identifier.

drivers/pci/probe.c | 14 ++++++++++++--
include/linux/pci.h | 2 +-
2 files changed, 13 insertions(+), 3 deletions(-)


diff -urpN pci-2.5.47-8/drivers/pci/probe.c pci-2.5.47-8-dbg/drivers/pci/probe.c
--- pci-2.5.47-8/drivers/pci/probe.c 2002-11-12 13:37:56.000000000 -0800
+++ pci-2.5.47-8-dbg/drivers/pci/probe.c 2002-11-12 14:27:54.000000000 -0800
@@ -548,11 +548,21 @@ int __devinit pci_bus_exists(const struc
return 0;
}

-struct pci_bus * __devinit pci_alloc_primary_bus(int bus)
+static int __devinit pci_root_bus_exists(int number, void *sysdata)
+{
+ const struct pci_bus *bus;
+
+ list_for_each_entry(bus, &pci_root_buses, node)
+ if (bus->number == number && bus->sysdata == sysdata)
+ return 1;
+ return 0;
+}
+
+struct pci_bus * __devinit pci_alloc_primary_bus(int bus, void *sysdata)
{
struct pci_bus *b;

- if (pci_bus_exists(&pci_root_buses, bus)) {
+ if (pci_root_bus_exists(bus, sysdata)) {
/* If we already got to this bus through a different bridge, ignore it */
DBG("PCI: Bus %02x already known\n", bus);
return NULL;
diff -urpN pci-2.5.47-8/include/linux/pci.h pci-2.5.47-8-dbg/include/linux/pci.h
--- pci-2.5.47-8/include/linux/pci.h 2002-11-10 19:28:13.000000000 -0800
+++ pci-2.5.47-8-dbg/include/linux/pci.h 2002-11-12 14:29:13.000000000 -0800
@@ -537,7 +537,7 @@ int pcibios_write_config_dword (unsigned

int pci_bus_exists(const struct list_head *list, int nr);
struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata);
-struct pci_bus *pci_alloc_primary_bus(int bus);
+struct pci_bus *pci_alloc_primary_bus(int bus, void *sysdata);
struct pci_dev *pci_scan_slot(struct pci_dev *temp);
int pci_proc_attach_device(struct pci_dev *dev);
int pci_proc_detach_device(struct pci_dev *dev);

2002-11-13 00:03:00

by Greg KH

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 03:58:24PM -0800, William Lee Irwin III wrote:
> On Tue, Nov 12, 2002 at 03:53:49PM -0800, Martin J. Bligh wrote:
> >> Right, I'm not against the sysdata thing, seems like a much better way
> >> to do it in general (what I did was a quick hack). Was just confused
> >> by the global bus number assertion, but if we use the sysdata stuff,
> >> it's all a non-issue ;-)
> >
> On Tue, Nov 12, 2002 at 02:59:37PM -0800, William Lee Irwin III wrote:
> > Non-issue for merging...
> > The pain isn't over yet. =(
> > Core PCI code is assuming unique bus numbers in several places.
>
> Okay, an attempt to remedy this world-breaking braindamage with the
> fewest lines of code:
>
> This alters PCI bus number "clash" detection to compare ->sysdata in
> addition to the numbers. The bus number is not a unique identifier.

Um, why not?

And what would /sys/bus/pci/devices look like if you allow the same
identifiers for different devices? :)

thanks,

greg k-h

2002-11-13 00:08:33

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 03:58:24PM -0800, William Lee Irwin III wrote:
>> Okay, an attempt to remedy this world-breaking braindamage with the
>> fewest lines of code:
>> This alters PCI bus number "clash" detection to compare ->sysdata in
>> addition to the numbers. The bus number is not a unique identifier.

On Tue, Nov 12, 2002 at 04:04:35PM -0800, Greg KH wrote:
> Um, why not?
> And what would /sys/bus/pci/devices look like if you allow the same
> identifiers for different devices? :)
> thanks,
> greg k-h

(1) incorrect semantics:
multiple domains/segments exist, and the bus number is not
qualified with such
(2) insufficient cardinality:
even in the presence of remapping schemes the bus space is limited
to less than the number of buses requiring unique identifiers


Bill

2002-11-13 00:19:00

by Greg KH

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:12:46PM -0800, William Lee Irwin III wrote:
> On Tue, Nov 12, 2002 at 03:58:24PM -0800, William Lee Irwin III wrote:
> >> Okay, an attempt to remedy this world-breaking braindamage with the
> >> fewest lines of code:
> >> This alters PCI bus number "clash" detection to compare ->sysdata in
> >> addition to the numbers. The bus number is not a unique identifier.
>
> On Tue, Nov 12, 2002 at 04:04:35PM -0800, Greg KH wrote:
> > Um, why not?
> > And what would /sys/bus/pci/devices look like if you allow the same
> > identifiers for different devices? :)
> > thanks,
> > greg k-h
>
> (1) incorrect semantics:
> multiple domains/segments exist, and the bus number is not
> qualified with such
> (2) insufficient cardinality:
> even in the presence of remapping schemes the bus space is limited
> to less than the number of buses requiring unique identifiers

Ok, then also please fix up drivers/pci/probe.c::pci_setup_device() to
set a unique slot_name up for the pci_dev, if you have multiple
domains/segments.

thanks,

greg k-h

2002-11-13 00:27:50

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:20:32PM -0800, Greg KH wrote:
> Ok, then also please fix up drivers/pci/probe.c::pci_setup_device() to
> set a unique slot_name up for the pci_dev, if you have multiple
> domains/segments.
> thanks,
> greg k-h

Okay, tihs just needs the introduction of something that can produce
a number out of ->sysdata (or whatever):

sprintf(dev->slot_name, "%02x:%02x.%d", dev->bus->number, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));


This wants to be something like:
sprintf(dev->slot_name, "%02x:%02x:%02x.%d", dev->bus->segment, dev->bus->number, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));


On the way in 5 minutes or thereabouts. I'm restarting from just before
the NUMA changes.

Bill

2002-11-13 00:57:20

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:20:32PM -0800, Greg KH wrote:
>> Ok, then also please fix up drivers/pci/probe.c::pci_setup_device() to
>> set a unique slot_name up for the pci_dev, if you have multiple
>> domains/segments.
>> thanks,
>> greg k-h

On Tue, Nov 12, 2002 at 04:28:55PM -0800, William Lee Irwin III wrote:
> Okay, tihs just needs the introduction of something that can produce
> a number out of ->sysdata (or whatever):
> sprintf(dev->slot_name, "%02x:%02x.%d", dev->bus->number, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));
>
> This wants to be something like:
> sprintf(dev->slot_name, "%02x:%02x:%02x.%d", dev->bus->segment, dev->bus->number, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));
> On the way in 5 minutes or thereabouts. I'm restarting from just before
> the NUMA changes.

Some low-level port I/O stuff has to be adjusted for correct resource
accounting. This is going to bump the ETA back to about 2 hours.

Bill

2002-11-13 10:05:58

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:20:32PM -0800, Greg KH wrote:
> Ok, then also please fix up drivers/pci/probe.c::pci_setup_device() to
> set a unique slot_name up for the pci_dev, if you have multiple
> domains/segments.
> thanks,
> greg k-h

Reporting that stuff is trivial, but resolving the deep arch issues
with the remaining failures I'm getting (not directly bus-related,
actually I/O resource allocation going wrong) have me badly stumped.

Push back that ETA to weeks. I'll break off generically mergeable
bits and send them your way as I go though. The patch queue is
something like 20 long, but most of the content is "resolve one
problem after the other". Getting this into a state of "the system
works at every step of the way" is tricky, esp. since the end results
of today's excursion do not yet include a fully-working system.


Bill

2002-11-13 11:16:13

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:20:32PM -0800, Greg KH wrote:
>> Ok, then also please fix up drivers/pci/probe.c::pci_setup_device() to
>> set a unique slot_name up for the pci_dev, if you have multiple
>> domains/segments.
>> thanks,
>> greg k-h

On Wed, Nov 13, 2002 at 02:10:05AM -0800, William Lee Irwin III wrote:
> Reporting that stuff is trivial, but resolving the deep arch issues
> with the remaining failures I'm getting (not directly bus-related,
> actually I/O resource allocation going wrong) have me badly stumped.
> Push back that ETA to weeks. I'll break off generically mergeable
> bits and send them your way as I go though. The patch queue is
> something like 20 long, but most of the content is "resolve one
> problem after the other". Getting this into a state of "the system
> works at every step of the way" is tricky, esp. since the end results
> of today's excursion do not yet include a fully-working system.

Actually benh has straightened me out on this count as of a few minutes
ago. Something will probably show up here by Friday.


Bill

2002-11-13 12:02:53

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:28:55PM -0800, William Lee Irwin III wrote:
> On Tue, Nov 12, 2002 at 04:20:32PM -0800, Greg KH wrote:
> > Ok, then also please fix up drivers/pci/probe.c::pci_setup_device() to
> > set a unique slot_name up for the pci_dev, if you have multiple
> > domains/segments.
> > thanks,
> > greg k-h
>
> Okay, tihs just needs the introduction of something that can produce
> a number out of ->sysdata (or whatever):

Please use existing pci_controller_num(struct pci_dev *pdev).
It does exactly that you want.

Ivan.

2002-11-13 20:55:08

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Wed, Nov 13, 2002 at 03:07:16PM +0300, Ivan Kokshaysky wrote:
> Please use existing pci_controller_num(struct pci_dev *pdev).
> It does exactly that you want.

There are some other issues blocking a working implementation but
using that instead of a new function will be trivial.


Bill

2003-01-01 09:20:36

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [0/4] NUMA-Q: remove PCI bus number mangling

On Tue, Nov 12, 2002 at 04:20:32PM -0800, Greg KH wrote:
>>> Ok, then also please fix up drivers/pci/probe.c::pci_setup_device() to
>>> set a unique slot_name up for the pci_dev, if you have multiple
>>> domains/segments.

On Wed, Nov 13, 2002 at 02:10:05AM -0800, William Lee Irwin III wrote:
>> Reporting that stuff is trivial, but resolving the deep arch issues
>> with the remaining failures I'm getting (not directly bus-related,
>> actually I/O resource allocation going wrong) have me badly stumped.
>> Push back that ETA to weeks. I'll break off generically mergeable
>> bits and send them your way as I go though. The patch queue is
>> something like 20 long, but most of the content is "resolve one
>> problem after the other". Getting this into a state of "the system
>> works at every step of the way" is tricky, esp. since the end results
>> of today's excursion do not yet include a fully-working system.

On Wed, Nov 13, 2002 at 03:20:28AM -0800, William Lee Irwin III wrote:
> Actually benh has straightened me out on this count as of a few minutes
> ago. Something will probably show up here by Friday.

I've had a lot of trouble with this. Don't count on this anytime soon.

Bill