2005-05-16 21:04:03

by Martin Bligh

[permalink] [raw]
Subject: 2.6.12-rc4-mm2 build failure

ppc64 box

drivers/ide/ide-probe.c: In function `ide_init_queue':
drivers/ide/ide-probe.c:982: warning: implicit declaration of function `pcibus_to_node'
drivers/ide/ide-disk.c: In function `ide_disk_probe':
drivers/ide/ide-disk.c:1225: warning: implicit declaration of function `pcibus_to_node'
drivers/built-in.o(.text+0xaee4c): In function `.init_irq':
: undefined reference to `.pcibus_to_node'
drivers/built-in.o(.text+0xaf01c): In function `.init_irq':
: undefined reference to `.pcibus_to_node'
drivers/built-in.o(.text+0xb7808): In function `.ide_disk_probe':
: undefined reference to `.pcibus_to_node'
make: *** [.tmp_vmlinux1] Error 1
05/16/05-07:36:03 Build the kernel. Failed rc = 2
05/16/05-07:36:03 build: kernel build Failed rc = 1



2005-05-16 23:07:45

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure

On Mon, 16 May 2005, Martin J. Bligh wrote:

> ppc64 box
>
> drivers/ide/ide-probe.c: In function `ide_init_queue':
> drivers/ide/ide-probe.c:982: warning: implicit declaration of function `pcibus_to_node'
> drivers/ide/ide-disk.c: In function `ide_disk_probe':
> drivers/ide/ide-disk.c:1225: warning: implicit declaration of function `pcibus_to_node'
> drivers/built-in.o(.text+0xaee4c): In function `.init_irq':
> : undefined reference to `.pcibus_to_node'
> drivers/built-in.o(.text+0xaf01c): In function `.init_irq':
> : undefined reference to `.pcibus_to_node'
> drivers/built-in.o(.text+0xb7808): In function `.ide_disk_probe':
> : undefined reference to `.pcibus_to_node'
> make: *** [.tmp_vmlinux1] Error 1
> 05/16/05-07:36:03 Build the kernel. Failed rc = 2
> 05/16/05-07:36:03 build: kernel build Failed rc = 1

There was a prior discussion with the ppc64 folks about the way that
asm-generic/topology.h was included only for CONFIG_NUMA. I thought that
was fixed?

asm-generic/topology.h must also be included if CONFIG_NUMA is not set
inorder to provide the fall back pcibus_to_node function.

patch follows. Cannot test since I do not have a ppc64.

Index: linux-2.6.12-rc4/include/asm-ppc64/topology.h
===================================================================
--- linux-2.6.12-rc4.orig/include/asm-ppc64/topology.h 2005-03-01 23:38:32.000000000 -0800
+++ linux-2.6.12-rc4/include/asm-ppc64/topology.h 2005-05-16 16:06:24.000000000 -0700
@@ -59,10 +59,8 @@
.nr_balance_failed = 0, \
}

-#else /* !CONFIG_NUMA */
+#endif /* CONFIG_NUMA */

#include <asm-generic/topology.h>

-#endif /* CONFIG_NUMA */
-
#endif /* _ASM_PPC64_TOPOLOGY_H */

2005-05-16 23:15:29

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure


> There was a prior discussion with the ppc64 folks about the way that
> asm-generic/topology.h was included only for CONFIG_NUMA. I thought that
> was fixed?
>
> asm-generic/topology.h must also be included if CONFIG_NUMA is not set
> inorder to provide the fall back pcibus_to_node function.
>
> patch follows. Cannot test since I do not have a ppc64.

OK, I queued up a test. Ping me if you don't get an answer by tommorow ...

M.


2005-05-17 00:39:19

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure

On Tue, 17 May 2005, Alexey Dobriyan wrote:

> On Tuesday 17 May 2005 03:21, [email protected] wrote:
> > From: Christoph Lameter <[email protected]>
>
> > asm-generic/topology.h must also be included if CONFIG_NUMA is not set
> > in order to provide the fall back pcibus_to_node function.
>
> It should be "included if CONFIG_NUMA is set".

Where will we define the fallback functions for new functions to
be added to topology.h?

> P. S.: alpha has the same "#ifdef #else #endif" pattern in asm/topology.h

ia64 and i386 do not.


2005-05-17 17:10:39

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure



> There was a prior discussion with the ppc64 folks about the way that
> asm-generic/topology.h was included only for CONFIG_NUMA. I thought that
> was fixed?
>
> asm-generic/topology.h must also be included if CONFIG_NUMA is not set
> inorder to provide the fall back pcibus_to_node function.

Nope. same failure.

M.

2005-05-17 17:10:26

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure



--On Tuesday, May 17, 2005 10:08:19 -0700 "Martin J. Bligh" <[email protected]> wrote:

>
>
>> There was a prior discussion with the ppc64 folks about the way that
>> asm-generic/topology.h was included only for CONFIG_NUMA. I thought that
>> was fixed?
>>
>> asm-generic/topology.h must also be included if CONFIG_NUMA is not set
>> inorder to provide the fall back pcibus_to_node function.
>
> Nope. same failure.

Oops. I lie. Now it gets the other failure (the panic on boot we were
discussing before). So yes, that patch worked.

M.

2005-05-17 17:25:16

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure

On Tue, 17 May 2005, Martin J. Bligh wrote:

> > Nope. same failure.
>
> Oops. I lie. Now it gets the other failure (the panic on boot we were
> discussing before). So yes, that patch worked.

You removed the slab patches right? So this confirms that the panic on
boot has nothing to do with the numa slab allocator?

2005-05-17 17:38:43

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure



--On Tuesday, May 17, 2005 10:20:49 -0700 Christoph Lameter <[email protected]> wrote:

> On Tue, 17 May 2005, Martin J. Bligh wrote:
>
>> > Nope. same failure.
>>
>> Oops. I lie. Now it gets the other failure (the panic on boot we were
>> discussing before). So yes, that patch worked.
>
> You removed the slab patches right? So this confirms that the panic on
> boot has nothing to do with the numa slab allocator?

Nope, not yet. I just applied your other patch.

M.

2005-05-18 00:06:30

by Matthew Dobson

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure

Christoph Lameter wrote:
> On Mon, 16 May 2005, Martin J. Bligh wrote:
>
>
>>ppc64 box
>>
>>drivers/ide/ide-probe.c: In function `ide_init_queue':
>>drivers/ide/ide-probe.c:982: warning: implicit declaration of function `pcibus_to_node'
>>drivers/ide/ide-disk.c: In function `ide_disk_probe':
>>drivers/ide/ide-disk.c:1225: warning: implicit declaration of function `pcibus_to_node'
>>drivers/built-in.o(.text+0xaee4c): In function `.init_irq':
>>: undefined reference to `.pcibus_to_node'
>>drivers/built-in.o(.text+0xaf01c): In function `.init_irq':
>>: undefined reference to `.pcibus_to_node'
>>drivers/built-in.o(.text+0xb7808): In function `.ide_disk_probe':
>>: undefined reference to `.pcibus_to_node'
>>make: *** [.tmp_vmlinux1] Error 1
>>05/16/05-07:36:03 Build the kernel. Failed rc = 2
>>05/16/05-07:36:03 build: kernel build Failed rc = 1
>
>
> There was a prior discussion with the ppc64 folks about the way that
> asm-generic/topology.h was included only for CONFIG_NUMA. I thought that
> was fixed?
>
> asm-generic/topology.h must also be included if CONFIG_NUMA is not set
> inorder to provide the fall back pcibus_to_node function.
>
> patch follows. Cannot test since I do not have a ppc64.
>
> Index: linux-2.6.12-rc4/include/asm-ppc64/topology.h
> ===================================================================
> --- linux-2.6.12-rc4.orig/include/asm-ppc64/topology.h 2005-03-01 23:38:32.000000000 -0800
> +++ linux-2.6.12-rc4/include/asm-ppc64/topology.h 2005-05-16 16:06:24.000000000 -0700
> @@ -59,10 +59,8 @@
> .nr_balance_failed = 0, \
> }
>
> -#else /* !CONFIG_NUMA */
> +#endif /* CONFIG_NUMA */
>
> #include <asm-generic/topology.h>
>
> -#endif /* CONFIG_NUMA */
> -
> #endif /* _ASM_PPC64_TOPOLOGY_H */

Not a big fan of this patch. It's not wrong, per-se, but it just doesn't
sit right with me. asm-generic/topology.h should be a fallback file for
the arches that just want some sort of sane UP/SMP defaults. The better
(IMHO) solution is this patch. Builds fine on PPC64.

-Matt


Attachments:
ppc64_topology-fix.patch (414.00 B)

2005-05-18 00:08:50

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure

On Tue, 17 May 2005, Matthew Dobson wrote:

> Not a big fan of this patch. It's not wrong, per-se, but it just doesn't
> sit right with me. asm-generic/topology.h should be a fallback file for
> the arches that just want some sort of sane UP/SMP defaults. The better
> (IMHO) solution is this patch. Builds fine on PPC64.

And what happens if yet another function needs to be added for all arches?
Then ppc64 will break again and you will fix it again?

2005-05-18 17:24:57

by Matthew Dobson

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure

Christoph Lameter wrote:
> On Tue, 17 May 2005, Matthew Dobson wrote:
>
>
>>Not a big fan of this patch. It's not wrong, per-se, but it just doesn't
>>sit right with me. asm-generic/topology.h should be a fallback file for
>>the arches that just want some sort of sane UP/SMP defaults. The better
>>(IMHO) solution is this patch. Builds fine on PPC64.
>
>
> And what happens if yet another function needs to be added for all arches?
> Then ppc64 will break again and you will fix it again?

I think you explained it well yourself. If another function needs to be
added for ALL ARCHES, then ALL ARCHES will need to add the function. In
most cases there is no single function that is both CORRECT and GENERIC
across all arches. The way that i386, ia64, ppc64, etc. will map PCI Buses
to Nodes (for instance) will NOT be the same. Anyone who adds a new
topology function has the responsibility of
1) making sure it works for all arches which support topology, or
2) getting the arch maintainers involved and helping them make sure it
works for all arches.

New topology functions don't really get added all that often. We've got
the basics (CPU, Mem, I/O Buses, Nodes) mapped in various ways, so there
shouldn't be tons of new functions added. If someone wants to add a new
function, it's their responsibility to make sure that it doesn't break
anyone's arch.

Regardless, I highly doubt Linus or Andrew will pick up a patch for a new
topology function if it breaks some arch.

-Matt

2005-05-18 17:40:41

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure

On Wed, 18 May 2005, Matthew Dobson wrote:

> I think you explained it well yourself. If another function needs to be
> added for ALL ARCHES, then ALL ARCHES will need to add the function. In
> most cases there is no single function that is both CORRECT and GENERIC
> across all arches. The way that i386, ia64, ppc64, etc. will map PCI Buses
> to Nodes (for instance) will NOT be the same. Anyone who adds a new
> topology function has the responsibility of
> 1) making sure it works for all arches which support topology, or
> 2) getting the arch maintainers involved and helping them make sure it
> works for all arches.

The topology function in generic may just do nothing
if not defined by the arch like pcibus_to_node. If the arch does not
define it return indeterminate and make all node specific allocations fall
back to unspecific allocation. This works for all arches and preserves
the existing behavior.

> New topology functions don't really get added all that often. We've got
> the basics (CPU, Mem, I/O Buses, Nodes) mapped in various ways, so there
> shouldn't be tons of new functions added. If someone wants to add a new
> function, it's their responsibility to make sure that it doesn't break
> anyone's arch.

Its best to have the kernel setup in such a way that functions can be
added without having to cause breakage.

It is imaginable that someone will add more hardware something to node
functions in the near future given that pcibus_to_node is just for pci.

2005-05-19 00:56:18

by Matthew Dobson

[permalink] [raw]
Subject: Re: 2.6.12-rc4-mm2 build failure

Christoph Lameter wrote:
> On Wed, 18 May 2005, Matthew Dobson wrote:
>
>
>>I think you explained it well yourself. If another function needs to be
>>added for ALL ARCHES, then ALL ARCHES will need to add the function. In
>>most cases there is no single function that is both CORRECT and GENERIC
>>across all arches. The way that i386, ia64, ppc64, etc. will map PCI Buses
>>to Nodes (for instance) will NOT be the same. Anyone who adds a new
>>topology function has the responsibility of
>>1) making sure it works for all arches which support topology, or
>>2) getting the arch maintainers involved and helping them make sure it
>>works for all arches.
>
>
> The topology function in generic may just do nothing
> if not defined by the arch like pcibus_to_node. If the arch does not
> define it return indeterminate and make all node specific allocations fall
> back to unspecific allocation. This works for all arches and preserves
> the existing behavior.

Which is the way it was designed. Arches which just want a safe/sane
default (yet likely inefficient) topology behavior that makes the system
look like a UP/SMP box should include asm-generic/topology.h from their own
asm/topology.h. Arches that want to provide a useful and accurate topology
behavior should define their own implementation of the functions in
asm/topology.h.

The compilation breakage from defining a new topology function for some
arches and using it in generic code without defining it for all arches that
implement topology was intentional. It's a sign that whoever made that new
function should at least solicit some feedback from other arches to make
sure that they are doing something sane, and so that we just don't add new
topology functions on a whim. It was hoped that this would encourage some
discussion amongst arches to come up with interfaces that would be usable
across multiple (all?) arches.


>>New topology functions don't really get added all that often. We've got
>>the basics (CPU, Mem, I/O Buses, Nodes) mapped in various ways, so there
>>shouldn't be tons of new functions added. If someone wants to add a new
>>function, it's their responsibility to make sure that it doesn't break
>>anyone's arch.
>
>
> Its best to have the kernel setup in such a way that functions can be
> added without having to cause breakage.
>
> It is imaginable that someone will add more hardware something to node
> functions in the near future given that pcibus_to_node is just for pci.

I agree. The way (historically) to not cause breakage is to implement your
new topology function for arches that care, or just stick a sane default in
their topology.h. Every arch does it the way I'm suggesting, except IA64 &
x86_64.

x86_64 surprisingly does define pcibus_to_cpumask, but NOT
pcibus_to_node(). That means when they unconditionally #include
<asm-generic/topology.h> at the bottom of asm-x86_64/topology.h they get a
confusing mish-mash of pcibus functions. What I mean is, on x86_64
pcibus_to_cpumask will do the right thing but pcibus_to_node will not.
This is broken. The correct thing to do would be to define pcibus_to_node
in asm-x86_64/topology.h and only include asm-generic/topology.h if
DISCONTIG/NUMA isn't defined.

IA64 seems to define all the topo functions except pcibus related ones.
They also include the generic topology functions, but at least their
pcibus_to_FOO functions are consistent. Ideally, IA64 would also implement
a version of pcibus_to_FOO that does the right thing, but for now I guess
it isn't a priority for them.

Either way, I suppose it's not a terribly interesting argument. Both of
our patches solve the build problem. Whichever one Andrew wants is the fix
we'll go with. I'd obviously prefer mine, since it remains consistent with
the way I designed the topology system in the first place, but since 2
other arches do it the way your patch does, I won't throw a fit if your
patch gets in.

-Matt

2005-05-19 03:36:57

by Christoph Lameter

[permalink] [raw]
Subject: [PATCH] fix pcibus_to_node for x86_64

On Wed, 18 May 2005, Matthew Dobson wrote:

> x86_64 surprisingly does define pcibus_to_cpumask, but NOT
> pcibus_to_node(). That means when they unconditionally #include
> <asm-generic/topology.h> at the bottom of asm-x86_64/topology.h they get a
> confusing mish-mash of pcibus functions. What I mean is, on x86_64
> pcibus_to_cpumask will do the right thing but pcibus_to_node will not.
> This is broken. The correct thing to do would be to define pcibus_to_node
> in asm-x86_64/topology.h and only include asm-generic/topology.h if
> DISCONTIG/NUMA isn't defined.

Correct. There is a bit missing there. x86_64 has the pci_bus_to_node
array but not the macro pcibus_to_node. On cursory review it looks as if
it would already be right. Just the _ and the [] ....

Here is a fix for x86_64 that adds pcibus_to_node to
asm-x86_64/topology.h. Patch against 2.6.12-rc4-mm2:

Index: linux-2.6.12-rc4/include/asm-x86_64/topology.h
===================================================================
--- linux-2.6.12-rc4.orig/include/asm-x86_64/topology.h 2005-05-17 02:19:53.000000000 +0000
+++ linux-2.6.12-rc4/include/asm-x86_64/topology.h 2005-05-19 01:40:06.000000000 +0000
@@ -26,7 +26,8 @@
#define parent_node(node) (node)
#define node_to_first_cpu(node) (__ffs(node_to_cpumask[node]))
#define node_to_cpumask(node) (node_to_cpumask[node])
-#define pcibus_to_cpumask(bus) node_to_cpumask(pci_bus_to_node[(bus)->number]);
+#define pcibus_to_node(bus) pci_bus_to_node[(bus)->number]
+#define pcibus_to_cpumask(bus) node_to_cpumask(pcibus_to_node(bus));

#ifdef CONFIG_NUMA
/* sched_domains SD_NODE_INIT for x86_64 machines */