2005-09-06 03:56:35

by Magnus Damm

[permalink] [raw]
Subject: [PATCH] i386: single node SPARSEMEM fix

This patch for 2.6.13-git5 fixes single node sparsemem support. In the case
when multiple nodes are used, setup_memory() in arch/i386/mm/discontig.c calls
get_memcfg_numa() which calls memory_present(). The single node case with
setup_memory() in arch/i386/kernel/setup.c does not call memory_present()
without this patch, which breaks single node support.

Signed-off-by: Magnus Damm <[email protected]>
----

--- from-0006/arch/i386/Kconfig
+++ to-0007/arch/i386/Kconfig 2005-09-06 12:01:45.000000000 +0900
@@ -758,7 +758,6 @@ config NUMA
depends on SMP && HIGHMEM64G && (X86_NUMAQ || X86_GENERICARCH || (X86_SUMMIT && ACPI))
default n if X86_PC
default y if (X86_NUMAQ || X86_SUMMIT)
- select SPARSEMEM_STATIC

# Need comments to help the hapless user trying to turn on NUMA support
comment "NUMA (NUMA-Q) requires SMP, 64GB highmem support"
@@ -797,7 +796,8 @@ config ARCH_DISCONTIGMEM_DEFAULT

config ARCH_SPARSEMEM_ENABLE
def_bool y
- depends on NUMA
+ depends on NUMA || (X86_PC && EXPERIMENTAL)
+ select SPARSEMEM_STATIC

config ARCH_SELECT_MEMORY_MODEL
def_bool y
--- from-0006/arch/i386/kernel/setup.c
+++ to-0007/arch/i386/kernel/setup.c 2005-09-06 11:34:07.000000000 +0900
@@ -1127,6 +1127,9 @@ static unsigned long __init setup_memory
printk(KERN_NOTICE "%ldMB LOWMEM available.\n",
pages_to_mb(max_low_pfn));

+#ifdef CONFIG_SPARSEMEM
+ memory_present(0, 0, max_pfn);
+#endif
setup_bootmem_allocator();

return max_low_pfn;


2005-09-07 17:29:14

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

On Tue, 2005-09-06 at 12:56 +0900, Magnus Damm wrote:
> This patch for 2.6.13-git5 fixes single node sparsemem support. In the case
> when multiple nodes are used, setup_memory() in arch/i386/mm/discontig.c calls
> get_memcfg_numa() which calls memory_present(). The single node case with
> setup_memory() in arch/i386/kernel/setup.c does not call memory_present()
> without this patch, which breaks single node support.

First of all, this is really a feature addition, not a bug fix. :)

The reason we haven't included this so far is that we don't really have
any machines that need sparsemem on i386 that aren't NUMA. So, we
disabled it for now, and probably need to decide first why we need it
before a patch like that goes in.

I actually have exactly the same patch that you sent out in my tree, but
it's just for testing. Magnus, perhaps we can get some of my testing
patches in good enough shape to put them in -mm so that the non-NUMA
folks can do more sparsemem testing.

-- Dave

2005-09-07 18:22:46

by Martin Bligh

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

--On Wednesday, September 07, 2005 10:28:36 -0700 Dave Hansen <[email protected]> wrote:

> On Tue, 2005-09-06 at 12:56 +0900, Magnus Damm wrote:
>> This patch for 2.6.13-git5 fixes single node sparsemem support. In the case
>> when multiple nodes are used, setup_memory() in arch/i386/mm/discontig.c calls
>> get_memcfg_numa() which calls memory_present(). The single node case with
>> setup_memory() in arch/i386/kernel/setup.c does not call memory_present()
>> without this patch, which breaks single node support.
>
> First of all, this is really a feature addition, not a bug fix. :)
>
> The reason we haven't included this so far is that we don't really have
> any machines that need sparsemem on i386 that aren't NUMA. So, we
> disabled it for now, and probably need to decide first why we need it
> before a patch like that goes in.

CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
machines. This is essential in order for the distros to support it - same
will go for sparsemem.

M.

2005-09-07 18:28:52

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

On Wed, 2005-09-07 at 11:22 -0700, Martin J. Bligh wrote:
> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
> machines. This is essential in order for the distros to support it - same
> will go for sparsemem.

That's a different issue. The current code works if you boot a NUMA=y
SPARSEMEM=y machine with a single node. The current Kconfig options
also enforce that SPARSEMEM depends on NUMA on i386.

Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n. That
requires some Kconfig changes, as well as an extra memory present call.
I'm questioning why we need to do that when we could never do
DISCONTIG=y while NUMA=n on i386.

-- Dave

2005-09-07 18:34:55

by Martin Bligh

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix



--On Wednesday, September 07, 2005 11:27:54 -0700 Dave Hansen <[email protected]> wrote:

> On Wed, 2005-09-07 at 11:22 -0700, Martin J. Bligh wrote:
>> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
>> machines. This is essential in order for the distros to support it - same
>> will go for sparsemem.
>
> That's a different issue. The current code works if you boot a NUMA=y
> SPARSEMEM=y machine with a single node. The current Kconfig options
> also enforce that SPARSEMEM depends on NUMA on i386.
>
> Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n. That
> requires some Kconfig changes, as well as an extra memory present call.
> I'm questioning why we need to do that when we could never do
> DISCONTIG=y while NUMA=n on i386.

Ah, OK - makes more sense. However, some machines do have large holes
in e820 map setups - is not really critical, more of an efficiency
thing.

M.

2005-09-07 23:50:00

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

"Martin J. Bligh" <[email protected]> wrote:
>
>
>
> --On Wednesday, September 07, 2005 11:27:54 -0700 Dave Hansen <[email protected]> wrote:
>
> > On Wed, 2005-09-07 at 11:22 -0700, Martin J. Bligh wrote:
> >> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
> >> machines. This is essential in order for the distros to support it - same
> >> will go for sparsemem.
> >
> > That's a different issue. The current code works if you boot a NUMA=y
> > SPARSEMEM=y machine with a single node. The current Kconfig options
> > also enforce that SPARSEMEM depends on NUMA on i386.
> >
> > Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n. That
> > requires some Kconfig changes, as well as an extra memory present call.
> > I'm questioning why we need to do that when we could never do
> > DISCONTIG=y while NUMA=n on i386.
>
> Ah, OK - makes more sense. However, some machines do have large holes
> in e820 map setups - is not really critical, more of an efficiency
> thing.

Confused. Does all this mean that we want the patch, or not?

2005-09-08 01:40:44

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

On 9/8/05, Dave Hansen <[email protected]> wrote:
> On Tue, 2005-09-06 at 12:56 +0900, Magnus Damm wrote:
> > This patch for 2.6.13-git5 fixes single node sparsemem support. In the case
> > when multiple nodes are used, setup_memory() in arch/i386/mm/discontig.c calls
> > get_memcfg_numa() which calls memory_present(). The single node case with
> > setup_memory() in arch/i386/kernel/setup.c does not call memory_present()
> > without this patch, which breaks single node support.
>
> First of all, this is really a feature addition, not a bug fix. :)

>From the POV that you can use sparsemem on a PC, yes. But from the POV
that setup_memory() in arch/i386/kernel/setup.c not includes a call to
memory_present(), I think it is a fix. =)

While at it, why do we have two copies of setup_memory()? Couldn't
NUMA and non-NUMA share the same code? OTOH, NUMA and discontigmem
seems very integrated/mixed up and there seems to be much activity in
this field so maybe it is nice to keep the NUMA part separated anyway.

> The reason we haven't included this so far is that we don't really have
> any machines that need sparsemem on i386 that aren't NUMA. So, we
> disabled it for now, and probably need to decide first why we need it
> before a patch like that goes in.

Well, I do not have any hardware here that requires sparsemem either,
but I wanted to add NUMA emulation code to be able to run some
multiple-memory-nodes tests on a virtual PC in QEMU. And this little
patch shows my first step which involved getting sparsememto run on a
PC.

> I actually have exactly the same patch that you sent out in my tree, but
> it's just for testing. Magnus, perhaps we can get some of my testing
> patches in good enough shape to put them in -mm so that the non-NUMA
> folks can do more sparsemem testing.

Well, my NUMA emulation project has been postponed a bit now, but
sooner or later I or someone else will need sparsemem on non-NUMA. So
getting your testing patches in to -mm seems like a good idea!

Thanks!

/ magnus

2005-09-08 01:45:39

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

On 9/8/05, Martin J. Bligh <[email protected]> wrote:
> --On Wednesday, September 07, 2005 10:28:36 -0700 Dave Hansen <[email protected]> wrote:
>
> > On Tue, 2005-09-06 at 12:56 +0900, Magnus Damm wrote:
> >> This patch for 2.6.13-git5 fixes single node sparsemem support. In the case
> >> when multiple nodes are used, setup_memory() in arch/i386/mm/discontig.c calls
> >> get_memcfg_numa() which calls memory_present(). The single node case with
> >> setup_memory() in arch/i386/kernel/setup.c does not call memory_present()
> >> without this patch, which breaks single node support.
> >
> > First of all, this is really a feature addition, not a bug fix. :)
> >
> > The reason we haven't included this so far is that we don't really have
> > any machines that need sparsemem on i386 that aren't NUMA. So, we
> > disabled it for now, and probably need to decide first why we need it
> > before a patch like that goes in.
>
> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
> machines. This is essential in order for the distros to support it - same
> will go for sparsemem.

Yes, by reading the code this becomes very clear. But what is the
current status? Is CONFIG_X86_GENERICARCH working right out of the box
on 2.6.13?

Thanks!

/ magnus

2005-09-08 01:51:14

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

On 9/8/05, Dave Hansen <[email protected]> wrote:
> On Wed, 2005-09-07 at 11:22 -0700, Martin J. Bligh wrote:
> > CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
> > machines. This is essential in order for the distros to support it - same
> > will go for sparsemem.
>
> That's a different issue. The current code works if you boot a NUMA=y
> SPARSEMEM=y machine with a single node. The current Kconfig options
> also enforce that SPARSEMEM depends on NUMA on i386.
>
> Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n. That
> requires some Kconfig changes, as well as an extra memory present call.
> I'm questioning why we need to do that when we could never do
> DISCONTIG=y while NUMA=n on i386.

Actually, I do not really care about the Kconfig stuff. I just added
that to show you guys why and when the change in
arch/i386/kernel/setup.c was needed. So my main interest is to include
the fix to the single-node version of setup_memory(). This to sync up
the single-node case with the multiple-node version of setup_memory(),
and to make it easier for me and other people to start using sparsemem
om single-node (or non-NUMA if you prefer that) configurations.

/ magnus

2005-09-08 01:54:10

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

On 9/8/05, Andrew Morton <[email protected]> wrote:
> "Martin J. Bligh" <[email protected]> wrote:
> >
> >
> >
> > --On Wednesday, September 07, 2005 11:27:54 -0700 Dave Hansen <[email protected]> wrote:
> >
> > > On Wed, 2005-09-07 at 11:22 -0700, Martin J. Bligh wrote:
> > >> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
> > >> machines. This is essential in order for the distros to support it - same
> > >> will go for sparsemem.
> > >
> > > That's a different issue. The current code works if you boot a NUMA=y
> > > SPARSEMEM=y machine with a single node. The current Kconfig options
> > > also enforce that SPARSEMEM depends on NUMA on i386.
> > >
> > > Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n. That
> > > requires some Kconfig changes, as well as an extra memory present call.
> > > I'm questioning why we need to do that when we could never do
> > > DISCONTIG=y while NUMA=n on i386.
> >
> > Ah, OK - makes more sense. However, some machines do have large holes
> > in e820 map setups - is not really critical, more of an efficiency
> > thing.
>
> Confused. Does all this mean that we want the patch, or not?

What about if I remove the Kconfig stuff and just keep the "fix" for
the non-NUMA version of setup_memory()?

/ magnus

2005-09-08 06:11:24

by Martin Bligh

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

>> >> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
>> >> machines. This is essential in order for the distros to support it - same
>> >> will go for sparsemem.
>> >
>> > That's a different issue. The current code works if you boot a NUMA=y
>> > SPARSEMEM=y machine with a single node. The current Kconfig options
>> > also enforce that SPARSEMEM depends on NUMA on i386.
>> >
>> > Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n. That
>> > requires some Kconfig changes, as well as an extra memory present call.
>> > I'm questioning why we need to do that when we could never do
>> > DISCONTIG=y while NUMA=n on i386.
>>
>> Ah, OK - makes more sense. However, some machines do have large holes
>> in e820 map setups - is not really critical, more of an efficiency
>> thing.
>
> Confused. Does all this mean that we want the patch, or not?

>From that POV, nothing urgent, and would require more work to make use
of it anyway. Not sure if Magnus had another more immediate use for it?

M.

2005-09-08 06:37:28

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

On Wed, 2005-09-07 at 23:11 -0700, Martin J. Bligh wrote:
> >> >> CONFIG_NUMA was meant to (and did at one point) support both NUMA and flat
> >> >> machines. This is essential in order for the distros to support it - same
> >> >> will go for sparsemem.
> >> >
> >> > That's a different issue. The current code works if you boot a NUMA=y
> >> > SPARSEMEM=y machine with a single node. The current Kconfig options
> >> > also enforce that SPARSEMEM depends on NUMA on i386.
> >> >
> >> > Magnus would like to enable SPARSEMEM=y while CONFIG_NUMA=n. That
> >> > requires some Kconfig changes, as well as an extra memory present call.
> >> > I'm questioning why we need to do that when we could never do
> >> > DISCONTIG=y while NUMA=n on i386.
> >>
> >> Ah, OK - makes more sense. However, some machines do have large holes
> >> in e820 map setups - is not really critical, more of an efficiency
> >> thing.
> >
> > Confused. Does all this mean that we want the patch, or not?
>
> >From that POV, nothing urgent, and would require more work to make use
> of it anyway. Not sure if Magnus had another more immediate use for it?

Just wanted to make sure that both versions of setup_memory() behaved in
a similar way and they both called memory_present(). But nothing urgent,
and no immediate use.

/ magnus

2005-09-08 17:45:08

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] i386: single node SPARSEMEM fix

On Wed, 2005-09-07 at 16:49 -0700, Andrew Morton wrote:
> "Martin J. Bligh" <[email protected]> wrote:
> > Ah, OK - makes more sense. However, some machines do have large holes
> > in e820 map setups - is not really critical, more of an efficiency
> > thing.
>
> Confused. Does all this mean that we want the patch, or not?

I say we wait on it.

Martin brings up a scenario in which SPARSEMEM is useful without NUMA,
but it Magnus's patch doesn't actually deal with systems like that.
Let's do it right, and base the memory_present() calls off of real data
from the e820 or efi data.

-- Dave