2005-11-11 04:17:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 01/05] NUMA: Generic code

On Thursday 10 November 2005 10:08, Magnus Damm wrote:
> Generic CONFIG_NUMA_EMU code.
>
> This patch adds generic NUMA emulation code to the kernel. The code
> provides the architectures with functions that calculate the size of
> emulated nodes, together with configuration stuff such as Kconfig and
> kernel command line code.

IMHO making it generic and bloated like this is total overkill
for this simple debugginghack. I think it is better to keep
it simple and hiden it in a architecture specific dark corners, not expose it
like this.

I think the patch shouldn't be applied.

-Andi


2005-11-15 08:34:18

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH 01/05] NUMA: Generic code

On 11/11/05, Andi Kleen <[email protected]> wrote:
> On Thursday 10 November 2005 10:08, Magnus Damm wrote:
> > Generic CONFIG_NUMA_EMU code.
> >
> > This patch adds generic NUMA emulation code to the kernel. The code
> > provides the architectures with functions that calculate the size of
> > emulated nodes, together with configuration stuff such as Kconfig and
> > kernel command line code.
>
> IMHO making it generic and bloated like this is total overkill
> for this simple debugginghack. I think it is better to keep
> it simple and hiden it in a architecture specific dark corners, not expose it
> like this.

My plan with breaking out the NUMA emulation code was to merge my i386
stuff with the x86_64 code, but as you say - it might be overkill.

What do you think about the fact that real NUMA nodes now can be
divided into several smaller nodes?

/ magnus

2005-11-15 14:48:30

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 01/05] NUMA: Generic code

On Tuesday 15 November 2005 09:34, Magnus Damm wrote:

>
> My plan with breaking out the NUMA emulation code was to merge my i386
> stuff with the x86_64 code, but as you say - it might be overkill.
>
> What do you think about the fact that real NUMA nodes now can be
> divided into several smaller nodes?

Is it really needed? I never needed it. Normally numa emulation
is just for basic numa testing, and for that just an independent
split is good enough.

-Andi

2005-11-16 05:22:33

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH 01/05] NUMA: Generic code

On 11/15/05, Andi Kleen <[email protected]> wrote:
> On Tuesday 15 November 2005 09:34, Magnus Damm wrote:
>
> >
> > My plan with breaking out the NUMA emulation code was to merge my i386
> > stuff with the x86_64 code, but as you say - it might be overkill.
> >
> > What do you think about the fact that real NUMA nodes now can be
> > divided into several smaller nodes?
>
> Is it really needed? I never needed it. Normally numa emulation
> is just for basic numa testing, and for that just an independent
> split is good enough.

For testing, your NUMA emulation code is perfect IMO. But for memory
resource control your NUMA emulation code may be too simple.

With my patch, CONFIG_NUMA_EMU provides a way to partition a machine
into several smaller nodes, regardless if the machine is using NUMA or
not.

This NUMA emulation code together with CPUSETS could be seen as a
simple alternative to the memory resource control provided by CKRM.

/ magnus

2005-11-16 07:48:57

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 01/05] NUMA: Generic code

Magnus Damm <[email protected]> writes:
>
> For testing, your NUMA emulation code is perfect IMO. But for memory
> resource control your NUMA emulation code may be too simple.
>
> With my patch, CONFIG_NUMA_EMU provides a way to partition a machine
> into several smaller nodes, regardless if the machine is using NUMA or
> not.
>
> This NUMA emulation code together with CPUSETS could be seen as a
> simple alternative to the memory resource control provided by CKRM.

I believe Werner tried to use it at some point for that and it just
didn't work very well. So it doesn't seem to be very useful for
that usecase.

-Andi

2005-11-16 07:58:01

by Magnus Damm

[permalink] [raw]
Subject: Re: [PATCH 01/05] NUMA: Generic code

On 16 Nov 2005 08:48:39 +0100, Andi Kleen <[email protected]> wrote:
> Magnus Damm <[email protected]> writes:
> >
> > For testing, your NUMA emulation code is perfect IMO. But for memory
> > resource control your NUMA emulation code may be too simple.
> >
> > With my patch, CONFIG_NUMA_EMU provides a way to partition a machine
> > into several smaller nodes, regardless if the machine is using NUMA or
> > not.
> >
> > This NUMA emulation code together with CPUSETS could be seen as a
> > simple alternative to the memory resource control provided by CKRM.
>
> I believe Werner tried to use it at some point for that and it just
> didn't work very well. So it doesn't seem to be very useful for
> that usecase.

Sorry, but which one did not work very well? CKRM memory controller or
NUMA emulation + CPUSETS?

Thanks,

/ magnus

2005-11-16 08:36:55

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 01/05] NUMA: Generic code

On Wednesday 16 November 2005 08:57, Magnus Damm wrote:

>
> Sorry, but which one did not work very well? CKRM memory controller or
> NUMA emulation + CPUSETS?

Using simulated nodes for controlling memory.

-Andi

2005-11-16 11:32:31

by Werner Almesberger

[permalink] [raw]
Subject: Re: [PATCH 01/05] NUMA: Generic code

Magnus Damm wrote:
> Sorry, but which one did not work very well? CKRM memory controller or
> NUMA emulation + CPUSETS?

We tried to partition our memory using the NUMA emulation, such that
timing-critical processes would allocate from one node, while all
the rest of the system would allocate from the other node.

The idea was that the timing-critical processes, with a fairly
"calm" allocation behaviour (read file data into the page cache,
then evict it again), would never or almost never trigger memory
reclaim this way, and thus have better worst-case latency.

Unfortunately, our benchmarks didn't show any improvements in
latency. In fact, the results were slightly worse, perhaps because
of processes on the "regular" node holding shared resources while
in memory reclaim.

I'm not entirely sure why this didn't work better. At least in
theory, it should have.

We did this in the ABISS project, about one year ago in response
to quite nasty reclaim latency suddenly appearing in an earlier
2.6 kernel. When we asked various MM developers, but none of them
was aware of any change that would make reclaims all of a sudden
very intrusive, and they attributed it to the "butterfly effect".
After a while (i.e., in later kernels), the butterflies must have
chosen a different victim, and the latency got better on its own.

So, in the end, we didn't need that NUMA hack to control reclaims.
But if they should rear their ugly heads again, it may be worth
having a second look.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/