2003-01-28 03:26:51

by Jason Papadopoulos

[permalink] [raw]
Subject: [PATCH] page coloring for 2.5.59 kernel, version 1


This is yet another holding action, a port of my page coloring patch
to the 2.5 kernel. This is a minimal port (x86 only) intended to get
some testing done; once again the algorithm used is the same as in
previous patches. There are several cleanups and removed 2.4-isms that
make the code somewhat more compact, though.

I'll be experimenting with other coloring schemes later this week.

http://www.boo.net/~jasonp/page_color-2.5.59-20030127.patch

Feedback of any sort welcome.

jasonp


2003-01-28 03:49:36

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

On Mon, Jan 27, 2003 at 10:47:26PM -0500, Jason Papadopoulos wrote:
> This is yet another holding action, a port of my page coloring patch
> to the 2.5 kernel. This is a minimal port (x86 only) intended to get
> some testing done; once again the algorithm used is the same as in
> previous patches. There are several cleanups and removed 2.4-isms that
> make the code somewhat more compact, though.
> I'll be experimenting with other coloring schemes later this week.
> http://www.boo.net/~jasonp/page_color-2.5.59-20030127.patch
> Feedback of any sort welcome.

set_num_colors() needs to go downstairs under arch/ Some of the
current->pid checks look a bit odd esp. for GFP_ATOMIC and/or
in_interrupt() cases. I'm not sure why this is a config option; it
should be mandatory. I also wonder about the interaction of this with
the per-cpu lists. This may really want to be something like a matrix
with (cpu, color) indices to find the right list; trouble is, there's a
high potential for many pages to be trapped there. mapnr's (page -
zone->zone_mem_map etc.) are being used for pfn's; this may raise
issues if zones' required alignments aren't num_colors*PAGE_SIZE or
larger. proc_misc.c can be used instead of page_color_init(). ->free_list
can be removed. get_rand() needs locking, per-zone state. Useful stuff.

--wli

2003-01-28 04:03:35

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

On Mon, Jan 27, 2003 at 07:57:36PM -0800, William Lee Irwin III wrote:
> set_num_colors() needs to go downstairs under arch/ Some of the
> current->pid checks look a bit odd esp. for GFP_ATOMIC and/or
> in_interrupt() cases. I'm not sure why this is a config option; it
> should be mandatory. I also wonder about the interaction of this with
> the per-cpu lists. This may really want to be something like a matrix
> with (cpu, color) indices to find the right list; trouble is, there's a
> high potential for many pages to be trapped there. mapnr's (page -
> zone->zone_mem_map etc.) are being used for pfn's; this may raise
> issues if zones' required alignments aren't num_colors*PAGE_SIZE or
> larger. proc_misc.c can be used instead of page_color_init(). ->free_list
> can be removed. get_rand() needs locking, per-zone state. Useful stuff.

Hmm, actually the mapnr's as physical pfn's are broken with
MAP_NR_DENSE(), though existing boxen probably luck out. The RNG uses
an integer multiply which may be slow on various cpus, and I wouldn't
mind either a stronger or better documented RNG algorithm. ->color_init
is basically a bitflag, and ->target_color has a very limited range.
sizeof(task_t) needs to be small, could you fold that stuff into
->flags or ->thread_info?

-- wli

2003-01-28 06:49:42

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

> This is yet another holding action, a port of my page coloring patch
> to the 2.5 kernel. This is a minimal port (x86 only) intended to get
> some testing done; once again the algorithm used is the same as in
> previous patches. There are several cleanups and removed 2.4-isms that
> make the code somewhat more compact, though.
>
> I'll be experimenting with other coloring schemes later this week.
>
> http://www.boo.net/~jasonp/page_color-2.5.59-20030127.patch
>
> Feedback of any sort welcome.

I took a 16-way NUMA-Q (700MHz P3 Xeon's w/2MB L2 cache) and ran some
cpu-intensive benchmarks (kernel compile on warm cache with -j32 and
-j 256, SDET 1 - 128 users, and numaschedbench with 1 to 64 processes,
which is a memory thrasher to test node affinity of memory operations),
and compared to virgin 2.5.59 - no measurable difference on any test.
Sorry,

M.

2003-01-28 07:05:13

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

At some point in the past, Jason P. wrote:
>> This is yet another holding action, a port of my page coloring patch
>> to the 2.5 kernel. This is a minimal port (x86 only) intended to get
>> some testing done; once again the algorithm used is the same as in
>> previous patches. There are several cleanups and removed 2.4-isms that
>> make the code somewhat more compact, though.
>> I'll be experimenting with other coloring schemes later this week.
>> http://www.boo.net/~jasonp/page_color-2.5.59-20030127.patch
>> Feedback of any sort welcome.

On Mon, Jan 27, 2003 at 10:58:53PM -0800, Martin J. Bligh wrote:
> I took a 16-way NUMA-Q (700MHz P3 Xeon's w/2MB L2 cache) and ran some
> cpu-intensive benchmarks (kernel compile on warm cache with -j32 and
> -j 256, SDET 1 - 128 users, and numaschedbench with 1 to 64 processes,
> which is a memory thrasher to test node affinity of memory operations),
> and compared to virgin 2.5.59 - no measurable difference on any test.

I think this one really needs to be done with the userspace cache
thrashing microbenchmarks. I also have rather serious reservations
about the interaction of the qlists with the per-cpu lists.


-- wli

2003-01-28 15:57:40

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

> I think this one really needs to be done with the userspace cache
> thrashing microbenchmarks.

If a benefit cannot be show on some sort of semi-realistic workload,
it's probably not worth it, IMHO.

M.

2003-01-28 16:28:06

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

"Martin J. Bligh" <[email protected]> writes:

> > I think this one really needs to be done with the userspace cache
> > thrashing microbenchmarks.
>
> If a benefit cannot be show on some sort of semi-realistic workload,
> it's probably not worth it, IMHO.

The main advantage of cache coloring normally is that benchmarks
should get stable results. Without it a benchmark result can vary based on
random memory allocation patterns.

Just having stable benchmarks may be worth it.

I suspect the benefit will vary a lot based on the CPU. Your caches may
have good enough associativity. On other CPUs it may make much more difference.

-Andi

2003-01-28 16:33:47

by Falk Hueffner

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

"Martin J. Bligh" <[email protected]> writes:

> > I think this one really needs to be done with the userspace cache
> > thrashing microbenchmarks.
>
> If a benefit cannot be show on some sort of semi-realistic workload,
> it's probably not worth it, IMHO.

I tested an earlier version on Alpha. While it didn't yield noticeable
performance benefits, it increased the reproducability of my benchmark
a lot, which is also pretty useful.

--
Falk

2003-01-28 16:40:24

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

> The main advantage of cache coloring normally is that benchmarks
> should get stable results. Without it a benchmark result can vary based on
> random memory allocation patterns.
>
> Just having stable benchmarks may be worth it.

OK, I'll try to hack the scripts to measure standard deviation between runs
as well.

> I suspect the benefit will vary a lot based on the CPU. Your caches may
> have good enough associativity. On other CPUs it may make much more difference.

IIRC, P3's are 4 way associative ... people had been saying that this would
make more of a difference on machines with larger caches, which is why I ran
it ... 2Mb is fairly big for ia32.

M.

2003-01-28 16:59:41

by Bill Davidsen

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

On 28 Jan 2003, Andi Kleen wrote:

> The main advantage of cache coloring normally is that benchmarks
> should get stable results. Without it a benchmark result can vary based on
> random memory allocation patterns.
>
> Just having stable benchmarks may be worth it.

I have noted in ctxbench that the SMP results have a vast performance
range while the uni (and nosmp) don't. Not clear if this would improve
that, but I sure would like to try.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2003-01-28 17:15:13

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

On Tue, 28 Jan 2003 12:06:11 EST, Bill Davidsen said:

> I have noted in ctxbench that the SMP results have a vast performance
> range while the uni (and nosmp) don't. Not clear if this would improve
> that, but I sure would like to try.

Another thing to check is whether in the SMP case, there's a race condition
with optimal/pessimal grabbing of locks, etc.


Attachments:
(No filename) (226.00 B)

2003-01-28 17:40:40

by Jason Papadopoulos

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1


> If a benefit cannot be show on some sort of semi-realistic workload,
> it's probably not worth it, IMHO.

With the present state of the patch my own limited tests don't uncover any
speedups at all on my x86 test machine. For the Alpha with 2MB cache (and the
2.4 patch) there are measurable speedups; number-crunching benchmarks show it
the most.

jasonp

---------------------------------------------
This message was sent using Endymion MailMan.
http://www.endymion.com/products/mailman/


2003-01-28 17:52:34

by Jason Papadopoulos

[permalink] [raw]
Subject: Re: [PATCH] page coloring for 2.5.59 kernel, version 1

>
> set_num_colors() needs to go downstairs under arch/ Some of the
> current->pid checks look a bit odd esp. for GFP_ATOMIC and/or
> in_interrupt() cases. I'm not sure why this is a config option; it
> should be mandatory. I also wonder about the interaction of this with
> the per-cpu lists. This may really want to be something like a matrix
> with (cpu, color) indices to find the right list; trouble is, there's a
> high potential for many pages to be trapped there. mapnr's (page -
> zone->zone_mem_map etc.) are being used for pfn's; this may raise
> issues if zones' required alignments aren't num_colors*PAGE_SIZE or
> larger. proc_misc.c can be used instead of page_color_init(). ->free_list
> can be removed. get_rand() needs locking, per-zone state. Useful stuff.

The current->pid tests date back to the 2.2 kernel patch, to get around
a bug where reusing an old task_struct didn't reinitialize the counter.
I'd much rather initialize the counter properly when a process starts, but
am not smart enough to track down all the places in the kernel where it
happens (kernel/fork.c only seems to account for half the pids on my system,
whereas in 2.4 virtually every process went through fork.c)

I originally had a much better RNG in place of the present one, but
at least one person didn't like explicit long-long calculations. Rather
than locking, what about the (admittedly much slower) nondeterministic RNG
interface? Also, the new __rmqueue is probably sufficiently slower than
the original (especially when accounting for non-power-of-two cache sizes)
that the latency for random numbers may not matter much.

Not sure how to handle pfn's properly in light of your observation, though.
What do you suggest? Likewise, I'll have to look at this per-cpu thing, older
patches didn't need to care about it.

Thanks to everyone for their feedback; I'll keep at it.

jasonp

---------------------------------------------
This message was sent using Endymion MailMan.
http://www.endymion.com/products/mailman/