2006-09-28 15:18:48

by keios

[permalink] [raw]
Subject: [PATCH] low performance of lib/sort.c , kernel 2.6.18

It is a non-standard heap-sort algorithm implementation because the
index of child node is wrong . The sort function still outputs right
result, but the performance is O( n * ( log(n) + 1 ) ) , about 10% ~
20% worse than standard algorithm .

Signed-off-by: keios <[email protected]>
-----
diff -Nraup a/lib/sort.c b/lib/sort.c
--- a/lib/sort.c 2006-09-20 11:42:06.000000000 +0800
+++ b/lib/sort.c 2006-09-27 21:26:38.000000000 +0800
@@ -49,15 +49,15 @@ void sort(void *base, size_t num, size_t
void (*swap)(void *, void *, int size))
{
/* pre-scale counters for performance */
- int i = (num/2) * size, n = num * size, c, r;
+ int i = (num/2 - 1) * size, n = num * size, c, r;

if (!swap)
swap = (size == 4 ? u32_swap : generic_swap);

/* heapify */
for ( ; i >= 0; i -= size) {
- for (r = i; r * 2 < n; r = c) {
- c = r * 2;
+ for (r = i; r * 2 + size < n; r = c) {
+ c = r * 2 + size;
if (c < n - size && cmp(base + c, base + c + size) < 0)
c += size;
if (cmp(base + r, base + c) >= 0)
@@ -69,8 +69,8 @@ void sort(void *base, size_t num, size_t
/* sort */
for (i = n - size; i >= 0; i -= size) {
swap(base, base + i, size);
- for (r = 0; r * 2 < i; r = c) {
- c = r * 2;
+ for (r = 0; r * 2 + size < i; r = c) {
+ c = r * 2 + size;
if (c < i - size && cmp(base + c, base + c + size) < 0)
c += size;
if (cmp(base + r, base + c) >= 0)


2006-09-28 22:35:24

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH] low performance of lib/sort.c , kernel 2.6.18

On Thu, Sep 28, 2006 at 11:18:45PM +0800, keios wrote:
> It is a non-standard heap-sort algorithm implementation because the
> index of child node is wrong . The sort function still outputs right
> result, but the performance is O( n * ( log(n) + 1 ) ) , about 10% ~
> 20% worse than standard algorithm .
>
> Signed-off-by: keios <[email protected]>

Was a bit mystified by this as your patch matches what I've got
in my userspace test harness from 2003.

Here's what I submitted, which is almost the same as yours:

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc4/2.6.11-rc4-mm1/broken-out/lib-sort-heapsort-implementation-of-sort.patch

Then Zou Nan hai sent Andrew a fix for an off-by-one bug here (merged
with my patch):

http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm1/broken-out/lib-sort-heapsort-implementation-of-sort.patch

..which introduced the performance regression.

And then I subsequently tweaked my local copy for use in another
project, coming up with your version.

So this passes my test harness just fine (for both even and odd array
sizes).

Acked-by: Matt Mackall <[email protected]>

--
Mathematics is the supreme nostalgia of our time.

2006-09-29 00:23:57

by Zou, Nanhai

[permalink] [raw]
Subject: Re: [PATCH] low performance of lib/sort.c , kernel 2.6.18

On Fri, 2006-09-29 at 06:33, Matt Mackall wrote:
> On Thu, Sep 28, 2006 at 11:18:45PM +0800, keios wrote:
> > It is a non-standard heap-sort algorithm implementation because the
> > index of child node is wrong . The sort function still outputs right
> > result, but the performance is O( n * ( log(n) + 1 ) ) , about 10% ~
> > 20% worse than standard algorithm .
> >
> > Signed-off-by: keios <[email protected]>
>
> Was a bit mystified by this as your patch matches what I've got
> in my userspace test harness from 2003.
>
> Here's what I submitted, which is almost the same as yours:
>
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc4/2.6.11-rc4-mm1/broken-out/lib-sort-heapsort-implementation-of-sort.patch
>
> Then Zou Nan hai sent Andrew a fix for an off-by-one bug here (merged
> with my patch):
>
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm1/broken-out/lib-sort-heapsort-implementation-of-sort.patch
>
> ..which introduced the performance regression.
>
> And then I subsequently tweaked my local copy for use in another
> project, coming up with your version.
>
> So this passes my test harness just fine (for both even and odd array
> sizes).
>
> Acked-by: Matt Mackall <[email protected]>


I think this patch is correct.

Thanks
Zou Nan hai

2006-09-29 01:56:24

by keios

[permalink] [raw]
Subject: Re: [PATCH] low performance of lib/sort.c , kernel 2.6.18

Yes, it is almost same as the first version, except a little
difference : descendants of node [r] is [r * 2 + 1] and [r * 2 + 2]
(comment:the size of element is ignored), but yours is [r * 2] and [r
* 2 + 1] .

The tree you build is :
[0]
|
[1]
/ \
[2] [3]
/ \ / \
[4] [5][6] [7]

Not same as standard tree :
[0]
/ \
[1] [2]
/ \ / \
[3] [4][5] [6]

We can find the standard algorithm here:
http://en.wikipedia.org/wiki/Heapsort .

So , in every shift-down operation (comment: In sort.c, it is after
heapify , second loop in /* sort */ section ), [0] will compare with
[0] and [1], and always swap with [1] . Performance lost here .

Acked-by: keios <[email protected]>

On 9/29/06, Matt Mackall <[email protected]> wrote:
> On Thu, Sep 28, 2006 at 11:18:45PM +0800, keios wrote:
> > It is a non-standard heap-sort algorithm implementation because the
> > index of child node is wrong . The sort function still outputs right
> > result, but the performance is O( n * ( log(n) + 1 ) ) , about 10% ~
> > 20% worse than standard algorithm .
> >
> > Signed-off-by: keios <[email protected]>
>
> Was a bit mystified by this as your patch matches what I've got
> in my userspace test harness from 2003.
>
> Here's what I submitted, which is almost the same as yours:
>
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc4/2.6.11-rc4-mm1/broken-out/lib-sort-heapsort-implementation-of-sort.patch
>
> Then Zou Nan hai sent Andrew a fix for an off-by-one bug here (merged
> with my patch):
>
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm1/broken-out/lib-sort-heapsort-implementation-of-sort.patch
>
> ..which introduced the performance regression.
>
> And then I subsequently tweaked my local copy for use in another
> project, coming up with your version.
>
> So this passes my test harness just fine (for both even and odd array
> sizes).
>
> Acked-by: Matt Mackall <[email protected]>
>
> --
> Mathematics is the supreme nostalgia of our time.
>