LinuxLists.cc - Bug in kernel code?

2002-08-28 00:21:07

by Stephen Biggs

[permalink] [raw]

Subject: Bug in kernel code?

I posted this to the newsgroup news:linux-kernel but got no response...
I'll try here.

While going thru the kernel code for 2.4.17 (uClinux patch) I noticed a
problem with functions in 3 files:
dcache_init in fs/dcache.c, inode_init in fs/inode.c and in mnt_init in
fs/namespace.c

At least on my Redhat 7.3 2.4.18-10 source, in the function

static void __init dcache_init(unsigned long mempages)
{
struct list_head *d;
unsigned long order;
unsigned int nr_hash;
int i;

/*
* A constructor could be added for stable state like the lists,
* but it is probably not worth it because of the cache nature
* of the dcache.
* If fragmentation is too bad then the SLAB_HWCACHE_ALIGN
* flag could be removed here, to hint to the allocator that
* it should not try to get multiple page regions.
*/
dentry_cache = kmem_cache_create("dentry_cache",
sizeof(struct dentry),
0,
SLAB_HWCACHE_ALIGN,
NULL, NULL);
if (!dentry_cache)
panic("Cannot create dentry cache");

#if PAGE_SHIFT < 13
mempages >>= (13 - PAGE_SHIFT);
#endif
mempages *= sizeof(struct list_head);
for (order = 0; ((1UL << order) << PAGE_SHIFT) < mempages; order++)
;

do {
u32 tmp;

nr_hash = (1UL << order) * PAGE_SIZE /
sizeof(struct list_head);
d_hash_mask = (nr_hash - 1);

tmp = nr_hash;
d_hash_shift = 0;
while ((tmp >>= 1UL) != 0UL)
d_hash_shift++;

dentry_hashtable = (struct list_head *)
__get_free_pages(GFP_ATOMIC, order);
} while (dentry_hashtable == NULL && --order >= 0);

........... Notice the --order >=0 in the do while test... since order
is declared as an unsigned long, this test is a pointless comparison and
never fails, causing an infinite loop if the hashtable is empty.

My fix to this is to change the test to have
while (dentry_hashtable == NULL && order-- != 0);

Could someone check me on this?

2002-08-28 00:28:56

by Stephen Biggs

[permalink] [raw]

Subject: Re: Bug in kernel code?

On 27 Aug 2002 at 17:23, David S. Miller wrote:

> From: "Stephen C. Biggs" <[email protected]>
> Date: Tue, 27 Aug 2002 17:24:19 -0700
>
> ........... Notice the --order >=0 in the do while test... since order is declared as an unsigned
> long, this test is a pointless comparison and never fails, causing an infinite loop if the
> hashtable is empty.
>
> My fix to this is to change the test to have
> while (dentry_hashtable == NULL && order-- != 0);
>
> Could someone check me on this?
>
> Your analysis is right but the fix is wrong, we do want to
> try order == 0 on very small memory systems, and you should
> not decrement the order at the beginning of the loop. We
> want to use the "order" calculated.
>
> Just make 'order' signed long to fix this bug.
>

NO! That won't work either. This is a "do while" loop so the first test is always done and if
order is 0, the check will be done AFTER the decrement, so this works. Changing it to a signed
long loses you a bit.

2002-08-28 00:30:47

by David Miller

[permalink] [raw]

Subject: Re: Bug in kernel code?

From: "Stephen C. Biggs" <[email protected]>
Date: Tue, 27 Aug 2002 17:32:58 -0700

This is a "do while" loop so the first test is always done

No, a "do while" loop always executes the first iteration.
What version of the C language do you think we are using?

2002-08-28 00:36:47

by Stephen Biggs

[permalink] [raw]

Subject: Re: Bug in kernel code?

On 27 Aug 2002 at 17:29, David S. Miller wrote:

> From: "Stephen C. Biggs" <[email protected]>
> Date: Tue, 27 Aug 2002 17:32:58 -0700
>
> This is a "do while" loop so the first test is always done
>
> No, a "do while" loop always executes the first iteration.
> What version of the C language do you think we are using?
>

You're misunderstanding me, I meant that the first test is done AFTER the first iteration is
executed, so my fix is correct since, even if order is 0 because at least one iteration of the loop
is done, and the post decrement makes sure that the test succeeds if order was 0 going into the
loop.

Changing order to a signed long loses half of the cache entries due to the number being divided by
2.

2002-08-28 00:44:04

by David Miller

[permalink] [raw]

Subject: Re: Bug in kernel code?

From: "Stephen C. Biggs" <[email protected]>
Date: Tue, 27 Aug 2002 17:40:41 -0700

You're misunderstanding me, I meant that the first test is done
AFTER the first iteration is executed, so my fix is correct
since, even if order is 0 because at least one iteration of the
loop is done, and the post decrement makes sure that the test
succeeds if order was 0 going into the loop.

Order is always >= 0 when we enter the loop.

If we actually get the table allocated then the decrement of 'order'
is not executed if we allocate the table successfully.

I don't understand what the problem is with my fix.

2002-08-28 00:50:44

by Stephen Biggs

[permalink] [raw]

Subject: Re: Bug in kernel code?

2002-08-28 01:05:45

by Stephen Biggs

[permalink] [raw]

Subject: Re: Bug in kernel code?

On 27 Aug 2002 at 17:42, David S. Miller wrote:

> From: "Stephen C. Biggs" <[email protected]>
> Date: Tue, 27 Aug 2002 17:40:41 -0700
>
> You're misunderstanding me, I meant that the first test is done
> AFTER the first iteration is executed, so my fix is correct
> since, even if order is 0 because at least one iteration of the
> loop is done, and the post decrement makes sure that the test
> succeeds if order was 0 going into the loop.
>
> Order is always >= 0 when we enter the loop.
>
> If we actually get the table allocated then the decrement of 'order'
> is not executed if we allocate the table successfully.
>
> I don't understand what the problem is with my fix.
>
[Trying again - my mail client is messing up/sending empty posts]

As I look at it, it can only be 0 <= order < 32, so your fix works, as mine does. Yours is simpler
and doesn't involve changing code, just declarations.

There need to be some sanity checks in this code: what if mempages is passed as some insanely huge
number, e.g.

2002-08-28 01:11:12

by David Miller

[permalink] [raw]

Subject: Re: Bug in kernel code?

From: "Stephen C. Biggs" <[email protected]>
Date: Tue, 27 Aug 2002 18:09:46 -0700

There need to be some sanity checks in this code: what if mempages is passed as some insanely huge
number, e.g.

And then your mail ends.... let us know when you've fixed
your email client, this isn't rocket science :-)

2002-08-28 01:20:11

by Stephen Biggs

[permalink] [raw]

Subject: Re: Bug in kernel code?

On 27 Aug 2002 at 18:09, David S. Miller wrote:

> From: "Stephen C. Biggs" <[email protected]>
> Date: Tue, 27 Aug 2002 18:09:46 -0700
>
> There need to be some sanity checks in this code: what if mempages is passed as some insanely huge
> number, e.g.
>
> And then your mail ends.... let us know when you've fixed
> your email client, this isn't rocket science :-)
>

What are you talking about "And then your mail ends..." That's all I wanted to say...

Oh, you want me to post a complete patch.... I get it.

2002-08-28 01:24:32

by David Miller

[permalink] [raw]

Subject: Re: Bug in kernel code?

From: "Stephen C. Biggs" <[email protected]>
Date: Tue, 27 Aug 2002 18:24:13 -0700

On 27 Aug 2002 at 18:09, David S. Miller wrote:

> And then your mail ends.... let us know when you've fixed
> your email client, this isn't rocket science :-)
>

What are you talking about "And then your mail ends..." That's all I wanted to say...

No, I thought you were going to give an example of a huge number.
:-)

2002-08-28 02:38:41

by Stephen Biggs

[permalink] [raw]

Subject: Re: Bug in kernel code?

On 27 Aug 2002 at 18:23, David S. Miller wrote:

> From: "Stephen C. Biggs" <[email protected]>
> Date: Tue, 27 Aug 2002 18:24:13 -0700
>
> On 27 Aug 2002 at 18:09, David S. Miller wrote:
>
> > And then your mail ends.... let us know when you've fixed
> > your email client, this isn't rocket science :-)
> >
>
> What are you talking about "And then your mail ends..." That's all I wanted to say...
>
> No, I thought you were going to give an example of a huge number.
> :-)
>

How about (unsigned long)(~0)?

This code never finishes:

#include <stdio.h>

#define PAGE_SHIFT 12

struct list_head {
struct list_head *next, *prev;
};

int main(void)
{
unsigned long order;
unsigned long mempages = ~0;

#if PAGE_SHIFT < 13
mempages >>= (13 - PAGE_SHIFT);
#endif
mempages *= sizeof(struct list_head);

for (order = 0; ((1UL << order) << PAGE_SHIFT) < mempages; order++)
;
printf("%d\n",order);
return 0;
}

2002-08-28 02:45:27

by Stephen Biggs

[permalink] [raw]

Subject: Re[2]: Bug in kernel code?

On 27 Aug 2002 at 18:23, David S. Miller wrote:

> From: "Stephen C. Biggs" <[email protected]>
> Date: Tue, 27 Aug 2002 18:24:13 -0700
>
> On 27 Aug 2002 at 18:09, David S. Miller wrote:
>
> > And then your mail ends.... let us know when you've fixed
> > your email client, this isn't rocket science :-)
> >
>
> What are you talking about "And then your mail ends..." That's all I wanted to say...
>
> No, I thought you were going to give an example of a huge number.
> :-)
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Argh. I meant to post THIS code:

#include <stdio.h>

#define PAGE_SHIFT 12

struct list_head {
struct list_head *next, *prev;
};

int main(void)
{
unsigned long order;
unsigned long mempages = ~0;

#if PAGE_SHIFT < 13
mempages >>= (13 - PAGE_SHIFT);
#endif
mempages *= sizeof(struct list_head);

printf("mempages: %lu\n", mempages);

for (order = 0; ((1UL << order) << PAGE_SHIFT) < mempages; order++)
;
printf("%lu\n",order);
return 0;
}

2002-08-28 03:41:07

by David Miller

[permalink] [raw]

Subject: Re: Bug in kernel code?

From: "Stephen C. Biggs" <[email protected]>
Date: Tue, 27 Aug 2002 19:42:36 -0700

How about (unsigned long)(~0)?

Realistically possible with any known configuration?

2002-08-28 03:53:49

by Luca Barbieri

[permalink] [raw]

Subject: Re: Bug in kernel code?

On Wed, 2002-08-28 at 05:39, David S. Miller wrote:
> From: "Stephen C. Biggs" <[email protected]>
> Date: Tue, 27 Aug 2002 19:42:36 -0700
>
> How about (unsigned long)(~0)?
>
> Realistically possible with any known configuration?
How about replacing the loop with ffs/__ffs that would be more elegant,
shorter and avoid any problem or doubt?

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2002-08-28 03:59:52

by David Miller

[permalink] [raw]

Subject: Re: Bug in kernel code?

From: Luca Barbieri <[email protected]>
Date: 28 Aug 2002 05:57:50 +0200

On Wed, 2002-08-28 at 05:39, David S. Miller wrote:
> Realistically possible with any known configuration?

How about replacing the loop with ffs/__ffs that would be more
elegant, shorter and avoid any problem or doubt?

This is getting rediculious. You can pursue this further
if you want, but I think we've wasted enough time with
this non-bug.

Who is getting bootup failures due to this problem?
Answer: nobody

2002-08-28 04:06:28

by Luca Barbieri

[permalink] [raw]

Subject: Re: Bug in kernel code?

On Wed, 2002-08-28 at 05:58, David S. Miller wrote:
> From: Luca Barbieri <[email protected]>
> Date: 28 Aug 2002 05:57:50 +0200
>
> On Wed, 2002-08-28 at 05:39, David S. Miller wrote:
> > Realistically possible with any known configuration?
>
> How about replacing the loop with ffs/__ffs that would be more
> elegant, shorter and avoid any problem or doubt?
>
> This is getting rediculious. You can pursue this further
> if you want, but I think we've wasted enough time with
> this non-bug.
>
> Who is getting bootup failures due to this problem?
> Answer: nobody
I'm not saying that it's serious bug, just that using __ffs is more
appropriate than reimplementing it incorrectly and inefficiently.

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2002-08-28 04:09:11

by David Miller

[permalink] [raw]

Subject: Re: Bug in kernel code?

From: Luca Barbieri <[email protected]>
Date: 28 Aug 2002 06:10:39 +0200

I'm not saying that it's serious bug, just that using __ffs is more
appropriate than reimplementing it incorrectly and inefficiently.

ffs won't find the smallest power of 2 >= some_arbitrary_value.
That is what this code is doing.

I've personally wasted enough typing on this thread, so enough....

2002-08-28 04:25:04

by Stephen Biggs

[permalink] [raw]

Subject: Re: Bug in kernel code?

On 27 Aug 2002 at 20:39, David S. Miller wrote:

> How about (unsigned long)(~0)?
>
> Realistically possible with any known configuration?
>

You tell me. You're saying a billion pages (((unsigned long)(~0)) >> 2) also crashes) is never
going to be realistically possible? Sounds like Bill Gates when he said (and I don't know the word-
for-word quote) "Who's ever going to need more than 640K??" What if we get into 64 bit addressing?
What if there is some sort of bug that passes all 1s on the stack for just this one instance?
Never could "realistically" happen? Yeah, right; I've seen weirder things than that.

It's a question of mandatory paranoid sanity checking in an OS wherever possible. Linux is trying
to be known as robust. Are you saying that a supposedly robust kernel should have a chance to
crash in an infinite loop during initialization because there isn't code doing input validation
when there isn't an optimization or speed issue?

2002-08-28 04:26:22

by Luca Barbieri

[permalink] [raw]

Subject: Re: Bug in kernel code?

On Wed, 2002-08-28 at 06:07, David S. Miller wrote:
> From: Luca Barbieri <[email protected]>
> Date: 28 Aug 2002 06:10:39 +0200
>
> I'm not saying that it's serious bug, just that using __ffs is more
> appropriate than reimplementing it incorrectly and inefficiently.
>
> ffs won't find the smallest power of 2 >= some_arbitrary_value.
> That is what this code is doing.
Yes you are right, fls should be used there, not ffs.

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2002-08-28 04:28:30

by David Miller

[permalink] [raw]

Subject: Re: Bug in kernel code?

From: "Stephen Biggs" <[email protected]>
Date: Tue, 27 Aug 2002 21:29:06 -0700

You tell me. You're saying a billion pages (((unsigned long)(~0)) >> 2) also crashes) is never
going to be realistically possible?

On a 32-bit system? No. x86 cpus are architectually limited
to 64GB of memory, shift that right by PAGE_SIZE (13) and we're
still within bounds.

Larger memory will be then be found on 64-bit systems, and hey
then the limit becomes significantly higher.

So that is exactly what I am saying, it is not realistically
possible.

2002-08-28 21:51:06

by H. Peter Anvin

[permalink] [raw]

Subject: Re: Bug in kernel code?

Followup to: <[email protected]>
By author: "David S. Miller" <[email protected]>
In newsgroup: linux.dev.kernel
>
> From: "Stephen Biggs" <[email protected]>
> Date: Tue, 27 Aug 2002 21:29:06 -0700
>
> You tell me. You're saying a billion pages (((unsigned long)(~0)) >> 2) also crashes) is never
> going to be realistically possible?
>
> On a 32-bit system? No. x86 cpus are architectually limited
> to 64GB of memory, shift that right by PAGE_SIZE (13) and we're
> still within bounds. ^^^^^^^^^^^^^^
>

PAGE_SHIFT, 12.

The maximum possible page count of any known 32-bit system (64 GB @ 4K
for x86) is 2^24 pages.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>