2008-07-24 03:25:42

by David Miller

[permalink] [raw]
Subject: Re: [Bug 11046] New: Kernel bug in mm/bootmem.c on Sparc machines

From: Andrew Morton <[email protected]>
Date: Sun, 6 Jul 2008 13:20:49 -0700

> On Sun, 6 Jul 2008 13:02:28 -0700 (PDT) [email protected] wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=11046
...
> > Here is the BUG:
> >
> > [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 4.11.5 2003/11/12 10:40'
> > [ 0.000000] PROMLIB: Root node compatible:
> > [ 0.000000] Linux version 2.6.25.10 (root@sparc1) (gcc version 4.1.2
> > 20061115 (prerelease) (Debian 4.1.1-21)) #5 SMP Sun Jul 6 21:05:42 CEST 2008
> > [ 0.000000] console [earlyprom0] enabled
> > [ 0.000000] ARCH: SUN4U
> > [ 0.000000] Ethernet address: 00:03:ba:7a:f3:d6
> > [ 0.000000] Kernel: Using 2 locked TLB entries for main kernel image.
> > [ 0.000000] Remapping the kernel... done.
> > [ 0.000000] kernel BUG at mm/bootmem.c:125!

This can only happen if you attach a zero-sized initrd to the kernel.

I see platforms like x86 sometimes have explicit checks for a zero
size to guard reserve_bootmem() and similar calls, but if that's what
callers are all going to do doesn't it make better sense for
reserve_bootmem_core() to just return instead of BUG on a zero size
argument?


2008-07-24 03:38:55

by Andrew Morton

[permalink] [raw]
Subject: Re: [Bug 11046] New: Kernel bug in mm/bootmem.c on Sparc machines

On Wed, 23 Jul 2008 20:25:33 -0700 (PDT) David Miller <[email protected]> wrote:

> From: Andrew Morton <[email protected]>
> Date: Sun, 6 Jul 2008 13:20:49 -0700
>
> > On Sun, 6 Jul 2008 13:02:28 -0700 (PDT) [email protected] wrote:
> >
> > > http://bugzilla.kernel.org/show_bug.cgi?id=11046
> ...
> > > Here is the BUG:
> > >
> > > [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 4.11.5 2003/11/12 10:40'
> > > [ 0.000000] PROMLIB: Root node compatible:
> > > [ 0.000000] Linux version 2.6.25.10 (root@sparc1) (gcc version 4.1.2
> > > 20061115 (prerelease) (Debian 4.1.1-21)) #5 SMP Sun Jul 6 21:05:42 CEST 2008
> > > [ 0.000000] console [earlyprom0] enabled
> > > [ 0.000000] ARCH: SUN4U
> > > [ 0.000000] Ethernet address: 00:03:ba:7a:f3:d6
> > > [ 0.000000] Kernel: Using 2 locked TLB entries for main kernel image.
> > > [ 0.000000] Remapping the kernel... done.
> > > [ 0.000000] kernel BUG at mm/bootmem.c:125!
>
> This can only happen if you attach a zero-sized initrd to the kernel.
>
> I see platforms like x86 sometimes have explicit checks for a zero
> size to guard reserve_bootmem() and similar calls, but if that's what
> callers are all going to do doesn't it make better sense for
> reserve_bootmem_core() to just return instead of BUG on a zero size
> argument?

Sounds logical.

Johannes just rewrote the bootmem code, but from a quick read it
appears that this behaviour has been retained.

So if we're going to change it in 2.6.26, we'll need a separate patch.

2008-07-24 03:42:57

by David Miller

[permalink] [raw]
Subject: Re: [Bug 11046] New: Kernel bug in mm/bootmem.c on Sparc machines

From: Andrew Morton <[email protected]>
Date: Wed, 23 Jul 2008 20:38:36 -0700

> So if we're going to change it in 2.6.26, we'll need a separate patch.

Here is the 2.6.26 version:

bootmem: Allow zero length reserve and free.

It's either this or all the call sites explicitly check
when such a case is possible and sometimes expected.

Signed-off-by: David S. Miller <[email protected]>

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 8d9f60e..e540f7a 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -153,7 +153,8 @@ static void __init reserve_bootmem_core(bootmem_data_t *bdata,
unsigned long sidx, eidx;
unsigned long i;

- BUG_ON(!size);
+ if (!size)
+ return;

/* out of range */
if (addr + size < bdata->node_boot_start ||
@@ -187,7 +188,8 @@ static void __init free_bootmem_core(bootmem_data_t *bdata, unsigned long addr,
unsigned long sidx, eidx;
unsigned long i;

- BUG_ON(!size);
+ if (!size)
+ return;

/* out range */
if (addr + size < bdata->node_boot_start ||

2008-07-24 12:10:18

by Johannes Weiner

[permalink] [raw]
Subject: Re: [Bug 11046] New: Kernel bug in mm/bootmem.c on Sparc machines

Hi,

Andrew Morton <[email protected]> writes:

> On Wed, 23 Jul 2008 20:25:33 -0700 (PDT) David Miller <[email protected]> wrote:
>
>> From: Andrew Morton <[email protected]>
>> Date: Sun, 6 Jul 2008 13:20:49 -0700
>>
>> > On Sun, 6 Jul 2008 13:02:28 -0700 (PDT) [email protected] wrote:
>> >
>> > > http://bugzilla.kernel.org/show_bug.cgi?id=11046
>> ...
>> > > Here is the BUG:
>> > >
>> > > [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 4.11.5 2003/11/12 10:40'
>> > > [ 0.000000] PROMLIB: Root node compatible:
>> > > [ 0.000000] Linux version 2.6.25.10 (root@sparc1) (gcc version 4.1.2
>> > > 20061115 (prerelease) (Debian 4.1.1-21)) #5 SMP Sun Jul 6 21:05:42 CEST 2008
>> > > [ 0.000000] console [earlyprom0] enabled
>> > > [ 0.000000] ARCH: SUN4U
>> > > [ 0.000000] Ethernet address: 00:03:ba:7a:f3:d6
>> > > [ 0.000000] Kernel: Using 2 locked TLB entries for main kernel image.
>> > > [ 0.000000] Remapping the kernel... done.
>> > > [ 0.000000] kernel BUG at mm/bootmem.c:125!
>>
>> This can only happen if you attach a zero-sized initrd to the kernel.
>>
>> I see platforms like x86 sometimes have explicit checks for a zero
>> size to guard reserve_bootmem() and similar calls, but if that's what
>> callers are all going to do doesn't it make better sense for
>> reserve_bootmem_core() to just return instead of BUG on a zero size
>> argument?
>
> Sounds logical.
>
> Johannes just rewrote the bootmem code, but from a quick read it
> appears that this behaviour has been retained.

In the new version, zero sized ranges are okay for reservation and
freeing. It still bugs on allocation, though.

> So if we're going to change it in 2.6.26, we'll need a separate patch.

Hannes

2008-07-24 18:38:17

by Andrew Morton

[permalink] [raw]
Subject: Re: [Bug 11046] New: Kernel bug in mm/bootmem.c on Sparc machines

On Thu, 24 Jul 2008 14:09:38 +0200 Johannes Weiner <[email protected]> wrote:

> Hi,
>
> Andrew Morton <[email protected]> writes:
>
> > On Wed, 23 Jul 2008 20:25:33 -0700 (PDT) David Miller <[email protected]> wrote:
> >
> >> From: Andrew Morton <[email protected]>
> >> Date: Sun, 6 Jul 2008 13:20:49 -0700
> >>
> >> > On Sun, 6 Jul 2008 13:02:28 -0700 (PDT) [email protected] wrote:
> >> >
> >> > > http://bugzilla.kernel.org/show_bug.cgi?id=11046
> >> ...
> >> > > Here is the BUG:
> >> > >
> >> > > [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 4.11.5 2003/11/12 10:40'
> >> > > [ 0.000000] PROMLIB: Root node compatible:
> >> > > [ 0.000000] Linux version 2.6.25.10 (root@sparc1) (gcc version 4.1.2
> >> > > 20061115 (prerelease) (Debian 4.1.1-21)) #5 SMP Sun Jul 6 21:05:42 CEST 2008
> >> > > [ 0.000000] console [earlyprom0] enabled
> >> > > [ 0.000000] ARCH: SUN4U
> >> > > [ 0.000000] Ethernet address: 00:03:ba:7a:f3:d6
> >> > > [ 0.000000] Kernel: Using 2 locked TLB entries for main kernel image.
> >> > > [ 0.000000] Remapping the kernel... done.
> >> > > [ 0.000000] kernel BUG at mm/bootmem.c:125!
> >>
> >> This can only happen if you attach a zero-sized initrd to the kernel.
> >>
> >> I see platforms like x86 sometimes have explicit checks for a zero
> >> size to guard reserve_bootmem() and similar calls, but if that's what
> >> callers are all going to do doesn't it make better sense for
> >> reserve_bootmem_core() to just return instead of BUG on a zero size
> >> argument?
> >
> > Sounds logical.
> >
> > Johannes just rewrote the bootmem code, but from a quick read it
> > appears that this behaviour has been retained.
>
> In the new version, zero sized ranges are okay for reservation and
> freeing. It still bugs on allocation, though.
>

Interesting. So from Dave's patch (which changes only
reserve_bootmem_core() and free_bootmem_core()), it sounds like we
have already fixed 2.6.27?

In which case David's 2.6.26 patch is a "minimal backport".

2008-07-24 21:32:41

by Johannes Weiner

[permalink] [raw]
Subject: Re: [Bug 11046] New: Kernel bug in mm/bootmem.c on Sparc machines

Hi,

David Miller <[email protected]> writes:

> From: Andrew Morton <[email protected]>
> Date: Wed, 23 Jul 2008 20:38:36 -0700
>
>> So if we're going to change it in 2.6.26, we'll need a separate patch.
>
> Here is the 2.6.26 version:
>
> bootmem: Allow zero length reserve and free.
>
> It's either this or all the call sites explicitly check
> when such a case is possible and sometimes expected.
>
> Signed-off-by: David S. Miller <[email protected]>
>
> diff --git a/mm/bootmem.c b/mm/bootmem.c
> index 8d9f60e..e540f7a 100644
> --- a/mm/bootmem.c
> +++ b/mm/bootmem.c
> @@ -153,7 +153,8 @@ static void __init reserve_bootmem_core(bootmem_data_t *bdata,
> unsigned long sidx, eidx;
> unsigned long i;
>
> - BUG_ON(!size);
> + if (!size)
> + return;
>
> /* out of range */
> if (addr + size < bdata->node_boot_start ||
> @@ -187,7 +188,8 @@ static void __init free_bootmem_core(bootmem_data_t *bdata, unsigned long addr,
> unsigned long sidx, eidx;
> unsigned long i;
>
> - BUG_ON(!size);
> + if (!size)
> + return;
>
> /* out range */
> if (addr + size < bdata->node_boot_start ||

Sorry, Dave, I missed that before: there is still the BUG_ON() in
can_reserve_bootmem_core(), which should just return 0 instead.

Other than that, yes, Andrew, this introduces the same behaviour the
bootmem rewrite.

Hannes

2008-07-24 21:59:18

by David Miller

[permalink] [raw]
Subject: Re: [Bug 11046] New: Kernel bug in mm/bootmem.c on Sparc machines

From: Johannes Weiner <[email protected]>
Date: Thu, 24 Jul 2008 23:32:06 +0200

> Sorry, Dave, I missed that before: there is still the BUG_ON() in
> can_reserve_bootmem_core(), which should just return 0 instead.
>
> Other than that, yes, Andrew, this introduces the same behaviour the
> bootmem rewrite.

Thanks, here is an updated version of the patch:

bootmem: Allow zero length reserve and free.

It's either this or all the call sites explicitly check
when such a case is possible and sometimes expected.

Signed-off-by: David S. Miller <[email protected]>

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 8d9f60e..5e3fab8 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -117,7 +117,8 @@ static int __init can_reserve_bootmem_core(bootmem_data_t *bdata,
unsigned long sidx, eidx;
unsigned long i;

- BUG_ON(!size);
+ if (!size)
+ return 0;

/* out of range, don't hold other */
if (addr + size < bdata->node_boot_start ||
@@ -153,7 +154,8 @@ static void __init reserve_bootmem_core(bootmem_data_t *bdata,
unsigned long sidx, eidx;
unsigned long i;

- BUG_ON(!size);
+ if (!size)
+ return;

/* out of range */
if (addr + size < bdata->node_boot_start ||
@@ -187,7 +189,8 @@ static void __init free_bootmem_core(bootmem_data_t *bdata, unsigned long addr,
unsigned long sidx, eidx;
unsigned long i;

- BUG_ON(!size);
+ if (!size)
+ return;

/* out range */
if (addr + size < bdata->node_boot_start ||