2003-08-15 21:15:48

by Ed L Cashin

[permalink] [raw]
Subject: [PATCH] do_wp_page: BUG on invalid pfn

Rusty Russell <[email protected]> writes:

> In message <[email protected]> you write:
>> This patch just does what the comment says should be done.
>
> Hi Ed!
>
> Not trivial I'm afraid. Send to Linus and lkml.


This patch just does what the comment says should be done. I thought
it was a trivial patch, but Rusty Russell has informed me otherwise.
(Thanks, RR).


--- linux-2.6.0-test2/mm/memory.c.orig Sun Jul 27 13:01:24 2003
+++ linux-2.6.0-test2/mm/memory.c Wed Aug 6 18:30:55 2003
@@ -990,15 +990,10 @@
int ret;

if (unlikely(!pfn_valid(pfn))) {
- /*
- * This should really halt the system so it can be debugged or
- * at least the kernel stops what it's doing before it corrupts
- * data, but for the moment just pretend this is OOM.
- */
- pte_unmap(page_table);
printk(KERN_ERR "do_wp_page: bogus page at address %08lx\n",
address);
- goto oom;
+ dump_stack();
+ BUG();
}
old_page = pfn_to_page(pfn);

@@ -1054,7 +1049,6 @@

no_mem:
page_cache_release(old_page);
-oom:
ret = VM_FAULT_OOM;
out:
spin_unlock(&mm->page_table_lock);

--
--Ed L Cashin | PGP public key:
[email protected] | http://noserose.net/e/pgp/


2003-08-15 21:22:56

by Mike Fedyk

[permalink] [raw]
Subject: Re: [PATCH] do_wp_page: BUG on invalid pfn

On Fri, Aug 15, 2003 at 05:15:45PM -0400, Ed L Cashin wrote:
> Rusty Russell <[email protected]> writes:
>
> > In message <[email protected]> you write:
> >> This patch just does what the comment says should be done.
> >
> > Hi Ed!
> >
> > Not trivial I'm afraid. Send to Linus and lkml.
>
>
> This patch just does what the comment says should be done. I thought
> it was a trivial patch, but Rusty Russell has informed me otherwise.
> (Thanks, RR).
>
>
> --- linux-2.6.0-test2/mm/memory.c.orig Sun Jul 27 13:01:24 2003
> +++ linux-2.6.0-test2/mm/memory.c Wed Aug 6 18:30:55 2003
> @@ -990,15 +990,10 @@
> int ret;
>
> if (unlikely(!pfn_valid(pfn))) {
> - /*
> - * This should really halt the system so it can be debugged or
> - * at least the kernel stops what it's doing before it corrupts
> - * data, but for the moment just pretend this is OOM.
> - */
> - pte_unmap(page_table);
> printk(KERN_ERR "do_wp_page: bogus page at address %08lx\n",
> address);
> - goto oom;
> + dump_stack();
> + BUG();

You're not unmapping the pte I guess to not interfere with the dump_stack,
but what about the printk? Will that affect the dump_stack also?

2003-08-15 21:39:20

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] do_wp_page: BUG on invalid pfn

On Fri, Aug 15, 2003 at 05:15:45PM -0400, Ed L Cashin wrote:
> + dump_stack();
> + BUG();

Is there much point to both dump_stack and BUG() - BUG is supposed to
provide a calltrace, which dump_stack also does. Do we really need two
copies?

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2003-08-15 21:50:10

by Ed L Cashin

[permalink] [raw]
Subject: Re: [PATCH] do_wp_page: BUG on invalid pfn

Russell King <[email protected]> writes:

> On Fri, Aug 15, 2003 at 05:15:45PM -0400, Ed L Cashin wrote:
>> + dump_stack();
>> + BUG();
>
> Is there much point to both dump_stack and BUG() - BUG is supposed to
> provide a calltrace, which dump_stack also does. Do we really need two
> copies?

On i386 WARN_ON calls dump_stack, but BUG just prints some minimal
helpful info on the console, like this:

------------[ cut here ]------------
kernel BUG at kernel/any.c:36!
invalid operand: 0000 [#1]


--
--Ed L Cashin | PGP public key:
[email protected] | http://noserose.net/e/pgp/

2003-08-15 21:52:20

by Ed L Cashin

[permalink] [raw]
Subject: Re: [PATCH] do_wp_page: BUG on invalid pfn

Mike Fedyk <[email protected]> writes:

> On Fri, Aug 15, 2003 at 05:15:45PM -0400, Ed L Cashin wrote:
>> Rusty Russell <[email protected]> writes:
>>
>> > In message <[email protected]> you write:
>> >> This patch just does what the comment says should be done.
>> >
>> > Hi Ed!
>> >
>> > Not trivial I'm afraid. Send to Linus and lkml.
>>
>>
>> This patch just does what the comment says should be done. I thought
>> it was a trivial patch, but Rusty Russell has informed me otherwise.
>> (Thanks, RR).
>>
>>
>> --- linux-2.6.0-test2/mm/memory.c.orig Sun Jul 27 13:01:24 2003
>> +++ linux-2.6.0-test2/mm/memory.c Wed Aug 6 18:30:55 2003
>> @@ -990,15 +990,10 @@
>> int ret;
>>
>> if (unlikely(!pfn_valid(pfn))) {
>> - /*
>> - * This should really halt the system so it can be debugged or
>> - * at least the kernel stops what it's doing before it corrupts
>> - * data, but for the moment just pretend this is OOM.
>> - */
>> - pte_unmap(page_table);
>> printk(KERN_ERR "do_wp_page: bogus page at address %08lx\n",
>> address);
>> - goto oom;
>> + dump_stack();
>> + BUG();
>
> You're not unmapping the pte I guess to not interfere with the dump_stack,

This patch changes the logic from "pretend it's out of memory" to
"announce something's very wrong and bail out right away." Unmapping
the pte seems like a precursor to carrying on business as usual, but
there must be some subtleties here that I am unaware of, or Rusty
Russell wouldn't have called this patch non-trivial.

> but what about the printk? Will that affect the dump_stack also?

It seems like you'd return from the printk before dumping the stack,
so I wouldn't think so.

--
--Ed L Cashin | PGP public key:
[email protected] | http://noserose.net/e/pgp/

2003-08-15 22:11:47

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] do_wp_page: BUG on invalid pfn

On Fri, Aug 15, 2003 at 05:50:05PM -0400, Ed L Cashin wrote:
> On i386 WARN_ON calls dump_stack, but BUG just prints some minimal
> helpful info on the console, like this:
>
> ------------[ cut here ]------------
> kernel BUG at kernel/any.c:36!
> invalid operand: 0000 [#1]

BUG causes an exception, which calls die(), which in turn calls
handle_BUG(), and this indeed does print the first two lines of the
above. die() goes on to print the 3rd line, but it also goes on
to call show_registers() which should print the registers and
calltrace as well.

Maybe you've found a bug in show_registers() ?

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2003-08-15 22:06:51

by Mike Fedyk

[permalink] [raw]
Subject: Re: [PATCH] do_wp_page: BUG on invalid pfn

On Fri, Aug 15, 2003 at 05:52:09PM -0400, Ed L Cashin wrote:
> Mike Fedyk <[email protected]> writes:
>
> > On Fri, Aug 15, 2003 at 05:15:45PM -0400, Ed L Cashin wrote:
> >> Rusty Russell <[email protected]> writes:
> >>
> >> > In message <[email protected]> you write:
> >> >> This patch just does what the comment says should be done.
> >> >
> >> > Hi Ed!
> >> >
> >> > Not trivial I'm afraid. Send to Linus and lkml.
> >>
> >>
> >> This patch just does what the comment says should be done. I thought
> >> it was a trivial patch, but Rusty Russell has informed me otherwise.
> >> (Thanks, RR).
> >>
> >>
> >> --- linux-2.6.0-test2/mm/memory.c.orig Sun Jul 27 13:01:24 2003
> >> +++ linux-2.6.0-test2/mm/memory.c Wed Aug 6 18:30:55 2003
> >> @@ -990,15 +990,10 @@
> >> int ret;
> >>
> >> if (unlikely(!pfn_valid(pfn))) {
> >> - /*
> >> - * This should really halt the system so it can be debugged or
> >> - * at least the kernel stops what it's doing before it corrupts
> >> - * data, but for the moment just pretend this is OOM.
> >> - */
> >> - pte_unmap(page_table);
> >> printk(KERN_ERR "do_wp_page: bogus page at address %08lx\n",
> >> address);
> >> - goto oom;
> >> + dump_stack();
> >> + BUG();
> >
> > You're not unmapping the pte I guess to not interfere with the dump_stack,
>
> This patch changes the logic from "pretend it's out of memory" to
> "announce something's very wrong and bail out right away." Unmapping
> the pte seems like a precursor to carrying on business as usual, but
> there must be some subtleties here that I am unaware of, or Rusty
> Russell wouldn't have called this patch non-trivial.

So does show_stack() halt the kernel? If not, then you probably want the
pte_unmap since you'll have a working/semi-working system after the bug()
call.

And if show_stack() does halt the kernel, what's the point of bug() then?

2003-08-15 22:18:33

by Ed L Cashin

[permalink] [raw]
Subject: Re: [PATCH] do_wp_page: BUG on invalid pfn

Mike Fedyk <[email protected]> writes:

> On Fri, Aug 15, 2003 at 05:52:09PM -0400, Ed L Cashin wrote:
...
>> >> + dump_stack();
>> >> + BUG();

...
> So does show_stack() halt the kernel?

I'm not calling show_stack but dump_stack. For i386 it looks like
handle_BUG in arch/i386/kernel/traps.c is the function that gets run
when BUG raises an exception.

BUG() does halt the system, though.

--
--Ed L Cashin | PGP public key:
[email protected] | http://noserose.net/e/pgp/