2009-12-03 03:03:16

by Jin Dongming

[permalink] [raw]
Subject: Question about kmsg_dump for OOPS

Hello, Simon

I am Jin Dongming.

I have a question about kmsg_dump which needs your help.
The question is as following:
Why not put the kmsg_dump() for OOPS into oops_end() and before the branch
of crash_kexec()?

The reason for the question is as following:
Now the kmsg_dump() for OOPS is added in oops_exit(). When OOPS happened,
kernel will call oops_end(). If the crash_kexec() is executed first in
oops_end(), the oops_exit() could not be called. And also the kmsg_dump()
for PANIC could not be executed. So I think that the kmsg_dump() for OOPS
will lose its real meaning.

The function tree for OOPS is as following:
oops_end()
|
|-- if (crash_kexec is valid)
| |
| |-- crash_kexec() ==> reboot (and the following function will
| not be executed)
|
|-- oops_exit
| |
| |-- kmsg_dump(OOPS)
|
|-- if (panic is valid)
| |
| |-- kmsg_dump(PANIC)
|

The function tree for PANIC is as following:
panic()
|
|-- kmsg_dump(PANIC)
|
|-- crash_kexec()
|
|-- notifier()


When kernel paniced, kmsg_dump() for PANIC is executed before crash_kexec(). So
I think before crash_kexec() is executed, kmsg_dump() for OOPS should be called
too. How do you think?

Best regards,
Jin Dongming


2009-12-03 08:26:27

by Simon Kagstrom

[permalink] [raw]
Subject: Re: Question about kmsg_dump for OOPS

Hi Jin!

On Thu, 03 Dec 2009 12:04:46 +0900
Jin Dongming <[email protected]> wrote:

> I have a question about kmsg_dump which needs your help.
> The question is as following:
> Why not put the kmsg_dump() for OOPS into oops_end() and before the branch
> of crash_kexec()?
>
> The reason for the question is as following:
> Now the kmsg_dump() for OOPS is added in oops_exit(). When OOPS happened,
> kernel will call oops_end(). If the crash_kexec() is executed first in
> oops_end(), the oops_exit() could not be called. And also the kmsg_dump()
> for PANIC could not be executed. So I think that the kmsg_dump() for OOPS
> will lose its real meaning.

It would be OK to move it for my part, I understand your reasoning.
How this is handled seems to vary a bit between architectures though.
ARM has (arch/arm/kernel/die.c)

NORET_TYPE void die(const char *str, struct pt_regs *regs, int err)
{
[...]
if (panic_on_oops)
panic("Fatal exception");

oops_exit();
[...]

while x86 does (arch/x86/kernel/dumpstack.c):

void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
{
if (regs && kexec_should_crash(current))
crash_kexec(regs);
[...]
oops_exit();
[...]
if (in_interrupt())
panic("Fatal exception in interrupt");
if (panic_on_oops)
panic("Fatal exception");


There was some additional discussion on this a while ago in these two
threads:

http://lkml.org/lkml/2009/11/11/404

http://lkml.org/lkml/2009/10/23/131

where there additionally was a request to move

atomic_notifier_call_chain(&panic_notifier_list, 0, buf);

before kmsg_dump() and crash_kexec(). I can't immediately see any
problem with this approach, but I'm no expert on kexec. The discussion
didn't really conclude on this matter though.

// Simon

2009-12-04 00:53:22

by Jin Dongming

[permalink] [raw]
Subject: Re: Question about kmsg_dump for OOPS

Hi Simon

I am sorry for replying late.

Thank you for your answer and your information.
I will read the discussion and consider whether there is a better method
to resolve my problem.

Best Regards,
Jin Dongming

Simon Kagstrom wrote:
> Hi Jin!
>
> On Thu, 03 Dec 2009 12:04:46 +0900
> Jin Dongming <[email protected]> wrote:
>
>> I have a question about kmsg_dump which needs your help.
>> The question is as following:
>> Why not put the kmsg_dump() for OOPS into oops_end() and before the branch
>> of crash_kexec()?
>>
>> The reason for the question is as following:
>> Now the kmsg_dump() for OOPS is added in oops_exit(). When OOPS happened,
>> kernel will call oops_end(). If the crash_kexec() is executed first in
>> oops_end(), the oops_exit() could not be called. And also the kmsg_dump()
>> for PANIC could not be executed. So I think that the kmsg_dump() for OOPS
>> will lose its real meaning.
>
> It would be OK to move it for my part, I understand your reasoning.
> How this is handled seems to vary a bit between architectures though.
> ARM has (arch/arm/kernel/die.c)
>
> NORET_TYPE void die(const char *str, struct pt_regs *regs, int err)
> {
> [...]
> if (panic_on_oops)
> panic("Fatal exception");
>
> oops_exit();
> [...]
>
> while x86 does (arch/x86/kernel/dumpstack.c):
>
> void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
> {
> if (regs && kexec_should_crash(current))
> crash_kexec(regs);
> [...]
> oops_exit();
> [...]
> if (in_interrupt())
> panic("Fatal exception in interrupt");
> if (panic_on_oops)
> panic("Fatal exception");
>
>
> There was some additional discussion on this a while ago in these two
> threads:
>
> http://lkml.org/lkml/2009/11/11/404
>
> http://lkml.org/lkml/2009/10/23/131
>
> where there additionally was a request to move
>
> atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
>
> before kmsg_dump() and crash_kexec(). I can't immediately see any
> problem with this approach, but I'm no expert on kexec. The discussion
> didn't really conclude on this matter though.
>
> // Simon
>
>