2009-07-09 08:07:30

by Ken'ichi Ohmichi

[permalink] [raw]
Subject: [PATCH] kdump: Enable kdump if 2nd-kernel is loaded.


Hi,

This patch enables a kdump if 2nd-kernel is loaded.
(The patch is based on linux-2.6.31-rc2.)

Now, a kdump is enabled if a kernel parameter "oops=panic" is specified and
2nd-kernel is loaded. I think that a kdump should be enabled regardless of
"oops=panic" if 2nd-kernel is loaded, because a system administrator loads
2nd-kernel for enabling a kdump.

* Reference
The discussion about this patch
http://lists.infradead.org/pipermail/kexec/2009-July/003417.html


Thanks
Ken'ichi Ohmichi

Signed-off-by: Ken'ichi Ohmichi <[email protected]>
Acked-by: Simon Horman <[email protected]>
---
--- a/kernel/kexec.c 2009-07-08 12:30:26.000000000 +0900
+++ b/kernel/kexec.c 2009-07-08 12:38:08.000000000 +0900
@@ -57,6 +57,8 @@ struct resource crashk_res = {

int kexec_should_crash(struct task_struct *p)
{
+ if (kexec_crash_image)
+ return 1;
if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops)
return 1;
return 0;

_______________________________________________
kexec mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/kexec


2009-07-10 06:32:49

by Hidetoshi Seto

[permalink] [raw]
Subject: Re: [PATCH] kdump: Enable kdump if 2nd-kernel is loaded.

Hi Ohmichi-san,

Ken'ichi Ohmichi wrote:
> Hi,
>
> This patch enables a kdump if 2nd-kernel is loaded.
> (The patch is based on linux-2.6.31-rc2.)
>
> Now, a kdump is enabled if a kernel parameter "oops=panic" is specified and
> 2nd-kernel is loaded. I think that a kdump should be enabled regardless of
> "oops=panic" if 2nd-kernel is loaded, because a system administrator loads
> 2nd-kernel for enabling a kdump.

I think this description is slightly wrong because kdump will be invoked
from panic, regardless of the panic_on_oops.

Maybe:
A kdump on oops is enabled if a kernel parameter "oops=panic" ...
~~~~~~~

>
> * Reference
> The discussion about this patch
> http://lists.infradead.org/pipermail/kexec/2009-July/003417.html

I'd like to quote your comment:

>> I tried to test a kdump on linux-2.6.31-rc1 *without* a kernel parameter
>> "oops=panic" by `echo c > /proc/sysrq-trigger`, but a kdump did not work
>> because a kdump, which is occurred by `echo c > /proc/sysrq-trigger`, has
>> been changed to a NULL pointer error instead of calling crash_kexec()
>> since linux-2.6.31-rc1.

So the real problem is that kdump is not triggered by the NULL pointer oops
if !panic_on_oops, isn't it?

It seems that you should report this trouble of sysrq-c as a regression.

>
>
> Thanks
> Ken'ichi Ohmichi
>
> Signed-off-by: Ken'ichi Ohmichi <[email protected]>
> Acked-by: Simon Horman <[email protected]>
> ---
> --- a/kernel/kexec.c 2009-07-08 12:30:26.000000000 +0900
> +++ b/kernel/kexec.c 2009-07-08 12:38:08.000000000 +0900
> @@ -57,6 +57,8 @@ struct resource crashk_res = {
>
> int kexec_should_crash(struct task_struct *p)
> {
> + if (kexec_crash_image)
> + return 1;
> if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops)
> return 1;
> return 0;

I think kexec cannot crash if there is no image, right?

Then:

if (kexec_crash_image)
return 1;
return 0;

or

return (kexec_crash_image) ? 1 : 0;

or

since crash_kexec() is nop if !kexec_crash_image,
replace all:
if (kexec_should_crash(p))
crash_kexec(reg);
at everywhere in kernel to a simple line:
crash_kexec(reg);
and remove kexec_should_crash() completely

would be better fix.


Thanks,
H.Seto

2009-07-10 07:09:01

by Ken'ichi Ohmichi

[permalink] [raw]
Subject: Re: [PATCH] kdump: Enable kdump if 2nd-kernel is loaded.


Hi Seto-san,

Thank you for your comment.

Hidetoshi Seto wrote:
>> This patch enables a kdump if 2nd-kernel is loaded.
>> (The patch is based on linux-2.6.31-rc2.)
>>
>> Now, a kdump is enabled if a kernel parameter "oops=panic" is specified and
>> 2nd-kernel is loaded. I think that a kdump should be enabled regardless of
>> "oops=panic" if 2nd-kernel is loaded, because a system administrator loads
>> 2nd-kernel for enabling a kdump.
>
> I think this description is slightly wrong because kdump will be invoked
> from panic, regardless of the panic_on_oops.
>
> Maybe:
> A kdump on oops is enabled if a kernel parameter "oops=panic" ...
> ~~~~~~~

Right, I should add "on oops" to the above description.


>> * Reference
>> The discussion about this patch
>> http://lists.infradead.org/pipermail/kexec/2009-July/003417.html
>
> I'd like to quote your comment:
>
>>> I tried to test a kdump on linux-2.6.31-rc1 *without* a kernel parameter
>>> "oops=panic" by `echo c > /proc/sysrq-trigger`, but a kdump did not work
>>> because a kdump, which is occurred by `echo c > /proc/sysrq-trigger`, has
>>> been changed to a NULL pointer error instead of calling crash_kexec()
>>> since linux-2.6.31-rc1.
>
> So the real problem is that kdump is not triggered by the NULL pointer oops
> if !panic_on_oops, isn't it?
>
> It seems that you should report this trouble of sysrq-c as a regression.

I don't think this problem is a regression of sysrq-c.
This change of sysrq-c looks reasonable to me. Because sysrq-c is used
for the test of kdump, and its code path has been changed to the same
path as a kdump on oops (a real problem, not test).
So we can test a kdump by a real oops, that is good to me.
As a result, I could find this problem a kdump is not enabled without
"oops=panic".


>> Signed-off-by: Ken'ichi Ohmichi <[email protected]>
>> Acked-by: Simon Horman <[email protected]>
>> ---
>> --- a/kernel/kexec.c 2009-07-08 12:30:26.000000000 +0900
>> +++ b/kernel/kexec.c 2009-07-08 12:38:08.000000000 +0900
>> @@ -57,6 +57,8 @@ struct resource crashk_res = {
>>
>> int kexec_should_crash(struct task_struct *p)
>> {
>> + if (kexec_crash_image)
>> + return 1;
>> if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops)
>> return 1;
>> return 0;
>
> I think kexec cannot crash if there is no image, right?
>
> Then:
>
> if (kexec_crash_image)
> return 1;
> return 0;
>
> or
>
> return (kexec_crash_image) ? 1 : 0;
>
> or
>
> since crash_kexec() is nop if !kexec_crash_image,
> replace all:
> if (kexec_should_crash(p))
> crash_kexec(reg);
> at everywhere in kernel to a simple line:
> crash_kexec(reg);
> and remove kexec_should_crash() completely
>
> would be better fix.

Good point, I like the first option because it is safe that a caller of
crash_kexec() checks it. That is not a strong opinion ;-)


Thanks
Ken'ichi Ohmichi

2009-07-10 07:52:59

by Hidetoshi Seto

[permalink] [raw]
Subject: Re: [PATCH] kdump: Enable kdump if 2nd-kernel is loaded.

Hi Ohmichi-san,

Ken'ichi Ohmichi wrote:
> Hidetoshi Seto wrote:
>> I'd like to quote your comment:
>>
>>>> I tried to test a kdump on linux-2.6.31-rc1 *without* a kernel parameter
>>>> "oops=panic" by `echo c > /proc/sysrq-trigger`, but a kdump did not work
>>>> because a kdump, which is occurred by `echo c > /proc/sysrq-trigger`, has
>>>> been changed to a NULL pointer error instead of calling crash_kexec()
>>>> since linux-2.6.31-rc1.
>> So the real problem is that kdump is not triggered by the NULL pointer oops
>> if !panic_on_oops, isn't it?
>>
>> It seems that you should report this trouble of sysrq-c as a regression.
>
> I don't think this problem is a regression of sysrq-c.
> This change of sysrq-c looks reasonable to me. Because sysrq-c is used
> for the test of kdump, and its code path has been changed to the same
> path as a kdump on oops (a real problem, not test).
> So we can test a kdump by a real oops, that is good to me.
> As a result, I could find this problem a kdump is not enabled without
> "oops=panic".

Still I believe this is a regression of sysrq-c.
If required function of sysrq-c were "cause a real oops", then it is not
regression. But as described in Documentation/sysrq.txt, real usage of
sysrq-c is "perform a kexec reboot," i.e. kdump.
So I think sysrq-c should use panic instead of oops if it want to kick
the kdump unconditionally.

However, yes, I agree it is a problem on the kdump's policy that kdump on
oops is not enabled without the option. I'm not sure how much people want
to enable kdump only on panic while not on oops, but I suppose it will be
negligible. Or it would be better to introduce kdump_on_oops option or so.


Thanks,
H.Seto

2009-07-13 07:38:06

by Ken'ichi Ohmichi

[permalink] [raw]
Subject: [PATCH-v2] kdump: Enable kdump if 2nd-kernel is loaded.


Hi,

This patch is a new version by Seto-san's comment.


Changelog since v1:
* Remove the check code other than kexec_crash_image from kexec_should_crash()
because a kexec cannot crash if there is no image.


This patch enables a kdump if 2nd-kernel is loaded.
(The patch is based on linux-2.6.31-rc2.)

Now, a kdump on oops is enabled if a kernel parameter "oops=panic"
is specified and 2nd-kernel is loaded. I think that a kdump should
be enabled regardless of "oops=panic" if 2nd-kernel is loaded,
because a system administrator loads 2nd-kernel for enabling a kdump.


* Reference
The discussion about this patch
http://lists.infradead.org/pipermail/kexec/2009-July/003417.html
http://lists.infradead.org/pipermail/kexec/2009-July/003433.html


Thanks
Ken'ichi Ohmichi

Signed-off-by: Ken'ichi Ohmichi <[email protected]>
Acked-by: Simon Horman <[email protected]>
---
--- a/kernel/kexec.c 2009-07-08 12:30:26.000000000 +0900
+++ b/kernel/kexec.c 2009-07-13 13:49:03.000000000 +0900
@@ -57,7 +57,7 @@ struct resource crashk_res = {

int kexec_should_crash(struct task_struct *p)
{
- if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops)
+ if (kexec_crash_image)
return 1;
return 0;
}

2009-07-13 07:49:28

by Hidetoshi Seto

[permalink] [raw]
Subject: Re: [PATCH-v2] kdump: Enable kdump if 2nd-kernel is loaded.

Ken'ichi Ohmichi wrote:
> Signed-off-by: Ken'ichi Ohmichi <[email protected]>
> Acked-by: Simon Horman <[email protected]>
> ---
> --- a/kernel/kexec.c 2009-07-08 12:30:26.000000000 +0900
> +++ b/kernel/kexec.c 2009-07-13 13:49:03.000000000 +0900
> @@ -57,7 +57,7 @@ struct resource crashk_res = {
>
> int kexec_should_crash(struct task_struct *p)
> {
> - if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops)
> + if (kexec_crash_image)
> return 1;
> return 0;
> }

Reviewed-by: Hidetoshi Seto <[email protected]>

Thanks,
H.Seto

2009-07-13 17:06:33

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH-v2] kdump: Enable kdump if 2nd-kernel is loaded.

"Ken'ichi Ohmichi" <[email protected]> writes:

> Hi,
>
> This patch is a new version by Seto-san's comment.
>
>
> Changelog since v1:
> * Remove the check code other than kexec_crash_image from kexec_should_crash()
> because a kexec cannot crash if there is no image.
>
>
> This patch enables a kdump if 2nd-kernel is loaded.
> (The patch is based on linux-2.6.31-rc2.)
>
> Now, a kdump on oops is enabled if a kernel parameter "oops=panic"
> is specified and 2nd-kernel is loaded. I think that a kdump should
> be enabled regardless of "oops=panic" if 2nd-kernel is loaded,
> because a system administrator loads 2nd-kernel for enabling a kdump.

The Documentation for sysrq-c certainly needs to be updated.

If I am doing development on a system I like oops's. All of the
information and nothing goes down. I can get at /proc/kcore etc.

In a setting where I can't be Johnny on the spot and look at
things a core dump is probably the best I can get. In that scenario
panic_on_oops sounds good.

As I read the current check it reads:
If we are going to panic and not oops || panic_on_oops
kexec_should_crash = true;

Which seems a very reasonable implementation of policy.

Eric

2009-07-14 02:05:13

by Hidetoshi Seto

[permalink] [raw]
Subject: Re: [PATCH-v2] kdump: Enable kdump if 2nd-kernel is loaded.

Eric W. Biederman wrote:
> The Documentation for sysrq-c certainly needs to be updated.

No doubt about it.

> If I am doing development on a system I like oops's. All of the
> information and nothing goes down. I can get at /proc/kcore etc.
>
> In a setting where I can't be Johnny on the spot and look at
> things a core dump is probably the best I can get. In that scenario
> panic_on_oops sounds good.
>
> As I read the current check it reads:
> If we are going to panic and not oops || panic_on_oops
> kexec_should_crash = true;
>
> Which seems a very reasonable implementation of policy.

If I understand it,

a) panic
b) oops
c) SysRq (from keyboard interrupt)
d) echo c > /proc/sysrq-trigger

a b c d
common, w/o CONFIG_KEXEC:
panic oops panic oops :
panic panic panic panic : w/ panic_on_oops

A) 2.6.30 + CONFIG_KEXEC:
panic oops - - : w/o kdump-image
panic panic - - : w/o kdump-image w/ panic_on_oops
kdump oops kdump kdump : w/ kdump-image
kdump kdump kdump kdump : w/ kdump-image w/ panic_on_oops

B) 2.6.31-rc1 + CONFIG_KEXEC:
panic oops panic oops : w/o kdump-image
panic panic panic panic : w/o kdump-image w/ panic_on_oops
kdump oops kdump oops : w/ kdump-image
kdump kdump kdump kdump : w/ kdump-image w/ panic_on_oops

C) After this patch + CONFIG_KEXEC:
panic oops panic oops : w/o kdump-image
panic panic panic panic : w/o kdump-image w/ panic_on_oops
kdump kdump kdump kdump : w/ kdump-image
kdump kdump kdump kdump : w/ kdump-image w/ panic_on_oops

Ohmichi-san pointed "oops on d) with kdump-image" in B) and suggested
it should be "kdump."
And Eric pointed "kdump on b) without panic_on_oops" in C) and requested
it should be "oops."
(Plus, IMHO, "oops on d) without panic_on_oops" should be "panic.")

... Right?

Then as I mentioned, use panic in sysrq would be one of solutions.
Are there any better fix?


Thanks,
H.Seto