2007-11-24 04:28:50

by youquan_song

[permalink] [raw]
Subject: ipmi_watchdog can not reset the kernel panic machine

Build kernel-2.6.24-rc3. pmi_watchdog can not reset the kernel panic
machine. The watchdog can never to record panic information to IPMI SEL.

1. I disable auto reset when kernel panic by echo "0" >
/proc/sys/kernel/panic

2. modprobe ipmi_watchdog timeout=120 action=reset

3. Load a driver, the driver will call panic() when ioctl to call into
the driver.

4. By ioctl call into the driver, panic the system.

in wdog_panic_handler, I printk "ipmi_watchdog_state=WDOG_TIMEOUT_NONE"
so, the watchdog can never to record panic information to IPMI SEL.


static int wdog_panic_handler(struct notifier_block *this,
unsigned long event,
void *unused)
{
static int panic_event_handled = 0;

/* On a panic, if we have a panic timeout, make sure to extend
the watchdog timer to a reasonable value to complete the
panic, if the watchdog timer is running. Plus the
pretimeout is meaningless at panic time. */
if (watchdog_user && !panic_event_handled &&
ipmi_watchdog_state != WDOG_TIMEOUT_NONE) {
/* Make sure we do this only once. */
panic_event_handled = 1;

timeout = 255;
pretimeout = 0;
panic_halt_ipmi_set_timeout();
}

return NOTIFY_OK;
}


2007-11-25 06:40:10

by Andrew Morton

[permalink] [raw]
Subject: Re: ipmi_watchdog can not reset the kernel panic machine


(cc's added)

On Fri, 23 Nov 2007 20:28:41 -0800 (PST) [email protected] wrote:

> Build kernel-2.6.24-rc3. pmi_watchdog can not reset the kernel panic
> machine. The watchdog can never to record panic information to IPMI SEL.
>
> 1. I disable auto reset when kernel panic by echo "0" >
> /proc/sys/kernel/panic
>
> 2. modprobe ipmi_watchdog timeout=120 action=reset
>
> 3. Load a driver, the driver will call panic() when ioctl to call into
> the driver.
>
> 4. By ioctl call into the driver, panic the system.
>
> in wdog_panic_handler, I printk "ipmi_watchdog_state=WDOG_TIMEOUT_NONE"
> so, the watchdog can never to record panic information to IPMI SEL.
>
>
> static int wdog_panic_handler(struct notifier_block *this,
> unsigned long event,
> void *unused)
> {
> static int panic_event_handled = 0;
>
> /* On a panic, if we have a panic timeout, make sure to extend
> the watchdog timer to a reasonable value to complete the
> panic, if the watchdog timer is running. Plus the
> pretimeout is meaningless at panic time. */
> if (watchdog_user && !panic_event_handled &&
> ipmi_watchdog_state != WDOG_TIMEOUT_NONE) {
> /* Make sure we do this only once. */
> panic_event_handled = 1;
>
> timeout = 255;
> pretimeout = 0;
> panic_halt_ipmi_set_timeout();
> }
>
> return NOTIFY_OK;
> }

2007-11-26 00:26:43

by Corey Minyard

[permalink] [raw]
Subject: Re: ipmi_watchdog can not reset the kernel panic machine

The watchdog is "off" by default, meaning that you have to have
something actually start resetting the watchdog before it will start
running. That's why you are seeing this behavior.

There is a start_now option that will start the watchdog when it is
loaded, but then it will reset the system unless something resets the
watchdog periodically, and you have a limited time to start this operation.

On a panic, the IPMI driver attempts to preserve the state of the
watchdog and (if running) increase the timeout time to allow a kdump or
something like that to occur. That's the purpose of the code you
reference. It is not to start a reset operation on any panic. It used
to start a reset on every panic, but that cause problems for many users.

-corey

Andrew Morton wrote:
> (cc's added)
>
> On Fri, 23 Nov 2007 20:28:41 -0800 (PST) [email protected] wrote:
>
>
>> Build kernel-2.6.24-rc3. pmi_watchdog can not reset the kernel panic
>> machine. The watchdog can never to record panic information to IPMI SEL.
>>
>> 1. I disable auto reset when kernel panic by echo "0" >
>> /proc/sys/kernel/panic
>>
>> 2. modprobe ipmi_watchdog timeout=120 action=reset
>>
>> 3. Load a driver, the driver will call panic() when ioctl to call into
>> the driver.
>>
>> 4. By ioctl call into the driver, panic the system.
>>
>> in wdog_panic_handler, I printk "ipmi_watchdog_state=WDOG_TIMEOUT_NONE"
>> so, the watchdog can never to record panic information to IPMI SEL.
>>
>>
>> static int wdog_panic_handler(struct notifier_block *this,
>> unsigned long event,
>> void *unused)
>> {
>> static int panic_event_handled = 0;
>>
>> /* On a panic, if we have a panic timeout, make sure to extend
>> the watchdog timer to a reasonable value to complete the
>> panic, if the watchdog timer is running. Plus the
>> pretimeout is meaningless at panic time. */
>> if (watchdog_user && !panic_event_handled &&
>> ipmi_watchdog_state != WDOG_TIMEOUT_NONE) {
>> /* Make sure we do this only once. */
>> panic_event_handled = 1;
>>
>> timeout = 255;
>> pretimeout = 0;
>> panic_halt_ipmi_set_timeout();
>> }
>>
>> return NOTIFY_OK;
>> }
>>