2005-11-04 00:36:22

by JaniD++

[permalink] [raw]
Subject: Reboot problem.

Hello list,

Is there any way to force reboot after this:

Nov 3 21:31:39 192.168.2.50 kernel: ------------[ cut here ]------------
Nov 3 21:31:39 192.168.2.50 kernel: kernel BUG at mm/highmem.c:183!
Nov 3 21:31:39 192.168.2.50 kernel: invalid operand: 0000 [#1]
Nov 3 21:31:39 192.168.2.50 kernel: SMP
Nov 3 21:31:39 192.168.2.50 kernel: Modules linked in: netconsole
Nov 3 21:31:39 192.168.2.50 kernel: CPU: 3
Nov 3 21:31:39 192.168.2.50 kernel: EIP: 0060:[<c015094f>] Not
tainted VLI
Nov 3 21:31:39 192.168.2.50 kernel: EFLAGS: 00010246 (2.6.14)
Nov 3 21:31:39 192.168.2.50 kernel: EIP is at kunmap_high+0x1f/0x93
Nov 3 21:31:39 192.168.2.50 kernel: eax: 00000000 ebx: c30e1280 ecx:
c0764a78 edx: 00000286
Nov 3 21:31:39 192.168.2.50 kernel: esi: 00001000 edi: 00000000 ebp:
f6eade64 esp: f6eade5c
Nov 3 21:31:39 192.168.2.50 kernel: ds: 007b es: 007b ss: 0068
Nov 3 21:31:39 192.168.2.50 kernel: Process md4_raid1 (pid: 3367,
threadinfo=f6eac000 task=f7bd8030)
Nov 3 21:31:39 192.168.2.50 kernel: Stack: c30e1280 cc31b8d0 f6eade6c
c0118349 f6eadec4 c038f626 c30e1280 00000001
Nov 3 21:31:39 192.168.2.50 kernel: ed143000 00001000 00004000
c0104e81 f6eadeec f7292d00 c07768ec f179e600
Nov 3 21:31:39 192.168.2.50 kernel: 13956025 01000000 e25b6edc
f6eadeec 22010000 00801bbc 00900100 c07768cc
Nov 3 21:31:39 192.168.2.50 kernel: Call Trace:
Nov 3 21:31:39 192.168.2.50 kernel: [<c0103bf2>] show_stack+0x9a/0xd0
Nov 3 21:31:39 192.168.2.50 kernel: [<c0103db2>]
show_registers+0x16a/0x1fa
Nov 3 21:31:39 192.168.2.50 kernel: [<c0103fc3>] die+0xfa/0x17c
Nov 3 21:31:39 192.168.2.50 kernel: [<c055510e>] do_trap+0x7e/0xb2
Nov 3 21:31:39 192.168.2.50 kernel: [<c010432d>] do_invalid_op+0xa9/0xb3
Nov 3 21:31:39 192.168.2.50 kernel: [<c01038ab>] error_code+0x4f/0x54
Nov 3 21:31:39 192.168.2.50 kernel: [<c0118349>] kunmap+0x42/0x44
Nov 3 21:31:39 192.168.2.50 kernel: [<c038f626>] nbd_send_req+0x1fc/0x297
Nov 3 21:31:39 192.168.2.50 kernel: [<c038fb17>] do_nbd_request+0xf4/0x27d
Nov 3 21:31:39 192.168.2.50 kernel: [<c0380e0c>]
__generic_unplug_device+0x28/0x2e
Nov 3 21:31:39 192.168.2.50 kernel: [<c0380e2f>]
generic_unplug_device+0x1d/0x2e
Nov 3 21:31:39 192.168.2.50 kernel: [<c046910b>] unplug_slaves+0x54/0xaf
Nov 3 21:31:39 192.168.2.50 kernel: [<c046a81b>] raid1d+0x288/0x2cb
Nov 3 21:31:39 192.168.2.50 kernel: [<c0478d4a>] md_thread+0x5f/0x10b
Nov 3 21:31:39 192.168.2.50 kernel: [<c0132f07>] kthread+0xb1/0xb5
Nov 3 21:31:39 192.168.2.50 kernel: [<c0101145>]
kernel_thread_helper+0x5/0xb
Nov 3 21:31:39 192.168.2.50 kernel: Code: e8 08 06 00 00 89 c7 e9 38 ff ff
ff 55 89 e5 53 83 ec 04 89 c3 b8 80 6c 68 c0 e8 3e
Nov 3 21:31:39 192.168.2.50 kernel: <0>Fatal exception: panic in 5 seconds


At this point the system is freez, and only reset can help.

Thanks

Janos


2005-11-04 05:42:59

by Willy Tarreau

[permalink] [raw]
Subject: Re: Reboot problem.

Hello,

On Fri, Nov 04, 2005 at 01:33:01AM +0100, JaniD++ wrote:
> Hello list,
>
> Is there any way to force reboot after this:
>
> Nov 3 21:31:39 192.168.2.50 kernel: ------------[ cut here ]------------
> Nov 3 21:31:39 192.168.2.50 kernel: kernel BUG at mm/highmem.c:183!
> Nov 3 21:31:39 192.168.2.50 kernel: invalid operand: 0000 [#1]
> Nov 3 21:31:39 192.168.2.50 kernel: SMP
> Nov 3 21:31:39 192.168.2.50 kernel: Modules linked in: netconsole
> Nov 3 21:31:39 192.168.2.50 kernel: CPU: 3
(...)
> Nov 3 21:31:39 192.168.2.50 kernel: Code: e8 08 06 00 00 89 c7 e9 38 ff ff
> ff 55 89 e5 53 83 ec 04 89 c3 b8 80 6c 68 c0 e8 3e
> Nov 3 21:31:39 192.168.2.50 kernel: <0>Fatal exception: panic in 5 seconds
>
> At this point the system is freez, and only reset can help.

It should have rebooted, but the system is too instable to be able to
do so. In this case, the only thing that can help is a hardware watchdog.
Possibly, you motherboard includes a chipset with a watchdog that you can
simply enable by loading the module and having a simple daemon to ping it
(I have one which takes 12 kB of RAM and which tries mallocs, forks and
FS accesses).

If the daemon stops pinging the hardware watchdog for too long, the chipset
will simply assert the RESET signal and the system will reboot.

> Thanks
>
> Janos

Regards,
Willy