2024-05-28 15:16:16

by David Wang

[permalink] [raw]
Subject: [BUG] 6.10.0-rc1: segfault at 0 when reboot with kernel config INIT_MLOCKED_ON_FREE_DEFAULT_ON=y

Hi,

My kernel is 6.10.0-rc1 with CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON=y, and
I got following screen when I execute `systemctl reboot` on my system.
(The text was extracted from a console image, there may be some parse error. And my kernel was tainted mostly because of nvidia driver)

42.855067] watchdog: watchdog0: watchdog did not stop!
42.905871] show_signal_msg: 14 callbacks suppressed
42.905874) systemd-shutdow[1]: segfault at 0 ip 0000000000000000 sp 00007ffcc8af7318 error 14 likely on CPU 6 (core 4, socke 42.906017] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
42.906080] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
42.906143] CPU: 6 PID: 1 Comm: systemd-shutdow Tainted: P OE 6.10.0-rc1-linan-0 #244
42.906220] Hardware name: Micro-Star International Co., Ltd. MS-7889/B450M MORTAR MAX (MS-7889), BIOS 2.80 06/10/2020 42.906308] Call Trace:
42.906329] <TASK>
42.906346] panic+0x31d/0x350
42.906375) ? srso_return_thunk+0x5/0x5f
42.906411] do_exit+0x968/0xad0
42.906441] do_group_exit+0x2c/0x80
42.906472] get_signal+0x876/0x8a0
42.906502] arch_do_signal_or_restart+0x2a/0x240
42.906544] irgentry_exit_to_user_mode+0xc2/0x160
42.906585] asm_exc_page_fault+0x22/0x30
42.906619] RIP: 0033:0x0
42.906640] Code: Unable to access opcode bytes at 0xfffft fffffffd6.
42.906693] RSP: 002b:00007ffcc8af7318 EFLAGS: 00010206
42.906736] RAX: 0000000000000011 RBX: 000000000328adea RCX: 0000000000000005
42.906794] RDX: 00007ffcc8af73b0 RSI: 0000000000000ea8 RDI: 0000000000000001
42.906852] RBP: 00007ffcc8af73be R08: 003b459fca9238c4 R09: 0000000000000069
42.906910] R10: 0000000000000008 R11: 0000000000000202 R12: 00007ffcc8af7330
42.906968] R13: 0000000000000ea8 R14: 0000000000000000 R15: 0000000000000000
42.907029] </TASK>
42.907328] Kernel Offset: 0x21a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
43.081928) --- [ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---


I rebuild the kernel with `CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON not set`, the system can reboot normally.
My guess, some memory is zeroed unproperly when reboot?


David



2024-06-04 13:50:58

by David Hildenbrand

[permalink] [raw]
Subject: Re: [BUG] 6.10.0-rc1: segfault at 0 when reboot with kernel config INIT_MLOCKED_ON_FREE_DEFAULT_ON=y

On 28.05.24 17:13, David Wang wrote:
> Hi,
>
> My kernel is 6.10.0-rc1 with CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON=y, and
> I got following screen when I execute `systemctl reboot` on my system.
> (The text was extracted from a console image, there may be some parse error. And my kernel was tainted mostly because of nvidia driver)
>
> 42.855067] watchdog: watchdog0: watchdog did not stop!
> 42.905871] show_signal_msg: 14 callbacks suppressed
> 42.905874) systemd-shutdow[1]: segfault at 0 ip 0000000000000000 sp 00007ffcc8af7318 error 14 likely on CPU 6 (core 4, socke 42.906017] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> 42.906080] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> 42.906143] CPU: 6 PID: 1 Comm: systemd-shutdow Tainted: P OE 6.10.0-rc1-linan-0 #244
> 42.906220] Hardware name: Micro-Star International Co., Ltd. MS-7889/B450M MORTAR MAX (MS-7889), BIOS 2.80 06/10/2020 42.906308] Call Trace:
> 42.906329] <TASK>
> 42.906346] panic+0x31d/0x350
> 42.906375) ? srso_return_thunk+0x5/0x5f
> 42.906411] do_exit+0x968/0xad0
> 42.906441] do_group_exit+0x2c/0x80
> 42.906472] get_signal+0x876/0x8a0
> 42.906502] arch_do_signal_or_restart+0x2a/0x240
> 42.906544] irgentry_exit_to_user_mode+0xc2/0x160
> 42.906585] asm_exc_page_fault+0x22/0x30
> 42.906619] RIP: 0033:0x0
> 42.906640] Code: Unable to access opcode bytes at 0xfffft fffffffd6.
> 42.906693] RSP: 002b:00007ffcc8af7318 EFLAGS: 00010206
> 42.906736] RAX: 0000000000000011 RBX: 000000000328adea RCX: 0000000000000005
> 42.906794] RDX: 00007ffcc8af73b0 RSI: 0000000000000ea8 RDI: 0000000000000001
> 42.906852] RBP: 00007ffcc8af73be R08: 003b459fca9238c4 R09: 0000000000000069
> 42.906910] R10: 0000000000000008 R11: 0000000000000202 R12: 00007ffcc8af7330
> 42.906968] R13: 0000000000000ea8 R14: 0000000000000000 R15: 0000000000000000
> 42.907029] </TASK>
> 42.907328] Kernel Offset: 0x21a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> 43.081928) --- [ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
>
>
> I rebuild the kernel with `CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON not set`, the system can reboot normally.
> My guess, some memory is zeroed unproperly when reboot?

Stumbled over that as well [1], maybe because of the messed up
interaction with fork()..

This should be reverted. Waiting for Andrews reply on my mail before I
send a patch to do that.

[1]
https://lkml.kernel.org/r/[email protected]

--
Cheers,

David / dhildenb