Hello,
We've observed in some cases kernel panic when trying to boot on
ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
could be related to patch
https://lore.kernel.org/lkml/[email protected]/
[ 1.516075] wait_for_initramfs() called before rootfs_initcalls
[ 1.520467] raid6: skip pq benchmark and using algorithm vpermxor8
[ 1.520475] raid6: using intx1 recovery algorithm
[ 1.520654] iommu: Default domain type: Translated
[ 1.520733] vgaarb: loaded
[ 1.520937] SCSI subsystem initialized
[ 1.521104] usbcore: registered new interface driver usbfs
[ 1.521118] usbcore: registered new interface driver hub
[ 1.521179] usbcore: registered new device driver usb
[ 1.521207] pps_core: LinuxPPS API ver. 1 registered
[ 1.521212] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
Rodolfo Giometti <[email protected]>
[ 1.521220] PTP clock support registered
[ 1.521388] EDAC MC: Ver: 3.0.0
[ 1.521784] NetLabel: Initializing
[ 1.521789] NetLabel: domain hash size = 128
[ 1.521793] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
[ 1.521812] NetLabel: unlabeled traffic allowed by default
[ 1.523330] clocksource: Switched to clocksource timebase
[ 1.546838] VFS: Disk quotas dquot_6.6.0
[ 1.546981] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
[ 1.549556] NET: Registered protocol family 2
[ 1.549749] IP idents hash table entries: 262144 (order: 5, 2097152
bytes, vmalloc)
[ 1.555300] tcp_listen_portaddr_hash hash table entries: 65536
(order: 4, 1048576 bytes, vmalloc)
[ 1.555559] TCP established hash table entries: 524288 (order: 6,
4194304 bytes, vmalloc)
[ 1.556874] TCP bind hash table entries: 65536 (order: 4, 1048576
bytes, vmalloc)
[ 1.557053] TCP: Hash tables configured (established 524288 bind 65536)
[ 1.557788] MPTCP token hash table entries: 65536 (order: 4,
1572864 bytes, vmalloc)
[ 1.558051] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
[ 1.558449] UDP-Lite hash table entries: 65536 (order: 5, 2097152
bytes, vmalloc)
[ 1.559536] NET: Registered protocol family 1
[ 1.559548] NET: Registered protocol family 44
[ 1.559681] pci 0005:03:00.0: enabling device (0140 -> 0142)
[ 1.559757] PCI: CLS 128 bytes, default 128
[ 1.560150] Trying to unpack rootfs image as initramfs...
[ 1.560203] rtas_flash: no firmware flash support
[ 1.575826] Initialise system trusted keyrings
[ 1.575859] Key type blacklist registered
[ 1.576031] workingset: timestamp_bits=38 max_order=24 bucket_order=0
[ 1.578756] zbud: loaded
[ 1.620652] NET: Registered protocol family 38
[ 1.620705] xor: measuring software checksum speed
[ 1.621978] 8regs : 7789 MB/sec
[ 1.623409] 8regs_prefetch : 7482 MB/sec
[ 1.624493] 32regs : 9208 MB/sec
[ 1.625666] 32regs_prefetch : 8471 MB/sec
[ 1.626333] altivec : 15060 MB/sec
[ 1.626339] xor: using function: altivec (15060 MB/sec)
[ 1.626348] Key type asymmetric registered
[ 1.626354] Asymmetric key parser 'x509' registered
[ 1.626417] Block layer SCSI generic (bsg) driver version 0.4
loaded (major 245)
[ 1.626604] io scheduler mq-deadline registered
[ 1.626610] io scheduler kyber registered
[ 1.626812] io scheduler bfq registered
[ 1.635490] atomic64_test: passed
[ 1.636168] random: crng init done
[ 1.636563] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[ 1.636815] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
[ 1.637064] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
[ 1.637091] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[ 1.638860] Non-volatile memory driver v1.3
[ 1.640714] libphy: Fixed MDIO Bus: probed
[ 1.640864] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 1.640880] ehci-pci: EHCI PCI platform driver
[ 1.640895] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 1.640912] ohci-pci: OHCI PCI platform driver
[ 1.640925] uhci_hcd: USB Universal Host Controller Interface driver
[ 1.641099] xhci_hcd 0005:03:00.0: xHCI Host Controller
[ 1.641232] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
bus number 1
[ 1.641379] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
version 0x96 quirks 0x0000000004000000
[ 1.641849] usb usb1: New USB device found, idVendor=1d6b,
idProduct=0002, bcdDevice= 5.13
[ 1.641858] usb usb1: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[ 1.641865] usb usb1: Product: xHCI Host Controller
[ 1.641871] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
[ 1.641876] usb usb1: SerialNumber: 0005:03:00.0
[ 1.642025] hub 1-0:1.0: USB hub found
[ 1.642040] hub 1-0:1.0: 4 ports detected
[ 1.642211] xhci_hcd 0005:03:00.0: xHCI Host Controller
[ 1.642275] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
bus number 2
[ 1.642283] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
[ 1.642305] usb usb2: We don't know the algorithms for LPM for this
host, disabling LPM.
[ 1.642383] usb usb2: New USB device found, idVendor=1d6b,
idProduct=0003, bcdDevice= 5.13
[ 1.642391] usb usb2: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[ 1.642398] usb usb2: Product: xHCI Host Controller
[ 1.642403] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
[ 1.642409] usb usb2: SerialNumber: 0005:03:00.0
[ 1.642543] hub 2-0:1.0: USB hub found
[ 1.642557] hub 2-0:1.0: 4 ports detected
[ 1.642732] usbcore: registered new interface driver usbserial_generic
[ 1.642744] usbserial: USB Serial support registered for generic
[ 1.642786] mousedev: PS/2 mouse device common for all mice
[ 1.646583] device-mapper: uevent: version 1.0.3
[ 1.646730] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
initialised: [email protected]
[ 1.647019] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
0xfffffff7 max 0x0
[ 1.647027] powernv-cpufreq: Workload Optimized Frequency is
disabled in the platform
[ 1.652136] hid: raw HID events driver (C) Jiri Kosina
[ 1.652178] usbcore: registered new interface driver usbhid
[ 1.652183] usbhid: USB HID core driver
[ 1.652441] drop_monitor: Initializing network drop monitor service
[ 1.652599] Initializing XFRM netlink socket
[ 1.653049] NET: Registered protocol family 10
[ 1.764430] Initramfs unpacking failed: no cpio magic
[ 1.766204] Freeing initrd memory: 18176K
[ 1.768734] Segment Routing with IPv6
[ 1.768747] RPL Segment Routing with IPv6
[ 1.768792] mip6: Mobile IPv6
[ 1.768801] NET: Registered protocol family 17
[ 1.768912] secvar-sysfs: secvar: failed to retrieve secvar operations.
[ 1.768961] drmem: No dynamic reconfiguration memory found
[ 1.769297] registered taskstats version 1
[ 1.769329] Loading compiled-in X.509 certificates
[ 1.771836] Loaded X.509 cert 'Build time autogenerated kernel key:
97604d93c367cf27b215cdfd062467d582f7e126'
[ 1.774441] zswap: loaded using pool lzo/zbud
[ 1.774697] debug_vm_pgtable: [debug_vm_pgtable ]:
Validating architecture page table helpers
[ 1.775440] Key type ._fscrypt registered
[ 1.775450] Key type .fscrypt registered
[ 1.775458] Key type fscrypt-provisioning registered
[ 1.778036] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
[ 1.778099] pstore: Using crash dump compression: deflate
[ 1.778765] Key type encrypted registered
[ 1.779101] Secure boot mode disabled
[ 1.779113] ima: No TPM chip found, activating TPM-bypass!
[ 1.779126] Loading compiled-in module X.509 certificates
[ 1.781448] Loaded X.509 cert 'Build time autogenerated kernel key:
97604d93c367cf27b215cdfd062467d582f7e126'
[ 1.781465] ima: Allocated hash algorithm: sha256
[ 1.781727] Secure boot mode disabled
[ 1.781963] Trusted boot mode disabled
[ 1.781972] ima: No architecture policies found
[ 1.782008] evm: Initialising EVM extended attributes:
[ 1.782017] evm: security.selinux
[ 1.782024] evm: security.ima
[ 1.782032] evm: security.capability
[ 1.782040] evm: HMAC attrs: 0x1
[ 1.787618] Freeing unused kernel memory: 5760K
[ 1.787634] Kernel memory protection not selected by kernel config.
[ 1.787649] Run /init as init process
[ 1.787793] Failed to execute /init (error -2)
[ 1.787801] Run /sbin/init as init process
[ 1.787842] Run /etc/init as init process
[ 1.787880] Run /bin/init as init process
[ 1.787921] Run /bin/sh as init process
[ 1.787978] Kernel panic - not syncing: No working init found. Try
passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance.
[ 1.787993] CPU: 24 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc3 #1
[ 1.788004] Call Trace:
[ 1.788009] [c000004001083cc0] [c0000000009f2640]
dump_stack+0xc4/0x114 (unreliable)
[ 1.788027] [c000004001083d10] [c00000000014b9e0] panic+0x168/0x408
[ 1.788040] [c000004001083da0] [c000000000012964] kernel_init+0x14c/0x168
[ 1.788052] [c000004001083e10] [c00000000000d6ec]
ret_from_kernel_thread+0x5c/0x70
[ 1.813436] ---[ end Kernel panic - not syncing: No working init
found. Try passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance. ]---
More logs:
https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/tests/10112063_ppc64le_2_console.log
https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/08/317255035/build_ppc64le_redhat:1329401048/tests/10107307_ppc64le_2_console.log
https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/08/105424895c02858922e1bf27ef01127e12caca9a/build_ppc64le_redhat:1326137946/tests/1/results_0001/console.log/console.log
Thanks,
Bruno Goncalves
On 10/06/2021 13.47, Bruno Goncalves wrote:
> Hello,
>
> We've observed in some cases kernel panic when trying to boot on
> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
> could be related to patch
> https://lore.kernel.org/lkml/[email protected]/
>
Thanks for the report. It's possible, but I'll need some help from you
to get more info.
First, can you send me the .config?
>
> [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
non-empty path (/sbin/hotplug perhaps). This did get reported once before:
https://lore.kernel.org/lkml/[email protected]/
I think I should go and prepare a patch that moves the
usermodehelper_enable() call to after initramfs unpacking has been
initiated.
But until then, can you check if you do have UEVENT_HELPER_PATH set, and
if so, does changing it to the empty string make a change wrt this crash?
> [ 1.559757] PCI: CLS 128 bytes, default 128
> [ 1.560150] Trying to unpack rootfs image as initramfs...
OK, so now we got to populate_rootfs() and have kicked off a worker to
do the unpacking. Meanwhile, PID1 goes on to do other initcalls.
...
> [ 1.764430] Initramfs unpacking failed: no cpio magic
Whoa, that's not good. Did something scramble over the initramfs memory
while it was being unpacked? It's been .2 seconds since the start of the
unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
Can you try booting with initramfs_async=0 on the command line and see
if the kernel still crashes?
> [ 1.766204] Freeing initrd memory: 18176K
...
> [ 1.787649] Run /init as init process
> [ 1.787793] Failed to execute /init (error -2)
> [ 1.787801] Run /sbin/init as init process
> [ 1.787842] Run /etc/init as init process
> [ 1.787880] Run /bin/init as init process
> [ 1.787921] Run /bin/sh as init process
> [ 1.787978] Kernel panic - not syncing: No working init found. Try
Yeah, well, this is expected when unpacking the initramfs failed.
So I think the problem is the "no cpio magic", i.e. the initramfs got
corrupted somehow. But I don't have any idea why that would happen -
freeing the initramfs memory only happens after unpacking is done
(naturally...).
Rasmus
On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
<[email protected]> wrote:
>
> On 10/06/2021 13.47, Bruno Goncalves wrote:
> > Hello,
> >
> > We've observed in some cases kernel panic when trying to boot on
> > ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
> > could be related to patch
> > https://lore.kernel.org/lkml/[email protected]/
> >
>
> Thanks for the report. It's possible, but I'll need some help from you
> to get more info.
>
> First, can you send me the .config?
The .config is on
https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config
>
> >
> > [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
>
> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
>
CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
CONFIG_UEVENT_HELPER is not set"
> https://lore.kernel.org/lkml/[email protected]/
>
> I think I should go and prepare a patch that moves the
> usermodehelper_enable() call to after initramfs unpacking has been
> initiated.
>
> But until then, can you check if you do have UEVENT_HELPER_PATH set, and
> if so, does changing it to the empty string make a change wrt this crash?
>
>
> > [ 1.559757] PCI: CLS 128 bytes, default 128
> > [ 1.560150] Trying to unpack rootfs image as initramfs...
>
> OK, so now we got to populate_rootfs() and have kicked off a worker to
> do the unpacking. Meanwhile, PID1 goes on to do other initcalls.
>
> ...
>
> > [ 1.764430] Initramfs unpacking failed: no cpio magic
>
> Whoa, that's not good. Did something scramble over the initramfs memory
> while it was being unpacked? It's been .2 seconds since the start of the
> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
>
> Can you try booting with initramfs_async=0 on the command line and see
> if the kernel still crashes?
We are not able to reproduce it 100% of the time, but sure I can try
with this option and see what happens.
We've also seen:
Initramfs unpacking failed: junk within compressed archive
This can be seen on the other 2 console logs that I provided the link to.
Bruno
>
> > [ 1.766204] Freeing initrd memory: 18176K
> ...
>
>
> > [ 1.787649] Run /init as init process
> > [ 1.787793] Failed to execute /init (error -2)
> > [ 1.787801] Run /sbin/init as init process
> > [ 1.787842] Run /etc/init as init process
> > [ 1.787880] Run /bin/init as init process
> > [ 1.787921] Run /bin/sh as init process
> > [ 1.787978] Kernel panic - not syncing: No working init found. Try
>
> Yeah, well, this is expected when unpacking the initramfs failed.
>
> So I think the problem is the "no cpio magic", i.e. the initramfs got
> corrupted somehow. But I don't have any idea why that would happen -
> freeing the initramfs memory only happens after unpacking is done
> (naturally...).
>
> Rasmus
>
On 10/06/2021 17.14, Bruno Goncalves wrote:
> On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
> <[email protected]> wrote:
>>
>> On 10/06/2021 13.47, Bruno Goncalves wrote:
>>> Hello,
>>>
>>> We've observed in some cases kernel panic when trying to boot on
>>> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
>>> could be related to patch
>>> https://lore.kernel.org/lkml/[email protected]/
>>>
>>
>> Thanks for the report. It's possible, but I'll need some help from you
>> to get more info.
>>
>> First, can you send me the .config?
>
> The .config is on
> https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config
Thanks.
>>
>>>
>>> [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
>>
>> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
>> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
>>
>
> CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
> CONFIG_UEVENT_HELPER is not set"
OK. Then I assume some quite early initcall does a request_module() or
request_firmware() (or similar). I don't think this matters - that call
would be done before the initramfs was unpacked with or without my
patch, so it won't find anything in the empty rootfs. It's just my patch
added a note. But just to figure out where that triggers, can you do
- pr_warn_once("wait_for_initramfs() called before
rootfs_initcalls\n");
+ WARN_ONCE(1, "wait_for_initramfs() called before
rootfs_initcalls\n");
in init/initramfs.c.
>>> [ 1.764430] Initramfs unpacking failed: no cpio magic
>>
>> Whoa, that's not good. Did something scramble over the initramfs memory
>> while it was being unpacked? It's been .2 seconds since the start of the
>> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
>>
>> Can you try booting with initramfs_async=0 on the command line and see
>> if the kernel still crashes?
>
> We are not able to reproduce it 100% of the time, but sure I can try
> with this option and see what happens.
>
> We've also seen:
> Initramfs unpacking failed: junk within compressed archive
>
> This can be seen on the other 2 console logs that I provided the link to.
Yes, I saw that. This, and the fact that it's not 100% reproducible, is
consistent with the problem being some race that happens to write over
the compressed initramfs image - sometimes, the decompressor can still
make sense of the bits, but the output is no longer a valid cpio
archive, and sometimes already the decompressor notices the corruption.
I wonder if there is some way to mark the pages occupied by the
compressed initramfs as read-only - what would hopefully trigger a nice
crash with a backtrace to whoever writes to that memory.
Rasmus
On Fri, Jun 11, 2021 at 9:13 AM Rasmus Villemoes
<[email protected]> wrote:
>
> On 10/06/2021 17.14, Bruno Goncalves wrote:
> > On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
> > <[email protected]> wrote:
> >>
> >> On 10/06/2021 13.47, Bruno Goncalves wrote:
> >>> Hello,
> >>>
> >>> We've observed in some cases kernel panic when trying to boot on
> >>> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
> >>> could be related to patch
> >>> https://lore.kernel.org/lkml/[email protected]/
> >>>
> >>
> >> Thanks for the report. It's possible, but I'll need some help from you
> >> to get more info.
> >>
> >> First, can you send me the .config?
> >
> > The .config is on
> > https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config
>
> Thanks.
>
> >>
> >>>
> >>> [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
> >>
> >> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
> >> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
> >>
> >
> > CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
> > CONFIG_UEVENT_HELPER is not set"
>
> OK. Then I assume some quite early initcall does a request_module() or
> request_firmware() (or similar). I don't think this matters - that call
> would be done before the initramfs was unpacked with or without my
> patch, so it won't find anything in the empty rootfs. It's just my patch
> added a note. But just to figure out where that triggers, can you do
>
> - pr_warn_once("wait_for_initramfs() called before
> rootfs_initcalls\n");
> + WARN_ONCE(1, "wait_for_initramfs() called before
> rootfs_initcalls\n");
>
> in init/initramfs.c.
>
I've managed to reproduce the panic with the patch.
[ 1.303632] Kprobes globally optimized
[ 1.304131] HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
[ 1.304138] HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
[ 1.498349] alg: No test for 842 (842-generic)
[ 1.498395] alg: No test for 842 (842-scomp)
[ 1.498517] ------------[ cut here ]------------
[ 1.498524] wait_for_initramfs() called before rootfs_initcalls
[ 1.498532] WARNING: CPU: 13 PID: 1218 at init/initramfs.c:719
wait_for_initramfs+0x94/0xa4
[ 1.498545] Modules linked in:
[ 1.498550] CPU: 13 PID: 1218 Comm: kworker/u385:0 Not tainted 5.13.0-rc3 #1
[ 1.498558] NIP: c0000000000137d4 LR: c0000000000137d0 CTR: c0000000000c9e70
[ 1.498564] REGS: c000000027debac0 TRAP: 0700 Not tainted (5.13.0-rc3)
[ 1.498571] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR:
28000202 XER: 20000000
[ 1.498586] CFAR: c00000000014b874 IRQMASK: 0
[ 1.498586] GPR00: c0000000000137d0 c000000027debd60
c000000001e30e00 0000000000000033
[ 1.498586] GPR04: 00000000ffff7fff c000000027deb9f8
c000000027deb9f0 0000000000000000
[ 1.498586] GPR08: 0000003feb5b0000 c000000001bb5b50
c000000001bb5b50 0000000000000001
[ 1.498586] GPR12: 0000000000000000 c000003fffff1680
c000000000172e58 c0000000267822e0
[ 1.498586] GPR16: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
[ 1.498586] GPR20: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
[ 1.498586] GPR24: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
[ 1.498586] GPR28: 0000000000000000 0000000000000000
c000000028111c00 c0000000267822e0
[ 1.498654] NIP [c0000000000137d4] wait_for_initramfs+0x94/0xa4
[ 1.498661] LR [c0000000000137d0] wait_for_initramfs+0x90/0xa4
[ 1.498668] Call Trace:
[ 1.498671] [c000000027debd60] [c0000000000137d0]
wait_for_initramfs+0x90/0xa4 (unreliable)
[ 1.498680] [c000000027debdc0] [c000000000172fc8]
call_usermodehelper_exec_async+0x178/0x2c0
[ 1.498691] [c000000027debe10] [c00000000000d6ec]
ret_from_kernel_thread+0x5c/0x70
[ 1.498699] Instruction dump:
[ 1.498703] 7c0803a6 4e800020 60420000 7c0802a6 39200001 3d42fff4
3c62ff5e 3863fb08
[ 1.498715] 992ab06c f8010070 48138041 60000000 <0fe00000> e8010070
7c0803a6 4bffff94
[ 1.498728] ---[ end trace dca36620a70fa99e ]---
[ 1.503062] raid6: skip pq benchmark and using algorithm vpermxor8
[ 1.503070] raid6: using intx1 recovery algorithm
[ 1.503258] iommu: Default domain type: Translated
[ 1.503339] vgaarb: loaded
[ 1.503535] SCSI subsystem initialized
[ 1.503699] usbcore: registered new interface driver usbfs
[ 1.503712] usbcore: registered new interface driver hub
[ 1.503789] usbcore: registered new device driver usb
[ 1.503816] pps_core: LinuxPPS API ver. 1 registered
[ 1.503820] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
Rodolfo Giometti <[email protected]>
[ 1.503829] PTP clock support registered
[ 1.504012] EDAC MC: Ver: 3.0.0
[ 1.504425] NetLabel: Initializing
[ 1.504429] NetLabel: domain hash size = 128
[ 1.504433] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
[ 1.504452] NetLabel: unlabeled traffic allowed by default
[ 1.505953] clocksource: Switched to clocksource timebase
[ 1.528908] VFS: Disk quotas dquot_6.6.0
[ 1.529034] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
[ 1.531252] NET: Registered protocol family 2
[ 1.531428] IP idents hash table entries: 262144 (order: 5, 2097152
bytes, vmalloc)
[ 1.536616] tcp_listen_portaddr_hash hash table entries: 65536
(order: 4, 1048576 bytes, vmalloc)
[ 1.536818] TCP established hash table entries: 524288 (order: 6,
4194304 bytes, vmalloc)
[ 1.537868] TCP bind hash table entries: 65536 (order: 4, 1048576
bytes, vmalloc)
[ 1.538003] TCP: Hash tables configured (established 524288 bind 65536)
[ 1.538632] MPTCP token hash table entries: 65536 (order: 4,
1572864 bytes, vmalloc)
[ 1.538841] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
[ 1.539112] UDP-Lite hash table entries: 65536 (order: 5, 2097152
bytes, vmalloc)
[ 1.540030] NET: Registered protocol family 1
[ 1.540040] NET: Registered protocol family 44
[ 1.540140] pci 0005:03:00.0: enabling device (0140 -> 0142)
[ 1.540202] PCI: CLS 128 bytes, default 128
[ 1.540332] Trying to unpack rootfs image as initramfs...
[ 1.540407] rtas_flash: no firmware flash support
[ 1.550617] Initialise system trusted keyrings
[ 1.550641] Key type blacklist registered
[ 1.550787] workingset: timestamp_bits=38 max_order=24 bucket_order=0
[ 1.552891] zbud: loaded
[ 1.596393] NET: Registered protocol family 38
[ 1.596434] xor: measuring software checksum speed
[ 1.597187] 8regs : 13172 MB/sec
[ 1.598009] 8regs_prefetch : 12027 MB/sec
[ 1.598734] 32regs : 13657 MB/sec
[ 1.599504] 32regs_prefetch : 12869 MB/sec
[ 1.599889] altivec : 25910 MB/sec
[ 1.599893] xor: using function: altivec (25910 MB/sec)
[ 1.599899] Key type asymmetric registered
[ 1.599903] Asymmetric key parser 'x509' registered
[ 1.599934] Block layer SCSI generic (bsg) driver version 0.4
loaded (major 245)
[ 1.600092] io scheduler mq-deadline registered
[ 1.600097] io scheduler kyber registered
[ 1.600233] io scheduler bfq registered
[ 1.602775] atomic64_test: passed
[ 1.603547] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[ 1.603754] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
[ 1.603994] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
[ 1.604021] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[ 1.605830] Non-volatile memory driver v1.3
[ 1.607633] libphy: Fixed MDIO Bus: probed
[ 1.607773] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 1.607790] ehci-pci: EHCI PCI platform driver
[ 1.607804] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 1.607822] ohci-pci: OHCI PCI platform driver
[ 1.607835] uhci_hcd: USB Universal Host Controller Interface driver
[ 1.608005] xhci_hcd 0005:03:00.0: xHCI Host Controller
[ 1.608116] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
bus number 1
[ 1.608265] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
version 0x96 quirks 0x0000000004000000
[ 1.608728] usb usb1: New USB device found, idVendor=1d6b,
idProduct=0002, bcdDevice= 5.13
[ 1.608737] usb usb1: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[ 1.608744] usb usb1: Product: xHCI Host Controller
[ 1.608749] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
[ 1.608755] usb usb1: SerialNumber: 0005:03:00.0
[ 1.608915] hub 1-0:1.0: USB hub found
[ 1.608931] hub 1-0:1.0: 4 ports detected
[ 1.609104] xhci_hcd 0005:03:00.0: xHCI Host Controller
[ 1.609162] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
bus number 2
[ 1.609179] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
[ 1.609203] usb usb2: We don't know the algorithms for LPM for this
host, disabling LPM.
[ 1.609235] usb usb2: New USB device found, idVendor=1d6b,
idProduct=0003, bcdDevice= 5.13
[ 1.609243] usb usb2: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[ 1.609250] usb usb2: Product: xHCI Host Controller
[ 1.609255] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
[ 1.609261] usb usb2: SerialNumber: 0005:03:00.0
[ 1.609378] hub 2-0:1.0: USB hub found
[ 1.609391] hub 2-0:1.0: 4 ports detected
[ 1.609565] usbcore: registered new interface driver usbserial_generic
[ 1.609576] usbserial: USB Serial support registered for generic
[ 1.609614] mousedev: PS/2 mouse device common for all mice
[ 1.613393] device-mapper: uevent: version 1.0.3
[ 1.613535] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
initialised: [email protected]
[ 1.613831] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
0xfffffff7 max 0x0
[ 1.613838] powernv-cpufreq: Workload Optimized Frequency is
disabled in the platform
[ 1.618654] hid: raw HID events driver (C) Jiri Kosina
[ 1.618692] usbcore: registered new interface driver usbhid
[ 1.618697] usbhid: USB HID core driver
[ 1.618938] drop_monitor: Initializing network drop monitor service
[ 1.619088] Initializing XFRM netlink socket
[ 1.619464] NET: Registered protocol family 10
[ 1.708294] Initramfs unpacking failed: junk within compressed archive
[ 1.709187] Freeing initrd memory: 18176K
[ 1.710883] Segment Routing with IPv6
[ 1.710893] RPL Segment Routing with IPv6
[ 1.710925] mip6: Mobile IPv6
[ 1.710932] NET: Registered protocol family 17
[ 1.711010] secvar-sysfs: secvar: failed to retrieve secvar operations.
[ 1.711047] drmem: No dynamic reconfiguration memory found
[ 1.711278] registered taskstats version 1
[ 1.711299] Loading compiled-in X.509 certificates
[ 1.712740] Loaded X.509 cert 'Build time autogenerated kernel key:
ea17e1addc1b8e4973bc39f5bfb1273b281f4bec'
[ 1.714417] zswap: loaded using pool lzo/zbud
[ 1.714638] debug_vm_pgtable: [debug_vm_pgtable ]:
Validating architecture page table helpers
[ 1.715174] Key type ._fscrypt registered
[ 1.715181] Key type .fscrypt registered
[ 1.715187] Key type fscrypt-provisioning registered
[ 1.716999] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
[ 1.717047] pstore: Using crash dump compression: deflate
[ 1.717913] Key type encrypted registered
[ 1.718118] Secure boot mode disabled
[ 1.718127] ima: No TPM chip found, activating TPM-bypass!
[ 1.718135] Loading compiled-in module X.509 certificates
[ 1.719409] Loaded X.509 cert 'Build time autogenerated kernel key:
ea17e1addc1b8e4973bc39f5bfb1273b281f4bec'
[ 1.719421] ima: Allocated hash algorithm: sha256
[ 1.719594] Secure boot mode disabled
[ 1.719748] Trusted boot mode disabled
[ 1.719755] ima: No architecture policies found
[ 1.719778] evm: Initialising EVM extended attributes:
[ 1.719785] evm: security.selinux
[ 1.719790] evm: security.ima
[ 1.719795] evm: security.capability
[ 1.719800] evm: HMAC attrs: 0x1
[ 1.722870] Freeing unused kernel memory: 5760K
[ 1.722880] Kernel memory protection not selected by kernel config.
[ 1.722891] Run /init as init process
[ 1.723033] Failed to execute /init (error -2)
[ 1.723041] Run /sbin/init as init process
[ 1.723082] Run /etc/init as init process
[ 1.723121] Run /bin/init as init process
[ 1.723161] Run /bin/sh as init process
[ 1.723219] Kernel panic - not syncing: No working init found. Try
passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance.
[ 1.723234] CPU: 40 PID: 1 Comm: swapper/0 Tainted: G W
5.13.0-rc3 #1
[ 1.723246] Call Trace:
[ 1.723251] [c0000080010cfcc0] [c0000000009f2640]
dump_stack+0xc4/0x114 (unreliable)
[ 1.723267] [c0000080010cfd10] [c00000000014b9e0] panic+0x168/0x408
[ 1.723281] [c0000080010cfda0] [c000000000012964] kernel_init+0x14c/0x168
[ 1.723293] [c0000080010cfe10] [c00000000000d6ec]
ret_from_kernel_thread+0x5c/0x70
> >>> [ 1.764430] Initramfs unpacking failed: no cpio magic
> >>
> >> Whoa, that's not good. Did something scramble over the initramfs memory
> >> while it was being unpacked? It's been .2 seconds since the start of the
> >> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
> >>
> >> Can you try booting with initramfs_async=0 on the command line and see
> >> if the kernel still crashes?
Using initramfs_async=0 I was also able to reproduce the panic.
Bruno
> > We are not able to reproduce it 100% of the time, but sure I can try
> > with this option and see what happens.
> >
> > We've also seen:
> > Initramfs unpacking failed: junk within compressed archive
> >
> > This can be seen on the other 2 console logs that I provided the link to.
>
> Yes, I saw that. This, and the fact that it's not 100% reproducible, is
> consistent with the problem being some race that happens to write over
> the compressed initramfs image - sometimes, the decompressor can still
> make sense of the bits, but the output is no longer a valid cpio
> archive, and sometimes already the decompressor notices the corruption.
>
> I wonder if there is some way to mark the pages occupied by the
> compressed initramfs as read-only - what would hopefully trigger a nice
> crash with a backtrace to whoever writes to that memory.
>
> Rasmus
>
On 11/06/2021 17.06, Bruno Goncalves wrote:
> On Fri, Jun 11, 2021 at 9:13 AM Rasmus Villemoes
> <[email protected]> wrote:
>>
>> On 10/06/2021 17.14, Bruno Goncalves wrote:
>>> On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
>>> <[email protected]> wrote:
>>>>
>>>> On 10/06/2021 13.47, Bruno Goncalves wrote:
>>>>> Hello,
>>>>>
>>>>> We've observed in some cases kernel panic when trying to boot on
>>>>> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
>>>>> could be related to patch
>>>>> https://lore.kernel.org/lkml/[email protected]/
>>>>>
>>>>
>>>> Thanks for the report. It's possible, but I'll need some help from you
>>>> to get more info.
>>>>
>>>> First, can you send me the .config?
>>>
>>> The .config is on
>>> https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config
>>
>> Thanks.
>>
>>>>
>>>>>
>>>>> [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
>>>>
>>>> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
>>>> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
>>>>
>>>
>>> CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
>>> CONFIG_UEVENT_HELPER is not set"
>>
>> OK. Then I assume some quite early initcall does a request_module() or
>> request_firmware() (or similar). I don't think this matters - that call
>> would be done before the initramfs was unpacked with or without my
>> patch, so it won't find anything in the empty rootfs. It's just my patch
>> added a note. But just to figure out where that triggers, can you do
>>
>> - pr_warn_once("wait_for_initramfs() called before
>> rootfs_initcalls\n");
>> + WARN_ONCE(1, "wait_for_initramfs() called before
>> rootfs_initcalls\n");
>>
>> in init/initramfs.c.
>>
>
> I've managed to reproduce the panic with the patch.
>
> [ 1.498654] NIP [c0000000000137d4] wait_for_initramfs+0x94/0xa4
> [ 1.498661] LR [c0000000000137d0] wait_for_initramfs+0x90/0xa4
> [ 1.498668] Call Trace:
> [ 1.498671] [c000000027debd60] [c0000000000137d0]
> wait_for_initramfs+0x90/0xa4 (unreliable)
> [ 1.498680] [c000000027debdc0] [c000000000172fc8]
> call_usermodehelper_exec_async+0x178/0x2c0
> [ 1.498691] [c000000027debe10] [c00000000000d6ec]
> ret_from_kernel_thread+0x5c/0x70
Thanks, but unfortunately (and I should have known better) that doesn't
tell us who actually initated that call_usermodehelper - it's most
likely some request_module() call. But again, I don't think this is
related to the later crash.
>>>>> [ 1.764430] Initramfs unpacking failed: no cpio magic
>>>>
>>>> Whoa, that's not good. Did something scramble over the initramfs memory
>>>> while it was being unpacked? It's been .2 seconds since the start of the
>>>> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
>>>>
>>>> Can you try booting with initramfs_async=0 on the command line and see
>>>> if the kernel still crashes?
>
> Using initramfs_async=0 I was also able to reproduce the panic.
Hm, that's very interesting. Can you share the log for that as well?
And, perhaps asking a silly question, does the crash go away if you
revert e7cb072eb988e46295512617c39d004f9e1c26f8 ?
Thanks,
Rasmus
On Fri, Jun 11, 2021 at 11:49 PM Rasmus Villemoes
<[email protected]> wrote:
>
> On 11/06/2021 17.06, Bruno Goncalves wrote:
> > On Fri, Jun 11, 2021 at 9:13 AM Rasmus Villemoes
> > <[email protected]> wrote:
> >>
> >> On 10/06/2021 17.14, Bruno Goncalves wrote:
> >>> On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
> >>> <[email protected]> wrote:
> >>>>
> >>>> On 10/06/2021 13.47, Bruno Goncalves wrote:
> >>>>> Hello,
> >>>>>
> >>>>> We've observed in some cases kernel panic when trying to boot on
> >>>>> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
> >>>>> could be related to patch
> >>>>> https://lore.kernel.org/lkml/[email protected]/
> >>>>>
> >>>>
> >>>> Thanks for the report. It's possible, but I'll need some help from you
> >>>> to get more info.
> >>>>
> >>>> First, can you send me the .config?
> >>>
> >>> The .config is on
> >>> https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config
> >>
> >> Thanks.
> >>
> >>>>
> >>>>>
> >>>>> [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
> >>>>
> >>>> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
> >>>> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
> >>>>
> >>>
> >>> CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
> >>> CONFIG_UEVENT_HELPER is not set"
> >>
> >> OK. Then I assume some quite early initcall does a request_module() or
> >> request_firmware() (or similar). I don't think this matters - that call
> >> would be done before the initramfs was unpacked with or without my
> >> patch, so it won't find anything in the empty rootfs. It's just my patch
> >> added a note. But just to figure out where that triggers, can you do
> >>
> >> - pr_warn_once("wait_for_initramfs() called before
> >> rootfs_initcalls\n");
> >> + WARN_ONCE(1, "wait_for_initramfs() called before
> >> rootfs_initcalls\n");
> >>
> >> in init/initramfs.c.
> >>
> >
> > I've managed to reproduce the panic with the patch.
> >
> > [ 1.498654] NIP [c0000000000137d4] wait_for_initramfs+0x94/0xa4
> > [ 1.498661] LR [c0000000000137d0] wait_for_initramfs+0x90/0xa4
> > [ 1.498668] Call Trace:
> > [ 1.498671] [c000000027debd60] [c0000000000137d0]
> > wait_for_initramfs+0x90/0xa4 (unreliable)
> > [ 1.498680] [c000000027debdc0] [c000000000172fc8]
> > call_usermodehelper_exec_async+0x178/0x2c0
> > [ 1.498691] [c000000027debe10] [c00000000000d6ec]
> > ret_from_kernel_thread+0x5c/0x70
>
> Thanks, but unfortunately (and I should have known better) that doesn't
> tell us who actually initated that call_usermodehelper - it's most
> likely some request_module() call. But again, I don't think this is
> related to the later crash.
>
> >>>>> [ 1.764430] Initramfs unpacking failed: no cpio magic
> >>>>
> >>>> Whoa, that's not good. Did something scramble over the initramfs memory
> >>>> while it was being unpacked? It's been .2 seconds since the start of the
> >>>> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
> >>>>
> >>>> Can you try booting with initramfs_async=0 on the command line and see
> >>>> if the kernel still crashes?
> >
> > Using initramfs_async=0 I was also able to reproduce the panic.
>
> Hm, that's very interesting. Can you share the log for that as well?
[ 0.000000] Kernel command line:
root=UUID=72f391f6-e71f-41a6-ba16-2c25460203ed ro initramfs_async=0
[ 0.000000] printk: log_buf_len individual max cpu contribution: 4096 bytes
[ 0.000000] printk: log_buf_len total cpu_extra contributions: 782336 bytes
[ 0.000000] printk: log_buf_len min size: 262144 bytes
[ 0.000000] printk: log_buf_len: 1048576 bytes
[ 0.000000] printk: early log buf free: 253968(96%)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] Memory: 1017540928K/1073741824K available (17472K
kernel code, 3072K rwdata, 4992K rodata, 5760K init, 1818K bss,
2510528K reserved, 53690368K cma-reserved)
[ 0.000000] random: get_random_u64 called from
__kmem_cache_create+0x3c/0x770 with crng_init=0
<snip>
[ 1.366341] HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
[ 1.366348] HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
[ 1.560635] alg: No test for 842 (842-generic)
[ 1.560677] alg: No test for 842 (842-scomp)
[ 1.560824] wait_for_initramfs() called before rootfs_initcalls
[ 1.565123] raid6: skip pq benchmark and using algorithm vpermxor8
[ 1.565132] raid6: using intx1 recovery algorithm
[ 1.565318] iommu: Default domain type: Translated
[ 1.565402] vgaarb: loaded
[ 1.565663] SCSI subsystem initialized
[ 1.565829] usbcore: registered new interface driver usbfs
[ 1.565841] usbcore: registered new interface driver hub
[ 1.565917] usbcore: registered new device driver usb
[ 1.565944] pps_core: LinuxPPS API ver. 1 registered
[ 1.565949] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
Rodolfo Giometti <[email protected]>
[ 1.565957] PTP clock support registered
[ 1.566137] EDAC MC: Ver: 3.0.0
[ 1.566551] NetLabel: Initializing
[ 1.566555] NetLabel: domain hash size = 128
[ 1.566559] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
[ 1.566578] NetLabel: unlabeled traffic allowed by default
[ 1.568021] clocksource: Switched to clocksource timebase
[ 1.591752] VFS: Disk quotas dquot_6.6.0
[ 1.591918] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
[ 1.594487] NET: Registered protocol family 2
[ 1.594673] IP idents hash table entries: 262144 (order: 5, 2097152
bytes, vmalloc)
[ 1.600286] tcp_listen_portaddr_hash hash table entries: 65536
(order: 4, 1048576 bytes, vmalloc)
[ 1.600585] TCP established hash table entries: 524288 (order: 6,
4194304 bytes, vmalloc)
[ 1.601814] TCP bind hash table entries: 65536 (order: 4, 1048576
bytes, vmalloc)
[ 1.601991] TCP: Hash tables configured (established 524288 bind 65536)
[ 1.602677] MPTCP token hash table entries: 65536 (order: 4,
1572864 bytes, vmalloc)
[ 1.602943] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
[ 1.603347] UDP-Lite hash table entries: 65536 (order: 5, 2097152
bytes, vmalloc)
[ 1.604431] NET: Registered protocol family 1
[ 1.604443] NET: Registered protocol family 44
[ 1.604571] pci 0005:03:00.0: enabling device (0140 -> 0142)
[ 1.604656] PCI: CLS 128 bytes, default 128
[ 1.604850] Trying to unpack rootfs image as initramfs...
[ 1.774342] Initramfs unpacking failed: no cpio magic
[ 1.775307] Freeing initrd memory: 18176K
[ 1.775594] rtas_flash: no firmware flash support
[ 1.780166] Initialise system trusted keyrings
[ 1.780190] Key type blacklist registered
[ 1.780364] workingset: timestamp_bits=38 max_order=24 bucket_order=0
[ 1.782469] zbud: loaded
[ 1.825156] NET: Registered protocol family 38
[ 1.825169] xor: measuring software checksum speed
[ 1.825925] 8regs : 13170 MB/sec
[ 1.826748] 8regs_prefetch : 12031 MB/sec
[ 1.827473] 32regs : 13662 MB/sec
[ 1.828249] 32regs_prefetch : 12820 MB/sec
[ 1.828635] altivec : 25906 MB/sec
[ 1.828639] xor: using function: altivec (25906 MB/sec)
[ 1.828645] Key type asymmetric registered
[ 1.828649] Asymmetric key parser 'x509' registered
[ 1.828661] Block layer SCSI generic (bsg) driver version 0.4
loaded (major 245)
[ 1.828820] io scheduler mq-deadline registered
[ 1.828825] io scheduler kyber registered
[ 1.828932] io scheduler bfq registered
[ 1.831304] atomic64_test: passed
[ 1.832070] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[ 1.832277] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
[ 1.832527] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
[ 1.832555] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[ 1.834347] Non-volatile memory driver v1.3
[ 1.836184] libphy: Fixed MDIO Bus: probed
[ 1.836295] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 1.836311] ehci-pci: EHCI PCI platform driver
[ 1.836325] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 1.836342] ohci-pci: OHCI PCI platform driver
[ 1.836356] uhci_hcd: USB Universal Host Controller Interface driver
[ 1.836523] xhci_hcd 0005:03:00.0: xHCI Host Controller
[ 1.836635] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
bus number 1
[ 1.836788] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
version 0x96 quirks 0x0000000004000000
[ 1.837236] usb usb1: New USB device found, idVendor=1d6b,
idProduct=0002, bcdDevice= 5.13
[ 1.837245] usb usb1: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[ 1.837252] usb usb1: Product: xHCI Host Controller
[ 1.837257] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
[ 1.837263] usb usb1: SerialNumber: 0005:03:00.0
[ 1.837418] hub 1-0:1.0: USB hub found
[ 1.837433] hub 1-0:1.0: 4 ports detected
[ 1.837605] xhci_hcd 0005:03:00.0: xHCI Host Controller
[ 1.837662] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
bus number 2
[ 1.837679] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
[ 1.837710] usb usb2: We don't know the algorithms for LPM for this
host, disabling LPM.
[ 1.837742] usb usb2: New USB device found, idVendor=1d6b,
idProduct=0003, bcdDevice= 5.13
[ 1.837750] usb usb2: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[ 1.837756] usb usb2: Product: xHCI Host Controller
[ 1.837761] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
[ 1.837767] usb usb2: SerialNumber: 0005:03:00.0
[ 1.837884] hub 2-0:1.0: USB hub found
[ 1.837897] hub 2-0:1.0: 4 ports detected
[ 1.838074] usbcore: registered new interface driver usbserial_generic
[ 1.838086] usbserial: USB Serial support registered for generic
[ 1.838123] mousedev: PS/2 mouse device common for all mice
[ 1.841900] device-mapper: uevent: version 1.0.3
[ 1.842050] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
initialised: [email protected]
[ 1.842352] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
0xfffffff7 max 0x0
[ 1.842358] powernv-cpufreq: Workload Optimized Frequency is
disabled in the platform
[ 1.847336] hid: raw HID events driver (C) Jiri Kosina
[ 1.847386] usbcore: registered new interface driver usbhid
[ 1.847391] usbhid: USB HID core driver
[ 1.847646] drop_monitor: Initializing network drop monitor service
[ 1.847817] Initializing XFRM netlink socket
[ 1.848236] NET: Registered protocol family 10
[ 1.861274] Segment Routing with IPv6
[ 1.861288] RPL Segment Routing with IPv6
[ 1.861313] mip6: Mobile IPv6
[ 1.861318] NET: Registered protocol family 17
[ 1.861392] secvar-sysfs: secvar: failed to retrieve secvar operations.
[ 1.861424] drmem: No dynamic reconfiguration memory found
[ 1.861631] registered taskstats version 1
[ 1.861650] Loading compiled-in X.509 certificates
[ 1.862965] Loaded X.509 cert 'Build time autogenerated kernel key:
97604d93c367cf27b215cdfd062467d582f7e126'
[ 1.864430] zswap: loaded using pool lzo/zbud
[ 1.864634] debug_vm_pgtable: [debug_vm_pgtable ]:
Validating architecture page table helpers
[ 1.865112] Key type ._fscrypt registered
[ 1.865118] Key type .fscrypt registered
[ 1.865124] Key type fscrypt-provisioning registered
[ 1.866849] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
[ 1.866898] pstore: Using crash dump compression: deflate
[ 1.867485] Key type encrypted registered
[ 1.867690] Secure boot mode disabled
[ 1.867725] ima: No TPM chip found, activating TPM-bypass!
[ 1.867735] Loading compiled-in module X.509 certificates
[ 1.869037] Loaded X.509 cert 'Build time autogenerated kernel key:
97604d93c367cf27b215cdfd062467d582f7e126'
[ 1.869049] ima: Allocated hash algorithm: sha256
[ 1.869223] Secure boot mode disabled
[ 1.869378] Trusted boot mode disabled
[ 1.869385] ima: No architecture policies found
[ 1.869411] evm: Initialising EVM extended attributes:
[ 1.869418] evm: security.selinux
[ 1.869423] evm: security.ima
[ 1.869428] evm: security.capability
[ 1.869433] evm: HMAC attrs: 0x1
[ 1.872544] Freeing unused kernel memory: 5760K
[ 1.872555] Kernel memory protection not selected by kernel config.
[ 1.872566] Run /init as init process
[ 1.872706] Failed to execute /init (error -2)
[ 1.872713] Run /sbin/init as init process
[ 1.872755] Run /etc/init as init process
[ 1.872794] Run /bin/init as init process
[ 1.872834] Run /bin/sh as init process
[ 1.872891] Kernel panic - not syncing: No working init found. Try
passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance.
[ 1.872906] CPU: 42 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc3 #1
[ 1.872917] Call Trace:
[ 1.872922] [c0000080010d7cc0] [c0000000009f2640]
dump_stack+0xc4/0x114 (unreliable)
[ 1.872939] [c0000080010d7d10] [c00000000014b9e0] panic+0x168/0x408
[ 1.872952] [c0000080010d7da0] [c000000000012964] kernel_init+0x14c/0x168
[ 1.872964] [c0000080010d7e10] [c00000000000d6ec]
ret_from_kernel_thread+0x5c/0x70
>
> And, perhaps asking a silly question, does the crash go away if you
> revert e7cb072eb988e46295512617c39d004f9e1c26f8 ?
Sure, I'll try it and let you know.
Bruno
>
> Thanks,
> Rasmus
>
On Mon, Jun 14, 2021 at 7:47 AM Bruno Goncalves <[email protected]> wrote:
>
> On Fri, Jun 11, 2021 at 11:49 PM Rasmus Villemoes
> <[email protected]> wrote:
> >
> > On 11/06/2021 17.06, Bruno Goncalves wrote:
> > > On Fri, Jun 11, 2021 at 9:13 AM Rasmus Villemoes
> > > <[email protected]> wrote:
> > >>
> > >> On 10/06/2021 17.14, Bruno Goncalves wrote:
> > >>> On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
> > >>> <[email protected]> wrote:
> > >>>>
> > >>>> On 10/06/2021 13.47, Bruno Goncalves wrote:
> > >>>>> Hello,
> > >>>>>
> > >>>>> We've observed in some cases kernel panic when trying to boot on
> > >>>>> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
> > >>>>> could be related to patch
> > >>>>> https://lore.kernel.org/lkml/[email protected]/
> > >>>>>
> > >>>>
> > >>>> Thanks for the report. It's possible, but I'll need some help from you
> > >>>> to get more info.
> > >>>>
> > >>>> First, can you send me the .config?
> > >>>
> > >>> The .config is on
> > >>> https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config
> > >>
> > >> Thanks.
> > >>
> > >>>>
> > >>>>>
> > >>>>> [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
> > >>>>
> > >>>> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
> > >>>> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
> > >>>>
> > >>>
> > >>> CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
> > >>> CONFIG_UEVENT_HELPER is not set"
> > >>
> > >> OK. Then I assume some quite early initcall does a request_module() or
> > >> request_firmware() (or similar). I don't think this matters - that call
> > >> would be done before the initramfs was unpacked with or without my
> > >> patch, so it won't find anything in the empty rootfs. It's just my patch
> > >> added a note. But just to figure out where that triggers, can you do
> > >>
> > >> - pr_warn_once("wait_for_initramfs() called before
> > >> rootfs_initcalls\n");
> > >> + WARN_ONCE(1, "wait_for_initramfs() called before
> > >> rootfs_initcalls\n");
> > >>
> > >> in init/initramfs.c.
> > >>
> > >
> > > I've managed to reproduce the panic with the patch.
> > >
> > > [ 1.498654] NIP [c0000000000137d4] wait_for_initramfs+0x94/0xa4
> > > [ 1.498661] LR [c0000000000137d0] wait_for_initramfs+0x90/0xa4
> > > [ 1.498668] Call Trace:
> > > [ 1.498671] [c000000027debd60] [c0000000000137d0]
> > > wait_for_initramfs+0x90/0xa4 (unreliable)
> > > [ 1.498680] [c000000027debdc0] [c000000000172fc8]
> > > call_usermodehelper_exec_async+0x178/0x2c0
> > > [ 1.498691] [c000000027debe10] [c00000000000d6ec]
> > > ret_from_kernel_thread+0x5c/0x70
> >
> > Thanks, but unfortunately (and I should have known better) that doesn't
> > tell us who actually initated that call_usermodehelper - it's most
> > likely some request_module() call. But again, I don't think this is
> > related to the later crash.
> >
> > >>>>> [ 1.764430] Initramfs unpacking failed: no cpio magic
> > >>>>
> > >>>> Whoa, that's not good. Did something scramble over the initramfs memory
> > >>>> while it was being unpacked? It's been .2 seconds since the start of the
> > >>>> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
> > >>>>
> > >>>> Can you try booting with initramfs_async=0 on the command line and see
> > >>>> if the kernel still crashes?
> > >
> > > Using initramfs_async=0 I was also able to reproduce the panic.
> >
> > Hm, that's very interesting. Can you share the log for that as well?
>
> [ 0.000000] Kernel command line:
> root=UUID=72f391f6-e71f-41a6-ba16-2c25460203ed ro initramfs_async=0
> [ 0.000000] printk: log_buf_len individual max cpu contribution: 4096 bytes
> [ 0.000000] printk: log_buf_len total cpu_extra contributions: 782336 bytes
> [ 0.000000] printk: log_buf_len min size: 262144 bytes
> [ 0.000000] printk: log_buf_len: 1048576 bytes
> [ 0.000000] printk: early log buf free: 253968(96%)
> [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> [ 0.000000] Memory: 1017540928K/1073741824K available (17472K
> kernel code, 3072K rwdata, 4992K rodata, 5760K init, 1818K bss,
> 2510528K reserved, 53690368K cma-reserved)
> [ 0.000000] random: get_random_u64 called from
> __kmem_cache_create+0x3c/0x770 with crng_init=0
> <snip>
> [ 1.366341] HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> [ 1.366348] HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> [ 1.560635] alg: No test for 842 (842-generic)
> [ 1.560677] alg: No test for 842 (842-scomp)
> [ 1.560824] wait_for_initramfs() called before rootfs_initcalls
> [ 1.565123] raid6: skip pq benchmark and using algorithm vpermxor8
> [ 1.565132] raid6: using intx1 recovery algorithm
> [ 1.565318] iommu: Default domain type: Translated
> [ 1.565402] vgaarb: loaded
> [ 1.565663] SCSI subsystem initialized
> [ 1.565829] usbcore: registered new interface driver usbfs
> [ 1.565841] usbcore: registered new interface driver hub
> [ 1.565917] usbcore: registered new device driver usb
> [ 1.565944] pps_core: LinuxPPS API ver. 1 registered
> [ 1.565949] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
> Rodolfo Giometti <[email protected]>
> [ 1.565957] PTP clock support registered
> [ 1.566137] EDAC MC: Ver: 3.0.0
> [ 1.566551] NetLabel: Initializing
> [ 1.566555] NetLabel: domain hash size = 128
> [ 1.566559] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
> [ 1.566578] NetLabel: unlabeled traffic allowed by default
> [ 1.568021] clocksource: Switched to clocksource timebase
> [ 1.591752] VFS: Disk quotas dquot_6.6.0
> [ 1.591918] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
> [ 1.594487] NET: Registered protocol family 2
> [ 1.594673] IP idents hash table entries: 262144 (order: 5, 2097152
> bytes, vmalloc)
> [ 1.600286] tcp_listen_portaddr_hash hash table entries: 65536
> (order: 4, 1048576 bytes, vmalloc)
> [ 1.600585] TCP established hash table entries: 524288 (order: 6,
> 4194304 bytes, vmalloc)
> [ 1.601814] TCP bind hash table entries: 65536 (order: 4, 1048576
> bytes, vmalloc)
> [ 1.601991] TCP: Hash tables configured (established 524288 bind 65536)
> [ 1.602677] MPTCP token hash table entries: 65536 (order: 4,
> 1572864 bytes, vmalloc)
> [ 1.602943] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
> [ 1.603347] UDP-Lite hash table entries: 65536 (order: 5, 2097152
> bytes, vmalloc)
> [ 1.604431] NET: Registered protocol family 1
> [ 1.604443] NET: Registered protocol family 44
> [ 1.604571] pci 0005:03:00.0: enabling device (0140 -> 0142)
> [ 1.604656] PCI: CLS 128 bytes, default 128
> [ 1.604850] Trying to unpack rootfs image as initramfs...
> [ 1.774342] Initramfs unpacking failed: no cpio magic
> [ 1.775307] Freeing initrd memory: 18176K
> [ 1.775594] rtas_flash: no firmware flash support
> [ 1.780166] Initialise system trusted keyrings
> [ 1.780190] Key type blacklist registered
> [ 1.780364] workingset: timestamp_bits=38 max_order=24 bucket_order=0
> [ 1.782469] zbud: loaded
> [ 1.825156] NET: Registered protocol family 38
> [ 1.825169] xor: measuring software checksum speed
> [ 1.825925] 8regs : 13170 MB/sec
> [ 1.826748] 8regs_prefetch : 12031 MB/sec
> [ 1.827473] 32regs : 13662 MB/sec
> [ 1.828249] 32regs_prefetch : 12820 MB/sec
> [ 1.828635] altivec : 25906 MB/sec
> [ 1.828639] xor: using function: altivec (25906 MB/sec)
> [ 1.828645] Key type asymmetric registered
> [ 1.828649] Asymmetric key parser 'x509' registered
> [ 1.828661] Block layer SCSI generic (bsg) driver version 0.4
> loaded (major 245)
> [ 1.828820] io scheduler mq-deadline registered
> [ 1.828825] io scheduler kyber registered
> [ 1.828932] io scheduler bfq registered
> [ 1.831304] atomic64_test: passed
> [ 1.832070] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> [ 1.832277] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
> [ 1.832527] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
> [ 1.832555] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> [ 1.834347] Non-volatile memory driver v1.3
> [ 1.836184] libphy: Fixed MDIO Bus: probed
> [ 1.836295] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> [ 1.836311] ehci-pci: EHCI PCI platform driver
> [ 1.836325] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> [ 1.836342] ohci-pci: OHCI PCI platform driver
> [ 1.836356] uhci_hcd: USB Universal Host Controller Interface driver
> [ 1.836523] xhci_hcd 0005:03:00.0: xHCI Host Controller
> [ 1.836635] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> bus number 1
> [ 1.836788] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
> version 0x96 quirks 0x0000000004000000
> [ 1.837236] usb usb1: New USB device found, idVendor=1d6b,
> idProduct=0002, bcdDevice= 5.13
> [ 1.837245] usb usb1: New USB device strings: Mfr=3, Product=2,
> SerialNumber=1
> [ 1.837252] usb usb1: Product: xHCI Host Controller
> [ 1.837257] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> [ 1.837263] usb usb1: SerialNumber: 0005:03:00.0
> [ 1.837418] hub 1-0:1.0: USB hub found
> [ 1.837433] hub 1-0:1.0: 4 ports detected
> [ 1.837605] xhci_hcd 0005:03:00.0: xHCI Host Controller
> [ 1.837662] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> bus number 2
> [ 1.837679] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
> [ 1.837710] usb usb2: We don't know the algorithms for LPM for this
> host, disabling LPM.
> [ 1.837742] usb usb2: New USB device found, idVendor=1d6b,
> idProduct=0003, bcdDevice= 5.13
> [ 1.837750] usb usb2: New USB device strings: Mfr=3, Product=2,
> SerialNumber=1
> [ 1.837756] usb usb2: Product: xHCI Host Controller
> [ 1.837761] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> [ 1.837767] usb usb2: SerialNumber: 0005:03:00.0
> [ 1.837884] hub 2-0:1.0: USB hub found
> [ 1.837897] hub 2-0:1.0: 4 ports detected
> [ 1.838074] usbcore: registered new interface driver usbserial_generic
> [ 1.838086] usbserial: USB Serial support registered for generic
> [ 1.838123] mousedev: PS/2 mouse device common for all mice
> [ 1.841900] device-mapper: uevent: version 1.0.3
> [ 1.842050] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
> initialised: [email protected]
> [ 1.842352] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
> 0xfffffff7 max 0x0
> [ 1.842358] powernv-cpufreq: Workload Optimized Frequency is
> disabled in the platform
> [ 1.847336] hid: raw HID events driver (C) Jiri Kosina
> [ 1.847386] usbcore: registered new interface driver usbhid
> [ 1.847391] usbhid: USB HID core driver
> [ 1.847646] drop_monitor: Initializing network drop monitor service
> [ 1.847817] Initializing XFRM netlink socket
> [ 1.848236] NET: Registered protocol family 10
> [ 1.861274] Segment Routing with IPv6
> [ 1.861288] RPL Segment Routing with IPv6
> [ 1.861313] mip6: Mobile IPv6
> [ 1.861318] NET: Registered protocol family 17
> [ 1.861392] secvar-sysfs: secvar: failed to retrieve secvar operations.
> [ 1.861424] drmem: No dynamic reconfiguration memory found
> [ 1.861631] registered taskstats version 1
> [ 1.861650] Loading compiled-in X.509 certificates
> [ 1.862965] Loaded X.509 cert 'Build time autogenerated kernel key:
> 97604d93c367cf27b215cdfd062467d582f7e126'
> [ 1.864430] zswap: loaded using pool lzo/zbud
> [ 1.864634] debug_vm_pgtable: [debug_vm_pgtable ]:
> Validating architecture page table helpers
> [ 1.865112] Key type ._fscrypt registered
> [ 1.865118] Key type .fscrypt registered
> [ 1.865124] Key type fscrypt-provisioning registered
> [ 1.866849] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
> [ 1.866898] pstore: Using crash dump compression: deflate
> [ 1.867485] Key type encrypted registered
> [ 1.867690] Secure boot mode disabled
> [ 1.867725] ima: No TPM chip found, activating TPM-bypass!
> [ 1.867735] Loading compiled-in module X.509 certificates
> [ 1.869037] Loaded X.509 cert 'Build time autogenerated kernel key:
> 97604d93c367cf27b215cdfd062467d582f7e126'
> [ 1.869049] ima: Allocated hash algorithm: sha256
> [ 1.869223] Secure boot mode disabled
> [ 1.869378] Trusted boot mode disabled
> [ 1.869385] ima: No architecture policies found
> [ 1.869411] evm: Initialising EVM extended attributes:
> [ 1.869418] evm: security.selinux
> [ 1.869423] evm: security.ima
> [ 1.869428] evm: security.capability
> [ 1.869433] evm: HMAC attrs: 0x1
> [ 1.872544] Freeing unused kernel memory: 5760K
> [ 1.872555] Kernel memory protection not selected by kernel config.
> [ 1.872566] Run /init as init process
> [ 1.872706] Failed to execute /init (error -2)
> [ 1.872713] Run /sbin/init as init process
> [ 1.872755] Run /etc/init as init process
> [ 1.872794] Run /bin/init as init process
> [ 1.872834] Run /bin/sh as init process
> [ 1.872891] Kernel panic - not syncing: No working init found. Try
> passing init= option to kernel. See Linux
> Documentation/admin-guide/init.rst for guidance.
> [ 1.872906] CPU: 42 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc3 #1
> [ 1.872917] Call Trace:
> [ 1.872922] [c0000080010d7cc0] [c0000000009f2640]
> dump_stack+0xc4/0x114 (unreliable)
> [ 1.872939] [c0000080010d7d10] [c00000000014b9e0] panic+0x168/0x408
> [ 1.872952] [c0000080010d7da0] [c000000000012964] kernel_init+0x14c/0x168
> [ 1.872964] [c0000080010d7e10] [c00000000000d6ec]
> ret_from_kernel_thread+0x5c/0x70
>
> >
> > And, perhaps asking a silly question, does the crash go away if you
> > revert e7cb072eb988e46295512617c39d004f9e1c26f8 ?
Okay, indeed the problem is not with this commit , I've just hit the
panic with reverted commit.
[ 1.302017] EEH: Capable adapter found: recovery enabled.
[ 1.308206] opal-power: OPAL EPOW, DPO support detected.
[ 1.308866] powernv-rng: Registering arch random hook.
[ 1.310984] Kprobes globally optimized
[ 1.311466] HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
[ 1.311473] HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
[ 1.505658] alg: No test for 842 (842-generic)
[ 1.505699] alg: No test for 842 (842-scomp)
[ 1.510121] raid6: skip pq benchmark and using algorithm vpermxor8
[ 1.510128] raid6: using intx1 recovery algorithm
[ 1.510314] iommu: Default domain type: Translated
[ 1.510402] vgaarb: loaded
[ 1.510579] SCSI subsystem initialized
[ 1.510732] usbcore: registered new interface driver usbfs
[ 1.510745] usbcore: registered new interface driver hub
[ 1.510803] usbcore: registered new device driver usb
[ 1.510829] pps_core: LinuxPPS API ver. 1 registered
[ 1.510834] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
Rodolfo Giometti <[email protected]>
[ 1.510843] PTP clock support registered
[ 1.511011] EDAC MC: Ver: 3.0.0
[ 1.511399] NetLabel: Initializing
[ 1.511404] NetLabel: domain hash size = 128
[ 1.511408] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
[ 1.511427] NetLabel: unlabeled traffic allowed by default
[ 1.512949] clocksource: Switched to clocksource timebase
[ 1.536237] VFS: Disk quotas dquot_6.6.0
[ 1.536380] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
[ 1.538890] NET: Registered protocol family 2
[ 1.539074] IP idents hash table entries: 262144 (order: 5, 2097152
bytes, vmalloc)
[ 1.544621] tcp_listen_portaddr_hash hash table entries: 65536
(order: 4, 1048576 bytes, vmalloc)
[ 1.544887] TCP established hash table entries: 524288 (order: 6,
4194304 bytes, vmalloc)
[ 1.546117] TCP bind hash table entries: 65536 (order: 4, 1048576
bytes, vmalloc)
[ 1.546298] TCP: Hash tables configured (established 524288 bind 65536)
[ 1.546993] MPTCP token hash table entries: 65536 (order: 4,
1572864 bytes, vmalloc)
[ 1.547255] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
[ 1.547648] UDP-Lite hash table entries: 65536 (order: 5, 2097152
bytes, vmalloc)
[ 1.548726] NET: Registered protocol family 1
[ 1.548739] NET: Registered protocol family 44
[ 1.548864] pci 0005:03:00.0: enabling device (0140 -> 0142)
[ 1.548936] PCI: CLS 128 bytes, default 128
[ 1.549005] Trying to unpack rootfs image as initramfs...
[ 1.723810] Initramfs unpacking failed: junk within compressed archive
[ 1.725056] Freeing initrd memory: 20992K
[ 1.725342] rtas_flash: no firmware flash support
[ 1.729768] Initialise system trusted keyrings
[ 1.729792] Key type blacklist registered
[ 1.729965] workingset: timestamp_bits=38 max_order=24 bucket_order=0
[ 1.732064] zbud: loaded
[ 1.775945] NET: Registered protocol family 38
[ 1.775963] xor: measuring software checksum speed
[ 1.776740] 8regs : 13166 MB/sec
[ 1.777564] 8regs_prefetch : 12024 MB/sec
[ 1.778289] 32regs : 13652 MB/sec
[ 1.779061] 32regs_prefetch : 12819 MB/sec
[ 1.779447] altivec : 25821 MB/sec
[ 1.779452] xor: using function: altivec (25821 MB/sec)
[ 1.779457] Key type asymmetric registered
[ 1.779462] Asymmetric key parser 'x509' registered
[ 1.779495] Block layer SCSI generic (bsg) driver version 0.4
loaded (major 245)
[ 1.779652] io scheduler mq-deadline registered
[ 1.779657] io scheduler kyber registered
[ 1.779790] io scheduler bfq registered
[ 1.782225] atomic64_test: passed
[ 1.783089] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[ 1.783314] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
[ 1.783555] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
[ 1.783582] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[ 1.785366] Non-volatile memory driver v1.3
[ 1.787231] libphy: Fixed MDIO Bus: probed
[ 1.787345] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 1.787360] ehci-pci: EHCI PCI platform driver
[ 1.787374] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 1.787391] ohci-pci: OHCI PCI platform driver
[ 1.787404] uhci_hcd: USB Universal Host Controller Interface driver
[ 1.787580] xhci_hcd 0005:03:00.0: xHCI Host Controller
[ 1.787693] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
bus number 1
[ 1.787844] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
version 0x96 quirks 0x0000000004000000
[ 1.788305] usb usb1: New USB device found, idVendor=1d6b,
idProduct=0002, bcdDevice= 5.13
[ 1.788314] usb usb1: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[ 1.788322] usb usb1: Product: xHCI Host Controller
[ 1.788327] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
[ 1.788333] usb usb1: SerialNumber: 0005:03:00.0
[ 1.788498] hub 1-0:1.0: USB hub found
[ 1.788514] hub 1-0:1.0: 4 ports detected
[ 1.788685] xhci_hcd 0005:03:00.0: xHCI Host Controller
[ 1.788745] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
bus number 2
[ 1.788763] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
[ 1.788787] usb usb2: We don't know the algorithms for LPM for this
host, disabling LPM.
[ 1.788819] usb usb2: New USB device found, idVendor=1d6b,
idProduct=0003, bcdDevice= 5.13
[ 1.788827] usb usb2: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
[ 1.788834] usb usb2: Product: xHCI Host Controller
[ 1.788839] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
[ 1.788845] usb usb2: SerialNumber: 0005:03:00.0
[ 1.788961] hub 2-0:1.0: USB hub found
[ 1.788974] hub 2-0:1.0: 4 ports detected
[ 1.789146] usbcore: registered new interface driver usbserial_generic
[ 1.789157] usbserial: USB Serial support registered for generic
[ 1.789196] mousedev: PS/2 mouse device common for all mice
[ 1.793122] device-mapper: uevent: version 1.0.3
[ 1.793282] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
initialised: [email protected]
[ 1.793588] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
0xfffffff7 max 0x0
[ 1.793595] powernv-cpufreq: Workload Optimized Frequency is
disabled in the platform
[ 1.798302] hid: raw HID events driver (C) Jiri Kosina
[ 1.798344] usbcore: registered new interface driver usbhid
[ 1.798349] usbhid: USB HID core driver
[ 1.798600] drop_monitor: Initializing network drop monitor service
[ 1.798750] Initializing XFRM netlink socket
[ 1.799148] NET: Registered protocol family 10
[ 1.811611] Segment Routing with IPv6
[ 1.811627] RPL Segment Routing with IPv6
[ 1.811653] mip6: Mobile IPv6
[ 1.811658] NET: Registered protocol family 17
[ 1.811729] secvar-sysfs: secvar: failed to retrieve secvar operations.
[ 1.811761] drmem: No dynamic reconfiguration memory found
[ 1.811937] registered taskstats version 1
[ 1.811955] Loading compiled-in X.509 certificates
[ 1.812905] Loaded X.509 cert 'Build time autogenerated kernel key:
5a49ad3d49566246c5ef57be0cf7d450502ed699'
[ 1.813953] zswap: loaded using pool lzo/zbud
[ 1.814131] debug_vm_pgtable: [debug_vm_pgtable ]:
Validating architecture page table helpers
[ 1.814498] Key type ._fscrypt registered
[ 1.814503] Key type .fscrypt registered
[ 1.814506] Key type fscrypt-provisioning registered
[ 1.815660] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
[ 1.815695] pstore: Using crash dump compression: deflate
[ 1.816338] Key type encrypted registered
[ 1.816484] Secure boot mode disabled
[ 1.816490] ima: No TPM chip found, activating TPM-bypass!
[ 1.816496] Loading compiled-in module X.509 certificates
[ 1.817293] Loaded X.509 cert 'Build time autogenerated kernel key:
5a49ad3d49566246c5ef57be0cf7d450502ed699'
[ 1.817300] ima: Allocated hash algorithm: sha256
[ 1.817409] Secure boot mode disabled
[ 1.817506] Trusted boot mode disabled
[ 1.817510] ima: No architecture policies found
[ 1.817525] evm: Initialising EVM extended attributes:
[ 1.817529] evm: security.selinux
[ 1.817532] evm: security.ima
[ 1.817535] evm: security.capability
[ 1.817539] evm: HMAC attrs: 0x1
[ 1.819569] Freeing unused kernel memory: 5760K
[ 1.819575] Kernel memory protection not selected by kernel config.
[ 1.819583] Run /init as init process
[ 1.819652] Failed to execute /init (error -2)
[ 1.819658] Run /sbin/init as init process
[ 1.819684] Run /etc/init as init process
[ 1.819710] Run /bin/init as init process
[ 1.819738] Run /bin/sh as init process
[ 1.819777] Kernel panic - not syncing: No working init found. Try
passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance.
[ 1.819787] CPU: 8 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc3 #1
[ 1.819794] Call Trace:
[ 1.819796] [c0000040010e3cc0] [c0000000009f25c0]
dump_stack+0xc4/0x114 (unreliable)
[ 1.819809] [c0000040010e3d10] [c00000000014b960] panic+0x168/0x408
[ 1.819818] [c0000040010e3da0] [c000000000012964] kernel_init+0x14c/0x168
[ 1.819825] [c0000040010e3e10] [c00000000000d6ec]
ret_from_kernel_thread+0x5c/0x70
[ 1.835866] ---[ end Kernel panic - not syncing: No working init
found. Try passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance. ]
Bruno
>
> Sure, I'll try it and let you know.
>
> Bruno
>
> >
> > Thanks,
> > Rasmus
> >
On 14/06/2021 10.03, Bruno Goncalves wrote:
> On Mon, Jun 14, 2021 at 7:47 AM Bruno Goncalves <[email protected]> wrote:
>>
>> On Fri, Jun 11, 2021 at 11:49 PM Rasmus Villemoes
>> <[email protected]> wrote:
>>>
>>>
>>> And, perhaps asking a silly question, does the crash go away if you
>>> revert e7cb072eb988e46295512617c39d004f9e1c26f8 ?
>
> Okay, indeed the problem is not with this commit , I've just hit the
> panic with reverted commit.
OK, thanks for trying and confirming. Unless I here otherwise I'll
ignore the ppc64 issue (there's another report, with completely
different symptoms, on that patch which does go away when passing
initramfs_async=0, so something does seem to be weird around it).
Happy hunting.
Rasmus
Hi Bruno,
[cced kexec and petiboot list]
On 06/14/21 at 10:03am, Bruno Goncalves wrote:
> On Mon, Jun 14, 2021 at 7:47 AM Bruno Goncalves <[email protected]> wrote:
> >
> > On Fri, Jun 11, 2021 at 11:49 PM Rasmus Villemoes
> > <[email protected]> wrote:
> > >
> > > On 11/06/2021 17.06, Bruno Goncalves wrote:
> > > > On Fri, Jun 11, 2021 at 9:13 AM Rasmus Villemoes
> > > > <[email protected]> wrote:
> > > >>
> > > >> On 10/06/2021 17.14, Bruno Goncalves wrote:
> > > >>> On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
> > > >>> <[email protected]> wrote:
> > > >>>>
> > > >>>> On 10/06/2021 13.47, Bruno Goncalves wrote:
> > > >>>>> Hello,
> > > >>>>>
> > > >>>>> We've observed in some cases kernel panic when trying to boot on
> > > >>>>> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
> > > >>>>> could be related to patch
> > > >>>>> https://lore.kernel.org/lkml/[email protected]/
> > > >>>>>
> > > >>>>
> > > >>>> Thanks for the report. It's possible, but I'll need some help from you
> > > >>>> to get more info.
> > > >>>>
> > > >>>> First, can you send me the .config?
> > > >>>
> > > >>> The .config is on
> > > >>> https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config
> > > >>
> > > >> Thanks.
> > > >>
> > > >>>>
> > > >>>>>
> > > >>>>> [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
> > > >>>>
> > > >>>> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
> > > >>>> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
> > > >>>>
> > > >>>
> > > >>> CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
> > > >>> CONFIG_UEVENT_HELPER is not set"
> > > >>
> > > >> OK. Then I assume some quite early initcall does a request_module() or
> > > >> request_firmware() (or similar). I don't think this matters - that call
> > > >> would be done before the initramfs was unpacked with or without my
> > > >> patch, so it won't find anything in the empty rootfs. It's just my patch
> > > >> added a note. But just to figure out where that triggers, can you do
> > > >>
> > > >> - pr_warn_once("wait_for_initramfs() called before
> > > >> rootfs_initcalls\n");
> > > >> + WARN_ONCE(1, "wait_for_initramfs() called before
> > > >> rootfs_initcalls\n");
> > > >>
> > > >> in init/initramfs.c.
> > > >>
> > > >
> > > > I've managed to reproduce the panic with the patch.
> > > >
> > > > [ 1.498654] NIP [c0000000000137d4] wait_for_initramfs+0x94/0xa4
> > > > [ 1.498661] LR [c0000000000137d0] wait_for_initramfs+0x90/0xa4
> > > > [ 1.498668] Call Trace:
> > > > [ 1.498671] [c000000027debd60] [c0000000000137d0]
> > > > wait_for_initramfs+0x90/0xa4 (unreliable)
> > > > [ 1.498680] [c000000027debdc0] [c000000000172fc8]
> > > > call_usermodehelper_exec_async+0x178/0x2c0
> > > > [ 1.498691] [c000000027debe10] [c00000000000d6ec]
> > > > ret_from_kernel_thread+0x5c/0x70
> > >
> > > Thanks, but unfortunately (and I should have known better) that doesn't
> > > tell us who actually initated that call_usermodehelper - it's most
> > > likely some request_module() call. But again, I don't think this is
> > > related to the later crash.
> > >
> > > >>>>> [ 1.764430] Initramfs unpacking failed: no cpio magic
> > > >>>>
> > > >>>> Whoa, that's not good. Did something scramble over the initramfs memory
> > > >>>> while it was being unpacked? It's been .2 seconds since the start of the
> > > >>>> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
> > > >>>>
> > > >>>> Can you try booting with initramfs_async=0 on the command line and see
> > > >>>> if the kernel still crashes?
> > > >
> > > > Using initramfs_async=0 I was also able to reproduce the panic.
> > >
> > > Hm, that's very interesting. Can you share the log for that as well?
> >
> > [ 0.000000] Kernel command line:
> > root=UUID=72f391f6-e71f-41a6-ba16-2c25460203ed ro initramfs_async=0
> > [ 0.000000] printk: log_buf_len individual max cpu contribution: 4096 bytes
> > [ 0.000000] printk: log_buf_len total cpu_extra contributions: 782336 bytes
> > [ 0.000000] printk: log_buf_len min size: 262144 bytes
> > [ 0.000000] printk: log_buf_len: 1048576 bytes
> > [ 0.000000] printk: early log buf free: 253968(96%)
> > [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > [ 0.000000] Memory: 1017540928K/1073741824K available (17472K
> > kernel code, 3072K rwdata, 4992K rodata, 5760K init, 1818K bss,
> > 2510528K reserved, 53690368K cma-reserved)
> > [ 0.000000] random: get_random_u64 called from
> > __kmem_cache_create+0x3c/0x770 with crng_init=0
> > <snip>
> > [ 1.366341] HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> > [ 1.366348] HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> > [ 1.560635] alg: No test for 842 (842-generic)
> > [ 1.560677] alg: No test for 842 (842-scomp)
> > [ 1.560824] wait_for_initramfs() called before rootfs_initcalls
> > [ 1.565123] raid6: skip pq benchmark and using algorithm vpermxor8
> > [ 1.565132] raid6: using intx1 recovery algorithm
> > [ 1.565318] iommu: Default domain type: Translated
> > [ 1.565402] vgaarb: loaded
> > [ 1.565663] SCSI subsystem initialized
> > [ 1.565829] usbcore: registered new interface driver usbfs
> > [ 1.565841] usbcore: registered new interface driver hub
> > [ 1.565917] usbcore: registered new device driver usb
> > [ 1.565944] pps_core: LinuxPPS API ver. 1 registered
> > [ 1.565949] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
> > Rodolfo Giometti <[email protected]>
> > [ 1.565957] PTP clock support registered
> > [ 1.566137] EDAC MC: Ver: 3.0.0
> > [ 1.566551] NetLabel: Initializing
> > [ 1.566555] NetLabel: domain hash size = 128
> > [ 1.566559] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
> > [ 1.566578] NetLabel: unlabeled traffic allowed by default
> > [ 1.568021] clocksource: Switched to clocksource timebase
> > [ 1.591752] VFS: Disk quotas dquot_6.6.0
> > [ 1.591918] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
> > [ 1.594487] NET: Registered protocol family 2
> > [ 1.594673] IP idents hash table entries: 262144 (order: 5, 2097152
> > bytes, vmalloc)
> > [ 1.600286] tcp_listen_portaddr_hash hash table entries: 65536
> > (order: 4, 1048576 bytes, vmalloc)
> > [ 1.600585] TCP established hash table entries: 524288 (order: 6,
> > 4194304 bytes, vmalloc)
> > [ 1.601814] TCP bind hash table entries: 65536 (order: 4, 1048576
> > bytes, vmalloc)
> > [ 1.601991] TCP: Hash tables configured (established 524288 bind 65536)
> > [ 1.602677] MPTCP token hash table entries: 65536 (order: 4,
> > 1572864 bytes, vmalloc)
> > [ 1.602943] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
> > [ 1.603347] UDP-Lite hash table entries: 65536 (order: 5, 2097152
> > bytes, vmalloc)
> > [ 1.604431] NET: Registered protocol family 1
> > [ 1.604443] NET: Registered protocol family 44
> > [ 1.604571] pci 0005:03:00.0: enabling device (0140 -> 0142)
> > [ 1.604656] PCI: CLS 128 bytes, default 128
> > [ 1.604850] Trying to unpack rootfs image as initramfs...
> > [ 1.774342] Initramfs unpacking failed: no cpio magic
> > [ 1.775307] Freeing initrd memory: 18176K
> > [ 1.775594] rtas_flash: no firmware flash support
> > [ 1.780166] Initialise system trusted keyrings
> > [ 1.780190] Key type blacklist registered
> > [ 1.780364] workingset: timestamp_bits=38 max_order=24 bucket_order=0
> > [ 1.782469] zbud: loaded
> > [ 1.825156] NET: Registered protocol family 38
> > [ 1.825169] xor: measuring software checksum speed
> > [ 1.825925] 8regs : 13170 MB/sec
> > [ 1.826748] 8regs_prefetch : 12031 MB/sec
> > [ 1.827473] 32regs : 13662 MB/sec
> > [ 1.828249] 32regs_prefetch : 12820 MB/sec
> > [ 1.828635] altivec : 25906 MB/sec
> > [ 1.828639] xor: using function: altivec (25906 MB/sec)
> > [ 1.828645] Key type asymmetric registered
> > [ 1.828649] Asymmetric key parser 'x509' registered
> > [ 1.828661] Block layer SCSI generic (bsg) driver version 0.4
> > loaded (major 245)
> > [ 1.828820] io scheduler mq-deadline registered
> > [ 1.828825] io scheduler kyber registered
> > [ 1.828932] io scheduler bfq registered
> > [ 1.831304] atomic64_test: passed
> > [ 1.832070] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> > [ 1.832277] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
> > [ 1.832527] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
> > [ 1.832555] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> > [ 1.834347] Non-volatile memory driver v1.3
> > [ 1.836184] libphy: Fixed MDIO Bus: probed
> > [ 1.836295] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> > [ 1.836311] ehci-pci: EHCI PCI platform driver
> > [ 1.836325] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> > [ 1.836342] ohci-pci: OHCI PCI platform driver
> > [ 1.836356] uhci_hcd: USB Universal Host Controller Interface driver
> > [ 1.836523] xhci_hcd 0005:03:00.0: xHCI Host Controller
> > [ 1.836635] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> > bus number 1
> > [ 1.836788] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
> > version 0x96 quirks 0x0000000004000000
> > [ 1.837236] usb usb1: New USB device found, idVendor=1d6b,
> > idProduct=0002, bcdDevice= 5.13
> > [ 1.837245] usb usb1: New USB device strings: Mfr=3, Product=2,
> > SerialNumber=1
> > [ 1.837252] usb usb1: Product: xHCI Host Controller
> > [ 1.837257] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> > [ 1.837263] usb usb1: SerialNumber: 0005:03:00.0
> > [ 1.837418] hub 1-0:1.0: USB hub found
> > [ 1.837433] hub 1-0:1.0: 4 ports detected
> > [ 1.837605] xhci_hcd 0005:03:00.0: xHCI Host Controller
> > [ 1.837662] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> > bus number 2
> > [ 1.837679] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
> > [ 1.837710] usb usb2: We don't know the algorithms for LPM for this
> > host, disabling LPM.
> > [ 1.837742] usb usb2: New USB device found, idVendor=1d6b,
> > idProduct=0003, bcdDevice= 5.13
> > [ 1.837750] usb usb2: New USB device strings: Mfr=3, Product=2,
> > SerialNumber=1
> > [ 1.837756] usb usb2: Product: xHCI Host Controller
> > [ 1.837761] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> > [ 1.837767] usb usb2: SerialNumber: 0005:03:00.0
> > [ 1.837884] hub 2-0:1.0: USB hub found
> > [ 1.837897] hub 2-0:1.0: 4 ports detected
> > [ 1.838074] usbcore: registered new interface driver usbserial_generic
> > [ 1.838086] usbserial: USB Serial support registered for generic
> > [ 1.838123] mousedev: PS/2 mouse device common for all mice
> > [ 1.841900] device-mapper: uevent: version 1.0.3
> > [ 1.842050] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
> > initialised: [email protected]
> > [ 1.842352] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
> > 0xfffffff7 max 0x0
> > [ 1.842358] powernv-cpufreq: Workload Optimized Frequency is
> > disabled in the platform
> > [ 1.847336] hid: raw HID events driver (C) Jiri Kosina
> > [ 1.847386] usbcore: registered new interface driver usbhid
> > [ 1.847391] usbhid: USB HID core driver
> > [ 1.847646] drop_monitor: Initializing network drop monitor service
> > [ 1.847817] Initializing XFRM netlink socket
> > [ 1.848236] NET: Registered protocol family 10
> > [ 1.861274] Segment Routing with IPv6
> > [ 1.861288] RPL Segment Routing with IPv6
> > [ 1.861313] mip6: Mobile IPv6
> > [ 1.861318] NET: Registered protocol family 17
> > [ 1.861392] secvar-sysfs: secvar: failed to retrieve secvar operations.
> > [ 1.861424] drmem: No dynamic reconfiguration memory found
> > [ 1.861631] registered taskstats version 1
> > [ 1.861650] Loading compiled-in X.509 certificates
> > [ 1.862965] Loaded X.509 cert 'Build time autogenerated kernel key:
> > 97604d93c367cf27b215cdfd062467d582f7e126'
> > [ 1.864430] zswap: loaded using pool lzo/zbud
> > [ 1.864634] debug_vm_pgtable: [debug_vm_pgtable ]:
> > Validating architecture page table helpers
> > [ 1.865112] Key type ._fscrypt registered
> > [ 1.865118] Key type .fscrypt registered
> > [ 1.865124] Key type fscrypt-provisioning registered
> > [ 1.866849] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
> > [ 1.866898] pstore: Using crash dump compression: deflate
> > [ 1.867485] Key type encrypted registered
> > [ 1.867690] Secure boot mode disabled
> > [ 1.867725] ima: No TPM chip found, activating TPM-bypass!
> > [ 1.867735] Loading compiled-in module X.509 certificates
> > [ 1.869037] Loaded X.509 cert 'Build time autogenerated kernel key:
> > 97604d93c367cf27b215cdfd062467d582f7e126'
> > [ 1.869049] ima: Allocated hash algorithm: sha256
> > [ 1.869223] Secure boot mode disabled
> > [ 1.869378] Trusted boot mode disabled
> > [ 1.869385] ima: No architecture policies found
> > [ 1.869411] evm: Initialising EVM extended attributes:
> > [ 1.869418] evm: security.selinux
> > [ 1.869423] evm: security.ima
> > [ 1.869428] evm: security.capability
> > [ 1.869433] evm: HMAC attrs: 0x1
> > [ 1.872544] Freeing unused kernel memory: 5760K
> > [ 1.872555] Kernel memory protection not selected by kernel config.
> > [ 1.872566] Run /init as init process
> > [ 1.872706] Failed to execute /init (error -2)
> > [ 1.872713] Run /sbin/init as init process
> > [ 1.872755] Run /etc/init as init process
> > [ 1.872794] Run /bin/init as init process
> > [ 1.872834] Run /bin/sh as init process
> > [ 1.872891] Kernel panic - not syncing: No working init found. Try
> > passing init= option to kernel. See Linux
> > Documentation/admin-guide/init.rst for guidance.
> > [ 1.872906] CPU: 42 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc3 #1
> > [ 1.872917] Call Trace:
> > [ 1.872922] [c0000080010d7cc0] [c0000000009f2640]
> > dump_stack+0xc4/0x114 (unreliable)
> > [ 1.872939] [c0000080010d7d10] [c00000000014b9e0] panic+0x168/0x408
> > [ 1.872952] [c0000080010d7da0] [c000000000012964] kernel_init+0x14c/0x168
> > [ 1.872964] [c0000080010d7e10] [c00000000000d6ec]
> > ret_from_kernel_thread+0x5c/0x70
> >
> > >
> > > And, perhaps asking a silly question, does the crash go away if you
> > > revert e7cb072eb988e46295512617c39d004f9e1c26f8 ?
>
> Okay, indeed the problem is not with this commit , I've just hit the
> panic with reverted commit.
>
> [ 1.302017] EEH: Capable adapter found: recovery enabled.
> [ 1.308206] opal-power: OPAL EPOW, DPO support detected.
> [ 1.308866] powernv-rng: Registering arch random hook.
> [ 1.310984] Kprobes globally optimized
> [ 1.311466] HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> [ 1.311473] HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> [ 1.505658] alg: No test for 842 (842-generic)
> [ 1.505699] alg: No test for 842 (842-scomp)
> [ 1.510121] raid6: skip pq benchmark and using algorithm vpermxor8
> [ 1.510128] raid6: using intx1 recovery algorithm
> [ 1.510314] iommu: Default domain type: Translated
> [ 1.510402] vgaarb: loaded
> [ 1.510579] SCSI subsystem initialized
> [ 1.510732] usbcore: registered new interface driver usbfs
> [ 1.510745] usbcore: registered new interface driver hub
> [ 1.510803] usbcore: registered new device driver usb
> [ 1.510829] pps_core: LinuxPPS API ver. 1 registered
> [ 1.510834] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
> Rodolfo Giometti <[email protected]>
> [ 1.510843] PTP clock support registered
> [ 1.511011] EDAC MC: Ver: 3.0.0
> [ 1.511399] NetLabel: Initializing
> [ 1.511404] NetLabel: domain hash size = 128
> [ 1.511408] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
> [ 1.511427] NetLabel: unlabeled traffic allowed by default
> [ 1.512949] clocksource: Switched to clocksource timebase
> [ 1.536237] VFS: Disk quotas dquot_6.6.0
> [ 1.536380] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
> [ 1.538890] NET: Registered protocol family 2
> [ 1.539074] IP idents hash table entries: 262144 (order: 5, 2097152
> bytes, vmalloc)
> [ 1.544621] tcp_listen_portaddr_hash hash table entries: 65536
> (order: 4, 1048576 bytes, vmalloc)
> [ 1.544887] TCP established hash table entries: 524288 (order: 6,
> 4194304 bytes, vmalloc)
> [ 1.546117] TCP bind hash table entries: 65536 (order: 4, 1048576
> bytes, vmalloc)
> [ 1.546298] TCP: Hash tables configured (established 524288 bind 65536)
> [ 1.546993] MPTCP token hash table entries: 65536 (order: 4,
> 1572864 bytes, vmalloc)
> [ 1.547255] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
> [ 1.547648] UDP-Lite hash table entries: 65536 (order: 5, 2097152
> bytes, vmalloc)
> [ 1.548726] NET: Registered protocol family 1
> [ 1.548739] NET: Registered protocol family 44
> [ 1.548864] pci 0005:03:00.0: enabling device (0140 -> 0142)
> [ 1.548936] PCI: CLS 128 bytes, default 128
> [ 1.549005] Trying to unpack rootfs image as initramfs...
> [ 1.723810] Initramfs unpacking failed: junk within compressed archive
This sounds similar to an old bug we had on Amazon instances which was
fixed below commit:
commit 428c491332bca498c8eb2127669af51506c346c7
Author: Guilherme G. Piccoli <[email protected]>
Date: Fri Mar 20 09:55:34 2020 -0300
net: ena: Add PCI shutdown handler to allow safe kexec
I think it would helpful if petiboot people can do some debugging to see
if anything wrong happened on the petiboot kernel/drivers.
> [ 1.725056] Freeing initrd memory: 20992K
> [ 1.725342] rtas_flash: no firmware flash support
> [ 1.729768] Initialise system trusted keyrings
> [ 1.729792] Key type blacklist registered
> [ 1.729965] workingset: timestamp_bits=38 max_order=24 bucket_order=0
> [ 1.732064] zbud: loaded
> [ 1.775945] NET: Registered protocol family 38
> [ 1.775963] xor: measuring software checksum speed
> [ 1.776740] 8regs : 13166 MB/sec
> [ 1.777564] 8regs_prefetch : 12024 MB/sec
> [ 1.778289] 32regs : 13652 MB/sec
> [ 1.779061] 32regs_prefetch : 12819 MB/sec
> [ 1.779447] altivec : 25821 MB/sec
> [ 1.779452] xor: using function: altivec (25821 MB/sec)
> [ 1.779457] Key type asymmetric registered
> [ 1.779462] Asymmetric key parser 'x509' registered
> [ 1.779495] Block layer SCSI generic (bsg) driver version 0.4
> loaded (major 245)
> [ 1.779652] io scheduler mq-deadline registered
> [ 1.779657] io scheduler kyber registered
> [ 1.779790] io scheduler bfq registered
> [ 1.782225] atomic64_test: passed
> [ 1.783089] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> [ 1.783314] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
> [ 1.783555] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
> [ 1.783582] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> [ 1.785366] Non-volatile memory driver v1.3
> [ 1.787231] libphy: Fixed MDIO Bus: probed
> [ 1.787345] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> [ 1.787360] ehci-pci: EHCI PCI platform driver
> [ 1.787374] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> [ 1.787391] ohci-pci: OHCI PCI platform driver
> [ 1.787404] uhci_hcd: USB Universal Host Controller Interface driver
> [ 1.787580] xhci_hcd 0005:03:00.0: xHCI Host Controller
> [ 1.787693] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> bus number 1
> [ 1.787844] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
> version 0x96 quirks 0x0000000004000000
> [ 1.788305] usb usb1: New USB device found, idVendor=1d6b,
> idProduct=0002, bcdDevice= 5.13
> [ 1.788314] usb usb1: New USB device strings: Mfr=3, Product=2,
> SerialNumber=1
> [ 1.788322] usb usb1: Product: xHCI Host Controller
> [ 1.788327] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> [ 1.788333] usb usb1: SerialNumber: 0005:03:00.0
> [ 1.788498] hub 1-0:1.0: USB hub found
> [ 1.788514] hub 1-0:1.0: 4 ports detected
> [ 1.788685] xhci_hcd 0005:03:00.0: xHCI Host Controller
> [ 1.788745] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> bus number 2
> [ 1.788763] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
> [ 1.788787] usb usb2: We don't know the algorithms for LPM for this
> host, disabling LPM.
> [ 1.788819] usb usb2: New USB device found, idVendor=1d6b,
> idProduct=0003, bcdDevice= 5.13
> [ 1.788827] usb usb2: New USB device strings: Mfr=3, Product=2,
> SerialNumber=1
> [ 1.788834] usb usb2: Product: xHCI Host Controller
> [ 1.788839] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> [ 1.788845] usb usb2: SerialNumber: 0005:03:00.0
> [ 1.788961] hub 2-0:1.0: USB hub found
> [ 1.788974] hub 2-0:1.0: 4 ports detected
> [ 1.789146] usbcore: registered new interface driver usbserial_generic
> [ 1.789157] usbserial: USB Serial support registered for generic
> [ 1.789196] mousedev: PS/2 mouse device common for all mice
> [ 1.793122] device-mapper: uevent: version 1.0.3
> [ 1.793282] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
> initialised: [email protected]
> [ 1.793588] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
> 0xfffffff7 max 0x0
> [ 1.793595] powernv-cpufreq: Workload Optimized Frequency is
> disabled in the platform
> [ 1.798302] hid: raw HID events driver (C) Jiri Kosina
> [ 1.798344] usbcore: registered new interface driver usbhid
> [ 1.798349] usbhid: USB HID core driver
> [ 1.798600] drop_monitor: Initializing network drop monitor service
> [ 1.798750] Initializing XFRM netlink socket
> [ 1.799148] NET: Registered protocol family 10
> [ 1.811611] Segment Routing with IPv6
> [ 1.811627] RPL Segment Routing with IPv6
> [ 1.811653] mip6: Mobile IPv6
> [ 1.811658] NET: Registered protocol family 17
> [ 1.811729] secvar-sysfs: secvar: failed to retrieve secvar operations.
> [ 1.811761] drmem: No dynamic reconfiguration memory found
> [ 1.811937] registered taskstats version 1
> [ 1.811955] Loading compiled-in X.509 certificates
> [ 1.812905] Loaded X.509 cert 'Build time autogenerated kernel key:
> 5a49ad3d49566246c5ef57be0cf7d450502ed699'
> [ 1.813953] zswap: loaded using pool lzo/zbud
> [ 1.814131] debug_vm_pgtable: [debug_vm_pgtable ]:
> Validating architecture page table helpers
> [ 1.814498] Key type ._fscrypt registered
> [ 1.814503] Key type .fscrypt registered
> [ 1.814506] Key type fscrypt-provisioning registered
> [ 1.815660] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
> [ 1.815695] pstore: Using crash dump compression: deflate
> [ 1.816338] Key type encrypted registered
> [ 1.816484] Secure boot mode disabled
> [ 1.816490] ima: No TPM chip found, activating TPM-bypass!
> [ 1.816496] Loading compiled-in module X.509 certificates
> [ 1.817293] Loaded X.509 cert 'Build time autogenerated kernel key:
> 5a49ad3d49566246c5ef57be0cf7d450502ed699'
> [ 1.817300] ima: Allocated hash algorithm: sha256
> [ 1.817409] Secure boot mode disabled
> [ 1.817506] Trusted boot mode disabled
> [ 1.817510] ima: No architecture policies found
> [ 1.817525] evm: Initialising EVM extended attributes:
> [ 1.817529] evm: security.selinux
> [ 1.817532] evm: security.ima
> [ 1.817535] evm: security.capability
> [ 1.817539] evm: HMAC attrs: 0x1
> [ 1.819569] Freeing unused kernel memory: 5760K
> [ 1.819575] Kernel memory protection not selected by kernel config.
> [ 1.819583] Run /init as init process
> [ 1.819652] Failed to execute /init (error -2)
> [ 1.819658] Run /sbin/init as init process
> [ 1.819684] Run /etc/init as init process
> [ 1.819710] Run /bin/init as init process
> [ 1.819738] Run /bin/sh as init process
> [ 1.819777] Kernel panic - not syncing: No working init found. Try
> passing init= option to kernel. See Linux
> Documentation/admin-guide/init.rst for guidance.
> [ 1.819787] CPU: 8 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc3 #1
> [ 1.819794] Call Trace:
> [ 1.819796] [c0000040010e3cc0] [c0000000009f25c0]
> dump_stack+0xc4/0x114 (unreliable)
> [ 1.819809] [c0000040010e3d10] [c00000000014b960] panic+0x168/0x408
> [ 1.819818] [c0000040010e3da0] [c000000000012964] kernel_init+0x14c/0x168
> [ 1.819825] [c0000040010e3e10] [c00000000000d6ec]
> ret_from_kernel_thread+0x5c/0x70
> [ 1.835866] ---[ end Kernel panic - not syncing: No working init
> found. Try passing init= option to kernel. See Linux
> Documentation/admin-guide/init.rst for guidance. ]
>
> Bruno
> >
> > Sure, I'll try it and let you know.
> >
> > Bruno
> >
> > >
> > > Thanks,
> > > Rasmus
> > >
>
Thanks
Dave
[readd kexec/petiboot list]
On 06/15/21 at 11:20am, Dave Young wrote:
> Hi Bruno,
>
> [cced kexec and petiboot list]
> On 06/14/21 at 10:03am, Bruno Goncalves wrote:
> > On Mon, Jun 14, 2021 at 7:47 AM Bruno Goncalves <[email protected]> wrote:
> > >
> > > On Fri, Jun 11, 2021 at 11:49 PM Rasmus Villemoes
> > > <[email protected]> wrote:
> > > >
> > > > On 11/06/2021 17.06, Bruno Goncalves wrote:
> > > > > On Fri, Jun 11, 2021 at 9:13 AM Rasmus Villemoes
> > > > > <[email protected]> wrote:
> > > > >>
> > > > >> On 10/06/2021 17.14, Bruno Goncalves wrote:
> > > > >>> On Thu, Jun 10, 2021 at 3:02 PM Rasmus Villemoes
> > > > >>> <[email protected]> wrote:
> > > > >>>>
> > > > >>>> On 10/06/2021 13.47, Bruno Goncalves wrote:
> > > > >>>>> Hello,
> > > > >>>>>
> > > > >>>>> We've observed in some cases kernel panic when trying to boot on
> > > > >>>>> ppc64le using a kernel based on 5.13.0-rc3. We are not sure if it
> > > > >>>>> could be related to patch
> > > > >>>>> https://lore.kernel.org/lkml/[email protected]/
> > > > >>>>>
> > > > >>>>
> > > > >>>> Thanks for the report. It's possible, but I'll need some help from you
> > > > >>>> to get more info.
> > > > >>>>
> > > > >>>> First, can you send me the .config?
> > > > >>>
> > > > >>> The .config is on
> > > > >>> https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2021/06/09/317881801/build_ppc64le_redhat:1332368174/kernel-block-ppc64le-d3f02e52f5548006f04358d407bbb7fe51255c41.config
> > > > >>
> > > > >> Thanks.
> > > > >>
> > > > >>>>
> > > > >>>>>
> > > > >>>>> [ 1.516075] wait_for_initramfs() called before rootfs_initcalls
> > > > >>>>
> > > > >>>> This is likely because you have CONFIG_UEVENT_HELPER_PATH set to some
> > > > >>>> non-empty path (/sbin/hotplug perhaps). This did get reported once before:
> > > > >>>>
> > > > >>>
> > > > >>> CONFIG_UEVENT_HELPER_PATH is not set. In the .config we have "#
> > > > >>> CONFIG_UEVENT_HELPER is not set"
> > > > >>
> > > > >> OK. Then I assume some quite early initcall does a request_module() or
> > > > >> request_firmware() (or similar). I don't think this matters - that call
> > > > >> would be done before the initramfs was unpacked with or without my
> > > > >> patch, so it won't find anything in the empty rootfs. It's just my patch
> > > > >> added a note. But just to figure out where that triggers, can you do
> > > > >>
> > > > >> - pr_warn_once("wait_for_initramfs() called before
> > > > >> rootfs_initcalls\n");
> > > > >> + WARN_ONCE(1, "wait_for_initramfs() called before
> > > > >> rootfs_initcalls\n");
> > > > >>
> > > > >> in init/initramfs.c.
> > > > >>
> > > > >
> > > > > I've managed to reproduce the panic with the patch.
> > > > >
> > > > > [ 1.498654] NIP [c0000000000137d4] wait_for_initramfs+0x94/0xa4
> > > > > [ 1.498661] LR [c0000000000137d0] wait_for_initramfs+0x90/0xa4
> > > > > [ 1.498668] Call Trace:
> > > > > [ 1.498671] [c000000027debd60] [c0000000000137d0]
> > > > > wait_for_initramfs+0x90/0xa4 (unreliable)
> > > > > [ 1.498680] [c000000027debdc0] [c000000000172fc8]
> > > > > call_usermodehelper_exec_async+0x178/0x2c0
> > > > > [ 1.498691] [c000000027debe10] [c00000000000d6ec]
> > > > > ret_from_kernel_thread+0x5c/0x70
> > > >
> > > > Thanks, but unfortunately (and I should have known better) that doesn't
> > > > tell us who actually initated that call_usermodehelper - it's most
> > > > likely some request_module() call. But again, I don't think this is
> > > > related to the later crash.
> > > >
> > > > >>>>> [ 1.764430] Initramfs unpacking failed: no cpio magic
> > > > >>>>
> > > > >>>> Whoa, that's not good. Did something scramble over the initramfs memory
> > > > >>>> while it was being unpacked? It's been .2 seconds since the start of the
> > > > >>>> unpacking, so it's unlikely the very beginning of the initramfs is corrupt.
> > > > >>>>
> > > > >>>> Can you try booting with initramfs_async=0 on the command line and see
> > > > >>>> if the kernel still crashes?
> > > > >
> > > > > Using initramfs_async=0 I was also able to reproduce the panic.
> > > >
> > > > Hm, that's very interesting. Can you share the log for that as well?
> > >
> > > [ 0.000000] Kernel command line:
> > > root=UUID=72f391f6-e71f-41a6-ba16-2c25460203ed ro initramfs_async=0
> > > [ 0.000000] printk: log_buf_len individual max cpu contribution: 4096 bytes
> > > [ 0.000000] printk: log_buf_len total cpu_extra contributions: 782336 bytes
> > > [ 0.000000] printk: log_buf_len min size: 262144 bytes
> > > [ 0.000000] printk: log_buf_len: 1048576 bytes
> > > [ 0.000000] printk: early log buf free: 253968(96%)
> > > [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > > [ 0.000000] Memory: 1017540928K/1073741824K available (17472K
> > > kernel code, 3072K rwdata, 4992K rodata, 5760K init, 1818K bss,
> > > 2510528K reserved, 53690368K cma-reserved)
> > > [ 0.000000] random: get_random_u64 called from
> > > __kmem_cache_create+0x3c/0x770 with crng_init=0
> > > <snip>
> > > [ 1.366341] HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> > > [ 1.366348] HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> > > [ 1.560635] alg: No test for 842 (842-generic)
> > > [ 1.560677] alg: No test for 842 (842-scomp)
> > > [ 1.560824] wait_for_initramfs() called before rootfs_initcalls
> > > [ 1.565123] raid6: skip pq benchmark and using algorithm vpermxor8
> > > [ 1.565132] raid6: using intx1 recovery algorithm
> > > [ 1.565318] iommu: Default domain type: Translated
> > > [ 1.565402] vgaarb: loaded
> > > [ 1.565663] SCSI subsystem initialized
> > > [ 1.565829] usbcore: registered new interface driver usbfs
> > > [ 1.565841] usbcore: registered new interface driver hub
> > > [ 1.565917] usbcore: registered new device driver usb
> > > [ 1.565944] pps_core: LinuxPPS API ver. 1 registered
> > > [ 1.565949] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
> > > Rodolfo Giometti <[email protected]>
> > > [ 1.565957] PTP clock support registered
> > > [ 1.566137] EDAC MC: Ver: 3.0.0
> > > [ 1.566551] NetLabel: Initializing
> > > [ 1.566555] NetLabel: domain hash size = 128
> > > [ 1.566559] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
> > > [ 1.566578] NetLabel: unlabeled traffic allowed by default
> > > [ 1.568021] clocksource: Switched to clocksource timebase
> > > [ 1.591752] VFS: Disk quotas dquot_6.6.0
> > > [ 1.591918] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
> > > [ 1.594487] NET: Registered protocol family 2
> > > [ 1.594673] IP idents hash table entries: 262144 (order: 5, 2097152
> > > bytes, vmalloc)
> > > [ 1.600286] tcp_listen_portaddr_hash hash table entries: 65536
> > > (order: 4, 1048576 bytes, vmalloc)
> > > [ 1.600585] TCP established hash table entries: 524288 (order: 6,
> > > 4194304 bytes, vmalloc)
> > > [ 1.601814] TCP bind hash table entries: 65536 (order: 4, 1048576
> > > bytes, vmalloc)
> > > [ 1.601991] TCP: Hash tables configured (established 524288 bind 65536)
> > > [ 1.602677] MPTCP token hash table entries: 65536 (order: 4,
> > > 1572864 bytes, vmalloc)
> > > [ 1.602943] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
> > > [ 1.603347] UDP-Lite hash table entries: 65536 (order: 5, 2097152
> > > bytes, vmalloc)
> > > [ 1.604431] NET: Registered protocol family 1
> > > [ 1.604443] NET: Registered protocol family 44
> > > [ 1.604571] pci 0005:03:00.0: enabling device (0140 -> 0142)
> > > [ 1.604656] PCI: CLS 128 bytes, default 128
> > > [ 1.604850] Trying to unpack rootfs image as initramfs...
> > > [ 1.774342] Initramfs unpacking failed: no cpio magic
> > > [ 1.775307] Freeing initrd memory: 18176K
> > > [ 1.775594] rtas_flash: no firmware flash support
> > > [ 1.780166] Initialise system trusted keyrings
> > > [ 1.780190] Key type blacklist registered
> > > [ 1.780364] workingset: timestamp_bits=38 max_order=24 bucket_order=0
> > > [ 1.782469] zbud: loaded
> > > [ 1.825156] NET: Registered protocol family 38
> > > [ 1.825169] xor: measuring software checksum speed
> > > [ 1.825925] 8regs : 13170 MB/sec
> > > [ 1.826748] 8regs_prefetch : 12031 MB/sec
> > > [ 1.827473] 32regs : 13662 MB/sec
> > > [ 1.828249] 32regs_prefetch : 12820 MB/sec
> > > [ 1.828635] altivec : 25906 MB/sec
> > > [ 1.828639] xor: using function: altivec (25906 MB/sec)
> > > [ 1.828645] Key type asymmetric registered
> > > [ 1.828649] Asymmetric key parser 'x509' registered
> > > [ 1.828661] Block layer SCSI generic (bsg) driver version 0.4
> > > loaded (major 245)
> > > [ 1.828820] io scheduler mq-deadline registered
> > > [ 1.828825] io scheduler kyber registered
> > > [ 1.828932] io scheduler bfq registered
> > > [ 1.831304] atomic64_test: passed
> > > [ 1.832070] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> > > [ 1.832277] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
> > > [ 1.832527] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
> > > [ 1.832555] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> > > [ 1.834347] Non-volatile memory driver v1.3
> > > [ 1.836184] libphy: Fixed MDIO Bus: probed
> > > [ 1.836295] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> > > [ 1.836311] ehci-pci: EHCI PCI platform driver
> > > [ 1.836325] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> > > [ 1.836342] ohci-pci: OHCI PCI platform driver
> > > [ 1.836356] uhci_hcd: USB Universal Host Controller Interface driver
> > > [ 1.836523] xhci_hcd 0005:03:00.0: xHCI Host Controller
> > > [ 1.836635] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> > > bus number 1
> > > [ 1.836788] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
> > > version 0x96 quirks 0x0000000004000000
> > > [ 1.837236] usb usb1: New USB device found, idVendor=1d6b,
> > > idProduct=0002, bcdDevice= 5.13
> > > [ 1.837245] usb usb1: New USB device strings: Mfr=3, Product=2,
> > > SerialNumber=1
> > > [ 1.837252] usb usb1: Product: xHCI Host Controller
> > > [ 1.837257] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> > > [ 1.837263] usb usb1: SerialNumber: 0005:03:00.0
> > > [ 1.837418] hub 1-0:1.0: USB hub found
> > > [ 1.837433] hub 1-0:1.0: 4 ports detected
> > > [ 1.837605] xhci_hcd 0005:03:00.0: xHCI Host Controller
> > > [ 1.837662] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> > > bus number 2
> > > [ 1.837679] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
> > > [ 1.837710] usb usb2: We don't know the algorithms for LPM for this
> > > host, disabling LPM.
> > > [ 1.837742] usb usb2: New USB device found, idVendor=1d6b,
> > > idProduct=0003, bcdDevice= 5.13
> > > [ 1.837750] usb usb2: New USB device strings: Mfr=3, Product=2,
> > > SerialNumber=1
> > > [ 1.837756] usb usb2: Product: xHCI Host Controller
> > > [ 1.837761] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> > > [ 1.837767] usb usb2: SerialNumber: 0005:03:00.0
> > > [ 1.837884] hub 2-0:1.0: USB hub found
> > > [ 1.837897] hub 2-0:1.0: 4 ports detected
> > > [ 1.838074] usbcore: registered new interface driver usbserial_generic
> > > [ 1.838086] usbserial: USB Serial support registered for generic
> > > [ 1.838123] mousedev: PS/2 mouse device common for all mice
> > > [ 1.841900] device-mapper: uevent: version 1.0.3
> > > [ 1.842050] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
> > > initialised: [email protected]
> > > [ 1.842352] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
> > > 0xfffffff7 max 0x0
> > > [ 1.842358] powernv-cpufreq: Workload Optimized Frequency is
> > > disabled in the platform
> > > [ 1.847336] hid: raw HID events driver (C) Jiri Kosina
> > > [ 1.847386] usbcore: registered new interface driver usbhid
> > > [ 1.847391] usbhid: USB HID core driver
> > > [ 1.847646] drop_monitor: Initializing network drop monitor service
> > > [ 1.847817] Initializing XFRM netlink socket
> > > [ 1.848236] NET: Registered protocol family 10
> > > [ 1.861274] Segment Routing with IPv6
> > > [ 1.861288] RPL Segment Routing with IPv6
> > > [ 1.861313] mip6: Mobile IPv6
> > > [ 1.861318] NET: Registered protocol family 17
> > > [ 1.861392] secvar-sysfs: secvar: failed to retrieve secvar operations.
> > > [ 1.861424] drmem: No dynamic reconfiguration memory found
> > > [ 1.861631] registered taskstats version 1
> > > [ 1.861650] Loading compiled-in X.509 certificates
> > > [ 1.862965] Loaded X.509 cert 'Build time autogenerated kernel key:
> > > 97604d93c367cf27b215cdfd062467d582f7e126'
> > > [ 1.864430] zswap: loaded using pool lzo/zbud
> > > [ 1.864634] debug_vm_pgtable: [debug_vm_pgtable ]:
> > > Validating architecture page table helpers
> > > [ 1.865112] Key type ._fscrypt registered
> > > [ 1.865118] Key type .fscrypt registered
> > > [ 1.865124] Key type fscrypt-provisioning registered
> > > [ 1.866849] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
> > > [ 1.866898] pstore: Using crash dump compression: deflate
> > > [ 1.867485] Key type encrypted registered
> > > [ 1.867690] Secure boot mode disabled
> > > [ 1.867725] ima: No TPM chip found, activating TPM-bypass!
> > > [ 1.867735] Loading compiled-in module X.509 certificates
> > > [ 1.869037] Loaded X.509 cert 'Build time autogenerated kernel key:
> > > 97604d93c367cf27b215cdfd062467d582f7e126'
> > > [ 1.869049] ima: Allocated hash algorithm: sha256
> > > [ 1.869223] Secure boot mode disabled
> > > [ 1.869378] Trusted boot mode disabled
> > > [ 1.869385] ima: No architecture policies found
> > > [ 1.869411] evm: Initialising EVM extended attributes:
> > > [ 1.869418] evm: security.selinux
> > > [ 1.869423] evm: security.ima
> > > [ 1.869428] evm: security.capability
> > > [ 1.869433] evm: HMAC attrs: 0x1
> > > [ 1.872544] Freeing unused kernel memory: 5760K
> > > [ 1.872555] Kernel memory protection not selected by kernel config.
> > > [ 1.872566] Run /init as init process
> > > [ 1.872706] Failed to execute /init (error -2)
> > > [ 1.872713] Run /sbin/init as init process
> > > [ 1.872755] Run /etc/init as init process
> > > [ 1.872794] Run /bin/init as init process
> > > [ 1.872834] Run /bin/sh as init process
> > > [ 1.872891] Kernel panic - not syncing: No working init found. Try
> > > passing init= option to kernel. See Linux
> > > Documentation/admin-guide/init.rst for guidance.
> > > [ 1.872906] CPU: 42 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc3 #1
> > > [ 1.872917] Call Trace:
> > > [ 1.872922] [c0000080010d7cc0] [c0000000009f2640]
> > > dump_stack+0xc4/0x114 (unreliable)
> > > [ 1.872939] [c0000080010d7d10] [c00000000014b9e0] panic+0x168/0x408
> > > [ 1.872952] [c0000080010d7da0] [c000000000012964] kernel_init+0x14c/0x168
> > > [ 1.872964] [c0000080010d7e10] [c00000000000d6ec]
> > > ret_from_kernel_thread+0x5c/0x70
> > >
> > > >
> > > > And, perhaps asking a silly question, does the crash go away if you
> > > > revert e7cb072eb988e46295512617c39d004f9e1c26f8 ?
> >
> > Okay, indeed the problem is not with this commit , I've just hit the
> > panic with reverted commit.
> >
> > [ 1.302017] EEH: Capable adapter found: recovery enabled.
> > [ 1.308206] opal-power: OPAL EPOW, DPO support detected.
> > [ 1.308866] powernv-rng: Registering arch random hook.
> > [ 1.310984] Kprobes globally optimized
> > [ 1.311466] HugeTLB registered 16.0 MiB page size, pre-allocated 0 pages
> > [ 1.311473] HugeTLB registered 16.0 GiB page size, pre-allocated 0 pages
> > [ 1.505658] alg: No test for 842 (842-generic)
> > [ 1.505699] alg: No test for 842 (842-scomp)
> > [ 1.510121] raid6: skip pq benchmark and using algorithm vpermxor8
> > [ 1.510128] raid6: using intx1 recovery algorithm
> > [ 1.510314] iommu: Default domain type: Translated
> > [ 1.510402] vgaarb: loaded
> > [ 1.510579] SCSI subsystem initialized
> > [ 1.510732] usbcore: registered new interface driver usbfs
> > [ 1.510745] usbcore: registered new interface driver hub
> > [ 1.510803] usbcore: registered new device driver usb
> > [ 1.510829] pps_core: LinuxPPS API ver. 1 registered
> > [ 1.510834] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
> > Rodolfo Giometti <[email protected]>
> > [ 1.510843] PTP clock support registered
> > [ 1.511011] EDAC MC: Ver: 3.0.0
> > [ 1.511399] NetLabel: Initializing
> > [ 1.511404] NetLabel: domain hash size = 128
> > [ 1.511408] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
> > [ 1.511427] NetLabel: unlabeled traffic allowed by default
> > [ 1.512949] clocksource: Switched to clocksource timebase
> > [ 1.536237] VFS: Disk quotas dquot_6.6.0
> > [ 1.536380] VFS: Dquot-cache hash table entries: 8192 (order 0, 65536 bytes)
> > [ 1.538890] NET: Registered protocol family 2
> > [ 1.539074] IP idents hash table entries: 262144 (order: 5, 2097152
> > bytes, vmalloc)
> > [ 1.544621] tcp_listen_portaddr_hash hash table entries: 65536
> > (order: 4, 1048576 bytes, vmalloc)
> > [ 1.544887] TCP established hash table entries: 524288 (order: 6,
> > 4194304 bytes, vmalloc)
> > [ 1.546117] TCP bind hash table entries: 65536 (order: 4, 1048576
> > bytes, vmalloc)
> > [ 1.546298] TCP: Hash tables configured (established 524288 bind 65536)
> > [ 1.546993] MPTCP token hash table entries: 65536 (order: 4,
> > 1572864 bytes, vmalloc)
> > [ 1.547255] UDP hash table entries: 65536 (order: 5, 2097152 bytes, vmalloc)
> > [ 1.547648] UDP-Lite hash table entries: 65536 (order: 5, 2097152
> > bytes, vmalloc)
> > [ 1.548726] NET: Registered protocol family 1
> > [ 1.548739] NET: Registered protocol family 44
> > [ 1.548864] pci 0005:03:00.0: enabling device (0140 -> 0142)
> > [ 1.548936] PCI: CLS 128 bytes, default 128
> > [ 1.549005] Trying to unpack rootfs image as initramfs...
> > [ 1.723810] Initramfs unpacking failed: junk within compressed archive
>
> This sounds similar to an old bug we had on Amazon instances which was
> fixed below commit:
> commit 428c491332bca498c8eb2127669af51506c346c7
> Author: Guilherme G. Piccoli <[email protected]>
> Date: Fri Mar 20 09:55:34 2020 -0300
>
> net: ena: Add PCI shutdown handler to allow safe kexec
>
> I think it would helpful if petiboot people can do some debugging to see
> if anything wrong happened on the petiboot kernel/drivers.
>
>
> > [ 1.725056] Freeing initrd memory: 20992K
> > [ 1.725342] rtas_flash: no firmware flash support
> > [ 1.729768] Initialise system trusted keyrings
> > [ 1.729792] Key type blacklist registered
> > [ 1.729965] workingset: timestamp_bits=38 max_order=24 bucket_order=0
> > [ 1.732064] zbud: loaded
> > [ 1.775945] NET: Registered protocol family 38
> > [ 1.775963] xor: measuring software checksum speed
> > [ 1.776740] 8regs : 13166 MB/sec
> > [ 1.777564] 8regs_prefetch : 12024 MB/sec
> > [ 1.778289] 32regs : 13652 MB/sec
> > [ 1.779061] 32regs_prefetch : 12819 MB/sec
> > [ 1.779447] altivec : 25821 MB/sec
> > [ 1.779452] xor: using function: altivec (25821 MB/sec)
> > [ 1.779457] Key type asymmetric registered
> > [ 1.779462] Asymmetric key parser 'x509' registered
> > [ 1.779495] Block layer SCSI generic (bsg) driver version 0.4
> > loaded (major 245)
> > [ 1.779652] io scheduler mq-deadline registered
> > [ 1.779657] io scheduler kyber registered
> > [ 1.779790] io scheduler bfq registered
> > [ 1.782225] atomic64_test: passed
> > [ 1.783089] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> > [ 1.783314] hvc0: raw protocol on /ibm,opal/consoles/serial@0 (boot console)
> > [ 1.783555] hvc1: hvsi protocol on /ibm,opal/consoles/serial@1
> > [ 1.783582] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> > [ 1.785366] Non-volatile memory driver v1.3
> > [ 1.787231] libphy: Fixed MDIO Bus: probed
> > [ 1.787345] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> > [ 1.787360] ehci-pci: EHCI PCI platform driver
> > [ 1.787374] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> > [ 1.787391] ohci-pci: OHCI PCI platform driver
> > [ 1.787404] uhci_hcd: USB Universal Host Controller Interface driver
> > [ 1.787580] xhci_hcd 0005:03:00.0: xHCI Host Controller
> > [ 1.787693] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> > bus number 1
> > [ 1.787844] xhci_hcd 0005:03:00.0: hcc params 0x0270f06d hci
> > version 0x96 quirks 0x0000000004000000
> > [ 1.788305] usb usb1: New USB device found, idVendor=1d6b,
> > idProduct=0002, bcdDevice= 5.13
> > [ 1.788314] usb usb1: New USB device strings: Mfr=3, Product=2,
> > SerialNumber=1
> > [ 1.788322] usb usb1: Product: xHCI Host Controller
> > [ 1.788327] usb usb1: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> > [ 1.788333] usb usb1: SerialNumber: 0005:03:00.0
> > [ 1.788498] hub 1-0:1.0: USB hub found
> > [ 1.788514] hub 1-0:1.0: 4 ports detected
> > [ 1.788685] xhci_hcd 0005:03:00.0: xHCI Host Controller
> > [ 1.788745] xhci_hcd 0005:03:00.0: new USB bus registered, assigned
> > bus number 2
> > [ 1.788763] xhci_hcd 0005:03:00.0: Host supports USB 3.0 SuperSpeed
> > [ 1.788787] usb usb2: We don't know the algorithms for LPM for this
> > host, disabling LPM.
> > [ 1.788819] usb usb2: New USB device found, idVendor=1d6b,
> > idProduct=0003, bcdDevice= 5.13
> > [ 1.788827] usb usb2: New USB device strings: Mfr=3, Product=2,
> > SerialNumber=1
> > [ 1.788834] usb usb2: Product: xHCI Host Controller
> > [ 1.788839] usb usb2: Manufacturer: Linux 5.13.0-rc3 xhci-hcd
> > [ 1.788845] usb usb2: SerialNumber: 0005:03:00.0
> > [ 1.788961] hub 2-0:1.0: USB hub found
> > [ 1.788974] hub 2-0:1.0: 4 ports detected
> > [ 1.789146] usbcore: registered new interface driver usbserial_generic
> > [ 1.789157] usbserial: USB Serial support registered for generic
> > [ 1.789196] mousedev: PS/2 mouse device common for all mice
> > [ 1.793122] device-mapper: uevent: version 1.0.3
> > [ 1.793282] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22)
> > initialised: [email protected]
> > [ 1.793588] powernv-cpufreq: cpufreq pstate min 0xffffffda nominal
> > 0xfffffff7 max 0x0
> > [ 1.793595] powernv-cpufreq: Workload Optimized Frequency is
> > disabled in the platform
> > [ 1.798302] hid: raw HID events driver (C) Jiri Kosina
> > [ 1.798344] usbcore: registered new interface driver usbhid
> > [ 1.798349] usbhid: USB HID core driver
> > [ 1.798600] drop_monitor: Initializing network drop monitor service
> > [ 1.798750] Initializing XFRM netlink socket
> > [ 1.799148] NET: Registered protocol family 10
> > [ 1.811611] Segment Routing with IPv6
> > [ 1.811627] RPL Segment Routing with IPv6
> > [ 1.811653] mip6: Mobile IPv6
> > [ 1.811658] NET: Registered protocol family 17
> > [ 1.811729] secvar-sysfs: secvar: failed to retrieve secvar operations.
> > [ 1.811761] drmem: No dynamic reconfiguration memory found
> > [ 1.811937] registered taskstats version 1
> > [ 1.811955] Loading compiled-in X.509 certificates
> > [ 1.812905] Loaded X.509 cert 'Build time autogenerated kernel key:
> > 5a49ad3d49566246c5ef57be0cf7d450502ed699'
> > [ 1.813953] zswap: loaded using pool lzo/zbud
> > [ 1.814131] debug_vm_pgtable: [debug_vm_pgtable ]:
> > Validating architecture page table helpers
> > [ 1.814498] Key type ._fscrypt registered
> > [ 1.814503] Key type .fscrypt registered
> > [ 1.814506] Key type fscrypt-provisioning registered
> > [ 1.815660] Btrfs loaded, crc32c=crc32c-generic, zoned=yes
> > [ 1.815695] pstore: Using crash dump compression: deflate
> > [ 1.816338] Key type encrypted registered
> > [ 1.816484] Secure boot mode disabled
> > [ 1.816490] ima: No TPM chip found, activating TPM-bypass!
> > [ 1.816496] Loading compiled-in module X.509 certificates
> > [ 1.817293] Loaded X.509 cert 'Build time autogenerated kernel key:
> > 5a49ad3d49566246c5ef57be0cf7d450502ed699'
> > [ 1.817300] ima: Allocated hash algorithm: sha256
> > [ 1.817409] Secure boot mode disabled
> > [ 1.817506] Trusted boot mode disabled
> > [ 1.817510] ima: No architecture policies found
> > [ 1.817525] evm: Initialising EVM extended attributes:
> > [ 1.817529] evm: security.selinux
> > [ 1.817532] evm: security.ima
> > [ 1.817535] evm: security.capability
> > [ 1.817539] evm: HMAC attrs: 0x1
> > [ 1.819569] Freeing unused kernel memory: 5760K
> > [ 1.819575] Kernel memory protection not selected by kernel config.
> > [ 1.819583] Run /init as init process
> > [ 1.819652] Failed to execute /init (error -2)
> > [ 1.819658] Run /sbin/init as init process
> > [ 1.819684] Run /etc/init as init process
> > [ 1.819710] Run /bin/init as init process
> > [ 1.819738] Run /bin/sh as init process
> > [ 1.819777] Kernel panic - not syncing: No working init found. Try
> > passing init= option to kernel. See Linux
> > Documentation/admin-guide/init.rst for guidance.
> > [ 1.819787] CPU: 8 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc3 #1
> > [ 1.819794] Call Trace:
> > [ 1.819796] [c0000040010e3cc0] [c0000000009f25c0]
> > dump_stack+0xc4/0x114 (unreliable)
> > [ 1.819809] [c0000040010e3d10] [c00000000014b960] panic+0x168/0x408
> > [ 1.819818] [c0000040010e3da0] [c000000000012964] kernel_init+0x14c/0x168
> > [ 1.819825] [c0000040010e3e10] [c00000000000d6ec]
> > ret_from_kernel_thread+0x5c/0x70
> > [ 1.835866] ---[ end Kernel panic - not syncing: No working init
> > found. Try passing init= option to kernel. See Linux
> > Documentation/admin-guide/init.rst for guidance. ]
> >
> > Bruno
> > >
> > > Sure, I'll try it and let you know.
> > >
> > > Bruno
> > >
> > > >
> > > > Thanks,
> > > > Rasmus
> > > >
> >
>
> Thanks
> Dave
>