Hi,
I'm getting panics when booting from a QEMU hw/nvme device on an aarch64
guest in roughly 20% of boots on v6.2-rc4. Example panic below.
I've bisected it to commit eac3ef262941 ("nvme-pci: split the initial
probe from the rest path").
I'm not seeing this on any other emulated platforms that I'm currently
testing (x86_64, riscv32/64, mips32/64 and sparc64).
nvme nvme0: 1/0/0 default/read/poll queues
NET: Registered PF_VSOCK protocol family
registered taskstats version 1
nvme nvme0: Ignoring bogus Namespace Identifiers
/dev/root: Can't open blockdev
VFS: Cannot open root device "nvme0n1" or unknown-block(0,0): error -6
Please append a correct "root=" boot option; here are the available partitions:
103:00000 61440 nvme0n1
(driver?)
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc4 #22
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace.part.0+0xdc/0xf0
show_stack+0x18/0x30
dump_stack_lvl+0x7c/0xa0
dump_stack+0x18/0x34
panic+0x17c/0x328
mount_block_root+0x184/0x234
mount_root+0x178/0x198
prepare_namespace+0x124/0x164
kernel_init_freeable+0x2a0/0x2c8
kernel_init+0x2c/0x130
ret_from_fork+0x10/0x20
Kernel Offset: disabled
CPU features: 0x00000,01800100,0000420b
Memory Limit: none
---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---
On Mon, Jan 16, 2023 at 10:57:11PM +0100, Klaus Jensen wrote:
> Hi,
>
> I'm getting panics when booting from a QEMU hw/nvme device on an aarch64
> guest in roughly 20% of boots on v6.2-rc4. Example panic below.
This smells like your setup somehow doesn't wait for async driver
probe. Does the hack below work around it?
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b13baccedb4a95..f47e19c701d520 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3508,7 +3508,6 @@ static struct pci_driver nvme_driver = {
.remove = nvme_remove,
.shutdown = nvme_shutdown,
.driver = {
- .probe_type = PROBE_PREFER_ASYNCHRONOUS,
#ifdef CONFIG_PM_SLEEP
.pm = &nvme_dev_pm_ops,
#endif
On Jan 17 06:58, Christoph Hellwig wrote:
> On Mon, Jan 16, 2023 at 10:57:11PM +0100, Klaus Jensen wrote:
> > Hi,
> >
> > I'm getting panics when booting from a QEMU hw/nvme device on an aarch64
> > guest in roughly 20% of boots on v6.2-rc4. Example panic below.
>
> This smells like your setup somehow doesn't wait for async driver
> probe. Does the hack below work around it?
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index b13baccedb4a95..f47e19c701d520 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3508,7 +3508,6 @@ static struct pci_driver nvme_driver = {
> .remove = nvme_remove,
> .shutdown = nvme_shutdown,
> .driver = {
> - .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> #ifdef CONFIG_PM_SLEEP
> .pm = &nvme_dev_pm_ops,
> #endif
Good morning Christoph,
Yep, the above works.
My setup is a buildroot qemu_aarch64_virt_defconfig booting from an
emulated nvme device:
qemu-system-aarch64 -M "virt" -cpu "cortex-a53" -m 512M \
-nodefaults -nographic -snapshot -no-reboot \
-kernel images/Image \
-append "root=/dev/nvme0n1 console=ttyAMA0,115200" \
-drive file=images/rootfs.ext2,format=raw,if=none,id=d0 \
-device nvme,serial=default,drive=d0 \
-nic user,model=virtio \
-serial stdio
On Jan 17 07:37, Christoph Hellwig wrote:
> On Tue, Jan 17, 2023 at 07:31:59AM +0100, Klaus Jensen wrote:
> > Good morning Christoph,
> >
> > Yep, the above works.
>
> Context for the newly added: This is dropping the newly added
> PROBE_PREFER_ASYNCHRONOUS in nvme, which causes Klaus' arm64 (but not
> other boot tests) to fail. Any idea what could be going wrong there
> probably in userspace?
>
Adding 'rootwait' to the boot parameters does the trick as well.
On Tue, Jan 17, 2023 at 07:31:59AM +0100, Klaus Jensen wrote:
> Good morning Christoph,
>
> Yep, the above works.
Context for the newly added: This is dropping the newly added
PROBE_PREFER_ASYNCHRONOUS in nvme, which causes Klaus' arm64 (but not
other boot tests) to fail. Any idea what could be going wrong there
probably in userspace?
On Tue, 2023-01-17 at 07:37 +0100, Christoph Hellwig wrote:
> On Tue, Jan 17, 2023 at 07:31:59AM +0100, Klaus Jensen wrote:
> > Good morning Christoph,
> >
> > Yep, the above works.
>
> Context for the newly added: This is dropping the newly added
> PROBE_PREFER_ASYNCHRONOUS in nvme, which causes Klaus' arm64 (but not
> other boot tests) to fail.? Any idea what could be going wrong there
> probably in userspace?
If this is an aarch64 userspace issue, maybe related to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107678 ?
That bug causes segfaults of user space programs if for some reason the
unwind code is invoked. It happens only if libgcc_s.so is compiled with
gcc 13, and the pauth CPU feature is enabled in qemu.
Martin
On Jan 17 13:11, Martin Wilck wrote:
> On Tue, 2023-01-17 at 07:37 +0100, Christoph Hellwig wrote:
> > On Tue, Jan 17, 2023 at 07:31:59AM +0100, Klaus Jensen wrote:
> > > Good morning Christoph,
> > >
> > > Yep, the above works.
> >
> > Context for the newly added: This is dropping the newly added
> > PROBE_PREFER_ASYNCHRONOUS in nvme, which causes Klaus' arm64 (but not
> > other boot tests) to fail. Any idea what could be going wrong there
> > probably in userspace?
>
> If this is an aarch64 userspace issue, maybe related to
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107678 ?
>
> That bug causes segfaults of user space programs if for some reason the
> unwind code is invoked. It happens only if libgcc_s.so is compiled with
> gcc 13, and the pauth CPU feature is enabled in qemu.
>
> Martin
>
I just observed the same panic on qemu emulated ppc64 as well. It's
pretty rare, maybe 1 in 20. 'rootwait' or removing the the prefer
asynchronous probe fixes it as well.
[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]
On 16.01.23 22:57, Klaus Jensen wrote:
>
> I'm getting panics when booting from a QEMU hw/nvme device on an aarch64
> guest in roughly 20% of boots on v6.2-rc4. Example panic below.
>
> I've bisected it to commit eac3ef262941 ("nvme-pci: split the initial
> probe from the rest path").
>
> I'm not seeing this on any other emulated platforms that I'm currently
> testing (x86_64, riscv32/64, mips32/64 and sparc64).
> [...]
Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:
#regzbot ^introduced eac3ef262941
#regzbot title nvme: occasional boot problems due to the newly supported
async driver probe
#regzbot ignore-activity
This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.
Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.
On Tue, Jan 17, 2023 at 07:37:35AM +0100, Christoph Hellwig wrote:
> On Tue, Jan 17, 2023 at 07:31:59AM +0100, Klaus Jensen wrote:
> > Good morning Christoph,
> >
> > Yep, the above works.
>
> Context for the newly added: This is dropping the newly added
> PROBE_PREFER_ASYNCHRONOUS in nvme, which causes Klaus' arm64 (but not
> other boot tests) to fail. Any idea what could be going wrong there
> probably in userspace?
Prior to 6.2, the driver would do it's own async_schedule, and that
async probe function would flush the first scan work.
wait_for_device_probe() was then forced to wait for the scan_work to
complete, which brings up the root device.
We're not flushing the scan_work anymore from our probe, so this should
fix it for 6.2:
---
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b294b41a149a7..ff97426749976 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3046,6 +3046,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
nvme_start_ctrl(&dev->ctrl);
nvme_put_ctrl(&dev->ctrl);
+ flush_work(&dev->ctrl.scan_work);
return 0;
out_disable:
--
On Thu, Jan 19, 2023 at 09:48:56AM -0700, Keith Busch wrote:
> On Tue, Jan 17, 2023 at 07:37:35AM +0100, Christoph Hellwig wrote:
> > On Tue, Jan 17, 2023 at 07:31:59AM +0100, Klaus Jensen wrote:
> > > Good morning Christoph,
> > >
> > > Yep, the above works.
> >
> > Context for the newly added: This is dropping the newly added
> > PROBE_PREFER_ASYNCHRONOUS in nvme, which causes Klaus' arm64 (but not
> > other boot tests) to fail. Any idea what could be going wrong there
> > probably in userspace?
>
> Prior to 6.2, the driver would do it's own async_schedule, and that
> async probe function would flush the first scan work.
> wait_for_device_probe() was then forced to wait for the scan_work to
> complete, which brings up the root device.
>
> We're not flushing the scan_work anymore from our probe, so this should
> fix it for 6.2:
Appears to fix my Tigerlake Thinkpad T14 gen2.
Tested-by: Ville Syrj?l? <[email protected]>
>
> ---
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index b294b41a149a7..ff97426749976 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3046,6 +3046,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>
> nvme_start_ctrl(&dev->ctrl);
> nvme_put_ctrl(&dev->ctrl);
> + flush_work(&dev->ctrl.scan_work);
> return 0;
>
> out_disable:
> --
>
--
Ville Syrj?l?
Intel
[TLDR: there afaics is a fix for the regression discussed in this
thread, but its author did not use a Link: tag to point to the report,
as wanted by Linus and explained in the documentation; this forces me to
write this mail, which sole purpose it to update the state of this
tracked Linux kernel regression.]
On 19.01.23 14:10, Linux kernel regression tracking (#adding) wrote:
> On 16.01.23 22:57, Klaus Jensen wrote:
>>
>> I'm getting panics when booting from a QEMU hw/nvme device on an aarch64
>> guest in roughly 20% of boots on v6.2-rc4. Example panic below.
>>
>> I've bisected it to commit eac3ef262941 ("nvme-pci: split the initial
>> probe from the rest path").
>>
>> I'm not seeing this on any other emulated platforms that I'm currently
>> testing (x86_64, riscv32/64, mips32/64 and sparc64).
>> [...]
>
> Thanks for the report. To be sure the issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> tracking bot:
>
> #regzbot ^introduced eac3ef262941
> #regzbot title nvme: occasional boot problems due to the newly supported
> async driver probe
> #regzbot ignore-activity
#regzbot monitor:
https://lore.kernel.org/all/[email protected]/
#regzbot fix: nvme-pci: flush initial scan_work for async probe
#regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.