2023-06-13 13:50:48

by Richard W.M. Jones

[permalink] [raw]
Subject: printk.time causes rare kernel boot hangs

[Being tracked in this bug which contains much more detail:
https://gitlab.com/qemu-project/qemu/-/issues/1696 ]

Recent kernels hang rarely when booted on qemu. Usually you need to
boot 100s or 1,000s of times to see the hang, compared to 292,612 [sic]
successful boots which I was able to do before the problematic commit.

A reproducer (you'll probably need to use Fedora) is:

$ while guestfish -a /dev/null -v run >& /tmp/log; do echo -n . ; done

You will need to leave it running for probably several hours, and
examine the /tmp/log file at the end.

I tracked this down to the following commit:

commit f31dcb152a3d0816e2f1deab4e64572336da197d
Author: Aaron Thompson <[email protected]>
Date: Thu Apr 13 17:50:12 2023 +0000

sched/clock: Fix local_clock() before sched_clock_init()

Have local_clock() return sched_clock() if sched_clock_init() has not
yet run. sched_clock_cpu() has this check but it was not included in the
new noinstr implementation of local_clock().

(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f31dcb152a3d0816e2f1deab4e64572336da197d)

Reverting this commit fixes the problem.

I don't know _why_ this commit is wrong, but can we revert it as it
causes serious problems with libguestfs hanging randomly.

Or if there's anything you want me to try out then let me know,
because I can reproduce the problem locally quite easily.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html



2023-06-13 14:38:50

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 13.06.23 15:41, Richard W.M. Jones wrote:
> [Being tracked in this bug which contains much more detail:
> https://gitlab.com/qemu-project/qemu/-/issues/1696 ]
>
> Recent kernels hang rarely when booted on qemu. Usually you need to
> boot 100s or 1,000s of times to see the hang, compared to 292,612 [sic]
> successful boots which I was able to do before the problematic commit.
>
> A reproducer (you'll probably need to use Fedora) is:
>
> $ while guestfish -a /dev/null -v run >& /tmp/log; do echo -n . ; done
>
> You will need to leave it running for probably several hours, and
> examine the /tmp/log file at the end.
>
> I tracked this down to the following commit:
>
> commit f31dcb152a3d0816e2f1deab4e64572336da197d
> Author: Aaron Thompson <[email protected]>
> Date: Thu Apr 13 17:50:12 2023 +0000
>
> sched/clock: Fix local_clock() before sched_clock_init()
>
> Have local_clock() return sched_clock() if sched_clock_init() has not
> yet run. sched_clock_cpu() has this check but it was not included in the
> new noinstr implementation of local_clock().
>
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f31dcb152a3d0816e2f1deab4e64572336da197d)
>
> Reverting this commit fixes the problem.
>
> I don't know _why_ this commit is wrong, but can we revert it as it
> causes serious problems with libguestfs hanging randomly.
>
> Or if there's anything you want me to try out then let me know,
> because I can reproduce the problem locally quite easily.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced f31dcb152a3d0816e2f1deab4e64572336da197d
#regzbot title sched/clock: printk.time causes rare kernel boot hangs
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

2023-06-14 09:31:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Tue, Jun 13, 2023 at 02:41:05PM +0100, Richard W.M. Jones wrote:
> [Being tracked in this bug which contains much more detail:
> https://gitlab.com/qemu-project/qemu/-/issues/1696 ]

Can I please just get the detail in mail instead of having to go look at
random websites?

> Recent kernels hang rarely when booted on qemu. Usually you need to
> boot 100s or 1,000s of times to see the hang, compared to 292,612 [sic]
> successful boots which I was able to do before the problematic commit.
>
> A reproducer (you'll probably need to use Fedora) is:

Debian only shop here... in fact, I still have machines without systemd.

> $ while guestfish -a /dev/null -v run >& /tmp/log; do echo -n . ; done
>
> You will need to leave it running for probably several hours, and
> examine the /tmp/log file at the end.
>
> I tracked this down to the following commit:
>
> commit f31dcb152a3d0816e2f1deab4e64572336da197d
> Author: Aaron Thompson <[email protected]>
> Date: Thu Apr 13 17:50:12 2023 +0000
>
> sched/clock: Fix local_clock() before sched_clock_init()
>
> Have local_clock() return sched_clock() if sched_clock_init() has not
> yet run. sched_clock_cpu() has this check but it was not included in the
> new noinstr implementation of local_clock().
>
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f31dcb152a3d0816e2f1deab4e64572336da197d)
>
> Reverting this commit fixes the problem.
>
> I don't know _why_ this commit is wrong, but can we revert it as it
> causes serious problems with libguestfs hanging randomly.
>
> Or if there's anything you want me to try out then let me know,
> because I can reproduce the problem locally quite easily.

Well, since it's virt and all, can you attach gdb to the gdb-stub and
see where it's at? Any clue is better than no clue.

2023-06-14 10:20:33

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 11:21:58AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 13, 2023 at 02:41:05PM +0100, Richard W.M. Jones wrote:
> > [Being tracked in this bug which contains much more detail:
> > https://gitlab.com/qemu-project/qemu/-/issues/1696 ]
>
> Can I please just get the detail in mail instead of having to go look at
> random websites?

Sure, the kernel hangs after printing:

[ 0.070120] x86/cpu: User Mode Instruction Prevention (UMIP) activated
[ 0.070120] Last level iTLB entries: 4KB 512, 2MB 255, 4MB 127
[ 0.070120] Last level dTLB entries: 4KB 512, 2MB 255, 4MB 127, 1GB 0
[ 0.070120] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[ 0.070120] Spectre V2 : Mitigation: Retpolines
[ 0.070120] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[ 0.070120] Spectre V2 : Spectre v2 / SpectreRSB : Filling RSB on VMEXIT
[ 0.070120] Spectre V2 : Enabling Speculation Barrier for firmware calls
[ 0.070120] RETBleed: Mitigation: untrained return thunk
[ 0.070120] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
[ 0.070120] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl
[ 0.070120] Freeing SMP alternatives memory: 48K

The next message we'd expect here would be:

[ 0.070794] smpboot: CPU0: AMD Ryzen 9 3900X 12-Core Processor (family: 0x17, model: 0x71, stepping: 0x0)

I believe this bug would affect baremetal too, basically any kernel
compiled with CONFIG_PRINTK_TIME. However the hang is very rare.

> > Recent kernels hang rarely when booted on qemu. Usually you need to
> > boot 100s or 1,000s of times to see the hang, compared to 292,612 [sic]
> > successful boots which I was able to do before the problematic commit.
> >
> > A reproducer (you'll probably need to use Fedora) is:
>
> Debian only shop here... in fact, I still have machines without systemd.

Debian should work too actually, just run the following command until
it hangs:

> > $ while guestfish -a /dev/null -v run >& /tmp/log; do echo -n . ; done
> >
> > You will need to leave it running for probably several hours, and
> > examine the /tmp/log file at the end.
> >
> > I tracked this down to the following commit:
> >
> > commit f31dcb152a3d0816e2f1deab4e64572336da197d
> > Author: Aaron Thompson <[email protected]>
> > Date: Thu Apr 13 17:50:12 2023 +0000
> >
> > sched/clock: Fix local_clock() before sched_clock_init()
> >
> > Have local_clock() return sched_clock() if sched_clock_init() has not
> > yet run. sched_clock_cpu() has this check but it was not included in the
> > new noinstr implementation of local_clock().
> >
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f31dcb152a3d0816e2f1deab4e64572336da197d)
> >
> > Reverting this commit fixes the problem.
> >
> > I don't know _why_ this commit is wrong, but can we revert it as it
> > causes serious problems with libguestfs hanging randomly.
> >
> > Or if there's anything you want me to try out then let me know,
> > because I can reproduce the problem locally quite easily.
>
> Well, since it's virt and all, can you attach gdb to the gdb-stub and
> see where it's at? Any clue is better than no clue.

I'll see if this is possible, but I didn't have much luck with gdb on
qemu guests in the past.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org


2023-06-14 10:34:57

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

OK I have a simpler test:

$ while true; do qemu-system-x86_64 -no-user-config -nodefaults -display none -machine accel=kvm:tcg,graphics=off -cpu max,la57=off -m 1280 -no-reboot -rtc driftfix=slew -no-hpet -global kvm-pit.lost_tick_policy=discard -kernel vmlinux -serial stdio -append "panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=7cd9caf4-6491-41a5-b387-38af6fa2d638 selinux=0 guestfs_verbose=1 TERM=xterm-256color" >& /tmp/log ; echo -n . ; done

where "vmlinux" is a locally compiled kernel with CONFIG_PRINTK_TIME=y
set. I'm using the current git head.

This will print dots about once a second. When it stops printing dots
(after potentially many thousands of iterations), take a look at
/tmp/log to see if it has hung at:

[ 0.071830] Freeing SMP alternatives memory: 48K

I tried connecting gdb to qemu but so far I've not successfully
managed to collect a usable stack trace. I can't seem to get it to
print symbols, even though I believe I have disabled kASLR, but still
trying.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html


2023-06-14 10:49:03

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

Got it!

#0 arch_static_branch (branch=false, key=<optimized out>)
at ./arch/x86/include/asm/jump_label.h:27
#1 static_key_false (key=<optimized out>) at ./include/linux/jump_label.h:207
#2 native_write_msr (msr=1760, low=1876580734, high=106)
at ./arch/x86/include/asm/msr.h:147
#3 0xffffffff8107997c in paravirt_write_msr (high=<optimized out>,
low=1876580734, msr=1760) at ./arch/x86/include/asm/paravirt.h:196
#4 wrmsrl (val=<optimized out>, msr=1760)
at ./arch/x86/include/asm/paravirt.h:229
#5 lapic_next_deadline (delta=<optimized out>, evt=<optimized out>)
at arch/x86/kernel/apic/apic.c:491
#6 0xffffffff811f7b1d in clockevents_program_event (dev=0xffff88804e820dc0,
expires=<optimized out>, force=<optimized out>)
at kernel/time/clockevents.c:334
#7 0xffffffff811f81b0 in tick_handle_periodic (dev=0xffff88804e820dc0)
at kernel/time/tick-common.c:133
#8 0xffffffff810796c1 in local_apic_timer_interrupt ()
at arch/x86/kernel/apic/apic.c:1095
#9 __sysvec_apic_timer_interrupt (regs=regs@entry=0xffffc90000003ee8)
at arch/x86/kernel/apic/apic.c:1112
#10 0xffffffff81f9cf09 in sysvec_apic_timer_interrupt (regs=0xffffc90000003ee8)
at arch/x86/kernel/apic/apic.c:1106
#11 0xffffffff820015ca in asm_sysvec_apic_timer_interrupt ()
at ./arch/x86/include/asm/idtentry.h:645
#12 0x0000000000000000 in ?? ()

There's only 1 vCPU in the VM. However earlier I did try
with -smp 2 and that also has the same issue.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org


2023-06-14 11:31:55

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 01:16:44PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 14, 2023 at 10:45:22AM +0100, Richard W.M. Jones wrote:
> > Debian should work too actually, just run the following command until
> > it hangs:
> >
> > > > $ while guestfish -a /dev/null -v run >& /tmp/log; do echo -n . ; done
>
> What kernel with guestfish use?

I think it's easier to test using the qemu command I suggested in a
later email. (The command line is very long but should be simple to
understand.) That way you can just use a local vmlinux or vmlinuz
file:

https://lkml.org/lkml/2023/6/14/491

Note for gdb you will need to add '-s' to the qemu command line as well.

Unfortunately since the hang is so rare it's not really possible to
attach gdb at the beginning, since you'd have to do hundreds of runs
before seeing the problem. The stack trace seems like it isn't very
interesting (since the kernel still appears to be running while "hung").

If you have any other suggestions for things I can try please let me
know - this is a pretty important issue for us as it breaks CI tests,
so I can try whatever you like.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org


2023-06-14 11:36:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 11:30:11AM +0100, Richard W.M. Jones wrote:
> OK I have a simpler test:
>
> $ while true; do qemu-system-x86_64 -no-user-config -nodefaults -display none -machine accel=kvm:tcg,graphics=off -cpu max,la57=off -m 1280 -no-reboot -rtc driftfix=slew -no-hpet -global kvm-pit.lost_tick_policy=discard -kernel vmlinux -serial stdio -append "panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=7cd9caf4-6491-41a5-b387-38af6fa2d638 selinux=0 guestfs_verbose=1 TERM=xterm-256color" >& /tmp/log ; echo -n . ; done
>
> where "vmlinux" is a locally compiled kernel with CONFIG_PRINTK_TIME=y
> set. I'm using the current git head.

Ah, that's much easier than whatever guestfish does, let me go run this
then.

2023-06-14 11:37:12

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

Also here's the .config I'm using in case it's some other kernel
feature interacting:

http://oirase.annexia.org/tmp/config-bz2213346

It's the full-fat Fedora config, except I disabled kASLR.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html


2023-06-14 11:37:40

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 10:45:22AM +0100, Richard W.M. Jones wrote:
> Debian should work too actually, just run the following command until
> it hangs:
>
> > > $ while guestfish -a /dev/null -v run >& /tmp/log; do echo -n . ; done

What kernel with guestfish use?

2023-06-14 11:47:14

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 11:39:53AM +0100, Richard W.M. Jones wrote:
> Got it!
>
> #0 arch_static_branch (branch=false, key=<optimized out>)
> at ./arch/x86/include/asm/jump_label.h:27
> #1 static_key_false (key=<optimized out>) at ./include/linux/jump_label.h:207
> #2 native_write_msr (msr=1760, low=1876580734, high=106)
> at ./arch/x86/include/asm/msr.h:147
> #3 0xffffffff8107997c in paravirt_write_msr (high=<optimized out>,
> low=1876580734, msr=1760) at ./arch/x86/include/asm/paravirt.h:196
> #4 wrmsrl (val=<optimized out>, msr=1760)
> at ./arch/x86/include/asm/paravirt.h:229
> #5 lapic_next_deadline (delta=<optimized out>, evt=<optimized out>)
> at arch/x86/kernel/apic/apic.c:491
> #6 0xffffffff811f7b1d in clockevents_program_event (dev=0xffff88804e820dc0,
> expires=<optimized out>, force=<optimized out>)
> at kernel/time/clockevents.c:334
> #7 0xffffffff811f81b0 in tick_handle_periodic (dev=0xffff88804e820dc0)
> at kernel/time/tick-common.c:133
> #8 0xffffffff810796c1 in local_apic_timer_interrupt ()
> at arch/x86/kernel/apic/apic.c:1095
> #9 __sysvec_apic_timer_interrupt (regs=regs@entry=0xffffc90000003ee8)
> at arch/x86/kernel/apic/apic.c:1112
> #10 0xffffffff81f9cf09 in sysvec_apic_timer_interrupt (regs=0xffffc90000003ee8)
> at arch/x86/kernel/apic/apic.c:1106
> #11 0xffffffff820015ca in asm_sysvec_apic_timer_interrupt ()
> at ./arch/x86/include/asm/idtentry.h:645
> #12 0x0000000000000000 in ?? ()

Uuuhh.. something is really fishy here. The thing in common between the
fingered patch and this stacktrace is the jump_label/static_branch
usage, but they're quite different paths.

There is no printk or local_clock() in sight here.

I've got that plain qemu thing running on:

defconfig + kvm_guest.config + CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y

and have added "nokaslr" to the -append string. Lets see if it wants to
go wobbly for me.



2023-06-14 11:48:24

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 01:35:36PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 14, 2023 at 11:39:53AM +0100, Richard W.M. Jones wrote:
> > Got it!
> >
> > #0 arch_static_branch (branch=false, key=<optimized out>)
> > at ./arch/x86/include/asm/jump_label.h:27
> > #1 static_key_false (key=<optimized out>) at ./include/linux/jump_label.h:207
> > #2 native_write_msr (msr=1760, low=1876580734, high=106)
> > at ./arch/x86/include/asm/msr.h:147
> > #3 0xffffffff8107997c in paravirt_write_msr (high=<optimized out>,
> > low=1876580734, msr=1760) at ./arch/x86/include/asm/paravirt.h:196
> > #4 wrmsrl (val=<optimized out>, msr=1760)
> > at ./arch/x86/include/asm/paravirt.h:229
> > #5 lapic_next_deadline (delta=<optimized out>, evt=<optimized out>)
> > at arch/x86/kernel/apic/apic.c:491
> > #6 0xffffffff811f7b1d in clockevents_program_event (dev=0xffff88804e820dc0,
> > expires=<optimized out>, force=<optimized out>)
> > at kernel/time/clockevents.c:334
> > #7 0xffffffff811f81b0 in tick_handle_periodic (dev=0xffff88804e820dc0)
> > at kernel/time/tick-common.c:133
> > #8 0xffffffff810796c1 in local_apic_timer_interrupt ()
> > at arch/x86/kernel/apic/apic.c:1095
> > #9 __sysvec_apic_timer_interrupt (regs=regs@entry=0xffffc90000003ee8)
> > at arch/x86/kernel/apic/apic.c:1112
> > #10 0xffffffff81f9cf09 in sysvec_apic_timer_interrupt (regs=0xffffc90000003ee8)
> > at arch/x86/kernel/apic/apic.c:1106
> > #11 0xffffffff820015ca in asm_sysvec_apic_timer_interrupt ()
> > at ./arch/x86/include/asm/idtentry.h:645
> > #12 0x0000000000000000 in ?? ()
>
> Uuuhh.. something is really fishy here. The thing in common between the
> fingered patch and this stacktrace is the jump_label/static_branch
> usage, but they're quite different paths.
>
> There is no printk or local_clock() in sight here.

So I should note that the stack trace is collected by connecting to
qemu after the kernel "hangs" ("hangs" in quotes, because it is
probably still running).

It's difficult to collect a stack trace at the moment of the hang
since I basically have to wait for hundreds of boot iterations until I
see the bug manifest.

I wonder if the KVM tracing features would be a better approach here?

> I've got that plain qemu thing running on:
>
> defconfig + kvm_guest.config + CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
>
> and have added "nokaslr" to the -append string. Lets see if it wants to
> go wobbly for me.

Thanks,

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html


2023-06-14 13:09:00

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 01:35:36PM +0200, Peter Zijlstra wrote:
> I've got that plain qemu thing running on:
>
> defconfig + kvm_guest.config + CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y

With this config I hit the bug locally after ~ 400 iterations.
However this is on an AMD host, and for some reason on Intel host the
bug is rarer (although it still happens).

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v


2023-06-14 13:14:51

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 02:53:20PM +0200, Peter Zijlstra wrote:
> Ooooh, what qemu version do you have? There were some really dodgy
> reports all around self modifying code, all reported on 7.2, that seems
> to have gone away with 8.

> Now, all of them were using TCG, and I think you're using KVM.

I'm using qemu-system-x86-8.0.0-4.fc39.x86_64 with KVM. The host
kernel is 6.4.0-0.rc5.41.fc39.x86_64.

> I've at least 36000 cycles and still nothing :-(, let me go try your
> .config.

I can definitely reproduce this with your config (but AMD host), so
that's odd. Let me try with Intel hardware and your config. You
really should see this with only a few thousands iterations.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org


2023-06-14 14:23:49

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 01:35:36PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 14, 2023 at 11:39:53AM +0100, Richard W.M. Jones wrote:
> > Got it!
> >
> > #0 arch_static_branch (branch=false, key=<optimized out>)
> > at ./arch/x86/include/asm/jump_label.h:27
> > #1 static_key_false (key=<optimized out>) at ./include/linux/jump_label.h:207
> > #2 native_write_msr (msr=1760, low=1876580734, high=106)
> > at ./arch/x86/include/asm/msr.h:147
> > #3 0xffffffff8107997c in paravirt_write_msr (high=<optimized out>,
> > low=1876580734, msr=1760) at ./arch/x86/include/asm/paravirt.h:196
> > #4 wrmsrl (val=<optimized out>, msr=1760)
> > at ./arch/x86/include/asm/paravirt.h:229
> > #5 lapic_next_deadline (delta=<optimized out>, evt=<optimized out>)
> > at arch/x86/kernel/apic/apic.c:491
> > #6 0xffffffff811f7b1d in clockevents_program_event (dev=0xffff88804e820dc0,
> > expires=<optimized out>, force=<optimized out>)
> > at kernel/time/clockevents.c:334
> > #7 0xffffffff811f81b0 in tick_handle_periodic (dev=0xffff88804e820dc0)
> > at kernel/time/tick-common.c:133
> > #8 0xffffffff810796c1 in local_apic_timer_interrupt ()
> > at arch/x86/kernel/apic/apic.c:1095
> > #9 __sysvec_apic_timer_interrupt (regs=regs@entry=0xffffc90000003ee8)
> > at arch/x86/kernel/apic/apic.c:1112
> > #10 0xffffffff81f9cf09 in sysvec_apic_timer_interrupt (regs=0xffffc90000003ee8)
> > at arch/x86/kernel/apic/apic.c:1106
> > #11 0xffffffff820015ca in asm_sysvec_apic_timer_interrupt ()
> > at ./arch/x86/include/asm/idtentry.h:645
> > #12 0x0000000000000000 in ?? ()
>
> Uuuhh.. something is really fishy here. The thing in common between the
> fingered patch and this stacktrace is the jump_label/static_branch
> usage, but they're quite different paths.
>
> There is no printk or local_clock() in sight here.
>
> I've got that plain qemu thing running on:
>
> defconfig + kvm_guest.config + CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
>
> and have added "nokaslr" to the -append string. Lets see if it wants to
> go wobbly for me.


Ooooh, what qemu version do you have? There were some really dodgy
reports all around self modifying code, all reported on 7.2, that seems
to have gone away with 8.

Now, all of them were using TCG, and I think you're using KVM.

I've at least 36000 cycles and still nothing :-(, let me go try your
.config.



2023-06-14 14:41:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 02:03:48PM +0100, Richard W.M. Jones wrote:
> On Wed, Jun 14, 2023 at 02:53:20PM +0200, Peter Zijlstra wrote:
> > Ooooh, what qemu version do you have? There were some really dodgy
> > reports all around self modifying code, all reported on 7.2, that seems
> > to have gone away with 8.
>
> > Now, all of them were using TCG, and I think you're using KVM.
>
> I'm using qemu-system-x86-8.0.0-4.fc39.x86_64 with KVM. The host
> kernel is 6.4.0-0.rc5.41.fc39.x86_64.
>
> > I've at least 36000 cycles and still nothing :-(, let me go try your
> > .config.
>
> I can definitely reproduce this with your config (but AMD host), so
> that's odd. Let me try with Intel hardware and your config. You
> really should see this with only a few thousands iterations.

Yeah, I'm running on some ancient ivb-ep, let me try on the alderlake. I
don't really have AMD machines at hand :-/

2023-06-14 14:56:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 03:09:45PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 14, 2023 at 02:03:48PM +0100, Richard W.M. Jones wrote:
> > On Wed, Jun 14, 2023 at 02:53:20PM +0200, Peter Zijlstra wrote:
> > > Ooooh, what qemu version do you have? There were some really dodgy
> > > reports all around self modifying code, all reported on 7.2, that seems
> > > to have gone away with 8.
> >
> > > Now, all of them were using TCG, and I think you're using KVM.
> >
> > I'm using qemu-system-x86-8.0.0-4.fc39.x86_64 with KVM. The host
> > kernel is 6.4.0-0.rc5.41.fc39.x86_64.
> >
> > > I've at least 36000 cycles and still nothing :-(, let me go try your
> > > .config.
> >
> > I can definitely reproduce this with your config (but AMD host), so
> > that's odd. Let me try with Intel hardware and your config. You
> > really should see this with only a few thousands iterations.
>
> Yeah, I'm running on some ancient ivb-ep, let me try on the alderlake. I
> don't really have AMD machines at hand :-/

Using v6.4-rc6-37-gb6dad5178cea, and 36000+ cycles on the ADL (affine to
big cores) later and nothing :-(

Clearly I'm doing something wrong.

2023-06-14 15:28:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 04:07:17PM +0100, Richard W.M. Jones wrote:
> On Wed, Jun 14, 2023 at 04:53:48PM +0200, Peter Zijlstra wrote:

> > Clearly I'm doing something wrong.
>
> I'm suspecting version of qemu may have something to do with it, but
> honestly I don't know. Which qemu are you using?

ivb-ep: QEMU emulator version 7.2.0 (Debian 1:7.2+dfsg-3)
adl: QEMU emulator version 7.2.0 (Debian 1:7.2+dfsg-1+b2)

I'll try and upgrade the adl to 8.

2023-06-14 15:43:32

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 04:53:48PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 14, 2023 at 03:09:45PM +0200, Peter Zijlstra wrote:
> > On Wed, Jun 14, 2023 at 02:03:48PM +0100, Richard W.M. Jones wrote:
> > > On Wed, Jun 14, 2023 at 02:53:20PM +0200, Peter Zijlstra wrote:
> > > > Ooooh, what qemu version do you have? There were some really dodgy
> > > > reports all around self modifying code, all reported on 7.2, that seems
> > > > to have gone away with 8.
> > >
> > > > Now, all of them were using TCG, and I think you're using KVM.
> > >
> > > I'm using qemu-system-x86-8.0.0-4.fc39.x86_64 with KVM. The host
> > > kernel is 6.4.0-0.rc5.41.fc39.x86_64.
> > >
> > > > I've at least 36000 cycles and still nothing :-(, let me go try your
> > > > .config.
> > >
> > > I can definitely reproduce this with your config (but AMD host), so
> > > that's odd. Let me try with Intel hardware and your config. You
> > > really should see this with only a few thousands iterations.
> >
> > Yeah, I'm running on some ancient ivb-ep, let me try on the alderlake. I
> > don't really have AMD machines at hand :-/
>
> Using v6.4-rc6-37-gb6dad5178cea, and 36000+ cycles on the ADL (affine to
> big cores) later and nothing :-(

So I have just this minute reproduced it on Intel, after around ~2000
iterations.

On this machine I am using:

Host kernel: 6.3.7-200.fc38.x86_64
Guest kernel: git commit fb054096aea0576f0c0a61c598e5e9676443ee86
Guest config: defconfig + kvm_guest.config + CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
QEMU: qemu-system-x86-8.0.0-4.fc39.x86_64 (using kvm_intel)
Hardware: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz

> Clearly I'm doing something wrong.

I'm suspecting version of qemu may have something to do with it, but
honestly I don't know. Which qemu are you using?

I'll now try with upstream qemu from git in case there's something
about the Fedora qemu.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW


2023-06-14 16:11:33

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 05:19:20PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 14, 2023 at 04:07:17PM +0100, Richard W.M. Jones wrote:
> > On Wed, Jun 14, 2023 at 04:53:48PM +0200, Peter Zijlstra wrote:
>
> > > Clearly I'm doing something wrong.
> >
> > I'm suspecting version of qemu may have something to do with it, but
> > honestly I don't know. Which qemu are you using?
>
> ivb-ep: QEMU emulator version 7.2.0 (Debian 1:7.2+dfsg-3)
> adl: QEMU emulator version 7.2.0 (Debian 1:7.2+dfsg-1+b2)

We had a report that it occurred on qemu 7.0:
https://bugzilla.redhat.com/show_bug.cgi?id=2213346#c5

I didn't try qemu 7.2, but I've been using 8.0 in my tests.

I'm now trying (on Intel again) with qemu @ 7efd65423ab22e6f5890,
the latest version as of now.

> I'll try and upgrade the adl to 8.

Thanks,

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW


2023-06-14 16:30:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Wed, Jun 14, 2023 at 05:19:20PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 14, 2023 at 04:07:17PM +0100, Richard W.M. Jones wrote:
> > On Wed, Jun 14, 2023 at 04:53:48PM +0200, Peter Zijlstra wrote:
>
> > > Clearly I'm doing something wrong.
> >
> > I'm suspecting version of qemu may have something to do with it, but
> > honestly I don't know. Which qemu are you using?
>
> ivb-ep: QEMU emulator version 7.2.0 (Debian 1:7.2+dfsg-3)
> adl: QEMU emulator version 7.2.0 (Debian 1:7.2+dfsg-1+b2)
>
> I'll try and upgrade the adl to 8.

adl: QEMU emulator version 8.0.2 (Debian 1:8.0.2+dfsg-1)

started it again..

2023-06-14 16:31:21

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs


Not very useful information, but FWIW I can reproduce it with qemu @
7efd65423a (latest upstream) configured like this:

../configure --target-list=" x86_64-softmmu arm-softmmu s390x-softmmu x86_64-linux-user" \
--disable-werror --enable-kvm --disable-tpm \
--enable-debug --enable-download

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v


2023-06-14 17:38:34

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs


FWIW attached is a test program that runs the qemu instances in
parallel (up to 8 threads), which seems to be a quicker way to hit the
problem for me. Even on Intel, with this test I can hit the bug in a
few hundred iteration.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW


Attachments:
(No filename) (563.00 B)
bootbootboot.c (3.19 kB)
Download all attachments

2023-06-15 08:21:33

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Thu, Jun 15, 2023 at 09:40:40AM +0200, Alexandre Belloni wrote:
> Hello,
>
> On 14/06/2023 18:34:30+0100, Richard W.M. Jones wrote:
> >
> > FWIW attached is a test program that runs the qemu instances in
> > parallel (up to 8 threads), which seems to be a quicker way to hit the
> > problem for me. Even on Intel, with this test I can hit the bug in a
> > few hundred iteration.
> >
>
> I'm just chiming in to say that we do hit the same issue on the Yocto
> Project CI. We are using qemu 8.0.0 on Intel hardware and a 6.1 kernel.
>
> I see that f31dcb152a3d0816e2f1deab4e64572336da197d hasn't been
> backported so it may not be the culprit. However, this seems to have
> started happening when we switched from 5.15 to 6.1.

I don't know if it's related or not, or even valid, but it was pointed
out to me[1] that you can get the exact same failure this way:

- Linux git @ b6dad5178ceaf23f369c3711062ce1f2afc33644
- Revert f31dcb152a3d0816e2f1deab4e64572336da197d
- Add the following patch:

diff --git a/init/main.c b/init/main.c
index af50044deed5..c2774865a83f 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1552,6 +1552,7 @@ static noinline void __init kernel_init_freeable(void)

cad_pid = get_pid(task_pid(current));

+ msleep(1);
smp_prepare_cpus(setup_max_cpus);

workqueue_init();

So is sleeping in kernel_init_freeable valid? It seems as if it
wouldn't be an atomic context. And is the fact that the failure looks
precisely the same coincidence?

Rich.

[1] https://news.ycombinator.com/item?id=36336059

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top


2023-06-15 08:22:10

by Alexandre Belloni

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

Hello,

On 14/06/2023 18:34:30+0100, Richard W.M. Jones wrote:
>
> FWIW attached is a test program that runs the qemu instances in
> parallel (up to 8 threads), which seems to be a quicker way to hit the
> problem for me. Even on Intel, with this test I can hit the bug in a
> few hundred iteration.
>

I'm just chiming in to say that we do hit the same issue on the Yocto
Project CI. We are using qemu 8.0.0 on Intel hardware and a 6.1 kernel.

I see that f31dcb152a3d0816e2f1deab4e64572336da197d hasn't been
backported so it may not be the culprit. However, this seems to have
started happening when we switched from 5.15 to 6.1.

Regards,

--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2023-06-15 11:23:14

by YiFei Zhu

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

> FWIW attached is a test program that runs the qemu instances in
> parallel (up to 8 threads), which seems to be a quicker way to hit the
> problem for me. Even on Intel, with this test I can hit the bug in a
> few hundred iteration.

A friend sent me here so I took a look.

I was unable to reproduce with this script after 10000 iterations,
on a AMD Gentoo Linux host:

Host kernel: 6.3.3 vanilla
Guest kernel: git commit f31dcb152a3d0816e2f1deab4e64572336da197d
Guest config: Provided full-fat Fedora config + CONFIG_GDB_SCRIPTS
QEMU: 8.0.2 (with kvm_amd)
Hardware: AMD Ryzen 7 PRO 5850U

I wonder if anything on the host side affects this, or could be some
sort of race condition.

> So is sleeping in kernel_init_freeable valid? It seems as if it
> wouldn't be an atomic context. And is the fact that the failure looks
> precisely the same coincidence?

I did a quick test on this and was able to reproduce. Attaching gdb
shows the CPU is idling in:
default_idle -> arch_safe_halt -> pv_native_safe_halt

By booting with an additional "idle=poll" and using gdb to invoke
"p show_state_filter(0)", I was able to dump the stack of kernel_init
at PID 1 (this is before it execve userspace init):

[ 0.322736] task:swapper/0 state:D stack:0 pid:1 ppid:0 flags:0x00004000
[ 0.322736] Call Trace:
[ 0.322736] <TASK>
[ 0.322736] __schedule+0x3eb/0x14c0
[ 0.322736] schedule+0x5e/0xd0
[ 0.322736] schedule_timeout+0x98/0x160
[ 0.322736] ? __pfx_process_timeout+0x10/0x10
[ 0.322736] msleep+0x2d/0x40
[ 0.322736] kernel_init_freeable+0x1f7/0x490
[ 0.322736] ? __pfx_kernel_init+0x10/0x10
[ 0.322736] kernel_init+0x1a/0x1c0
[ 0.322736] ret_from_fork+0x2c/0x50
[ 0.322736] </TASK>

It does not appear that msleep work at this stage. Also interestingly,
the printk timer shown does not reflect the time spent idling (a few
seconds), and shows the exact timestamp as:

[ 0.322736] Freeing SMP alternatives memory: 48K

I'm not exactly sure where msleep would start working. If it's just timers
then "init_timers" has already run my PID 0 at this point I believe.

I also tested a busy delay via:

diff --git a/init/main.c b/init/main.c
index bb87b789c543..f6d722917388 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1616,6 +1616,7 @@ static noinline void __init kernel_init_freeable(void)

cad_pid = get_pid(task_pid(current));

+ for (volatile long i = 0; i < 0x100000000ULL; i++);
smp_prepare_cpus(setup_max_cpus);

workqueue_init();

The kernel does boot and panic (due to lack of rootfs), taking around two
seconds longer. The dmesg printed still does not reflect that there was a
busy wait.

If you can reproduce the original bug (without the msleep or busy wait
patch), could you check if you can reproduce that with idle=poll? If so,
can you run "p show_state_filter(0)" so we get a stack trace of kernel_init,
assuming it hit a similar issue as if msleep was added. If idle=poll does
not work, or you can't call functions from within gdb (some old qemu versions
did not support this), see if you can send a alt-sysrq-w to show stacks of
blocked tasks.

YiFei Zhu

2023-06-15 11:51:44

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Thu, Jun 15, 2023 at 11:04:29AM +0000, YiFei Zhu wrote:
> see if you can send a alt-sysrq-w to show stacks of
> blocked tasks.

I guess this is a virtual console thing? I have to boot this hundreds
of times with a serial console to even get to the bug, so it's tricky.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html


2023-06-15 12:11:06

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Thu, Jun 15, 2023 at 11:04:29AM +0000, YiFei Zhu wrote:
> > FWIW attached is a test program that runs the qemu instances in
> > parallel (up to 8 threads), which seems to be a quicker way to hit the
> > problem for me. Even on Intel, with this test I can hit the bug in a
> > few hundred iteration.
>
> A friend sent me here so I took a look.
>
> I was unable to reproduce with this script after 10000 iterations,
> on a AMD Gentoo Linux host:
>
> Host kernel: 6.3.3 vanilla
> Guest kernel: git commit f31dcb152a3d0816e2f1deab4e64572336da197d
> Guest config: Provided full-fat Fedora config + CONFIG_GDB_SCRIPTS
> QEMU: 8.0.2 (with kvm_amd)
> Hardware: AMD Ryzen 7 PRO 5850U
>
> I wonder if anything on the host side affects this, or could be some
> sort of race condition.

We've had multiple independent reports of reproducing the bug, since
this story (unfortunately) hit Hacker News. Your configuration above
should work, so I still don't know what the factor is.

[...]

> If you can reproduce the original bug (without the msleep or busy wait
> patch), could you check if you can reproduce that with idle=poll? If so,
> can you run "p show_state_filter(0)" so we get a stack trace of kernel_init,
> assuming it hit a similar issue as if msleep was added. If idle=poll does
> not work, or you can't call functions from within gdb (some old qemu versions
> did not support this), see if you can send a alt-sysrq-w to show stacks of
> blocked tasks.

(1) Adding idle=poll to the guest kernel

=> Bug still occurs, with about the same frequency as before.

(2) Connect with gdb to qemu's gdb-stub:

Trying to evaluate show_state_filter(0) didn't work for reasons I
don't understand:

(gdb) target remote localhost:1234
Remote debugging using localhost:1234
warning: Remote gdbserver does not support determining executable automatically.
RHEL <=6.8 and <=7.2 versions of gdbserver do not support such automatic execut.
The following versions of gdbserver support it:
- Upstream version of gdbserver (unsupported) 7.10 or later
- Red Hat Developer Toolset (DTS) version of gdbserver from DTS 4.0 or later (o)
- RHEL-7.3 versions of gdbserver (on any architecture)
arch_static_branch (branch=false, key=<optimized out>)
at ./arch/x86/include/asm/jump_label.h:27
27 asm_volatile_goto("1:"
(gdb) bt
#0 arch_static_branch (branch=false, key=<optimized out>)
at ./arch/x86/include/asm/jump_label.h:27
#1 static_key_false (key=<optimized out>) at ./include/linux/jump_label.h:207
#2 native_write_msr (high=222, low=719927812, msr=1760)
at ./arch/x86/include/asm/msr.h:147
#3 wrmsrl (val=954202667524, msr=1760) at ./arch/x86/include/asm/msr.h:262
#4 lapic_next_deadline (delta=474, evt=0xffff88804e81bf40)
at arch/x86/kernel/apic/apic.c:491
#5 0xffffffff81143667 in clockevents_program_event (dev=0xffff88804e81bf40,
expires=<optimized out>, force=<optimized out>)
at kernel/time/clockevents.c:334
#6 0xffffffff81143c0b in tick_handle_periodic (dev=0xffff88804e81bf40)
at kernel/time/tick-common.c:133
#7 0xffffffff8105d01c in local_apic_timer_interrupt ()
at arch/x86/kernel/apic/apic.c:1095
#8 __sysvec_apic_timer_interrupt (regs=regs@entry=0xffffc90000003ee8)
at arch/x86/kernel/apic/apic.c:1112
#9 0xffffffff81e61a91 in sysvec_apic_timer_interrupt (regs=0xffffc90000003ee8)
at arch/x86/kernel/apic/apic.c:1106
#10 0xffffffff8200144a in asm_sysvec_apic_timer_interrupt ()
at ./arch/x86/include/asm/idtentry.h:645
#11 0x0000000000000000 in ?? ()
(gdb) p show_state_filter(0)
[Inferior 1 (process 1) exited normally]
The program being debugged exited while in a function called from GDB.
Evaluation of the expression containing the function
(show_state_filter) will be abandoned.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top


2023-06-15 12:52:50

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs


Never mind, I've tested the fix here, problem solved!

https://lore.kernel.org/all/87jzypq6gq.ffs@tglx/T/#m860cc015534899c7fcd12dc184c96441027f602b

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org


2023-06-15 13:04:12

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On Thu, Jun 15, 2023 at 01:21:46PM +0100, Richard W.M. Jones wrote:
>
> Never mind, I've tested the fix here, problem solved!
>

I mean:
https://lore.kernel.org/all/[email protected]/T/#m3ba5981ed4d5b534fa53589e226d832639584826

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html


2023-06-15 13:11:27

by Dr. David Alan Gilbert

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

* Richard W.M. Jones ([email protected]) wrote:
> On Thu, Jun 15, 2023 at 11:04:29AM +0000, YiFei Zhu wrote:
> > see if you can send a alt-sysrq-w to show stacks of
> > blocked tasks.
>
> I guess this is a virtual console thing? I have to boot this hundreds
> of times with a serial console to even get to the bug, so it's tricky.

If you open a monitor device you should be able to do things to the qemu
after you detect the hang; and then I think a :

sendkey alt-sysrq-w

on HMP should do it.

However, some other thoughts:
a) You could try taking a dump and then importing it into crash rather
than using gdb; both ways are touchy, but that does mean you could
give the dump to someone else who can't trigger the bug.

b) You could try giving your kernel to someone who can't trigger it;
if they can trigger it on their host with your kernel that would
suggest something is different in your build environment of your
kernel (e.g. compiler version or flags) rather than the host kernel.

Dave

> Rich.
>
> --
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-builder quickly builds VMs from scratch
> http://libguestfs.org/virt-builder.1.html
>
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux | Happy \
\ dave @ treblig.org | | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/

2023-06-18 11:45:35

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: printk.time causes rare kernel boot hangs

On 13.06.23 16:07, Linux regression tracking #adding (Thorsten Leemhuis)
wrote:
>
> On 13.06.23 15:41, Richard W.M. Jones wrote:
>> [Being tracked in this bug which contains much more detail:
>> https://gitlab.com/qemu-project/qemu/-/issues/1696 ]
>>
>> Recent kernels hang rarely when booted on qemu. Usually you need to
>> boot 100s or 1,000s of times to see the hang, compared to 292,612 [sic]
>> successful boots which I was able to do before the problematic commit.
>>
>> A reproducer (you'll probably need to use Fedora) is:
>>
>> $ while guestfish -a /dev/null -v run >& /tmp/log; do echo -n . ; done
>>
>> You will need to leave it running for probably several hours, and
>> examine the /tmp/log file at the end.
>>
>> I tracked this down to the following commit:
>>
>> commit f31dcb152a3d0816e2f1deab4e64572336da197d
>> Author: Aaron Thompson <[email protected]>
>> Date: Thu Apr 13 17:50:12 2023 +0000
>>
>> sched/clock: Fix local_clock() before sched_clock_init()
>>
>> Have local_clock() return sched_clock() if sched_clock_init() has not
>> yet run. sched_clock_cpu() has this check but it was not included in the
>> new noinstr implementation of local_clock().
>>
>> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f31dcb152a3d0816e2f1deab4e64572336da197d)
>>
>> Reverting this commit fixes the problem.
>>
>> I don't know _why_ this commit is wrong, but can we revert it as it
>> causes serious problems with libguestfs hanging randomly.
>>
>> Or if there's anything you want me to try out then let me know,
>> because I can reproduce the problem locally quite easily.
>
> Thanks for the report. To be sure the issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> tracking bot:
>
> #regzbot ^introduced f31dcb152a3d0816e2f1deab4e64572336da197d
> #regzbot title sched/clock: printk.time causes rare kernel boot hangs
> #regzbot ignore-activity

#regzbot fix: tick/common: Align tick period during sched_timer setup
#regzbot monitor:
https://lore.kernel.org/all/[email protected]/
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.