2023-08-29 18:08:13

by Marc Haber

[permalink] [raw]
Subject: Re: Linux 6.5 speed regression, boot VERY slow with anything systemd related

On Tue, Aug 29, 2023 at 08:42:04AM -0700, Sean Christopherson wrote:
> On Tue, Aug 29, 2023, Marc Haber wrote:
> > On Tue, Aug 29, 2023 at 07:53:45AM -0700, Sean Christopherson wrote:
> > > What is different between the bad host(s) and the good host(s)? E.g. kernel, QEMU,
> >
> > The bad host is an APU ("AMD GX-412TC SOC") with 4 GB of RAM, one of the
> > good hosts is a "Xeon(R) CPU E3-1246 v3" with 32 GB of RAM.
>
> I don't expect it to help, but can you try booting the bad host with
> "spec_rstack_overflow=off"?

That is destined to go on the kernel command line of the host, not the
VM, right? I am asking because that host runs a set of VMs that are not
that easy to reboot without impact on other services, I'd rather not do
experiments with that.

The issue can be triggered and worked around by changing the VM only, I
didn't touch the host other than some virsh incantations.

>
> > system configuration is from the same ansible playbook, but of
> > course there are differences.
>
> Can you capture the QEMU command lines for the good and bad hosts?
> KVM doesn't get directly involved in serial port emulation; if the
> blamed commit in 6.5 is triggering unexpected behavior then QEMU is a
> better starting point than KVM.

the qmu command line of the test VM on the APU host is

[14/4946]mh@prom:~ $ pstree -apl | grep [l]asso
|-qemu-system-x86,251092 -name guest=lasso,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-20-lasso/master-key.aes -machine pc-i440fx-2.1,accel=kvm,usb=off,dump-guest-core=off -cpu Opteron_G3,monitor=off,x2apic=on,hypervisor=on -m 768 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 15338d79-b877-48da-b72f-f706bd05dadf -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=30,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/dev/prom/lasso,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on -drive file=/dev/prom/lasso-swap0,format=raw,if=none,id=drive-virtio-disk1,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=drive-virtio-disk1,id=virtio-disk1,write-cache=on -drive if=none,id=drive-ide0-0-0,readonly=on -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=33,id=hostnet0,vhost=on,vhostfd=34 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:9e:9a:15,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5902,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2,bus=usb.0,port=4 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3,bus=usb.0,port=5 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0xa -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
[15/4947]mh@prom:~ $

a refrence qemu command line of a test VM on the Xeon host is

[1/4992]mh@gancho:~ $ pstree -apl | grep '[w]hip'
|-qemu-system-x86,1478 -enable-kvm -name guest=whip,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-9-whip/master-key.aes -machine pc-i440fx-2.1,accel=kvm,usb=off,dump-guest-core=off -cpu Nehalem -m 512 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 3dd1f71d-3b84-44e2-808f-a5c67694f25c -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=34,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x7 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/dev/gancho/whip,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,aio=native -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on -netdev tap,fd=36,id=hostnet0,vhost=on,vhostfd=37 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5e:b4:42,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=38,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5908,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x9 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
[2/4993]mh@gancho:~ $

Both come from virt-manager, so if the XML helps more, I'll happy to
post that as well.

Greetings
Marc

--
-----------------------------------------------------------------------------
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421


2023-08-30 18:31:41

by Sean Christopherson

[permalink] [raw]
Subject: Re: Linux 6.5 speed regression, boot VERY slow with anything systemd related

On Tue, Aug 29, 2023, Marc Haber wrote:
> On Tue, Aug 29, 2023 at 08:42:04AM -0700, Sean Christopherson wrote:
> > On Tue, Aug 29, 2023, Marc Haber wrote:
> > > On Tue, Aug 29, 2023 at 07:53:45AM -0700, Sean Christopherson wrote:
> > > > What is different between the bad host(s) and the good host(s)? E.g. kernel, QEMU,
> > >
> > > The bad host is an APU ("AMD GX-412TC SOC") with 4 GB of RAM, one of the
> > > good hosts is a "Xeon(R) CPU E3-1246 v3" with 32 GB of RAM.
> >
> > I don't expect it to help, but can you try booting the bad host with
> > "spec_rstack_overflow=off"?
>
> That is destined to go on the kernel command line of the host, not the
> VM, right? I am asking because that host runs a set of VMs that are not
> that easy to reboot without impact on other services, I'd rather not do
> experiments with that.

Ah, yeah, in that case don't bother.

> The issue can be triggered and worked around by changing the VM only, I
> didn't touch the host other than some virsh incantations.
>
> >
> > > system configuration is from the same ansible playbook, but of
> > > course there are differences.
> >
> > Can you capture the QEMU command lines for the good and bad hosts?
> > KVM doesn't get directly involved in serial port emulation; if the
> > blamed commit in 6.5 is triggering unexpected behavior then QEMU is a
> > better starting point than KVM.
>
> the qmu command line of the test VM on the APU host is
>
> [14/4946]mh@prom:~ $ pstree -apl | grep [l]asso
> |-qemu-system-x86,251092 -name guest=lasso,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-20-lasso/master-key.aes -machine pc-i440fx-2.1,accel=kvm,usb=off,dump-guest-core=off -cpu Opteron_G3,monitor=off,x2apic=on,hypervisor=on -m 768 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 15338d79-b877-48da-b72f-f706bd05dadf -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=30,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/dev/prom/lasso,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on -drive file=/dev/prom/lasso-swap0,format=raw,if=none,id=drive-virtio-disk1,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=drive-virtio-disk1,id=virtio-disk1,write-cache=on -drive if=none,id=drive-ide0-0-0,readonly=on -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=33,id=hostnet0,vhost=on,vhostfd=34 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:9e:9a:15,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5902,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -chardev spicevmc,id=charredir2,name=usbredir -device usb-redir,chardev=charredir2,id=redir2,bus=usb.0,port=4 -chardev spicevmc,id=charredir3,name=usbredir -device usb-redir,chardev=charredir3,id=redir3,bus=usb.0,port=5 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0xa -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
> [15/4947]mh@prom:~ $
>
> a refrence qemu command line of a test VM on the Xeon host is
>
> [1/4992]mh@gancho:~ $ pstree -apl | grep '[w]hip'
> |-qemu-system-x86,1478 -enable-kvm -name guest=whip,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-9-whip/master-key.aes -machine pc-i440fx-2.1,accel=kvm,usb=off,dump-guest-core=off -cpu Nehalem -m 512 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 3dd1f71d-3b84-44e2-808f-a5c67694f25c -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=34,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x7 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/dev/gancho/whip,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,aio=native -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,write-cache=on -netdev tap,fd=36,id=hostnet0,vhost=on,vhostfd=37 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5e:b4:42,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=38,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5908,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x9 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
> [2/4993]mh@gancho:~ $
>
> Both come from virt-manager, so if the XML helps more, I'll happy to
> post that as well.

Those command lines are quite different, e.g. the Intel one has two serial ports
versus one for the AMD VM. Unless Tony jumps in with an idea, I would try massaging
either the good or bad VM's QEMU invocation, e.g. see if you can get the AMD VM
to "pass" by pulling in stuff from the Intel VM, or get the Intel VM to fail by
making its command line look more like the AMD VM.

2023-08-30 18:35:44

by Marc Haber

[permalink] [raw]
Subject: Re: Linux 6.5 speed regression, boot VERY slow with anything systemd related

Hi,

On Tue, Aug 29, 2023 at 10:14:23AM -0700, Sean Christopherson wrote:
> On Tue, Aug 29, 2023, Marc Haber wrote:
> > Both come from virt-manager, so if the XML helps more, I'll happy to
> > post that as well.
>
> Those command lines are quite different, e.g. the Intel one has two
> serial ports versus one for the AMD VM.

Indeed? I virt-manager, I don't see a second serial port. In either case,
only the one showing up in the VM as ttyS0 is being used. But thanks for
making me look, I discovered that the machine on the Intel host still
used emulated SCSI instead of VirtIO für the main disk. I changed that.

>Unless Tony jumps in with an
> idea, I would try massaging either the good or bad VM's QEMU
> invocation, e.g. see if you can get the AMD VM to "pass" by pulling in
> stuff from the Intel VM, or get the Intel VM to fail by making its
> command line look more like the AMD VM.

In Virt-Manager, both machines don't look THAT different tbh. I verified
the XML and the differences are not big at all.

Do you want me to try different vCPU types? Currently the VM is set to
"Opteron_G3", would you recommend a different vCPU for the host having a
"AMD GX-412TC SOC" host CPU?

Greetings
Marc

--
-----------------------------------------------------------------------------
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421