Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751716AbdC0JHr (ORCPT ); Mon, 27 Mar 2017 05:07:47 -0400 Received: from mout.gmx.net ([212.227.17.21]:50270 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751608AbdC0JHl (ORCPT ); Mon, 27 Mar 2017 05:07:41 -0400 Message-ID: <1490605644.14634.50.camel@gmx.de> Subject: Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues") From: Mike Galbraith To: Christoph Hellwig , Thorsten Leemhuis Cc: mst@redhat.com, virtio-dev@lists.oasis-open.org, Linux Kernel Mailing List , rjones@redhat.com Date: Mon, 27 Mar 2017 11:07:24 +0200 In-Reply-To: <20170323145622.GA31690@lst.de> References: <20170323145622.GA31690@lst.de> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:pnoQuXdlBpHHZtVYbOdWTs8mT6FcU8uHs6rxDUrsB69uWNJHzmQ n074FwGSWly/0SOzbKW6EwKrP8jKL5+4EKMMMfs4CQcLeu2/eOto6KVfXwuT1w6Mvii8FkN eI0CSpdE3duqvayUcPcmjPcckdvwxdj7nSWs8/25cfa5M0t+00PNfHTZ93nmAXli59NCSQM T5mESjal4NlU+aZxSc80g== X-UI-Out-Filterresults: notjunk:1;V01:K0:kP5Y4IqIrBM=:3BjlVWNfD6e9eiWelWHtfQ LLgoaut3SqxfRgoJKjxESGrXtDI7wPFIoFr8Szqmod2X03CF36rhvNY+u2EY5rY7wnnFY9C3g usGeCrufH1ZPExXX7TW5EaODpd9oeIRQS4rdvjcSDbxoV6flK3orv6KAmDcEj3pKr5K5SkG0d LzJNhgA3HULDv6dCVliXXmxWCpOtp8IpRFOzKpVE48WJtM2bLu7pBPlvqj4UarjC+IUbtN7Rq VoethX1aJhE0wDCNWJTdCrotBg+kEg47h6MPZqr6SxRbDXKyeM/EmW8MJL976DlgykHraW1DV kJ10dih8BIr3E62Hm3YtCYlFs4p+BqufBpg5POy3vgBAXCxQ0MQ85m2FLPGeemtIzHM6zSSLU QxLfyQPzXT87X0Tq9uoigAEe0p6we8Rbcm7mXoxGVhjHrmYinLbdcFmwhrhVPma6H8uCdqbPM psdEsLWc1zY+ek4cb9bdm/eBH0MYBfKxzf7YrlsyG0g8QntizLn6I02DwHUbwkH39+U+P9Z8h WMQDkhajoAMX5CyKP/1Tl7TSAG+418StprGuPQEtJaPMeG0WNBD9320F7VAXumVVmu/Ok55Ho lV6cSBOYlU+tb+d28P4DKlv3wjMTfZEMLdTUC5KhVefDmwhLcDSB0g3/whZZDpjhWZTKl/g3J PvU+SExzXxPV7aAbzpRbFjtMFjD2qkd6JLBuTScY0n8e4brStqkdXvn9/AuTm9n6o2ti8vHBT KIJ+i0NA4zYcHmB6yjsV6QRoDnjJplc/zhU0GBtrERSxZBvdO+jsz3LO/1/whbwfMWFmPuhVN 6fE40dX Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 12366 Lines: 173 On Thu, 2017-03-23 at 15:56 +0100, Christoph Hellwig wrote: > Does the patch from Jason in the > > "[REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest" > > thread fix the issue for you? That seems to eliminate explosions, but not the below. 07ec51480b5e causes me some kworker grief in -rt too (100% CPU), but that's as yet not been stared at (too darn [busy/lazy], pick one;). virgin 4.11-rc4+referenced patch, config=enterprise-ish. ... [ 158.400210] PM: Hibernation mode set to 'shutdown' [ 158.607439] PM: Syncing filesystems ... [ 158.986595] PM: done. [ 158.986771] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 158.988758] PM: Marking nosave pages: [mem 0x00000000-0x00000fff] [ 158.989156] PM: Marking nosave pages: [mem 0x0009f000-0x000fffff] [ 158.989550] PM: Marking nosave pages: [mem 0xbffde000-0xffffffff] [ 158.990200] PM: Basic memory bitmaps created [ 158.990468] PM: Preallocating image memory... done (allocated 395798 pages) [ 159.114650] PM: Allocated 1583192 kbytes in 0.12 seconds (13193.26 MB/s) [ 159.115203] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 159.119378] ------------[ cut here ]------------ [ 159.122606] WARNING: CPU: 3 PID: 509 at drivers/pci/msi.c:1251 pci_irq_vector+0xcf/0xe0 [ 159.123194] Modules linked in: fuse(E) ebtable_filter(E) ebtables(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) nf_log_ipv6(E) xt_pkttype(E) nf_log_ipv4(E) nf_log_common(E) xt_LOG(E) xt_limit(E) af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) ip6t_REJECT(E) xt_tcpudp(E) nf_conntrack_ipv6(E) nf_defrag_ipv6(E) ip6table_raw(E) ipt_REJECT(E) iptable_raw(E) xt_CT(E) iptable_filter(E) ip6table_mangle(E) nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) ip_tables(E) xt_conntrack(E) nf_conntrack(E) libcrc32c(E) ip6table_filter(E) ip6_tables(E) x_tables(E) snd_hda_codec_generic(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) joydev(E) snd_pcm(E) snd_timer(E) snd(E) crct10dif_pclmul(E) soundcore(E) crc32_pclmul(E) 8139too(E) ghash_clmulni_intel(E) [ 159.128123] pcbc(E) aesni_intel(E) ppdev(E) i2c_piix4(E) aes_x86_64(E) virtio_balloon(E) crypto_simd(E) parport_pc(E) glue_helper(E) parport(E) button(E) pcspkr(E) cryptd(E) serio_raw(E) acpi_cpufreq(E) nfsd(E) dm_mod(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ext4(E) crc16(E) jbd2(E) mbcache(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) ata_generic(E) virtio_rng(E) virtio_blk(E) virtio_console(E) ata_piix(E) floppy(E) ehci_pci(E) qxl(E) drm_kms_helper(E) syscopyarea(E) uhci_hcd(E) ahci(E) ehci_hcd(E) sysfillrect(E) crc32c_intel(E) sysimgblt(E) libahci(E) fb_sys_fops(E) ttm(E) virtio_pci(E) virtio_ring(E) usbcore(E) virtio(E) 8139cp(E) drm(E) libata(E) mii(E) sg(E) scsi_mod(E) autofs4(E) [ 159.132177] CPU: 3 PID: 509 Comm: kworker/u16:6 Tainted: G E 4.11.0-default #28 [ 159.132677] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20161202_174313-build11a 04/01/2014 [ 159.133428] Workqueue: events_unbound async_run_entry_fn [ 159.133768] Call Trace: [ 159.133933] dump_stack+0x63/0x90 [ 159.134161] __warn+0xd1/0xf0 [ 159.134360] ? pci_pm_poweroff+0x100/0x100 [ 159.134627] warn_slowpath_null+0x1d/0x20 [ 159.134889] pci_irq_vector+0xcf/0xe0 [ 159.135134] vp_synchronize_vectors+0x46/0x60 [virtio_pci] [ 159.135486] vp_reset+0x37/0x40 [virtio_pci] [ 159.135780] virtcons_freeze+0x23/0xa0 [virtio_console] [ 159.136116] virtio_device_freeze+0x6b/0x80 [virtio] [ 159.136431] virtio_pci_freeze+0x1d/0x40 [virtio_pci] [ 159.136756] pci_pm_freeze+0x5f/0xe0 [ 159.136999] dpm_run_callback+0x59/0x180 [ 159.137252] __device_suspend+0x127/0x3c0 [ 159.137513] ? pm_dev_dbg+0x80/0x80 [ 159.137740] async_suspend+0x1f/0xa0 [ 159.137973] async_run_entry_fn+0x39/0x170 [ 159.138250] process_one_work+0x16c/0x450 [ 159.138514] worker_thread+0x137/0x4e0 [ 159.138761] kthread+0x10c/0x140 [ 159.138970] ? rescuer_thread+0x3c0/0x3c0 [ 159.139235] ? kthread_park+0x90/0x90 [ 159.139476] ret_from_fork+0x2c/0x40 [ 159.139721] ---[ end trace d66daafbe82e66e7 ]--- [ 159.728658] PM: freeze of devices complete after 611.743 msecs [ 159.729321] PM: late freeze of devices complete after 0.243 msecs [ 159.730921] PM: noirq freeze of devices complete after 1.145 msecs [ 159.731507] Disabling non-boot CPUs ... [ 159.732004] Unregister pv shared memory for cpu 1 [ 159.739017] smpboot: CPU 1 is now offline [ 159.765702] Unregister pv shared memory for cpu 2 [ 159.770757] smpboot: CPU 2 is now offline [ 159.797684] Unregister pv shared memory for cpu 3 [ 159.799545] smpboot: CPU 3 is now offline [ 159.821934] Unregister pv shared memory for cpu 4 [ 159.823759] smpboot: CPU 4 is now offline [ 159.848588] Unregister pv shared memory for cpu 5 [ 159.850375] smpboot: CPU 5 is now offline [ 159.872907] Unregister pv shared memory for cpu 6 [ 159.874598] smpboot: CPU 6 is now offline [ 159.896898] Unregister pv shared memory for cpu 7 [ 159.898517] smpboot: CPU 7 is now offline [ 159.916620] PM: Creating hibernation image: [ 159.998033] PM: Need to copy 394338 pages [ 159.998370] PM: Normal pages needed: 394338 + 1024, available pages: 1702516 [1266874520.353525] kvm-clock: cpu 0, msr 2:3ff54001, primary cpu clock, resume [1266874520.354849] Suspended for 67.457 seconds [1266874520.354877] Enabling non-boot CPUs ... [1266874520.366683] x86: Booting SMP configuration: [1266874520.367028] smpboot: Booting Node 0 Processor 1 APIC 0x1 [1266874520.367514] kvm-clock: cpu 1, msr 2:3ff54041, secondary cpu clock [1266874520.370107] KVM setup async PF for cpu 1 [1266874520.370425] kvm-stealtime: cpu 1, msr 23fc4d900 [1266874520.370858] cache: parent cpu1 should not be sleeping [1266874520.371408] CPU1 is up [1266874520.386657] smpboot: Booting Node 0 Processor 2 APIC 0x2 [1266874520.387189] kvm-clock: cpu 2, msr 2:3ff54081, secondary cpu clock [1266874520.389787] KVM setup async PF for cpu 2 [1266874520.390142] kvm-stealtime: cpu 2, msr 23fc8d900 [1266874520.390604] cache: parent cpu2 should not be sleeping [1266874520.391178] CPU2 is up [1266874520.410687] smpboot: Booting Node 0 Processor 3 APIC 0x3 [1266874520.411288] kvm-clock: cpu 3, msr 2:3ff540c1, secondary cpu clock [1266874520.413851] KVM setup async PF for cpu 3 [1266874520.414162] kvm-stealtime: cpu 3, msr 23fccd900 [1266874520.414566] cache: parent cpu3 should not be sleeping [1266874520.415071] CPU3 is up [1266874520.430924] smpboot: Booting Node 0 Processor 4 APIC 0x4 [1266874520.431459] kvm-clock: cpu 4, msr 2:3ff54101, secondary cpu clock [1266874520.434048] KVM setup async PF for cpu 4 [1266874520.434381] kvm-stealtime: cpu 4, msr 23fd0d900 [1266874520.434822] cache: parent cpu4 should not be sleeping [1266874520.435456] CPU4 is up [1266874520.454957] smpboot: Booting Node 0 Processor 5 APIC 0x5 [1266874520.455822] kvm-clock: cpu 5, msr 2:3ff54141, secondary cpu clock [1266874520.458507] KVM setup async PF for cpu 5 [1266874520.458834] kvm-stealtime: cpu 5, msr 23fd4d900 [1266874520.459439] cache: parent cpu5 should not be sleeping [1266874520.460320] CPU5 is up [1266874520.478907] smpboot: Booting Node 0 Processor 6 APIC 0x6 [1266874520.479586] kvm-clock: cpu 6, msr 2:3ff54181, secondary cpu clock [1266874520.482176] KVM setup async PF for cpu 6 [1266874520.482510] kvm-stealtime: cpu 6, msr 23fd8d900 [1266874520.482936] cache: parent cpu6 should not be sleeping [1266874520.483466] CPU6 is up [1266874520.502988] smpboot: Booting Node 0 Processor 7 APIC 0x7 [1266874520.503775] kvm-clock: cpu 7, msr 2:3ff541c1, secondary cpu clock [1266874520.506519] KVM setup async PF for cpu 7 [1266874520.506884] kvm-stealtime: cpu 7, msr 23fdcd900 [1266874520.507438] cache: parent cpu7 should not be sleeping [1266874520.508141] CPU7 is up [1266874520.517708] PM: noirq restore of devices complete after 3.294 msecs [1266874520.518547] PM: early restore of devices complete after 0.129 msecs [1266874520.553012] rtc_cmos 00:00: System wakeup disabled by ACPI [1266874520.562260] usb usb2: root hub lost power or was reset [1266874520.562269] usb usb1: root hub lost power or was reset [1266874520.563296] usb usb4: root hub lost power or was reset [1266874520.564993] ------------[ cut here ]------------ [1266874520.564998] WARNING: CPU: 0 PID: 507 at drivers/pci/msi.c:1261 pci_irq_vector+0xb5/0xe0 [1266874520.564998] Modules linked in: fuse(E) ebtable_filter(E) ebtables(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) nf_log_ipv6(E) xt_pkttype(E) nf_log_ipv4(E) nf_log_common(E) xt_LOG(E) xt_limit(E) af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) ip6t_REJECT(E) xt_tcpudp(E) nf_conntrack_ipv6(E) nf_defrag_ipv6(E) ip6table_raw(E) ipt_REJECT(E) iptable_raw(E) xt_CT(E) iptable_filter(E) ip6table_mangle(E) nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) ip_tables(E) xt_conntrack(E) nf_conntrack(E) libcrc32c(E) ip6table_filter(E) ip6_tables(E) x_tables(E) snd_hda_codec_generic(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) joydev(E) snd_pcm(E) snd_timer(E) snd(E) crct10dif_pclmul(E) soundcore(E) crc32_pclmul(E) 8139too(E) ghash_clmulni_intel(E) [1266874520.565022] pcbc(E) aesni_intel(E) ppdev(E) i2c_piix4(E) aes_x86_64(E) virtio_balloon(E) crypto_simd(E) parport_pc(E) glue_helper(E) parport(E) button(E) pcspkr(E) cryptd(E) serio_raw(E) acpi_cpufreq(E) nfsd(E) dm_mod(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ext4(E) crc16(E) jbd2(E) mbcache(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) ata_generic(E) virtio_rng(E) virtio_blk(E) virtio_console(E) ata_piix(E) floppy(E) ehci_pci(E) qxl(E) drm_kms_helper(E) syscopyarea(E) uhci_hcd(E) ahci(E) ehci_hcd(E) sysfillrect(E) crc32c_intel(E) sysimgblt(E) libahci(E) fb_sys_fops(E) ttm(E) virtio_pci(E) virtio_ring(E) usbcore(E) virtio(E) 8139cp(E) drm(E) libata(E) mii(E) sg(E) scsi_mod(E) autofs4(E) [1266874520.565048] CPU: 0 PID: 507 Comm: kworker/u16:4 Tainted: G W E 4.11.0-default #28 [1266874520.565049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20161202_174313-build11a 04/01/2014 [1266874520.565052] Workqueue: events_unbound async_run_entry_fn [1266874520.565052] Call Trace: [1266874520.565058] dump_stack+0x63/0x90 [1266874520.565060] __warn+0xd1/0xf0 [1266874520.565064] ? pci_pm_suspend_noirq+0x190/0x190 [1266874520.565065] warn_slowpath_null+0x1d/0x20 [1266874520.565067] pci_irq_vector+0xb5/0xe0 [1266874520.565069] vp_synchronize_vectors+0x46/0x60 [virtio_pci] [1266874520.565071] vp_reset+0x37/0x40 [virtio_pci] [1266874520.565073] virtio_device_restore+0x22/0x110 [virtio] [1266874520.565074] virtio_pci_restore+0x36/0x40 [virtio_pci] [1266874520.565076] pci_pm_restore+0x79/0xb0 [1266874520.565078] dpm_run_callback+0x59/0x180 [1266874520.565079] device_resume+0xe7/0x210 [1266874520.565080] ? pm_dev_dbg+0x80/0x80 [1266874520.565082] async_resume+0x1d/0x50 [1266874520.565083] async_run_entry_fn+0x39/0x170 [1266874520.565084] process_one_work+0x16c/0x450 [1266874520.565085] worker_thread+0x137/0x4e0 [1266874520.565088] kthread+0x10c/0x140 [1266874520.565089] ? rescuer_thread+0x3c0/0x3c0 [1266874520.565090] ? kthread_park+0x90/0x90 [1266874520.565092] ret_from_fork+0x2c/0x40 [1266874520.565094] ---[ end trace d66daafbe82e66e8 ]--- [1266874520.600698] usb usb3: root hub lost power or was reset [1266874520.882457] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [1266874520.883385] ata5: SATA link down (SStatus 0 SControl 300) [1266874520.884202] ata4: SATA link down (SStatus 0 SControl 300) [1266874520.884923] ata6: SATA link down (SStatus 0 SControl 300) [1266874520.885592] ata3: SATA link down (SStatus 0 SControl 300) [1266874520.886347] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [1266874520.887101] ata1.00: configured for UDMA/100 [1266874520.887835] ata2.00: configured for UDMA/100 [1266874520.946407] usb 3-1: reset high-speed USB device number 2 using ehci-pci [1266874521.096211] PM: restore of devices complete after 543.539 msecs [1266874521.097984] PM: Image restored successfully. [1266874521.099356] PM: Basic memory bitmaps freed [1266874521.100669] Restarting tasks ... done.