Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp4666126rdh; Wed, 29 Nov 2023 07:30:54 -0800 (PST) X-Google-Smtp-Source: AGHT+IE6qLaKAQloeA5OoWz5TmV3UgnjU/Wd2sLDgZcrbmy8u6Z2qwxNhAuRrS2SsUyUMx1KTU7D X-Received: by 2002:a17:90b:4b4b:b0:285:b784:75d9 with SMTP id mi11-20020a17090b4b4b00b00285b78475d9mr14661284pjb.12.1701271854520; Wed, 29 Nov 2023 07:30:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701271854; cv=none; d=google.com; s=arc-20160816; b=oAXd5TWk0HOMmNN59I5vRqTvSO+N61bpgf1SN/bogeH5yPHkleaLEAxAgUQC8eJd+s VH2wvxzTdKey5Dv1f3gWMsenXVH4M9xatZ0SDAsP/M4rU/yp05kf6JA0pbD6Ha/38Gyv vPuMy4jxap+EOD8h300khZVW2Jvy58B5OLqd+7wnMw8DoeMb1ZfgYiArNaOMpfV6Hjhr xJ2+2lvOJJ892B30D9c+rqpASAkEh15tm7g8x3IyHWiixUithMZNY2Q0PEIeI2zTpX4A nJ/9jKxEW/vR8BomlzNkc6Q8n7Q59tburnA7JFQu++mxMFDTSlwSdiXGsjHUXgP8mWOc cOcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject:from :content-language:user-agent:mime-version:date:message-id; bh=XeqT2AKYwXctkqJeQJfcE1zKjljrx35WCbIeXNGYZp8=; fh=D3pY0Pr+gKS9swKUaTJJKzcI/JAKHTjx6amCiXviXIo=; b=ozGk+k7gKrfcjW2ayytVTo27X+xr6VY0B0fWGoehrxcx/ZLsyZdMMdCDVhGXtZr8kZ YoW6H+x2gVzjOpczbeBJ8LSPXleE29Js3VvjSRLzNhH+2AkOc3p9Pp/v3WcIgFAszkYA B4S9AxDVaOjhcYW3ItSyyuhzjAgJa/L3XsdkE9/nkBRyN5f5B+UVmmK8cxhS7v4K2t9S dlKFzlikH8b8aOoFslaBg/ZB1hMZTTO+e7ovEb8dBtQ6NQ68SHr3tutXokjd7BZP/hEx TI+CtSiT7a+0P1bpfEDDD55piccJjdmeOXY4hLLMEHmoN92LAUcSmSIs69zZC9+Rho/o yZbw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id a7-20020a17090a854700b0027e022bd420si1542158pjw.77.2023.11.29.07.30.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Nov 2023 07:30:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id E9E2180AE56D; Wed, 29 Nov 2023 07:30:27 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234741AbjK2P37 (ORCPT + 99 others); Wed, 29 Nov 2023 10:29:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231206AbjK2P36 (ORCPT ); Wed, 29 Nov 2023 10:29:58 -0500 X-Greylist: delayed 437 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Wed, 29 Nov 2023 07:30:01 PST Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45B88D43; Wed, 29 Nov 2023 07:30:01 -0800 (PST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 1C7AD40F03; Wed, 29 Nov 2023 16:22:43 +0100 (CET) Message-ID: <9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.com> Date: Wed, 29 Nov 2023 16:22:41 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US From: Fiona Ebner Subject: SCSI hotplug issues with UEFI VM with guest kernel >= 6.5 To: linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org, Igor Mammedov Cc: linux-kernel@vger.kernel.org, bhelgaas@google.com, lenb@kernel.org, rafael@kernel.org, Thomas Lamprecht Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 29 Nov 2023 07:30:28 -0800 (PST) Hi, it seems that hot-plugging SCSI disks for QEMU virtual machines booting with UEFI and with guest kernels >= 6.5 might be broken. It's not consistently broken, hinting there might be a race somewhere. Reverting the following two commits seems to make it work reliably again: cc22522fd55e2 ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") 40613da52b13f ("PCI: acpiphp: Reassign resources on bridge if necessary" Of course, they might only expose some pre-existing issue, but this is my best lead. See below for some logs and details about an affected virtual machine. Happy to provide more information and to debug/test further. Best Regards, Fiona Host kernel: 6.5.11-4-pve which is based on the one from Ubuntu Guest kernel: 6.7.0-rc3 and 6.7.0-rc3 with above commits reverted QEMU version: v8.1.0 built from source EDK2 version: submodule in the QEMU v8.1 repository: edk2-stable202302 QEMU command line: > ./qemu-system-x86_64 \ > -accel 'kvm' \ > -chardev 'socket,id=qmp,path=/var/run/qemu-server/104.qmp,server=on,wait=off' \ > -mon 'chardev=qmp,mode=control' \ > -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \ > -mon 'chardev=qmp-event,mode=control' \ > -pidfile /var/run/qemu-server/104.pid \ > -drive 'if=pflash,unit=0,format=raw,readonly=on,file=./pc-bios/edk2-x86_64-code.fd' \ > -drive 'if=pflash,unit=1,id=drive-efidisk0,format=raw,file=/dev/u2nvme/vm-104-disk-0,size=540672' \ > -smp '4,sockets=1,cores=4,maxcpus=4' \ > -nodefaults \ > -vnc 'unix:/var/run/qemu-server/104.vnc,password=on' \ > -m 4096 \ > -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \ > -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \ > -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' \ > -device 'VGA,id=vga,bus=pci.0,addr=0x2' \ > -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1' \ > -drive 'file=/dev/u2nvme/vm-104-disk-1,if=none,id=drive-scsi0,format=raw' \ > -blockdev 'raw,file.driver=file,file.filename=/home/febner/plug.raw,node-name=drive-scsi1' \ > -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' \ > -netdev 'type=tap,id=net0,ifname=tap104i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \ > -device 'virtio-net-pci,mac=BC:24:11:89:6A:E6,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=102' \ > -bios './pc-bios/edk2-x86_64-code.fd' Script to issue hotplug command via QEMU monitor protocol (QMP): > #!/bin/sh > > ID=$1 > CMD=$2 > > if [ -z "$ID" ]; then > echo "need to specify ID"; > exit 1; > fi > > if [ -z "$CMD" ]; then > echo "need to specify command (plug or unplug)"; > exit 1; > fi > > > if [ "$CMD" = "plug" ]; then > socat - /var/run/qemu-server/"$ID".qmp << END > {"execute": "qmp_capabilities"} > {"arguments":{"driver":"virtio-scsi-pci","bus":"pci.3","addr":"0x2","id":"virtioscsi1"},"execute":"device_add"} > {"arguments":{"bus":"virtioscsi1.0","channel":"0","driver":"scsi-hd","id":"scsi1","drive":"drive-scsi1","scsi-id":"0","lun":"1"},"execute":"device_add"} > END > elif [ "$CMD" = "unplug" ]; then > socat - /var/run/qemu-server/"$ID".qmp << END > {"execute": "qmp_capabilities"} > {"arguments":{"id":"scsi1"},"execute":"device_del"} > {"arguments":{"id":"virtioscsi1"},"execute":"device_del"} > END > fi I've also tired and added 10 second sleep between the two device_add commands just to be sure, but that didn't make a difference. (Our management stack does query via QMP and wait for the device to show up and is also affected, I was just too lazy to do that for the reproducer here). I've attached some logs for guest using kernel 6.7.0-rc3 where hotplug works rarely and guest using kernel 6.7.0-rc3 with the previously mentioned commits reverted where hotplug works reliably: 6.7.0-rc3: > Nov 29 15:12:02 hotplug kernel: pci 0000:01:02.0: [1af4:1004] type 00 class 0x010000 > Nov 29 15:12:02 hotplug kernel: pci 0000:01:02.0: reg 0x10: [io 0x0000-0x003f] > Nov 29 15:12:02 hotplug kernel: pci 0000:01:02.0: reg 0x14: [mem 0x00000000-0x00000fff] > Nov 29 15:12:02 hotplug kernel: pci 0000:01:02.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref] > Nov 29 15:12:02 hotplug kernel: pci 0000:01:02.0: BAR 4: assigned [mem 0xc000004000-0xc000007fff 64bit pref] > Nov 29 15:12:02 hotplug kernel: pci 0000:01:02.0: BAR 1: assigned [mem 0xc1401000-0xc1401fff] > Nov 29 15:12:02 hotplug kernel: pci 0000:01:02.0: BAR 0: assigned [io 0xe040-0xe07f] > Nov 29 15:12:02 hotplug kernel: pci 0000:00:05.0: PCI bridge to [bus 01] > Nov 29 15:12:02 hotplug kernel: pci 0000:00:05.0: bridge window [io 0xe000-0xefff] > Nov 29 15:12:02 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc1400000-0xc15fffff] > Nov 29 15:12:02 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc000000000-0xc01fffffff 64bit pref] > Nov 29 15:12:02 hotplug kernel: virtio-pci 0000:01:02.0: enabling device (0000 -> 0003) > Nov 29 15:12:02 hotplug kernel: ACPI: \_SB_.LNKC: Enabled at IRQ 11 > Nov 29 15:12:02 hotplug kernel: scsi host3: Virtio SCSI HBA > Nov 29 15:12:02 hotplug kernel: pci 0000:00:05.0: PCI bridge to [bus 01] > Nov 29 15:12:02 hotplug kernel: pci 0000:00:05.0: bridge window [io 0xe000-0xefff] > Nov 29 15:12:02 hotplug kernel: scsi 3:0:0:1: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 > Nov 29 15:12:02 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc1400000-0xc15fffff] > Nov 29 15:12:02 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc000000000-0xc01fffffff 64bit pref] Reboot > Nov 29 15:12:52 hotplug kernel: pci 0000:01:02.0: [1af4:1004] type 00 class 0x010000 > Nov 29 15:12:52 hotplug kernel: pci 0000:01:02.0: reg 0x10: [io 0x0000-0x003f] > Nov 29 15:12:52 hotplug kernel: pci 0000:01:02.0: reg 0x14: [mem 0x00000000-0x00000fff] > Nov 29 15:12:52 hotplug kernel: pci 0000:01:02.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref] > Nov 29 15:12:52 hotplug kernel: pci 0000:01:02.0: BAR 4: assigned [mem 0xc000004000-0xc000007fff 64bit pref] > Nov 29 15:12:52 hotplug kernel: pci 0000:01:02.0: BAR 1: assigned [mem 0xc1401000-0xc1401fff] > Nov 29 15:12:52 hotplug kernel: pci 0000:01:02.0: BAR 0: assigned [io 0xe040-0xe07f] > Nov 29 15:12:52 hotplug kernel: pci 0000:00:05.0: PCI bridge to [bus 01] > Nov 29 15:12:52 hotplug kernel: pci 0000:00:05.0: bridge window [io 0xe000-0xefff] > Nov 29 15:12:52 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc1400000-0xc15fffff] > Nov 29 15:12:52 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc000000000-0xc01fffffff 64bit pref] > Nov 29 15:12:52 hotplug kernel: virtio-pci 0000:01:02.0: enabling device (0000 -> 0003) > Nov 29 15:12:52 hotplug kernel: ACPI: \_SB_.LNKC: Enabled at IRQ 11 > Nov 29 15:12:52 hotplug kernel: scsi host3: Virtio SCSI HBA > Nov 29 15:12:52 hotplug kernel: pci 0000:00:05.0: PCI bridge to [bus 01] > Nov 29 15:12:52 hotplug kernel: pci 0000:00:05.0: bridge window [io 0xe000-0xefff] > Nov 29 15:12:52 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc1400000-0xc15fffff] > Nov 29 15:12:52 hotplug kernel: scsi 3:0:0:1: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 > Nov 29 15:12:52 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc000000000-0xc01fffffff 64bit pref] RebootThe one time it did work. Note that the line with "QEMU HARDDISK" comes after all lines with "bridge window": > Nov 29 15:13:51 hotplug kernel: pci 0000:01:02.0: [1af4:1004] type 00 class 0x010000 > Nov 29 15:13:51 hotplug kernel: pci 0000:01:02.0: reg 0x10: [io 0x0000-0x003f] > Nov 29 15:13:51 hotplug kernel: pci 0000:01:02.0: reg 0x14: [mem 0x00000000-0x00000fff] > Nov 29 15:13:51 hotplug kernel: pci 0000:01:02.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref] > Nov 29 15:13:51 hotplug kernel: pci 0000:01:02.0: BAR 4: assigned [mem 0xc000004000-0xc000007fff 64bit pref] > Nov 29 15:13:51 hotplug kernel: pci 0000:01:02.0: BAR 1: assigned [mem 0xc1401000-0xc1401fff] > Nov 29 15:13:51 hotplug kernel: pci 0000:01:02.0: BAR 0: assigned [io 0xe040-0xe07f] > Nov 29 15:13:51 hotplug kernel: pci 0000:00:05.0: PCI bridge to [bus 01] > Nov 29 15:13:51 hotplug kernel: pci 0000:00:05.0: bridge window [io 0xe000-0xefff] > Nov 29 15:13:51 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc1400000-0xc15fffff] > Nov 29 15:13:51 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc000000000-0xc01fffffff 64bit pref] > Nov 29 15:13:51 hotplug kernel: virtio-pci 0000:01:02.0: enabling device (0000 -> 0003) > Nov 29 15:13:51 hotplug kernel: ACPI: \_SB_.LNKC: Enabled at IRQ 11 > Nov 29 15:13:51 hotplug kernel: scsi host3: Virtio SCSI HBA > Nov 29 15:13:51 hotplug kernel: pci 0000:00:05.0: PCI bridge to [bus 01] > Nov 29 15:13:51 hotplug kernel: pci 0000:00:05.0: bridge window [io 0xe000-0xefff] > Nov 29 15:13:51 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc1400000-0xc15fffff] > Nov 29 15:13:51 hotplug kernel: pci 0000:00:05.0: bridge window [mem 0xc000000000-0xc01fffffff 64bit pref] > Nov 29 15:13:51 hotplug kernel: scsi 3:0:0:1: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 > Nov 29 15:13:51 hotplug kernel: sd 3:0:0:1: Attached scsi generic sg1 type 0 > Nov 29 15:13:51 hotplug kernel: sd 3:0:0:1: Power-on or device reset occurred > Nov 29 15:13:51 hotplug kernel: sd 3:0:0:1: [sdb] 2048 512-byte logical blocks: (1.05 MB/1.00 MiB) > Nov 29 15:13:51 hotplug kernel: sd 3:0:0:1: [sdb] Write Protect is off > Nov 29 15:13:51 hotplug kernel: sd 3:0:0:1: [sdb] Mode Sense: 63 00 00 08 > Nov 29 15:13:51 hotplug kernel: sd 3:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > Nov 29 15:13:51 hotplug kernel: sd 3:0:0:1: [sdb] Attached SCSI disk > Nov 29 15:14:08 hotplug systemd[1]: systemd-fsckd.service: Deactivated successfully. 6.7.0-rc3 with the following reverted: cc22522fd55e2 ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") 40613da52b13f ("PCI: acpiphp: Reassign resources on bridge if necessary") > Nov 29 15:15:37 hotplug kernel: pci 0000:01:02.0: [1af4:1004] type 00 class 0x010000 > Nov 29 15:15:37 hotplug kernel: pci 0000:01:02.0: reg 0x10: [io 0x0000-0x003f] > Nov 29 15:15:37 hotplug kernel: pci 0000:01:02.0: reg 0x14: [mem 0x00000000-0x00000fff] > Nov 29 15:15:37 hotplug kernel: pci 0000:01:02.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref] > Nov 29 15:15:37 hotplug kernel: pci 0000:01:02.0: BAR 4: assigned [mem 0xc000004000-0xc000007fff 64bit pref] > Nov 29 15:15:37 hotplug kernel: pci 0000:01:02.0: BAR 1: assigned [mem 0xc1401000-0xc1401fff] > Nov 29 15:15:37 hotplug kernel: pci 0000:01:02.0: BAR 0: assigned [io 0xe040-0xe07f] > Nov 29 15:15:37 hotplug kernel: virtio-pci 0000:01:02.0: enabling device (0000 -> 0003) > Nov 29 15:15:37 hotplug kernel: ACPI: \_SB_.LNKC: Enabled at IRQ 11 > Nov 29 15:15:37 hotplug kernel: scsi host3: Virtio SCSI HBA > Nov 29 15:15:37 hotplug kernel: scsi 3:0:0:1: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 > Nov 29 15:15:37 hotplug kernel: sd 3:0:0:1: Attached scsi generic sg1 type 0 > Nov 29 15:15:37 hotplug kernel: sd 3:0:0:1: Power-on or device reset occurred > Nov 29 15:15:37 hotplug kernel: sd 3:0:0:1: [sdb] 2048 512-byte logical blocks: (1.05 MB/1.00 MiB) > Nov 29 15:15:37 hotplug kernel: sd 3:0:0:1: [sdb] Write Protect is off > Nov 29 15:15:37 hotplug kernel: sd 3:0:0:1: [sdb] Mode Sense: 63 00 00 08 > Nov 29 15:15:37 hotplug kernel: sd 3:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > Nov 29 15:15:37 hotplug kernel: sd 3:0:0:1: [sdb] Attached SCSI disk > Nov 29 15:15:38 hotplug systemd[1]: systemd-fsckd.service: Deactivated successfully.