Hello,
On kernel v6.6.0-rc5 we have discovered a lockdep warning when using PCIe
hotplug. The issue is reproducible using PCIe hotplug in a Qemu environment.
When reverting the following commit, the warning no longer exists:
commit f5eff5591b8f9c5effd25c92c758a127765f74c1
Author: Lukas Wunner <[email protected]>
Date: Tue Apr 11 08:21:02 2023 +0200
PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock
We have also experienced the issue on the v5.10-stable branch.
For now I have difficulty determining if this is a serious potential deadlock
candidate or if this is a false reporting. Any help here would be greatly
appreciated.
The warning output:
[ 18.660032] pcieport 0000:80:00.0: pciehp: Slot(0): Button press: will power on in 5 sec
[ 18.662205] pcieport 0000:80:00.0: pciehp: Slot(0): Card present
[ 18.662791] pcieport 0000:80:00.0: pciehp: Slot(0): Link Up
[ 19.805119] pci 0000:81:00.0: [1af4:1041] type 00 class 0x020000
[ 19.806830] pci 0000:81:00.0: reg 0x14: [mem 0x00000000-0x00000fff]
[ 19.808184] pci 0000:81:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
[ 19.809635] pci 0000:81:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref]
[ 19.813842] pcieport 0000:80:00.0: bridge window [io 0x1000-0x0fff] to [bus 81] add_size 1000
[ 19.817462] pcieport 0000:80:00.0: BAR 7: no space for [io size 0x1000]
[ 19.818737] pcieport 0000:80:00.0: BAR 7: failed to assign [io size 0x1000]
[ 19.819920] pcieport 0000:80:00.0: BAR 7: no space for [io size 0x1000]
[ 19.820876] pcieport 0000:80:00.0: BAR 7: failed to assign [io size 0x1000]
[ 19.821428] pci 0000:81:00.0: BAR 6: assigned [mem 0xfe800000-0xfe87ffff pref]
[ 19.821835] pci 0000:81:00.0: BAR 4: assigned [mem 0xfe000000-0xfe003fff 64bit pref]
[ 19.822460] pci 0000:81:00.0: BAR 1: assigned [mem 0xfe880000-0xfe880fff]
[ 19.822845] pcieport 0000:80:00.0: PCI bridge to [bus 81]
[ 19.823728] pcieport 0000:80:00.0: bridge window [mem 0xfe800000-0xfe9fffff]
[ 19.824466] pcieport 0000:80:00.0: bridge window [mem 0xfe000000-0xfe1fffff 64bit pref]
[ 19.825769] virtio-pci 0000:81:00.0: enabling device (0000 -> 0002)
[ 19.880141]
[ 19.880246] ======================================================
[ 19.880886] WARNING: possible circular locking dependency detected
[ 19.881494] 6.6.0-rc5 #11 Not tainted
[ 19.881789] ------------------------------------------------------
[ 19.882354] irq/24-pciehp/53 is trying to acquire lock:
[ 19.882803] ffff8881002a66f8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_configure_device+0xb1/0x100
[ 19.883601]
[ 19.883601] but task is already holding lock:
[ 19.884080] ffffffff82beb608 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_configure_device+0x1f/0x100
[ 19.884804]
[ 19.884804] which lock already depends on the new lock.
[ 19.884804]
[ 19.885280]
[ 19.885280] the existing dependency chain (in reverse order) is:
[ 19.885923]
[ 19.885923] -> #1 (pci_rescan_remove_lock){+.+.}-{3:3}:
[ 19.886623] __mutex_lock+0x81/0xcb0
[ 19.886889] pciehp_configure_device+0x1f/0x100
[ 19.887211] pciehp_handle_presence_or_link_change+0x16e/0x4d0
[ 19.887587] pciehp_ist+0x157/0x190
[ 19.887822] irq_thread_fn+0x1f/0x60
[ 19.888076] irq_thread+0xe5/0x1b0
[ 19.888306] kthread+0xe4/0x120
[ 19.888499] ret_from_fork+0x2f/0x50
[ 19.888728] ret_from_fork_asm+0x1b/0x30
[ 19.889018]
[ 19.889018] -> #0 (&ctrl->reset_lock){.+.+}-{3:3}:
[ 19.889382] __lock_acquire+0x1509/0x25f0
[ 19.889661] lock_acquire+0xc1/0x2b0
[ 19.889899] down_read_nested+0x2f/0x160
[ 19.890177] pciehp_configure_device+0xb1/0x100
[ 19.890492] pciehp_handle_presence_or_link_change+0x16e/0x4d0
[ 19.890876] pciehp_ist+0x157/0x190
[ 19.891085] irq_thread_fn+0x1f/0x60
[ 19.891301] irq_thread+0xe5/0x1b0
[ 19.891538] kthread+0xe4/0x120
[ 19.891764] ret_from_fork+0x2f/0x50
[ 19.891989] ret_from_fork_asm+0x1b/0x30
[ 19.892227]
[ 19.892227] other info that might help us debug this:
[ 19.892227]
[ 19.892635] Possible unsafe locking scenario:
[ 19.892635]
[ 19.892944] CPU0 CPU1
[ 19.893179] ---- ----
[ 19.893415] lock(pci_rescan_remove_lock);
[ 19.893633] lock(&ctrl->reset_lock);
[ 19.893962] lock(pci_rescan_remove_lock);
[ 19.894307] rlock(&ctrl->reset_lock);
[ 19.894506]
[ 19.894506] *** DEADLOCK ***
[ 19.894506]
[ 19.894830] 1 lock held by irq/24-pciehp/53:
[ 19.895143] #0: ffffffff82beb608 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_configure_device+0x1f/0x100
[ 19.895785]
[ 19.895785] stack backtrace:
[ 19.896014] CPU: 2 PID: 53 Comm: irq/24-pciehp Not tainted 6.6.0-rc5 #11
[ 19.896358] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[ 19.896770] Call Trace:
[ 19.896905] <TASK>
[ 19.897022] dump_stack_lvl+0x4b/0x80
[ 19.897217] check_noncircular+0x134/0x150
[ 19.897432] ? lock_acquire+0xc1/0x2b0
[ 19.897629] __lock_acquire+0x1509/0x25f0
[ 19.897843] lock_acquire+0xc1/0x2b0
[ 19.898033] ? pciehp_configure_device+0xb1/0x100
[ 19.898282] ? __device_attach+0xfb/0x1d0
[ 19.898494] down_read_nested+0x2f/0x160
[ 19.898700] ? pciehp_configure_device+0xb1/0x100
[ 19.898945] pciehp_configure_device+0xb1/0x100
[ 19.899182] ? __pfx_irq_thread_fn+0x10/0x10
[ 19.899406] pciehp_handle_presence_or_link_change+0x16e/0x4d0
[ 19.899708] pciehp_ist+0x157/0x190
[ 19.899895] irq_thread_fn+0x1f/0x60
[ 19.900090] irq_thread+0xe5/0x1b0
[ 19.900272] ? __pfx_irq_thread_dtor+0x10/0x10
[ 19.900507] ? __pfx_irq_thread+0x10/0x10
[ 19.900716] kthread+0xe4/0x120
[ 19.900887] ? __pfx_kthread+0x10/0x10
[ 19.901084] ret_from_fork+0x2f/0x50
[ 19.901274] ? __pfx_kthread+0x10/0x10
[ 19.901482] ret_from_fork_asm+0x1b/0x30
[ 19.901689] </TASK>
Kind regards,
Marcel
On Tue, Oct 10, 2023 at 11:47:23AM +0200, Marcel Hamer wrote:
> On kernel v6.6.0-rc5 we have discovered a lockdep warning when using PCIe
> hotplug. The issue is reproducible using PCIe hotplug in a Qemu environment.
>
> When reverting the following commit, the warning no longer exists:
>
> commit f5eff5591b8f9c5effd25c92c758a127765f74c1
> Author: Lukas Wunner <[email protected]>
> Date: Tue Apr 11 08:21:02 2023 +0200
>
> PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock
>
> We have also experienced the issue on the v5.10-stable branch.
>
> For now I have difficulty determining if this is a serious potential deadlock
> candidate or if this is a false reporting. Any help here would be greatly
> appreciated.
Thanks a lot for the report.
It's a false positive because the two stacktraces are identical
but pciehp_ist() is single-threaded. There is only ever a single
instance of pciehp_ist() running per hotplug port, so two instances
running on separate CPUs can't happen:
> [ 19.885923] -> #1 (pci_rescan_remove_lock){+.+.}-{3:3}:
> [ 19.886623] __mutex_lock+0x81/0xcb0
> [ 19.886889] pciehp_configure_device+0x1f/0x100
> [ 19.887211] pciehp_handle_presence_or_link_change+0x16e/0x4d0
> [ 19.887587] pciehp_ist+0x157/0x190
> [ 19.887822] irq_thread_fn+0x1f/0x60
> [ 19.888076] irq_thread+0xe5/0x1b0
> [ 19.888306] kthread+0xe4/0x120
> [ 19.888499] ret_from_fork+0x2f/0x50
> [ 19.888728] ret_from_fork_asm+0x1b/0x30
> [ 19.889018]
> [ 19.889018] -> #0 (&ctrl->reset_lock){.+.+}-{3:3}:
> [ 19.889382] __lock_acquire+0x1509/0x25f0
> [ 19.889661] lock_acquire+0xc1/0x2b0
> [ 19.889899] down_read_nested+0x2f/0x160
> [ 19.890177] pciehp_configure_device+0xb1/0x100
> [ 19.890492] pciehp_handle_presence_or_link_change+0x16e/0x4d0
> [ 19.890876] pciehp_ist+0x157/0x190
> [ 19.891085] irq_thread_fn+0x1f/0x60
> [ 19.891301] irq_thread+0xe5/0x1b0
> [ 19.891538] kthread+0xe4/0x120
> [ 19.891764] ret_from_fork+0x2f/0x50
> [ 19.891989] ret_from_fork_asm+0x1b/0x30
lockdep doesn't appear to be smart enough to recognize that and
we do not have an annotation which would tell lockdep that a
particular function is always single-threaded.
From a brief look, amending lockdep to cope with such situations
seems non-trivial and I'm not sure if it happens frequently enough
to justify the additional complexity.
The only other option I see is to set lockdep_set_novalidate_class()
for the reset_lock. However that will prevent us from detecting
*valid* issues with that lock.
Hm, that's a difficult decision...
Thanks,
Lukas