2022-04-22 22:08:49

by Evan Green

[permalink] [raw]
Subject: [PATCH v3 0/2] USB: Quiesce interrupts across pm freeze

The documentation for the freeze() method says that it "should quiesce
the device so that it doesn't generate IRQs or DMA". The unspoken
consequence of not doing this is that MSIs aimed at non-boot CPUs may
get fully lost if they're sent during the period where the target CPU is
offline.

The current behavior of the USB subsystem still allows interrupts to
come in after freeze, both in terms of remote wakeups and HC events
related to things like root plug port activity. This can get controllers
like XHCI, which is very sensitive to lost interrupts, in a wedged
state. This series attempts to fully quiesce interrupts coming from USB
across in a freeze or quiescent state.

These patches are grouped together because they serve a united purpose,
but are actually independent. They could be merged or reverted
individually.

You may be able to reproduce this issue on your own machine via the
following:
1. Disable runtime PM on your XHCI controller
2. Aim your XHCI IRQ at a non-boot CPU (replace 174): echo 2 >
/proc/irq/174/smp_affinity
3. Attempt to hibernate (no need to actually go all the way down).

I run 2 and 3 in a loop, and can usually hit a hang or dead XHCI
controller within 1-2 iterations. I happened to notice this on an
Alderlake system where runtime PM is accidentally disabled for one of
the XHCI controllers. Some more discussion and debugging can be found at
[1].

[1] https://lore.kernel.org/linux-pci/CAE=gft4a-QL82iFJE_xRQ3JrMmz-KZKWREtz=MghhjFbJeK=8A@mail.gmail.com/T/#u

Changes in v3:
- Fix comment formatting and line wrap

Changes in v2:
- Introduced the patch to always disable remote wakeups
- Removed the change to freeze_noirq/thaw_noirq

Evan Green (2):
USB: core: Disable remote wakeup for freeze/quiesce
USB: hcd-pci: Fully suspend across freeze/thaw cycle

drivers/usb/core/driver.c | 25 +++++++++++++------------
drivers/usb/core/hcd-pci.c | 4 ++--
2 files changed, 15 insertions(+), 14 deletions(-)

--
2.31.0