Testing ohci functionality with qemu's pci-ohci emulation often results
in ohci interface stalls, resulting in hung task timeouts.
The problem is caused by lost interrupts between the emulation and the
Linux kernel code. Additional interrupts raised while the ohci interrupt
handler in Linux is running and before the handler clears the interrupt
status are not handled. The fix for a similar problem in ehci suggests
that the problem is likely caused by edge-triggered MSI interrupts. See
commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with
edge-triggered MSI") for details.
Ensure that the ohci interrupt code handles all pending interrupts before
returning to solve the problem.
Cc: Gerd Hoffmann <[email protected]>
Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
Signed-off-by: Guenter Roeck <[email protected]>
---
drivers/usb/host/ohci-hcd.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
index 4f9982ecfb58..4d764eb6c1e5 100644
--- a/drivers/usb/host/ohci-hcd.c
+++ b/drivers/usb/host/ohci-hcd.c
@@ -888,6 +888,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
/* Check for an all 1's result which is a typical consequence
* of dead, unclocked, or unplugged (CardBus...) devices
*/
+again:
if (ints == ~(u32)0) {
ohci->rh_state = OHCI_RH_HALTED;
ohci_dbg (ohci, "device removed!\n");
@@ -982,6 +983,11 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
}
spin_unlock(&ohci->lock);
+ /* repeat until all enabled interrupts are handled */
+ ints = ohci_readl(ohci, ®s->intrstatus);
+ if (ints & ohci_readl(ohci, ®s->intrenable))
+ goto again;
+
return IRQ_HANDLED;
}
--
2.39.2
On Wed, Apr 24, 2024 at 10:02:50AM -0700, Guenter Roeck wrote:
> Testing ohci functionality with qemu's pci-ohci emulation often results
> in ohci interface stalls, resulting in hung task timeouts.
>
> The problem is caused by lost interrupts between the emulation and the
> Linux kernel code. Additional interrupts raised while the ohci interrupt
> handler in Linux is running and before the handler clears the interrupt
> status are not handled. The fix for a similar problem in ehci suggests
> that the problem is likely caused by edge-triggered MSI interrupts. See
> commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with
> edge-triggered MSI") for details.
>
> Ensure that the ohci interrupt code handles all pending interrupts before
> returning to solve the problem.
>
> Cc: Gerd Hoffmann <[email protected]>
> Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
> Signed-off-by: Guenter Roeck <[email protected]>
> ---
> drivers/usb/host/ohci-hcd.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
> index 4f9982ecfb58..4d764eb6c1e5 100644
> --- a/drivers/usb/host/ohci-hcd.c
> +++ b/drivers/usb/host/ohci-hcd.c
> @@ -888,6 +888,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
> /* Check for an all 1's result which is a typical consequence
> * of dead, unclocked, or unplugged (CardBus...) devices
> */
> +again:
> if (ints == ~(u32)0) {
> ohci->rh_state = OHCI_RH_HALTED;
> ohci_dbg (ohci, "device removed!\n");
> @@ -982,6 +983,11 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
> }
> spin_unlock(&ohci->lock);
>
> + /* repeat until all enabled interrupts are handled */
> + ints = ohci_readl(ohci, ®s->intrstatus);
> + if (ints & ohci_readl(ohci, ®s->intrenable))
> + goto again;
If we take the repeat, we don't want to return IRQ_NOTMINE by accident. To
prevent this, we should check that ohci->rh_state != OHCI_RH_HALTED before
re-reading ints and jumping back.
(If rh_state _is_ OHCI_RH_HALTED, it means the device is supposedly stopped
and disabled for generating further interrupt requests, so we shouldn't
need to worry about any outstanding intrstatus bits still set.)
Alan Stern
> +
> return IRQ_HANDLED;
> }
>
> --
> 2.39.2
>
On 4/24/24 11:15, Alan Stern wrote:
> On Wed, Apr 24, 2024 at 10:02:50AM -0700, Guenter Roeck wrote:
>> Testing ohci functionality with qemu's pci-ohci emulation often results
>> in ohci interface stalls, resulting in hung task timeouts.
>>
>> The problem is caused by lost interrupts between the emulation and the
>> Linux kernel code. Additional interrupts raised while the ohci interrupt
>> handler in Linux is running and before the handler clears the interrupt
>> status are not handled. The fix for a similar problem in ehci suggests
>> that the problem is likely caused by edge-triggered MSI interrupts. See
>> commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with
>> edge-triggered MSI") for details.
>>
>> Ensure that the ohci interrupt code handles all pending interrupts before
>> returning to solve the problem.
>>
>> Cc: Gerd Hoffmann <[email protected]>
>> Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
>> Signed-off-by: Guenter Roeck <[email protected]>
>> ---
>> drivers/usb/host/ohci-hcd.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
>> index 4f9982ecfb58..4d764eb6c1e5 100644
>> --- a/drivers/usb/host/ohci-hcd.c
>> +++ b/drivers/usb/host/ohci-hcd.c
>> @@ -888,6 +888,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
>> /* Check for an all 1's result which is a typical consequence
>> * of dead, unclocked, or unplugged (CardBus...) devices
>> */
>> +again:
>> if (ints == ~(u32)0) {
>> ohci->rh_state = OHCI_RH_HALTED;
>> ohci_dbg (ohci, "device removed!\n");
>> @@ -982,6 +983,11 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
>> }
>> spin_unlock(&ohci->lock);
>>
>> + /* repeat until all enabled interrupts are handled */
>> + ints = ohci_readl(ohci, ®s->intrstatus);
>> + if (ints & ohci_readl(ohci, ®s->intrenable))
>> + goto again;
>
> If we take the repeat, we don't want to return IRQ_NOTMINE by accident. To
> prevent this, we should check that ohci->rh_state != OHCI_RH_HALTED before
> re-reading ints and jumping back.
>
> (If rh_state _is_ OHCI_RH_HALTED, it means the device is supposedly stopped
> and disabled for generating further interrupt requests, so we shouldn't
> need to worry about any outstanding intrstatus bits still set.)
>
Makes sense. I'll send v2.
Thanks,
Guenter