2016-12-10 00:26:22

by Bjorn Helgaas

[permalink] [raw]
Subject: Should xhci_irq() call usb_hc_died()?

Hi Mathias,

ehci_irq(), ohci_irq(), fotg210_irq(), and oxu210_hcd_irq() contain code
equivalent to this:

status = ehci_readl(...);
if (status == ~(u32) 0) {
...
usb_hc_died(hcd);
...
return IRQ_HANDLED;
}

xhci_irq() has a similar check, but does not call usb_hc_died():

status = readl(...);
if (status = 0xffffffff) {
...
return IRQ_HANDLED;
}

Should xhci_irq() also call usb_hc_died()? Maybe there's some reason
for it to be different than the others, but it wasn't obvious to this
casual observer :)

Bjorn


2016-12-12 08:44:52

by Felipe Balbi

[permalink] [raw]
Subject: Re: Should xhci_irq() call usb_hc_died()?


Hi,

Bjorn Helgaas <[email protected]> writes:
> Hi Mathias,
>
> ehci_irq(), ohci_irq(), fotg210_irq(), and oxu210_hcd_irq() contain code
> equivalent to this:
>
> status = ehci_readl(...);
> if (status == ~(u32) 0) {
> ...
> usb_hc_died(hcd);
> ...
> return IRQ_HANDLED;
> }
>
> xhci_irq() has a similar check, but does not call usb_hc_died():
>
> status = readl(...);
> if (status = 0xffffffff) {
> ...
> return IRQ_HANDLED;
> }
>
> Should xhci_irq() also call usb_hc_died()? Maybe there's some reason
> for it to be different than the others, but it wasn't obvious to this
> casual observer :)

you might just have fixed several bugs in dealing with a dead HC :-)

Can you provide a patch? (well, unless Mathias has a strong reason not
to call usb_hc_died(), of course).

--
balbi


Attachments:
signature.asc (832.00 B)

2016-12-12 10:47:57

by Mathias Nyman

[permalink] [raw]
Subject: Re: Should xhci_irq() call usb_hc_died()?

On 12.12.2016 10:43, Felipe Balbi wrote:
>
> Hi,
>
> Bjorn Helgaas <[email protected]> writes:
>> Hi Mathias,
>>
>> ehci_irq(), ohci_irq(), fotg210_irq(), and oxu210_hcd_irq() contain code
>> equivalent to this:
>>
>> status = ehci_readl(...);
>> if (status == ~(u32) 0) {
>> ...
>> usb_hc_died(hcd);
>> ...
>> return IRQ_HANDLED;
>> }
>>
>> xhci_irq() has a similar check, but does not call usb_hc_died():
>>
>> status = readl(...);
>> if (status = 0xffffffff) {
>> ...
>> return IRQ_HANDLED;
>> }
>>
>> Should xhci_irq() also call usb_hc_died()? Maybe there's some reason
>> for it to be different than the others, but it wasn't obvious to this
>> casual observer :)

It probably should, I'm not aware of any reason why not, and a quick look at the
logs didn't reveal anything.

Currently we are calling usb_hcd_died() in a couple of timeout cases if we read
0xffffffff from the pci registers, So eventually usb_hc_died() will be called.

I'll take a look at this in more detail

>
> you might just have fixed several bugs in dealing with a dead HC :-)
>
> Can you provide a patch? (well, unless Mathias has a strong reason not
> to call usb_hc_died(), of course).

I don't think this is the worst case, there are a couple of other reasons such as
normal pci remove case we halt the host and reset the hardware after first HCD (USB2)
is removed, with all the secondary HCD (USB3) sand all its devices still connected,

Or then the abnormal case where HC disappears, we may time out while giving back a
killed URB, and may end up never returning it. USB core waits with the roothub device
lock held for the URB, and we try to tear down xhci, which also requires the roothub
device lock at some point -> deadlock.

I'm am looking at these, but I need to make sure i fix it properly and not cause even
more issues.

-Mathias