2018-10-21 17:42:50

by Aaron Ma

[permalink] [raw]
Subject: [PATCH 1/2] usb: xhci: fix uninitialized completion when USB3 port got wrong status

Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
after clear port reset it works fine.

Since this device is registered on USB3 roothub at boot,
when port status reports not superspeed, xhci_get_port_status will call
an uninitialized completion in bus_state[0].
Kernel will hang because of NULL pointer.

Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
No harm to initialize USB3 bus_state[0] in case it is called.

Signed-off-by: Aaron Ma <[email protected]>
---
drivers/usb/host/xhci-hub.c | 2 +-
drivers/usb/host/xhci-mem.c | 1 +
drivers/usb/host/xhci-ring.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 7e2a531ba321..d30ca6ceffc9 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
status |= USB_PORT_STAT_SUSPEND;
}
if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
- !DEV_SUPERSPEED_ANY(raw_port_status)) {
+ !DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
if ((raw_port_status & PORT_RESET) ||
!(raw_port_status & PORT_PE))
return 0xffffffff;
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index b1f27aa38b10..dd2ad50c5289 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2539,6 +2539,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
xhci->bus_state[0].resume_done[i] = 0;
xhci->bus_state[1].resume_done[i] = 0;
/* Only the USB 2.0 completions will ever be used. */
+ init_completion(&xhci->bus_state[0].rexit_done[i]);
init_completion(&xhci->bus_state[1].rexit_done[i]);
}

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f0a99aa0ac58..894d4625b8b9 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1634,7 +1634,7 @@ static void handle_port_status(struct xhci_hcd *xhci,
* RExit to a disconnect state). If so, let the the driver know it's
* out of the RExit state.
*/
- if (!DEV_SUPERSPEED_ANY(portsc) &&
+ if (!DEV_SUPERSPEED_ANY(portsc) && 1 == hcd_index(hcd) &&
test_and_clear_bit(hcd_portnum,
&bus_state->rexit_ports)) {
complete(&bus_state->rexit_done[hcd_portnum]);
--
2.19.1



2018-10-21 17:45:06

by Aaron Ma

[permalink] [raw]
Subject: [PATCH 2/2] usb: xhci: fix timeout for transition from RExit to U0

This definition is used by msecs_to_jiffies in milliseconds.
According to the comments, max rexit timeout should be 20ms.
Align with the comments to properly calculate the delay.

Verified on Sunrise Point-LP and Cannon Lake.

Signed-off-by: Aaron Ma <[email protected]>
---
drivers/usb/host/xhci.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 6230a578324c..30225c53be1c 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1678,7 +1678,7 @@ struct xhci_bus_state {
* It can take up to 20 ms to transition from RExit to U0 on the
* Intel Lynx Point LP xHCI host.
*/
-#define XHCI_MAX_REXIT_TIMEOUT (20 * 1000)
+#define XHCI_MAX_REXIT_TIMEOUT 20

static inline unsigned int hcd_index(struct usb_hcd *hcd)
{
--
2.19.1


2018-10-21 18:31:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/2] usb: xhci: fix timeout for transition from RExit to U0

On Mon, Oct 22, 2018 at 01:08:45AM +0800, Aaron Ma wrote:
> This definition is used by msecs_to_jiffies in milliseconds.
> According to the comments, max rexit timeout should be 20ms.
> Align with the comments to properly calculate the delay.
>
> Verified on Sunrise Point-LP and Cannon Lake.
>
> Signed-off-by: Aaron Ma <[email protected]>
> ---
> drivers/usb/host/xhci.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 6230a578324c..30225c53be1c 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1678,7 +1678,7 @@ struct xhci_bus_state {
> * It can take up to 20 ms to transition from RExit to U0 on the
> * Intel Lynx Point LP xHCI host.
> */
> -#define XHCI_MAX_REXIT_TIMEOUT (20 * 1000)
> +#define XHCI_MAX_REXIT_TIMEOUT 20

Can we put the units in the #define itself so that this will be more
obvious in the future? Like XHCI_MAX_REXIT_TIMEOUT_MS?

thanks,

greg k-h

2018-10-22 03:45:07

by Aaron Ma

[permalink] [raw]
Subject: Re: [PATCH 2/2] usb: xhci: fix timeout for transition from RExit to U0


On 10/22/18 2:21 AM, Greg KH wrote:
> Can we put the units in the #define itself so that this will be more
> obvious in the future? Like XHCI_MAX_REXIT_TIMEOUT_MS?

If no other concerns about these 2 patches,
I will send V2 of 2nd patch to follow your advice.

Thanks,
Aaron

2018-10-22 13:36:22

by Mathias Nyman

[permalink] [raw]
Subject: Re: [PATCH 1/2] usb: xhci: fix uninitialized completion when USB3 port got wrong status

On 21.10.2018 20:08, Aaron Ma wrote:
> Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
> Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
> after clear port reset it works fine.
>
> Since this device is registered on USB3 roothub at boot,
> when port status reports not superspeed, xhci_get_port_status will call
> an uninitialized completion in bus_state[0].
> Kernel will hang because of NULL pointer.
>
> Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
> No harm to initialize USB3 bus_state[0] in case it is called.
>
> Signed-off-by: Aaron Ma <[email protected]>
> ---
> drivers/usb/host/xhci-hub.c | 2 +-
> drivers/usb/host/xhci-mem.c | 1 +
> drivers/usb/host/xhci-ring.c | 2 +-
> 3 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index 7e2a531ba321..d30ca6ceffc9 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
> status |= USB_PORT_STAT_SUSPEND;
> }
> if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
> - !DEV_SUPERSPEED_ANY(raw_port_status)) {
> + !DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
> if ((raw_port_status & PORT_RESET) ||
> !(raw_port_status & PORT_PE))
> return 0xffffffff;

Nice catch.

Maybe use "hcd->speed < HCD_USB3" instead of "1 == hcd_index(hcd)"
It's easier to understand.

Turns out this isn't an issue with your Realtek device, it just happens to trigger
a driver issue.

The original !DEV_SUPERSPEED_ANY() check was not suitable here.
It checks the port-speed field of portsc register (bits 13:10), which are only valid for USB3
ports if all link training is done and port reached its "enabled" state.
Otherwise it will return 0, and USB3 ports may be mistaken for USB2 ports.

Just to make sure, Does your device stay as a USB 3 device, it's never
enumerated as USB2, right?

I'm in the middle of refactoring the get_port_status(), it should solve this
as well, but we need your solution stable releases.

Any chance you to check if the refactored code works with the Realtek device?
I just created a "get_port_status_refactor" branch for it:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git get_port_status_refactor

> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
> index b1f27aa38b10..dd2ad50c5289 100644
> --- a/drivers/usb/host/xhci-mem.c
> +++ b/drivers/usb/host/xhci-mem.c
> @@ -2539,6 +2539,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
> xhci->bus_state[0].resume_done[i] = 0;
> xhci->bus_state[1].resume_done[i] = 0;
> /* Only the USB 2.0 completions will ever be used. */
> + init_completion(&xhci->bus_state[0].rexit_done[i]);
> init_completion(&xhci->bus_state[1].rexit_done[i]);
> }

I don't think we should init the completion unnecessary for USB3 ports.

>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index f0a99aa0ac58..894d4625b8b9 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -1634,7 +1634,7 @@ static void handle_port_status(struct xhci_hcd *xhci,
> * RExit to a disconnect state). If so, let the the driver know it's
> * out of the RExit state.
> */
> - if (!DEV_SUPERSPEED_ANY(portsc) &&
> + if (!DEV_SUPERSPEED_ANY(portsc) && 1 == hcd_index(hcd) &&
> test_and_clear_bit(hcd_portnum,
> &bus_state->rexit_ports)) {
> complete(&bus_state->rexit_done[hcd_portnum]);
>

Same here, prefer hcd->speed < HCD_USB3

Thanks
Mathias

2018-10-22 13:38:40

by Aaron Ma

[permalink] [raw]
Subject: Re: [PATCH 1/2] usb: xhci: fix uninitialized completion when USB3 port got wrong status



On 10/22/18 9:12 PM, Mathias Nyman wrote:
> On 21.10.2018 20:08, Aaron Ma wrote:
>> Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
>> Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
>> after clear port reset it works fine.
>>
>> Since this device is registered on USB3 roothub at boot,
>> when port status reports not superspeed, xhci_get_port_status will call
>> an uninitialized completion in bus_state[0].
>> Kernel will hang because of NULL pointer.
>>
>> Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
>> No harm to initialize USB3 bus_state[0] in case it is called.
>>
>> Signed-off-by: Aaron Ma <[email protected]>
>> ---
>>   drivers/usb/host/xhci-hub.c  | 2 +-
>>   drivers/usb/host/xhci-mem.c  | 1 +
>>   drivers/usb/host/xhci-ring.c | 2 +-
>>   3 files changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
>> index 7e2a531ba321..d30ca6ceffc9 100644
>> --- a/drivers/usb/host/xhci-hub.c
>> +++ b/drivers/usb/host/xhci-hub.c
>> @@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
>>               status |= USB_PORT_STAT_SUSPEND;
>>       }
>>       if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
>> -        !DEV_SUPERSPEED_ANY(raw_port_status)) {
>> +        !DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
>>           if ((raw_port_status & PORT_RESET) ||
>>                   !(raw_port_status & PORT_PE))
>>               return 0xffffffff;
>
> Nice catch.
>
> Maybe use "hcd->speed < HCD_USB3" instead of "1 == hcd_index(hcd)"
> It's easier to understand.
>
> Turns out this isn't an issue with your Realtek device, it just happens
> to trigger
> a driver issue.
>
> The original !DEV_SUPERSPEED_ANY() check was not suitable here.
> It checks the port-speed field of portsc register (bits 13:10), which
> are only valid for USB3
> ports if all link training is done and port reached its "enabled" state.
> Otherwise it will return 0, and USB3 ports may be mistaken for USB2 ports.
>
> Just to make sure, Does your device stay as a USB 3 device, it's never
> enumerated as USB2, right?
>
> I'm in the middle of refactoring the get_port_status(), it should solve
> this
> as well, but we need your solution stable releases.
>
> Any chance you to check if the refactored code works with the Realtek
> device?
> I just created a "get_port_status_refactor" branch for it:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
> get_port_status_refactor

Let me try your branch, please wait a moment.

>
>> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
>> index b1f27aa38b10..dd2ad50c5289 100644
>> --- a/drivers/usb/host/xhci-mem.c
>> +++ b/drivers/usb/host/xhci-mem.c
>> @@ -2539,6 +2539,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t
>> flags)
>>           xhci->bus_state[0].resume_done[i] = 0;
>>           xhci->bus_state[1].resume_done[i] = 0;
>>           /* Only the USB 2.0 completions will ever be used. */
>> +        init_completion(&xhci->bus_state[0].rexit_done[i]);
>>           init_completion(&xhci->bus_state[1].rexit_done[i]);
>>       }
>
> I don't think we should init the completion unnecessary for USB3 ports.
>
>>   diff --git a/drivers/usb/host/xhci-ring.c
>> b/drivers/usb/host/xhci-ring.c
>> index f0a99aa0ac58..894d4625b8b9 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -1634,7 +1634,7 @@ static void handle_port_status(struct xhci_hcd
>> *xhci,
>>        * RExit to a disconnect state).  If so, let the the driver know
>> it's
>>        * out of the RExit state.
>>        */
>> -    if (!DEV_SUPERSPEED_ANY(portsc) &&
>> +    if (!DEV_SUPERSPEED_ANY(portsc) && 1 == hcd_index(hcd) &&
>>               test_and_clear_bit(hcd_portnum,
>>                   &bus_state->rexit_ports)) {
>>           complete(&bus_state->rexit_done[hcd_portnum]);
>>
>
> Same here, prefer hcd->speed < HCD_USB3

Yes, I thought about this, bus_state[1/0] are defined to USB 2/3, so I
used "1 == hcd_index(hcd)".

Anyway, I will send V2 as your suggestion.

Thanks,
Aaron

>
> Thanks
> Mathias

2018-10-22 18:46:24

by Aaron Ma

[permalink] [raw]
Subject: Re: [PATCH 1/2] usb: xhci: fix uninitialized completion when USB3 port got wrong status



On 10/22/18 9:12 PM, Mathias Nyman wrote:
> On 21.10.2018 20:08, Aaron Ma wrote:
>> Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
>> Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
>> after clear port reset it works fine.
>>
>> Since this device is registered on USB3 roothub at boot,
>> when port status reports not superspeed, xhci_get_port_status will call
>> an uninitialized completion in bus_state[0].
>> Kernel will hang because of NULL pointer.
>>
>> Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
>> No harm to initialize USB3 bus_state[0] in case it is called.
>>
>> Signed-off-by: Aaron Ma <[email protected]>
>> ---
>>   drivers/usb/host/xhci-hub.c  | 2 +-
>>   drivers/usb/host/xhci-mem.c  | 1 +
>>   drivers/usb/host/xhci-ring.c | 2 +-
>>   3 files changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
>> index 7e2a531ba321..d30ca6ceffc9 100644
>> --- a/drivers/usb/host/xhci-hub.c
>> +++ b/drivers/usb/host/xhci-hub.c
>> @@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
>>               status |= USB_PORT_STAT_SUSPEND;
>>       }
>>       if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
>> -        !DEV_SUPERSPEED_ANY(raw_port_status)) {
>> +        !DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
>>           if ((raw_port_status & PORT_RESET) ||
>>                   !(raw_port_status & PORT_PE))
>>               return 0xffffffff;
>
> Nice catch.
>
> Maybe use "hcd->speed < HCD_USB3" instead of "1 == hcd_index(hcd)"
> It's easier to understand.
>
> Turns out this isn't an issue with your Realtek device, it just happens
> to trigger
> a driver issue.
>
> The original !DEV_SUPERSPEED_ANY() check was not suitable here.
> It checks the port-speed field of portsc register (bits 13:10), which
> are only valid for USB3
> ports if all link training is done and port reached its "enabled" state.
> Otherwise it will return 0, and USB3 ports may be mistaken for USB2 ports.

PORT_ENABLE should be already set to one.
The same device ID card reader doesn't have issue on Sunrise Point.
Maybe it is related to Cannon lake PCH USB controller?

>
> Just to make sure, Does your device stay as a USB 3 device, it's never
> enumerated as USB2, right?
>

Right, always USB3.

> I'm in the middle of refactoring the get_port_status(), it should solve
> this
> as well, but we need your solution stable releases.
>

V2 sent out. Cc-ed stable.

> Any chance you to check if the refactored code works with the Realtek
> device?
> I just created a "get_port_status_refactor" branch for it:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
> get_port_status_refactor

The hang issue is not reproduced on this kernel branch.

Thanks,
Aaron

>
>> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
>> index b1f27aa38b10..dd2ad50c5289 100644
>> --- a/drivers/usb/host/xhci-mem.c
>> +++ b/drivers/usb/host/xhci-mem.c
>> @@ -2539,6 +2539,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t
>> flags)
>>           xhci->bus_state[0].resume_done[i] = 0;
>>           xhci->bus_state[1].resume_done[i] = 0;
>>           /* Only the USB 2.0 completions will ever be used. */
>> +        init_completion(&xhci->bus_state[0].rexit_done[i]);
>>           init_completion(&xhci->bus_state[1].rexit_done[i]);
>>       }
>
> I don't think we should init the completion unnecessary for USB3 ports.
>
>>   diff --git a/drivers/usb/host/xhci-ring.c
>> b/drivers/usb/host/xhci-ring.c
>> index f0a99aa0ac58..894d4625b8b9 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -1634,7 +1634,7 @@ static void handle_port_status(struct xhci_hcd
>> *xhci,
>>        * RExit to a disconnect state).  If so, let the the driver know
>> it's
>>        * out of the RExit state.
>>        */
>> -    if (!DEV_SUPERSPEED_ANY(portsc) &&
>> +    if (!DEV_SUPERSPEED_ANY(portsc) && 1 == hcd_index(hcd) &&
>>               test_and_clear_bit(hcd_portnum,
>>                   &bus_state->rexit_ports)) {
>>           complete(&bus_state->rexit_done[hcd_portnum]);
>>
>
> Same here, prefer hcd->speed < HCD_USB3
>
> Thanks
> Mathias

2018-10-23 10:36:52

by Mathias Nyman

[permalink] [raw]
Subject: Re: [PATCH 1/2] usb: xhci: fix uninitialized completion when USB3 port got wrong status

On 22.10.2018 20:53, Aaron Ma wrote:
>
>
> On 10/22/18 9:12 PM, Mathias Nyman wrote:
>> On 21.10.2018 20:08, Aaron Ma wrote:
>>> Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
>>> Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
>>> after clear port reset it works fine.
>>>
>>> Since this device is registered on USB3 roothub at boot,
>>> when port status reports not superspeed, xhci_get_port_status will call
>>> an uninitialized completion in bus_state[0].
>>> Kernel will hang because of NULL pointer.
>>>
>>> Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
>>> No harm to initialize USB3 bus_state[0] in case it is called.
>>>
>>> Signed-off-by: Aaron Ma <[email protected]>
>>> ---
>>>   drivers/usb/host/xhci-hub.c  | 2 +-
>>>   drivers/usb/host/xhci-mem.c  | 1 >>>   drivers/usb/host/xhci-ring.c | 2 +-
>>>   3 files changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
>>> index 7e2a531ba321..d30ca6ceffc9 100644
>>> --- a/drivers/usb/host/xhci-hub.c
>>> +++ b/drivers/usb/host/xhci-hub.c
>>> @@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
>>>               status |= USB_PORT_STAT_SUSPEND;
>>>       }
>>>       if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
>>> -        !DEV_SUPERSPEED_ANY(raw_port_status)) {
>>> +        !DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
>>>           if ((raw_port_status & PORT_RESET) ||
>>>                   !(raw_port_status & PORT_PE))
>>>               return 0xffffffff;
>>
>> The original !DEV_SUPERSPEED_ANY() check was not suitable here.
>> It checks the port-speed field of portsc register (bits 13:10), which
>> are only valid for USB3
>> ports if all link training is done and port reached its "enabled" state.
>> Otherwise it will return 0, and USB3 ports may be mistaken for USB2 ports.
>
> PORT_ENABLE should be already set to one.
> The same device ID card reader doesn't have issue on Sunrise Point.
> Maybe it is related to Cannon lake PCH USB controller?
>

Ok, thanks for the info

>
> V2 sent out. Cc-ed stable.
>
>> Any chance you to check if the refactored code works with the Realtek
>> device?
>> I just created a "get_port_status_refactor" branch for it:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
>> get_port_status_refactor
>
> The hang issue is not reproduced on this kernel branch.
>

Great, thanks for testing it

-Mathias