2021-05-10 14:55:38

by Chris Chiu

[permalink] [raw]
Subject: [PATCH v2 0/2] USB: propose a generic fix for PORT_SUSPEND set feature timeout

From: Chris Chiu <[email protected]>

For the Realtek Hub which fails to resume the port which has wakeup
enable descendants, trying to create a more generic and better option
to have the runtime suspend/resume work instead of a reset-resume quirk.

Chris Chiu (2):
USB: reset-resume the device when PORT_SUSPEND is set but timeout
Revert "USB: Add reset-resume quirk for WD19's Realtek Hub"

drivers/usb/core/hub.c | 15 +++++++++++++++
drivers/usb/core/quirks.c | 1 -
2 files changed, 15 insertions(+), 1 deletion(-)

--
2.20.1


2021-05-10 14:55:41

by Chris Chiu

[permalink] [raw]
Subject: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

From: Chris Chiu <[email protected]>

On the Realtek high-speed Hub(0bda:5487), the port which has wakeup
enabled_descendants will sometimes timeout when setting PORT_SUSPEND
feature. After checking the PORT_SUSPEND bit in wPortStatus, it is
already set. However, the hub will fail to activate because the
PORT_SUSPEND feature of that port is not cleared during resume. All
connected devices are lost after resume.

This commit force reset-resume the device connected to the timeout
but suspended port so that the hub will have chance to clear the
PORT_SUSPEND feature during resume.

Signed-off-by: Chris Chiu <[email protected]>
---

Changelog:
v2:
- create a new variable to keep the result of hub_port_status
when suspend timeout.

drivers/usb/core/hub.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index b2bc4b7c4289..3c823544e425 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -3385,6 +3385,21 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg)
status = 0;
}
if (status) {
+ if (status == -ETIMEDOUT) {
+ u16 portstatus, portchange;
+
+ int ret = hub_port_status(hub, port1, &portstatus,
+ &portchange);
+
+ dev_dbg(&port_dev->dev,
+ "suspend timeout, status %04x\n", portstatus);
+
+ if (ret == 0 && port_is_suspended(hub, portstatus)) {
+ udev->reset_resume = 1;
+ goto err_wakeup;
+ }
+ }
+
dev_dbg(&port_dev->dev, "can't suspend, status %d\n", status);

/* Try to enable USB3 LTM again */
--
2.20.1

2021-05-10 14:55:55

by Chris Chiu

[permalink] [raw]
Subject: [PATCH v2 2/2] Revert "USB: Add reset-resume quirk for WD19's Realtek Hub"

From: Chris Chiu <[email protected]>

This reverts commit ca91fd8c7643d93bfc18a6fec1a0d3972a46a18a. The
problematic hub should be taken care for each setting PORT_SUSPEND
feature timeout instead of reset-resume all the time.

Signed-off-by: Chris Chiu <[email protected]>
---
drivers/usb/core/quirks.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index 21e7522655ac..6114cf83bb44 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -406,7 +406,6 @@ static const struct usb_device_id usb_quirk_list[] = {

/* Realtek hub in Dell WD19 (Type-C) */
{ USB_DEVICE(0x0bda, 0x0487), .driver_info = USB_QUIRK_NO_LPM },
- { USB_DEVICE(0x0bda, 0x5487), .driver_info = USB_QUIRK_RESET_RESUME },

/* Generic RTL8153 based ethernet adapters */
{ USB_DEVICE(0x0bda, 0x8153), .driver_info = USB_QUIRK_NO_LPM },
--
2.20.1

2021-05-10 15:06:38

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

On Mon, May 10, 2021 at 10:50:29PM +0800, [email protected] wrote:
> From: Chris Chiu <[email protected]>
>
> On the Realtek high-speed Hub(0bda:5487), the port which has wakeup
> enabled_descendants will sometimes timeout when setting PORT_SUSPEND
> feature. After checking the PORT_SUSPEND bit in wPortStatus, it is
> already set. However, the hub will fail to activate because the
> PORT_SUSPEND feature of that port is not cleared during resume. All
> connected devices are lost after resume.
>
> This commit force reset-resume the device connected to the timeout
> but suspended port so that the hub will have chance to clear the
> PORT_SUSPEND feature during resume.

Are you certain that the reset-resume is needed? What happens if you
leave out the line that sets udev->reset_resume? The rest of the patch
will cause the kernel to realize that the port really is suspended, so
maybe the suspend feature will get cleared properly during resume.

It's worthwhile to try the experiement and see what happens.

Alan Stern

> Signed-off-by: Chris Chiu <[email protected]>
> ---
>
> Changelog:
> v2:
> - create a new variable to keep the result of hub_port_status
> when suspend timeout.
>
> drivers/usb/core/hub.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index b2bc4b7c4289..3c823544e425 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -3385,6 +3385,21 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg)
> status = 0;
> }
> if (status) {
> + if (status == -ETIMEDOUT) {
> + u16 portstatus, portchange;
> +
> + int ret = hub_port_status(hub, port1, &portstatus,
> + &portchange);
> +
> + dev_dbg(&port_dev->dev,
> + "suspend timeout, status %04x\n", portstatus);
> +
> + if (ret == 0 && port_is_suspended(hub, portstatus)) {
> + udev->reset_resume = 1;
> + goto err_wakeup;
> + }
> + }
> +
> dev_dbg(&port_dev->dev, "can't suspend, status %d\n", status);
>
> /* Try to enable USB3 LTM again */
> --
> 2.20.1
>

2021-05-11 05:06:55

by Chris Chiu

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

On Mon, May 10, 2021 at 11:02 PM Alan Stern <[email protected]> wrote:
>
> On Mon, May 10, 2021 at 10:50:29PM +0800, [email protected] wrote:
> > From: Chris Chiu <[email protected]>
> >
> > On the Realtek high-speed Hub(0bda:5487), the port which has wakeup
> > enabled_descendants will sometimes timeout when setting PORT_SUSPEND
> > feature. After checking the PORT_SUSPEND bit in wPortStatus, it is
> > already set. However, the hub will fail to activate because the
> > PORT_SUSPEND feature of that port is not cleared during resume. All
> > connected devices are lost after resume.
> >
> > This commit force reset-resume the device connected to the timeout
> > but suspended port so that the hub will have chance to clear the
> > PORT_SUSPEND feature during resume.
>
> Are you certain that the reset-resume is needed? What happens if you
> leave out the line that sets udev->reset_resume? The rest of the patch
> will cause the kernel to realize that the port really is suspended, so
> maybe the suspend feature will get cleared properly during resume.
>
> It's worthwhile to try the experiement and see what happens.
>
> Alan Stern
>

If I leave out the udev->reset_resume set, the resume will fail. Please refer
to the following kernel log. The usb 1-1 is the hub which has wakeup enabled
descendants.

[ 57.210472] usb 1-1: kworker/u32:7 timed out on ep0out len=0/0
[ 57.211022] usb 1-1-port3: suspend timeout, status 0507
[ 57.211130] hub 1-1:1.0: hub_suspend
[ 57.230500] usb 1-1: usb suspend, wakeup 0

The timeout happens at 57.210472 and you can see the PORT_SUSPEND
bit is actually set in the "status 0507". The following shows the resume log.

[ 58.046556] usb 1-1: usb resume
[ 58.114515] usb 1-1: Waited 0ms for CONNECT
[ 58.114524] usb 1-1: finish resume
[ 58.114928] hub 1-1:1.0: hub_resume
[ 58.116035] usb 1-1-port3: status 0507 change 0000
[ 58.116720] usb 1-1-port5: status 0503 change 0000
[ 58.116778] hub 1-1.3:1.0: hub_resume
[ 58.116908] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
[ 58.116952] usb 1-1.5: Waited 0ms for CONNECT
[ 58.116955] usb 1-1.5: finish resume
[ 58.117157] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
[ 58.117397] usb 1-1.3-port5: can't resume, status -71
[ 58.117782] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
[ 58.118147] usb 1-1.3-port2: can't resume, status -71
[ 58.118149] usb 1-1.3.2: Waited 0ms for CONNECT
[ 58.118151] usb 1-1.3-port2: status 07eb.906e after resume, -19
[ 58.118153] usb 1-1.3.2: can't resume, status -19
[ 58.118154] usb 1-1.3-port2: logical disconnect
[ 58.118526] usb 1-1.3-port2: cannot disable (err = -71)

As you see in the 58.116035, the hub_resume and activate is OK for the
usb 1-1. The "usb 1-1.3: finish resume" is not in the log because it's not
considered suspended and no chance to ClearPortFeature. Then it fails
the subsequent hub 1-1.3 resume and active because the usb 1-1.3 happens
to be a downstream hub. So this is why we need at least udev->reset_resume
to give this kind of timeout port/device a chance to clear port feature and
come back from an unknown state.

Chris

> > Signed-off-by: Chris Chiu <[email protected]>
> > ---
> >
> > Changelog:
> > v2:
> > - create a new variable to keep the result of hub_port_status
> > when suspend timeout.
> >
> > drivers/usb/core/hub.c | 15 +++++++++++++++
> > 1 file changed, 15 insertions(+)
> >
> > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> > index b2bc4b7c4289..3c823544e425 100644
> > --- a/drivers/usb/core/hub.c
> > +++ b/drivers/usb/core/hub.c
> > @@ -3385,6 +3385,21 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg)
> > status = 0;
> > }
> > if (status) {
> > + if (status == -ETIMEDOUT) {
> > + u16 portstatus, portchange;
> > +
> > + int ret = hub_port_status(hub, port1, &portstatus,
> > + &portchange);
> > +
> > + dev_dbg(&port_dev->dev,
> > + "suspend timeout, status %04x\n", portstatus);
> > +
> > + if (ret == 0 && port_is_suspended(hub, portstatus)) {
> > + udev->reset_resume = 1;
> > + goto err_wakeup;
> > + }
> > + }
> > +
> > dev_dbg(&port_dev->dev, "can't suspend, status %d\n", status);
> >
> > /* Try to enable USB3 LTM again */
> > --
> > 2.20.1
> >

2021-05-11 16:32:08

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

On Tue, May 11, 2021 at 01:04:36PM +0800, Chris Chiu wrote:
> On Mon, May 10, 2021 at 11:02 PM Alan Stern <[email protected]> wrote:
> >
> > On Mon, May 10, 2021 at 10:50:29PM +0800, [email protected] wrote:
> > > From: Chris Chiu <[email protected]>
> > >
> > > On the Realtek high-speed Hub(0bda:5487), the port which has wakeup
> > > enabled_descendants will sometimes timeout when setting PORT_SUSPEND
> > > feature. After checking the PORT_SUSPEND bit in wPortStatus, it is
> > > already set. However, the hub will fail to activate because the
> > > PORT_SUSPEND feature of that port is not cleared during resume. All
> > > connected devices are lost after resume.
> > >
> > > This commit force reset-resume the device connected to the timeout
> > > but suspended port so that the hub will have chance to clear the
> > > PORT_SUSPEND feature during resume.
> >
> > Are you certain that the reset-resume is needed? What happens if you
> > leave out the line that sets udev->reset_resume? The rest of the patch
> > will cause the kernel to realize that the port really is suspended, so
> > maybe the suspend feature will get cleared properly during resume.
> >
> > It's worthwhile to try the experiement and see what happens.
> >
> > Alan Stern
> >
>
> If I leave out the udev->reset_resume set, the resume will fail. Please refer
> to the following kernel log. The usb 1-1 is the hub which has wakeup enabled
> descendants.
>
> [ 57.210472] usb 1-1: kworker/u32:7 timed out on ep0out len=0/0
> [ 57.211022] usb 1-1-port3: suspend timeout, status 0507
> [ 57.211130] hub 1-1:1.0: hub_suspend
> [ 57.230500] usb 1-1: usb suspend, wakeup 0
>
> The timeout happens at 57.210472 and you can see the PORT_SUSPEND
> bit is actually set in the "status 0507". The following shows the resume log.
>
> [ 58.046556] usb 1-1: usb resume
> [ 58.114515] usb 1-1: Waited 0ms for CONNECT
> [ 58.114524] usb 1-1: finish resume
> [ 58.114928] hub 1-1:1.0: hub_resume
> [ 58.116035] usb 1-1-port3: status 0507 change 0000
> [ 58.116720] usb 1-1-port5: status 0503 change 0000
> [ 58.116778] hub 1-1.3:1.0: hub_resume
> [ 58.116908] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> [ 58.116952] usb 1-1.5: Waited 0ms for CONNECT
> [ 58.116955] usb 1-1.5: finish resume
> [ 58.117157] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> [ 58.117397] usb 1-1.3-port5: can't resume, status -71
> [ 58.117782] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> [ 58.118147] usb 1-1.3-port2: can't resume, status -71
> [ 58.118149] usb 1-1.3.2: Waited 0ms for CONNECT
> [ 58.118151] usb 1-1.3-port2: status 07eb.906e after resume, -19
> [ 58.118153] usb 1-1.3.2: can't resume, status -19
> [ 58.118154] usb 1-1.3-port2: logical disconnect
> [ 58.118526] usb 1-1.3-port2: cannot disable (err = -71)
>
> As you see in the 58.116035, the hub_resume and activate is OK for the
> usb 1-1. The "usb 1-1.3: finish resume" is not in the log because it's not
> considered suspended and no chance to ClearPortFeature.

Wait -- why isn't it considered suspended? We saw at 57.211022 that
1-1-port3's Suspend feature really was set, and thanks to your patch,
the kernel should now believe that the port is suspended.

> Then it fails
> the subsequent hub 1-1.3 resume and active because the usb 1-1.3 happens
> to be a downstream hub. So this is why we need at least udev->reset_resume
> to give this kind of timeout port/device a chance to clear port feature and
> come back from an unknown state.

Don't worry about this part. Naturally anything associated with the
1-1.3 hub will fail after the resume of 1-1-port3 is messed up. Fix the
first problem (failure to resume the port) and the second problem is
likely to go away.

Alan Stern

2021-05-12 04:19:07

by Chris Chiu

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

On Wed, May 12, 2021 at 12:30 AM Alan Stern <[email protected]> wrote:
>
> On Tue, May 11, 2021 at 01:04:36PM +0800, Chris Chiu wrote:
> > On Mon, May 10, 2021 at 11:02 PM Alan Stern <[email protected]> wrote:
> > >
> > > On Mon, May 10, 2021 at 10:50:29PM +0800, [email protected] wrote:
> > > > From: Chris Chiu <[email protected]>
> > > >
> > > > On the Realtek high-speed Hub(0bda:5487), the port which has wakeup
> > > > enabled_descendants will sometimes timeout when setting PORT_SUSPEND
> > > > feature. After checking the PORT_SUSPEND bit in wPortStatus, it is
> > > > already set. However, the hub will fail to activate because the
> > > > PORT_SUSPEND feature of that port is not cleared during resume. All
> > > > connected devices are lost after resume.
> > > >
> > > > This commit force reset-resume the device connected to the timeout
> > > > but suspended port so that the hub will have chance to clear the
> > > > PORT_SUSPEND feature during resume.
> > >
> > > Are you certain that the reset-resume is needed? What happens if you
> > > leave out the line that sets udev->reset_resume? The rest of the patch
> > > will cause the kernel to realize that the port really is suspended, so
> > > maybe the suspend feature will get cleared properly during resume.
> > >
> > > It's worthwhile to try the experiement and see what happens.
> > >
> > > Alan Stern
> > >
> >
> > If I leave out the udev->reset_resume set, the resume will fail. Please refer
> > to the following kernel log. The usb 1-1 is the hub which has wakeup enabled
> > descendants.
> >
> > [ 57.210472] usb 1-1: kworker/u32:7 timed out on ep0out len=0/0
> > [ 57.211022] usb 1-1-port3: suspend timeout, status 0507
> > [ 57.211130] hub 1-1:1.0: hub_suspend
> > [ 57.230500] usb 1-1: usb suspend, wakeup 0
> >
> > The timeout happens at 57.210472 and you can see the PORT_SUSPEND
> > bit is actually set in the "status 0507". The following shows the resume log.
> >
> > [ 58.046556] usb 1-1: usb resume
> > [ 58.114515] usb 1-1: Waited 0ms for CONNECT
> > [ 58.114524] usb 1-1: finish resume
> > [ 58.114928] hub 1-1:1.0: hub_resume
> > [ 58.116035] usb 1-1-port3: status 0507 change 0000
> > [ 58.116720] usb 1-1-port5: status 0503 change 0000
> > [ 58.116778] hub 1-1.3:1.0: hub_resume
> > [ 58.116908] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > [ 58.116952] usb 1-1.5: Waited 0ms for CONNECT
> > [ 58.116955] usb 1-1.5: finish resume
> > [ 58.117157] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > [ 58.117397] usb 1-1.3-port5: can't resume, status -71
> > [ 58.117782] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > [ 58.118147] usb 1-1.3-port2: can't resume, status -71
> > [ 58.118149] usb 1-1.3.2: Waited 0ms for CONNECT
> > [ 58.118151] usb 1-1.3-port2: status 07eb.906e after resume, -19
> > [ 58.118153] usb 1-1.3.2: can't resume, status -19
> > [ 58.118154] usb 1-1.3-port2: logical disconnect
> > [ 58.118526] usb 1-1.3-port2: cannot disable (err = -71)
> >
> > As you see in the 58.116035, the hub_resume and activate is OK for the
> > usb 1-1. The "usb 1-1.3: finish resume" is not in the log because it's not
> > considered suspended and no chance to ClearPortFeature.
>
> Wait -- why isn't it considered suspended? We saw at 57.211022 that
> 1-1-port3's Suspend feature really was set, and thanks to your patch,
> the kernel should now believe that the port is suspended.
>
But it's still in the `if (status)` branch so it will not get
usb_set_device_state
to USB_STATE_SUSPENDED, then usb_resume_both will not do the resume
process for it.

My original thought is, we still take this as an abnormal status
because we don't
really know the reason for the timeout. Set reset_resume for the udev
will make the
kernel to reset_resume it. Or I have to create a new `goto` name in
the `else` branch
to force it back to the successful suspended process. And should I
clean the status to
zero for pm_runtime_put_sync()? What's your suggestion?

Chris

> > Then it fails
> > the subsequent hub 1-1.3 resume and active because the usb 1-1.3 happens
> > to be a downstream hub. So this is why we need at least udev->reset_resume
> > to give this kind of timeout port/device a chance to clear port feature and
> > come back from an unknown state.
>
> Don't worry about this part. Naturally anything associated with the
> 1-1.3 hub will fail after the resume of 1-1-port3 is messed up. Fix the
> first problem (failure to resume the port) and the second problem is
> likely to go away.
>
> Alan Stern

2021-05-12 15:20:36

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

On Wed, May 12, 2021 at 12:17:12PM +0800, Chris Chiu wrote:
> On Wed, May 12, 2021 at 12:30 AM Alan Stern <[email protected]> wrote:
> >
> > On Tue, May 11, 2021 at 01:04:36PM +0800, Chris Chiu wrote:
> > > On Mon, May 10, 2021 at 11:02 PM Alan Stern <[email protected]> wrote:
> > > >
> > > > On Mon, May 10, 2021 at 10:50:29PM +0800, [email protected] wrote:
> > > > > From: Chris Chiu <[email protected]>
> > > > >
> > > > > On the Realtek high-speed Hub(0bda:5487), the port which has wakeup
> > > > > enabled_descendants will sometimes timeout when setting PORT_SUSPEND
> > > > > feature. After checking the PORT_SUSPEND bit in wPortStatus, it is
> > > > > already set. However, the hub will fail to activate because the
> > > > > PORT_SUSPEND feature of that port is not cleared during resume. All
> > > > > connected devices are lost after resume.
> > > > >
> > > > > This commit force reset-resume the device connected to the timeout
> > > > > but suspended port so that the hub will have chance to clear the
> > > > > PORT_SUSPEND feature during resume.
> > > >
> > > > Are you certain that the reset-resume is needed? What happens if you
> > > > leave out the line that sets udev->reset_resume? The rest of the patch
> > > > will cause the kernel to realize that the port really is suspended, so
> > > > maybe the suspend feature will get cleared properly during resume.
> > > >
> > > > It's worthwhile to try the experiement and see what happens.
> > > >
> > > > Alan Stern
> > > >
> > >
> > > If I leave out the udev->reset_resume set, the resume will fail. Please refer
> > > to the following kernel log. The usb 1-1 is the hub which has wakeup enabled
> > > descendants.
> > >
> > > [ 57.210472] usb 1-1: kworker/u32:7 timed out on ep0out len=0/0
> > > [ 57.211022] usb 1-1-port3: suspend timeout, status 0507
> > > [ 57.211130] hub 1-1:1.0: hub_suspend
> > > [ 57.230500] usb 1-1: usb suspend, wakeup 0
> > >
> > > The timeout happens at 57.210472 and you can see the PORT_SUSPEND
> > > bit is actually set in the "status 0507". The following shows the resume log.
> > >
> > > [ 58.046556] usb 1-1: usb resume
> > > [ 58.114515] usb 1-1: Waited 0ms for CONNECT
> > > [ 58.114524] usb 1-1: finish resume
> > > [ 58.114928] hub 1-1:1.0: hub_resume
> > > [ 58.116035] usb 1-1-port3: status 0507 change 0000
> > > [ 58.116720] usb 1-1-port5: status 0503 change 0000
> > > [ 58.116778] hub 1-1.3:1.0: hub_resume
> > > [ 58.116908] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > [ 58.116952] usb 1-1.5: Waited 0ms for CONNECT
> > > [ 58.116955] usb 1-1.5: finish resume
> > > [ 58.117157] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > [ 58.117397] usb 1-1.3-port5: can't resume, status -71
> > > [ 58.117782] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > [ 58.118147] usb 1-1.3-port2: can't resume, status -71
> > > [ 58.118149] usb 1-1.3.2: Waited 0ms for CONNECT
> > > [ 58.118151] usb 1-1.3-port2: status 07eb.906e after resume, -19
> > > [ 58.118153] usb 1-1.3.2: can't resume, status -19
> > > [ 58.118154] usb 1-1.3-port2: logical disconnect
> > > [ 58.118526] usb 1-1.3-port2: cannot disable (err = -71)
> > >
> > > As you see in the 58.116035, the hub_resume and activate is OK for the
> > > usb 1-1. The "usb 1-1.3: finish resume" is not in the log because it's not
> > > considered suspended and no chance to ClearPortFeature.
> >
> > Wait -- why isn't it considered suspended? We saw at 57.211022 that
> > 1-1-port3's Suspend feature really was set, and thanks to your patch,
> > the kernel should now believe that the port is suspended.
> >
> But it's still in the `if (status)` branch so it will not get
> usb_set_device_state
> to USB_STATE_SUSPENDED, then usb_resume_both will not do the resume
> process for it.

Ah, yes. I was mis-reading the patch.

> My original thought is, we still take this as an abnormal status
> because we don't
> really know the reason for the timeout. Set reset_resume for the udev
> will make the
> kernel to reset_resume it. Or I have to create a new `goto` name in
> the `else` branch
> to force it back to the successful suspended process. And should I
> clean the status to
> zero for pm_runtime_put_sync()? What's your suggestion?

For testing purposes, set status to 0 and jump to a new goto label in
the "else" branch. In other words, treat it as if the suspend really
had worked and go back to the successful pathway. Try this out and see
if it fixes the problem.

If it does then the reset-resume isn't needed. If it doesn't, post your
patch again, and mention in the patch description that testing shows the
reset-resume really is necessary.

Alan Stern

2021-05-13 04:25:46

by Chris Chiu

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

On Wed, May 12, 2021 at 11:04 PM Alan Stern <[email protected]> wrote:
>
> On Wed, May 12, 2021 at 12:17:12PM +0800, Chris Chiu wrote:
> > On Wed, May 12, 2021 at 12:30 AM Alan Stern <[email protected]> wrote:
> > >
> > > On Tue, May 11, 2021 at 01:04:36PM +0800, Chris Chiu wrote:
> > > > On Mon, May 10, 2021 at 11:02 PM Alan Stern <[email protected]> wrote:
> > > > >
> > > > > On Mon, May 10, 2021 at 10:50:29PM +0800, [email protected] wrote:
> > > > > > From: Chris Chiu <[email protected]>
> > > > > >
> > > > > > On the Realtek high-speed Hub(0bda:5487), the port which has wakeup
> > > > > > enabled_descendants will sometimes timeout when setting PORT_SUSPEND
> > > > > > feature. After checking the PORT_SUSPEND bit in wPortStatus, it is
> > > > > > already set. However, the hub will fail to activate because the
> > > > > > PORT_SUSPEND feature of that port is not cleared during resume. All
> > > > > > connected devices are lost after resume.
> > > > > >
> > > > > > This commit force reset-resume the device connected to the timeout
> > > > > > but suspended port so that the hub will have chance to clear the
> > > > > > PORT_SUSPEND feature during resume.
> > > > >
> > > > > Are you certain that the reset-resume is needed? What happens if you
> > > > > leave out the line that sets udev->reset_resume? The rest of the patch
> > > > > will cause the kernel to realize that the port really is suspended, so
> > > > > maybe the suspend feature will get cleared properly during resume.
> > > > >
> > > > > It's worthwhile to try the experiement and see what happens.
> > > > >
> > > > > Alan Stern
> > > > >
> > > >
> > > > If I leave out the udev->reset_resume set, the resume will fail. Please refer
> > > > to the following kernel log. The usb 1-1 is the hub which has wakeup enabled
> > > > descendants.
> > > >
> > > > [ 57.210472] usb 1-1: kworker/u32:7 timed out on ep0out len=0/0
> > > > [ 57.211022] usb 1-1-port3: suspend timeout, status 0507
> > > > [ 57.211130] hub 1-1:1.0: hub_suspend
> > > > [ 57.230500] usb 1-1: usb suspend, wakeup 0
> > > >
> > > > The timeout happens at 57.210472 and you can see the PORT_SUSPEND
> > > > bit is actually set in the "status 0507". The following shows the resume log.
> > > >
> > > > [ 58.046556] usb 1-1: usb resume
> > > > [ 58.114515] usb 1-1: Waited 0ms for CONNECT
> > > > [ 58.114524] usb 1-1: finish resume
> > > > [ 58.114928] hub 1-1:1.0: hub_resume
> > > > [ 58.116035] usb 1-1-port3: status 0507 change 0000
> > > > [ 58.116720] usb 1-1-port5: status 0503 change 0000
> > > > [ 58.116778] hub 1-1.3:1.0: hub_resume
> > > > [ 58.116908] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > > [ 58.116952] usb 1-1.5: Waited 0ms for CONNECT
> > > > [ 58.116955] usb 1-1.5: finish resume
> > > > [ 58.117157] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > > [ 58.117397] usb 1-1.3-port5: can't resume, status -71
> > > > [ 58.117782] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > > [ 58.118147] usb 1-1.3-port2: can't resume, status -71
> > > > [ 58.118149] usb 1-1.3.2: Waited 0ms for CONNECT
> > > > [ 58.118151] usb 1-1.3-port2: status 07eb.906e after resume, -19
> > > > [ 58.118153] usb 1-1.3.2: can't resume, status -19
> > > > [ 58.118154] usb 1-1.3-port2: logical disconnect
> > > > [ 58.118526] usb 1-1.3-port2: cannot disable (err = -71)
> > > >
> > > > As you see in the 58.116035, the hub_resume and activate is OK for the
> > > > usb 1-1. The "usb 1-1.3: finish resume" is not in the log because it's not
> > > > considered suspended and no chance to ClearPortFeature.
> > >
> > > Wait -- why isn't it considered suspended? We saw at 57.211022 that
> > > 1-1-port3's Suspend feature really was set, and thanks to your patch,
> > > the kernel should now believe that the port is suspended.
> > >
> > But it's still in the `if (status)` branch so it will not get
> > usb_set_device_state
> > to USB_STATE_SUSPENDED, then usb_resume_both will not do the resume
> > process for it.
>
> Ah, yes. I was mis-reading the patch.
>
> > My original thought is, we still take this as an abnormal status
> > because we don't
> > really know the reason for the timeout. Set reset_resume for the udev
> > will make the
> > kernel to reset_resume it. Or I have to create a new `goto` name in
> > the `else` branch
> > to force it back to the successful suspended process. And should I
> > clean the status to
> > zero for pm_runtime_put_sync()? What's your suggestion?
>
> For testing purposes, set status to 0 and jump to a new goto label in
> the "else" branch. In other words, treat it as if the suspend really
> had worked and go back to the successful pathway. Try this out and see
> if it fixes the problem.
>
> If it does then the reset-resume isn't needed. If it doesn't, post your
> patch again, and mention in the patch description that testing shows the
> reset-resume really is necessary.
>
> Alan Stern

Thanks for the suggestion. I revised the patch to create a new goto
label in the "else" branch and clear the status to zero if the port is
really suspended. It fixed the problem in my 100 time suspend/resume
test. I will send v3 patch w/ the modification.

Chris

2021-05-13 22:19:01

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

On Thu, May 13, 2021 at 12:21:52PM +0800, Chris Chiu wrote:
> On Wed, May 12, 2021 at 11:04 PM Alan Stern <[email protected]> wrote:
> >
> > For testing purposes, set status to 0 and jump to a new goto label in
> > the "else" branch. In other words, treat it as if the suspend really
> > had worked and go back to the successful pathway. Try this out and see
> > if it fixes the problem.
> >
> > If it does then the reset-resume isn't needed. If it doesn't, post your
> > patch again, and mention in the patch description that testing shows the
> > reset-resume really is necessary.
> >
> > Alan Stern
>
> Thanks for the suggestion. I revised the patch to create a new goto
> label in the "else" branch and clear the status to zero if the port is
> really suspended. It fixed the problem in my 100 time suspend/resume
> test. I will send v3 patch w/ the modification.

Great! This is a much better solution.

Alan Stern