2022-06-02 16:11:40

by Krishna Kurapati PSSNV

[permalink] [raw]
Subject: [PATCH v20 0/5] USB DWC3 host wake up support from system suspend

Avoiding phy powerdown in host mode when dwc3 is wakeup capable, so that
it can be wake up by devices. Keep usb30_prim gdsc active to retain
controller status during suspend/resume.

Changes in v20:
Fixed nitpicks in dwc3 qcom driver.
Fixed code changes in dwc3 core driver.

Changes in v19:
Fixed dwc3 driver code changes.

Changes in v18:
Fixed minor nit picks in v17 reported by Matthias.

Changes in v17:
Moved the speed check to glue driver.
Powering down phy's solely based on dwc3 wakeup capability.
Configuring the interrupt functions appropriately.

Changes in v16:
Added changes to power down the phy's during suspend only if dwc3
is not wakeup capable.

Changes in v15:
Added patch to enable wakeup for xhci-plat based on children wakeup status.
Used device_wakeup_path instead of device_children_wakeup_capable

Changes in v14:
Added patch for device_children_wakeup_capable.
Used device_children_wakeup_capable instead of usb_wakeup_enabled_descendants.
Fixed minor nit picks in v13 reported by Matthias.

Changes in v13:
Moved the dt bindings patch to start.
Changed dwc3_set_phy_speed_mode to dwc3_check_phy_speed_mode.
Check wakep-source property for dwc3 core node to set the
wakeup capability. Drop the device_init_wakeup call from
runtime suspend and resume.
Added GENPD_FLAG_RPM_ALWAYS_ON and set GENPD_FLAG_ALWAYS_ON if
wakeup is supported.

Changes in v12:
Squashed PATCH 1/5 and 2/5 of v11.
Added dt bindings and device tree entry for wakeup-source property
for dwc3 core node.
Dropped redundant phy_set_mode call.


Changes in v11:
Moving back to v8 version
https://patchwork.kernel.org/project/linux-arm-msm/cover/[email protected]
as we are getting interrupts during suspend
when enabling both DP hs phy irq and DM hs phy irq.
Moved the set phy mode function to dwc3/core.c from xhci-plat.c
We didn't find any other option other than accessing xhci from dwc.

Changes in v10:
PATCH 1/6: Change device_set_wakeup_capable to device_set_wakeup_enable
PATCH 2/6: Remove redundant else part in dwc3_resume_common
PATCH 4/6: Change the irg flags
PATCH 5/6: Set flag GENPD_FLAG_ALWAYS_ON
PATCH 6/6: Remove disable interrupts function and enable
interrupts in probe.


Changes in v9:
Checking with device_may_makeup property instead of phy_power_off flag.
Changed the IRQ flags and removed hs_phy_mode variable.

Changes in v8:
Moved the dwc3 suspend quirk code in dwc3/host.c to xhci-plat.c
Checking phy_power_off flag instead of usb_wakeup_enabled_descendants
to keep gdsc active.

Changes in v7:
Change in commit text and message in PATCH 1/5 and PATCH 5/5
as per Matthias suggestion.
Added curly braces for if and else if sections in PATCH 4/5.

Changes in v6:
Addressed comments in host.c and core.c
Separated the patches in dwc3-qcom.c to make it simple.
Dropped wakeup-source change as it is not related to this series.

Changes in v5:
Added phy_power_off flag to check presence of wakeup capable devices.
Dropped patch[v4,4/5] as it is present linux-next.
Addressed comments in host.c and dwc3-qcom.c.

Changes in v4:
Addressed Matthias comments raised in v3.

Changes in v3:
Removed need_phy_for_wakeup flag and by default avoiding phy powerdown.
Addressed Matthias comments and added entry for DEV_SUPERSPEED.
Added suspend_quirk in dwc3 host and moved the dwc3_set_phy_speed_flags.
Added wakeup-source dt entry and reading in dwc-qcom.c glue driver.

Changes in v2:
Dropped the patch in clock to set GENPD_FLAG_ACTIVE_WAKEUP flag and
setting in usb dwc3 driver.
Separated the core patch and glue driver patch.
Made need_phy_for_wakeup flag part of dwc structure and
hs_phy_flags as unsgined int.
Adrressed the comment on device_init_wakeup call.
Corrected offset for reading portsc register.
Added pacth to support wakeup in xo shutdown case.

Sandeep Maheswaram (5):
dt-bindings: usb: dwc3: Add wakeup-source property support
usb: dwc3: core: Host wake up support from system suspend
usb: dwc3: qcom: Add helper functions to enable,disable wake irqs
usb: dwc3: qcom: Configure wakeup interrupts during suspend
usb: dwc3: qcom: Keep power domain on to retain controller status

.../devicetree/bindings/usb/snps,dwc3.yaml | 5 +
drivers/usb/dwc3/core.c | 9 +-
drivers/usb/dwc3/dwc3-qcom.c | 140 +++++++++++++++------
3 files changed, 108 insertions(+), 46 deletions(-)

--
2.7.4



2022-06-02 20:34:27

by Krishna Kurapati PSSNV

[permalink] [raw]
Subject: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

From: Sandeep Maheswaram <[email protected]>

Check wakeup-source property for dwc3 core node to set the
wakeup capability. Drop the device_init_wakeup call from
runtime suspend and resume.

If the dwc3 is wakeup capable, don't power down the USB PHY(s).
The glue drivers are expected to take care of configuring the
additional wakeup settings if needed based on the dwc3 wakeup
capability status. In some SOC designs, powering off the PHY is
resulting in higher leakage, so this patch save power on such boards.

Signed-off-by: Sandeep Maheswaram <[email protected]>
Signed-off-by: Krishna Kurapati <[email protected]>
Reviewed-by: Pavankumar Kondeti <[email protected]>
---
drivers/usb/dwc3/core.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index e027c04..b99d3c2 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1787,6 +1787,7 @@ static int dwc3_probe(struct platform_device *pdev)

platform_set_drvdata(pdev, dwc);
dwc3_cache_hwparams(dwc);
+ device_init_wakeup(&pdev->dev, of_property_read_bool(dev->of_node, "wakeup-source"));

spin_lock_init(&dwc->lock);
mutex_init(&dwc->mutex);
@@ -1948,7 +1949,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
dwc3_core_exit(dwc);
break;
case DWC3_GCTL_PRTCAP_HOST:
- if (!PMSG_IS_AUTO(msg)) {
+ if (!PMSG_IS_AUTO(msg) && !device_can_wakeup(dwc->dev)) {
dwc3_core_exit(dwc);
break;
}
@@ -2009,7 +2010,7 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
spin_unlock_irqrestore(&dwc->lock, flags);
break;
case DWC3_GCTL_PRTCAP_HOST:
- if (!PMSG_IS_AUTO(msg)) {
+ if (!PMSG_IS_AUTO(msg) && !device_can_wakeup(dwc->dev)) {
ret = dwc3_core_init_for_resume(dwc);
if (ret)
return ret;
@@ -2086,8 +2087,6 @@ static int dwc3_runtime_suspend(struct device *dev)
if (ret)
return ret;

- device_init_wakeup(dev, true);
-
return 0;
}

@@ -2096,8 +2095,6 @@ static int dwc3_runtime_resume(struct device *dev)
struct dwc3 *dwc = dev_get_drvdata(dev);
int ret;

- device_init_wakeup(dev, false);
-
ret = dwc3_resume_common(dwc, PMSG_AUTO_RESUME);
if (ret)
return ret;
--
2.7.4


2022-06-04 19:32:55

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

Hi Krishna,

with this version I see xHCI errors on my SC7180 based system, like
these:

[ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit

[ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout

After resume a downstream hub isn't enumerated again.

So far I didn't see those with v13, but I aso saw the first error with
v16.

I can do some more digging next week.

On Thu, Jun 02, 2022 at 01:54:34PM +0530, Krishna Kurapati wrote:
> From: Sandeep Maheswaram <[email protected]>
>
> Check wakeup-source property for dwc3 core node to set the
> wakeup capability. Drop the device_init_wakeup call from
> runtime suspend and resume.
>
> If the dwc3 is wakeup capable, don't power down the USB PHY(s).
> The glue drivers are expected to take care of configuring the
> additional wakeup settings if needed based on the dwc3 wakeup
> capability status. In some SOC designs, powering off the PHY is
> resulting in higher leakage, so this patch save power on such boards.
>
> Signed-off-by: Sandeep Maheswaram <[email protected]>
> Signed-off-by: Krishna Kurapati <[email protected]>
> Reviewed-by: Pavankumar Kondeti <[email protected]>
> ---
> drivers/usb/dwc3/core.c | 9 +++------
> 1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> index e027c04..b99d3c2 100644
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -1787,6 +1787,7 @@ static int dwc3_probe(struct platform_device *pdev)
>
> platform_set_drvdata(pdev, dwc);
> dwc3_cache_hwparams(dwc);
> + device_init_wakeup(&pdev->dev, of_property_read_bool(dev->of_node, "wakeup-source"));
>
> spin_lock_init(&dwc->lock);
> mutex_init(&dwc->mutex);
> @@ -1948,7 +1949,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
> dwc3_core_exit(dwc);
> break;
> case DWC3_GCTL_PRTCAP_HOST:
> - if (!PMSG_IS_AUTO(msg)) {
> + if (!PMSG_IS_AUTO(msg) && !device_can_wakeup(dwc->dev)) {
> dwc3_core_exit(dwc);
> break;
> }
> @@ -2009,7 +2010,7 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
> spin_unlock_irqrestore(&dwc->lock, flags);
> break;
> case DWC3_GCTL_PRTCAP_HOST:
> - if (!PMSG_IS_AUTO(msg)) {
> + if (!PMSG_IS_AUTO(msg) && !device_can_wakeup(dwc->dev)) {
> ret = dwc3_core_init_for_resume(dwc);
> if (ret)
> return ret;
> @@ -2086,8 +2087,6 @@ static int dwc3_runtime_suspend(struct device *dev)
> if (ret)
> return ret;
>
> - device_init_wakeup(dev, true);
> -
> return 0;
> }
>
> @@ -2096,8 +2095,6 @@ static int dwc3_runtime_resume(struct device *dev)
> struct dwc3 *dwc = dev_get_drvdata(dev);
> int ret;
>
> - device_init_wakeup(dev, false);
> -
> ret = dwc3_resume_common(dwc, PMSG_AUTO_RESUME);
> if (ret)
> return ret;
> --
> 2.7.4
>

2022-06-07 00:29:14

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
> Hi Krishna,
>
> with this version I see xHCI errors on my SC7180 based system, like
> these:
>
> [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
>
> [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
>
> After resume a downstream hub isn't enumerated again.
>
> So far I didn't see those with v13, but I aso saw the first error with
> v16.

It also happens with v13, but only when a wakeup capable vUSB <= 2
device is plugged in. Initially I used a wakeup capable USB3 to
Ethernet adapter to trigger the wakeup case, however older versions
of this series that use usb_wakeup_enabled_descendants() to check
for wakeup capable devices didn't actually check for vUSB > 2
devices.

So the case were the controller/PHYs is powered down works, but
the controller is unhappy when the runtime PM path is used during
system suspend.

> On Thu, Jun 02, 2022 at 01:54:34PM +0530, Krishna Kurapati wrote:
> > From: Sandeep Maheswaram <[email protected]>
> >
> > Check wakeup-source property for dwc3 core node to set the
> > wakeup capability. Drop the device_init_wakeup call from
> > runtime suspend and resume.
> >
> > If the dwc3 is wakeup capable, don't power down the USB PHY(s).
> > The glue drivers are expected to take care of configuring the
> > additional wakeup settings if needed based on the dwc3 wakeup
> > capability status. In some SOC designs, powering off the PHY is
> > resulting in higher leakage, so this patch save power on such boards.
> >
> > Signed-off-by: Sandeep Maheswaram <[email protected]>
> > Signed-off-by: Krishna Kurapati <[email protected]>
> > Reviewed-by: Pavankumar Kondeti <[email protected]>
> > ---
> > drivers/usb/dwc3/core.c | 9 +++------
> > 1 file changed, 3 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> > index e027c04..b99d3c2 100644
> > --- a/drivers/usb/dwc3/core.c
> > +++ b/drivers/usb/dwc3/core.c
> > @@ -1787,6 +1787,7 @@ static int dwc3_probe(struct platform_device *pdev)
> >
> > platform_set_drvdata(pdev, dwc);
> > dwc3_cache_hwparams(dwc);
> > + device_init_wakeup(&pdev->dev, of_property_read_bool(dev->of_node, "wakeup-source"));
> >
> > spin_lock_init(&dwc->lock);
> > mutex_init(&dwc->mutex);
> > @@ -1948,7 +1949,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg)
> > dwc3_core_exit(dwc);
> > break;
> > case DWC3_GCTL_PRTCAP_HOST:
> > - if (!PMSG_IS_AUTO(msg)) {
> > + if (!PMSG_IS_AUTO(msg) && !device_can_wakeup(dwc->dev)) {
> > dwc3_core_exit(dwc);
> > break;
> > }
> > @@ -2009,7 +2010,7 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg)
> > spin_unlock_irqrestore(&dwc->lock, flags);
> > break;
> > case DWC3_GCTL_PRTCAP_HOST:
> > - if (!PMSG_IS_AUTO(msg)) {
> > + if (!PMSG_IS_AUTO(msg) && !device_can_wakeup(dwc->dev)) {
> > ret = dwc3_core_init_for_resume(dwc);
> > if (ret)
> > return ret;
> > @@ -2086,8 +2087,6 @@ static int dwc3_runtime_suspend(struct device *dev)
> > if (ret)
> > return ret;
> >
> > - device_init_wakeup(dev, true);
> > -
> > return 0;
> > }
> >
> > @@ -2096,8 +2095,6 @@ static int dwc3_runtime_resume(struct device *dev)
> > struct dwc3 *dwc = dev_get_drvdata(dev);
> > int ret;
> >
> > - device_init_wakeup(dev, false);
> > -
> > ret = dwc3_resume_common(dwc, PMSG_AUTO_RESUME);
> > if (ret)
> > return ret;
> > --
> > 2.7.4
> >

2022-06-13 20:07:21

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote:
> On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
> > Hi Krishna,
> >
> > with this version I see xHCI errors on my SC7180 based system, like
> > these:
> >
> > [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
> >
> > [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
> >
> > After resume a downstream hub isn't enumerated again.
> >
> > So far I didn't see those with v13, but I aso saw the first error with
> > v16.
>
> It also happens with v13, but only when a wakeup capable vUSB <= 2
> device is plugged in. Initially I used a wakeup capable USB3 to
> Ethernet adapter to trigger the wakeup case, however older versions
> of this series that use usb_wakeup_enabled_descendants() to check
> for wakeup capable devices didn't actually check for vUSB > 2
> devices.
>
> So the case were the controller/PHYs is powered down works, but
> the controller is unhappy when the runtime PM path is used during
> system suspend.

The issue isn't seen on all systems using dwc3-qcom and the problem starts
during probe(). The expected probe sequence is something like this:

dwc3_qcom_probe
dwc3_qcom_of_register_core
dwc3_probe

if (device_can_wakeup(&qcom->dwc3->dev))
...

The important part is that device_can_wakeup() is called after dwc3_probe()
has completed. That's what I see on a QC SC7280 system, where wakeup is
generally working with these patches.

However on a QC SC7180 system dwc3_probe() is deferred and only executed after
dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false.
With that the controller/driver ends up in an unhappy state after system
suspend.

Probing is deferred on SC7180 because device_links_check_suppliers() finds
that '88e3000.phy' isn't ready yet.

2022-06-14 18:27:31

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Mon, Jun 13, 2022 at 11:08:32AM -0700, Matthias Kaehlcke wrote:
> On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote:
> > On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
> > > Hi Krishna,
> > >
> > > with this version I see xHCI errors on my SC7180 based system, like
> > > these:
> > >
> > > [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
> > >
> > > [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
> > >
> > > After resume a downstream hub isn't enumerated again.
> > >
> > > So far I didn't see those with v13, but I aso saw the first error with
> > > v16.
> >
> > It also happens with v13, but only when a wakeup capable vUSB <= 2
> > device is plugged in. Initially I used a wakeup capable USB3 to
> > Ethernet adapter to trigger the wakeup case, however older versions
> > of this series that use usb_wakeup_enabled_descendants() to check
> > for wakeup capable devices didn't actually check for vUSB > 2
> > devices.
> >
> > So the case were the controller/PHYs is powered down works, but
> > the controller is unhappy when the runtime PM path is used during
> > system suspend.
>
> The issue isn't seen on all systems using dwc3-qcom and the problem starts
> during probe(). The expected probe sequence is something like this:
>
> dwc3_qcom_probe
> dwc3_qcom_of_register_core
> dwc3_probe
>
> if (device_can_wakeup(&qcom->dwc3->dev))
> ...
>
> The important part is that device_can_wakeup() is called after dwc3_probe()
> has completed. That's what I see on a QC SC7280 system, where wakeup is
> generally working with these patches.
>
> However on a QC SC7180 system dwc3_probe() is deferred and only executed after
> dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false.
> With that the controller/driver ends up in an unhappy state after system
> suspend.
>
> Probing is deferred on SC7180 because device_links_check_suppliers() finds
> that '88e3000.phy' isn't ready yet.

It seems device links could be used to make sure the dwc3 core is present:

Another example for an inconsistent state would be a device link that
represents a driver presence dependency, yet is added from the consumer’s
->probe callback while the supplier hasn’t probed yet: Had the driver core
known about the device link earlier, it wouldn’t have probed the consumer
in the first place. The onus is thus on the consumer to check presence of
the supplier after adding the link, and defer probing on non-presence.

https://www.kernel.org/doc/html/v5.18/driver-api/device_link.html#usage


You could add something like this to dwc3_qcom_of_register_core():


device_link_add(dev, &qcom->dwc3->dev,
DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOPROBE_CONSUMER);

if (qcom->dwc3->dev.links.status != DL_DEV_DRIVER_BOUND)
ret = -EPROBE_DEFER;


From the doc it isn't clear how the consumer is supposed to check presence
of the supplier, the above check of the link status is also used in
drivers/cpufreq/mediatek-cpufreq.c , but not elsewhere outside of the
driver framework.

2022-06-14 20:32:27

by Krishna Kurapati PSSNV

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend


On 6/14/2022 11:23 PM, Matthias Kaehlcke wrote:
> On Mon, Jun 13, 2022 at 11:08:32AM -0700, Matthias Kaehlcke wrote:
>> On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote:
>>> On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
>>>> Hi Krishna,
>>>>
>>>> with this version I see xHCI errors on my SC7180 based system, like
>>>> these:
>>>>
>>>> [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
>>>>
>>>> [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
>>>>
>>>> After resume a downstream hub isn't enumerated again.
>>>>
>>>> So far I didn't see those with v13, but I aso saw the first error with
>>>> v16.
>>> It also happens with v13, but only when a wakeup capable vUSB <= 2
>>> device is plugged in. Initially I used a wakeup capable USB3 to
>>> Ethernet adapter to trigger the wakeup case, however older versions
>>> of this series that use usb_wakeup_enabled_descendants() to check
>>> for wakeup capable devices didn't actually check for vUSB > 2
>>> devices.
>>>
>>> So the case were the controller/PHYs is powered down works, but
>>> the controller is unhappy when the runtime PM path is used during
>>> system suspend.
>> The issue isn't seen on all systems using dwc3-qcom and the problem starts
>> during probe(). The expected probe sequence is something like this:
>>
>> dwc3_qcom_probe
>> dwc3_qcom_of_register_core
>> dwc3_probe
>>
>> if (device_can_wakeup(&qcom->dwc3->dev))
>> ...
>>
>> The important part is that device_can_wakeup() is called after dwc3_probe()
>> has completed. That's what I see on a QC SC7280 system, where wakeup is
>> generally working with these patches.
>>
>> However on a QC SC7180 system dwc3_probe() is deferred and only executed after
>> dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false.
>> With that the controller/driver ends up in an unhappy state after system
>> suspend.
>>
>> Probing is deferred on SC7180 because device_links_check_suppliers() finds
>> that '88e3000.phy' isn't ready yet.
> It seems device links could be used to make sure the dwc3 core is present:
>
> Another example for an inconsistent state would be a device link that
> represents a driver presence dependency, yet is added from the consumer’s
> ->probe callback while the supplier hasn’t probed yet: Had the driver core
> known about the device link earlier, it wouldn’t have probed the consumer
> in the first place. The onus is thus on the consumer to check presence of
> the supplier after adding the link, and defer probing on non-presence.
>
> https://www.kernel.org/doc/html/v5.18/driver-api/device_link.html#usage
>
>
> You could add something like this to dwc3_qcom_of_register_core():
>
>
> device_link_add(dev, &qcom->dwc3->dev,
> DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOPROBE_CONSUMER);
>
> if (qcom->dwc3->dev.links.status != DL_DEV_DRIVER_BOUND)
> ret = -EPROBE_DEFER;
>
>
> From the doc it isn't clear how the consumer is supposed to check presence
> of the supplier, the above check of the link status is also used in
> drivers/cpufreq/mediatek-cpufreq.c , but not elsewhere outside of the
> driver framework.
Hi Mathias,

    Thanks for the input. I will try the above snippet and confirm if
probe call happens in sync with of_platform_populate in
dwc3_qcom_of_register_core

Regards,
Krishna,

2022-06-16 09:32:14

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

Hi Matthias/Krishna,

On Tue, Jun 14, 2022 at 10:53:35AM -0700, Matthias Kaehlcke wrote:
> On Mon, Jun 13, 2022 at 11:08:32AM -0700, Matthias Kaehlcke wrote:
> > On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote:
> > > On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
> > > > Hi Krishna,
> > > >
> > > > with this version I see xHCI errors on my SC7180 based system, like
> > > > these:
> > > >
> > > > [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
> > > >
> > > > [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
> > > >
> > > > After resume a downstream hub isn't enumerated again.
> > > >
> > > > So far I didn't see those with v13, but I aso saw the first error with
> > > > v16.
> > >
> > > It also happens with v13, but only when a wakeup capable vUSB <= 2
> > > device is plugged in. Initially I used a wakeup capable USB3 to
> > > Ethernet adapter to trigger the wakeup case, however older versions
> > > of this series that use usb_wakeup_enabled_descendants() to check
> > > for wakeup capable devices didn't actually check for vUSB > 2
> > > devices.
> > >
> > > So the case were the controller/PHYs is powered down works, but
> > > the controller is unhappy when the runtime PM path is used during
> > > system suspend.
> >
> > The issue isn't seen on all systems using dwc3-qcom and the problem starts
> > during probe(). The expected probe sequence is something like this:
> >
> > dwc3_qcom_probe
> > dwc3_qcom_of_register_core
> > dwc3_probe
> >
> > if (device_can_wakeup(&qcom->dwc3->dev))
> > ...
> >
> > The important part is that device_can_wakeup() is called after dwc3_probe()
> > has completed. That's what I see on a QC SC7280 system, where wakeup is
> > generally working with these patches.
> >
> > However on a QC SC7180 system dwc3_probe() is deferred and only executed after
> > dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false.
> > With that the controller/driver ends up in an unhappy state after system
> > suspend.
> >
> > Probing is deferred on SC7180 because device_links_check_suppliers() finds
> > that '88e3000.phy' isn't ready yet.
>
> It seems device links could be used to make sure the dwc3 core is present:
>
> Another example for an inconsistent state would be a device link that
> represents a driver presence dependency, yet is added from the consumer’s
> ->probe callback while the supplier hasn’t probed yet: Had the driver core
> known about the device link earlier, it wouldn’t have probed the consumer
> in the first place. The onus is thus on the consumer to check presence of
> the supplier after adding the link, and defer probing on non-presence.
>
> https://www.kernel.org/doc/html/v5.18/driver-api/device_link.html#usage
>
>
> You could add something like this to dwc3_qcom_of_register_core():
>
>
> device_link_add(dev, &qcom->dwc3->dev,
> DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOPROBE_CONSUMER);
>
> if (qcom->dwc3->dev.links.status != DL_DEV_DRIVER_BOUND)
> ret = -EPROBE_DEFER;
>
>
I am not very sure how the device_link_add() API works. we are the parent and
creating a depdency on child probe. That does not sound correct to me. Any
ways, I have another question.

When dwc3_qcom_of_register_core() returns error back to dwc3_qcom_probe(), we
goto depopulate label which calls of_platform_depopulate() which destroy the
child devices that are populated. how does that ensure that child probe is
completed by the time, our probe is called again. The child device it self is
gone. Is this working because when our probe is called next time, the child
probe depenencies are resolved?

Thanks,
Pavan

2022-06-16 17:38:20

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Thu, Jun 16, 2022 at 02:41:10PM +0530, Pavan Kondeti wrote:
> Hi Matthias/Krishna,
>
> On Tue, Jun 14, 2022 at 10:53:35AM -0700, Matthias Kaehlcke wrote:
> > On Mon, Jun 13, 2022 at 11:08:32AM -0700, Matthias Kaehlcke wrote:
> > > On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote:
> > > > On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
> > > > > Hi Krishna,
> > > > >
> > > > > with this version I see xHCI errors on my SC7180 based system, like
> > > > > these:
> > > > >
> > > > > [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
> > > > >
> > > > > [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
> > > > >
> > > > > After resume a downstream hub isn't enumerated again.
> > > > >
> > > > > So far I didn't see those with v13, but I aso saw the first error with
> > > > > v16.
> > > >
> > > > It also happens with v13, but only when a wakeup capable vUSB <= 2
> > > > device is plugged in. Initially I used a wakeup capable USB3 to
> > > > Ethernet adapter to trigger the wakeup case, however older versions
> > > > of this series that use usb_wakeup_enabled_descendants() to check
> > > > for wakeup capable devices didn't actually check for vUSB > 2
> > > > devices.
> > > >
> > > > So the case were the controller/PHYs is powered down works, but
> > > > the controller is unhappy when the runtime PM path is used during
> > > > system suspend.
> > >
> > > The issue isn't seen on all systems using dwc3-qcom and the problem starts
> > > during probe(). The expected probe sequence is something like this:
> > >
> > > dwc3_qcom_probe
> > > dwc3_qcom_of_register_core
> > > dwc3_probe
> > >
> > > if (device_can_wakeup(&qcom->dwc3->dev))
> > > ...
> > >
> > > The important part is that device_can_wakeup() is called after dwc3_probe()
> > > has completed. That's what I see on a QC SC7280 system, where wakeup is
> > > generally working with these patches.
> > >
> > > However on a QC SC7180 system dwc3_probe() is deferred and only executed after
> > > dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false.
> > > With that the controller/driver ends up in an unhappy state after system
> > > suspend.
> > >
> > > Probing is deferred on SC7180 because device_links_check_suppliers() finds
> > > that '88e3000.phy' isn't ready yet.
> >
> > It seems device links could be used to make sure the dwc3 core is present:
> >
> > Another example for an inconsistent state would be a device link that
> > represents a driver presence dependency, yet is added from the consumer’s
> > ->probe callback while the supplier hasn’t probed yet: Had the driver core
> > known about the device link earlier, it wouldn’t have probed the consumer
> > in the first place. The onus is thus on the consumer to check presence of
> > the supplier after adding the link, and defer probing on non-presence.
> >
> > https://www.kernel.org/doc/html/v5.18/driver-api/device_link.html#usage
> >
> >
> > You could add something like this to dwc3_qcom_of_register_core():
> >
> >
> > device_link_add(dev, &qcom->dwc3->dev,
> > DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOPROBE_CONSUMER);
> >
> > if (qcom->dwc3->dev.links.status != DL_DEV_DRIVER_BOUND)
> > ret = -EPROBE_DEFER;
> >
> >
> I am not very sure how the device_link_add() API works. we are the parent and
> creating a depdency on child probe. That does not sound correct to me.

The functional dependency is effectively there, the driver already assumes that
the dwc3 core was probed when of_platform_populate() returns.

The device link itself doesn't create the dependency on the probe(), the check
of the link status below does.

Another option would be to add a link to the PHYs to the dwc3-qcom node in
the device tree, but I don't think that would be a better solution (and I
expect Rob would oppose this).

I'm open to other solutions, so far the device link is the cleanest that came
to my mind.

I think the root issue is the driver architecture, with two interdependent
drivers for the same IP block, instead of a single framework driver with a
common part (dwc3 core) and vendor specific hooks/data.

> Any ways, I have another question.
>
> When dwc3_qcom_of_register_core() returns error back to dwc3_qcom_probe(), we
> goto depopulate label which calls of_platform_depopulate() which destroy the
> child devices that are populated. how does that ensure that child probe is
> completed by the time, our probe is called again. The child device it self is
> gone. Is this working because when our probe is called next time, the child
> probe depenencies are resolved?

Good point! It doesn't really ensure that the child is probed (actually it
won't be probed and DL_FLAG_AUTOPROBE_CONSUMER doesn't make sense here), it
could happen that dwc3_qcom_probe() is deferred multiple times, but eventually
the PHYs should be ready and dwc3_probe() be invoked through
of_platform_populate().

2022-06-20 09:20:55

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

+Felipe, Bjorn

On Thu, Jun 16, 2022 at 10:15:49AM -0700, Matthias Kaehlcke wrote:
> On Thu, Jun 16, 2022 at 02:41:10PM +0530, Pavan Kondeti wrote:
> > Hi Matthias/Krishna,
> >
> > On Tue, Jun 14, 2022 at 10:53:35AM -0700, Matthias Kaehlcke wrote:
> > > On Mon, Jun 13, 2022 at 11:08:32AM -0700, Matthias Kaehlcke wrote:
> > > > On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote:
> > > > > On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
> > > > > > Hi Krishna,
> > > > > >
> > > > > > with this version I see xHCI errors on my SC7180 based system, like
> > > > > > these:
> > > > > >
> > > > > > [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
> > > > > >
> > > > > > [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
> > > > > >
> > > > > > After resume a downstream hub isn't enumerated again.
> > > > > >
> > > > > > So far I didn't see those with v13, but I aso saw the first error with
> > > > > > v16.
> > > > >
> > > > > It also happens with v13, but only when a wakeup capable vUSB <= 2
> > > > > device is plugged in. Initially I used a wakeup capable USB3 to
> > > > > Ethernet adapter to trigger the wakeup case, however older versions
> > > > > of this series that use usb_wakeup_enabled_descendants() to check
> > > > > for wakeup capable devices didn't actually check for vUSB > 2
> > > > > devices.
> > > > >
> > > > > So the case were the controller/PHYs is powered down works, but
> > > > > the controller is unhappy when the runtime PM path is used during
> > > > > system suspend.
> > > >
> > > > The issue isn't seen on all systems using dwc3-qcom and the problem starts
> > > > during probe(). The expected probe sequence is something like this:
> > > >
> > > > dwc3_qcom_probe
> > > > dwc3_qcom_of_register_core
> > > > dwc3_probe
> > > >
> > > > if (device_can_wakeup(&qcom->dwc3->dev))
> > > > ...
> > > >
> > > > The important part is that device_can_wakeup() is called after dwc3_probe()
> > > > has completed. That's what I see on a QC SC7280 system, where wakeup is
> > > > generally working with these patches.
> > > >
> > > > However on a QC SC7180 system dwc3_probe() is deferred and only executed after
> > > > dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false.
> > > > With that the controller/driver ends up in an unhappy state after system
> > > > suspend.
> > > >
> > > > Probing is deferred on SC7180 because device_links_check_suppliers() finds
> > > > that '88e3000.phy' isn't ready yet.
> > >
> > > It seems device links could be used to make sure the dwc3 core is present:
> > >
> > > Another example for an inconsistent state would be a device link that
> > > represents a driver presence dependency, yet is added from the consumer’s
> > > ->probe callback while the supplier hasn’t probed yet: Had the driver core
> > > known about the device link earlier, it wouldn’t have probed the consumer
> > > in the first place. The onus is thus on the consumer to check presence of
> > > the supplier after adding the link, and defer probing on non-presence.
> > >
> > > https://www.kernel.org/doc/html/v5.18/driver-api/device_link.html#usage
> > >
> > >
> > > You could add something like this to dwc3_qcom_of_register_core():
> > >
> > >
> > > device_link_add(dev, &qcom->dwc3->dev,
> > > DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOPROBE_CONSUMER);
> > >
> > > if (qcom->dwc3->dev.links.status != DL_DEV_DRIVER_BOUND)
> > > ret = -EPROBE_DEFER;
> > >
> > >
> > I am not very sure how the device_link_add() API works. we are the parent and
> > creating a depdency on child probe. That does not sound correct to me.
>
> The functional dependency is effectively there, the driver already assumes that
> the dwc3 core was probed when of_platform_populate() returns.
>
> The device link itself doesn't create the dependency on the probe(), the check
> of the link status below does.
>
> Another option would be to add a link to the PHYs to the dwc3-qcom node in
> the device tree, but I don't think that would be a better solution (and I
> expect Rob would oppose this).
>
> I'm open to other solutions, so far the device link is the cleanest that came
> to my mind.
>
> I think the root issue is the driver architecture, with two interdependent
> drivers for the same IP block, instead of a single framework driver with a
> common part (dwc3 core) and vendor specific hooks/data.
>
> > Any ways, I have another question.
> >
> > When dwc3_qcom_of_register_core() returns error back to dwc3_qcom_probe(), we
> > goto depopulate label which calls of_platform_depopulate() which destroy the
> > child devices that are populated. how does that ensure that child probe is
> > completed by the time, our probe is called again. The child device it self is
> > gone. Is this working because when our probe is called next time, the child
> > probe depenencies are resolved?
>
> Good point! It doesn't really ensure that the child is probed (actually it
> won't be probed and DL_FLAG_AUTOPROBE_CONSUMER doesn't make sense here), it
> could happen that dwc3_qcom_probe() is deferred multiple times, but eventually
> the PHYs should be ready and dwc3_probe() be invoked through
> of_platform_populate().

This is a generic problem i.e if a parent can only proceed after the child
devices are bounded (i.e probed successfully), how to ensure this behavior
from the parent's probe? Since we can't block the parent probe (async probe is
not the default behavior), we have to identify the condition that the children
are deferring probe, so that parent also can do that.

Can we add a API in drivers core to tell if a device probe is deferred or
not? This can be done by testing list_empty(&dev->p->deferred_probe) under
deferred_probe_mutex mutex. The parent can return EPROBE_DEFER based on this
API return value.

Another alternative would be explicitly checking if the child device suppliers
are ready or not before adding child device. That would require decoupling
of_platform_populate() to creating devices and adding devices.

Note that this problem is not just limited to suppliers not ready. if the
dwc3-qcom is made asynchronous probe, then its child also probed
asynchronously and there is no guarantee that child would be probed by the
time of_platform_populate() is returned. The bus notifier might come handy
in this case. The parent can register for this notifier and waiting for
the children device's BUS_NOTIFY_BOUND_DRIVER/BUS_NOTIFY_DRIVER_NOT_BOUND
notifications. This would also work in our case, if we move to
of_platform_populate() outside the probe().

Would like to hear other people thoughts on this.

Thanks,
Pavan

2022-06-23 19:34:18

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Mon, Jun 20, 2022 at 02:24:15PM +0530, Pavan Kondeti wrote:
> +Felipe, Bjorn
>
> On Thu, Jun 16, 2022 at 10:15:49AM -0700, Matthias Kaehlcke wrote:
> > On Thu, Jun 16, 2022 at 02:41:10PM +0530, Pavan Kondeti wrote:
> > > Hi Matthias/Krishna,
> > >
> > > On Tue, Jun 14, 2022 at 10:53:35AM -0700, Matthias Kaehlcke wrote:
> > > > On Mon, Jun 13, 2022 at 11:08:32AM -0700, Matthias Kaehlcke wrote:
> > > > > On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote:
> > > > > > On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
> > > > > > > Hi Krishna,
> > > > > > >
> > > > > > > with this version I see xHCI errors on my SC7180 based system, like
> > > > > > > these:
> > > > > > >
> > > > > > > [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
> > > > > > >
> > > > > > > [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
> > > > > > >
> > > > > > > After resume a downstream hub isn't enumerated again.
> > > > > > >
> > > > > > > So far I didn't see those with v13, but I aso saw the first error with
> > > > > > > v16.
> > > > > >
> > > > > > It also happens with v13, but only when a wakeup capable vUSB <= 2
> > > > > > device is plugged in. Initially I used a wakeup capable USB3 to
> > > > > > Ethernet adapter to trigger the wakeup case, however older versions
> > > > > > of this series that use usb_wakeup_enabled_descendants() to check
> > > > > > for wakeup capable devices didn't actually check for vUSB > 2
> > > > > > devices.
> > > > > >
> > > > > > So the case were the controller/PHYs is powered down works, but
> > > > > > the controller is unhappy when the runtime PM path is used during
> > > > > > system suspend.
> > > > >
> > > > > The issue isn't seen on all systems using dwc3-qcom and the problem starts
> > > > > during probe(). The expected probe sequence is something like this:
> > > > >
> > > > > dwc3_qcom_probe
> > > > > dwc3_qcom_of_register_core
> > > > > dwc3_probe
> > > > >
> > > > > if (device_can_wakeup(&qcom->dwc3->dev))
> > > > > ...
> > > > >
> > > > > The important part is that device_can_wakeup() is called after dwc3_probe()
> > > > > has completed. That's what I see on a QC SC7280 system, where wakeup is
> > > > > generally working with these patches.
> > > > >
> > > > > However on a QC SC7180 system dwc3_probe() is deferred and only executed after
> > > > > dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false.
> > > > > With that the controller/driver ends up in an unhappy state after system
> > > > > suspend.
> > > > >
> > > > > Probing is deferred on SC7180 because device_links_check_suppliers() finds
> > > > > that '88e3000.phy' isn't ready yet.
> > > >
> > > > It seems device links could be used to make sure the dwc3 core is present:
> > > >
> > > > Another example for an inconsistent state would be a device link that
> > > > represents a driver presence dependency, yet is added from the consumer’s
> > > > ->probe callback while the supplier hasn’t probed yet: Had the driver core
> > > > known about the device link earlier, it wouldn’t have probed the consumer
> > > > in the first place. The onus is thus on the consumer to check presence of
> > > > the supplier after adding the link, and defer probing on non-presence.
> > > >
> > > > https://www.kernel.org/doc/html/v5.18/driver-api/device_link.html#usage
> > > >
> > > >
> > > > You could add something like this to dwc3_qcom_of_register_core():
> > > >
> > > >
> > > > device_link_add(dev, &qcom->dwc3->dev,
> > > > DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOPROBE_CONSUMER);
> > > >
> > > > if (qcom->dwc3->dev.links.status != DL_DEV_DRIVER_BOUND)
> > > > ret = -EPROBE_DEFER;
> > > >
> > > >
> > > I am not very sure how the device_link_add() API works. we are the parent and
> > > creating a depdency on child probe. That does not sound correct to me.
> >
> > The functional dependency is effectively there, the driver already assumes that
> > the dwc3 core was probed when of_platform_populate() returns.
> >
> > The device link itself doesn't create the dependency on the probe(), the check
> > of the link status below does.
> >
> > Another option would be to add a link to the PHYs to the dwc3-qcom node in
> > the device tree, but I don't think that would be a better solution (and I
> > expect Rob would oppose this).
> >
> > I'm open to other solutions, so far the device link is the cleanest that came
> > to my mind.
> >
> > I think the root issue is the driver architecture, with two interdependent
> > drivers for the same IP block, instead of a single framework driver with a
> > common part (dwc3 core) and vendor specific hooks/data.
> >
> > > Any ways, I have another question.
> > >
> > > When dwc3_qcom_of_register_core() returns error back to dwc3_qcom_probe(), we
> > > goto depopulate label which calls of_platform_depopulate() which destroy the
> > > child devices that are populated. how does that ensure that child probe is
> > > completed by the time, our probe is called again. The child device it self is
> > > gone. Is this working because when our probe is called next time, the child
> > > probe depenencies are resolved?
> >
> > Good point! It doesn't really ensure that the child is probed (actually it
> > won't be probed and DL_FLAG_AUTOPROBE_CONSUMER doesn't make sense here), it
> > could happen that dwc3_qcom_probe() is deferred multiple times, but eventually
> > the PHYs should be ready and dwc3_probe() be invoked through
> > of_platform_populate().
>
> This is a generic problem i.e if a parent can only proceed after the child
> devices are bounded (i.e probed successfully), how to ensure this behavior
> from the parent's probe? Since we can't block the parent probe (async probe is
> not the default behavior), we have to identify the condition that the children
> are deferring probe, so that parent also can do that.
>
> Can we add a API in drivers core to tell if a device probe is deferred or
> not? This can be done by testing list_empty(&dev->p->deferred_probe) under
> deferred_probe_mutex mutex. The parent can return EPROBE_DEFER based on this
> API return value.

That could be an option.

> Another alternative would be explicitly checking if the child device suppliers
> are ready or not before adding child device. That would require decoupling
> of_platform_populate() to creating devices and adding devices.

It might require a new API since there are plenty of users of
of_platform_populate() that rely on the current behavior.

> Note that this problem is not just limited to suppliers not ready. if the
> dwc3-qcom is made asynchronous probe, then its child also probed
> asynchronously and there is no guarantee that child would be probed by the
> time of_platform_populate() is returned. The bus notifier might come handy
> in this case. The parent can register for this notifier and waiting for
> the children device's BUS_NOTIFY_BOUND_DRIVER/BUS_NOTIFY_DRIVER_NOT_BOUND
> notifications. This would also work in our case, if we move to
> of_platform_populate() outside the probe().

If I understand correctly the outcome would be a probe() in two stages. The
first does as much as it can do without the dwc3 core and leaves the device
in a state where it isn't really functional, and the second stage does the
rest when BUS_NOTIFY_BOUND_DRIVER is received for the dwc3 core device.

A concern could be the need for additional conditions in some code paths to
deal with the half-initialized device.

Why would of_platform_populate() be moved outside of probe()?

To avoid the half-initialized device probe() could block until
BUS_NOTIFY_BOUND_DRIVER is received. Probably that should be done with a
timeout to avoid blocking forever in case of a problem with probing the
dwc3 core.

2022-06-24 09:11:33

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Thu, Jun 23, 2022 at 11:38:06AM -0700, Matthias Kaehlcke wrote:
> On Mon, Jun 20, 2022 at 02:24:15PM +0530, Pavan Kondeti wrote:
> > +Felipe, Bjorn
> >
> > On Thu, Jun 16, 2022 at 10:15:49AM -0700, Matthias Kaehlcke wrote:
> > > On Thu, Jun 16, 2022 at 02:41:10PM +0530, Pavan Kondeti wrote:
> > > > Hi Matthias/Krishna,
> > > >
> > > > On Tue, Jun 14, 2022 at 10:53:35AM -0700, Matthias Kaehlcke wrote:
> > > > > On Mon, Jun 13, 2022 at 11:08:32AM -0700, Matthias Kaehlcke wrote:
> > > > > > On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote:
> > > > > > > On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote:
> > > > > > > > Hi Krishna,
> > > > > > > >
> > > > > > > > with this version I see xHCI errors on my SC7180 based system, like
> > > > > > > > these:
> > > > > > > >
> > > > > > > > [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit
> > > > > > > >
> > > > > > > > [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout
> > > > > > > >
> > > > > > > > After resume a downstream hub isn't enumerated again.
> > > > > > > >
> > > > > > > > So far I didn't see those with v13, but I aso saw the first error with
> > > > > > > > v16.
> > > > > > >
> > > > > > > It also happens with v13, but only when a wakeup capable vUSB <= 2
> > > > > > > device is plugged in. Initially I used a wakeup capable USB3 to
> > > > > > > Ethernet adapter to trigger the wakeup case, however older versions
> > > > > > > of this series that use usb_wakeup_enabled_descendants() to check
> > > > > > > for wakeup capable devices didn't actually check for vUSB > 2
> > > > > > > devices.
> > > > > > >
> > > > > > > So the case were the controller/PHYs is powered down works, but
> > > > > > > the controller is unhappy when the runtime PM path is used during
> > > > > > > system suspend.
> > > > > >
> > > > > > The issue isn't seen on all systems using dwc3-qcom and the problem starts
> > > > > > during probe(). The expected probe sequence is something like this:
> > > > > >
> > > > > > dwc3_qcom_probe
> > > > > > dwc3_qcom_of_register_core
> > > > > > dwc3_probe
> > > > > >
> > > > > > if (device_can_wakeup(&qcom->dwc3->dev))
> > > > > > ...
> > > > > >
> > > > > > The important part is that device_can_wakeup() is called after dwc3_probe()
> > > > > > has completed. That's what I see on a QC SC7280 system, where wakeup is
> > > > > > generally working with these patches.
> > > > > >
> > > > > > However on a QC SC7180 system dwc3_probe() is deferred and only executed after
> > > > > > dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false.
> > > > > > With that the controller/driver ends up in an unhappy state after system
> > > > > > suspend.
> > > > > >
> > > > > > Probing is deferred on SC7180 because device_links_check_suppliers() finds
> > > > > > that '88e3000.phy' isn't ready yet.
> > > > >
> > > > > It seems device links could be used to make sure the dwc3 core is present:
> > > > >
> > > > > Another example for an inconsistent state would be a device link that
> > > > > represents a driver presence dependency, yet is added from the consumer’s
> > > > > ->probe callback while the supplier hasn’t probed yet: Had the driver core
> > > > > known about the device link earlier, it wouldn’t have probed the consumer
> > > > > in the first place. The onus is thus on the consumer to check presence of
> > > > > the supplier after adding the link, and defer probing on non-presence.
> > > > >
> > > > > https://www.kernel.org/doc/html/v5.18/driver-api/device_link.html#usage
> > > > >
> > > > >
> > > > > You could add something like this to dwc3_qcom_of_register_core():
> > > > >
> > > > >
> > > > > device_link_add(dev, &qcom->dwc3->dev,
> > > > > DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOPROBE_CONSUMER);
> > > > >
> > > > > if (qcom->dwc3->dev.links.status != DL_DEV_DRIVER_BOUND)
> > > > > ret = -EPROBE_DEFER;
> > > > >
> > > > >
> > > > I am not very sure how the device_link_add() API works. we are the parent and
> > > > creating a depdency on child probe. That does not sound correct to me.
> > >
> > > The functional dependency is effectively there, the driver already assumes that
> > > the dwc3 core was probed when of_platform_populate() returns.
> > >
> > > The device link itself doesn't create the dependency on the probe(), the check
> > > of the link status below does.
> > >
> > > Another option would be to add a link to the PHYs to the dwc3-qcom node in
> > > the device tree, but I don't think that would be a better solution (and I
> > > expect Rob would oppose this).
> > >
> > > I'm open to other solutions, so far the device link is the cleanest that came
> > > to my mind.
> > >
> > > I think the root issue is the driver architecture, with two interdependent
> > > drivers for the same IP block, instead of a single framework driver with a
> > > common part (dwc3 core) and vendor specific hooks/data.
> > >
> > > > Any ways, I have another question.
> > > >
> > > > When dwc3_qcom_of_register_core() returns error back to dwc3_qcom_probe(), we
> > > > goto depopulate label which calls of_platform_depopulate() which destroy the
> > > > child devices that are populated. how does that ensure that child probe is
> > > > completed by the time, our probe is called again. The child device it self is
> > > > gone. Is this working because when our probe is called next time, the child
> > > > probe depenencies are resolved?
> > >
> > > Good point! It doesn't really ensure that the child is probed (actually it
> > > won't be probed and DL_FLAG_AUTOPROBE_CONSUMER doesn't make sense here), it
> > > could happen that dwc3_qcom_probe() is deferred multiple times, but eventually
> > > the PHYs should be ready and dwc3_probe() be invoked through
> > > of_platform_populate().
> >
> > This is a generic problem i.e if a parent can only proceed after the child
> > devices are bounded (i.e probed successfully), how to ensure this behavior
> > from the parent's probe? Since we can't block the parent probe (async probe is
> > not the default behavior), we have to identify the condition that the children
> > are deferring probe, so that parent also can do that.
> >
> > Can we add a API in drivers core to tell if a device probe is deferred or
> > not? This can be done by testing list_empty(&dev->p->deferred_probe) under
> > deferred_probe_mutex mutex. The parent can return EPROBE_DEFER based on this
> > API return value.
>
> That could be an option.
>
> > Another alternative would be explicitly checking if the child device suppliers
> > are ready or not before adding child device. That would require decoupling
> > of_platform_populate() to creating devices and adding devices.
>
> It might require a new API since there are plenty of users of
> of_platform_populate() that rely on the current behavior.

Agree. A new API is needed. we also have to consider multiple children
scenario, where one child is ready but others are not etc.

>
> > Note that this problem is not just limited to suppliers not ready. if the
> > dwc3-qcom is made asynchronous probe, then its child also probed
> > asynchronously and there is no guarantee that child would be probed by the
> > time of_platform_populate() is returned. The bus notifier might come handy
> > in this case. The parent can register for this notifier and waiting for
> > the children device's BUS_NOTIFY_BOUND_DRIVER/BUS_NOTIFY_DRIVER_NOT_BOUND
> > notifications. This would also work in our case, if we move to
> > of_platform_populate() outside the probe().
>
> If I understand correctly the outcome would be a probe() in two stages. The
> first does as much as it can do without the dwc3 core and leaves the device
> in a state where it isn't really functional, and the second stage does the
> rest when BUS_NOTIFY_BOUND_DRIVER is received for the dwc3 core device.
>
> A concern could be the need for additional conditions in some code paths to
> deal with the half-initialized device.
>
> Why would of_platform_populate() be moved outside of probe()?
>
> To avoid the half-initialized device probe() could block until
> BUS_NOTIFY_BOUND_DRIVER is received. Probably that should be done with a
> timeout to avoid blocking forever in case of a problem with probing the
> dwc3 core.

Right, we have to split the probe() into two parts. The second stage which
runs asynchronously should wait for the dwc3 core probe to happen. if at all
dwc3 core probe is falied (in which case also we get a notification) for any
reason other than EPROBE_DEFER, we have to undo the first stage.

Thansk,
Pavan

2022-06-27 20:21:54

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

Quoting Pavan Kondeti (2022-06-20 01:54:15)
> +Felipe, Bjorn
>
> On Thu, Jun 16, 2022 at 10:15:49AM -0700, Matthias Kaehlcke wrote:
> > On Thu, Jun 16, 2022 at 02:41:10PM +0530, Pavan Kondeti wrote:
> >
> > Good point! It doesn't really ensure that the child is probed (actually it
> > won't be probed and DL_FLAG_AUTOPROBE_CONSUMER doesn't make sense here), it
> > could happen that dwc3_qcom_probe() is deferred multiple times, but eventually
> > the PHYs should be ready and dwc3_probe() be invoked through
> > of_platform_populate().
>
> This is a generic problem i.e if a parent can only proceed after the child
> devices are bounded (i.e probed successfully), how to ensure this behavior
> from the parent's probe? Since we can't block the parent probe (async probe is
> not the default behavior), we have to identify the condition that the children
> are deferring probe, so that parent also can do that.
>
> Can we add a API in drivers core to tell if a device probe is deferred or
> not? This can be done by testing list_empty(&dev->p->deferred_probe) under
> deferred_probe_mutex mutex. The parent can return EPROBE_DEFER based on this
> API return value.
>
> Another alternative would be explicitly checking if the child device suppliers
> are ready or not before adding child device. That would require decoupling
> of_platform_populate() to creating devices and adding devices.
>
> Note that this problem is not just limited to suppliers not ready. if the
> dwc3-qcom is made asynchronous probe, then its child also probed
> asynchronously and there is no guarantee that child would be probed by the
> time of_platform_populate() is returned. The bus notifier might come handy
> in this case. The parent can register for this notifier and waiting for
> the children device's BUS_NOTIFY_BOUND_DRIVER/BUS_NOTIFY_DRIVER_NOT_BOUND
> notifications. This would also work in our case, if we move to
> of_platform_populate() outside the probe().
>
> Would like to hear other people thoughts on this.
>

I'm not following very closely but it sounds like a problem that may be
solved by using the component driver code (see
include/linux/component.h). That would let you move anything that needs
to be done once the child devices probe to the aggregate driver 'bind'
function (see struct component_master_ops::bind).

2022-06-28 06:15:09

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Mon, Jun 27, 2022 at 01:02:49PM -0700, Stephen Boyd wrote:
> Quoting Pavan Kondeti (2022-06-20 01:54:15)
> > +Felipe, Bjorn
> >
> > On Thu, Jun 16, 2022 at 10:15:49AM -0700, Matthias Kaehlcke wrote:
> > > On Thu, Jun 16, 2022 at 02:41:10PM +0530, Pavan Kondeti wrote:
> > >
> > > Good point! It doesn't really ensure that the child is probed (actually it
> > > won't be probed and DL_FLAG_AUTOPROBE_CONSUMER doesn't make sense here), it
> > > could happen that dwc3_qcom_probe() is deferred multiple times, but eventually
> > > the PHYs should be ready and dwc3_probe() be invoked through
> > > of_platform_populate().
> >
> > This is a generic problem i.e if a parent can only proceed after the child
> > devices are bounded (i.e probed successfully), how to ensure this behavior
> > from the parent's probe? Since we can't block the parent probe (async probe is
> > not the default behavior), we have to identify the condition that the children
> > are deferring probe, so that parent also can do that.
> >
> > Can we add a API in drivers core to tell if a device probe is deferred or
> > not? This can be done by testing list_empty(&dev->p->deferred_probe) under
> > deferred_probe_mutex mutex. The parent can return EPROBE_DEFER based on this
> > API return value.
> >
> > Another alternative would be explicitly checking if the child device suppliers
> > are ready or not before adding child device. That would require decoupling
> > of_platform_populate() to creating devices and adding devices.
> >
> > Note that this problem is not just limited to suppliers not ready. if the
> > dwc3-qcom is made asynchronous probe, then its child also probed
> > asynchronously and there is no guarantee that child would be probed by the
> > time of_platform_populate() is returned. The bus notifier might come handy
> > in this case. The parent can register for this notifier and waiting for
> > the children device's BUS_NOTIFY_BOUND_DRIVER/BUS_NOTIFY_DRIVER_NOT_BOUND
> > notifications. This would also work in our case, if we move to
> > of_platform_populate() outside the probe().
> >
> > Would like to hear other people thoughts on this.
> >
>
> I'm not following very closely but it sounds like a problem that may be
> solved by using the component driver code (see
> include/linux/component.h). That would let you move anything that needs
> to be done once the child devices probe to the aggregate driver 'bind'
> function (see struct component_master_ops::bind).

Thanks Stephen for letting us know about the component device framework.

IIUC,

- dwc3-qcom (parent of the dwc3 core) registers as a component master by
calling component_master_add_with_match() before calling
of_platform_populate(). The match callback could be as simple as comparing
the device against our child device.

- The dwc3 core (child) at the end of its probe can add as a component by calling
component_add().

- The above triggers the component_master_ops::bind callback implemented in
dwc3-qcom driver which signals that we are good to go.

- The dwc-qcom can call component_bind_all() to finish the formality i.e
telling the dwc3 core that we are good to go.

Is my understanding correct? This is what we are looking for i.e a way for
the child device(s) to signal the parent when the former is bounded.

Also what happens when the child device probe fails for any reason. i.e
component_add() would never be called so the master driver i.e dwc3-qcom would
wait indefinitely. May be it needs to implement a timeout or runtime suspend
etc should take care of keeping the resoures in suspend state.

Thanks,
Pavan

2022-06-29 22:31:55

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

Quoting Pavan Kondeti (2022-06-27 22:31:48)
> On Mon, Jun 27, 2022 at 01:02:49PM -0700, Stephen Boyd wrote:
> > Quoting Pavan Kondeti (2022-06-20 01:54:15)
> > >
> > > Would like to hear other people thoughts on this.
> > >
> >
> > I'm not following very closely but it sounds like a problem that may be
> > solved by using the component driver code (see
> > include/linux/component.h). That would let you move anything that needs
> > to be done once the child devices probe to the aggregate driver 'bind'
> > function (see struct component_master_ops::bind).
>
> Thanks Stephen for letting us know about the component device framework.
>
> IIUC,
>
> - dwc3-qcom (parent of the dwc3 core) registers as a component master by
> calling component_master_add_with_match() before calling
> of_platform_populate(). The match callback could be as simple as comparing
> the device against our child device.
>
> - The dwc3 core (child) at the end of its probe can add as a component by calling
> component_add().
>
> - The above triggers the component_master_ops::bind callback implemented in
> dwc3-qcom driver which signals that we are good to go.
>
> - The dwc-qcom can call component_bind_all() to finish the formality i.e
> telling the dwc3 core that we are good to go.
>
> Is my understanding correct? This is what we are looking for i.e a way for
> the child device(s) to signal the parent when the former is bounded.

Sounds about right to me.

>
> Also what happens when the child device probe fails for any reason. i.e
> component_add() would never be called so the master driver i.e dwc3-qcom would
> wait indefinitely. May be it needs to implement a timeout or runtime suspend
> etc should take care of keeping the resoures in suspend state.

When the child fails probe, it should return -EPROBE_DEFER if probe
needs to be deferred. Then the driver will attempt probe at a later
time. If probe fails without defer then it will never work and dwc3-qcom
will wait indefinitely. Not much we can do in that situation.

dwc3-qcom should wait for dwc3 core to call component_add() and then do
whatever needs to be done once the dwc3 core is registered in the
dwc3-qcom bind callback. Honestly this may all be a little overkill if
there's only two drivers here, dwc3-qcom and dwc3 core. It could
probably just be some callback from dwc3 core at the end of probe that
calls some function in dwc3-qcom.

2022-06-30 18:27:45

by Krishna Kurapati PSSNV

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend


On 6/30/2022 3:45 AM, Stephen Boyd wrote:
> Quoting Pavan Kondeti (2022-06-27 22:31:48)
>> On Mon, Jun 27, 2022 at 01:02:49PM -0700, Stephen Boyd wrote:
>>> Quoting Pavan Kondeti (2022-06-20 01:54:15)
>>>> Would like to hear other people thoughts on this.
>>>>
>>> I'm not following very closely but it sounds like a problem that may be
>>> solved by using the component driver code (see
>>> include/linux/component.h). That would let you move anything that needs
>>> to be done once the child devices probe to the aggregate driver 'bind'
>>> function (see struct component_master_ops::bind).
>> Thanks Stephen for letting us know about the component device framework.
>>
>> IIUC,
>>
>> - dwc3-qcom (parent of the dwc3 core) registers as a component master by
>> calling component_master_add_with_match() before calling
>> of_platform_populate(). The match callback could be as simple as comparing
>> the device against our child device.
>>
>> - The dwc3 core (child) at the end of its probe can add as a component by calling
>> component_add().
>>
>> - The above triggers the component_master_ops::bind callback implemented in
>> dwc3-qcom driver which signals that we are good to go.
>>
>> - The dwc-qcom can call component_bind_all() to finish the formality i.e
>> telling the dwc3 core that we are good to go.
>>
>> Is my understanding correct? This is what we are looking for i.e a way for
>> the child device(s) to signal the parent when the former is bounded.
> Sounds about right to me.
>
>> Also what happens when the child device probe fails for any reason. i.e
>> component_add() would never be called so the master driver i.e dwc3-qcom would
>> wait indefinitely. May be it needs to implement a timeout or runtime suspend
>> etc should take care of keeping the resoures in suspend state.
> When the child fails probe, it should return -EPROBE_DEFER if probe
> needs to be deferred. Then the driver will attempt probe at a later
> time. If probe fails without defer then it will never work and dwc3-qcom
> will wait indefinitely. Not much we can do in that situation.
Hi Stephen,

  Thanks for the idea. But doesn't adding dwc3 as a component to an agg
driver meanthat this change needs to be done on all glue drivers, as
component_bind_all( ) from master componentis supposed to let the dwc3
core know that we are good to go ?
> dwc3-qcom should wait for dwc3 core to call component_add() and then do
> whatever needs to be done once the dwc3 core is registered in the
> dwc3-qcom bind callback. Honestly this may all be a little overkill if
> there's only two drivers here, dwc3-qcom and dwc3 core. It could
> probably just be some callback from dwc3 core at the end of probe that
> calls some function in dwc3-qcom.
Since the issue we are facing is that the ssphy device links are not ready
causing the dwc3 probe not being invoked, can we add an API as Pavan
suggested
to check if deferred_probe listfor dwc3 device is empty or not andbased on
that we can choose to defer our qcomprobe ? In this case, we don't need to
touch the dwc3 core driver and would be making changesonly in qcom glue
driver.

2022-07-01 01:37:52

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Thu, Jun 30, 2022 at 11:43:01PM +0530, Krishna Kurapati PSSNV wrote:
>
> On 6/30/2022 3:45 AM, Stephen Boyd wrote:
> > Quoting Pavan Kondeti (2022-06-27 22:31:48)
> > > On Mon, Jun 27, 2022 at 01:02:49PM -0700, Stephen Boyd wrote:
> > > > Quoting Pavan Kondeti (2022-06-20 01:54:15)
> > > > > Would like to hear other people thoughts on this.
> > > > >
> > > > I'm not following very closely but it sounds like a problem that may be
> > > > solved by using the component driver code (see
> > > > include/linux/component.h). That would let you move anything that needs
> > > > to be done once the child devices probe to the aggregate driver 'bind'
> > > > function (see struct component_master_ops::bind).
> > > Thanks Stephen for letting us know about the component device framework.
> > >
> > > IIUC,
> > >
> > > - dwc3-qcom (parent of the dwc3 core) registers as a component master by
> > > calling component_master_add_with_match() before calling
> > > of_platform_populate(). The match callback could be as simple as comparing
> > > the device against our child device.
> > >
> > > - The dwc3 core (child) at the end of its probe can add as a component by calling
> > > component_add().
> > >
> > > - The above triggers the component_master_ops::bind callback implemented in
> > > dwc3-qcom driver which signals that we are good to go.
> > >
> > > - The dwc-qcom can call component_bind_all() to finish the formality i.e
> > > telling the dwc3 core that we are good to go.
> > >
> > > Is my understanding correct? This is what we are looking for i.e a way for
> > > the child device(s) to signal the parent when the former is bounded.
> > Sounds about right to me.
> >
> > > Also what happens when the child device probe fails for any reason. i.e
> > > component_add() would never be called so the master driver i.e dwc3-qcom would
> > > wait indefinitely. May be it needs to implement a timeout or runtime suspend
> > > etc should take care of keeping the resoures in suspend state.
> > When the child fails probe, it should return -EPROBE_DEFER if probe
> > needs to be deferred. Then the driver will attempt probe at a later
> > time. If probe fails without defer then it will never work and dwc3-qcom
> > will wait indefinitely. Not much we can do in that situation.
> Hi Stephen,
>
>   Thanks for the idea. But doesn't adding dwc3 as a component to an agg
> driver meanthat this change needs to be done on all glue drivers, as
> component_bind_all( ) from master componentis supposed to let the dwc3
> core know that we are good to go ?

Ideally all glue drivers would add component support, however I don't think
it is strictly necessary. Currently the dwc3 core already assumes that
everything is in place when it is probed. The core could have empty bind()
and unbind() callbacks, with that things in the core would remain
essentially as they are and the core doesn't depend on the glue driver to
call component_bind_all().

> > dwc3-qcom should wait for dwc3 core to call component_add() and then do
> > whatever needs to be done once the dwc3 core is registered in the
> > dwc3-qcom bind callback. Honestly this may all be a little overkill if
> > there's only two drivers here, dwc3-qcom and dwc3 core. It could
> > probably just be some callback from dwc3 core at the end of probe that
> > calls some function in dwc3-qcom.
> Since the issue we are facing is that the ssphy device links are not ready
> causing the dwc3 probe not being invoked, can we add an API as Pavan
> suggested
> to check if deferred_probe listfor dwc3 device is empty or not andbased on
> that we can choose to defer our qcomprobe ? In this case, we don't need to
> touch the dwc3 core driver and would be making changesonly in qcom glue
> driver.

As mentioned above, it shouldn't be necessary to add component support to
all the glue drivers. An API to check for deferred probing is an option,
however there is a possible race condition: When the dwc3-qcom driver checks
for a deferred probe the core could still be probing, in that situation the
glue would proceed before the core driver is ready. That could be avoided
with the component based approach.

2022-07-01 10:49:57

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Thu, Jun 30, 2022 at 06:10:50PM -0700, Matthias Kaehlcke wrote:
> > > dwc3-qcom should wait for dwc3 core to call component_add() and then do
> > > whatever needs to be done once the dwc3 core is registered in the
> > > dwc3-qcom bind callback. Honestly this may all be a little overkill if
> > > there's only two drivers here, dwc3-qcom and dwc3 core. It could
> > > probably just be some callback from dwc3 core at the end of probe that
> > > calls some function in dwc3-qcom.
> > Since the issue we are facing is that the ssphy device links are not ready
> > causing the dwc3 probe not being invoked, can we add an API as Pavan
> > suggested
> > to check if deferred_probe listfor dwc3 device is empty or not andbased on
> > that we can choose to defer our qcomprobe ? In this case, we don't need to
> > touch the dwc3 core driver and would be making changesonly in qcom glue
> > driver.
>
> As mentioned above, it shouldn't be necessary to add component support to
> all the glue drivers. An API to check for deferred probing is an option,
> however there is a possible race condition: When the dwc3-qcom driver checks
> for a deferred probe the core could still be probing, in that situation the
> glue would proceed before the core driver is ready. That could be avoided
> with the component based approach.

The race can happen only if asynchronous probe is enabled, otherwise the
child's probe happens synchronously in of_platform_populate()

OTOH, would the below condition suffice for our needs here? if our device
is not bounded to a driver, we check the state of initcalls and return
either error or -EPROBE_DEFER

diff --git a/drivers/usb/dwc3/dwc3-qcom.c b/drivers/usb/dwc3/dwc3-qcom.c
index 7b6eff5..519a503 100644
--- a/drivers/usb/dwc3/dwc3-qcom.c
+++ b/drivers/usb/dwc3/dwc3-qcom.c
@@ -722,6 +722,9 @@ static int dwc3_qcom_of_register_core(struct platform_device *pdev)
dev_err(dev, "failed to get dwc3 platform device\n");
}

+ if (!qcom->dwc3->dev.driver)
+ return driver_deferred_probe_check_state(&qcom->dwc3->dev);
+
node_put:
of_node_put(dwc3_np);

Thanks,
Pavan

2022-07-01 16:37:08

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Fri, Jul 01, 2022 at 03:45:26PM +0530, Pavan Kondeti wrote:
> On Thu, Jun 30, 2022 at 06:10:50PM -0700, Matthias Kaehlcke wrote:
> > > > dwc3-qcom should wait for dwc3 core to call component_add() and then do
> > > > whatever needs to be done once the dwc3 core is registered in the
> > > > dwc3-qcom bind callback. Honestly this may all be a little overkill if
> > > > there's only two drivers here, dwc3-qcom and dwc3 core. It could
> > > > probably just be some callback from dwc3 core at the end of probe that
> > > > calls some function in dwc3-qcom.
> > > Since the issue we are facing is that the ssphy device links are not ready
> > > causing the dwc3 probe not being invoked, can we add an API as Pavan
> > > suggested
> > > to check if deferred_probe listfor dwc3 device is empty or not andbased on
> > > that we can choose to defer our qcomprobe ? In this case, we don't need to
> > > touch the dwc3 core driver and would be making changesonly in qcom glue
> > > driver.
> >
> > As mentioned above, it shouldn't be necessary to add component support to
> > all the glue drivers. An API to check for deferred probing is an option,
> > however there is a possible race condition: When the dwc3-qcom driver checks
> > for a deferred probe the core could still be probing, in that situation the
> > glue would proceed before the core driver is ready. That could be avoided
> > with the component based approach.
>
> The race can happen only if asynchronous probe is enabled, otherwise the
> child's probe happens synchronously in of_platform_populate()

I was thinking about the case where the dwc3-qcom probe is initially deferred,
then the deferred probe starts shortly after (asynchronously) and now the
dwc3-qcom driver does its check. Probably it's not very likely to happen ...

> OTOH, would the below condition suffice for our needs here? if our device
> is not bounded to a driver, we check the state of initcalls and return
> either error or -EPROBE_DEFER
>
> diff --git a/drivers/usb/dwc3/dwc3-qcom.c b/drivers/usb/dwc3/dwc3-qcom.c
> index 7b6eff5..519a503 100644
> --- a/drivers/usb/dwc3/dwc3-qcom.c
> +++ b/drivers/usb/dwc3/dwc3-qcom.c
> @@ -722,6 +722,9 @@ static int dwc3_qcom_of_register_core(struct platform_device *pdev)
> dev_err(dev, "failed to get dwc3 platform device\n");
> }
>
> + if (!qcom->dwc3->dev.driver)
> + return driver_deferred_probe_check_state(&qcom->dwc3->dev);
> +
> node_put:
> of_node_put(dwc3_np);

I like the simplicity of it, no need for new APIs.

The components based approach would be slightly safer, but in practice I
think this should be good enough.

2022-07-13 01:53:05

by Matthias Kaehlcke

[permalink] [raw]
Subject: Re: [PATCH v20 2/5] usb: dwc3: core: Host wake up support from system suspend

On Fri, Jul 08, 2022 at 04:37:19PM +0530, Krishna Kurapati PSSNV wrote:
> On 7/6/2022 12:28 PM, Krishna Kurapati PSSNV wrote:
>
> On 7/1/2022 9:22 PM, Matthias Kaehlcke wrote:
>
> On Fri, Jul 01, 2022 at 03:45:26PM +0530, Pavan Kondeti wrote:
>
> On Thu, Jun 30, 2022 at 06:10:50PM -0700, Matthias Kaehlcke wrote:
>
> dwc3-qcom should wait for dwc3 core to call component_add() and then do
> whatever needs to be done once the dwc3 core is registered in the
> dwc3-qcom bind callback. Honestly this may all be a little overkill if
> there's only two drivers here, dwc3-qcom and dwc3 core. It could
> probably just be some callback from dwc3 core at the end of probe that
> calls some function in dwc3-qcom.
>
> Since the issue we are facing is that the ssphy device links are not ready
> causing the dwc3 probe not being invoked, can we add an API as Pavan
> suggested
> to check if deferred_probe listfor dwc3 device is empty or not andbased on
> that we can choose to defer our qcomprobe ? In this case, we don't need to
> touch the dwc3 core driver and would be making changesonly in qcom glue
> driver.
>
> As mentioned above, it shouldn't be necessary to add component support to
> all the glue drivers. An API to check for deferred probing is an option,
> however there is a possible race condition: When the dwc3-qcom driver checks
> for a deferred probe the core could still be probing, in that situation the
> glue would proceed before the core driver is ready. That could be avoided
> with the component based approach.
>
> The race can happen only if asynchronous probe is enabled, otherwise the
> child's probe happens synchronously in of_platform_populate()
>
> I was thinking about the case where the dwc3-qcom probe is initially deferred,
> then the deferred probe starts shortly after (asynchronously) and now the
> dwc3-qcom driver does its check. Probably it's not very likely to happen ...
>
>
> OTOH, would the below condition suffice for our needs here? if our device
> is not bounded to a driver, we check the state of initcalls and return
> either error or -EPROBE_DEFER
>
> diff --git a/drivers/usb/dwc3/dwc3-qcom.c b/drivers/usb/dwc3/dwc3-qcom.c
> index 7b6eff5..519a503 100644
> --- a/drivers/usb/dwc3/dwc3-qcom.c
> +++ b/drivers/usb/dwc3/dwc3-qcom.c
> @@ -722,6 +722,9 @@ static int dwc3_qcom_of_register_core(struct platform_device
> *pdev)
> dev_err(dev, "failed to get dwc3 platform device\n");
> }
>
> + if (!qcom->dwc3->dev.driver)
> + return driver_deferred_probe_check_state(&qcom->dwc3->dev);
> +
> node_put:
> of_node_put(dwc3_np);
>
> I like the simplicity of it, no need for new APIs.
>
> The components based approach would be slightly safer, but in practice I
> think this should be good enough.
>
> Hi Pavan, Mathias,
> I have tested the suggested code and verified that it works on
> sc7180. I see that the API has been removed recently in the following
> patch :\
> commit 9cbffc7a59561be950ecc675d19a3d2b45202b2b
> Author: Saravana Kannan [1]<[email protected]>
> Date: Wed Jun 1 00:07:05 2022 -0700
> driver core: Delete driver_deferred_probe_check_state()
> Can we make a patch and add it back to the kernel for this purpose ?
> Hi Saravana,
> Can you help suggest if we can revert your patch or make a new one to
> add back the function.

The cover letter [1] of the 'deferred_probe_timeout logic clean up'
series [2] has more context:


A lot of the deferred_probe_timeout logic is redundant with
fw_devlink=on. Also, enabling deferred_probe_timeout by default breaks
a few cases.

This series tries to delete the redundant logic, simplify the frameworks
that use driver_deferred_probe_check_state(), enable
deferred_probe_timeout=10 by default, and fixes the nfsroot failure
case.

The overall idea of this series is to replace the global behavior of
driver_deferred_probe_check_state() where all devices give up waiting on
supplier at the same time with a more granular behavior:

1. Devices with all their suppliers successfully probed by late_initcall
probe as usual and avoid unnecessary deferred probe attempts.

2. At or after late_initcall, in cases where boot would break because of
fw_devlink=on being strict about the ordering, we

a. Temporarily relax the enforcement to probe any unprobed devices
that can probe successfully in the current state of the system.
For example, when we boot with a NFS rootfs and no network device
has probed.
b. Go back to enforcing the ordering for any devices that haven't
probed.

3. After deferred probe timeout expires, we permanently give up waiting
on supplier devices without drivers. At this point, whatever devices
can probe without some of their optional suppliers end up probing.

In the case where module support is disabled, it's fairly
straightforward and all device probes are completed before the initcalls
are done.

[1] https://lore.kernel.org/all/[email protected]/
[2] https://patchwork.kernel.org/project/linux-pm/list/?series=646471&archive=both&state=*


Does anything speak against returning -EPROBE_DEFER directly from dwc3-qcom's
probe()? Now with deferred_probe_timeout > 0 there should be at least no endless
probing.