Hi Dan!

On Tue, Oct 18, 2022 at 10:14:54AM -0500, Dan Vacura wrote:
>On Tue, Oct 18, 2022 at 10:32:33AM -0400, Alan Stern wrote:
>> On Tue, Oct 18, 2022 at 02:27:13PM +0100, Dan Scally wrote:
>> > On 17/10/2022 21:54, Dan Vacura wrote:
>> > > The scatter gather support doesn't appear to work well with some UDC hw.
>> > > Add the ability to turn on the feature depending on the controller in
>> > > use.
>> > >
>> > > Signed-off-by: Dan Vacura <[email protected]>
>> >
>> >
>> > Nitpick: I would call it use_sg everywhere, but either way:
>> >
>> >
>> > Reviewed-by: Daniel Scally <[email protected]>
>> >
>> > Tested-by: Daniel Scally <[email protected]>
>> >
>> > > ---
>> > > V1 -> V2:
>> > > - no change, new patch in serie
>> > > V2 -> V3:
>> > > - default on, same as baseline
>> > >
>> > > Documentation/ABI/testing/configfs-usb-gadget-uvc | 1 +
>> > > Documentation/usb/gadget-testing.rst | 2 ++
>> > > drivers/usb/gadget/function/f_uvc.c | 2 ++
>> > > drivers/usb/gadget/function/u_uvc.h | 1 +
>> > > drivers/usb/gadget/function/uvc_configfs.c | 2 ++
>> > > drivers/usb/gadget/function/uvc_queue.c | 4 ++--
>> > > 6 files changed, 10 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/Documentation/ABI/testing/configfs-usb-gadget-uvc b/Documentation/ABI/testing/configfs-usb-gadget-uvc
>> > > index 5dfaa3f7f6a4..839a75fc28ee 100644
>> > > --- a/Documentation/ABI/testing/configfs-usb-gadget-uvc
>> > > +++ b/Documentation/ABI/testing/configfs-usb-gadget-uvc
>> > > @@ -9,6 +9,7 @@ Description: UVC function directory
>> > > streaming_interval 1..16
>> > > function_name string [32]
>> > > req_int_skip_div unsigned int
>> > > + sg_supported 0..1
>> > > =================== =============================
>> > > What: /config/usb-gadget/gadget/functions/uvc.name/control
>> > > diff --git a/Documentation/usb/gadget-testing.rst b/Documentation/usb/gadget-testing.rst
>> > > index f9b5a09be1f4..8e3072d6a590 100644
>> > > --- a/Documentation/usb/gadget-testing.rst
>> > > +++ b/Documentation/usb/gadget-testing.rst
>> > > @@ -796,6 +796,8 @@ The uvc function provides these attributes in its function directory:
>> > > function_name name of the interface
>> > > req_int_skip_div divisor of total requests to aid in calculating
>> > > interrupt frequency, 0 indicates all interrupt
>> > > + sg_supported allow for scatter gather to be used if the UDC
>> > > + hw supports it
>>
>> Why is a configuration option needed for this? Why not always use SG
>> when the UDC supports it? Or at least, make the decision automatically
>> (say, based on the amount of data to be transferred) with no need for
>> any user input?
>
>Patches for a fix and to select to use SG depending on amount of data
>are already submitted and under review. I agree, ideally we don't need
>this patch, but there have been several regressions uncovered with
>enabling this support and it takes time to root cause these issues.
>

>In my specific environment, Android GKI 2.0, changes need to get
>upstreamed first here before they're pulled into Android device
>software.

In fact this is actually a good policy, but adding workarounds mainline
to "hopefully" fix the real problems later are probably not what this
policy is about. Hopefully!

Michael

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Attachments:

(No filename) (3.69 kB)
signature.asc (849.00 B)
Download all attachments

2022-10-18 16:28:34

by Dan Vacura

[permalink] [raw]

Subject: Re: [PATCH v3 6/6] usb: gadget: uvc: add configfs option for sg support

Hi Alan,

On Tue, Oct 18, 2022 at 10:32:33AM -0400, Alan Stern wrote:
> On Tue, Oct 18, 2022 at 02:27:13PM +0100, Dan Scally wrote:
> > Hi Dan
> >
> > On 17/10/2022 21:54, Dan Vacura wrote:
> > > The scatter gather support doesn't appear to work well with some UDC hw.
> > > Add the ability to turn on the feature depending on the controller in
> > > use.
> > >
> > > Signed-off-by: Dan Vacura <[email protected]>
> >
> >
> > Nitpick: I would call it use_sg everywhere, but either way:
> >
> >
> > Reviewed-by: Daniel Scally <[email protected]>
> >
> > Tested-by: Daniel Scally <[email protected]>
> >
> > > ---
> > > V1 -> V2:
> > > - no change, new patch in serie
> > > V2 -> V3:
> > > - default on, same as baseline
> > >
> > > Documentation/ABI/testing/configfs-usb-gadget-uvc | 1 +
> > > Documentation/usb/gadget-testing.rst | 2 ++
> > > drivers/usb/gadget/function/f_uvc.c | 2 ++
> > > drivers/usb/gadget/function/u_uvc.h | 1 +
> > > drivers/usb/gadget/function/uvc_configfs.c | 2 ++
> > > drivers/usb/gadget/function/uvc_queue.c | 4 ++--
> > > 6 files changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/Documentation/ABI/testing/configfs-usb-gadget-uvc b/Documentation/ABI/testing/configfs-usb-gadget-uvc
> > > index 5dfaa3f7f6a4..839a75fc28ee 100644
> > > --- a/Documentation/ABI/testing/configfs-usb-gadget-uvc
> > > +++ b/Documentation/ABI/testing/configfs-usb-gadget-uvc
> > > @@ -9,6 +9,7 @@ Description: UVC function directory
> > > streaming_interval 1..16
> > > function_name string [32]
> > > req_int_skip_div unsigned int
> > > + sg_supported 0..1
> > > =================== =============================
> > > What: /config/usb-gadget/gadget/functions/uvc.name/control
> > > diff --git a/Documentation/usb/gadget-testing.rst b/Documentation/usb/gadget-testing.rst
> > > index f9b5a09be1f4..8e3072d6a590 100644
> > > --- a/Documentation/usb/gadget-testing.rst
> > > +++ b/Documentation/usb/gadget-testing.rst
> > > @@ -796,6 +796,8 @@ The uvc function provides these attributes in its function directory:
> > > function_name name of the interface
> > > req_int_skip_div divisor of total requests to aid in calculating
> > > interrupt frequency, 0 indicates all interrupt
> > > + sg_supported allow for scatter gather to be used if the UDC
> > > + hw supports it
>
> Why is a configuration option needed for this? Why not always use SG
> when the UDC supports it? Or at least, make the decision automatically
> (say, based on the amount of data to be transferred) with no need for
> any user input?

Patches for a fix and to select to use SG depending on amount of data
are already submitted and under review. I agree, ideally we don't need
this patch, but there have been several regressions uncovered with
enabling this support and it takes time to root cause these issues.

In my specific environment, Android GKI 2.0, changes need to get
upstreamed first here before they're pulled into Android device
software. Having this logic in place gives us the ability to turn off
this functionality without going through this process. A revert was also
considered until all the bugs are resolved, but the code is quite
entrenched now to take out, plus others seem to benefit from it being
enabled. Thus the configurability.

>
> Is this because the SG support in some UDC drivers is buggy? In that
> case the proper approach is to fix the UDC drivers, not add new options
> that users won't know when to use.
>
> Or is it because the UDC hardware itself is buggy? In that case the
> best approach is to fix the UDC drivers so that they don't advertise
> working SG support when the hardware is unable to handle it.
>
> Alan Stern

2022-10-18 19:16:53

by Thinh Nguyen

[permalink] [raw]

Subject: Re: [PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of release after missed isoc

Hi Dan,

On Mon, Oct 17, 2022, Dan Vacura wrote:
> Hi Thinh,
>
> On Mon, Oct 17, 2022 at 09:30:38PM +0000, Thinh Nguyen wrote:
> > On Mon, Oct 17, 2022, Dan Vacura wrote:
> > > From: Jeff Vanhoof <[email protected]>
> > >
> > > arm-smmu related crashes seen after a Missed ISOC interrupt when
> > > no_interrupt=1 is used. This can happen if the hardware is still using
> > > the data associated with a TRB after the usb_request's ->complete call
> > > has been made. Instead of immediately releasing a request when a Missed
> > > ISOC interrupt has occurred, this change will add logic to cancel the
> > > request instead where it will eventually be released when the
> > > END_TRANSFER command has completed. This logic is similar to some of the
> > > cleanup done in dwc3_gadget_ep_dequeue.
> >
> > This doesn't sound right. How did you determine that the hardware is
> > still using the data associated with the TRB? Did you check the TRB's
> > HWO bit?
>
> The problem we're seeing was mentioned in the summary of this patch
> series, issue #1. Basically, with the following patch
> https://urldefense.com/v3/__https://patchwork.kernel.org/project/linux-usb/patch/[email protected]/__;!!A4F2R9G_pg!aSNZ-IjMcPgL47A4NR5qp9qhVlP91UGTuCxej5NRTv8-FmTrMkKK7CjNToQQVEgtpqbKzLU2HXET9O226AEN$
> integrated a smmu panic is occurring on our Android device with the 5.15
> kernel which is:
>
> <3>[ 718.314900][ T803] arm-smmu 15000000.apps-smmu: Unhandled arm-smmu context fault from a600000.dwc3!
>
> The uvc gadget driver appears to be the first (and only) gadget that
> uses the no_interrupt=1 logic, so this seems to be a new condition for
> the dwc3 driver. In our configuration, we have up to 64 requests and the
> no_interrupt=1 for up to 15 requests. The list size of dep->started_list
> would get up to that amount when looping through to cleanup the
> completed requests. From testing and debugging the smmu panic occurs
> when a -EXDEV status shows up and right after
> dwc3_gadget_ep_cleanup_completed_request() was visited. The conclusion
> we had was the requests were getting returned to the gadget too early.

As I mentioned, if the status is updated to missed isoc, that means that
the controller returned ownership of the TRB to the driver. At least for
the particular request with -EXDEV, its TRBs are completed. I'm not
clear on your conclusion.

Do we know where did the crash occur? Is it from dwc3 driver or from uvc
driver, and at what line? It'd great if we can see the driver log.

>
> >
> > The dwc3 driver would only give back the requests if the TRBs of the
> > associated requests are completed or when the device is disconnected.
> > If the TRB indicated missed isoc, that means that the TRB is completed
> > and its status was updated.
>
> Interesting, the device is not disconnected as we don't get the
> -ESHUTDOWN status back and with this patch in place things continue
> after a -EXDEV status is received.
>

Actually, minor correction here: a recent change
b44c0e7fef51 ("usb: dwc3: gadget: conditionally remove requests")
changed -ESHUTDOWN request status to -ECONNRESET when disable endpoint.
This doesn't look right.

While disabling endpoint may also apply for other cases such as
switching alternate interface in addition to disconnect, -ESHUTDOWN
seems more fitting there.

Hi Michael,

Can you help clarify for the change above? This changed the usage of
requests. Now requests returned by disconnection won't be returned as
-ESHUTDOWN.

> >
> > There's a special case which dwc3 may give back requests early is the
> > case of the device disconnecting. The requests should be returned with
> > -ESHUTDOWN, and the gadget driver shouldn't be re-using the requests on
> > de-initialization anyway.
> >
> > We should not issue End Transfer command just because of missed isoc. We
> > may want issue End Transfer if the gadget driver is too slow and unable
> > to feed requests in time (causing underrun and missed isoc) to resync
> > with the host, but we already handle that.
>
> Hmm, isn't that what happens when we get into this
> condition in dwc3_gadget_endpoint_trbs_complete():
>
> if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> list_empty(&dep->started_list) &&
> (list_empty(&dep->pending_list) || status == -EXDEV))
> dwc3_stop_active_transfer(dep, true, true);
>

Yes, it's being handled there.

> >
> > I'm still not clear what's the problem you're seeing. Do you have the
> > crash log? Tracepoints?
> >
>
> Appreciate the support!
>

Thanks,
Thinh

2022-10-18 19:53:34

On Fri, Oct 21, 2022 at 07:09:55PM +0000, Thinh Nguyen wrote:
> On Fri, Oct 21, 2022, Jeff Vanhoof wrote:
> > Hi Thinh,
> >
> > On Fri, Oct 21, 2022 at 04:43:52PM +0000, Thinh Nguyen wrote:
> > > On Fri, Oct 21, 2022, Jeff Vanhoof wrote:
> > > > Hi Thinh,
> > > >
> > > > On Fri, Oct 21, 2022 at 12:55:51AM +0000, Thinh Nguyen wrote:
> > > > > On Thu, Oct 20, 2022, Thinh Nguyen wrote:
> > > > > > On Thu, Oct 20, 2022, Jeff Vanhoof wrote:
> > > > > > > Hi Thinh,
> > > > > > >
> > > > > > > On Wed, Oct 19, 2022 at 11:06:08PM +0000, Thinh Nguyen wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > On Wed, Oct 19, 2022, Jeff Vanhoof wrote:
> > > > > > > > > Hi Thinh,
> > > > > > > > > On Wed, Oct 19, 2022 at 07:08:27PM +0000, Thinh Nguyen wrote:
> > > > > > > > > > On Wed, Oct 19, 2022, Jeff Vanhoof wrote:
> > > > > > > >
> > > > > > > > <snip>
> > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > From what I can gather from the log, with the current changes it seems that
> > > > > > > > > > > after a missed isoc event few requests are staying longer than expected in the
> > > > > > > > > > > started_list (not getting reclaimed) and this is preventing the transmission
> > > > > > > > > > > from stopping/starting again, and opening the door for continuous stream of
> > > > > > > > > > > missed isoc events that cause what appears to the user as a frozen video.
> > > > > > > > > > >
> > > > > > > > > > > So one thought, if IOC bit is not set every frame, but IMI bit is, when a
> > > > > > > > > > > missed isoc related interrupt occurs it seems likely that more than one trb
> > > > > > > > > > > request will need to be reclaimed, but the current set of changes is not
> > > > > > > > > > > handling this.
> > > > > > > > > > >
> > > > > > > > > > > In the good transfer case this issue seems to be taken care of since the IOC
> > > > > > > > > > > bit is not set every frame and the reclaimation will loop through every item in
> > > > > > > > > > > the started_list and only stop if there are no additional trbs or if one has
> > > > > > > > > >
> > > > > > > > > > It should stop at the request that associated with the interrupt event,
> > > > > > > > > > whether it's because of IMI or IOC.
> > > > > > > > >
> > > > > > > > > In this case I was concerned that if multipled queued reqs did not have IOC bit
> > > > > > > > > set, but there was a missed isoc on one of the last reqs, whether or not we would
> > > > > > > > > reclaim all of the requests up to the missed isoc related req. I'm not sure if
> > > > > > > > > my concern is valid or not.
> > > > > > > > >
> > > > > > > >
> > > > > > > > There should be no problem. If there's an interrupt event indicating a
> > > > > > > > TRB completion, the driver will give back all the requests up to the
> > > > > > > > request associated with the interrupt event, and the controller will
> > > > > > > > continue processing the remaining TRBs. On the next TRB completion
> > > > > > > > event, the driver will again give back all the requests up to the
> > > > > > > > request associated with that event.
> > > > > > > >
> > > > > > >
> > > > > > > I was testing with the following patch you suggested:
> > > > > > >
> > > > > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > > > > > index 61fba2b7389b..8352f4b5dd9f 100644
> > > > > > > > --- a/drivers/usb/dwc3/gadget.c
> > > > > > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > > > > > @@ -3657,6 +3657,10 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
> > > > > > > > if (event->status & DEPEVT_STATUS_SHORT && !chain)
> > > > > > > > return 1;
> > > > > > > >
> > > > > > > > + if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> > > > > > > > + (event->status & DEPEVT_STATUS_MISSED_ISOC) && !chain)
> > > > > > > > + return 1;
> > > > > > > > +
> > > > > > > > if ((trb->ctrl & DWC3_TRB_CTRL_IOC) ||
> > > > > > > > (trb->ctrl & DWC3_TRB_CTRL_LST))
> > > > > > > > return 1;
> > > > > > > >
> > > > > > >
> > > > > > > At this time the IMI bit was set for every frame. With these changes it
> > > > > > > appeared in case of missed isoc that sometimes not all requests would be
> > > > > > > reclaimed (enqueued != dequeued even 100ms after the last interrupt was
> > > > > > > handled). If the 1st req in the started_list was fine (IMI set, but not IOC),
> > > > > > > and a later req was the one actually missed, because of this status check the
> > > > > > > reclaimation could stop early and not clean up to the appropriate req. As
> > > > > >
> > > > > > Oops. You're right.
> > > > > >
> > > > > > > suggested yesterday, I also tried only setting the IMI bit when no_interrupt is
> > > > > > > not set, however I was still seeing the complete freezes. After analyzing this
> > > > > > > issue a bit, I have updated the diff to look more like this:
> > > > > > >
> > > > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > > > > index dfaf9ac24c4f..bb800a81815b 100644
> > > > > > > --- a/drivers/usb/dwc3/gadget.c
> > > > > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > > > > @@ -1230,8 +1230,9 @@ static void __dwc3_prepare_one_trb(struct dwc3_ep *dep, struct dwc3_trb *trb,
> > > > > > > trb->ctrl = DWC3_TRBCTL_ISOCHRONOUS;
> > > > > > > }
> > > > > > >
> > > > > > > - /* always enable Interrupt on Missed ISOC */
> > > > > > > - trb->ctrl |= DWC3_TRB_CTRL_ISP_IMI;
> > > > > > > + /* enable Interrupt on Missed ISOC */
> > > > > > > + if ((!no_interrupt && !chain) || must_interrupt)
> > > > > > > + trb->ctrl |= DWC3_TRB_CTRL_ISP_IMI;
> > > > > > > break;
> > > > > >
> > > > > > Either all or none of the TRBs of a request is set with IMI, and not
> > > > > > some.
> > > > > >
> > > > > > >
> > > > > > > case USB_ENDPOINT_XFER_BULK:
> > > > > > > @@ -3195,6 +3196,11 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
> > > > > > > if (event->status & DEPEVT_STATUS_SHORT && !chain)
> > > > > > > return 1;
> > > > > > >
> > > > > > > + if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> > > > > > > + (event->status & DEPEVT_STATUS_MISSED_ISOC) && !chain
> > > > > > > + && (trb->ctrl & DWC3_TRB_CTRL_ISP_IMI))
> > > > > > > + return 1;
> > > > > > > +
> > > > > > > if ((trb->ctrl & DWC3_TRB_CTRL_IOC) ||
> > > > > > > (trb->ctrl & DWC3_TRB_CTRL_LST))
> > > > > > > return 1;
> > > > > > >
> > > > > > > Where the trb must have the IMI set before returning early. This seemed to make
> > > > > > > the freezes recoverable.
> > > > > >
> > > > > > Can you try this revised change:
> > > > > >
> > > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > > > index 61fba2b7389b..a69d8c28d86b 100644
> > > > > > --- a/drivers/usb/dwc3/gadget.c
> > > > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > > > @@ -3654,7 +3654,7 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
> > > > > > if ((trb->ctrl & DWC3_TRB_CTRL_HWO) && status != -ESHUTDOWN)
> > > > > > return 1;
> > > > > >
> > > > > > - if (event->status & DEPEVT_STATUS_SHORT && !chain)
> > > > >
> > > > > I accidentally deleted a couple of lines here.
> > > > >
> > > > > > + if (DWC3_TRB_SIZE_TRBSTS(trb->size) == DWC3_TRBSTS_MISSED_ISOC && !chain)
> > > > > > return 1;
> > > > > >
> > > > > > if ((trb->ctrl & DWC3_TRB_CTRL_IOC) ||
> > > > >
> > > > > I meant to do this:
> > > > >
> > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > > index 61fba2b7389b..cb65371572ee 100644
> > > > > --- a/drivers/usb/dwc3/gadget.c
> > > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > > @@ -3657,6 +3657,9 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
> > > > > if (event->status & DEPEVT_STATUS_SHORT && !chain)
> > > > > return 1;
> > > > >
> > > > > + if (DWC3_TRB_SIZE_TRBSTS(trb->size) == DWC3_TRBSTS_MISSED_ISOC && !chain)
> > > > > + return 1;
> > > > > +
> > > > > if ((trb->ctrl & DWC3_TRB_CTRL_IOC) ||
> > > > > (trb->ctrl & DWC3_TRB_CTRL_LST))
> > > > > return 1;
> > > > > @@ -3673,6 +3676,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep,
> > > > > struct scatterlist *s;
> > > > > unsigned int num_queued = req->num_queued_sgs;
> > > > > unsigned int i;
> > > > > + bool missed_isoc = false;
> > > > > int ret = 0;
> > > > >
> > > > > for_each_sg(sg, s, num_queued, i) {
> > > > > @@ -3681,12 +3685,18 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep,
> > > > > req->sg = sg_next(s);
> > > > > req->num_queued_sgs--;
> > > > >
> > > > > + if (DWC3_TRB_SIZE_TRBSTS(trb->size) == DWC3_TRBSTS_MISSED_ISOC)
> > > > > + missed_isoc = true;
> > > > > +
> > > > > ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req,
> > > > > trb, event, status, true);
> > > > > if (ret)
> > > > > break;
> > > > > }
> > > > >
> > > > > + if (missed_isoc)
> > > > > + ret = 1;
> > > > > +
> > > > > return ret;
> > > > > }
> > > > >
> > > > >
> > > > > BR,
> > > > > Thinh
> > > >
> > > > I tried out the following patch diff you provided and I did not see any iommu
> > > > related crashes with these changes:
> > > >
> > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > index dfaf9ac24c4f..50287437d6de 100644
> > > > --- a/drivers/usb/dwc3/gadget.c
> > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > @@ -3195,6 +3195,9 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
> > > > if (event->status & DEPEVT_STATUS_SHORT && !chain)
> > > > return 1;
> > > >
> > > > + if (DWC3_TRB_SIZE_TRBSTS(trb->size) == DWC3_TRBSTS_MISSED_ISOC && !chain)
> > > > + return 1;
> > > > +
> > > > if ((trb->ctrl & DWC3_TRB_CTRL_IOC) ||
> > > > (trb->ctrl & DWC3_TRB_CTRL_LST))
> > > > return 1;
> > > > @@ -3211,6 +3214,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep,
> > > > struct scatterlist *s;
> > > > unsigned int num_queued = req->num_queued_sgs;
> > > > unsigned int i;
> > > > + bool missed_isoc = false;
> > > > int ret = 0;
> > > >
> > > > for_each_sg(sg, s, num_queued, i) {
> > > > @@ -3219,12 +3223,18 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep,
> > > > req->sg = sg_next(s);
> > > > req->num_queued_sgs--;
> > > >
> > > > + if (DWC3_TRB_SIZE_TRBSTS(trb->size) == DWC3_TRBSTS_MISSED_ISOC)
> > > > + missed_isoc = true;
> > > > +
> > > > ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req,
> > > > trb, event, status, true);
> > > > if (ret)
> > > > break;
> > > > }
> > > >
> > > > + if (missed_isoc)
> > > > + ret = 1;
> > > > +
> > > > return ret;
> > > > }
> > > >
> > > >
> > > > As we discussed earlier, when uvc's complete function is called, if an -EXDEV
> > > > is returned in the request's status, the uvc driver will begin to cancel its
> > > > queue. With the current skip interrupt implementation in the uvc driver, if
> > > > this occurs while the uvc driver is pumping the current frame, then there is no
> > > > guarentee that the last request(s) will have had 'no_interrupt=0'. If the last
> > > > requests passed to dwc3 had 'no_interrupt=1', these requests would eventually
> > > > be placed at the end of the started_list. Since the IOC bit will not be set,
> > > > and if no missed isoc event occurs on these requests, then the dwc3 driver will
> > > > not be interrupted, leaving those remaining requests sitting in the
> > > > started_list, and dwc3 will not perform an 'End Transfer' as expected. Once the
> > > > uvc driver begins to pump the requests for the next frame, then it most likely
> > > > will result in additional missed isoc events, with the result being an extended
> > > > video freeze seen by the user.
> > > >
> > > > I hope that other uvc driver maintainers can chime in here to help determine the
> > > > correct path forward. With the skip interrupt implementation, the uvc driver should
> > > > guarentee that the last request sent to dwc3 has 'no_interrupt=0', otherwise
> > >
> > > Rather than guarenteeing no_interrupt or not, it's more important that
> > > the UVC maintains a constant queue of requests to the controller driver.
> > > Isoc transfers are meant to be sent at a constant rate which the
> > > endpoint is configured.
> > >
> >
> > I agree with you on this, but it will probably always be a race with uvc
> > queuing up one req at a time and dwc3 starting to transmit almost immediately.
> > We can configure the streaming_interval on a product to kind of slow down or
>
> No, it doesn't work that way. The controller would only send the data
> when the host requests for it. The host will request for data every
> 125us. So a UVC request will complete roughly every 125us. There should
> be no race with uvc.
>

I wasn't very clear here. I was talking about the UVC Gadget Driver, about how
uvcg_video_pump queues up requests to the dwc3 driver via uvcg_video_ep_queue
(->usb_ep_queue->dwc3_gadget_ep_queue).

> > delay the usb transfers, but between dwc3 and uvc driver it would be nice to
> > have an interface that would allow pre-queuing a certain number of reqs before
> > the transfer is actually started. If that is not possible, then uvc could
> > instead prepare a number of reqs ahead of time and attempt to queue them each
> > as fast a possible in a very tight loop.
>
> All UVC gadget driver has to do is to maintain multiple requests
> prepared ahead of time and don't starve the controller driver. That is,
> don't let the queue reaches 0. Let's say uvc can only pump 16 requests
> at a time, split at least every 8 request with no_interrupt=0. So that
> uvc will have time to feed more requests when it gets the notification
> of 8 requests completed. It would have roughly 8x125us to queue more
> requests before underrun.
>

I think we are saying the same thing. I'll try to be more clear going forward. :)

> >
> > > I recalled Dan mentioned that UVC gadget driver can queue up to 64
> > > requests with no_interrupt=1 up to 15 requests. But I keep seeing that
> > > the gadget driver only "pumps" 16 requests and doesn't continue until
> > > they are completed. We can almost guarantee that it's going to be
> > > underrun. Can UVC "pumps" multiple times at once?
> > >
> >
> > uvc will usually pump when new frames comes in or when a req's complete gets
> > called. uvc should fill up front all the reqs required to transfer a frame (up
> > to 64), but once the available reqs are filled or the frame completely queued
> > up, it would take a new incoming frame or a kick via the complete call to have
> > it attempt to fill any remaining reqs in the queue again. To me this looks
> > ok to do, but for heavy transfers we have a somewhat smaller queue as a buffer,
> > 48 reqs vs 64 reqs (64 - 16).
> >
> > To note, for the very last request of a frame/buffer (the end of it) the uvc
> > driver does set the no_interrupt to 0 for this request.
> >
> > > > if a missed isoc error occurs, it becomes very likely that the next immediate set of
> > > > frames could be dropped/cancelled because the dwc3 driver could not perform a timely
> > > > 'End Transfer'.
> > > >
> > > > For testing I implemented the following changes to see what I could do for this
> > > > issue. Note that I am on an older implementation and it's missing a lot of the
> > >
> > > Please use the latest kernel, there are a lot of fixes/improvement to
> > > dwc3 every kernel version.
> > >
> >
> > I've been debugging the sg implementation and the skip interrupt implementation
> > seperately, backporting what can be backported. I'm working off of a 5.10
> > kernel debugging various issues Dan Vacura was seeing on a 5.15 kernel on a
> > newer product. What we had on our 5.10 based kernel was stable for uvc/dwc3, so
> > we needed to understand what came in since that time that broke stability.
> >
> > > > sg related implementation. The idea here is that if the queue is empty, and that
> > > > req_int_count is non-zero then the last request likely had 'no_interrupt=1' set.
> > > > And if this is the case then we will want to send some dummy request to dwc3 with
> > > > 'no_interrupt=0' set to make sure that no requests get stuck in its started_list.
> > >
> > > This is not efficient and unnecessary.
> > >
> >
> > Agree, but to fix this in the uvc driver the correct way seemed a bit more
> > complicated at first. I was thinking that the driver would always send one last
> > request of the frame buffer once an error is seen, but I'm now thinking of a
> > simpler solution. If we can update the uvc pump to prepare a number of requests
> > and make sure that the last request has no_interrupt set to 0 before queuing
> > them all up in a tight loop to the dwc3 driver, this would effectively solve
> > this problem too. A bit of extra smarts this area might also address some
> > of your concerns about uvc not pumping a lot of data at once (especially for
> > the beginning of a frame). I believe we should prepare more reqs up front
> > if the req queue is empty and less reqs if its already busy.
> >
> > I'm hoping that someone can step up to help here :). If not, then this will be
> > my next activity.
> >
> > > <snip>
> > >
> > > >
> > > >
> > > > Alternatively we may just not want to cancel the queue upon receiving -EXDEV
> > > > and this could solve the problem too, but I don't think that it's such a great
> > > > idea, especially if things start falling behind.
> > > >
> > > > I hope that someone more fluent in this area of code can take a crack at
> > > > improving/fixing this issue.
> > > >
> > > > The changes above do seem to help dwc3 timely end its transfers, but mainly for
> > > > cases where some requests are missed but the next immediate ones are not (i'm
> > > > talking within a couple of hundred microseconds). Most of the time if missed
> > > > isocs occurs for a frame that the remaining reqs in the started_list will
> > > > likely also error out and the list will be emptied and dwc3 will still timely
> > > > send 'End Transfer'. In reality this is to cover a corner case that can
> > > > adversely affect the quality of the video being watched. Just wanted to be
> > > > upfront with these details.
> > > >
> > > > Thinh, any pointers on how we should proceed from here? It looks like your
> > > > changes are working well.
> > > >
> > >
> > > You can add the underrun detection check to dwc3 whenever it receives a
> > > new request.
> > >
> > > ie. When the new request comes, check if the last prepared TRB's HWO bit
> > > is cleared and if the endpoint is started, send End Transfer command to
> > > reschedule the isoc transfers for the incoming requests.
> > >
> > > This is probably the simpler workaround to the underrun issue of UVC.
> > >
> >
> > This sounds like a good optimization too by itself. Would it be possible for
> > you to implement something here to help get me started? Even if it's not
> > perfect, I'll take what I can get. We are running up against the clock for
> > trying to close things out (changes must be released to mainline and backported
> > to 5.15 for Android).
> >
>
> At the moment, I have very limited bandwidth to implement and run tests.
> However, if you or someone provide a patch, I can help review.
>
> It should look something like this (pseudo code):
>
> ep_queue(request) {
> add_pending_list(request)
> if (ep is isoc and enabled) {
> if (prev_trb.HWO == 0) {
> send_cmd(End Transfer)
> return;
> }
> }
> }
>
> On End Transfer command completion:
> reclaim_and_giveback_requests(started_list)
>
> BR,
> Thinh

Thanks,
Jeff

2024-02-22 00:04:15

On Thu, Feb 22, 2024 at 01:20:04AM +0000, Thinh Nguyen wrote:
>On Thu, Feb 22, 2024, Michael Grzeschik wrote:
>> For #2: I found an issue in the handling of the completion of requests in
>> the started list. When the interrupt handler is *explicitly* calling
>> stop_active_transfer if the overall event of the request was an missed
>> event. This event value only represents the value of the request that
>> was actually triggering the interrupt.
>>
>> It also calls ep_cleanup_completed_requests and is iterating over the
>> started requests and will call giveback/complete functions of the
>> requests with the proper request status.
>>
>> So this will also catch missed requests in the queue. However, since
>> there might be, lets say 5 good requests and one missed request, what
>> will happen is, that each complete call for the first good requests will
>> enqueue new requests into the started list and will also call the
>> updatecmd on that transfer that was already missed until the loop will
>> reach the one request with the MISSED status bit set.
>>
>> So in my opinion the patch from Jeff makes sense when adding the
>> following change aswell. With those both changes the underruns and
>> broken frames finally disappear. I am still unsure about the complete
>> solution about that, since with this the mentioned 5 good requests
>> will be cancelled aswell. So this is still a WIP status here.
>>
>
>When the dwc3 driver issues stop_active_transfer(), that means that the
>started_list is empty and there is an underrun.

At this moment this is only the case when both, pending and started list
are empty. Or the interrupt event was EXDEV.

The main problem is that the function
dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); will
issue an complete for each started request, which on the other hand will
refill the pending list, and therefor after that refill the
stop_active_transfer is currently never hit.

>It treats the incoming requests as staled. However, for UVC, they are
>still "good".

Right, so in that case we can requeue them anyway. But this will have to
be done after the stop transfer cmd has finished.

>I think you can just check if the started_list is empty before queuing
>new requests. If it is, perform stop_active_transfer() to reschedule the
>incoming requests. None of the newly queue requests will be released
>yet since they are in the pending_list.

So that is basically exactly what my patch is doing. However in the case
of an underrun it is not safe to call dwc3_gadget_ep_cleanup_completed_requests
as jeff stated. So his underlying patch is really fixing an issue here.

>For UVC, perhaps you can introduce a new flag to usb_request called
>"ignore_queue_latency" or something equivalent. The dwc3 is already
>partially doing this for UVC. With this new flag, we can rework dwc3 to
>clearly separate the expected behavior from the function driver.

I don't know why this "extra" flag is even necessary. The code example
is already working without that extra flag.

Actually I even came up with an better solution. Additionally of checking if
one of the requests in the started list was missed, we can activly check if
the trb ring did run dry and if dwc3_gadget_endpoint_trbs_complete is
going to enqueue in to the empty trb ring.

So my whole change looks like that:

diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index efe6caf4d0e87..2c8047dcd1612 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -952,6 +952,7 @@ struct dwc3_request {
#define DWC3_REQUEST_STATUS_DEQUEUED 3
#define DWC3_REQUEST_STATUS_STALLED 4
#define DWC3_REQUEST_STATUS_COMPLETED 5
+#define DWC3_REQUEST_STATUS_MISSED_ISOC 6
#define DWC3_REQUEST_STATUS_UNKNOWN -1

u8 epnum;
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 858fe4c299b7a..a31f4d3502bd3 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2057,6 +2057,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
req = next_request(&dep->cancelled_list);
dwc3_gadget_ep_skip_trbs(dep, req);
switch (req->status) {
+ case 0:
+ dwc3_gadget_giveback(dep, req, 0);
+ break;
case DWC3_REQUEST_STATUS_DISCONNECTED:
dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
break;
@@ -2066,6 +2069,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
case DWC3_REQUEST_STATUS_STALLED:
dwc3_gadget_giveback(dep, req, -EPIPE);
break;
+ case DWC3_REQUEST_STATUS_MISSED_ISOC:
+ dwc3_gadget_giveback(dep, req, -EXDEV);
+ break;
default:
dev_err(dwc->dev, "request cancelled with wrong reason:%d\n", req->status);
dwc3_gadget_giveback(dep, req, -ECONNRESET);
@@ -3509,6 +3515,36 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
return ret;
}

+static int dwc3_gadget_ep_check_missed_requests(struct dwc3_ep *dep)
+{
+ struct dwc3_request *req;
+ struct dwc3_request *tmp;
+ int ret = 0;
+
+ list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
+ struct dwc3_trb *trb;
+
+ trb = req->trb;
+ switch (DWC3_TRB_SIZE_TRBSTS(trb->size)) {
+ case DWC3_TRBSTS_MISSED_ISOC:
+ /* Isoc endpoint only */
+ ret = -EXDEV;
+ break;
+ case DWC3_TRB_STS_XFER_IN_PROG:
+ /* Applicable when End Transfer with ForceRM=0 */
+ case DWC3_TRBSTS_SETUP_PENDING:
+ /* Control endpoint only */
+ case DWC3_TRBSTS_OK:
+ default:
+ ret = 0;
+ break;
+ }
+ }
+
+ return ret;
+}
+
static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
const struct dwc3_event_depevt *event, int status)
{
@@ -3565,22 +3601,51 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep,
{
struct dwc3 *dwc = dep->dwc;
bool no_started_trb = true;
+ unsigned int transfer_in_flight = 0;
+
+ /* It is possible that the interrupt thread was delayed by
+ * scheduling in the system, and therefor the HW has already
+ * run dry. In that case the last trb in the queue is already
+ * handled by the hw. By checking the HWO bit we know to restart
+ * the whole transfer. The condition to appear is more likelely
+ * if not every trb has the IOC bit set and therefor does not
+ * trigger the interrupt thread fewer.
+ */
+ if (dep->number && usb_endpoint_xfer_isoc(dep->endpoint.desc)) {
+ struct dwc3_trb *trb;

- dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
+ trb = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
+ transfer_in_flight = trb->ctrl & DWC3_TRB_CTRL_HWO;
+ }

- if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
- goto out;
+ if (status == -EXDEV || !transfer_in_flight) {
+ struct dwc3_request *tmp;
+ struct dwc3_request *req;

- if (!dep->endpoint.desc)
- return no_started_trb;
+ if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
+ dwc3_stop_active_transfer(dep, true, true);

- if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
- list_empty(&dep->started_list) &&
- (list_empty(&dep->pending_list) || status == -EXDEV))
- dwc3_stop_active_transfer(dep, true, true);
- else if (dwc3_gadget_ep_should_continue(dep))
- if (__dwc3_gadget_kick_transfer(dep) == 0)
- no_started_trb = false;
+ list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
+ dwc3_gadget_move_cancelled_request(req,
+ (DWC3_TRB_SIZE_TRBSTS(req->trb->size) == DWC3_TRBSTS_MISSED_ISOC) ?
+ DWC3_REQUEST_STATUS_MISSED_ISOC : 0);
+ }
+ } else {
+ dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
+
+ if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
+ goto out;
+
+ if (!dep->endpoint.desc)
+ return no_started_trb;
+
+ if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
+ list_empty(&dep->started_list) && list_empty(&dep->pending_list))
+ dwc3_stop_active_transfer(dep, true, true);
+ else if (dwc3_gadget_ep_should_continue(dep))
+ if (__dwc3_gadget_kick_transfer(dep) == 0)
+ no_started_trb = false;
+ }

out:
/*

I will seperate the whole hunk into smaller changes and send an v1
the next days to review.

Regards,
Michael

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Attachments:

(No filename) (8.38 kB)
signature.asc (849.00 B)
Download all attachments

2024-03-07 01:58:33

by Thinh Nguyen

[permalink] [raw]

Subject: Re: [PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of release after missed isoc

On Tue, Feb 27, 2024, Michael Grzeschik wrote:
> On Thu, Feb 22, 2024 at 01:20:04AM +0000, Thinh Nguyen wrote:
> > On Thu, Feb 22, 2024, Michael Grzeschik wrote:
> > > For #2: I found an issue in the handling of the completion of requests in
> > > the started list. When the interrupt handler is *explicitly* calling
> > > stop_active_transfer if the overall event of the request was an missed
> > > event. This event value only represents the value of the request that
> > > was actually triggering the interrupt.
> > >
> > > It also calls ep_cleanup_completed_requests and is iterating over the
> > > started requests and will call giveback/complete functions of the
> > > requests with the proper request status.
> > >
> > > So this will also catch missed requests in the queue. However, since
> > > there might be, lets say 5 good requests and one missed request, what
> > > will happen is, that each complete call for the first good requests will
> > > enqueue new requests into the started list and will also call the
> > > updatecmd on that transfer that was already missed until the loop will
> > > reach the one request with the MISSED status bit set.
> > >
> > > So in my opinion the patch from Jeff makes sense when adding the
> > > following change aswell. With those both changes the underruns and
> > > broken frames finally disappear. I am still unsure about the complete
> > > solution about that, since with this the mentioned 5 good requests
> > > will be cancelled aswell. So this is still a WIP status here.
> > >
> >
> > When the dwc3 driver issues stop_active_transfer(), that means that the
> > started_list is empty and there is an underrun.
>
> At this moment this is only the case when both, pending and started list
> are empty. Or the interrupt event was EXDEV.
>
> The main problem is that the function
> dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); will
> issue an complete for each started request, which on the other hand will
> refill the pending list, and therefor after that refill the
> stop_active_transfer is currently never hit.
>
> > It treats the incoming requests as staled. However, for UVC, they are
> > still "good".
>
> Right, so in that case we can requeue them anyway. But this will have to
> be done after the stop transfer cmd has finished.
>
> > I think you can just check if the started_list is empty before queuing
> > new requests. If it is, perform stop_active_transfer() to reschedule the
> > incoming requests. None of the newly queue requests will be released
> > yet since they are in the pending_list.
>
> So that is basically exactly what my patch is doing. However in the case
> of an underrun it is not safe to call dwc3_gadget_ep_cleanup_completed_requests
> as jeff stated. So his underlying patch is really fixing an issue here.

What I mean is to actively check for started list on every
usb_ep_queue() call. Checking during
dwc3_gadget_ep_cleanup_completed_requests() is already too late.

>
> > For UVC, perhaps you can introduce a new flag to usb_request called
> > "ignore_queue_latency" or something equivalent. The dwc3 is already
> > partially doing this for UVC. With this new flag, we can rework dwc3 to
> > clearly separate the expected behavior from the function driver.
>
> I don't know why this "extra" flag is even necessary. The code example
> is already working without that extra flag.

The flag is for controller to determine what kinds of behavior the
function driver expects. My intention is if this extra flag is not set,
the dwc3 driver will not attempt to reshcedule isoc request at all (ie.
no stop_active_transfer()).

>
> Actually I even came up with an better solution. Additionally of checking if
> one of the requests in the started list was missed, we can activly check if
> the trb ring did run dry and if dwc3_gadget_endpoint_trbs_complete is
> going to enqueue in to the empty trb ring.
>
> So my whole change looks like that:
>
> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index efe6caf4d0e87..2c8047dcd1612 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -952,6 +952,7 @@ struct dwc3_request {
> #define DWC3_REQUEST_STATUS_DEQUEUED 3
> #define DWC3_REQUEST_STATUS_STALLED 4
> #define DWC3_REQUEST_STATUS_COMPLETED 5
> +#define DWC3_REQUEST_STATUS_MISSED_ISOC 6
> #define DWC3_REQUEST_STATUS_UNKNOWN -1
> u8 epnum;
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 858fe4c299b7a..a31f4d3502bd3 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -2057,6 +2057,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> req = next_request(&dep->cancelled_list);
> dwc3_gadget_ep_skip_trbs(dep, req);
> switch (req->status) {
> + case 0:
> + dwc3_gadget_giveback(dep, req, 0);
> + break;
> case DWC3_REQUEST_STATUS_DISCONNECTED:
> dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
> break;
> @@ -2066,6 +2069,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> case DWC3_REQUEST_STATUS_STALLED:
> dwc3_gadget_giveback(dep, req, -EPIPE);
> break;
> + case DWC3_REQUEST_STATUS_MISSED_ISOC:
> + dwc3_gadget_giveback(dep, req, -EXDEV);
> + break;
> default:
> dev_err(dwc->dev, "request cancelled with wrong reason:%d\n", req->status);
> dwc3_gadget_giveback(dep, req, -ECONNRESET);
> @@ -3509,6 +3515,36 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
> return ret;
> }
> +static int dwc3_gadget_ep_check_missed_requests(struct dwc3_ep *dep)
> +{
> + struct dwc3_request *req;
> + struct dwc3_request *tmp;
> + int ret = 0;
> +
> + list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> + struct dwc3_trb *trb;
> +
> + trb = req->trb;
> + switch (DWC3_TRB_SIZE_TRBSTS(trb->size)) {
> + case DWC3_TRBSTS_MISSED_ISOC:
> + /* Isoc endpoint only */
> + ret = -EXDEV;
> + break;
> + case DWC3_TRB_STS_XFER_IN_PROG:
> + /* Applicable when End Transfer with ForceRM=0 */
> + case DWC3_TRBSTS_SETUP_PENDING:
> + /* Control endpoint only */
> + case DWC3_TRBSTS_OK:
> + default:
> + ret = 0;
> + break;
> + }
> + }
> +
> + return ret;
> +}
> +
> static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
> const struct dwc3_event_depevt *event, int status)
> {
> @@ -3565,22 +3601,51 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep,
> {
> struct dwc3 *dwc = dep->dwc;
> bool no_started_trb = true;
> + unsigned int transfer_in_flight = 0;
> +
> + /* It is possible that the interrupt thread was delayed by
> + * scheduling in the system, and therefor the HW has already
> + * run dry. In that case the last trb in the queue is already
> + * handled by the hw. By checking the HWO bit we know to restart
> + * the whole transfer. The condition to appear is more likelely
> + * if not every trb has the IOC bit set and therefor does not
> + * trigger the interrupt thread fewer.
> + */
> + if (dep->number && usb_endpoint_xfer_isoc(dep->endpoint.desc)) {
> + struct dwc3_trb *trb;
> - dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
> + trb = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
> + transfer_in_flight = trb->ctrl & DWC3_TRB_CTRL_HWO;
> + }
> - if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
> - goto out;
> + if (status == -EXDEV || !transfer_in_flight) {
> + struct dwc3_request *tmp;
> + struct dwc3_request *req;
> - if (!dep->endpoint.desc)
> - return no_started_trb;
> + if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
> + dwc3_stop_active_transfer(dep, true, true);
> - if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> - list_empty(&dep->started_list) &&
> - (list_empty(&dep->pending_list) || status == -EXDEV))
> - dwc3_stop_active_transfer(dep, true, true);
> - else if (dwc3_gadget_ep_should_continue(dep))
> - if (__dwc3_gadget_kick_transfer(dep) == 0)
> - no_started_trb = false;
> + list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> + dwc3_gadget_move_cancelled_request(req,
> + (DWC3_TRB_SIZE_TRBSTS(req->trb->size) == DWC3_TRBSTS_MISSED_ISOC) ?
> + DWC3_REQUEST_STATUS_MISSED_ISOC : 0);
> + }
> + } else {
> + dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
> +
> + if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
> + goto out;
> +
> + if (!dep->endpoint.desc)
> + return no_started_trb;
> +
> + if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> + list_empty(&dep->started_list) && list_empty(&dep->pending_list))
> + dwc3_stop_active_transfer(dep, true, true);
> + else if (dwc3_gadget_ep_should_continue(dep))
> + if (__dwc3_gadget_kick_transfer(dep) == 0)
> + no_started_trb = false;
> + }
> out:
> /*
>
> I will seperate the whole hunk into smaller changes and send an v1
> the next days to review.
>

No, we should not reschedule for every missed-isoc. We only want to
target underrun condition.

Thanks,
Thinh

2024-03-07 16:15:52

by Michael Grzeschik

[permalink] [raw]

Subject: Re: [PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of release after missed isoc

On Thu, Mar 07, 2024 at 01:57:44AM +0000, Thinh Nguyen wrote:
>On Tue, Feb 27, 2024, Michael Grzeschik wrote:
>> On Thu, Feb 22, 2024 at 01:20:04AM +0000, Thinh Nguyen wrote:
>> > On Thu, Feb 22, 2024, Michael Grzeschik wrote:
>> > > For #2: I found an issue in the handling of the completion of requests in
>> > > the started list. When the interrupt handler is *explicitly* calling
>> > > stop_active_transfer if the overall event of the request was an missed
>> > > event. This event value only represents the value of the request that
>> > > was actually triggering the interrupt.
>> > >
>> > > It also calls ep_cleanup_completed_requests and is iterating over the
>> > > started requests and will call giveback/complete functions of the
>> > > requests with the proper request status.
>> > >
>> > > So this will also catch missed requests in the queue. However, since
>> > > there might be, lets say 5 good requests and one missed request, what
>> > > will happen is, that each complete call for the first good requests will
>> > > enqueue new requests into the started list and will also call the
>> > > updatecmd on that transfer that was already missed until the loop will
>> > > reach the one request with the MISSED status bit set.
>> > >
>> > > So in my opinion the patch from Jeff makes sense when adding the
>> > > following change aswell. With those both changes the underruns and
>> > > broken frames finally disappear. I am still unsure about the complete
>> > > solution about that, since with this the mentioned 5 good requests
>> > > will be cancelled aswell. So this is still a WIP status here.
>> > >
>> >
>> > When the dwc3 driver issues stop_active_transfer(), that means that the
>> > started_list is empty and there is an underrun.
>>
>> At this moment this is only the case when both, pending and started list
>> are empty. Or the interrupt event was EXDEV.
>>
>> The main problem is that the function
>> dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); will
>> issue an complete for each started request, which on the other hand will
>> refill the pending list, and therefor after that refill the
>> stop_active_transfer is currently never hit.
>>
>> > It treats the incoming requests as staled. However, for UVC, they are
>> > still "good".
>>
>> Right, so in that case we can requeue them anyway. But this will have to
>> be done after the stop transfer cmd has finished.
>>
>> > I think you can just check if the started_list is empty before queuing
>> > new requests. If it is, perform stop_active_transfer() to reschedule the
>> > incoming requests. None of the newly queue requests will be released
>> > yet since they are in the pending_list.
>>
>> So that is basically exactly what my patch is doing. However in the case
>> of an underrun it is not safe to call dwc3_gadget_ep_cleanup_completed_requests
>> as jeff stated. So his underlying patch is really fixing an issue here.
>
>What I mean is to actively check for started list on every
>usb_ep_queue() call. Checking during
>dwc3_gadget_ep_cleanup_completed_requests() is already too late.

I see.

>>
>> > For UVC, perhaps you can introduce a new flag to usb_request called
>> > "ignore_queue_latency" or something equivalent. The dwc3 is already
>> > partially doing this for UVC. With this new flag, we can rework dwc3 to
>> > clearly separate the expected behavior from the function driver.
>>
>> I don't know why this "extra" flag is even necessary. The code example
>> is already working without that extra flag.
>
>The flag is for controller to determine what kinds of behavior the
>function driver expects. My intention is if this extra flag is not set,
>the dwc3 driver will not attempt to reshcedule isoc request at all (ie.
>no stop_active_transfer()).

Ok.

>>
>> Actually I even came up with an better solution. Additionally of checking if
>> one of the requests in the started list was missed, we can activly check if
>> the trb ring did run dry and if dwc3_gadget_endpoint_trbs_complete is
>> going to enqueue in to the empty trb ring.
>>
>> So my whole change looks like that:
>>
>> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
>> index efe6caf4d0e87..2c8047dcd1612 100644
>> --- a/drivers/usb/dwc3/core.h
>> +++ b/drivers/usb/dwc3/core.h
>> @@ -952,6 +952,7 @@ struct dwc3_request {
>> #define DWC3_REQUEST_STATUS_DEQUEUED 3
>> #define DWC3_REQUEST_STATUS_STALLED 4
>> #define DWC3_REQUEST_STATUS_COMPLETED 5
>> +#define DWC3_REQUEST_STATUS_MISSED_ISOC 6
>> #define DWC3_REQUEST_STATUS_UNKNOWN -1
>> u8 epnum;
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 858fe4c299b7a..a31f4d3502bd3 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -2057,6 +2057,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
>> req = next_request(&dep->cancelled_list);
>> dwc3_gadget_ep_skip_trbs(dep, req);
>> switch (req->status) {
>> + case 0:
>> + dwc3_gadget_giveback(dep, req, 0);
>> + break;
>> case DWC3_REQUEST_STATUS_DISCONNECTED:
>> dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
>> break;
>> @@ -2066,6 +2069,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
>> case DWC3_REQUEST_STATUS_STALLED:
>> dwc3_gadget_giveback(dep, req, -EPIPE);
>> break;
>> + case DWC3_REQUEST_STATUS_MISSED_ISOC:
>> + dwc3_gadget_giveback(dep, req, -EXDEV);
>> + break;
>> default:
>> dev_err(dwc->dev, "request cancelled with wrong reason:%d\n", req->status);
>> dwc3_gadget_giveback(dep, req, -ECONNRESET);
>> @@ -3509,6 +3515,36 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
>> return ret;
>> }
>> +static int dwc3_gadget_ep_check_missed_requests(struct dwc3_ep *dep)
>> +{
>> + struct dwc3_request *req;
>> + struct dwc3_request *tmp;
>> + int ret = 0;
>> +
>> + list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
>> + struct dwc3_trb *trb;
>> +
>> + trb = req->trb;
>> + switch (DWC3_TRB_SIZE_TRBSTS(trb->size)) {
>> + case DWC3_TRBSTS_MISSED_ISOC:
>> + /* Isoc endpoint only */
>> + ret = -EXDEV;
>> + break;
>> + case DWC3_TRB_STS_XFER_IN_PROG:
>> + /* Applicable when End Transfer with ForceRM=0 */
>> + case DWC3_TRBSTS_SETUP_PENDING:
>> + /* Control endpoint only */
>> + case DWC3_TRBSTS_OK:
>> + default:
>> + ret = 0;
>> + break;
>> + }
>> + }
>> +
>> + return ret;
>> +}
>> +
>> static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
>> const struct dwc3_event_depevt *event, int status)
>> {
>> @@ -3565,22 +3601,51 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep,
>> {
>> struct dwc3 *dwc = dep->dwc;
>> bool no_started_trb = true;
>> + unsigned int transfer_in_flight = 0;
>> +
>> + /* It is possible that the interrupt thread was delayed by
>> + * scheduling in the system, and therefor the HW has already
>> + * run dry. In that case the last trb in the queue is already
>> + * handled by the hw. By checking the HWO bit we know to restart
>> + * the whole transfer. The condition to appear is more likelely
>> + * if not every trb has the IOC bit set and therefor does not
>> + * trigger the interrupt thread fewer.
>> + */
>> + if (dep->number && usb_endpoint_xfer_isoc(dep->endpoint.desc)) {
>> + struct dwc3_trb *trb;
>> - dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
>> + trb = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
>> + transfer_in_flight = trb->ctrl & DWC3_TRB_CTRL_HWO;
>> + }
>> - if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>> - goto out;
>> + if (status == -EXDEV || !transfer_in_flight) {
>> + struct dwc3_request *tmp;
>> + struct dwc3_request *req;
>> - if (!dep->endpoint.desc)
>> - return no_started_trb;
>> + if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
>> + dwc3_stop_active_transfer(dep, true, true);
>> - if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
>> - list_empty(&dep->started_list) &&
>> - (list_empty(&dep->pending_list) || status == -EXDEV))

@[!!here!!]

>> - dwc3_stop_active_transfer(dep, true, true);
>> - else if (dwc3_gadget_ep_should_continue(dep))
>> - if (__dwc3_gadget_kick_transfer(dep) == 0)
>> - no_started_trb = false;
>> + list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
>> + dwc3_gadget_move_cancelled_request(req,
>> + (DWC3_TRB_SIZE_TRBSTS(req->trb->size) == DWC3_TRBSTS_MISSED_ISOC) ?
>> + DWC3_REQUEST_STATUS_MISSED_ISOC : 0);
>> + }
>> + } else {
>> + dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
>> +
>> + if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>> + goto out;
>> +
>> + if (!dep->endpoint.desc)
>> + return no_started_trb;
>> +
>> + if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
>> + list_empty(&dep->started_list) && list_empty(&dep->pending_list))
>> + dwc3_stop_active_transfer(dep, true, true);
>> + else if (dwc3_gadget_ep_should_continue(dep))
>> + if (__dwc3_gadget_kick_transfer(dep) == 0)
>> + no_started_trb = false;
>> + }
>> out:
>> /*
>>
>> I will seperate the whole hunk into smaller changes and send an v1
>> the next days to review.
>>

I finally send a v1 of my series.

https://lore.kernel.org/linux-usb/20240307-dwc3-gadget-complete-irq-v1-0-4fe9ac0ba2b7@pengutronix.de/

For the rest of the discussion, I would like to move the conversation to
the newly send series.

>No, we should not reschedule for every missed-isoc. We only want to
>target underrun condition.

As you stated above, with reschedule what you mean is calling
stop_transfer after a missed transfer was seen?

If so, why is this condition in there already? (@[!!here!!])

Michael

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Attachments:

(No filename) (10.00 kB)
signature.asc (849.00 B)
Download all attachments

2024-03-08 02:57:25

by Thinh Nguyen

[permalink] [raw]

Subject: Re: [PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of release after missed isoc

On Thu, Mar 07, 2024, Michael Grzeschik wrote:
> On Thu, Mar 07, 2024 at 01:57:44AM +0000, Thinh Nguyen wrote:
> > On Tue, Feb 27, 2024, Michael Grzeschik wrote:
> > > On Thu, Feb 22, 2024 at 01:20:04AM +0000, Thinh Nguyen wrote:
> > > > On Thu, Feb 22, 2024, Michael Grzeschik wrote:
> > > > > For #2: I found an issue in the handling of the completion of requests in
> > > > > the started list. When the interrupt handler is *explicitly* calling
> > > > > stop_active_transfer if the overall event of the request was an missed
> > > > > event. This event value only represents the value of the request that
> > > > > was actually triggering the interrupt.
> > > > >
> > > > > It also calls ep_cleanup_completed_requests and is iterating over the
> > > > > started requests and will call giveback/complete functions of the
> > > > > requests with the proper request status.
> > > > >
> > > > > So this will also catch missed requests in the queue. However, since
> > > > > there might be, lets say 5 good requests and one missed request, what
> > > > > will happen is, that each complete call for the first good requests will
> > > > > enqueue new requests into the started list and will also call the
> > > > > updatecmd on that transfer that was already missed until the loop will
> > > > > reach the one request with the MISSED status bit set.
> > > > >
> > > > > So in my opinion the patch from Jeff makes sense when adding the
> > > > > following change aswell. With those both changes the underruns and
> > > > > broken frames finally disappear. I am still unsure about the complete
> > > > > solution about that, since with this the mentioned 5 good requests
> > > > > will be cancelled aswell. So this is still a WIP status here.
> > > > >
> > > >
> > > > When the dwc3 driver issues stop_active_transfer(), that means that the
> > > > started_list is empty and there is an underrun.
> > >
> > > At this moment this is only the case when both, pending and started list
> > > are empty. Or the interrupt event was EXDEV.
> > >
> > > The main problem is that the function
> > > dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); will
> > > issue an complete for each started request, which on the other hand will
> > > refill the pending list, and therefor after that refill the
> > > stop_active_transfer is currently never hit.
> > >
> > > > It treats the incoming requests as staled. However, for UVC, they are
> > > > still "good".
> > >
> > > Right, so in that case we can requeue them anyway. But this will have to
> > > be done after the stop transfer cmd has finished.
> > >
> > > > I think you can just check if the started_list is empty before queuing
> > > > new requests. If it is, perform stop_active_transfer() to reschedule the
> > > > incoming requests. None of the newly queue requests will be released
> > > > yet since they are in the pending_list.
> > >
> > > So that is basically exactly what my patch is doing. However in the case
> > > of an underrun it is not safe to call dwc3_gadget_ep_cleanup_completed_requests
> > > as jeff stated. So his underlying patch is really fixing an issue here.
> >
> > What I mean is to actively check for started list on every
> > usb_ep_queue() call. Checking during
> > dwc3_gadget_ep_cleanup_completed_requests() is already too late.
>
> I see.
>
> > >
> > > > For UVC, perhaps you can introduce a new flag to usb_request called
> > > > "ignore_queue_latency" or something equivalent. The dwc3 is already
> > > > partially doing this for UVC. With this new flag, we can rework dwc3 to
> > > > clearly separate the expected behavior from the function driver.
> > >
> > > I don't know why this "extra" flag is even necessary. The code example
> > > is already working without that extra flag.
> >
> > The flag is for controller to determine what kinds of behavior the
> > function driver expects. My intention is if this extra flag is not set,
> > the dwc3 driver will not attempt to reshcedule isoc request at all (ie.
> > no stop_active_transfer()).
>
> Ok.
>
> > >
> > > Actually I even came up with an better solution. Additionally of checking if
> > > one of the requests in the started list was missed, we can activly check if
> > > the trb ring did run dry and if dwc3_gadget_endpoint_trbs_complete is
> > > going to enqueue in to the empty trb ring.
> > >
> > > So my whole change looks like that:
> > >
> > > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> > > index efe6caf4d0e87..2c8047dcd1612 100644
> > > --- a/drivers/usb/dwc3/core.h
> > > +++ b/drivers/usb/dwc3/core.h
> > > @@ -952,6 +952,7 @@ struct dwc3_request {
> > > #define DWC3_REQUEST_STATUS_DEQUEUED 3
> > > #define DWC3_REQUEST_STATUS_STALLED 4
> > > #define DWC3_REQUEST_STATUS_COMPLETED 5
> > > +#define DWC3_REQUEST_STATUS_MISSED_ISOC 6
> > > #define DWC3_REQUEST_STATUS_UNKNOWN -1
> > > u8 epnum;
> > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > index 858fe4c299b7a..a31f4d3502bd3 100644
> > > --- a/drivers/usb/dwc3/gadget.c
> > > +++ b/drivers/usb/dwc3/gadget.c
> > > @@ -2057,6 +2057,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> > > req = next_request(&dep->cancelled_list);
> > > dwc3_gadget_ep_skip_trbs(dep, req);
> > > switch (req->status) {
> > > + case 0:
> > > + dwc3_gadget_giveback(dep, req, 0);
> > > + break;
> > > case DWC3_REQUEST_STATUS_DISCONNECTED:
> > > dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
> > > break;
> > > @@ -2066,6 +2069,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
> > > case DWC3_REQUEST_STATUS_STALLED:
> > > dwc3_gadget_giveback(dep, req, -EPIPE);
> > > break;
> > > + case DWC3_REQUEST_STATUS_MISSED_ISOC:
> > > + dwc3_gadget_giveback(dep, req, -EXDEV);
> > > + break;
> > > default:
> > > dev_err(dwc->dev, "request cancelled with wrong reason:%d\n", req->status);
> > > dwc3_gadget_giveback(dep, req, -ECONNRESET);
> > > @@ -3509,6 +3515,36 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
> > > return ret;
> > > }
> > > +static int dwc3_gadget_ep_check_missed_requests(struct dwc3_ep *dep)
> > > +{
> > > + struct dwc3_request *req;
> > > + struct dwc3_request *tmp;
> > > + int ret = 0;
> > > +
> > > + list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> > > + struct dwc3_trb *trb;
> > > +
> > > + trb = req->trb;
> > > + switch (DWC3_TRB_SIZE_TRBSTS(trb->size)) {
> > > + case DWC3_TRBSTS_MISSED_ISOC:
> > > + /* Isoc endpoint only */
> > > + ret = -EXDEV;
> > > + break;
> > > + case DWC3_TRB_STS_XFER_IN_PROG:
> > > + /* Applicable when End Transfer with ForceRM=0 */
> > > + case DWC3_TRBSTS_SETUP_PENDING:
> > > + /* Control endpoint only */
> > > + case DWC3_TRBSTS_OK:
> > > + default:
> > > + ret = 0;
> > > + break;
> > > + }
> > > + }
> > > +
> > > + return ret;
> > > +}
> > > +
> > > static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
> > > const struct dwc3_event_depevt *event, int status)
> > > {
> > > @@ -3565,22 +3601,51 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep,
> > > {
> > > struct dwc3 *dwc = dep->dwc;
> > > bool no_started_trb = true;
> > > + unsigned int transfer_in_flight = 0;
> > > +
> > > + /* It is possible that the interrupt thread was delayed by
> > > + * scheduling in the system, and therefor the HW has already
> > > + * run dry. In that case the last trb in the queue is already
> > > + * handled by the hw. By checking the HWO bit we know to restart
> > > + * the whole transfer. The condition to appear is more likelely
> > > + * if not every trb has the IOC bit set and therefor does not
> > > + * trigger the interrupt thread fewer.
> > > + */
> > > + if (dep->number && usb_endpoint_xfer_isoc(dep->endpoint.desc)) {
> > > + struct dwc3_trb *trb;
> > > - dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
> > > + trb = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
> > > + transfer_in_flight = trb->ctrl & DWC3_TRB_CTRL_HWO;
> > > + }
> > > - if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
> > > - goto out;
> > > + if (status == -EXDEV || !transfer_in_flight) {
> > > + struct dwc3_request *tmp;
> > > + struct dwc3_request *req;
> > > - if (!dep->endpoint.desc)
> > > - return no_started_trb;
> > > + if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
> > > + dwc3_stop_active_transfer(dep, true, true);
> > > - if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> > > - list_empty(&dep->started_list) &&
> > > - (list_empty(&dep->pending_list) || status == -EXDEV))
>
> @[!!here!!]
>
> > > - dwc3_stop_active_transfer(dep, true, true);
> > > - else if (dwc3_gadget_ep_should_continue(dep))
> > > - if (__dwc3_gadget_kick_transfer(dep) == 0)
> > > - no_started_trb = false;
> > > + list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
> > > + dwc3_gadget_move_cancelled_request(req,
> > > + (DWC3_TRB_SIZE_TRBSTS(req->trb->size) == DWC3_TRBSTS_MISSED_ISOC) ?
> > > + DWC3_REQUEST_STATUS_MISSED_ISOC : 0);
> > > + }
> > > + } else {
> > > + dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
> > > +
> > > + if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
> > > + goto out;
> > > +
> > > + if (!dep->endpoint.desc)
> > > + return no_started_trb;
> > > +
> > > + if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
> > > + list_empty(&dep->started_list) && list_empty(&dep->pending_list))
> > > + dwc3_stop_active_transfer(dep, true, true);
> > > + else if (dwc3_gadget_ep_should_continue(dep))
> > > + if (__dwc3_gadget_kick_transfer(dep) == 0)
> > > + no_started_trb = false;
> > > + }
> > > out:
> > > /*
> > >
> > > I will seperate the whole hunk into smaller changes and send an v1
> > > the next days to review.
> > >
>
> I finally send a v1 of my series.
>
> https://lore.kernel.org/linux-usb/20240307-dwc3-gadget-complete-irq-v1-0-4fe9ac0ba2b7@pengutronix.de/
>
> For the rest of the discussion, I would like to move the conversation to
> the newly send series.

I saw your pushes. Thanks. I'll review and move the discussion there.

>
> > No, we should not reschedule for every missed-isoc. We only want to
> > target underrun condition.
>
> As you stated above, with reschedule what you mean is calling
> stop_transfer after a missed transfer was seen?
>
> If so, why is this condition in there already? (@[!!here!!])
>

It's only to reschedule if started_list is empty _and_ if there's either
no pending request or there's a missed isoc. Not for every missed isoc.

BR,
Thinh