2021-08-17 06:47:10

by Jeaho Hwang

[permalink] [raw]
Subject: [PATCH v2] usb: chipidea: add loop timeout for hw_ep_set_halt()

If ctrl EP priming is failed (very rare case in standard linux),
hw_ep_set_halt goes infinite loop. waiting 100 times was enough
for zynq7000.

Signed-off-by: Jeaho Hwang <[email protected]>

diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
index 8834ca613721..d73fadb18f32 100644
--- a/drivers/usb/chipidea/udc.c
+++ b/drivers/usb/chipidea/udc.c
@@ -209,6 +209,9 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
return 0;
}

+/* enough for zynq7000 evaluation board */
+#define HW_EP_SET_HALT_COUNT_MAX 100
+
/**
* hw_ep_set_halt: configures ep halt & resets data toggle after clear (execute
* without interruption)
@@ -221,6 +224,7 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
*/
static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
{
+ int count = HW_EP_SET_HALT_COUNT_MAX;
if (value != 0 && value != 1)
return -EINVAL;

@@ -232,9 +236,9 @@ static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
/* data toggle - reserved for EP0 but it's in ESS */
hw_write(ci, reg, mask_xs|mask_xr,
value ? mask_xs : mask_xr);
- } while (value != hw_ep_get_halt(ci, num, dir));
+ } while (value != hw_ep_get_halt(ci, num, dir) && --count > 0);

- return 0;
+ return count ? 0 : -EAGAIN;
}

/**
--
2.25.1


2021-08-24 08:33:08

by Jeaho Hwang

[permalink] [raw]
Subject: Re: [PATCH v2] usb: chipidea: add loop timeout for hw_ep_set_halt()

2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <[email protected]>님이 작성:
>
> If ctrl EP priming is failed (very rare case in standard linux),
> hw_ep_set_halt goes infinite loop. waiting 100 times was enough
> for zynq7000.
>

Hi Peter.
I found from zynq7000 TRM that the hardware clears Stall bit if a
setup packet is received on a control endpoint.
I think hw_ep_set_halt goes infinite loop since:

1. hw_ep_prime for control EP which is called from
isr_tr_complete_handler -> isr_setup_status_phase is failed due to a
setup packet received.
2. in isr_tr_complete_handler it tries to call _ep_set_halt if either
isr_tr_complete_low or isr_setup_status_phase returns error.
3. Since the control EP got a setup packet, HW resets TXS bit just as
the driver sets inside hw_ep_set_halt so it goes infinite loop.

Does it make sense? If it is right, we shouldn't call _ep_set_halt if
the err is -EAGAIN, which could be returned ONLY due to the setup
packet issue described above.
And the loop timeout is not required anymore.

Can I ask your opinion on this, Peter and USB experts?

Thanks.

> Signed-off-by: Jeaho Hwang <[email protected]>
>
> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
> index 8834ca613721..d73fadb18f32 100644
> --- a/drivers/usb/chipidea/udc.c
> +++ b/drivers/usb/chipidea/udc.c
> @@ -209,6 +209,9 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
> return 0;
> }
>
> +/* enough for zynq7000 evaluation board */
> +#define HW_EP_SET_HALT_COUNT_MAX 100
> +
> /**
> * hw_ep_set_halt: configures ep halt & resets data toggle after clear (execute
> * without interruption)
> @@ -221,6 +224,7 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
> */
> static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
> {
> + int count = HW_EP_SET_HALT_COUNT_MAX;
> if (value != 0 && value != 1)
> return -EINVAL;
>
> @@ -232,9 +236,9 @@ static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
> /* data toggle - reserved for EP0 but it's in ESS */
> hw_write(ci, reg, mask_xs|mask_xr,
> value ? mask_xs : mask_xr);
> - } while (value != hw_ep_get_halt(ci, num, dir));
> + } while (value != hw_ep_get_halt(ci, num, dir) && --count > 0);
>
> - return 0;
> + return count ? 0 : -EAGAIN;
> }
>
> /**
> --
> 2.25.1
>

2021-08-26 11:18:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] usb: chipidea: add loop timeout for hw_ep_set_halt()

On Tue, Aug 17, 2021 at 03:43:53PM +0900, Jeaho Hwang wrote:
> If ctrl EP priming is failed (very rare case in standard linux),
> hw_ep_set_halt goes infinite loop. waiting 100 times was enough
> for zynq7000.
>
> Signed-off-by: Jeaho Hwang <[email protected]>
>
> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
> index 8834ca613721..d73fadb18f32 100644
> --- a/drivers/usb/chipidea/udc.c
> +++ b/drivers/usb/chipidea/udc.c
> @@ -209,6 +209,9 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
> return 0;
> }
>
> +/* enough for zynq7000 evaluation board */
> +#define HW_EP_SET_HALT_COUNT_MAX 100
> +
> /**
> * hw_ep_set_halt: configures ep halt & resets data toggle after clear (execute
> * without interruption)
> @@ -221,6 +224,7 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
> */
> static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
> {
> + int count = HW_EP_SET_HALT_COUNT_MAX;
> if (value != 0 && value != 1)

You need a blank line after "int count..."


> return -EINVAL;
>
> @@ -232,9 +236,9 @@ static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
> /* data toggle - reserved for EP0 but it's in ESS */
> hw_write(ci, reg, mask_xs|mask_xr,
> value ? mask_xs : mask_xr);
> - } while (value != hw_ep_get_halt(ci, num, dir));
> + } while (value != hw_ep_get_halt(ci, num, dir) && --count > 0);
>
> - return 0;
> + return count ? 0 : -EAGAIN;

Please spell this out:
if (count)
return 0;
return -EAGAIN;

And will the caller properly handle this?

thanks,

greg k-h

2021-08-26 23:10:11

by Peter Chen

[permalink] [raw]
Subject: Re: [PATCH v2] usb: chipidea: add loop timeout for hw_ep_set_halt()

On 21-08-24 17:31:44, Jeaho Hwang wrote:
> 2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <[email protected]>님이 작성:
> >
> > If ctrl EP priming is failed (very rare case in standard linux),
> > hw_ep_set_halt goes infinite loop. waiting 100 times was enough
> > for zynq7000.
> >
>
> Hi Peter.
> I found from zynq7000 TRM that the hardware clears Stall bit if a
> setup packet is received on a control endpoint.
> I think hw_ep_set_halt goes infinite loop since:
>
> 1. hw_ep_prime for control EP which is called from
> isr_tr_complete_handler -> isr_setup_status_phase is failed due to a
> setup packet received.

How do you know that? Do you observe the new setup packet on the bus
before the current status stage? Usually, the host doesn't begin new setup
transfer before current setup transfer has finished.

Peter

> 2. in isr_tr_complete_handler it tries to call _ep_set_halt if either
> isr_tr_complete_low or isr_setup_status_phase returns error.
> 3. Since the control EP got a setup packet, HW resets TXS bit just as
> the driver sets inside hw_ep_set_halt so it goes infinite loop.
>
> Does it make sense? If it is right, we shouldn't call _ep_set_halt if
> the err is -EAGAIN, which could be returned ONLY due to the setup
> packet issue described above.
> And the loop timeout is not required anymore.
>
> Can I ask your opinion on this, Peter and USB experts?
>
> Thanks.
>
> > Signed-off-by: Jeaho Hwang <[email protected]>
> >
> > diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
> > index 8834ca613721..d73fadb18f32 100644
> > --- a/drivers/usb/chipidea/udc.c
> > +++ b/drivers/usb/chipidea/udc.c
> > @@ -209,6 +209,9 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
> > return 0;
> > }
> >
> > +/* enough for zynq7000 evaluation board */
> > +#define HW_EP_SET_HALT_COUNT_MAX 100
> > +
> > /**
> > * hw_ep_set_halt: configures ep halt & resets data toggle after clear (execute
> > * without interruption)
> > @@ -221,6 +224,7 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
> > */
> > static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
> > {
> > + int count = HW_EP_SET_HALT_COUNT_MAX;
> > if (value != 0 && value != 1)
> > return -EINVAL;
> >
> > @@ -232,9 +236,9 @@ static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
> > /* data toggle - reserved for EP0 but it's in ESS */
> > hw_write(ci, reg, mask_xs|mask_xr,
> > value ? mask_xs : mask_xr);
> > - } while (value != hw_ep_get_halt(ci, num, dir));
> > + } while (value != hw_ep_get_halt(ci, num, dir) && --count > 0);
> >
> > - return 0;
> > + return count ? 0 : -EAGAIN;
> > }
> >
> > /**
> > --
> > 2.25.1
> >

--

Thanks,
Peter Chen

2021-08-27 01:36:26

by Jeaho Hwang

[permalink] [raw]
Subject: Re: [PATCH v2] usb: chipidea: add loop timeout for hw_ep_set_halt()

2021년 8월 27일 (금) 오전 8:08, Peter Chen <[email protected]>님이 작성:
>
> On 21-08-24 17:31:44, Jeaho Hwang wrote:
> > 2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <[email protected]>님이 작성:
> > >
> > > If ctrl EP priming is failed (very rare case in standard linux),
> > > hw_ep_set_halt goes infinite loop. waiting 100 times was enough
> > > for zynq7000.
> > >
> >
> > Hi Peter.
> > I found from zynq7000 TRM that the hardware clears Stall bit if a
> > setup packet is received on a control endpoint.
> > I think hw_ep_set_halt goes infinite loop since:
> >
> > 1. hw_ep_prime for control EP which is called from
> > isr_tr_complete_handler -> isr_setup_status_phase is failed due to a
> > setup packet received.
>
> How do you know that? Do you observe the new setup packet on the bus
> before the current status stage? Usually, the host doesn't begin new setup
> transfer before current setup transfer has finished.
>
> Peter
>

I found an error return from the second ENDPTSETUPSTAT checking
routine, then setting the stall bit(TXS) kept failing. So I guessed it
is due to a setup packet received.
I didn't observe the setup packet by e.g. HW probes. Any other reason
to produce that symptom?

For reminder, only reproduced on preemp_rt kernel and with Windows(10)
RNDIS host.

thanks.

191 static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
192 {
193 int n = hw_ep_bit(num, dir);
194
195 /* Synchronize before ep prime */
196 wmb();
197
198 if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
199 return -EAGAIN;
200
201 hw_write(ci, OP_ENDPTPRIME, ~0, BIT(n));
202
203 while (hw_read(ci, OP_ENDPTPRIME, BIT(n)))
204 cpu_relax();
205 if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
206 return -EAGAIN;
~~~~~~~~~~~~~~~~
207
208 /* status shoult be tested according with manual but it doesn't work */
209 return 0;
210 }

> > 2. in isr_tr_complete_handler it tries to call _ep_set_halt if either
> > isr_tr_complete_low or isr_setup_status_phase returns error.
> > 3. Since the control EP got a setup packet, HW resets TXS bit just as
> > the driver sets inside hw_ep_set_halt so it goes infinite loop.
> >
> > Does it make sense? If it is right, we shouldn't call _ep_set_halt if
> > the err is -EAGAIN, which could be returned ONLY due to the setup
> > packet issue described above.
> > And the loop timeout is not required anymore.
> >
> > Can I ask your opinion on this, Peter and USB experts?
> >
> > Thanks.
> >

2021-08-28 01:43:06

by Peter Chen

[permalink] [raw]
Subject: Re: [PATCH v2] usb: chipidea: add loop timeout for hw_ep_set_halt()

On 21-08-27 10:35:05, Jeaho Hwang wrote:
> 2021년 8월 27일 (금) 오전 8:08, Peter Chen <[email protected]>님이 작성:
> >
> > On 21-08-24 17:31:44, Jeaho Hwang wrote:
> > > 2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <[email protected]>님이 작성:
> > > >
> > > > If ctrl EP priming is failed (very rare case in standard linux),
> > > > hw_ep_set_halt goes infinite loop. waiting 100 times was enough
> > > > for zynq7000.
> > > >
> > >
> > > Hi Peter.
> > > I found from zynq7000 TRM that the hardware clears Stall bit if a
> > > setup packet is received on a control endpoint.
> > > I think hw_ep_set_halt goes infinite loop since:
> > >
> > > 1. hw_ep_prime for control EP which is called from
> > > isr_tr_complete_handler -> isr_setup_status_phase is failed due to a
> > > setup packet received.
> >
> > How do you know that? Do you observe the new setup packet on the bus
> > before the current status stage? Usually, the host doesn't begin new setup
> > transfer before current setup transfer has finished.
> >
> > Peter
> >
>
> I found an error return from the second ENDPTSETUPSTAT checking
> routine, then setting the stall bit(TXS) kept failing. So I guessed it
> is due to a setup packet received.
> I didn't observe the setup packet by e.g. HW probes. Any other reason
> to produce that symptom?

I guess two possible reasons for that:
- The new setup is coming after priming
- The interrupt occurs after prime, and when the back from interrupt,
other thread for USB transfer is scheduled, eg, usb_ep_queue from RNDIS

From your experiments and observation, it seems the first reason is not possible.
Did your get failure with UP system?

Peter

>
> For reminder, only reproduced on preemp_rt kernel and with Windows(10)
> RNDIS host.
>
> thanks.
>
> 191 static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
> 192 {
> 193 int n = hw_ep_bit(num, dir);
> 194
> 195 /* Synchronize before ep prime */
> 196 wmb();
> 197
> 198 if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
> 199 return -EAGAIN;
> 200
> 201 hw_write(ci, OP_ENDPTPRIME, ~0, BIT(n));
> 202
> 203 while (hw_read(ci, OP_ENDPTPRIME, BIT(n)))
> 204 cpu_relax();
> 205 if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
> 206 return -EAGAIN;
> ~~~~~~~~~~~~~~~~
> 207
> 208 /* status shoult be tested according with manual but it doesn't work */
> 209 return 0;
> 210 }
>
> > > 2. in isr_tr_complete_handler it tries to call _ep_set_halt if either
> > > isr_tr_complete_low or isr_setup_status_phase returns error.
> > > 3. Since the control EP got a setup packet, HW resets TXS bit just as
> > > the driver sets inside hw_ep_set_halt so it goes infinite loop.
> > >
> > > Does it make sense? If it is right, we shouldn't call _ep_set_halt if
> > > the err is -EAGAIN, which could be returned ONLY due to the setup
> > > packet issue described above.
> > > And the loop timeout is not required anymore.
> > >
> > > Can I ask your opinion on this, Peter and USB experts?
> > >
> > > Thanks.
> > >

--

Thanks,
Peter Chen

2021-08-28 03:13:58

by Jeaho Hwang

[permalink] [raw]
Subject: Re: [PATCH v2] usb: chipidea: add loop timeout for hw_ep_set_halt()

2021년 8월 28일 (토) 오전 10:38, Peter Chen <[email protected]>님이 작성:
>
> On 21-08-27 10:35:05, Jeaho Hwang wrote:
> > 2021년 8월 27일 (금) 오전 8:08, Peter Chen <[email protected]>님이 작성:
> > >
> > > On 21-08-24 17:31:44, Jeaho Hwang wrote:
> > > > 2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <[email protected]>님이 작성:
> > > > >
> > > > > If ctrl EP priming is failed (very rare case in standard linux),
> > > > > hw_ep_set_halt goes infinite loop. waiting 100 times was enough
> > > > > for zynq7000.
> > > > >
> > > >
> > > > Hi Peter.
> > > > I found from zynq7000 TRM that the hardware clears Stall bit if a
> > > > setup packet is received on a control endpoint.
> > > > I think hw_ep_set_halt goes infinite loop since:
> > > >
> > > > 1. hw_ep_prime for control EP which is called from
> > > > isr_tr_complete_handler -> isr_setup_status_phase is failed due to a
> > > > setup packet received.
> > >
> > > How do you know that? Do you observe the new setup packet on the bus
> > > before the current status stage? Usually, the host doesn't begin new setup
> > > transfer before current setup transfer has finished.
> > >
> > > Peter
> > >
> >
> > I found an error return from the second ENDPTSETUPSTAT checking
> > routine, then setting the stall bit(TXS) kept failing. So I guessed it
> > is due to a setup packet received.
> > I didn't observe the setup packet by e.g. HW probes. Any other reason
> > to produce that symptom?
>
> I guess two possible reasons for that:
> - The new setup is coming after priming
> - The interrupt occurs after prime, and when the back from interrupt,
> other thread for USB transfer is scheduled, eg, usb_ep_queue from RNDIS
>
> From your experiments and observation, it seems the first reason is not possible.

I will check if any other thread calls udc. but the only workload
using RNDIS was the host side ping sender.
Thanks for the advice.

> Did your get failure with UP system?

I'm sorry I don't understand what UP system means.

>
> Peter
>
> >
> > For reminder, only reproduced on preemp_rt kernel and with Windows(10)
> > RNDIS host.
> >
> > thanks.
> >
> >
> > > > 2. in isr_tr_complete_handler it tries to call _ep_set_halt if either
> > > > isr_tr_complete_low or isr_setup_status_phase returns error.
> > > > 3. Since the control EP got a setup packet, HW resets TXS bit just as
> > > > the driver sets inside hw_ep_set_halt so it goes infinite loop.
> > > >
> > > > Does it make sense? If it is right, we shouldn't call _ep_set_halt if
> > > > the err is -EAGAIN, which could be returned ONLY due to the setup
> > > > packet issue described above.
> > > > And the loop timeout is not required anymore.
> > > >
> > > > Can I ask your opinion on this, Peter and USB experts?
> > > >
> > > > Thanks.
> > > >
>
> --
>
> Thanks,
> Peter Chen
>


--
황재호, Jay Hwang, linux team manager of RTst
010-7242-1593