2018-10-30 09:32:35

by Kurt Kanzenbach

[permalink] [raw]
Subject: [PATCH 0/2] net: xlinx: mdio: recheck condition after timeout

Hi,

the Xilinx mdio wait functions may return false positives under certain
circumstances: If the functions get preempted between reading the corresponding
mdio register and checking for the timeout, they could falsely indicate a
timeout.

In order to avoid the issue, the condition should be rechecked in the timeout
case.

Kurt Kanzenbach (2):
net: axienet: recheck condition after timeout in mdio_wait()
net: xilinx_emaclite: recheck condition after timeout in mdio_wait()

drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c | 21 ++++++++++++++++-----
drivers/net/ethernet/xilinx/xilinx_emaclite.c | 20 +++++++++++++++-----
2 files changed, 31 insertions(+), 10 deletions(-)

--
2.11.0



2018-10-30 09:32:38

by Kurt Kanzenbach

[permalink] [raw]
Subject: [PATCH 2/2] net: xilinx_emaclite: recheck condition after timeout in mdio_wait()

The function could report a false positive if it gets preempted between reading
the XEL_MDIOCTRL_OFFSET register and checking for the timeout. In such a case,
the condition has to be rechecked to avoid false positives.

Therefore, check for expected condition even after the timeout occurred.

Signed-off-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/ethernet/xilinx/xilinx_emaclite.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 639e3e99af46..957d03085bd0 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -714,19 +714,29 @@ static irqreturn_t xemaclite_interrupt(int irq, void *dev_id)
static int xemaclite_mdio_wait(struct net_local *lp)
{
unsigned long end = jiffies + 2;
+ u32 val;

/* wait for the MDIO interface to not be busy or timeout
* after some time.
*/
- while (xemaclite_readl(lp->base_addr + XEL_MDIOCTRL_OFFSET) &
- XEL_MDIOCTRL_MDIOSTS_MASK) {
+ while (1) {
+ val = xemaclite_readl(lp->base_addr + XEL_MDIOCTRL_OFFSET);
+
+ if (!(val & XEL_MDIOCTRL_MDIOSTS_MASK))
+ return 0;
+
if (time_before_eq(end, jiffies)) {
- WARN_ON(1);
- return -ETIMEDOUT;
+ val = xemaclite_readl(lp->base_addr + XEL_MDIOCTRL_OFFSET);
+ break;
}
+
msleep(1);
}
- return 0;
+ if (!(val & XEL_MDIOCTRL_MDIOSTS_MASK))
+ return 0;
+
+ WARN_ON(1);
+ return -ETIMEDOUT;
}

/**
--
2.11.0


2018-10-30 09:34:25

by Kurt Kanzenbach

[permalink] [raw]
Subject: [PATCH 1/2] net: axienet: recheck condition after timeout in mdio_wait()

The function could report a false positive if it gets preempted between reading
the XAE_MDIO_MCR_OFFSET register and checking for the timeout. In such a case,
the condition has to be rechecked to avoid false positives.

Therefore, check for expected condition even after the timeout occurred.

Signed-off-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
index 757a3b37ae8a..4f13125e4942 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
@@ -21,15 +21,26 @@
int axienet_mdio_wait_until_ready(struct axienet_local *lp)
{
unsigned long end = jiffies + 2;
- while (!(axienet_ior(lp, XAE_MDIO_MCR_OFFSET) &
- XAE_MDIO_MCR_READY_MASK)) {
+ u32 val;
+
+ while (1) {
+ val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET);
+
+ if (val & XAE_MDIO_MCR_READY_MASK)
+ return 0;
+
if (time_before_eq(end, jiffies)) {
- WARN_ON(1);
- return -ETIMEDOUT;
+ val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET);
+ break;
}
+
udelay(1);
}
- return 0;
+ if (val & XAE_MDIO_MCR_READY_MASK)
+ return 0;
+
+ WARN_ON(1);
+ return -ETIMEDOUT;
}

/**
--
2.11.0


2018-10-30 12:09:35

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH 0/2] net: xlinx: mdio: recheck condition after timeout

On Tue, Oct 30, 2018 at 10:31:37AM +0100, Kurt Kanzenbach wrote:
> Hi,
>
> the Xilinx mdio wait functions may return false positives under certain
> circumstances: If the functions get preempted between reading the corresponding
> mdio register and checking for the timeout, they could falsely indicate a
> timeout.

Hi Kurt

I wonder if it would be possible to add a readx_poll_timeout() which
passes two parameters to op()?

I keep seeing this basic problem in various drivers, and try to point
people towards readx_poll_timeout(), but it is not the best of fit.

Otherwise, could you add a axienet_ior_read_mcr(lp), and use
readx_poll_timeout() as is?

Andrew

2018-10-30 12:13:20

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH 2/2] net: xilinx_emaclite: recheck condition after timeout in mdio_wait()

On Tue, Oct 30, 2018 at 10:31:39AM +0100, Kurt Kanzenbach wrote:
> The function could report a false positive if it gets preempted between reading
> the XEL_MDIOCTRL_OFFSET register and checking for the timeout. In such a case,
> the condition has to be rechecked to avoid false positives.
>
> Therefore, check for expected condition even after the timeout occurred.
>
> Signed-off-by: Kurt Kanzenbach <[email protected]>
> ---
> drivers/net/ethernet/xilinx/xilinx_emaclite.c | 20 +++++++++++++++-----
> 1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> index 639e3e99af46..957d03085bd0 100644
> --- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> +++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> @@ -714,19 +714,29 @@ static irqreturn_t xemaclite_interrupt(int irq, void *dev_id)
> static int xemaclite_mdio_wait(struct net_local *lp)
> {
> unsigned long end = jiffies + 2;
> + u32 val;
>
> /* wait for the MDIO interface to not be busy or timeout
> * after some time.
> */
> - while (xemaclite_readl(lp->base_addr + XEL_MDIOCTRL_OFFSET) &
> - XEL_MDIOCTRL_MDIOSTS_MASK) {
> + while (1) {
> + val = xemaclite_readl(lp->base_addr + XEL_MDIOCTRL_OFFSET);

Hi Kurt

It looks like readx_poll_timeout() should work here.

Andrew

2018-10-30 12:59:26

by Radhey Shyam Pandey

[permalink] [raw]
Subject: RE: [PATCH 2/2] net: xilinx_emaclite: recheck condition after timeout in mdio_wait()

<snip>
>
> On Tue, Oct 30, 2018 at 10:31:39AM +0100, Kurt Kanzenbach wrote:
> > The function could report a false positive if it gets preempted between
> reading
> > the XEL_MDIOCTRL_OFFSET register and checking for the timeout. In such a
> case,
> > the condition has to be rechecked to avoid false positives.
> >
> > Therefore, check for expected condition even after the timeout occurred.
> >
> > Signed-off-by: Kurt Kanzenbach <[email protected]>
> > ---
> > drivers/net/ethernet/xilinx/xilinx_emaclite.c | 20 +++++++++++++++-----
> > 1 file changed, 15 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> > index 639e3e99af46..957d03085bd0 100644
> > --- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> > +++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> > @@ -714,19 +714,29 @@ static irqreturn_t xemaclite_interrupt(int irq, void
> *dev_id)
> > static int xemaclite_mdio_wait(struct net_local *lp)
> > {
> > unsigned long end = jiffies + 2;
> > + u32 val;
> >
> > /* wait for the MDIO interface to not be busy or timeout
> > * after some time.
> > */
> > - while (xemaclite_readl(lp->base_addr + XEL_MDIOCTRL_OFFSET) &
> > - XEL_MDIOCTRL_MDIOSTS_MASK) {
> > + while (1) {
> > + val = xemaclite_readl(lp->base_addr +
> XEL_MDIOCTRL_OFFSET);
>
> Hi Kurt
>
> It looks like readx_poll_timeout() should work here.

Yes, valid point. readx_poll_timeout API repoll addr after timeout.
Reusing it would simplify the flow.

>
> Andrew

2018-10-30 13:48:27

by Kurt Kanzenbach

[permalink] [raw]
Subject: Re: [PATCH 0/2] net: xlinx: mdio: recheck condition after timeout

Hi Andrew,

On Tue, Oct 30, 2018 at 01:08:31PM +0100, Andrew Lunn wrote:
> On Tue, Oct 30, 2018 at 10:31:37AM +0100, Kurt Kanzenbach wrote:
> > Hi,
> >
> > the Xilinx mdio wait functions may return false positives under certain
> > circumstances: If the functions get preempted between reading the corresponding
> > mdio register and checking for the timeout, they could falsely indicate a
> > timeout.
>
> Hi Kurt
>
> I wonder if it would be possible to add a readx_poll_timeout() which
> passes two parameters to op()?

actually I was thinking about using readx_poll_timeout(). But as you
already pointed out, it expects only one parameter for op(). I'm not
sure about adding a new readx_poll_timeout() macro.

>
> I keep seeing this basic problem in various drivers, and try to point
> people towards readx_poll_timeout(), but it is not the best of fit.
>
> Otherwise, could you add a axienet_ior_read_mcr(lp), and use
> readx_poll_timeout() as is?

I guess that would work.

I'll use readx_poll_timeout() for both wait functions and send a v2.

Thanks,
Kurt

2018-10-30 18:34:34

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 1/2] net: axienet: recheck condition after timeout in mdio_wait()

From: Kurt Kanzenbach <[email protected]>
Date: Tue, 30 Oct 2018 10:31:38 +0100

> The function could report a false positive if it gets preempted between reading
> the XAE_MDIO_MCR_OFFSET register and checking for the timeout. In such a case,
> the condition has to be rechecked to avoid false positives.
>
> Therefore, check for expected condition even after the timeout occurred.
>
> Signed-off-by: Kurt Kanzenbach <[email protected]>
...
> if (time_before_eq(end, jiffies)) {
> - WARN_ON(1);
> - return -ETIMEDOUT;
> + val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET);
> + break;
> }
> +
> udelay(1);
> }
> - return 0;
> + if (val & XAE_MDIO_MCR_READY_MASK)
> + return 0;
> +
> + WARN_ON(1);
> + return -ETIMEDOUT;

You are not fundamentally changing the situation at all.

The condtion could change right after your last read of
XAR_MDIO_MCR_OFFSET, which is the same thing that happens before your
modifications to this code.

It sounds more like the timeout is slightly too short, and that's the
real problem that causes whatever behavior you think you are fixing
here.

I'm not applying this.

2018-10-30 18:34:37

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2/2] net: xilinx_emaclite: recheck condition after timeout in mdio_wait()

From: Kurt Kanzenbach <[email protected]>
Date: Tue, 30 Oct 2018 10:31:39 +0100

> The function could report a false positive if it gets preempted between reading
> the XEL_MDIOCTRL_OFFSET register and checking for the timeout. In such a case,
> the condition has to be rechecked to avoid false positives.
>
> Therefore, check for expected condition even after the timeout occurred.
>
> Signed-off-by: Kurt Kanzenbach <[email protected]>

Same objections as your previous patch.

This isn't fixing anything.

2018-10-31 13:08:43

by Kurt Kanzenbach

[permalink] [raw]
Subject: Re: [PATCH 1/2] net: axienet: recheck condition after timeout in mdio_wait()

On Tue, Oct 30, 2018 at 11:25:11AM -0700, David Miller wrote:
> From: Kurt Kanzenbach <[email protected]>
> Date: Tue, 30 Oct 2018 10:31:38 +0100
>
> > The function could report a false positive if it gets preempted between reading
> > the XAE_MDIO_MCR_OFFSET register and checking for the timeout. In such a case,
> > the condition has to be rechecked to avoid false positives.
> >
> > Therefore, check for expected condition even after the timeout occurred.
> >
> > Signed-off-by: Kurt Kanzenbach <[email protected]>
> ...
> > if (time_before_eq(end, jiffies)) {
> > - WARN_ON(1);
> > - return -ETIMEDOUT;
> > + val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET);
> > + break;
> > }
> > +
> > udelay(1);
> > }
> > - return 0;
> > + if (val & XAE_MDIO_MCR_READY_MASK)
> > + return 0;
> > +
> > + WARN_ON(1);
> > + return -ETIMEDOUT;
>
> You are not fundamentally changing the situation at all.
>
> The condtion could change right after your last read of
> XAR_MDIO_MCR_OFFSET, which is the same thing that happens before your
> modifications to this code.

That's true. The problem is different: If the current task gets
preempted by a higher priority task between checking the condition and
the timeout code, then a timeout might be falsely detected. Consider the
following events:

loop:
check mdio condition
------------------------
task with real time priority may run for a long time
------------------------
check for timeout
wait

That's why I've added the recheck of the condition in the timeout case.

>
> It sounds more like the timeout is slightly too short, and that's the
> real problem that causes whatever behavior you think you are fixing
> here.

The timeout value is not the problem here.

Thanks,
Kurt

Subject: Re: [PATCH 1/2] net: axienet: recheck condition after timeout in mdio_wait()

On 2018-10-30 11:25:11 [-0700], David Miller wrote:
> From: Kurt Kanzenbach <[email protected]>
> Date: Tue, 30 Oct 2018 10:31:38 +0100
>
> > The function could report a false positive if it gets preempted between reading
> > the XAE_MDIO_MCR_OFFSET register and checking for the timeout. In such a case,
> > the condition has to be rechecked to avoid false positives.
> >
> > Therefore, check for expected condition even after the timeout occurred.
> >
> > Signed-off-by: Kurt Kanzenbach <[email protected]>
> ...
> > if (time_before_eq(end, jiffies)) {
> > - WARN_ON(1);
> > - return -ETIMEDOUT;
> > + val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET);
> > + break;
> > }
> > +
> > udelay(1);
> > }
> > - return 0;
> > + if (val & XAE_MDIO_MCR_READY_MASK)
> > + return 0;
> > +
> > + WARN_ON(1);
> > + return -ETIMEDOUT;
>
> You are not fundamentally changing the situation at all.

> The condtion could change right after your last read of
> XAR_MDIO_MCR_OFFSET, which is the same thing that happens before your
> modifications to this code.
>
> It sounds more like the timeout is slightly too short, and that's the
> real problem that causes whatever behavior you think you are fixing
> here.

There is a timeout of two jiffies. If the condition is not true within
those two jiffies it will attempt to check condition one last time after
the timeout occured.
If the task got preempted after the reading from the register but before
the timeout it is possible that the task gets back on the CPU after the
timeout occured. And since the timeout occured it won't check if the
condition changed:
Time
0 +---+
| c | Check for condition (false)
| c |
| c |
| c |
| c |
| P | Task gets preempted
| |
| O | Condition is true, task still preempted, no check
| |
2 | T | The timeout is true
| |
| |
| |
| p | Task gets back on the CPU, no re-check of condition

In the last step, there is no checking of the condition after the
timeout occured and it wrongly assumes that the condition is not true.
Increasing the timeout would help as long as the task gets not preempted
past the new timeout.
The same pattern (check condition after timeout) is also used in
wait_event_timeout() or readx_poll_timeout(). Would you prefer to
refactor this with readx_poll_timeout() instead?

> I'm not applying this.
Please reconsider.

Sebastian