Accidentally noticed, that this driver is the only user of
while (timer_after(jiffies...)).
It looks like typo, because likely this while loop will finish after 1st
iteration, because time_after() returns true when 1st argument _is after_
2nd one.
Fix it by negating time_after return value inside while loops statement
Fixes: 753a026cfec1 ("net: ocelot: add FDMA support")
Signed-off-by: Pavel Skripkin <[email protected]>
---
drivers/net/ethernet/mscc/ocelot_fdma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mscc/ocelot_fdma.c b/drivers/net/ethernet/mscc/ocelot_fdma.c
index dffa597bffe6..4500fed3ce5c 100644
--- a/drivers/net/ethernet/mscc/ocelot_fdma.c
+++ b/drivers/net/ethernet/mscc/ocelot_fdma.c
@@ -104,7 +104,7 @@ static int ocelot_fdma_wait_chan_safe(struct ocelot *ocelot, int chan)
safe = ocelot_fdma_readl(ocelot, MSCC_FDMA_CH_SAFE);
if (safe & BIT(chan))
return 0;
- } while (time_after(jiffies, timeout));
+ } while (!time_after(jiffies, timeout));
return -ETIMEDOUT;
}
--
2.36.1
On Thu, May 19, 2022 at 11:40:17PM +0300, Pavel Skripkin wrote:
> Accidentally noticed, that this driver is the only user of
> while (timer_after(jiffies...)).
>
> It looks like typo, because likely this while loop will finish after 1st
> iteration, because time_after() returns true when 1st argument _is after_
> 2nd one.
>
> Fix it by negating time_after return value inside while loops statement
A better fix would be to use one of the helpers in linux/iopoll.h.
There is a second bug in the current code:
static int ocelot_fdma_wait_chan_safe(struct ocelot *ocelot, int chan)
{
unsigned long timeout;
u32 safe;
timeout = jiffies + usecs_to_jiffies(OCELOT_FDMA_CH_SAFE_TIMEOUT_US);
do {
safe = ocelot_fdma_readl(ocelot, MSCC_FDMA_CH_SAFE);
if (safe & BIT(chan))
return 0;
} while (time_after(jiffies, timeout));
return -ETIMEDOUT;
}
The scheduler could put the thread to sleep, and it does not get woken
up for OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware
has done its thing, but you exit the while loop and return -ETIMEDOUT.
linux/iopoll.h handles this correctly by testing the state one more
time after the timeout has expired.
Andrew
Hi Andrew,
On 5/20/22 15:40, Andrew Lunn wrote:
> On Thu, May 19, 2022 at 11:40:17PM +0300, Pavel Skripkin wrote:
>> Accidentally noticed, that this driver is the only user of
>> while (timer_after(jiffies...)).
>>
>> It looks like typo, because likely this while loop will finish after 1st
>> iteration, because time_after() returns true when 1st argument _is after_
>> 2nd one.
>>
>> Fix it by negating time_after return value inside while loops statement
>
> A better fix would be to use one of the helpers in linux/iopoll.h.
>
> There is a second bug in the current code:
>
> static int ocelot_fdma_wait_chan_safe(struct ocelot *ocelot, int chan)
> {
> unsigned long timeout;
> u32 safe;
>
> timeout = jiffies + usecs_to_jiffies(OCELOT_FDMA_CH_SAFE_TIMEOUT_US);
> do {
> safe = ocelot_fdma_readl(ocelot, MSCC_FDMA_CH_SAFE);
> if (safe & BIT(chan))
> return 0;
> } while (time_after(jiffies, timeout));
>
> return -ETIMEDOUT;
> }
>
> The scheduler could put the thread to sleep, and it does not get woken
> up for OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware
> has done its thing, but you exit the while loop and return -ETIMEDOUT.
>
> linux/iopoll.h handles this correctly by testing the state one more
> time after the timeout has expired.
>
I wasn't aware about these macros. Thanks for pointing out to that header!
Will send v2 soon,
With regards,
Pavel Skripkin
Accidentally noticed, that this driver is the only user of
while (time_after(jiffies...)).
It looks like typo, because likely this while loop will finish after 1st
iteration, because time_after() returns true when 1st argument _is after_
2nd one.
There is one possible problem with this poll loop: the scheduler could put
the thread to sleep, and it does not get woken up for
OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware has done
its thing, but you exit the while loop and return -ETIMEDOUT.
Fix it by using sane poll API that avoids all problems described above
Fixes: 753a026cfec1 ("net: ocelot: add FDMA support")
Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: Pavel Skripkin <[email protected]>
---
I can't say if 0 is a good choise for 5th readx_poll_timeout() argument,
so this patch is build-tested only.
Testing and suggestions are welcomed!
Changes since v1:
- Fixed typos in title and commit message
- Remove while loop and use readx_poll_timeout as suggested by
Andrew
---
drivers/net/ethernet/mscc/ocelot_fdma.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/mscc/ocelot_fdma.c b/drivers/net/ethernet/mscc/ocelot_fdma.c
index dffa597bffe6..82abfa8394c8 100644
--- a/drivers/net/ethernet/mscc/ocelot_fdma.c
+++ b/drivers/net/ethernet/mscc/ocelot_fdma.c
@@ -94,19 +94,18 @@ static void ocelot_fdma_activate_chan(struct ocelot *ocelot, dma_addr_t dma,
ocelot_fdma_writel(ocelot, MSCC_FDMA_CH_ACTIVATE, BIT(chan));
}
+static u32 ocelot_fdma_read_ch_safe(struct ocelot *ocelot)
+{
+ return ocelot_fdma_readl(ocelot, MSCC_FDMA_CH_SAFE);
+}
+
static int ocelot_fdma_wait_chan_safe(struct ocelot *ocelot, int chan)
{
- unsigned long timeout;
u32 safe;
- timeout = jiffies + usecs_to_jiffies(OCELOT_FDMA_CH_SAFE_TIMEOUT_US);
- do {
- safe = ocelot_fdma_readl(ocelot, MSCC_FDMA_CH_SAFE);
- if (safe & BIT(chan))
- return 0;
- } while (time_after(jiffies, timeout));
-
- return -ETIMEDOUT;
+ return readx_poll_timeout(ocelot_fdma_read_ch_safe, ocelot, safe,
+ safe & BIT(chan), 0,
+ OCELOT_FDMA_CH_SAFE_TIMEOUT_US);
}
static void ocelot_fdma_dcb_set_data(struct ocelot_fdma_dcb *dcb,
--
2.36.1
On Sat, May 21, 2022 at 03:55:30PM +0200, Andrew Lunn wrote:
> On Sat, May 21, 2022 at 12:31:15AM +0300, Pavel Skripkin wrote:
> > Accidentally noticed, that this driver is the only user of
> > while (time_after(jiffies...)).
> >
> > It looks like typo, because likely this while loop will finish after 1st
> > iteration, because time_after() returns true when 1st argument _is after_
> > 2nd one.
> >
> > There is one possible problem with this poll loop: the scheduler could put
> > the thread to sleep, and it does not get woken up for
> > OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware has done
> > its thing, but you exit the while loop and return -ETIMEDOUT.
> >
> > Fix it by using sane poll API that avoids all problems described above
> >
> > Fixes: 753a026cfec1 ("net: ocelot: add FDMA support")
> > Suggested-by: Andrew Lunn <[email protected]>
> > Signed-off-by: Pavel Skripkin <[email protected]>
> > ---
> >
> > I can't say if 0 is a good choise for 5th readx_poll_timeout() argument,
> > so this patch is build-tested only.
>
> > Testing and suggestions are welcomed!
>
> If you had the hardware, i would suggest you profile how often it does
> complete on the first iteration. And when it does not complete on the
> first iteration, how many more iterations it needs.
>
> Tobias made an interesting observation with the mv88e6xxx switch. He
> found that two tight polls was enough 99% of the time. Putting a sleep
> in there doubles the time it took to setup the switch. So he ended up
> with a hybrid of open coded polling twice, followed by iopoll with a
> timer value set.
>
> That was with a heavily used poll function. How often is this function
> used? No point in overly optimising this if it is not used much.
If you're looking at me, I don't have the hardware to test, sorry.
Frame DMA is one of the components NXP removed when building their DSA
variants of these switches. But the function is called once or twice per
NAPI poll cycle, so it's worth optimizing as much as possible.
Clement, could you please do some testing? The patch that Andrew is
talking about is 35da1dfd9484 ("net: dsa: mv88e6xxx: Improve performance
of busy bit polling").
On Sat, May 21, 2022 at 12:31:15AM +0300, Pavel Skripkin wrote:
> Accidentally noticed, that this driver is the only user of
> while (time_after(jiffies...)).
>
> It looks like typo, because likely this while loop will finish after 1st
> iteration, because time_after() returns true when 1st argument _is after_
> 2nd one.
>
> There is one possible problem with this poll loop: the scheduler could put
> the thread to sleep, and it does not get woken up for
> OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware has done
> its thing, but you exit the while loop and return -ETIMEDOUT.
>
> Fix it by using sane poll API that avoids all problems described above
>
> Fixes: 753a026cfec1 ("net: ocelot: add FDMA support")
> Suggested-by: Andrew Lunn <[email protected]>
> Signed-off-by: Pavel Skripkin <[email protected]>
> ---
>
> I can't say if 0 is a good choise for 5th readx_poll_timeout() argument,
> so this patch is build-tested only.
> Testing and suggestions are welcomed!
If you had the hardware, i would suggest you profile how often it does
complete on the first iteration. And when it does not complete on the
first iteration, how many more iterations it needs.
Tobias made an interesting observation with the mv88e6xxx switch. He
found that two tight polls was enough 99% of the time. Putting a sleep
in there doubles the time it took to setup the switch. So he ended up
with a hybrid of open coded polling twice, followed by iopoll with a
timer value set.
That was with a heavily used poll function. How often is this function
used? No point in overly optimising this if it is not used much.
Andrew
Le Sat, 21 May 2022 16:21:09 +0000,
Vladimir Oltean <[email protected]> a écrit :
> On Sat, May 21, 2022 at 03:55:30PM +0200, Andrew Lunn wrote:
> > On Sat, May 21, 2022 at 12:31:15AM +0300, Pavel Skripkin wrote:
> > > Accidentally noticed, that this driver is the only user of
> > > while (time_after(jiffies...)).
> > >
> > > It looks like typo, because likely this while loop will finish after 1st
> > > iteration, because time_after() returns true when 1st argument _is after_
> > > 2nd one.
> > >
> > > There is one possible problem with this poll loop: the scheduler could put
> > > the thread to sleep, and it does not get woken up for
> > > OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware has done
> > > its thing, but you exit the while loop and return -ETIMEDOUT.
> > >
> > > Fix it by using sane poll API that avoids all problems described above
> > >
> > > Fixes: 753a026cfec1 ("net: ocelot: add FDMA support")
> > > Suggested-by: Andrew Lunn <[email protected]>
> > > Signed-off-by: Pavel Skripkin <[email protected]>
> > > ---
> > >
> > > I can't say if 0 is a good choise for 5th readx_poll_timeout() argument,
> > > so this patch is build-tested only.
> >
> > > Testing and suggestions are welcomed!
> >
> > If you had the hardware, i would suggest you profile how often it does
> > complete on the first iteration. And when it does not complete on the
> > first iteration, how many more iterations it needs.
> >
> > Tobias made an interesting observation with the mv88e6xxx switch. He
> > found that two tight polls was enough 99% of the time. Putting a sleep
> > in there doubles the time it took to setup the switch. So he ended up
> > with a hybrid of open coded polling twice, followed by iopoll with a
> > timer value set.
> >
> > That was with a heavily used poll function. How often is this function
> > used? No point in overly optimising this if it is not used much.
>
> If you're looking at me, I don't have the hardware to test, sorry.
> Frame DMA is one of the components NXP removed when building their DSA
> variants of these switches. But the function is called once or twice per
> NAPI poll cycle, so it's worth optimizing as much as possible.
>
> Clement, could you please do some testing? The patch that Andrew is
> talking about is 35da1dfd9484 ("net: dsa: mv88e6xxx: Improve performance
> of busy bit polling").
Ok, I'll have to wake up that ocelot board but I'll try to do
that.
--
Clément Léger,
Embedded Linux and Kernel engineer at Bootlin
https://bootlin.com
On Mon, 23 May 2022 16:00:04 +0200 Clément Léger wrote:
> > If you're looking at me, I don't have the hardware to test, sorry.
> > Frame DMA is one of the components NXP removed when building their DSA
> > variants of these switches. But the function is called once or twice per
> > NAPI poll cycle, so it's worth optimizing as much as possible.
> >
> > Clement, could you please do some testing? The patch that Andrew is
> > talking about is 35da1dfd9484 ("net: dsa: mv88e6xxx: Improve performance
> > of busy bit polling").
>
> Ok, I'll have to wake up that ocelot board but I'll try to do that.
We can't hold it in PW for much longer, please repost once tested.
Le Sat, 21 May 2022 16:21:09 +0000,
Vladimir Oltean <[email protected]> a écrit :
> On Sat, May 21, 2022 at 03:55:30PM +0200, Andrew Lunn wrote:
> > On Sat, May 21, 2022 at 12:31:15AM +0300, Pavel Skripkin wrote:
> > > Accidentally noticed, that this driver is the only user of
> > > while (time_after(jiffies...)).
> > >
> > > It looks like typo, because likely this while loop will finish after 1st
> > > iteration, because time_after() returns true when 1st argument _is after_
> > > 2nd one.
> > >
> > > There is one possible problem with this poll loop: the scheduler could put
> > > the thread to sleep, and it does not get woken up for
> > > OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware has done
> > > its thing, but you exit the while loop and return -ETIMEDOUT.
> > >
> > > Fix it by using sane poll API that avoids all problems described above
> > >
> > > Fixes: 753a026cfec1 ("net: ocelot: add FDMA support")
> > > Suggested-by: Andrew Lunn <[email protected]>
> > > Signed-off-by: Pavel Skripkin <[email protected]>
> > > ---
> > >
> > > I can't say if 0 is a good choise for 5th readx_poll_timeout() argument,
> > > so this patch is build-tested only.
> >
> > > Testing and suggestions are welcomed!
> >
> > If you had the hardware, i would suggest you profile how often it does
> > complete on the first iteration. And when it does not complete on the
> > first iteration, how many more iterations it needs.
> >
> > Tobias made an interesting observation with the mv88e6xxx switch. He
> > found that two tight polls was enough 99% of the time. Putting a sleep
> > in there doubles the time it took to setup the switch. So he ended up
> > with a hybrid of open coded polling twice, followed by iopoll with a
> > timer value set.
> >
> > That was with a heavily used poll function. How often is this function
> > used? No point in overly optimising this if it is not used much.
>
> If you're looking at me, I don't have the hardware to test, sorry.
> Frame DMA is one of the components NXP removed when building their DSA
> variants of these switches. But the function is called once or twice per
> NAPI poll cycle, so it's worth optimizing as much as possible.
>
> Clement, could you please do some testing? The patch that Andrew is
> talking about is 35da1dfd9484 ("net: dsa: mv88e6xxx: Improve performance
> of busy bit polling").
So I actually tested and added logging to see if the CH_SAFE
register bits are set for the channel on the first iteration. From
what I could test (iperf3 with huge/non huge packets, TCP/UDP), it
always return true on the first try. So since I think Pavel solution
is ok to go with.
However, since ocelot_fdma_wait_chan_safe() is also called in the napi
poll function of this driver, I don't think sleeping is allowed (softirq
context) and thus I would suggest using the readx_poll_timeout_atomic()
function instead.
Regarding the delay to wait between each read, I don't have any
information about that possible value, the datasheet only says "wait
for the bit to be set" so I guess we'll have to live with an
approximate value.
Thanks,
--
Clément Léger,
Embedded Linux and Kernel engineer at Bootlin
https://bootlin.com
On 6/24/22 18:14, Clément Léger wrote:
> So I actually tested and added logging to see if the CH_SAFE
> register bits are set for the channel on the first iteration. From
> what I could test (iperf3 with huge/non huge packets, TCP/UDP), it
> always return true on the first try. So since I think Pavel solution
> is ok to go with.
>
> However, since ocelot_fdma_wait_chan_safe() is also called in the napi
> poll function of this driver, I don't think sleeping is allowed (softirq
> context) and thus I would suggest using the readx_poll_timeout_atomic()
> function instead.
>
> Regarding the delay to wait between each read, I don't have any
> information about that possible value, the datasheet only says "wait
> for the bit to be set" so I guess we'll have to live with an
> approximate value.
>
Thank you for testing!
I will update update v3 with _atomic variant
Thanks,
--Pavel Skripkin