2019-03-18 11:09:40

by Lorenzo Bianconi

[permalink] [raw]
Subject: [RFC] mt76: usb: reduce locking in mt76u_tx_tasklet

Similar to pci counterpart, reduce locking in mt76u_tx_tasklet since
q->head is managed just in mt76u_tx_tasklet and q->queued is updated
holding q->lock

Signed-off-by: Lorenzo Bianconi <[email protected]>
---
drivers/net/wireless/mediatek/mt76/usb.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
index ac03acdae279..8cd70c32d77a 100644
--- a/drivers/net/wireless/mediatek/mt76/usb.c
+++ b/drivers/net/wireless/mediatek/mt76/usb.c
@@ -634,29 +634,33 @@ static void mt76u_tx_tasklet(unsigned long data)
int i;

for (i = 0; i < IEEE80211_NUM_ACS; i++) {
+ u32 n_queued = 0, n_sw_queued = 0;
+
sq = &dev->q_tx[i];
q = sq->q;

- spin_lock_bh(&q->lock);
- while (true) {
+ while (q->queued > n_queued) {
buf = &q->entry[q->head].ubuf;
- if (!buf->done || !q->queued)
+ if (!buf->done)
break;

if (q->entry[q->head].schedule) {
q->entry[q->head].schedule = false;
- sq->swq_queued--;
+ n_sw_queued++;
}

entry = q->entry[q->head];
q->head = (q->head + 1) % q->ndesc;
- q->queued--;
+ n_queued++;

- spin_unlock_bh(&q->lock);
dev->drv->tx_complete_skb(dev, i, &entry);
- spin_lock_bh(&q->lock);
}

+ spin_lock_bh(&q->lock);
+
+ sq->swq_queued -= n_sw_queued;
+ q->queued -= n_queued;
+
wake = q->stopped && q->queued < q->ndesc - 8;
if (wake)
q->stopped = false;
--
2.20.1



2019-03-19 11:07:15

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [RFC] mt76: usb: reduce locking in mt76u_tx_tasklet

On Mon, Mar 18, 2019 at 12:09:32PM +0100, Lorenzo Bianconi wrote:
> Similar to pci counterpart, reduce locking in mt76u_tx_tasklet since
> q->head is managed just in mt76u_tx_tasklet and q->queued is updated
> holding q->lock
>
> Signed-off-by: Lorenzo Bianconi <[email protected]>
> ---
> drivers/net/wireless/mediatek/mt76/usb.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> index ac03acdae279..8cd70c32d77a 100644
> --- a/drivers/net/wireless/mediatek/mt76/usb.c
> +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> @@ -634,29 +634,33 @@ static void mt76u_tx_tasklet(unsigned long data)
> int i;
>
> for (i = 0; i < IEEE80211_NUM_ACS; i++) {
> + u32 n_queued = 0, n_sw_queued = 0;
> +
> sq = &dev->q_tx[i];
> q = sq->q;
>
> - spin_lock_bh(&q->lock);
> - while (true) {
> + while (q->queued > n_queued) {
> buf = &q->entry[q->head].ubuf;
> - if (!buf->done || !q->queued)
> + if (!buf->done)
> break;

I'm still thinking if this is safe or not. Is somewhat tricky to
read variable outside the lock because in such case there is no time
guarantee when variable written on one CPU gets updated value on
different CPU. And for USB is not only q->queued but also buf->done.

Stanislaw


2019-03-19 12:58:20

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [RFC] mt76: usb: reduce locking in mt76u_tx_tasklet

> On Mon, Mar 18, 2019 at 12:09:32PM +0100, Lorenzo Bianconi wrote:
> > Similar to pci counterpart, reduce locking in mt76u_tx_tasklet since
> > q->head is managed just in mt76u_tx_tasklet and q->queued is updated
> > holding q->lock
> >
> > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > ---
> > drivers/net/wireless/mediatek/mt76/usb.c | 18 +++++++++++-------
> > 1 file changed, 11 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> > index ac03acdae279..8cd70c32d77a 100644
> > --- a/drivers/net/wireless/mediatek/mt76/usb.c
> > +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> > @@ -634,29 +634,33 @@ static void mt76u_tx_tasklet(unsigned long data)
> > int i;
> >
> > for (i = 0; i < IEEE80211_NUM_ACS; i++) {
> > + u32 n_queued = 0, n_sw_queued = 0;
> > +
> > sq = &dev->q_tx[i];
> > q = sq->q;
> >
> > - spin_lock_bh(&q->lock);
> > - while (true) {
> > + while (q->queued > n_queued) {
> > buf = &q->entry[q->head].ubuf;
> > - if (!buf->done || !q->queued)
> > + if (!buf->done)
> > break;
>
> I'm still thinking if this is safe or not. Is somewhat tricky to
> read variable outside the lock because in such case there is no time
> guarantee when variable written on one CPU gets updated value on
> different CPU. And for USB is not only q->queued but also buf->done.

Hi Stanislaw,

I was wondering if this is safe as well, but q->queued is updated holding q->lock
and I guess it will ensure to not overlap tx and status code path.
Regarding buf->done, it is already updated without holding the lock in mt76u_complete_tx

Regards,
Lorenzo

>
> Stanislaw
>


Attachments:
(No filename) (1.66 kB)
signature.asc (228.00 B)
Download all attachments

2019-03-19 16:04:42

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [RFC] mt76: usb: reduce locking in mt76u_tx_tasklet

On Tue, Mar 19, 2019 at 01:58:13PM +0100, Lorenzo Bianconi wrote:
> > On Mon, Mar 18, 2019 at 12:09:32PM +0100, Lorenzo Bianconi wrote:
> > > Similar to pci counterpart, reduce locking in mt76u_tx_tasklet since
> > > q->head is managed just in mt76u_tx_tasklet and q->queued is updated
> > > holding q->lock
> > >
> > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > ---
> > > drivers/net/wireless/mediatek/mt76/usb.c | 18 +++++++++++-------
> > > 1 file changed, 11 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> > > index ac03acdae279..8cd70c32d77a 100644
> > > --- a/drivers/net/wireless/mediatek/mt76/usb.c
> > > +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> > > @@ -634,29 +634,33 @@ static void mt76u_tx_tasklet(unsigned long data)
> > > int i;
> > >
> > > for (i = 0; i < IEEE80211_NUM_ACS; i++) {
> > > + u32 n_queued = 0, n_sw_queued = 0;
> > > +
> > > sq = &dev->q_tx[i];
> > > q = sq->q;
> > >
> > > - spin_lock_bh(&q->lock);
> > > - while (true) {
> > > + while (q->queued > n_queued) {
> > > buf = &q->entry[q->head].ubuf;
> > > - if (!buf->done || !q->queued)
> > > + if (!buf->done)
> > > break;
> >
> > I'm still thinking if this is safe or not. Is somewhat tricky to
> > read variable outside the lock because in such case there is no time
> > guarantee when variable written on one CPU gets updated value on
> > different CPU. And for USB is not only q->queued but also buf->done.
>
> Hi Stanislaw,
>
> I was wondering if this is safe as well, but q->queued is updated holding q->lock
> and I guess it will ensure to not overlap tx and status code path.

Overlap will not happen, at worst what can happen is q->queued will be
smaller on tx_tasklet than on tx_queue_skb.

> Regarding buf->done, it is already updated without holding the lock in mt76u_complete_tx

That's actually a bug, but it's not important, if tx_tasklet will not
see updated buf->done <- true value by mt76u_complete_tx on different
cpu, it will not complete skb. It will be done on next tx_tasklet iteration.
Worse thing would be opposite situation.

Stanislaw

2019-03-19 16:23:31

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [RFC] mt76: usb: reduce locking in mt76u_tx_tasklet

> On Tue, Mar 19, 2019 at 01:58:13PM +0100, Lorenzo Bianconi wrote:
> > > On Mon, Mar 18, 2019 at 12:09:32PM +0100, Lorenzo Bianconi wrote:
> > > > Similar to pci counterpart, reduce locking in mt76u_tx_tasklet since
> > > > q->head is managed just in mt76u_tx_tasklet and q->queued is updated
> > > > holding q->lock
> > > >
> > > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > > ---
> > > > drivers/net/wireless/mediatek/mt76/usb.c | 18 +++++++++++-------
> > > > 1 file changed, 11 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> > > > index ac03acdae279..8cd70c32d77a 100644
> > > > --- a/drivers/net/wireless/mediatek/mt76/usb.c
> > > > +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> > > > @@ -634,29 +634,33 @@ static void mt76u_tx_tasklet(unsigned long data)
> > > > int i;
> > > >
> > > > for (i = 0; i < IEEE80211_NUM_ACS; i++) {
> > > > + u32 n_queued = 0, n_sw_queued = 0;
> > > > +
> > > > sq = &dev->q_tx[i];
> > > > q = sq->q;
> > > >
> > > > - spin_lock_bh(&q->lock);
> > > > - while (true) {
> > > > + while (q->queued > n_queued) {
> > > > buf = &q->entry[q->head].ubuf;
> > > > - if (!buf->done || !q->queued)
> > > > + if (!buf->done)
> > > > break;
> > >
> > > I'm still thinking if this is safe or not. Is somewhat tricky to
> > > read variable outside the lock because in such case there is no time
> > > guarantee when variable written on one CPU gets updated value on
> > > different CPU. And for USB is not only q->queued but also buf->done.
> >
> > Hi Stanislaw,
> >
> > I was wondering if this is safe as well, but q->queued is updated holding q->lock
> > and I guess it will ensure to not overlap tx and status code path.
>
> Overlap will not happen, at worst what can happen is q->queued will be
> smaller on tx_tasklet than on tx_queue_skb.

Yes, that is the point :)

>
> > Regarding buf->done, it is already updated without holding the lock in mt76u_complete_tx
>
> That's actually a bug, but it's not important, if tx_tasklet will not
> see updated buf->done <- true value by mt76u_complete_tx on different
> cpu, it will not complete skb. It will be done on next tx_tasklet iteration.
> Worse thing would be opposite situation.

Can this really occur? (since queued is update holding the lock)

>
> Stanislaw


Attachments:
(No filename) (2.34 kB)
signature.asc (228.00 B)
Download all attachments

2019-03-20 08:11:34

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [RFC] mt76: usb: reduce locking in mt76u_tx_tasklet

On Tue, Mar 19, 2019 at 05:23:25PM +0100, Lorenzo Bianconi wrote:
> > On Tue, Mar 19, 2019 at 01:58:13PM +0100, Lorenzo Bianconi wrote:
> > > > On Mon, Mar 18, 2019 at 12:09:32PM +0100, Lorenzo Bianconi wrote:
> > > > > Similar to pci counterpart, reduce locking in mt76u_tx_tasklet since
> > > > > q->head is managed just in mt76u_tx_tasklet and q->queued is updated
> > > > > holding q->lock
> > > > >
> > > > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > > > ---
> > > > > drivers/net/wireless/mediatek/mt76/usb.c | 18 +++++++++++-------
> > > > > 1 file changed, 11 insertions(+), 7 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > index ac03acdae279..8cd70c32d77a 100644
> > > > > --- a/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > @@ -634,29 +634,33 @@ static void mt76u_tx_tasklet(unsigned long data)
> > > > > int i;
> > > > >
> > > > > for (i = 0; i < IEEE80211_NUM_ACS; i++) {
> > > > > + u32 n_queued = 0, n_sw_queued = 0;
> > > > > +
> > > > > sq = &dev->q_tx[i];
> > > > > q = sq->q;
> > > > >
> > > > > - spin_lock_bh(&q->lock);
> > > > > - while (true) {
> > > > > + while (q->queued > n_queued) {
> > > > > buf = &q->entry[q->head].ubuf;
> > > > > - if (!buf->done || !q->queued)
> > > > > + if (!buf->done)
> > > > > break;
> > > >
> > > > I'm still thinking if this is safe or not. Is somewhat tricky to
> > > > read variable outside the lock because in such case there is no time
> > > > guarantee when variable written on one CPU gets updated value on
> > > > different CPU. And for USB is not only q->queued but also buf->done.
> > >
> > > Hi Stanislaw,
> > >
> > > I was wondering if this is safe as well, but q->queued is updated holding q->lock
> > > and I guess it will ensure to not overlap tx and status code path.
> >
> > Overlap will not happen, at worst what can happen is q->queued will be
> > smaller on tx_tasklet than on tx_queue_skb.
>
> Yes, that is the point :)
>
> >
> > > Regarding buf->done, it is already updated without holding the lock in mt76u_complete_tx
> >
> > That's actually a bug, but it's not important, if tx_tasklet will not
> > see updated buf->done <- true value by mt76u_complete_tx on different
> > cpu, it will not complete skb. It will be done on next tx_tasklet iteration.
> > Worse thing would be opposite situation.
>
> Can this really occur?
I was thinking about that and yes it can occur. If q->queued and
buf->done writes/read will be reordered by CPUs. To prevent that you
will need to use smp_wmb/smp_rmb pair, but it's just simpler and more
convenient to use lock.

> (since queued is update holding the lock)
Holding the lock on one thread without holding it on concurrent thread
is irrelevant, it's the same as not holding any lock at all.

Stanislaw

2019-03-21 09:02:56

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [RFC] mt76: usb: reduce locking in mt76u_tx_tasklet

> On Tue, Mar 19, 2019 at 05:23:25PM +0100, Lorenzo Bianconi wrote:
> > > On Tue, Mar 19, 2019 at 01:58:13PM +0100, Lorenzo Bianconi wrote:
> > > > > On Mon, Mar 18, 2019 at 12:09:32PM +0100, Lorenzo Bianconi wrote:
> > > > > > Similar to pci counterpart, reduce locking in mt76u_tx_tasklet since
> > > > > > q->head is managed just in mt76u_tx_tasklet and q->queued is updated
> > > > > > holding q->lock
> > > > > >
> > > > > > Signed-off-by: Lorenzo Bianconi <[email protected]>
> > > > > > ---
> > > > > > drivers/net/wireless/mediatek/mt76/usb.c | 18 +++++++++++-------
> > > > > > 1 file changed, 11 insertions(+), 7 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > > index ac03acdae279..8cd70c32d77a 100644
> > > > > > --- a/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > > +++ b/drivers/net/wireless/mediatek/mt76/usb.c
> > > > > > @@ -634,29 +634,33 @@ static void mt76u_tx_tasklet(unsigned long data)
> > > > > > int i;
> > > > > >
> > > > > > for (i = 0; i < IEEE80211_NUM_ACS; i++) {
> > > > > > + u32 n_queued = 0, n_sw_queued = 0;
> > > > > > +
> > > > > > sq = &dev->q_tx[i];
> > > > > > q = sq->q;
> > > > > >
> > > > > > - spin_lock_bh(&q->lock);
> > > > > > - while (true) {
> > > > > > + while (q->queued > n_queued) {
> > > > > > buf = &q->entry[q->head].ubuf;
> > > > > > - if (!buf->done || !q->queued)
> > > > > > + if (!buf->done)
> > > > > > break;
> > > > >
> > > > > I'm still thinking if this is safe or not. Is somewhat tricky to
> > > > > read variable outside the lock because in such case there is no time
> > > > > guarantee when variable written on one CPU gets updated value on
> > > > > different CPU. And for USB is not only q->queued but also buf->done.
> > > >
> > > > Hi Stanislaw,
> > > >
> > > > I was wondering if this is safe as well, but q->queued is updated holding q->lock
> > > > and I guess it will ensure to not overlap tx and status code path.
> > >
> > > Overlap will not happen, at worst what can happen is q->queued will be
> > > smaller on tx_tasklet than on tx_queue_skb.
> >
> > Yes, that is the point :)
> >
> > >
> > > > Regarding buf->done, it is already updated without holding the lock in mt76u_complete_tx
> > >
> > > That's actually a bug, but it's not important, if tx_tasklet will not
> > > see updated buf->done <- true value by mt76u_complete_tx on different
> > > cpu, it will not complete skb. It will be done on next tx_tasklet iteration.
> > > Worse thing would be opposite situation.
> >
> > Can this really occur?
> I was thinking about that and yes it can occur. If q->queued and
> buf->done writes/read will be reordered by CPUs. To prevent that you
> will need to use smp_wmb/smp_rmb pair, but it's just simpler and more
> convenient to use lock.

good point, I will go through it.

Regards,
Lorenzo

>
> > (since queued is update holding the lock)
> Holding the lock on one thread without holding it on concurrent thread
> is irrelevant, it's the same as not holding any lock at all.
>
> Stanislaw


Attachments:
(No filename) (3.05 kB)
signature.asc (228.00 B)
Download all attachments

2019-03-21 09:11:01

by Felix Fietkau

[permalink] [raw]
Subject: Re: [RFC] mt76: usb: reduce locking in mt76u_tx_tasklet

On 2019-03-21 10:02, Lorenzo Bianconi wrote:
>> On Tue, Mar 19, 2019 at 05:23:25PM +0100, Lorenzo Bianconi wrote:
>> > > On Tue, Mar 19, 2019 at 01:58:13PM +0100, Lorenzo Bianconi wrote:
>> > > > > On Mon, Mar 18, 2019 at 12:09:32PM +0100, Lorenzo Bianconi wrote:
>> > > > > > Similar to pci counterpart, reduce locking in mt76u_tx_tasklet since
>> > > > > > q->head is managed just in mt76u_tx_tasklet and q->queued is updated
>> > > > > > holding q->lock
>> > > > > >
>> > > > > > Signed-off-by: Lorenzo Bianconi <[email protected]>
>> > > > > > ---
>> > > > > > drivers/net/wireless/mediatek/mt76/usb.c | 18 +++++++++++-------
>> > > > > > 1 file changed, 11 insertions(+), 7 deletions(-)
>> > > > > >
>> > > > > > diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c
>> > > > > > index ac03acdae279..8cd70c32d77a 100644
>> > > > > > --- a/drivers/net/wireless/mediatek/mt76/usb.c
>> > > > > > +++ b/drivers/net/wireless/mediatek/mt76/usb.c
>> > > > > > @@ -634,29 +634,33 @@ static void mt76u_tx_tasklet(unsigned long data)
>> > > > > > int i;
>> > > > > >
>> > > > > > for (i = 0; i < IEEE80211_NUM_ACS; i++) {
>> > > > > > + u32 n_queued = 0, n_sw_queued = 0;
>> > > > > > +
>> > > > > > sq = &dev->q_tx[i];
>> > > > > > q = sq->q;
>> > > > > >
>> > > > > > - spin_lock_bh(&q->lock);
>> > > > > > - while (true) {
>> > > > > > + while (q->queued > n_queued) {
>> > > > > > buf = &q->entry[q->head].ubuf;
>> > > > > > - if (!buf->done || !q->queued)
>> > > > > > + if (!buf->done)
>> > > > > > break;
>> > > > >
>> > > > > I'm still thinking if this is safe or not. Is somewhat tricky to
>> > > > > read variable outside the lock because in such case there is no time
>> > > > > guarantee when variable written on one CPU gets updated value on
>> > > > > different CPU. And for USB is not only q->queued but also buf->done.
>> > > >
>> > > > Hi Stanislaw,
>> > > >
>> > > > I was wondering if this is safe as well, but q->queued is updated holding q->lock
>> > > > and I guess it will ensure to not overlap tx and status code path.
>> > >
>> > > Overlap will not happen, at worst what can happen is q->queued will be
>> > > smaller on tx_tasklet than on tx_queue_skb.
>> >
>> > Yes, that is the point :)
>> >
>> > >
>> > > > Regarding buf->done, it is already updated without holding the lock in mt76u_complete_tx
>> > >
>> > > That's actually a bug, but it's not important, if tx_tasklet will not
>> > > see updated buf->done <- true value by mt76u_complete_tx on different
>> > > cpu, it will not complete skb. It will be done on next tx_tasklet iteration.
>> > > Worse thing would be opposite situation.
>> >
>> > Can this really occur?
>> I was thinking about that and yes it can occur. If q->queued and
>> buf->done writes/read will be reordered by CPUs. To prevent that you
>> will need to use smp_wmb/smp_rmb pair, but it's just simpler and more
>> convenient to use lock.
>
> good point, I will go through it.
Another simple solution would be to set buf->done = false in
mt76u_tx_tasklet after tx_complete_skb instead of doing it at enqueue time.

- Felix