2015-01-22 21:30:55

by Grumbach, Emmanuel

[permalink] [raw]
Subject: [PATCH] mac80211: synchronize_net() before flushing the queues

When mac80211 disconnects, it drops all the packets on the
queues. This happens after the net stack has been notified
that we have no link anymore (netif_carrier_off).
netif_carrier_off ensures that no new packets are sent to
xmit() callback, but we might have older packets in the
middle of the Tx path. These packets will land in the
driver's queues after the latter have been flushed.
Synchronize_net() between netif_carrier_off and drv_flush()
will fix this.

Note that we can't call synchronize_net inside
ieee80211_flush_queues since there are flows that call
ieee80211_flush_queues and don't need synchronize_net()
which is an expensive operation.

Signed-off-by: Emmanuel Grumbach <[email protected]>
---
net/mac80211/mlme.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 1d6bf01..1eafa63 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2016,6 +2016,9 @@ static void ieee80211_set_disassoc(struct ieee80211_sub_if_data *sdata,
/* disable per-vif ps */
ieee80211_recalc_ps_vif(sdata);

+ /* flush out all packets */
+ synchronize_net();
+
/*
* drop any frame before deauth/disassoc, this can be data or
* management frame. Since we are disconnecting, we should not
--
1.9.1



2015-01-23 13:20:16

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: synchronize_net() before flushing the queues

On Fri, 2015-01-23 at 13:36 +0200, Emmanuel Grumbach wrote:

> > It's not actually just "one packet for each vif/ac" - it's a whole tail
> > of whatever other RCU usages there are. Back when this was discussed,
> > the wizery people measured this to hundreds of ms in many not too
> > uncommon cases.
>
> That's the part I didn't know. I wasn't aware that this
> synchronize_net() was there and that someone removed it.

This particular one might not have been there - I don't remember. There
were some in this path and we removed quite a few. However, if you have
keys, wpa_s is currently clearing them, which calls it for every single
key. I once had worked on an optimisation here (telling wpa_s not to
clear keys because we handle that when stations are deleted anyway) but
so far that didn't really go anywhere.

> I just wonder how we handle the case where we still have packets in
> the Tx path and we bring everything down. I can check the code, but
> not right now.

Well - I didn't expect you to check anything now :-)

I was more noting this down for 'future work'. It's possible, perhaps,
that we cannot do anything else here, but then we should think about the
other optimisations again...

johannes


2015-01-23 11:36:08

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: [PATCH] mac80211: synchronize_net() before flushing the queues

On Fri, Jan 23, 2015 at 12:33 PM, Johannes Berg
<[email protected]> wrote:
> On Fri, 2015-01-23 at 10:18 +0000, Grumbach, Emmanuel wrote:
>
>> I don't think it will introduce that much of latency.
>> Note that there is a call to flush() right after that, this means that
>> any driver which implements this call needs to wait there. So you have
>> the latency in either place. The only additional latency it adds is for
>> other RCU sections / packets on other interfaces.
>
> This is correct.
>
>> Also - since we just stopped the netif, there can possibly be only one
>> packet for each vif / ac processing. This is not too much data to
>> process.
>
> But this is the wrong conclusion.
>
> synchronize_rcu() is expensive, no matter what, especially if you have
> multiple cores. The RCU grace periods will practically never line up,
> and it needs to wait for every CPU to go through one.
>

I know.

> It's not actually just "one packet for each vif/ac" - it's a whole tail
> of whatever other RCU usages there are. Back when this was discussed,
> the wizery people measured this to hundreds of ms in many not too
> uncommon cases.
>

That's the part I didn't know. I wasn't aware that this
synchronize_net() was there and that someone removed it.

>> Your call, but to honest, I have been optimizing the roaming as well (as
>> you know). And it was really bad for drivers that implements the flush()
>> callback. Especially in cases where you are already far from the AP you
>> want to roam from. This is why I added the drop parameter and other
>> optimizations (don't flush() after probing etc...)
>
> Well, yes, but that's an upper bound - here with the synchronize_net()
> we're more talking about a lower bound.
>

I also have to admit I was wrong earlier: the flush() call in
set_disassoc has drop set to true, so this call will return
immediately.

I just wonder how we handle the case where we still have packets in
the Tx path and we bring everything down. I can check the code, but
not right now.

2015-01-23 10:33:52

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: synchronize_net() before flushing the queues

On Fri, 2015-01-23 at 10:18 +0000, Grumbach, Emmanuel wrote:

> I don't think it will introduce that much of latency.
> Note that there is a call to flush() right after that, this means that
> any driver which implements this call needs to wait there. So you have
> the latency in either place. The only additional latency it adds is for
> other RCU sections / packets on other interfaces.

This is correct.

> Also - since we just stopped the netif, there can possibly be only one
> packet for each vif / ac processing. This is not too much data to
> process.

But this is the wrong conclusion.

synchronize_rcu() is expensive, no matter what, especially if you have
multiple cores. The RCU grace periods will practically never line up,
and it needs to wait for every CPU to go through one.

It's not actually just "one packet for each vif/ac" - it's a whole tail
of whatever other RCU usages there are. Back when this was discussed,
the wizery people measured this to hundreds of ms in many not too
uncommon cases.

> Your call, but to honest, I have been optimizing the roaming as well (as
> you know). And it was really bad for drivers that implements the flush()
> callback. Especially in cases where you are already far from the AP you
> want to roam from. This is why I added the drop parameter and other
> optimizations (don't flush() after probing etc...)

Well, yes, but that's an upper bound - here with the synchronize_net()
we're more talking about a lower bound.

johannes


2015-01-23 10:18:36

by Grumbach, Emmanuel

[permalink] [raw]
Subject: Re: [PATCH] mac80211: synchronize_net() before flushing the queues

T24gRnJpLCAyMDE1LTAxLTIzIGF0IDEwOjU2ICswMTAwLCBKb2hhbm5lcyBCZXJnIHdyb3RlOg0K
PiBPbiBUaHUsIDIwMTUtMDEtMjIgYXQgMjM6MzAgKzAyMDAsIEVtbWFudWVsIEdydW1iYWNoIHdy
b3RlOg0KPiA+IFdoZW4gbWFjODAyMTEgZGlzY29ubmVjdHMsIGl0IGRyb3BzIGFsbCB0aGUgcGFj
a2V0cyBvbiB0aGUNCj4gPiBxdWV1ZXMuIFRoaXMgaGFwcGVucyBhZnRlciB0aGUgbmV0IHN0YWNr
IGhhcyBiZWVuIG5vdGlmaWVkDQo+ID4gdGhhdCB3ZSBoYXZlIG5vIGxpbmsgYW55bW9yZSAobmV0
aWZfY2Fycmllcl9vZmYpLg0KPiA+IG5ldGlmX2NhcnJpZXJfb2ZmIGVuc3VyZXMgdGhhdCBubyBu
ZXcgcGFja2V0cyBhcmUgc2VudCB0bw0KPiA+IHhtaXQoKSBjYWxsYmFjaywgYnV0IHdlIG1pZ2h0
IGhhdmUgb2xkZXIgcGFja2V0cyBpbiB0aGUNCj4gPiBtaWRkbGUgb2YgdGhlIFR4IHBhdGguIFRo
ZXNlIHBhY2tldHMgd2lsbCBsYW5kIGluIHRoZQ0KPiA+IGRyaXZlcidzIHF1ZXVlcyBhZnRlciB0
aGUgbGF0dGVyIGhhdmUgYmVlbiBmbHVzaGVkLg0KPiA+IFN5bmNocm9uaXplX25ldCgpIGJldHdl
ZW4gbmV0aWZfY2Fycmllcl9vZmYgYW5kIGRydl9mbHVzaCgpDQo+ID4gd2lsbCBmaXggdGhpcy4N
Cj4gPiANCj4gPiBOb3RlIHRoYXQgd2UgY2FuJ3QgY2FsbCBzeW5jaHJvbml6ZV9uZXQgaW5zaWRl
DQo+ID4gaWVlZTgwMjExX2ZsdXNoX3F1ZXVlcyBzaW5jZSB0aGVyZSBhcmUgZmxvd3MgdGhhdCBj
YWxsDQo+ID4gaWVlZTgwMjExX2ZsdXNoX3F1ZXVlcyBhbmQgZG9uJ3QgbmVlZCBzeW5jaHJvbml6
ZV9uZXQoKQ0KPiA+IHdoaWNoIGlzIGFuIGV4cGVuc2l2ZSBvcGVyYXRpb24uDQo+IA0KPiBJJ20g
LS0gd2l0aCBhIGhlYXZ5IGhlYXJ0IC0tIGFwcGx5aW5nIHRoaXMgZm9yIG5vdy4gVEkvd2l6ZXJ5
IHNwZW50IGENCj4gbG90IG9mIHRpbWUgb3B0aW1pc2luZyB0aGUgdGltZSBpdCB0YWtlcyB0byBy
b2FtLCBhbmQgdGhpcyBzaW5nbGUgbGluZQ0KPiB3aWxsIGludHJvZHVjZSBhIHBvdGVudGlhbCBm
b3IgaHVuZHJlZHMgb2YgbXMgb2YgbGF0ZW5jeSBpbnRvIHRoZQ0KPiByb2FtaW5nIGZsb3cuDQo+
IA0KPiBZb3UgKHdlKSByZWFsbHkgbmVlZCB0byB0aGluayBhYm91dCBob3cgdG8gYXZvaWQgdGhh
dC4NCg0KSSBkb24ndCB0aGluayBpdCB3aWxsIGludHJvZHVjZSB0aGF0IG11Y2ggb2YgbGF0ZW5j
eS4NCk5vdGUgdGhhdCB0aGVyZSBpcyBhIGNhbGwgdG8gZmx1c2goKSByaWdodCBhZnRlciB0aGF0
LCB0aGlzIG1lYW5zIHRoYXQNCmFueSBkcml2ZXIgd2hpY2ggaW1wbGVtZW50cyB0aGlzIGNhbGwg
bmVlZHMgdG8gd2FpdCB0aGVyZS4gU28geW91IGhhdmUNCnRoZSBsYXRlbmN5IGluIGVpdGhlciBw
bGFjZS4gVGhlIG9ubHkgYWRkaXRpb25hbCBsYXRlbmN5IGl0IGFkZHMgaXMgZm9yDQpvdGhlciBS
Q1Ugc2VjdGlvbnMgLyBwYWNrZXRzIG9uIG90aGVyIGludGVyZmFjZXMuDQpBbHNvIC0gc2luY2Ug
d2UganVzdCBzdG9wcGVkIHRoZSBuZXRpZiwgdGhlcmUgY2FuIHBvc3NpYmx5IGJlIG9ubHkgb25l
DQpwYWNrZXQgZm9yIGVhY2ggdmlmIC8gYWMgcHJvY2Vzc2luZy4gVGhpcyBpcyBub3QgdG9vIG11
Y2ggZGF0YSB0bw0KcHJvY2Vzcy4NCg0KWW91ciBjYWxsLCBidXQgdG8gaG9uZXN0LCBJIGhhdmUg
YmVlbiBvcHRpbWl6aW5nIHRoZSByb2FtaW5nIGFzIHdlbGwgKGFzDQp5b3Uga25vdykuIEFuZCBp
dCB3YXMgcmVhbGx5IGJhZCBmb3IgZHJpdmVycyB0aGF0IGltcGxlbWVudHMgdGhlIGZsdXNoKCkN
CmNhbGxiYWNrLiBFc3BlY2lhbGx5IGluIGNhc2VzIHdoZXJlIHlvdSBhcmUgYWxyZWFkeSBmYXIg
ZnJvbSB0aGUgQVAgeW91DQp3YW50IHRvIHJvYW0gZnJvbS4gVGhpcyBpcyB3aHkgSSBhZGRlZCB0
aGUgZHJvcCBwYXJhbWV0ZXIgYW5kIG90aGVyDQpvcHRpbWl6YXRpb25zIChkb24ndCBmbHVzaCgp
IGFmdGVyIHByb2JpbmcgZXRjLi4uKQ0KDQpBZ2FpbiAtIHlvdXIgY2FsbC4gDQoNCg==

2015-01-23 09:56:44

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] mac80211: synchronize_net() before flushing the queues

On Thu, 2015-01-22 at 23:30 +0200, Emmanuel Grumbach wrote:
> When mac80211 disconnects, it drops all the packets on the
> queues. This happens after the net stack has been notified
> that we have no link anymore (netif_carrier_off).
> netif_carrier_off ensures that no new packets are sent to
> xmit() callback, but we might have older packets in the
> middle of the Tx path. These packets will land in the
> driver's queues after the latter have been flushed.
> Synchronize_net() between netif_carrier_off and drv_flush()
> will fix this.
>
> Note that we can't call synchronize_net inside
> ieee80211_flush_queues since there are flows that call
> ieee80211_flush_queues and don't need synchronize_net()
> which is an expensive operation.

I'm -- with a heavy heart -- applying this for now. TI/wizery spent a
lot of time optimising the time it takes to roam, and this single line
will introduce a potential for hundreds of ms of latency into the
roaming flow.

You (we) really need to think about how to avoid that.

johannes