2017-10-10 20:54:21

by Ben Greear

[permalink] [raw]
Subject: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

At one point, you could set a single rate using 'iw' and
ath10k would convert that to a special firmware API that
fixed all data traffic to a particular rate set. (Management
frames and broadcast will not be affected by setting the rates
when using ath10k).

But, with the commit below, a command like this will fail:

#iw dev vap206 set bitrates legacy-5 ht-mcs-5 0 vht-mcs-5
command failed: Invalid argument (-22)

But, it actually *does* successfully set the rate in the driver first, which
is confusing at best.

So, I think we should relax this check, at least for ath10k.

commit e8e4f5280ddd0a7b43a795f90a0758e3c99df6a6
Author: Johannes Berg <[email protected]>
Date: Wed Mar 8 11:12:10 2017 +0100

mac80211: reject/clear user rate mask if not usable

If the user rate mask results in no (basic) rates being usable,
clear it. Also, if we're already operating when it's set, reject
it instead.

Technically, selecting basic rates as the criterion is a bit too
restrictive, but calculating the usable rates over all stations
(e.g. in AP mode) is harder, and all stations must support the
basic rates. Similarly, in client mode, the basic rates will be
used anyway for control frames.

This fixes the "no supported rates (...) in rate_mask ..." warning
that occurs on TX when you've selected a rate mask that's not
compatible with the connection (e.g. an AP that enables only the
rates 36, 48, 54 and you've selected only 6, 9, 12.)

Reported-by: Kirtika Ruchandani <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2017-10-11 14:51:46

by Ben Greear

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"



On 10/11/2017 01:02 AM, Johannes Berg wrote:
> Hi,
>
>> #iw dev vap206 set bitrates legacy-5 ht-mcs-5 0 vht-mcs-5
>> command failed: Invalid argument (-22)
>>
>> But, it actually *does* successfully set the rate in the driver
>> first, which is confusing at best.
>
> Huh?

The call to set the rate in the driver comes before the error
check.

if (ieee80211_hw_check(&local->hw, HAS_RATE_CONTROL)) {
ret = drv_set_bitrate_mask(local, sdata, mask);
if (ret) {
pr_err("%s: drv-set-bitrate-mask had error return: %d\n",
sdata->dev->name, ret);
return ret;
}
}

/*
* If active validate the setting and reject it if it doesn't leave
* at least one basic rate usable, since we really have to be able
* to send something, and if we're an AP we have to be able to do
* so at a basic rate so that all clients can receive it.
*/
if (rcu_access_pointer(sdata->vif.chanctx_conf) &&
sdata->vif.bss_conf.chandef.chan) {
u32 basic_rates = sdata->vif.bss_conf.basic_rates;
enum nl80211_band band = sdata->vif.bss_conf.chandef.chan->band;

if (!(mask->control[band].legacy & basic_rates)) {
#### I changed this code so I could set a single rate... --Ben
pr_err("%s: WARNING: no legacy rates for band[%d] in set-bitrate-mask.\n",
sdata->dev->name, band);
}
}

>
>> So, I think we should relax this check, at least for ath10k.
>
> Well, yes and no. I don't think we should make ath10k special here, and
> this fixes a real problem - namely that you can set up the system so
> that you have no usable rates at all, and then you just get a WARN_ON
> and start using the lowest possible rate...

Well, there are a million ways to set up a broken system, and setting a single
rate has a useful purpose, especially with ath10k since it has such limited
rate-setting capabilities.

There is basically never going to be a case where setting a single tx-rate on an
AP is a good idea in a general production environment, so maybe a possible WARN-ON is fine?

If you *do* set up an AP with a limited rateset, then it should simply fail to
allow a station to connect if it does not have any matching rates. This might go
back to a previous idea I had (and patches I posted and carry in my tree) to allow setting a different
advertise rateset vs usable tx-rateset.

For existing stations that might not match the new fixed rate, we could purposefully kick
them off I guess, but seems like a lot of work for a case that seems pretty irrelevant.

>
>> commit e8e4f5280ddd0a7b43a795f90a0758e3c99df6a6
>> Author: Johannes Berg <[email protected]>
>> Date: Wed Mar 8 11:12:10 2017 +0100
>>
>> mac80211: reject/clear user rate mask if not usable
>>
>> If the user rate mask results in no (basic) rates being usable,
>> clear it. Also, if we're already operating when it's set, reject
>> it instead.
>>
>> Technically, selecting basic rates as the criterion is a bit too
>> restrictive, but calculating the usable rates over all stations
>> (e.g. in AP mode) is harder, and all stations must support the
>> basic rates. Similarly, in client mode, the basic rates will be
>> used anyway for control frames.
>
> I guess you could implement this part? I.e. iterating the clients and
> checking that they all support the rate that is set. However, then you
> also need to implement that this gets reset when a new client that
> doesn't support this rate connects.
>
> Overall, this isn't very well defined for AP mode...
>
> Perhaps it'd be better - as you pointed out in the other thread - to
> have API to force a rate per station? We already have that for iwlwifi
> in debugfs, so perhaps that'd be something to consider for this too,
> I'm not sure there would be a real need to have it in nl80211?

I looked closely at the ath10k firmware yesterday, and it has no way to set a specific
single rate per peer. Sure, I could hack something into my CT firmware, but that
still leaves all the stock driver/firmware people out of luck, and my patches
won't make it upstream in the main kernel...

Thanks,
Ben


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2017-10-18 21:30:41

by Ben Greear

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

On 10/18/2017 02:02 PM, Johannes Berg wrote:
> On Wed, 2017-10-18 at 13:51 -0700, Ben Greear wrote:
>> "
>> I guess you could implement this part? I.e. iterating the clients and
>> checking that they all support the rate that is set. However, then you
>> also need to implement that this gets reset when a new client that
>> doesn't support this rate connects.
>> "
>>
>> The first part seems OK, but the second seems like a pain. Maybe just
>> keep a new client from being able to connect at all if it doesn't support
>> any available rates?
>
> I suppose if you reject the NEW_STATION command, then hostapd will
> reject the association though, so it's probably not hard to do.
> However, I'm not really sure why you'd want that? If you do want that
> then basically you're just implemented a very roundabout way of adding
> this rate to the basic rates, so you might as well just add it and work
> with the current basic rates check?

If a user specifies a specific rate or rate-set, then I do not think we
should quietly change it out from under them. So, we could check at the
time the rate-set is applied (all current peers). We can reject the change
at that point if one of the peers does not support any common rates.
And, whenever a peer tries to be associated, we can check that there is some common rateset
in place. If there is no common rateset, then reject the association
one way or another.

> We'll need to be a little careful what happens with client-mode
> interfaces, which is where we actually observed hitting the WARN_ON()
> about not having any rates at all, but I think I already put a reset of
> the rates there anyway if they're not compatible with the AP. At least
> that's easier because there's only one client.

It would be easy to configure a station to do VHT MCS 8 only, and then
it would never be able to associate with an HT AP, for instance. I don't
think this should be a WARN_ON event, just a failure. It would be great
if we could get a useful error message back to the caller, but I am not
sure how feasible that is with the current netlink and mac80211 code.

If your main concern is hitting a WARN_ON, maybe just change it to a
quieter error message or remove it entirely and just return a failure
code?

Thanks,
Ben

>
> johannes
>


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2017-10-27 20:15:14

by Johannes Berg

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

On Wed, 2017-10-25 at 09:13 -0700, Ben Greear wrote:

> > Well it resulted in a WARN_ON because if the AP didn't have those
> > rates, we'd not find any usable rates while trying to transmit a frame,
> > and then ended up warning and falling down to the lowest possible rate.
> >
> > I don't think I agree that configuring this should reject the
> > association, and I've already implemented the behaviour of dropping the
> > user's rate set in this case in the patch we're discussing.
>
> So, as long as we are associating to a VHT AP, then we could still set the
> tx-rateset to (only) VHT MCS 8 and allow association, even if there are no additional
> rates set. There is at least one common rate, so theoretically, the
> station and AP can communicate.
>
> If we tried to associate this same station to an HT AP, then your work-around
> code could kick in and throw away the user's rateset configuration, preferably
> with a fairly noticeable warning message since you are overriding the user's
> preferred tx rateset?

There's only a debug message now, and I actually compare to the basic
rates. Like I said, we could fix that to compare to real rates, but
that gets a little trickier since we don't select from all possible
rates when we try to send management frames, only from legacy rates, so
we'd have to modify that code too.

> > This kind of problem is why I absolutely dislike out-of-band state that
> > affects the connection, rather than giving it in the connection
> > command(s) (connect, auth, associate, whatever). We're stuck with it
> > now, and needed to redefine that this selection may be dropped if not
> > usable.
>
> You could start allowing the user to configure the full advertised and
> transmit rateset for each of these actions (and probe too), and user-space can add the fields
> to their netlink commands. At least going forward, this might help
> make the behaviour more as expected. If you would like to implement at
> least the basics in cfg/mac80211, I would be happy to work on the supplicant
> end of things.

I'm not sure we can really remove this, there are users out there (e.g.
Chrome's shill, IIRC), so we'd be stuck with two ways of doing things
... that's not so much better really :)

johannes

2017-10-25 16:13:19

by Ben Greear

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"



On 10/25/2017 08:17 AM, Johannes Berg wrote:
> Hi,
>
> So I fixed the part about the rate setting calling into the driver
> before rejecting.
>
>> If a user specifies a specific rate or rate-set, then I do not think we
>> should quietly change it out from under them. So, we could check at the
>> time the rate-set is applied (all current peers). We can reject the change
>> at that point if one of the peers does not support any common rates.
>> And, whenever a peer tries to be associated, we can check that there is some common rateset
>> in place. If there is no common rateset, then reject the association
>> one way or another.
>
> It's not trivial to do in the kernel, but if we reject adding the
> station then hostapd will turn around and reject the association. This
> might not end up being done nicely, but I think it does still happen
> before the association response, so a negative one would result.
>
>>> We'll need to be a little careful what happens with client-mode
>>> interfaces, which is where we actually observed hitting the WARN_ON()
>>> about not having any rates at all, but I think I already put a reset of
>>> the rates there anyway if they're not compatible with the AP. At least
>>> that's easier because there's only one client.
>>
>> It would be easy to configure a station to do VHT MCS 8 only, and then
>> it would never be able to associate with an HT AP, for instance. I don't
>> think this should be a WARN_ON event, just a failure.
>
> Well it resulted in a WARN_ON because if the AP didn't have those
> rates, we'd not find any usable rates while trying to transmit a frame,
> and then ended up warning and falling down to the lowest possible rate.
>
> I don't think I agree that configuring this should reject the
> association, and I've already implemented the behaviour of dropping the
> user's rate set in this case in the patch we're discussing.

So, as long as we are associating to a VHT AP, then we could still set the
tx-rateset to (only) VHT MCS 8 and allow association, even if there are no additional
rates set. There is at least one common rate, so theoretically, the
station and AP can communicate.

If we tried to associate this same station to an HT AP, then your work-around
code could kick in and throw away the user's rateset configuration, preferably
with a fairly noticeable warning message since you are overriding the user's
preferred tx rateset?


>> It would be great
>> if we could get a useful error message back to the caller, but I am not
>> sure how feasible that is with the current netlink and mac80211 code.
>
> If we reject the user setting, then we can easily more useful return
> error messages now that we have generic netlink extack support. The
> more "interesting" case is when the user set this and then reconnects.
>
> This kind of problem is why I absolutely dislike out-of-band state that
> affects the connection, rather than giving it in the connection
> command(s) (connect, auth, associate, whatever). We're stuck with it
> now, and needed to redefine that this selection may be dropped if not
> usable.

You could start allowing the user to configure the full advertised and
transmit rateset for each of these actions (and probe too), and user-space can add the fields
to their netlink commands. At least going forward, this might help
make the behaviour more as expected. If you would like to implement at
least the basics in cfg/mac80211, I would be happy to work on the supplicant
end of things.

>
>> If your main concern is hitting a WARN_ON, maybe just change it to a
>> quieter error message or remove it entirely and just return a failure
>> code?
>
> No, the warning serves a useful purpose, it's not useful to fall back
> to the lowest rate, even if the user selected something completely
> unusable. Arguably we should simply reject transmitting in that case,
> but that's not really better either ...
>
> johannes


Thanks,
Ben


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2017-10-18 14:50:36

by Ben Greear

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"



On 10/18/2017 12:33 AM, Johannes Berg wrote:
> Hi,
>
>> The call to set the rate in the driver comes before the error
>> check.
>>
>> if (ieee80211_hw_check(&local->hw, HAS_RATE_CONTROL)) {
>> ret = drv_set_bitrate_mask(local, sdata, mask);
>> if (ret) {
>> pr_err("%s: drv-set-bitrate-mask had error
>> return: %d\n",
>> sdata->dev->name, ret);
>> return ret;
>> }
>> }
>>
>> /*
>> * If active validate the setting and reject it if it doesn't
>> leave
>> * at least one basic rate usable, since we really have to be
>> able
>> * to send something, and if we're an AP we have to be able to
>> do
>> * so at a basic rate so that all clients can receive it.
>> */
>> if (rcu_access_pointer(sdata->vif.chanctx_conf) &&
>> sdata->vif.bss_conf.chandef.chan) {
>> u32 basic_rates = sdata->vif.bss_conf.basic_rates;
>> enum nl80211_band band = sdata-
>>> vif.bss_conf.chandef.chan->band;
>>
>> if (!(mask->control[band].legacy & basic_rates)) {
>> #### I changed this code so I could set a
>> single rate... --Ben
>> pr_err("%s: WARNING: no legacy rates for
>> band[%d] in set-bitrate-mask.\n",
>> sdata->dev->name, band);
>> }
>> }
>
> Heh, that's just dumb. I guess I'll fix that by putting the test first,
> no idea how that happened.
>
>>>
>>>> So, I think we should relax this check, at least for ath10k.
>>>
>>> Well, yes and no. I don't think we should make ath10k special here,
>>> and
>>> this fixes a real problem - namely that you can set up the system
>>> so
>>> that you have no usable rates at all, and then you just get a
>>> WARN_ON
>>> and start using the lowest possible rate...
>>
>> Well, there are a million ways to set up a broken system,
>
> True, but this one actually happened in practice, for some reason, and
> stopping the user from constantly shooting themselves in the foot seems
> like a good idea to me. Especially if the user (or application) can't
> really even know what they're getting into.
>
> Now, the case in question was _client_ mode, but still.
>
>> and setting a single rate has a useful purpose, especially with
>> ath10k since it has such limited rate-setting capabilities.
>
> You're stretching the definition of "useful purpose" a bit here though,
> you're about the only one who's ever going to need to set a single
> rate.

People trying to do regulatory testing want this feature, and other people
that are not me also like to test with specific rates. Still a small-ish set
of people, but bigger than just me at least.

This used to work, and now is broken, so it is a regression of at least somewhat
useful feature. I think we need a way to re-enable this, even if it requires
poking some sysfs or debugfs file to allow this check to be relaxed.

And for that matter, it might be nice to allow purposefully broken ratesets
(with part of basic missing, for instance), in order to test peer devices (including
other Linux stacks).

>> There is basically never going to be a case where setting a single
>> tx-rate on an AP is a good idea in a general production environment,
>> so maybe a possible WARN-ON is fine?
>
> A WARN_ON() for a user configuration is never fine.
>
> You're assuming that there's actually a user sitting there and doing
> this, which is not necessarily the case.
>
> Even rejecting a single rate setting wouldn't be enough because you can
> get into problems even when you enable multiple rates, e.g. if you
> enable all the CCK rates while connected on 5 GHz.
>
>> If you *do* set up an AP with a limited rateset, then it should
>> simply fail to allow a station to connect if it does not have any
>> matching rates.
>
> That's what requiring at least one basic rate to remain does. If you
> want to have basic rates 6,9,12 and then configure only 18, how would
> the client get rejected? Just configure basic rates differently
> beforehand, and then you can do this easily, and the right thing with
> rejecting clients will happen automatically (in fact, clients might not
> even attempt to connect - even better!)
>
>> This might go back to a previous idea I had (and patches I posted and
>> carry in my tree) to allow setting a different advertise rateset vs
>> usable tx-rateset.
>
> You still have the same problem with which clients support them and
> require them etc.

For regulatory testing purposes: You can advertise normal basic rateset,
and you can accept those (and even transmit mgt and bcast frames on them),
but for data tx, you use a single fixed rate. That is one reason it is nice
to have the advertised rateset different from the tx rateset.


>> For existing stations that might not match the new fixed rate, we
>> could purposefully kick them off I guess, but seems like a lot of
>> work for a case that seems pretty irrelevant.
>
> For better or worse, there are people using this API programmatically
> without a user baby-sitting it, so we need to make it work in all
> cases. We can let the user shoot themselves in the foot and have only a
> single usable rate left, but we can't let them hang themselves and have
> no rate left at all.

Out of curiosity, couldn't 'iw' do the same checks? It is a lot easier to
return a useful error code/message from a user-space app anyway, in case that
can work.

Thanks,
Ben


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2017-10-18 17:56:55

by Oleksij Rempel

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

Am 18.10.2017 um 16:50 schrieb Ben Greear:
>
>
> On 10/18/2017 12:33 AM, Johannes Berg wrote:
>> Hi,
>>
>>> The call to set the rate in the driver comes before the error
>>> check.
>>>
>>>     if (ieee80211_hw_check(&local->hw, HAS_RATE_CONTROL)) {
>>>         ret = drv_set_bitrate_mask(local, sdata, mask);
>>>         if (ret) {
>>>             pr_err("%s: drv-set-bitrate-mask had error
>>> return: %d\n",
>>>                    sdata->dev->name, ret);
>>>             return ret;
>>>         }
>>>     }
>>>
>>>     /*
>>>      * If active validate the setting and reject it if it doesn't
>>> leave
>>>      * at least one basic rate usable, since we really have to be
>>> able
>>>      * to send something, and if we're an AP we have to be able to
>>> do
>>>      * so at a basic rate so that all clients can receive it.
>>>      */
>>>     if (rcu_access_pointer(sdata->vif.chanctx_conf) &&
>>>         sdata->vif.bss_conf.chandef.chan) {
>>>         u32 basic_rates = sdata->vif.bss_conf.basic_rates;
>>>         enum nl80211_band band = sdata-
>>>> vif.bss_conf.chandef.chan->band;
>>>
>>>         if (!(mask->control[band].legacy & basic_rates)) {
>>>             #### I changed this code so I could set a
>>> single rate... --Ben
>>>             pr_err("%s:  WARNING: no legacy rates for
>>> band[%d] in set-bitrate-mask.\n",
>>>                    sdata->dev->name, band);
>>>         }
>>>     }
>>
>> Heh, that's just dumb. I guess I'll fix that by putting the test first,
>> no idea how that happened.
>>
>>>>
>>>>> So, I think we should relax this check, at least for ath10k.
>>>>
>>>> Well, yes and no. I don't think we should make ath10k special here,
>>>> and
>>>> this fixes a real problem - namely that you can set up the system
>>>> so
>>>> that you have no usable rates at all, and then you just get a
>>>> WARN_ON
>>>> and start using the lowest possible rate...
>>>
>>> Well, there are a million ways to set up a broken system,
>>
>> True, but this one actually happened in practice, for some reason, and
>> stopping the user from constantly shooting themselves in the foot seems
>> like a good idea to me. Especially if the user (or application) can't
>> really even know what they're getting into.
>>
>> Now, the case in question was _client_ mode, but still.
>>
>>> and setting a single rate has a useful purpose, especially with
>>> ath10k since it has such limited rate-setting capabilities.
>>
>> You're stretching the definition of "useful purpose" a bit here though,
>> you're about the only one who's ever going to need to set a single
>> rate.
>
> People trying to do regulatory testing want this feature, and other people
> that are not me also like to test with specific rates.  Still a
> small-ish set
> of people, but bigger than just me at least.

Till now i was interviewing different people who was asking for this for
ath9k-htc. So I would say we have:
- academical researchers
- testers
- R&D
- exploit and penetration testers
- HAM
- just hackers

As for me, it sounds a s lot.


--
Regards,
Oleksij

2017-10-18 20:34:32

by Johannes Berg

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

Hi,

On Wed, 2017-10-18 at 19:56 +0200, Oleksij Rempel wrote:
>
> > People trying to do regulatory testing want this feature, and other people
> > that are not me also like to test with specific rates. Still a
> > small-ish set of people, but bigger than just me at least.
>
> Till now i was interviewing different people who was asking for this for
> ath9k-htc. So I would say we have:
> - academical researchers
> - testers
> - R&D
> - exploit and penetration testers
> - HAM
> - just hackers
>
> As for me, it sounds a s lot.

Making (literally) millions of devices in the field hit a WARN_ON() is
not really acceptable either though.

You can argue that this introduced a regression, but putting the old
behaviour back would equally be a regression, for more systems by a few
orders of magnitude.

In any case, I've already suggested a way to fix this, but you've both
completely ignored that part of my email. All I've been reading is that
you're demanding that I fix this, and arguments about how much people
are allowed to shoot themselves in the foot, none of which is very
constructive.

I might even fix it myself eventually, if only to appease the people
who say we have a zero tolerance no regressions rule, but it's not
exactly the most important thing I'm doing right now (also, I'll be
going on vacation for a few days, and you can probably implement my
suggestion in that time, and then I can review it when I get back on
Monday.)

Let's just say that I think we're discussing the wrong thing here - we
ought to be discussing how it can be fixed, and perhaps you can even be
constructive in suggesting (and testing, which I can't really do)
changes.

johannes

2017-10-11 08:02:31

by Johannes Berg

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

Hi,

> #iw dev vap206 set bitrates legacy-5 ht-mcs-5 0 vht-mcs-5
> command failed: Invalid argument (-22)
>
> But, it actually *does* successfully set the rate in the driver
> first, which is confusing at best.

Huh?

> So, I think we should relax this check, at least for ath10k.

Well, yes and no. I don't think we should make ath10k special here, and
this fixes a real problem - namely that you can set up the system so
that you have no usable rates at all, and then you just get a WARN_ON
and start using the lowest possible rate...

> commit e8e4f5280ddd0a7b43a795f90a0758e3c99df6a6
> Author: Johannes Berg <[email protected]>
> Date: Wed Mar 8 11:12:10 2017 +0100
>
> mac80211: reject/clear user rate mask if not usable
>
> If the user rate mask results in no (basic) rates being usable,
> clear it. Also, if we're already operating when it's set, reject
> it instead.
>
> Technically, selecting basic rates as the criterion is a bit too
> restrictive, but calculating the usable rates over all stations
> (e.g. in AP mode) is harder, and all stations must support the
> basic rates. Similarly, in client mode, the basic rates will be
> used anyway for control frames.

I guess you could implement this part? I.e. iterating the clients and
checking that they all support the rate that is set. However, then you
also need to implement that this gets reset when a new client that
doesn't support this rate connects.

Overall, this isn't very well defined for AP mode...

Perhaps it'd be better - as you pointed out in the other thread - to
have API to force a rate per station? We already have that for iwlwifi
in debugfs, so perhaps that'd be something to consider for this too,
I'm not sure there would be a real need to have it in nl80211?

johannes

2017-10-18 21:02:09

by Johannes Berg

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

On Wed, 2017-10-18 at 13:51 -0700, Ben Greear wrote:
> "
> I guess you could implement this part? I.e. iterating the clients and
> checking that they all support the rate that is set. However, then you
> also need to implement that this gets reset when a new client that
> doesn't support this rate connects.
> "
>
> The first part seems OK, but the second seems like a pain. Maybe just
> keep a new client from being able to connect at all if it doesn't support
> any available rates?

I suppose if you reject the NEW_STATION command, then hostapd will
reject the association though, so it's probably not hard to do.
However, I'm not really sure why you'd want that? If you do want that
then basically you're just implemented a very roundabout way of adding
this rate to the basic rates, so you might as well just add it and work
with the current basic rates check?


We'll need to be a little careful what happens with client-mode
interfaces, which is where we actually observed hitting the WARN_ON()
about not having any rates at all, but I think I already put a reset of
the rates there anyway if they're not compatible with the AP. At least
that's easier because there's only one client.

johannes

2017-10-18 20:51:18

by Ben Greear

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

On 10/18/2017 01:34 PM, Johannes Berg wrote:
> Hi,
>
> On Wed, 2017-10-18 at 19:56 +0200, Oleksij Rempel wrote:
>>
>>> People trying to do regulatory testing want this feature, and other people
>>> that are not me also like to test with specific rates. Still a
>>> small-ish set of people, but bigger than just me at least.
>>
>> Till now i was interviewing different people who was asking for this for
>> ath9k-htc. So I would say we have:
>> - academical researchers
>> - testers
>> - R&D
>> - exploit and penetration testers
>> - HAM
>> - just hackers
>>
>> As for me, it sounds a s lot.
>
> Making (literally) millions of devices in the field hit a WARN_ON() is
> not really acceptable either though.
>
> You can argue that this introduced a regression, but putting the old
> behaviour back would equally be a regression, for more systems by a few
> orders of magnitude.
>
> In any case, I've already suggested a way to fix this, but you've both
> completely ignored that part of my email. All I've been reading is that
> you're demanding that I fix this, and arguments about how much people
> are allowed to shoot themselves in the foot, none of which is very
> constructive.


You mean this part? It wasn't clear to me that you thought this was a good
solution that should be implemented.

"
I guess you could implement this part? I.e. iterating the clients and
checking that they all support the rate that is set. However, then you
also need to implement that this gets reset when a new client that
doesn't support this rate connects.
"

The first part seems OK, but the second seems like a pain. Maybe just
keep a new client from being able to connect at all if it doesn't support
any available rates?

Thanks,
Ben

> I might even fix it myself eventually, if only to appease the people
> who say we have a zero tolerance no regressions rule, but it's not
> exactly the most important thing I'm doing right now (also, I'll be
> going on vacation for a few days, and you can probably implement my
> suggestion in that time, and then I can review it when I get back on
> Monday.)
>
> Let's just say that I think we're discussing the wrong thing here - we
> ought to be discussing how it can be fixed, and perhaps you can even be
> constructive in suggesting (and testing, which I can't really do)
> changes.
>
> johannes
>


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2017-10-27 20:41:39

by Ben Greear

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

On 10/27/2017 01:15 PM, Johannes Berg wrote:
> On Wed, 2017-10-25 at 09:13 -0700, Ben Greear wrote:
>
>>> Well it resulted in a WARN_ON because if the AP didn't have those
>>> rates, we'd not find any usable rates while trying to transmit a frame,
>>> and then ended up warning and falling down to the lowest possible rate.
>>>
>>> I don't think I agree that configuring this should reject the
>>> association, and I've already implemented the behaviour of dropping the
>>> user's rate set in this case in the patch we're discussing.
>>
>> So, as long as we are associating to a VHT AP, then we could still set the
>> tx-rateset to (only) VHT MCS 8 and allow association, even if there are no additional
>> rates set. There is at least one common rate, so theoretically, the
>> station and AP can communicate.
>>
>> If we tried to associate this same station to an HT AP, then your work-around
>> code could kick in and throw away the user's rateset configuration, preferably
>> with a fairly noticeable warning message since you are overriding the user's
>> preferred tx rateset?
>
> There's only a debug message now, and I actually compare to the basic
> rates. Like I said, we could fix that to compare to real rates, but
> that gets a little trickier since we don't select from all possible
> rates when we try to send management frames, only from legacy rates, so
> we'd have to modify that code too.

ath10k ignores the tx rateset pretty much entirely when sending management
frames, so even if you set the tx rateset to have only VHT MCS 8,
management frames are still sent with legacy ratesets.

My end goal about this part is to be able to configure a single tx rate
and have that be allowed again, at least with ath10k.

Maybe a new flag for drivers like ath10k that at least somewhat ignore
the tx-rateset for management frames, and this flag would allow us to
bypass the cannot-set-single-rate check?

>>> This kind of problem is why I absolutely dislike out-of-band state that
>>> affects the connection, rather than giving it in the connection
>>> command(s) (connect, auth, associate, whatever). We're stuck with it
>>> now, and needed to redefine that this selection may be dropped if not
>>> usable.
>>
>> You could start allowing the user to configure the full advertised and
>> transmit rateset for each of these actions (and probe too), and user-space can add the fields
>> to their netlink commands. At least going forward, this might help
>> make the behaviour more as expected. If you would like to implement at
>> least the basics in cfg/mac80211, I would be happy to work on the supplicant
>> end of things.
>
> I'm not sure we can really remove this, there are users out there (e.g.
> Chrome's shill, IIRC), so we'd be stuck with two ways of doing things
> ... that's not so much better really :)

I am not saying you should remove the existing support, but since it is clunky
at best, you can add support for specifying the rates and those rates, if specified,
can take precedence over whatever was previously set (or auto-configured).

The kernel code will still be somewhat convoluted, but at least user-space
using newer API might have a more deterministic behaviour.

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2017-10-11 08:07:18

by Johannes Berg

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

On Wed, 2017-10-11 at 10:02 +0200, Johannes Berg wrote:

> Overall, this isn't very well defined for AP mode...

No, that's not really correct. It *is* well defined (set the rate for
all clients to the same fixed rate), but that definition isn't very
good because if that rate isn't a basic rate, clients can connect that
don't support this rate...

johannes

2017-10-25 15:17:53

by Johannes Berg

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

Hi,

So I fixed the part about the rate setting calling into the driver
before rejecting.

> If a user specifies a specific rate or rate-set, then I do not think we
> should quietly change it out from under them. So, we could check at the
> time the rate-set is applied (all current peers). We can reject the change
> at that point if one of the peers does not support any common rates.
> And, whenever a peer tries to be associated, we can check that there is some common rateset
> in place. If there is no common rateset, then reject the association
> one way or another.

It's not trivial to do in the kernel, but if we reject adding the
station then hostapd will turn around and reject the association. This
might not end up being done nicely, but I think it does still happen
before the association response, so a negative one would result.

> > We'll need to be a little careful what happens with client-mode
> > interfaces, which is where we actually observed hitting the WARN_ON()
> > about not having any rates at all, but I think I already put a reset of
> > the rates there anyway if they're not compatible with the AP. At least
> > that's easier because there's only one client.
>
> It would be easy to configure a station to do VHT MCS 8 only, and then
> it would never be able to associate with an HT AP, for instance. I don't
> think this should be a WARN_ON event, just a failure.

Well it resulted in a WARN_ON because if the AP didn't have those
rates, we'd not find any usable rates while trying to transmit a frame,
and then ended up warning and falling down to the lowest possible rate.

I don't think I agree that configuring this should reject the
association, and I've already implemented the behaviour of dropping the
user's rate set in this case in the patch we're discussing.

> It would be great
> if we could get a useful error message back to the caller, but I am not
> sure how feasible that is with the current netlink and mac80211 code.

If we reject the user setting, then we can easily more useful return
error messages now that we have generic netlink extack support. The
more "interesting" case is when the user set this and then reconnects.

This kind of problem is why I absolutely dislike out-of-band state that
affects the connection, rather than giving it in the connection
command(s) (connect, auth, associate, whatever). We're stuck with it
now, and needed to redefine that this selection may be dropped if not
usable.

> If your main concern is hitting a WARN_ON, maybe just change it to a
> quieter error message or remove it entirely and just return a failure
> code?

No, the warning serves a useful purpose, it's not useful to fall back
to the lowest rate, even if the user selected something completely
unusable. Arguably we should simply reject transmitting in that case,
but that's not really better either ...

johannes

2017-10-18 07:33:56

by Johannes Berg

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

Hi,

> The call to set the rate in the driver comes before the error
> check.
>
> if (ieee80211_hw_check(&local->hw, HAS_RATE_CONTROL)) {
> ret = drv_set_bitrate_mask(local, sdata, mask);
> if (ret) {
> pr_err("%s: drv-set-bitrate-mask had error
> return: %d\n",
> sdata->dev->name, ret);
> return ret;
> }
> }
>
> /*
> * If active validate the setting and reject it if it doesn't
> leave
> * at least one basic rate usable, since we really have to be
> able
> * to send something, and if we're an AP we have to be able to
> do
> * so at a basic rate so that all clients can receive it.
> */
> if (rcu_access_pointer(sdata->vif.chanctx_conf) &&
> sdata->vif.bss_conf.chandef.chan) {
> u32 basic_rates = sdata->vif.bss_conf.basic_rates;
> enum nl80211_band band = sdata-
> >vif.bss_conf.chandef.chan->band;
>
> if (!(mask->control[band].legacy & basic_rates)) {
> #### I changed this code so I could set a
> single rate... --Ben
> pr_err("%s: WARNING: no legacy rates for
> band[%d] in set-bitrate-mask.\n",
> sdata->dev->name, band);
> }
> }

Heh, that's just dumb. I guess I'll fix that by putting the test first,
no idea how that happened.

> >
> > > So, I think we should relax this check, at least for ath10k.
> >
> > Well, yes and no. I don't think we should make ath10k special here,
> > and
> > this fixes a real problem - namely that you can set up the system
> > so
> > that you have no usable rates at all, and then you just get a
> > WARN_ON
> > and start using the lowest possible rate...
>
> Well, there are a million ways to set up a broken system,

True, but this one actually happened in practice, for some reason, and
stopping the user from constantly shooting themselves in the foot seems
like a good idea to me. Especially if the user (or application) can't
really even know what they're getting into.

Now, the case in question was _client_ mode, but still.

> and setting a single rate has a useful purpose, especially with
> ath10k since it has such limited rate-setting capabilities.

You're stretching the definition of "useful purpose" a bit here though,
you're about the only one who's ever going to need to set a single
rate.

> There is basically never going to be a case where setting a single
> tx-rate on an AP is a good idea in a general production environment,
> so maybe a possible WARN-ON is fine?

A WARN_ON() for a user configuration is never fine.

You're assuming that there's actually a user sitting there and doing
this, which is not necessarily the case.

Even rejecting a single rate setting wouldn't be enough because you can
get into problems even when you enable multiple rates, e.g. if you
enable all the CCK rates while connected on 5 GHz.

> If you *do* set up an AP with a limited rateset, then it should
> simply fail to allow a station to connect if it does not have any
> matching rates.

That's what requiring at least one basic rate to remain does. If you
want to have basic rates 6,9,12 and then configure only 18, how would
the client get rejected? Just configure basic rates differently
beforehand, and then you can do this easily, and the right thing with
rejecting clients will happen automatically (in fact, clients might not
even attempt to connect - even better!)

> This might go back to a previous idea I had (and patches I posted and
> carry in my tree) to allow setting a different advertise rateset vs
> usable tx-rateset.

You still have the same problem with which clients support them and
require them etc.

> For existing stations that might not match the new fixed rate, we
> could purposefully kick them off I guess, but seems like a lot of
> work for a case that seems pretty irrelevant.

For better or worse, there are people using this API programmatically
without a user baby-sitting it, so we need to make it work in all
cases. We can let the user shoot themselves in the foot and have only a
single usable rate left, but we can't let them hang themselves and have
no rate left at all.

johannes

2017-11-13 17:05:31

by Ben Greear

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

On 11/13/2017 02:09 AM, Johannes Berg wrote:
> On Fri, 2017-10-27 at 13:41 -0700, Ben Greear wrote:
>
>> ath10k ignores the tx rateset pretty much entirely when sending management
>> frames, so even if you set the tx rateset to have only VHT MCS 8,
>> management frames are still sent with legacy ratesets.
>
> So that's a driver bug.

The firmware gives the ability to set a single fixed rate for
multicast, and another for management frames. It is possibly to
set the tx-data frame rate to another fixed rate, or to a more
normal rateset. But, you do not have full control over setting
tx-data rates (no way to tell stock firmware to use 6Mbps /g rate and
mcs 8 (only), for instance). The multicast and mgt frame API is not hooked up in the
stock driver as far as I know.

But even if they were, I don't see a good way to make this fit
with the mac80211 txrate setting framework.

What is the suggested approach to propagate a rateset set with
'iw' to a firmware with these limitations?

For the Intel firmware NICs, how do they set management and bcast
tx rates?

>> My end goal about this part is to be able to configure a single tx rate
>> and have that be allowed again, at least with ath10k.
>>
>> Maybe a new flag for drivers like ath10k that at least somewhat ignore
>> the tx-rateset for management frames, and this flag would allow us to
>> bypass the cannot-set-single-rate check?
>
> What? No, I'm not going to put a driver bug into the API like that!

Thanks for the constructive feedback.

--Ben

>
> johannes
>


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2017-11-13 10:09:49

by Johannes Berg

[permalink] [raw]
Subject: Re: Setting single rate in ath10k broken by "reject/clear user rate mask if not usable"

On Fri, 2017-10-27 at 13:41 -0700, Ben Greear wrote:

> ath10k ignores the tx rateset pretty much entirely when sending management
> frames, so even if you set the tx rateset to have only VHT MCS 8,
> management frames are still sent with legacy ratesets.

So that's a driver bug.

> My end goal about this part is to be able to configure a single tx rate
> and have that be allowed again, at least with ath10k.
>
> Maybe a new flag for drivers like ath10k that at least somewhat ignore
> the tx-rateset for management frames, and this flag would allow us to
> bypass the cannot-set-single-rate check?

What? No, I'm not going to put a driver bug into the API like that!

johannes