2021-06-18 12:41:37

by petrosilius

[permalink] [raw]
Subject: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth

Hello Lorenzo Bianconi,

we are running a set of R11e-2HPnD devices and having trouble getting
dynack working properly.
Setup:
* linux-5.4.123
* OpenWRT (current development branch) with wireless backports-5.10.34-1
* distance 2m between ap and sta
* Low ambient noise wifi environment
We experienced some non working dynack or low performance in the
connection due to too high calculated ackto's.

Here is a ath9k debug output example for a non working dynack @ 10Mhz BW:

Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500427] ath: phy0:
{48:8f:5a:3c:bb:03} tx sample 44905341 [dur 8720][h 29-t 30]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500437] ath: phy0:
ack_ts 44844835 st_ts 44905341 st_dur 8720 [17-29]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500445] ath: phy0:
ack_ts 44923425 st_ts 44905341 st_dur 8720 [18-29]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554642] ath: phy0: rx
sample 44977693 [h 18-t 20]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554701] ath: phy0:
{48:8f:5a:3c:bb:03} tx sample 44964984 [dur 6032][h 30-t 31]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554710] ath: phy0:
ack_ts 44923425 st_ts 44964984 st_dur 6032 [18-30]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554718] ath: phy0:
ack_ts 44977693 st_ts 44964984 st_dur 6032 [19-30]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577890] ath: phy0: rx
sample 45000939 [h 19-t 21]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577946] ath: phy0:
{48:8f:5a:3c:bb:03} tx sample 44998471 [dur 912][h 31-t 32]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577956] ath: phy0:
ack_ts 44977693 st_ts 44998471 st_dur 912 [19-31]
Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577964] ath: phy0:
ack_ts 45000939 st_ts 44998471 st_dur 912 [20-31]

THe above output is generated in dynack.c by

        ath_dbg(ath9k_hw_common(ah), DYNACK,
            "ack_ts %u st_ts %u st_dur %u [%u-%u]\n",
            ack_ts, st_ts->tstamp, st_ts->dur,
            da->ack_rbf.h_rb, da->st_rbf.h_rb);

The ackto is afterwards calculated by

        if (ack_ts > st_ts->tstamp + st_ts->dur) {
            ackto = ack_ts - st_ts->tstamp - st_ts->dur;

Filling in the values of the first sample:

(ack_ts > st_ts->tstamp + st_ts->dur) ?
(44844835 > 44905341+8720) ?
(44844835 > 44914061) ? ... not given

Therefore a new ackto is not calculated and i can also see that in the
ack_to file:

600 A
600 A
600 A
...

These look like the default values to me (and do not change), but
ath_dynack_get_max_to() should return 750 A for our 10MHz BW case - this
looks also suspecious to me.

For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a bit too
high to me) but the performance is way below expectation (<1MBit)
For 20 MHz bandwidth there is a ackto calculated (51 A) and the
performance is fitting the expectation.
If you want to take a look at the logs for each of these cases for ap
and sta, you can download them here:
https://cloud.hs-augsburg.de/s/eworxkJoL6JXYzZ

Did anyone else experience such a behaviour on non 20MHz Channels or
does anyone have an idea on where this behaviour might originate from?
I am not experienced in the ath9k driver code, but a uneducated guess
might be that the ring buffer where the dynack algorithm is taking its
frame-samples from is not behaving as expected for the 5&10MHz case.

regards,
julian dorner


2021-06-22 09:55:27

by Koen Vandeputte

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth


On 18.06.21 13:13, Petrosilius wrote:
> Hello Lorenzo Bianconi,
>
> we are running a set of R11e-2HPnD devices and having trouble getting
> dynack working properly.
> Setup:
> * linux-5.4.123
> * OpenWRT (current development branch) with wireless backports-5.10.34-1
> * distance 2m between ap and sta
> * Low ambient noise wifi environment
> We experienced some non working dynack or low performance in the
> connection due to too high calculated ackto's.
>
> Here is a ath9k debug output example for a non working dynack @ 10Mhz BW:
>
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500427] ath: phy0:
> {48:8f:5a:3c:bb:03} tx sample 44905341 [dur 8720][h 29-t 30]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500437] ath: phy0:
> ack_ts 44844835 st_ts 44905341 st_dur 8720 [17-29]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500445] ath: phy0:
> ack_ts 44923425 st_ts 44905341 st_dur 8720 [18-29]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554642] ath: phy0: rx
> sample 44977693 [h 18-t 20]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554701] ath: phy0:
> {48:8f:5a:3c:bb:03} tx sample 44964984 [dur 6032][h 30-t 31]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554710] ath: phy0:
> ack_ts 44923425 st_ts 44964984 st_dur 6032 [18-30]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554718] ath: phy0:
> ack_ts 44977693 st_ts 44964984 st_dur 6032 [19-30]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577890] ath: phy0: rx
> sample 45000939 [h 19-t 21]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577946] ath: phy0:
> {48:8f:5a:3c:bb:03} tx sample 44998471 [dur 912][h 31-t 32]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577956] ath: phy0:
> ack_ts 44977693 st_ts 44998471 st_dur 912 [19-31]
> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577964] ath: phy0:
> ack_ts 45000939 st_ts 44998471 st_dur 912 [20-31]
>
> THe above output is generated in dynack.c by
>
>         ath_dbg(ath9k_hw_common(ah), DYNACK,
>             "ack_ts %u st_ts %u st_dur %u [%u-%u]\n",
>             ack_ts, st_ts->tstamp, st_ts->dur,
>             da->ack_rbf.h_rb, da->st_rbf.h_rb);
>
> The ackto is afterwards calculated by
>
>         if (ack_ts > st_ts->tstamp + st_ts->dur) {
>             ackto = ack_ts - st_ts->tstamp - st_ts->dur;
>
> Filling in the values of the first sample:
>
> (ack_ts > st_ts->tstamp + st_ts->dur) ?
> (44844835 > 44905341+8720) ?
> (44844835 > 44914061) ? ... not given
>
> Therefore a new ackto is not calculated and i can also see that in the
> ack_to file:
>
> 600 A
> 600 A
> 600 A
> ...
>
> These look like the default values to me (and do not change), but
> ath_dynack_get_max_to() should return 750 A for our 10MHz BW case - this
> looks also suspecious to me.
>
> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a bit too
> high to me) but the performance is way below expectation (<1MBit)
> For 20 MHz bandwidth there is a ackto calculated (51 A) and the
> performance is fitting the expectation.
> If you want to take a look at the logs for each of these cases for ap
> and sta, you can download them here:
> https://cloud.hs-augsburg.de/s/eworxkJoL6JXYzZ
>
> Did anyone else experience such a behaviour on non 20MHz Channels or
> does anyone have an idea on where this behaviour might originate from?
> I am not experienced in the ath9k driver code, but a uneducated guess
> might be that the ring buffer where the dynack algorithm is taking its
> frame-samples from is not behaving as expected for the 5&10MHz case.
>
> regards,
> julian dorner

Are you stressing the link?
I'll try to simulate this later on

Regards,

Koen

2021-06-22 10:12:38

by petrosilius

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth

On 22.06.21 11:54, Koen Vandeputte wrote:
>
> On 18.06.21 13:13, Petrosilius wrote:
>> Hello Lorenzo Bianconi,
>>
>> we are running a set of R11e-2HPnD devices and having trouble getting
>> dynack working properly.
>> Setup:
>> * linux-5.4.123
>> * OpenWRT (current development branch) with wireless backports-5.10.34-1
>> * distance 2m between ap and sta
>> * Low ambient noise wifi environment
>> We experienced some non working dynack or low performance in the
>> connection due to too high calculated ackto's.
>>
>> Here is a ath9k debug output example for a non working dynack @ 10Mhz
>> BW:
>>
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500427] ath: phy0:
>> {48:8f:5a:3c:bb:03} tx sample 44905341 [dur 8720][h 29-t 30]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500437] ath: phy0:
>> ack_ts 44844835 st_ts 44905341 st_dur 8720 [17-29]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500445] ath: phy0:
>> ack_ts 44923425 st_ts 44905341 st_dur 8720 [18-29]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554642] ath: phy0: rx
>> sample 44977693 [h 18-t 20]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554701] ath: phy0:
>> {48:8f:5a:3c:bb:03} tx sample 44964984 [dur 6032][h 30-t 31]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554710] ath: phy0:
>> ack_ts 44923425 st_ts 44964984 st_dur 6032 [18-30]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554718] ath: phy0:
>> ack_ts 44977693 st_ts 44964984 st_dur 6032 [19-30]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577890] ath: phy0: rx
>> sample 45000939 [h 19-t 21]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577946] ath: phy0:
>> {48:8f:5a:3c:bb:03} tx sample 44998471 [dur 912][h 31-t 32]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577956] ath: phy0:
>> ack_ts 44977693 st_ts 44998471 st_dur 912 [19-31]
>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577964] ath: phy0:
>> ack_ts 45000939 st_ts 44998471 st_dur 912 [20-31]
>>
>> THe above output is generated in dynack.c by
>>
>>          ath_dbg(ath9k_hw_common(ah), DYNACK,
>>              "ack_ts %u st_ts %u st_dur %u [%u-%u]\n",
>>              ack_ts, st_ts->tstamp, st_ts->dur,
>>              da->ack_rbf.h_rb, da->st_rbf.h_rb);
>>
>> The ackto is afterwards calculated by
>>
>>          if (ack_ts > st_ts->tstamp + st_ts->dur) {
>>              ackto = ack_ts - st_ts->tstamp - st_ts->dur;
>>
>> Filling in the values of the first sample:
>>
>> (ack_ts > st_ts->tstamp + st_ts->dur) ?
>> (44844835 > 44905341+8720) ?
>> (44844835 > 44914061) ? ... not given
>>
>> Therefore a new ackto is not calculated and i can also see that in the
>> ack_to file:
>>
>> 600 A
>> 600 A
>> 600 A
>> ...
>>
>> These look like the default values to me (and do not change), but
>> ath_dynack_get_max_to() should return 750 A for our 10MHz BW case - this
>> looks also suspecious to me.
>>
>> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a bit too
>> high to me) but the performance is way below expectation (<1MBit)
>> For 20 MHz bandwidth there is a ackto calculated (51 A) and the
>> performance is fitting the expectation.
>> If you want to take a look at the logs for each of these cases for ap
>> and sta, you can download them here:
>> https://cloud.hs-augsburg.de/s/eworxkJoL6JXYzZ
>>
>> Did anyone else experience such a behaviour on non 20MHz Channels or
>> does anyone have an idea on where this behaviour might originate from?
>> I am not experienced in the ath9k driver code, but a uneducated guess
>> might be that the ring buffer where the dynack algorithm is taking its
>> frame-samples from is not behaving as expected for the 5&10MHz case.
>>
>> regards,
>> julian dorner
>
> Are you stressing the link?
> I'll try to simulate this later on
>
> Regards,
>
> Koen
>
Hi Koen,

we didnt stress the link that much.

There was only SSH from the ap to the sta running to get access to the sta.

regards,

Julian

2021-06-22 10:45:10

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth

just some cents from me. i modified the algorithm a long time ago since
the dynack way ath9k was going was not correct.
i will look if it can make a patch out of my experiences, but dont
expect it within the next 2 days.

Am 22.06.2021 um 12:12 schrieb Petrosilius:
> On 22.06.21 11:54, Koen Vandeputte wrote:
>> On 18.06.21 13:13, Petrosilius wrote:
>>> Hello Lorenzo Bianconi,
>>>
>>> we are running a set of R11e-2HPnD devices and having trouble getting
>>> dynack working properly.
>>> Setup:
>>> * linux-5.4.123
>>> * OpenWRT (current development branch) with wireless backports-5.10.34-1
>>> * distance 2m between ap and sta
>>> * Low ambient noise wifi environment
>>> We experienced some non working dynack or low performance in the
>>> connection due to too high calculated ackto's.
>>>
>>> Here is a ath9k debug output example for a non working dynack @ 10Mhz
>>> BW:
>>>
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500427] ath: phy0:
>>> {48:8f:5a:3c:bb:03} tx sample 44905341 [dur 8720][h 29-t 30]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500437] ath: phy0:
>>> ack_ts 44844835 st_ts 44905341 st_dur 8720 [17-29]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500445] ath: phy0:
>>> ack_ts 44923425 st_ts 44905341 st_dur 8720 [18-29]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554642] ath: phy0: rx
>>> sample 44977693 [h 18-t 20]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554701] ath: phy0:
>>> {48:8f:5a:3c:bb:03} tx sample 44964984 [dur 6032][h 30-t 31]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554710] ath: phy0:
>>> ack_ts 44923425 st_ts 44964984 st_dur 6032 [18-30]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554718] ath: phy0:
>>> ack_ts 44977693 st_ts 44964984 st_dur 6032 [19-30]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577890] ath: phy0: rx
>>> sample 45000939 [h 19-t 21]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577946] ath: phy0:
>>> {48:8f:5a:3c:bb:03} tx sample 44998471 [dur 912][h 31-t 32]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577956] ath: phy0:
>>> ack_ts 44977693 st_ts 44998471 st_dur 912 [19-31]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577964] ath: phy0:
>>> ack_ts 45000939 st_ts 44998471 st_dur 912 [20-31]
>>>
>>> THe above output is generated in dynack.c by
>>>
>>>          ath_dbg(ath9k_hw_common(ah), DYNACK,
>>>              "ack_ts %u st_ts %u st_dur %u [%u-%u]\n",
>>>              ack_ts, st_ts->tstamp, st_ts->dur,
>>>              da->ack_rbf.h_rb, da->st_rbf.h_rb);
>>>
>>> The ackto is afterwards calculated by
>>>
>>>          if (ack_ts > st_ts->tstamp + st_ts->dur) {
>>>              ackto = ack_ts - st_ts->tstamp - st_ts->dur;
>>>
>>> Filling in the values of the first sample:
>>>
>>> (ack_ts > st_ts->tstamp + st_ts->dur) ?
>>> (44844835 > 44905341+8720) ?
>>> (44844835 > 44914061) ? ... not given
>>>
>>> Therefore a new ackto is not calculated and i can also see that in the
>>> ack_to file:
>>>
>>> 600 A
>>> 600 A
>>> 600 A
>>> ...
>>>
>>> These look like the default values to me (and do not change), but
>>> ath_dynack_get_max_to() should return 750 A for our 10MHz BW case - this
>>> looks also suspecious to me.
>>>
>>> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a bit too
>>> high to me) but the performance is way below expectation (<1MBit)
>>> For 20 MHz bandwidth there is a ackto calculated (51 A) and the
>>> performance is fitting the expectation.
>>> If you want to take a look at the logs for each of these cases for ap
>>> and sta, you can download them here:
>>> https://cloud.hs-augsburg.de/s/eworxkJoL6JXYzZ
>>>
>>> Did anyone else experience such a behaviour on non 20MHz Channels or
>>> does anyone have an idea on where this behaviour might originate from?
>>> I am not experienced in the ath9k driver code, but a uneducated guess
>>> might be that the ring buffer where the dynack algorithm is taking its
>>> frame-samples from is not behaving as expected for the 5&10MHz case.
>>>
>>> regards,
>>> julian dorner
>> Are you stressing the link?
>> I'll try to simulate this later on
>>
>> Regards,
>>
>> Koen
>>
> Hi Koen,
>
> we didnt stress the link that much.
>
> There was only SSH from the ap to the sta running to get access to the sta.
>
> regards,
>
> Julian
>
>

2021-06-22 11:53:50

by petrosilius

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth

On 22.06.21 13:52, Koen Vandeputte wrote:
>
> On 22.06.21 12:12, Petrosilius wrote:
>> On 22.06.21 11:54, Koen Vandeputte wrote:
>>> On 18.06.21 13:13, Petrosilius wrote:
>>>> Hello Lorenzo Bianconi,
>>>>
>>>> we are running a set of R11e-2HPnD devices and having trouble getting
>>>> dynack working properly.
>>>> Setup:
>>>> * linux-5.4.123
>>>> * OpenWRT (current development branch) with wireless
>>>> backports-5.10.34-1
>>>> * distance 2m between ap and sta
>>>> * Low ambient noise wifi environment
>>>> We experienced some non working dynack or low performance in the
>>>> connection due to too high calculated ackto's.
>>>>
>>>> Here is a ath9k debug output example for a non working dynack @ 10Mhz
>>>> BW:
>>>>
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500427] ath: phy0:
>>>> {48:8f:5a:3c:bb:03} tx sample 44905341 [dur 8720][h 29-t 30]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500437] ath: phy0:
>>>> ack_ts 44844835 st_ts 44905341 st_dur 8720 [17-29]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500445] ath: phy0:
>>>> ack_ts 44923425 st_ts 44905341 st_dur 8720 [18-29]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554642] ath:
>>>> phy0: rx
>>>> sample 44977693 [h 18-t 20]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554701] ath: phy0:
>>>> {48:8f:5a:3c:bb:03} tx sample 44964984 [dur 6032][h 30-t 31]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554710] ath: phy0:
>>>> ack_ts 44923425 st_ts 44964984 st_dur 6032 [18-30]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554718] ath: phy0:
>>>> ack_ts 44977693 st_ts 44964984 st_dur 6032 [19-30]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577890] ath:
>>>> phy0: rx
>>>> sample 45000939 [h 19-t 21]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577946] ath: phy0:
>>>> {48:8f:5a:3c:bb:03} tx sample 44998471 [dur 912][h 31-t 32]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577956] ath: phy0:
>>>> ack_ts 44977693 st_ts 44998471 st_dur 912 [19-31]
>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577964] ath: phy0:
>>>> ack_ts 45000939 st_ts 44998471 st_dur 912 [20-31]
>>>>
>>>> THe above output is generated in dynack.c by
>>>>
>>>>           ath_dbg(ath9k_hw_common(ah), DYNACK,
>>>>               "ack_ts %u st_ts %u st_dur %u [%u-%u]\n",
>>>>               ack_ts, st_ts->tstamp, st_ts->dur,
>>>>               da->ack_rbf.h_rb, da->st_rbf.h_rb);
>>>>
>>>> The ackto is afterwards calculated by
>>>>
>>>>           if (ack_ts > st_ts->tstamp + st_ts->dur) {
>>>>               ackto = ack_ts - st_ts->tstamp - st_ts->dur;
>>>>
>>>> Filling in the values of the first sample:
>>>>
>>>> (ack_ts > st_ts->tstamp + st_ts->dur) ?
>>>> (44844835 > 44905341+8720) ?
>>>> (44844835 > 44914061) ? ... not given
>>>>
>>>> Therefore a new ackto is not calculated and i can also see that in the
>>>> ack_to file:
>>>>
>>>> 600 A
>>>> 600 A
>>>> 600 A
>>>> ...
>>>>
>>>> These look like the default values to me (and do not change), but
>>>> ath_dynack_get_max_to() should return 750 A for our 10MHz BW case -
>>>> this
>>>> looks also suspecious to me.
>>>>
>>>> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a
>>>> bit too
>>>> high to me) but the performance is way below expectation (<1MBit)
>>>> For 20 MHz bandwidth there is a ackto calculated (51 A) and the
>>>> performance is fitting the expectation.
>>>> If you want to take a look at the logs for each of these cases for ap
>>>> and sta, you can download them here:
>>>> https://cloud.hs-augsburg.de/s/eworxkJoL6JXYzZ
>>>>
>>>> Did anyone else experience such a behaviour on non 20MHz Channels or
>>>> does anyone have an idea on where this behaviour might originate from?
>>>> I am not experienced in the ath9k driver code, but a uneducated guess
>>>> might be that the ring buffer where the dynack algorithm is taking its
>>>> frame-samples from is not behaving as expected for the 5&10MHz case.
>>>>
>>>> regards,
>>>> julian dorner
>>> Are you stressing the link?
>>> I'll try to simulate this later on
>>>
>>> Regards,
>>>
>>> Koen
>>>
>> Hi Koen,
>>
>> we didnt stress the link that much.
>>
>> There was only SSH from the ap to the sta running to get access to
>> the sta.
>>
>> regards,
>>
>> Julian
>
> Hi,
>
> Please retry while sending data over the link (iperf or so) and let me
> know the results. :)
>
> Thanks,
>
> Koen
>
Hi Koen,

we tried this

> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a bit too
> high to me) but the performance is way below expectation (<1MBit)
running iperf didnt help on this.

Regards,

Julian

2021-06-22 11:54:41

by Koen Vandeputte

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth


On 22.06.21 12:12, Petrosilius wrote:
> On 22.06.21 11:54, Koen Vandeputte wrote:
>> On 18.06.21 13:13, Petrosilius wrote:
>>> Hello Lorenzo Bianconi,
>>>
>>> we are running a set of R11e-2HPnD devices and having trouble getting
>>> dynack working properly.
>>> Setup:
>>> * linux-5.4.123
>>> * OpenWRT (current development branch) with wireless backports-5.10.34-1
>>> * distance 2m between ap and sta
>>> * Low ambient noise wifi environment
>>> We experienced some non working dynack or low performance in the
>>> connection due to too high calculated ackto's.
>>>
>>> Here is a ath9k debug output example for a non working dynack @ 10Mhz
>>> BW:
>>>
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500427] ath: phy0:
>>> {48:8f:5a:3c:bb:03} tx sample 44905341 [dur 8720][h 29-t 30]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500437] ath: phy0:
>>> ack_ts 44844835 st_ts 44905341 st_dur 8720 [17-29]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500445] ath: phy0:
>>> ack_ts 44923425 st_ts 44905341 st_dur 8720 [18-29]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554642] ath: phy0: rx
>>> sample 44977693 [h 18-t 20]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554701] ath: phy0:
>>> {48:8f:5a:3c:bb:03} tx sample 44964984 [dur 6032][h 30-t 31]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554710] ath: phy0:
>>> ack_ts 44923425 st_ts 44964984 st_dur 6032 [18-30]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554718] ath: phy0:
>>> ack_ts 44977693 st_ts 44964984 st_dur 6032 [19-30]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577890] ath: phy0: rx
>>> sample 45000939 [h 19-t 21]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577946] ath: phy0:
>>> {48:8f:5a:3c:bb:03} tx sample 44998471 [dur 912][h 31-t 32]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577956] ath: phy0:
>>> ack_ts 44977693 st_ts 44998471 st_dur 912 [19-31]
>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577964] ath: phy0:
>>> ack_ts 45000939 st_ts 44998471 st_dur 912 [20-31]
>>>
>>> THe above output is generated in dynack.c by
>>>
>>>          ath_dbg(ath9k_hw_common(ah), DYNACK,
>>>              "ack_ts %u st_ts %u st_dur %u [%u-%u]\n",
>>>              ack_ts, st_ts->tstamp, st_ts->dur,
>>>              da->ack_rbf.h_rb, da->st_rbf.h_rb);
>>>
>>> The ackto is afterwards calculated by
>>>
>>>          if (ack_ts > st_ts->tstamp + st_ts->dur) {
>>>              ackto = ack_ts - st_ts->tstamp - st_ts->dur;
>>>
>>> Filling in the values of the first sample:
>>>
>>> (ack_ts > st_ts->tstamp + st_ts->dur) ?
>>> (44844835 > 44905341+8720) ?
>>> (44844835 > 44914061) ? ... not given
>>>
>>> Therefore a new ackto is not calculated and i can also see that in the
>>> ack_to file:
>>>
>>> 600 A
>>> 600 A
>>> 600 A
>>> ...
>>>
>>> These look like the default values to me (and do not change), but
>>> ath_dynack_get_max_to() should return 750 A for our 10MHz BW case - this
>>> looks also suspecious to me.
>>>
>>> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a bit too
>>> high to me) but the performance is way below expectation (<1MBit)
>>> For 20 MHz bandwidth there is a ackto calculated (51 A) and the
>>> performance is fitting the expectation.
>>> If you want to take a look at the logs for each of these cases for ap
>>> and sta, you can download them here:
>>> https://cloud.hs-augsburg.de/s/eworxkJoL6JXYzZ
>>>
>>> Did anyone else experience such a behaviour on non 20MHz Channels or
>>> does anyone have an idea on where this behaviour might originate from?
>>> I am not experienced in the ath9k driver code, but a uneducated guess
>>> might be that the ring buffer where the dynack algorithm is taking its
>>> frame-samples from is not behaving as expected for the 5&10MHz case.
>>>
>>> regards,
>>> julian dorner
>> Are you stressing the link?
>> I'll try to simulate this later on
>>
>> Regards,
>>
>> Koen
>>
> Hi Koen,
>
> we didnt stress the link that much.
>
> There was only SSH from the ap to the sta running to get access to the sta.
>
> regards,
>
> Julian

Hi,

Please retry while sending data over the link (iperf or so) and let me
know the results. :)

Thanks,

Koen

2021-06-22 11:55:26

by Koen Vandeputte

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth


On 22.06.21 12:40, Sebastian Gottschall wrote:
> just some cents from me. i modified the algorithm a long time ago
> since the dynack way ath9k was going was not correct.
> i will look if it can make a patch out of my experiences, but dont
> expect it within the next 2 days.
>
Can you elaborate (next to the patch) on *why* dynack was going the
wrong way?
New insights are always welcome :)

Thanks,

Koen


2021-06-22 12:04:33

by Koen Vandeputte

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth


On 22.06.21 13:53, Petrosilius wrote:
> On 22.06.21 13:52, Koen Vandeputte wrote:
>> On 22.06.21 12:12, Petrosilius wrote:
>>> On 22.06.21 11:54, Koen Vandeputte wrote:
>>>> On 18.06.21 13:13, Petrosilius wrote:
>>>>> Hello Lorenzo Bianconi,
>>>>>
>>>>> we are running a set of R11e-2HPnD devices and having trouble getting
>>>>> dynack working properly.
>>>>> Setup:
>>>>> * linux-5.4.123
>>>>> * OpenWRT (current development branch) with wireless
>>>>> backports-5.10.34-1
>>>>> * distance 2m between ap and sta
>>>>> * Low ambient noise wifi environment
>>>>> We experienced some non working dynack or low performance in the
>>>>> connection due to too high calculated ackto's.
>>>>>
>>>>> Here is a ath9k debug output example for a non working dynack @ 10Mhz
>>>>> BW:
>>>>>
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500427] ath: phy0:
>>>>> {48:8f:5a:3c:bb:03} tx sample 44905341 [dur 8720][h 29-t 30]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500437] ath: phy0:
>>>>> ack_ts 44844835 st_ts 44905341 st_dur 8720 [17-29]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500445] ath: phy0:
>>>>> ack_ts 44923425 st_ts 44905341 st_dur 8720 [18-29]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554642] ath:
>>>>> phy0: rx
>>>>> sample 44977693 [h 18-t 20]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554701] ath: phy0:
>>>>> {48:8f:5a:3c:bb:03} tx sample 44964984 [dur 6032][h 30-t 31]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554710] ath: phy0:
>>>>> ack_ts 44923425 st_ts 44964984 st_dur 6032 [18-30]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554718] ath: phy0:
>>>>> ack_ts 44977693 st_ts 44964984 st_dur 6032 [19-30]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577890] ath:
>>>>> phy0: rx
>>>>> sample 45000939 [h 19-t 21]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577946] ath: phy0:
>>>>> {48:8f:5a:3c:bb:03} tx sample 44998471 [dur 912][h 31-t 32]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577956] ath: phy0:
>>>>> ack_ts 44977693 st_ts 44998471 st_dur 912 [19-31]
>>>>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577964] ath: phy0:
>>>>> ack_ts 45000939 st_ts 44998471 st_dur 912 [20-31]
>>>>>
>>>>> THe above output is generated in dynack.c by
>>>>>
>>>>>           ath_dbg(ath9k_hw_common(ah), DYNACK,
>>>>>               "ack_ts %u st_ts %u st_dur %u [%u-%u]\n",
>>>>>               ack_ts, st_ts->tstamp, st_ts->dur,
>>>>>               da->ack_rbf.h_rb, da->st_rbf.h_rb);
>>>>>
>>>>> The ackto is afterwards calculated by
>>>>>
>>>>>           if (ack_ts > st_ts->tstamp + st_ts->dur) {
>>>>>               ackto = ack_ts - st_ts->tstamp - st_ts->dur;
>>>>>
>>>>> Filling in the values of the first sample:
>>>>>
>>>>> (ack_ts > st_ts->tstamp + st_ts->dur) ?
>>>>> (44844835 > 44905341+8720) ?
>>>>> (44844835 > 44914061) ? ... not given
>>>>>
>>>>> Therefore a new ackto is not calculated and i can also see that in the
>>>>> ack_to file:
>>>>>
>>>>> 600 A
>>>>> 600 A
>>>>> 600 A
>>>>> ...
>>>>>
>>>>> These look like the default values to me (and do not change), but
>>>>> ath_dynack_get_max_to() should return 750 A for our 10MHz BW case -
>>>>> this
>>>>> looks also suspecious to me.
>>>>>
>>>>> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a
>>>>> bit too
>>>>> high to me) but the performance is way below expectation (<1MBit)
>>>>> For 20 MHz bandwidth there is a ackto calculated (51 A) and the
>>>>> performance is fitting the expectation.
>>>>> If you want to take a look at the logs for each of these cases for ap
>>>>> and sta, you can download them here:
>>>>> https://cloud.hs-augsburg.de/s/eworxkJoL6JXYzZ
>>>>>
>>>>> Did anyone else experience such a behaviour on non 20MHz Channels or
>>>>> does anyone have an idea on where this behaviour might originate from?
>>>>> I am not experienced in the ath9k driver code, but a uneducated guess
>>>>> might be that the ring buffer where the dynack algorithm is taking its
>>>>> frame-samples from is not behaving as expected for the 5&10MHz case.
>>>>>
>>>>> regards,
>>>>> julian dorner
>>>> Are you stressing the link?
>>>> I'll try to simulate this later on
>>>>
>>>> Regards,
>>>>
>>>> Koen
>>>>
>>> Hi Koen,
>>>
>>> we didnt stress the link that much.
>>>
>>> There was only SSH from the ap to the sta running to get access to
>>> the sta.
>>>
>>> regards,
>>>
>>> Julian
>> Hi,
>>
>> Please retry while sending data over the link (iperf or so) and let me
>> know the results. :)
>>
>> Thanks,
>>
>> Koen
>>
> Hi Koen,
>
> we tried this
>
>> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a bit too
>> high to me) but the performance is way below expectation (<1MBit)
> running iperf didnt help on this.
>
> Regards,
>
> Julian
>
Thanks for confirming that.

What would really help is a small table showing this:

Real physical distance? (in m)
ack_to reported while stressing the link:

20MHz: xx
10 MHz: yy
5 MHz: zz

I'll try to simulate the issue somewhere in the next days.


Please do note that ongoing effort is currently going on to improve
dynack on lower distances.

It was observed and reported by me to Lorenzo that ack_to was way higher
than fixed settings when
real distance is <6km

Some testing patches were cooked and tested in the field last month
covering long and short distances (1km up to 24km)
and these are matching fixed distance ack_to very close now. (speeds
using dynack were also higher than fixed settings)
It's not finalized yet.

Also do note that dynack only shows (any) benefit when having links >3km
Below that, timing jitter and processing time seems to have more
influence on ack_to than actual distance.

Regards,

Koen

2021-06-22 18:55:25

by petrosilius

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth

On 22.06.21 14:03, Koen Vandeputte wrote:
>
> Thanks for confirming that.
>
> What would really help is a small table showing this:
>
> Real physical distance? (in m)
> ack_to reported while stressing the link:
>
> 20MHz: xx
> 10 MHz: yy
> 5 MHz: zz
>
> I'll try to simulate the issue somewhere in the next days.
>
>
> Please do note that ongoing effort is currently going on to improve
> dynack on lower distances.
>
> It was observed and reported by me to Lorenzo that ack_to was way
> higher than fixed settings when
> real distance is <6km
>
> Some testing patches were cooked and tested in the field last month
> covering long and short distances (1km up to 24km)
> and these are matching fixed distance ack_to very close now. (speeds
> using dynack were also higher than fixed settings)
> It's not finalized yet.
>
> Also do note that dynack only shows (any) benefit when having links >3km
> Below that, timing jitter and processing time seems to have more
> influence on ack_to than actual distance.
>
> Regards,
>
> Koen
>
Here the testresults

real physical distance: 1m

BW: 20mhz
ackto: 50
iperf3
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  86.4 MBytes  72.4 Mbits/sec    0             sender
[  5]   0.00-10.01  sec  86.0 MBytes  72.0 Mbits/sec                 
receiver

BW: 5mhz
ackto: 50
iperf3
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.01  sec  45.2 KBytes  37.0 Kbits/sec   14             sender
[  5]   0.00-10.07  sec  5.66 KBytes  4.60 Kbits/sec                 
receiver

BW: 10mhz
ackto: 50
iperf3
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   127 KBytes   104 Kbits/sec   30             sender
[  5]   0.00-14.10  sec   107 KBytes  62.4 Kbits/sec                 
receiver

Also longer iperf3 tests didnt change anything on this behaviour.

After this test we were actually able to get the ack_to doing something
for 5mhz (ack_to 641 - ~12MBit iperf3) and 10mhz (ack_to 258 - ~14MBit
iperf3) by doing some random wifi-restarts (using openwrt 'wifi'
command) and reboots, but this is not really reproducible.

We got the gutfeeling that there might be some issue reseting of the
dynack part when changing the bandwidths. This might explain, why for
the test above the ack_to doesnt change at all from the 50 of the 20Mhz
BW. Or is this normal behaviour that the ack_to doesnt change after a
bandwidth change?

Another observation: Is it normal that 'iw dev' doesnt show 5 & 10 MHz,
but always 20 MHz? (We verified that we actually get 5/10MHz Channels by
checking visibility of the AP with other clients).

To rule out a hardware problem: We used the same hardware successfully
on a 8km/20MHz link with dynack.

Regards,

Julian

2021-06-22 21:02:59

by Koen Vandeputte

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth


On 22.06.21 20:54, Petrosilius wrote:
> On 22.06.21 14:03, Koen Vandeputte wrote:
>> Thanks for confirming that.
>>
>> What would really help is a small table showing this:
>>
>> Real physical distance? (in m)
>> ack_to reported while stressing the link:
>>
>> 20MHz: xx
>> 10 MHz: yy
>> 5 MHz: zz
>>
>> I'll try to simulate the issue somewhere in the next days.
>>
>>
>> Please do note that ongoing effort is currently going on to improve
>> dynack on lower distances.
>>
>> It was observed and reported by me to Lorenzo that ack_to was way
>> higher than fixed settings when
>> real distance is <6km
>>
>> Some testing patches were cooked and tested in the field last month
>> covering long and short distances (1km up to 24km)
>> and these are matching fixed distance ack_to very close now. (speeds
>> using dynack were also higher than fixed settings)
>> It's not finalized yet.
>>
>> Also do note that dynack only shows (any) benefit when having links >3km
>> Below that, timing jitter and processing time seems to have more
>> influence on ack_to than actual distance.
>>
>> Regards,
>>
>> Koen
>>
> Here the testresults
>
> real physical distance: 1m
>
> BW: 20mhz
> ackto: 50
> iperf3
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  86.4 MBytes  72.4 Mbits/sec    0             sender
> [  5]   0.00-10.01  sec  86.0 MBytes  72.0 Mbits/sec
> receiver
>
> BW: 5mhz
> ackto: 50
> iperf3
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.01  sec  45.2 KBytes  37.0 Kbits/sec   14             sender
> [  5]   0.00-10.07  sec  5.66 KBytes  4.60 Kbits/sec
> receiver
>
> BW: 10mhz
> ackto: 50
> iperf3
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec   127 KBytes   104 Kbits/sec   30             sender
> [  5]   0.00-14.10  sec   107 KBytes  62.4 Kbits/sec
> receiver
>
> Also longer iperf3 tests didnt change anything on this behaviour.
>
> After this test we were actually able to get the ack_to doing something
> for 5mhz (ack_to 641 - ~12MBit iperf3) and 10mhz (ack_to 258 - ~14MBit
> iperf3) by doing some random wifi-restarts (using openwrt 'wifi'
> command) and reboots, but this is not really reproducible.
>
> We got the gutfeeling that there might be some issue reseting of the
> dynack part when changing the bandwidths. This might explain, why for
> the test above the ack_to doesnt change at all from the 50 of the 20Mhz
> BW. Or is this normal behaviour that the ack_to doesnt change after a
> bandwidth change?
>
> Another observation: Is it normal that 'iw dev' doesnt show 5 & 10 MHz,
> but always 20 MHz? (We verified that we actually get 5/10MHz Channels by
> checking visibility of the AP with other clients).
>
> To rule out a hardware problem: We used the same hardware successfully
> on a 8km/20MHz link with dynack.
>
> Regards,
>
> Julian
>
Great!

Thanks for the details.
i'll have a look at it.

I can't comment for the iw part.

Regards,

Koen

2021-07-08 22:07:01

by Peter Seiderer

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth

Hello *,

On Tue, 22 Jun 2021 12:40:31 +0200, Sebastian Gottschall <[email protected]> wrote:

> just some cents from me. i modified the algorithm a long time ago since
> the dynack way ath9k was going was not correct.
> i will look if it can make a patch out of my experiences, but dont
> expect it within the next 2 days.
>
> Am 22.06.2021 um 12:12 schrieb Petrosilius:
> > On 22.06.21 11:54, Koen Vandeputte wrote:
> >> On 18.06.21 13:13, Petrosilius wrote:
> >>> Hello Lorenzo Bianconi,
> >>>
> >>> we are running a set of R11e-2HPnD devices and having trouble getting
> >>> dynack working properly.
> >>> Setup:
> >>> * linux-5.4.123
> >>> * OpenWRT (current development branch) with wireless backports-5.10.34-1
> >>> * distance 2m between ap and sta
> >>> * Low ambient noise wifi environment
> >>> We experienced some non working dynack or low performance in the
> >>> connection due to too high calculated ackto's.
> >>>
> >>> Here is a ath9k debug output example for a non working dynack @ 10Mhz
> >>> BW:
> >>>
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500427] ath: phy0:
> >>> {48:8f:5a:3c:bb:03} tx sample 44905341 [dur 8720][h 29-t 30]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500437] ath: phy0:
> >>> ack_ts 44844835 st_ts 44905341 st_dur 8720 [17-29]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.500445] ath: phy0:
> >>> ack_ts 44923425 st_ts 44905341 st_dur ackto.tar8720 [18-29]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554642] ath: phy0: rx
> >>> sample 44977693 [h 18-t 20]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554701] ath: phy0:
> >>> {48:8f:5a:3c:bb:03} tx sample 44964984 [dur 6032][h 30-t 31]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554710] ath: phy0:
> >>> ack_ts 44923425 st_ts 44964984 st_dur 6032 [18-30]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.554718] ath: phy0:
> >>> ack_ts 44977693 st_ts 44964984 st_dur 6032 [19-30]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577890] ath: phy0: rx
> >>> sample 45000939 [h 19-t 21]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577946] ath: phy0:
> >>> {48:8f:5a:3c:bb:03} tx sample 44998471 [dur 912][h 31-t 32]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577956] ath: phy0:
> >>> ack_ts 44977693 st_ts 44998471 st_dur 912 [19-31]
> >>> Wed Jun  2 19:08:50 2021 kern.debug kernel: [  400.577964] ath: phy0:
> >>> ack_ts 45000939 st_ts 44998471 st_dur 912 [20-31]
> >>>
> >>> THe above output is generated in dynack.c by
> >>>
> >>>          ath_dbg(ath9k_hw_common(ah), DYNACK,
> >>>              "ack_ts %u st_ts %u st_dur %u [%u-%u]\n",
> >>>              ack_ts, st_ts->tstamp, st_ts->dur,
> >>>              da->ack_rbf.h_rb, da->st_rbf.h_rb);
> >>>
> >>> The ackto is afterwards calculated by
> >>>
> >>>          if (ack_ts > st_ts->tstamp + st_ts->dur) {
> >>>              ackto = ack_ts - st_ts->tstamp - st_ts->dur;
> >>>
> >>> Filling in the values of the first sample:
> >>>
> >>> (ack_ts > st_ts->tstamp + st_ts->dur) ?
> >>> (44844835 > 44905341+8720) ?
> >>> (44844835 > 44914061) ? ... not given
> >>>
> >>> Therefore a new ackto is not calculated and i can also see that in the
> >>> ack_to file:
> >>>
> >>> 600 A
> >>> 600 A
> >>> 600 A
> >>> ...
> >>>
> >>> These look like the default values to me (and do not change), but
> >>> ath_dynack_get_max_to() should return 750 A for our 10MHz BW case - this
> >>> looks also suspecious to me.
> >>>
> >>> For 5 MHz bandwidth there is a ackto calculated (~382 A, looks a bit too
> >>> high to me) but the performance is way below expectation (<1MBit)
> >>> For 20 MHz bandwidth there is a ackto calculated (51 A) and the
> >>> performance is fitting the expectation.
> >>> If you want to take a look at the logs for each of these cases for ap
> >>> and sta, you can download them here:
> >>> https://cloud.hs-augsburg.de/s/eworxkJoL6JXYzZ
> >>>
> >>> Did anyone else experience such a behaviour on non 20MHz Channels or
> >>> does anyone have an idea on where this behaviour might originate from?
> >>> I am not experienced in the ath9k driver code, but a uneducated guess
> >>> might be that the ring buffer where the dynack algorithm is taking its
> >>> frame-samples from is not behaving as expected for the 5&10MHz case.
> >>>
> >>> regards,
> >>> julian dorner
> >> Are you stressing the link?
> >> I'll try to simulate this later on
> >>
> >> Regards,
> >>
> >> Koen
> >>
> > Hi Koen,
> >
> > we didnt stress the link that much.
> >
> > There was only SSH from the ap to the sta running to get access to the sta.
> >
> > regards,
> >
> > Julian
> >
> >

Did observe same/similar problem, in my case with IBSS mode and three nodes/
stations A, B, C. IP traffic only between A, B. Node C is 'passive' (sending
only beacons and broadcasts), but the current algorithm keeps ackto at 600
(no update of the ackto value for node/station C - added an debugfs entry
ack_to_detailed dumping the ackto values per station which are evaluated
in ath_dynack_compute_ackto() where the highest value wins).

Fixed it by setting the initial ackto value to zero for new nodes/stations:

diff --git a/drivers/net/wireless/ath/ath9k/dynack.c b/drivers/net/wireless/ath/ath9k/dynack.c
index fbeb4a739..ebf7800bd 100644
--- a/drivers/net/wireless/ath/ath9k/dynack.c
+++ b/drivers/net/wireless/ath/ath9k/dynack.c
@@ -235,7 +235,7 @@ void ath_dynack_sample_tx_ts(struct ath_hw *ah, struct sk_buff *skb,
struct ath_node *an;

an = (struct ath_node *)sta->drv_priv;
- an->ackto = -1;
+ an->ackto = 0; /* do not vote for ackto until real value is known */
}
da->lto = jiffies + LATEACK_DELAY;
}
@@ -323,7 +323,7 @@ void ath_dynack_node_init(struct ath_hw *ah, struct ath_node *an)
{
struct ath_dynack *da = &ah->dynack;

- an->ackto = da->ackto;
+ an->ackto = 0; /* do not vote for ackto until real value is known */

spin_lock_bh(&da->qlock);
list_add_tail(&an->list, &da->nodes);
@@ -368,7 +368,7 @@ void ath_dynack_reset(struct ath_hw *ah)

da->ackto = ath_dynack_get_max_to(ah);
list_for_each_entry(an, &da->nodes, list)
- an->ackto = da->ackto;
+ an->ackto = 0; /* do not vote for ackto until real value is known */

/* init acktimeout */
ath_dynack_set_timeout(ah, da->ackto);


Another observation is that the initial/default ackto value is dependent on
the order of the interface configure commands, e.g.:

$ iw phy0 set distance auto
$ iw wlan0 set type ibss
$ ifconfig wlan0 10.11.0.3 up
$ iw wlan0 ibss join test-ibss1 5180 1a:2b:3c:4d:5e:6f

results in an initial ackto value of 300,

$ iw wlan0 set type ibss
$ ifconfig wlan0 10.11.0.3 up
$ iw wlan0 ibss join test-ibss1 5180 1a:2b:3c:4d:5e:6f
$ iw phy0 set distance auto

results in an initial ackto value of 600...

Regards,
Peter


2021-07-13 14:38:48

by Koen Vandeputte

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth


On 22.06.21 20:54, Petrosilius wrote:
> On 22.06.21 14:03, Koen Vandeputte wrote:
>> Thanks for confirming that.
>>
>> What would really help is a small table showing this:
>>
>> Real physical distance? (in m)
>> ack_to reported while stressing the link:
>>
>> 20MHz: xx
>> 10 MHz: yy
>> 5 MHz: zz
>>
>> I'll try to simulate the issue somewhere in the next days.
>>
>>
>> Please do note that ongoing effort is currently going on to improve
>> dynack on lower distances.
>>
>> It was observed and reported by me to Lorenzo that ack_to was way
>> higher than fixed settings when
>> real distance is <6km
>>
>> Some testing patches were cooked and tested in the field last month
>> covering long and short distances (1km up to 24km)
>> and these are matching fixed distance ack_to very close now. (speeds
>> using dynack were also higher than fixed settings)
>> It's not finalized yet.
>>
>> Also do note that dynack only shows (any) benefit when having links >3km
>> Below that, timing jitter and processing time seems to have more
>> influence on ack_to than actual distance.
>>
>> Regards,
>>
>> Koen
>>
> Here the testresults
>
> real physical distance: 1m
>
> BW: 20mhz
> ackto: 50
> iperf3
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  86.4 MBytes  72.4 Mbits/sec    0             sender
> [  5]   0.00-10.01  sec  86.0 MBytes  72.0 Mbits/sec
> receiver
>
> BW: 5mhz
> ackto: 50
> iperf3
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.01  sec  45.2 KBytes  37.0 Kbits/sec   14             sender
> [  5]   0.00-10.07  sec  5.66 KBytes  4.60 Kbits/sec
> receiver
>
> BW: 10mhz
> ackto: 50
> iperf3
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec   127 KBytes   104 Kbits/sec   30             sender
> [  5]   0.00-14.10  sec   107 KBytes  62.4 Kbits/sec
> receiver
>
> Also longer iperf3 tests didnt change anything on this behaviour.
>
> After this test we were actually able to get the ack_to doing something
> for 5mhz (ack_to 641 - ~12MBit iperf3) and 10mhz (ack_to 258 - ~14MBit
> iperf3) by doing some random wifi-restarts (using openwrt 'wifi'
> command) and reboots, but this is not really reproducible.
>
> We got the gutfeeling that there might be some issue reseting of the
> dynack part when changing the bandwidths. This might explain, why for
> the test above the ack_to doesnt change at all from the 50 of the 20Mhz
> BW. Or is this normal behaviour that the ack_to doesnt change after a
> bandwidth change?
>
> Another observation: Is it normal that 'iw dev' doesnt show 5 & 10 MHz,
> but always 20 MHz? (We verified that we actually get 5/10MHz Channels by
> checking visibility of the AP with other clients).
>
> To rule out a hardware problem: We used the same hardware successfully
> on a 8km/20MHz link with dynack.
>
> Regards,
>
> Julian
>
I can simulate the issue.

The problem is that IBSS starts at the highest value (to guarantee
connection), but it takes forever to get enough valid samples to
actually reach the optimal value through EWMA:

[   48.060768] ath: phy0: ACK timeout 600 slottime 298
[   51.155507] wlan0: Trigger new scan to find an IBSS to join
[   53.104609] wlan0: Selected IBSS BSSID ac:c5:2a:a3:61:c9 based on
configured SSID
[   53.160471] ath: phy0: rx sample 46297 [h 0-t 1]
[   53.160522] ath: phy0: {74:4d:28:14:d9:94} tx sample 46163 [dur 72][h
0-t 1]
[   53.160533] ath: phy0: ack_ts 46297 st_ts 46163 st_dur 72 [0-0]
[   53.160542] ath: phy0: 74:4d:28:14:d9:94 to 465 [62]
[   53.160550] ath: phy0: ACK timeout 465 slottime 231
[   53.165973] ath: phy0: rx sample 51368 [h 1-t 2]
[   53.166027] ath: phy0: {74:4d:28:14:d9:94} tx sample 51098 [dur
208][h 1-t 2]
[   53.166038] ath: phy0: ack_ts 51368 st_ts 51098 st_dur 208 [1-1]
[   53.166047] ath: phy0: 74:4d:28:14:d9:94 to 364 [62]
[   53.170739] ath: phy0: rx sample 56568 [h 2-t 3]
[   53.170791] ath: phy0: {74:4d:28:14:d9:94} tx sample 56222 [dur
284][h 2-t 3]
[   53.170801] ath: phy0: ack_ts 56568 st_ts 56222 st_dur 284 [2-2]
[   53.170811] ath: phy0: 74:4d:28:14:d9:94 to 288 [62]
[   54.169095] ath: phy0: rx sample 1054919 [h 3-t 4]
[   54.169148] ath: phy0: {74:4d:28:14:d9:94} tx sample 1054598 [dur
260][h 3-t 4]
[   54.169159] ath: phy0: ack_ts 1054919 st_ts 1054598 st_dur 260 [3-3]
[   54.169169] ath: phy0: 74:4d:28:14:d9:94 to 231 [61]
[   54.173970] ath: phy0: rx sample 1059800 [h 4-t 5]
[   54.174024] ath: phy0: {74:4d:28:14:d9:94} tx sample 1059511 [dur
228][h 4-t 5]
[   54.174035] ath: phy0: ack_ts 1059800 st_ts 1059511 st_dur 228 [4-4]
[   54.174044] ath: phy0: 74:4d:28:14:d9:94 to 188 [61]
...
[  326.925123] ath: phy0: ACK timeout 156 slottime 76
...
[  653.145142] ath: phy0: ACK timeout 132 slottime 64
...
[  926.928758] ath: phy0: ACK timeout 114 slottime 55

(static optimal value is 28 .. so it will take a lot longer ..)


Above log is with an iperf session running in the background. :-s

Will look further into this ..

Regards,

Koen

2021-07-14 05:39:46

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth


>
>
> Above log is with an iperf session running in the background. :-s
>
> Will look further into this ..
>
> Regards,
>
> Koen
some cents from me. the chipset has a hw delay/latency which doubles and
quadroubles depending on the channel width. this special offset needs to
be considered for ack
timing calculation. especially at very short distances you will run into
a problem that if this calculation is wrong, the ack timing register
contains a lower value than the hw delay/latency.
this will result in a effective negative ack value. maybe its worth to
check if the values used for this latency offset are correct?
i found also a patch in my own code which i made many years ago (i
cannot remember why) but maybe it helps
@@ -1128,11 +1110,10 @@

                slottime = ah->slottime;
        }
+
        /* As defined by IEEE 802.11-2007 17.3.8.6 */
        slottime += 3 * ah->coverage_class;
        acktimeout = slottime + sifstime + ack_offset;
+       eifs += (6 * ah->coverage_class);
        ctstimeout = acktimeout;

        /*

Sebastian

>
>

2021-11-09 22:08:55

by petrosilius

[permalink] [raw]
Subject: Re: [Bugreport] ath9k dynack not working/low performance on 5 & 10MHz Bandwidth

Hi Koen,

did you find time to look further into this?

regards,

julian

On 13.07.21 16:34, Koen Vandeputte wrote:
>
> On 22.06.21 20:54, Petrosilius wrote:
>> On 22.06.21 14:03, Koen Vandeputte wrote:
>>> Thanks for confirming that.
>>>
>>> What would really help is a small table showing this:
>>>
>>> Real physical distance? (in m)
>>> ack_to reported while stressing the link:
>>>
>>> 20MHz: xx
>>> 10 MHz: yy
>>> 5 MHz: zz
>>>
>>> I'll try to simulate the issue somewhere in the next days.
>>>
>>>
>>> Please do note that ongoing effort is currently going on to improve
>>> dynack on lower distances.
>>>
>>> It was observed and reported by me to Lorenzo that ack_to was way
>>> higher than fixed settings when
>>> real distance is <6km
>>>
>>> Some testing patches were cooked and tested in the field last month
>>> covering long and short distances (1km up to 24km)
>>> and these are matching fixed distance ack_to very close now. (speeds
>>> using dynack were also higher than fixed settings)
>>> It's not finalized yet.
>>>
>>> Also do note that dynack only shows (any) benefit when having links
>>> >3km
>>> Below that, timing jitter and processing time seems to have more
>>> influence on ack_to than actual distance.
>>>
>>> Regards,
>>>
>>> Koen
>>>
>> Here the testresults
>>
>> real physical distance: 1m
>>
>> BW: 20mhz
>> ackto: 50
>> iperf3
>> [ ID] Interval           Transfer     Bitrate         Retr
>> [  5]   0.00-10.00  sec  86.4 MBytes  72.4 Mbits/sec 0            
>> sender
>> [  5]   0.00-10.01  sec  86.0 MBytes  72.0 Mbits/sec
>> receiver
>>
>> BW: 5mhz
>> ackto: 50
>> iperf3
>> [ ID] Interval           Transfer     Bitrate         Retr
>> [  5]   0.00-10.01  sec  45.2 KBytes  37.0 Kbits/sec 14            
>> sender
>> [  5]   0.00-10.07  sec  5.66 KBytes  4.60 Kbits/sec
>> receiver
>>
>> BW: 10mhz
>> ackto: 50
>> iperf3
>> [ ID] Interval           Transfer     Bitrate         Retr
>> [  5]   0.00-10.00  sec   127 KBytes   104 Kbits/sec 30            
>> sender
>> [  5]   0.00-14.10  sec   107 KBytes  62.4 Kbits/sec
>> receiver
>>
>> Also longer iperf3 tests didnt change anything on this behaviour.
>>
>> After this test we were actually able to get the ack_to doing something
>> for 5mhz (ack_to 641 - ~12MBit iperf3) and 10mhz (ack_to 258 - ~14MBit
>> iperf3) by doing some random wifi-restarts (using openwrt 'wifi'
>> command) and reboots, but this is not really reproducible.
>>
>> We got the gutfeeling that there might be some issue reseting of the
>> dynack part when changing the bandwidths. This might explain, why for
>> the test above the ack_to doesnt change at all from the 50 of the 20Mhz
>> BW. Or is this normal behaviour that the ack_to doesnt change after a
>> bandwidth change?
>>
>> Another observation: Is it normal that 'iw dev' doesnt show 5 & 10 MHz,
>> but always 20 MHz? (We verified that we actually get 5/10MHz Channels by
>> checking visibility of the AP with other clients).
>>
>> To rule out a hardware problem: We used the same hardware successfully
>> on a 8km/20MHz link with dynack.
>>
>> Regards,
>>
>> Julian
>>
> I can simulate the issue.
>
> The problem is that IBSS starts at the highest value (to guarantee
> connection), but it takes forever to get enough valid samples to
> actually reach the optimal value through EWMA:
>
> [   48.060768] ath: phy0: ACK timeout 600 slottime 298
> [   51.155507] wlan0: Trigger new scan to find an IBSS to join
> [   53.104609] wlan0: Selected IBSS BSSID ac:c5:2a:a3:61:c9 based on
> configured SSID
> [   53.160471] ath: phy0: rx sample 46297 [h 0-t 1]
> [   53.160522] ath: phy0: {74:4d:28:14:d9:94} tx sample 46163 [dur
> 72][h 0-t 1]
> [   53.160533] ath: phy0: ack_ts 46297 st_ts 46163 st_dur 72 [0-0]
> [   53.160542] ath: phy0: 74:4d:28:14:d9:94 to 465 [62]
> [   53.160550] ath: phy0: ACK timeout 465 slottime 231
> [   53.165973] ath: phy0: rx sample 51368 [h 1-t 2]
> [   53.166027] ath: phy0: {74:4d:28:14:d9:94} tx sample 51098 [dur
> 208][h 1-t 2]
> [   53.166038] ath: phy0: ack_ts 51368 st_ts 51098 st_dur 208 [1-1]
> [   53.166047] ath: phy0: 74:4d:28:14:d9:94 to 364 [62]
> [   53.170739] ath: phy0: rx sample 56568 [h 2-t 3]
> [   53.170791] ath: phy0: {74:4d:28:14:d9:94} tx sample 56222 [dur
> 284][h 2-t 3]
> [   53.170801] ath: phy0: ack_ts 56568 st_ts 56222 st_dur 284 [2-2]
> [   53.170811] ath: phy0: 74:4d:28:14:d9:94 to 288 [62]
> [   54.169095] ath: phy0: rx sample 1054919 [h 3-t 4]
> [   54.169148] ath: phy0: {74:4d:28:14:d9:94} tx sample 1054598 [dur
> 260][h 3-t 4]
> [   54.169159] ath: phy0: ack_ts 1054919 st_ts 1054598 st_dur 260 [3-3]
> [   54.169169] ath: phy0: 74:4d:28:14:d9:94 to 231 [61]
> [   54.173970] ath: phy0: rx sample 1059800 [h 4-t 5]
> [   54.174024] ath: phy0: {74:4d:28:14:d9:94} tx sample 1059511 [dur
> 228][h 4-t 5]
> [   54.174035] ath: phy0: ack_ts 1059800 st_ts 1059511 st_dur 228 [4-4]
> [   54.174044] ath: phy0: 74:4d:28:14:d9:94 to 188 [61]
> ...
> [  326.925123] ath: phy0: ACK timeout 156 slottime 76
> ...
> [  653.145142] ath: phy0: ACK timeout 132 slottime 64
> ...
> [  926.928758] ath: phy0: ACK timeout 114 slottime 55
>
> (static optimal value is 28 .. so it will take a lot longer ..)
>
>
> Above log is with an iperf session running in the background. :-s
>
> Will look further into this ..
>
> Regards,
>
> Koen
>