2019-11-18 21:09:18

by Dave Taht

[permalink] [raw]
Subject: Re: [Make-wifi-fast] [PATCH v8 0/2] Implement Airtime-based Queue Limit (AQL)

On Fri, Nov 15, 2019 at 4:10 PM Kan Yan <[email protected]> wrote:
>
> > I do keep hoping for pretty pictures. Got any? :-P
>
> Certainly! I do have some :). Here is the link:
> https://drive.google.com/corp/drive/folders/14OIuQEHOUiIoNrVnKprj6rBYFNZ0Coif

Those were lovely, thanks!!!! Big win. Since you are on patch v10
now.... Any chance you could turn ecn on and off and give it a go
again in your next test run?

Also:

--step-size=.04 --socket-stats # the first is helpful to gain more
detail, the second as to the behavior of the tcp stack. You might need
to run as root (and It's only useful on the tcp_nup test) for the
latter (and have the right ss utility)

Secondly - and AFTER this patchset stablizes, I'd like us to look into
returning the codel default to 10ms or less
from it's currently 20ms or worse setting. Tis another easy test

And y'all know how much I love the rrul_be and rrul tests.....


> >
> > Is there an ax QCAXXXX platform, m.2 card, or mini-pci card worth
> > testing at this point?
>
> It will be great if someone with 11.ax platform can help give it a try.
>
> > How are they handling mu-mimo?
>
> I think it should still work. The queue length in airtime for each individual queue is unchanged, even the multiple queues are allowed to transmit concurrently with mu-mimo.
>
>> I have a round of tests scheduled for intel's ax200 chips, soon. Not sure
>> what, if any, of this new work might apply.
>
> It will be very interesting to know how it performance on 802.11ax platforms. Supposedly 802.11ax already fixed the latency problem so the benefit of this patch should be less significant.
>
>
> On Thu, Nov 14, 2019 at 6:07 PM Dave Taht <[email protected]> wrote:
>>
>> On Thu, Nov 14, 2019 at 6:04 PM Kan Yan <[email protected]> wrote:
>> >
>> > I have tested it with Toke's patch "[PATCH v6 4/4] mac80211: Use
>> > Airtime-based Queue Limits (AQL) on packet dequeue", but didn't
>> > include it here, as it is self contained and Toke has plan to update
>> > it.
>> >
>> > The platform (QCA9984) used in my test
>>
>> I do keep hoping for pretty pictures. Got any? :-P
>>
>> > doesn't support 802.11ax, so I
>> > was not able to test the HE mode support added in v7 update of "Import
>> > airtime calculation code from mt76" from Toke.
>>
>> Is there an ax QCAXXXX platform, m.2 card, or mini-pci card worth
>> testing at this point?
>>
>> How are they handling mu-mimo?
>>
>> I have a round of tests scheduled for intel's ax200 chips, soon. Not sure
>> what, if any, of this new work might apply.
>>
>> > On Thu, Nov 14, 2019 at 5:48 PM Kan Yan <[email protected]> wrote:
>> > >
>> > > This patch series port the Airtime Queue Limits concept from the out-of-tree
>> > > ath10k implementation[0] to mac80211. This version takes my patch to do the
>> > > throttling in mac80211, and replaces the driver API with the mechanism from
>> > > Toke's series, which instead calculated the expected airtime at dequeue time
>> > > inside mac80211, storing it in the SKB cb field.
>> > >
>> > > This version has been tested on QCA9984 platform.
>> > >
>> > > [0] https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1703105/7
>> > >
>> > > Changelog:
>> > >
>> > > v8:
>> > > - Includes Toke's v7 version of "mac80211: Import airtime calculation code from mt76"
>> > > - Don't clobber sta's customized queue limit when configuring the default via debugfs
>> > > - Fix a racing condition when reset aql_tx_pending.
>> > >
>> > > v7:
>> > > - Fix aql_total_pending_airtime underflow due to insufficient locking.
>> > >
>> > > v6:
>> > > - Fix sta lookup in ieee80211_report_used_skb().
>> > > - Move call to ieee80211_sta_update_pending_airtime() to a bit later in
>> > > __ieee80211_tx_status()
>> > > v5:
>> > > - Add missing export of ieee80211_calc_rx_airtime() and make
>> > > ieee80211_calc_tx_airtime_rate() static (kbuildbot).
>> > > - Use skb_get_queue_mapping() to get the AC from the skb.
>> > > - Take basic rate configuration for the BSS into account when calculating
>> > > multicast rate.
>> > > v4:
>> > > - Fix calculation that clamps the maximum airtime to fit into 10 bits
>> > > - Incorporate Rich Brown's nits for the commit message in Kan's patch
>> > > - Add fewer local variables to ieee80211_tx_dequeue()
>> > > v3:
>> > > - Move the tx_time_est field so it's shared with ack_frame_id, and use units
>> > > of 4us for the value stored in it.
>> > > - Move the addition of the Ethernet header size into ieee80211_calc_expected_tx_airtime()
>> > > v2:
>> > > - Integrate Kan's approach to airtime throttling.
>> > > - Hopefully fix the cb struct alignment on big-endian architectures.
>> > >
>> > >
>> > >
>> > > Kan Yan (1):
>> > > mac80211: Implement Airtime-based Queue Limit (AQL)
>> > >
>> > > Toke Høiland-Jørgensen (1):
>> > > mac80211: Import airtime calculation code from mt76
>> > >
>> > > include/net/cfg80211.h | 7 +
>> > > include/net/mac80211.h | 41 +++
>> > > net/mac80211/Makefile | 3 +-
>> > > net/mac80211/airtime.c | 597 +++++++++++++++++++++++++++++++++++++
>> > > net/mac80211/debugfs.c | 85 ++++++
>> > > net/mac80211/debugfs_sta.c | 43 ++-
>> > > net/mac80211/ieee80211_i.h | 8 +
>> > > net/mac80211/main.c | 10 +-
>> > > net/mac80211/sta_info.c | 38 +++
>> > > net/mac80211/sta_info.h | 8 +
>> > > net/mac80211/tx.c | 47 ++-
>> > > 11 files changed, 872 insertions(+), 15 deletions(-)
>> > > create mode 100644 net/mac80211/airtime.c
>> > >
>> > > --
>> > > 2.24.0.rc1.363.gb1bccd3e3d-goog
>> > >
>> > _______________________________________________
>> > Make-wifi-fast mailing list
>> > [email protected]
>> > https://lists.bufferbloat.net/listinfo/make-wifi-fast
>>
>>
>>
>> --
>>
>> Dave Täht
>> CTO, TekLibre, LLC
>> http://www.teklibre.com
>> Tel: 1-831-205-9740



--

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740


2019-11-20 00:41:29

by Kan Yan

[permalink] [raw]
Subject: Re: [Make-wifi-fast] [PATCH v8 0/2] Implement Airtime-based Queue Limit (AQL)

> Those were lovely, thanks!!!! Big win. Since you are on patch v10
> now.... Any chance you could turn ecn on and off and give it a go
> again in your next test run?
>
>
> Also:
>
> --step-size=.04 --socket-stats # the first is helpful to gain more
> detail, the second as to the behavior of the tcp stack.

Thanks for the feedback! I will do more tests in a few days.


> Secondly - and AFTER this patchset stablizes, I'd like us to look into
> returning the codel default to 10ms or less
> from it's currently 20ms or worse setting. Tis another easy test

Smaller CoDel "target" doesn't work well with wireless because the
dequeue behavior in wireless driver is very bursty. It is quite often
dequeues dozens of packets in one burst after one large aggregation is
completed, so smaller CoDel "target" can cause unnecessary packet
drop.


On Mon, Nov 18, 2019 at 1:08 PM Dave Taht <[email protected]> wrote:
>
> On Fri, Nov 15, 2019 at 4:10 PM Kan Yan <[email protected]> wrote:
> >
> > > I do keep hoping for pretty pictures. Got any? :-P
> >
> > Certainly! I do have some :). Here is the link:
> > https://drive.google.com/corp/drive/folders/14OIuQEHOUiIoNrVnKprj6rBYFNZ0Coif
>
> Those were lovely, thanks!!!! Big win. Since you are on patch v10
> now.... Any chance you could turn ecn on and off and give it a go
> again in your next test run?
>
> Also:
>
> --step-size=.04 --socket-stats # the first is helpful to gain more
> detail, the second as to the behavior of the tcp stack. You might need
> to run as root (and It's only useful on the tcp_nup test) for the
> latter (and have the right ss utility)
>
> Secondly - and AFTER this patchset stablizes, I'd like us to look into
> returning the codel default to 10ms or less
> from it's currently 20ms or worse setting. Tis another easy test
>
> And y'all know how much I love the rrul_be and rrul tests.....
>
>
> > >
> > > Is there an ax QCAXXXX platform, m.2 card, or mini-pci card worth
> > > testing at this point?
> >
> > It will be great if someone with 11.ax platform can help give it a try.
> >
> > > How are they handling mu-mimo?
> >
> > I think it should still work. The queue length in airtime for each individual queue is unchanged, even the multiple queues are allowed to transmit concurrently with mu-mimo.
> >
> >> I have a round of tests scheduled for intel's ax200 chips, soon. Not sure
> >> what, if any, of this new work might apply.
> >
> > It will be very interesting to know how it performance on 802.11ax platforms. Supposedly 802.11ax already fixed the latency problem so the benefit of this patch should be less significant.
> >
> >
> > On Thu, Nov 14, 2019 at 6:07 PM Dave Taht <[email protected]> wrote:
> >>
> >> On Thu, Nov 14, 2019 at 6:04 PM Kan Yan <[email protected]> wrote:
> >> >
> >> > I have tested it with Toke's patch "[PATCH v6 4/4] mac80211: Use
> >> > Airtime-based Queue Limits (AQL) on packet dequeue", but didn't
> >> > include it here, as it is self contained and Toke has plan to update
> >> > it.
> >> >
> >> > The platform (QCA9984) used in my test
> >>
> >> I do keep hoping for pretty pictures. Got any? :-P
> >>
> >> > doesn't support 802.11ax, so I
> >> > was not able to test the HE mode support added in v7 update of "Import
> >> > airtime calculation code from mt76" from Toke.
> >>
> >> Is there an ax QCAXXXX platform, m.2 card, or mini-pci card worth
> >> testing at this point?
> >>
> >> How are they handling mu-mimo?
> >>
> >> I have a round of tests scheduled for intel's ax200 chips, soon. Not sure
> >> what, if any, of this new work might apply.
> >>
> >> > On Thu, Nov 14, 2019 at 5:48 PM Kan Yan <[email protected]> wrote:
> >> > >
> >> > > This patch series port the Airtime Queue Limits concept from the out-of-tree
> >> > > ath10k implementation[0] to mac80211. This version takes my patch to do the
> >> > > throttling in mac80211, and replaces the driver API with the mechanism from
> >> > > Toke's series, which instead calculated the expected airtime at dequeue time
> >> > > inside mac80211, storing it in the SKB cb field.
> >> > >
> >> > > This version has been tested on QCA9984 platform.
> >> > >
> >> > > [0] https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1703105/7
> >> > >
> >> > > Changelog:
> >> > >
> >> > > v8:
> >> > > - Includes Toke's v7 version of "mac80211: Import airtime calculation code from mt76"
> >> > > - Don't clobber sta's customized queue limit when configuring the default via debugfs
> >> > > - Fix a racing condition when reset aql_tx_pending.
> >> > >
> >> > > v7:
> >> > > - Fix aql_total_pending_airtime underflow due to insufficient locking.
> >> > >
> >> > > v6:
> >> > > - Fix sta lookup in ieee80211_report_used_skb().
> >> > > - Move call to ieee80211_sta_update_pending_airtime() to a bit later in
> >> > > __ieee80211_tx_status()
> >> > > v5:
> >> > > - Add missing export of ieee80211_calc_rx_airtime() and make
> >> > > ieee80211_calc_tx_airtime_rate() static (kbuildbot).
> >> > > - Use skb_get_queue_mapping() to get the AC from the skb.
> >> > > - Take basic rate configuration for the BSS into account when calculating
> >> > > multicast rate.
> >> > > v4:
> >> > > - Fix calculation that clamps the maximum airtime to fit into 10 bits
> >> > > - Incorporate Rich Brown's nits for the commit message in Kan's patch
> >> > > - Add fewer local variables to ieee80211_tx_dequeue()
> >> > > v3:
> >> > > - Move the tx_time_est field so it's shared with ack_frame_id, and use units
> >> > > of 4us for the value stored in it.
> >> > > - Move the addition of the Ethernet header size into ieee80211_calc_expected_tx_airtime()
> >> > > v2:
> >> > > - Integrate Kan's approach to airtime throttling.
> >> > > - Hopefully fix the cb struct alignment on big-endian architectures.
> >> > >
> >> > >
> >> > >
> >> > > Kan Yan (1):
> >> > > mac80211: Implement Airtime-based Queue Limit (AQL)
> >> > >
> >> > > Toke Høiland-Jørgensen (1):
> >> > > mac80211: Import airtime calculation code from mt76
> >> > >
> >> > > include/net/cfg80211.h | 7 +
> >> > > include/net/mac80211.h | 41 +++
> >> > > net/mac80211/Makefile | 3 +-
> >> > > net/mac80211/airtime.c | 597 +++++++++++++++++++++++++++++++++++++
> >> > > net/mac80211/debugfs.c | 85 ++++++
> >> > > net/mac80211/debugfs_sta.c | 43 ++-
> >> > > net/mac80211/ieee80211_i.h | 8 +
> >> > > net/mac80211/main.c | 10 +-
> >> > > net/mac80211/sta_info.c | 38 +++
> >> > > net/mac80211/sta_info.h | 8 +
> >> > > net/mac80211/tx.c | 47 ++-
> >> > > 11 files changed, 872 insertions(+), 15 deletions(-)
> >> > > create mode 100644 net/mac80211/airtime.c
> >> > >
> >> > > --
> >> > > 2.24.0.rc1.363.gb1bccd3e3d-goog
> >> > >
> >> > _______________________________________________
> >> > Make-wifi-fast mailing list
> >> > [email protected]
> >> > https://lists.bufferbloat.net/listinfo/make-wifi-fast
> >>
> >>
> >>
> >> --
> >>
> >> Dave Täht
> >> CTO, TekLibre, LLC
> >> http://www.teklibre.com
> >> Tel: 1-831-205-9740
>
>
>
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740

2019-11-20 10:32:16

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [Make-wifi-fast] [PATCH v8 0/2] Implement Airtime-based Queue Limit (AQL)

Kan Yan <[email protected]> writes:

>> Those were lovely, thanks!!!! Big win. Since you are on patch v10
>> now.... Any chance you could turn ecn on and off and give it a go
>> again in your next test run?
>>
>>
>> Also:
>>
>> --step-size=.04 --socket-stats # the first is helpful to gain more
>> detail, the second as to the behavior of the tcp stack.
>
> Thanks for the feedback! I will do more tests in a few days.
>
>
>> Secondly - and AFTER this patchset stablizes, I'd like us to look into
>> returning the codel default to 10ms or less
>> from it's currently 20ms or worse setting. Tis another easy test
>
> Smaller CoDel "target" doesn't work well with wireless because the
> dequeue behavior in wireless driver is very bursty. It is quite often
> dequeues dozens of packets in one burst after one large aggregation is
> completed, so smaller CoDel "target" can cause unnecessary packet
> drop.

It would be interesting to get some samples of the actual sojourn time
as seen by CoDel in mac80211. Might be doable with bpftrace...

-Toke


2019-11-21 02:06:25

by Kan Yan

[permalink] [raw]
Subject: Re: [Make-wifi-fast] [PATCH v8 0/2] Implement Airtime-based Queue Limit (AQL)

> It would be interesting to get some samples of the actual sojourn time
> as seen by CoDel in mac80211. Might be doable with bpftrace...

I will try to add some trace event to get the sojourn time for the
next round of tests.


On Wed, Nov 20, 2019 at 2:14 AM Toke Høiland-Jørgensen <[email protected]> wrote:
>
> Kan Yan <[email protected]> writes:
>
> >> Those were lovely, thanks!!!! Big win. Since you are on patch v10
> >> now.... Any chance you could turn ecn on and off and give it a go
> >> again in your next test run?
> >>
> >>
> >> Also:
> >>
> >> --step-size=.04 --socket-stats # the first is helpful to gain more
> >> detail, the second as to the behavior of the tcp stack.
> >
> > Thanks for the feedback! I will do more tests in a few days.
> >
> >
> >> Secondly - and AFTER this patchset stablizes, I'd like us to look into
> >> returning the codel default to 10ms or less
> >> from it's currently 20ms or worse setting. Tis another easy test
> >
> > Smaller CoDel "target" doesn't work well with wireless because the
> > dequeue behavior in wireless driver is very bursty. It is quite often
> > dequeues dozens of packets in one burst after one large aggregation is
> > completed, so smaller CoDel "target" can cause unnecessary packet
> > drop.
>
> It would be interesting to get some samples of the actual sojourn time
> as seen by CoDel in mac80211. Might be doable with bpftrace...
>
> -Toke
>

2019-11-21 10:09:23

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [Make-wifi-fast] [PATCH v8 0/2] Implement Airtime-based Queue Limit (AQL)

Kan Yan <[email protected]> writes:

>> It would be interesting to get some samples of the actual sojourn time
>> as seen by CoDel in mac80211. Might be doable with bpftrace...
>
> I will try to add some trace event to get the sojourn time for the
> next round of tests.

In theory, this ought to produce a histogram of sojourn times (in
microseconds):

bpftrace -e 'kretprobe:codel_skb_time_func { @sojourn = lhist((nsecs - (retval << 10))/1000, 0, 100000, 1000); }'


Can't get the CoDel drop mechanism to trigger on my system at all,
though (a laptop running on iwl). I guess because there's queue
backpressure to userspace first?

It would be interesting to see if it works for you, assuming you can get
bpftrace to work on your test system :)

-Toke