2011-11-14 14:28:28

by Helmut Schaa

[permalink] [raw]
Subject: [PATCH 1/3] mac80211: Minor optimization in minstrel_ht tx status path

Under the assumption that minstrel_ht most likely picks a suitable rate
it makes sense to reorder the check for the "last" rate since in most
cases we won't hit the maximum number of tried rates.

Signed-off-by: Helmut Schaa <[email protected]>
---
net/mac80211/rc80211_minstrel_ht.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c
index cdb2853..63f1821 100644
--- a/net/mac80211/rc80211_minstrel_ht.c
+++ b/net/mac80211/rc80211_minstrel_ht.c
@@ -421,8 +421,8 @@ minstrel_ht_tx_status(void *priv, struct ieee80211_supported_band *sband,
mi->sample_packets += info->status.ampdu_len;

for (i = 0; !last; i++) {
- last = (i == IEEE80211_TX_MAX_RATES - 1) ||
- !minstrel_ht_txstat_valid(&ar[i + 1]);
+ last = !minstrel_ht_txstat_valid(&ar[i + 1]) ||
+ (i == IEEE80211_TX_MAX_RATES - 1);

if (!minstrel_ht_txstat_valid(&ar[i]))
break;
--
1.7.3.4



2011-11-14 14:28:30

by Helmut Schaa

[permalink] [raw]
Subject: [PATCH 2/3] mac80211: Check rate->idx before rate->count

The drivers are not required to fill in rate->count if rate->idx is set
to -1. Hence, we should first check rate->idx before accessing
rate->count.

Signed-off-by: Helmut Schaa <[email protected]>
---
net/mac80211/rc80211_minstrel_ht.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c
index 63f1821..3d76fd7 100644
--- a/net/mac80211/rc80211_minstrel_ht.c
+++ b/net/mac80211/rc80211_minstrel_ht.c
@@ -300,10 +300,10 @@ minstrel_ht_update_stats(struct minstrel_priv *mp, struct minstrel_ht_sta *mi)
static bool
minstrel_ht_txstat_valid(struct ieee80211_tx_rate *rate)
{
- if (!rate->count)
+ if (rate->idx < 0)
return false;

- if (rate->idx < 0)
+ if (!rate->count)
return false;

return !!(rate->flags & IEEE80211_TX_RC_MCS);
--
1.7.3.4


2011-11-16 15:37:59

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: [PATCH 1/3] mac80211: Minor optimization in minstrel_ht tx status path

On Mon, Nov 14, 2011 at 03:28:18PM +0100, Helmut Schaa wrote:
> @@ -421,8 +421,8 @@ minstrel_ht_tx_status(void *priv, struct ieee80211_supported_band *sband,
> mi->sample_packets += info->status.ampdu_len;
>
> for (i = 0; !last; i++) {
> - last = (i == IEEE80211_TX_MAX_RATES - 1) ||
> - !minstrel_ht_txstat_valid(&ar[i + 1]);
> + last = !minstrel_ht_txstat_valid(&ar[i + 1]) ||
> + (i == IEEE80211_TX_MAX_RATES - 1);
This make possible that we read outsite from ar[] border. Normally
that should not couse any troubles, but I think it could confuse
something like kmemcheck. Perhaps whould be better to just add
unlikely(i == IEEE80211_TX_MAX_RATES - 1) ?

Stanislaw

2011-11-14 14:40:53

by Felix Fietkau

[permalink] [raw]
Subject: Re: [PATCH 3/3] mac80211: Get rid of search loop for rate group index

On 2011-11-14 3:28 PM, Helmut Schaa wrote:
> Finding the group index for a specific rate is done by looping through
> all groups and returning if the correct one is found. This code is
> called for each tx'ed frame and thus it makes sense to reduce its
> runtime.
>
> Do this by calculating the group index by this formula based on the SGI
> and HT40 flags as well as the stream number:
>
> idx = (HT40 * 2 * MINSTREL_MAX_STREAMS) +
> (SGI * MINSTREL_MAX_STREAMS) +
> (streams - 1)
>
> Hence, the groups are ordered by th HT40 flag first, then by the SGI
> flag and afterwards by the number of used streams.
>
> This should reduce the runtime of minstrel_ht_get_group_idx
> considerable.
>
> Signed-off-by: Helmut Schaa<[email protected]>
For the whole series:
Acked-by: Felix Fietkau <[email protected]>

2011-11-16 16:02:06

by Helmut Schaa

[permalink] [raw]
Subject: Re: [PATCH 1/3] mac80211: Minor optimization in minstrel_ht tx status path

On Wed, Nov 16, 2011 at 4:39 PM, Stanislaw Gruszka <[email protected]> wrote:
> On Mon, Nov 14, 2011 at 03:28:18PM +0100, Helmut Schaa wrote:
>> @@ -421,8 +421,8 @@ minstrel_ht_tx_status(void *priv, struct ieee80211_supported_band *sband,
>> ? ? ? ? ? ? ? mi->sample_packets += info->status.ampdu_len;
>>
>> ? ? ? for (i = 0; !last; i++) {
>> - ? ? ? ? ? ? last = (i == IEEE80211_TX_MAX_RATES - 1) ||
>> - ? ? ? ? ? ? ? ? ? ?!minstrel_ht_txstat_valid(&ar[i + 1]);
>> + ? ? ? ? ? ? last = !minstrel_ht_txstat_valid(&ar[i + 1]) ||
>> + ? ? ? ? ? ? ? ? ? ?(i == IEEE80211_TX_MAX_RATES - 1);
> This make possible that we read outsite from ar[] border. Normally
> that should not couse any troubles, but I think it could confuse
> something like kmemcheck. Perhaps whould be better to just add
> unlikely(i == IEEE80211_TX_MAX_RATES - 1) ?

Indeed, you're right. John, please drop this patch. The other two are
still fine.

Thanks,
Helmut

2011-11-14 14:28:32

by Helmut Schaa

[permalink] [raw]
Subject: [PATCH 3/3] mac80211: Get rid of search loop for rate group index

Finding the group index for a specific rate is done by looping through
all groups and returning if the correct one is found. This code is
called for each tx'ed frame and thus it makes sense to reduce its
runtime.

Do this by calculating the group index by this formula based on the SGI
and HT40 flags as well as the stream number:

idx = (HT40 * 2 * MINSTREL_MAX_STREAMS) +
(SGI * MINSTREL_MAX_STREAMS) +
(streams - 1)

Hence, the groups are ordered by th HT40 flag first, then by the SGI
flag and afterwards by the number of used streams.

This should reduce the runtime of minstrel_ht_get_group_idx
considerable.

Signed-off-by: Helmut Schaa <[email protected]>
---
net/mac80211/rc80211_minstrel_ht.c | 32 ++++++++++++++++----------------
1 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c
index 3d76fd7..bb228e7 100644
--- a/net/mac80211/rc80211_minstrel_ht.c
+++ b/net/mac80211/rc80211_minstrel_ht.c
@@ -36,8 +36,17 @@
/* Transmit duration for the raw data part of an average sized packet */
#define MCS_DURATION(streams, sgi, bps) MCS_SYMBOL_TIME(sgi, MCS_NSYMS((streams) * (bps)))

+/*
+ * Define group sort order: HT40 -> SGI -> #streams
+ */
+#define GROUP_IDX(_streams, _sgi, _ht40) \
+ MINSTREL_MAX_STREAMS * 2 * _ht40 + \
+ MINSTREL_MAX_STREAMS * _sgi + \
+ _streams - 1
+
/* MCS rate information for an MCS group */
-#define MCS_GROUP(_streams, _sgi, _ht40) { \
+#define MCS_GROUP(_streams, _sgi, _ht40) \
+ [GROUP_IDX(_streams, _sgi, _ht40)] = { \
.streams = _streams, \
.flags = \
(_sgi ? IEEE80211_TX_RC_SHORT_GI : 0) | \
@@ -58,6 +67,9 @@
* To enable sufficiently targeted rate sampling, MCS rates are divided into
* groups, based on the number of streams and flags (HT40, SGI) that they
* use.
+ *
+ * Sortorder has to be fixed for GROUP_IDX macro to be applicable:
+ * HT40 -> SGI -> #streams
*/
const struct mcs_group minstrel_mcs_groups[] = {
MCS_GROUP(1, 0, 0),
@@ -102,21 +114,9 @@ minstrel_ewma(int old, int new, int weight)
static int
minstrel_ht_get_group_idx(struct ieee80211_tx_rate *rate)
{
- int streams = (rate->idx / MCS_GROUP_RATES) + 1;
- u32 flags = IEEE80211_TX_RC_SHORT_GI | IEEE80211_TX_RC_40_MHZ_WIDTH;
- int i;
-
- for (i = 0; i < ARRAY_SIZE(minstrel_mcs_groups); i++) {
- if (minstrel_mcs_groups[i].streams != streams)
- continue;
- if (minstrel_mcs_groups[i].flags != (rate->flags & flags))
- continue;
-
- return i;
- }
-
- WARN_ON(1);
- return 0;
+ return GROUP_IDX((rate->idx / MCS_GROUP_RATES) + 1,
+ !!(rate->flags & IEEE80211_TX_RC_SHORT_GI),
+ !!(rate->flags & IEEE80211_TX_RC_40_MHZ_WIDTH));
}

static inline struct minstrel_rate_stats *
--
1.7.3.4