2017-02-14 20:27:26

by Rajkumar Manoharan

[permalink] [raw]
Subject: [PATCH 1/2] mac80211: use DECLARE_EWMA for mesh_fail_avg

As moving average is not considering fractional part, it
will stuck at the same level after certain state. For example
with current values, moving average stuck at 96 and it will
not move forward. Fortunately current threshold is matching
against 95%. If thresold is increased more than 96, mesh path
never be deactivated under worst case. Fix failure average
movement by using EWMA helpers.

Signed-off-by: Rajkumar Manoharan <[email protected]>
---
net/mac80211/mesh_hwmp.c | 21 +++++++++++++++------
net/mac80211/mesh_pathtbl.c | 3 +++
net/mac80211/sta_info.h | 4 +++-
3 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index b747c9645e43..d07ee3ca07ee 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -307,10 +307,11 @@ void ieee80211s_update_metric(struct ieee80211_local *local,

failed = !(txinfo->flags & IEEE80211_TX_STAT_ACK);

- /* moving average, scaled to 100 */
- sta->mesh->fail_avg =
- ((80 * sta->mesh->fail_avg + 5) / 100 + 20 * failed);
- if (sta->mesh->fail_avg > 95)
+ /* moving average, scaled to 100.
+ * feed failure as 100 and success as 0
+ */
+ ewma_mesh_fail_avg_add(&sta->mesh->fail_avg, failed * 100);
+ if (ewma_mesh_fail_avg_read(&sta->mesh->fail_avg) > 95)
mesh_plink_broken(sta);
}

@@ -325,6 +326,8 @@ static u32 airtime_link_metric_get(struct ieee80211_local *local,
int rate, err;
u32 tx_time, estimated_retx;
u64 result;
+ unsigned long fail_avg =
+ ewma_mesh_fail_avg_read(&sta->mesh->fail_avg);

/* Try to get rate based on HW/SW RC algorithm.
* Rate is returned in units of Kbps, correct this
@@ -336,7 +339,7 @@ static u32 airtime_link_metric_get(struct ieee80211_local *local,
if (rate) {
err = 0;
} else {
- if (sta->mesh->fail_avg >= 100)
+ if (fail_avg >= 100)
return MAX_METRIC;

sta_set_rate_info_tx(sta, &sta->tx_stats.last_rate, &rinfo);
@@ -344,7 +347,7 @@ static u32 airtime_link_metric_get(struct ieee80211_local *local,
if (WARN_ON(!rate))
return MAX_METRIC;

- err = (sta->mesh->fail_avg << ARITH_SHIFT) / 100;
+ err = (fail_avg << ARITH_SHIFT) / 100;
}

/* bitrate is in units of 100 Kbps, while we need rate in units of
@@ -484,6 +487,9 @@ static u32 hwmp_route_info_get(struct ieee80211_sub_if_data *sdata,
? mpath->exp_time : exp_time;
mesh_path_activate(mpath);
spin_unlock_bh(&mpath->state_lock);
+ ewma_mesh_fail_avg_init(&sta->mesh->fail_avg);
+ /* init it at a low value - 0 start is tricky */
+ ewma_mesh_fail_avg_add(&sta->mesh->fail_avg, 1);
mesh_path_tx_pending(mpath);
/* draft says preq_id should be saved to, but there does
* not seem to be any use for it, skipping by now
@@ -522,6 +528,9 @@ static u32 hwmp_route_info_get(struct ieee80211_sub_if_data *sdata,
? mpath->exp_time : exp_time;
mesh_path_activate(mpath);
spin_unlock_bh(&mpath->state_lock);
+ ewma_mesh_fail_avg_init(&sta->mesh->fail_avg);
+ /* init it at a low value - 0 start is tricky */
+ ewma_mesh_fail_avg_add(&sta->mesh->fail_avg, 1);
mesh_path_tx_pending(mpath);
} else
spin_unlock_bh(&mpath->state_lock);
diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index f0e6175a9821..98a3b1c0c338 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -829,6 +829,9 @@ void mesh_path_fix_nexthop(struct mesh_path *mpath, struct sta_info *next_hop)
mpath->flags = MESH_PATH_FIXED | MESH_PATH_SN_VALID;
mesh_path_activate(mpath);
spin_unlock_bh(&mpath->state_lock);
+ ewma_mesh_fail_avg_init(&next_hop->mesh->fail_avg);
+ /* init it at a low value - 0 start is tricky */
+ ewma_mesh_fail_avg_add(&next_hop->mesh->fail_avg, 1);
mesh_path_tx_pending(mpath);
}

diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index dd06ef0b8861..d9010f29de3d 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -322,6 +322,8 @@ struct ieee80211_fast_rx {
struct rcu_head rcu_head;
};

+DECLARE_EWMA(mesh_fail_avg, 64, 8)
+
/**
* struct mesh_sta - mesh STA information
* @plink_lock: serialize access to plink fields
@@ -367,7 +369,7 @@ struct mesh_sta {
enum nl80211_mesh_power_mode nonpeer_pm;

/* moving percentage of failed MSDUs */
- unsigned int fail_avg;
+ struct ewma_mesh_fail_avg fail_avg;
};

DECLARE_EWMA(signal, 1024, 8)
--
1.9.1


2017-02-14 20:28:37

by Rajkumar Manoharan

[permalink] [raw]
Subject: [PATCH 2/2] mac80211: fix mesh fail_avg check

Mesh failure average never be more than 100. Only in case of
fixed path, average will be more than threshold limit (95%).
With recent EWMA changes it may go upto 99 as it is scaled to
100. It make sense to return maximum metric when average is
greater than threshold limit.

Signed-off-by: Rajkumar Manoharan <[email protected]>
---
net/mac80211/mesh_hwmp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index d07ee3ca07ee..02c30a21eb66 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -339,7 +339,7 @@ static u32 airtime_link_metric_get(struct ieee80211_local *local,
if (rate) {
err = 0;
} else {
- if (fail_avg >= 100)
+ if (fail_avg >= 95)
return MAX_METRIC;

sta_set_rate_info_tx(sta, &sta->tx_stats.last_rate, &rinfo);
--
1.9.1

2017-02-15 08:14:42

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/2] mac80211: fix mesh fail_avg check

On Tue, 2017-02-14 at 12:27 -0800, Rajkumar Manoharan wrote:
> Mesh failure average never be more than 100. Only in case of
> fixed path, average will be more than threshold limit (95%).
> With recent EWMA changes it may go upto 99 as it is scaled to
> 100. It make sense to return maximum metric when average is
> greater than threshold limit.
>
> Signed-off-by: Rajkumar Manoharan <[email protected]>
> ---
>  net/mac80211/mesh_hwmp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
> index d07ee3ca07ee..02c30a21eb66 100644
> --- a/net/mac80211/mesh_hwmp.c
> +++ b/net/mac80211/mesh_hwmp.c
> @@ -339,7 +339,7 @@ static u32 airtime_link_metric_get(struct
> ieee80211_local *local,
>   if (rate) {
>   err = 0;
>   } else {
> - if (fail_avg >= 100)
> + if (fail_avg >= 95)
>   return MAX_METRIC;

Why is this >= and the other place is >?

Also, I think it'd be good to introduce a #define for this value now,
perhaps something like "LINK_FAIL_THRESH".

johannes

2017-02-15 08:14:01

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 1/2] mac80211: use DECLARE_EWMA for mesh_fail_avg

On Tue, 2017-02-14 at 12:27 -0800, Rajkumar Manoharan wrote:
> As moving average is not considering fractional part, it
> will stuck at the same level after certain state. For example
> with current values, moving average stuck at 96 and it will
> not move forward. Fortunately current threshold is matching
> against 95%. If thresold is increased more than 96, mesh path
> never be deactivated under worst case. Fix failure average
> movement by using EWMA helpers.

Thanks, applied.

> +DECLARE_EWMA(mesh_fail_avg, 64, 8)

Since we only feed in small values (0-100), I picked a much larger
factor (1<<20) to give more precision here.

johannes