2021-10-29 07:26:32

by Johannes Berg

[permalink] [raw]
Subject: [PATCH] nl80211: fix radio statistics in survey dump

From: Johannes Berg <[email protected]>

Even if userspace specifies the NL80211_ATTR_SURVEY_RADIO_STATS
attribute, we cannot get the statistics because we're not really
parsing the incoming attributes properly any more.

Fix this by passing the attrbuf to nl80211_prepare_wdev_dump()
and filling it there, if given, and using a local version only
if no output is desired.

Since I'm touching it anyway, make nl80211_prepare_wdev_dump()
static.

Fixes: 50508d941c18 ("cfg80211: use parallel_ops for genl")
Reported-by: Jan Fuchs <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
---
net/wireless/nl80211.c | 34 +++++++++++++++++++---------------
net/wireless/nl80211.h | 6 +-----
2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 25ee16558dfa..83a1ba96e172 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -937,33 +937,37 @@ nl80211_packet_pattern_policy[MAX_NL80211_PKTPAT + 1] = {
[NL80211_PKTPAT_OFFSET] = { .type = NLA_U32 },
};

-int nl80211_prepare_wdev_dump(struct netlink_callback *cb,
- struct cfg80211_registered_device **rdev,
- struct wireless_dev **wdev)
+static int nl80211_prepare_wdev_dump(struct netlink_callback *cb,
+ struct cfg80211_registered_device **rdev,
+ struct wireless_dev **wdev,
+ struct nlattr **attrbuf)
{
int err;

if (!cb->args[0]) {
- struct nlattr **attrbuf;
+ struct nlattr **attrbuf_free = NULL;

- attrbuf = kcalloc(NUM_NL80211_ATTR, sizeof(*attrbuf),
- GFP_KERNEL);
- if (!attrbuf)
- return -ENOMEM;
+ if (!attrbuf) {
+ attrbuf = kcalloc(NUM_NL80211_ATTR, sizeof(*attrbuf),
+ GFP_KERNEL);
+ if (!attrbuf)
+ return -ENOMEM;
+ attrbuf_free = attrbuf;
+ }

err = nlmsg_parse_deprecated(cb->nlh,
GENL_HDRLEN + nl80211_fam.hdrsize,
attrbuf, nl80211_fam.maxattr,
nl80211_policy, NULL);
if (err) {
- kfree(attrbuf);
+ kfree(attrbuf_free);
return err;
}

rtnl_lock();
*wdev = __cfg80211_wdev_from_attrs(NULL, sock_net(cb->skb->sk),
attrbuf);
- kfree(attrbuf);
+ kfree(attrbuf_free);
if (IS_ERR(*wdev)) {
rtnl_unlock();
return PTR_ERR(*wdev);
@@ -6198,7 +6202,7 @@ static int nl80211_dump_station(struct sk_buff *skb,
int sta_idx = cb->args[2];
int err;

- err = nl80211_prepare_wdev_dump(cb, &rdev, &wdev);
+ err = nl80211_prepare_wdev_dump(cb, &rdev, &wdev, NULL);
if (err)
return err;
/* nl80211_prepare_wdev_dump acquired it in the successful case */
@@ -7093,7 +7097,7 @@ static int nl80211_dump_mpath(struct sk_buff *skb,
int path_idx = cb->args[2];
int err;

- err = nl80211_prepare_wdev_dump(cb, &rdev, &wdev);
+ err = nl80211_prepare_wdev_dump(cb, &rdev, &wdev, NULL);
if (err)
return err;
/* nl80211_prepare_wdev_dump acquired it in the successful case */
@@ -7293,7 +7297,7 @@ static int nl80211_dump_mpp(struct sk_buff *skb,
int path_idx = cb->args[2];
int err;

- err = nl80211_prepare_wdev_dump(cb, &rdev, &wdev);
+ err = nl80211_prepare_wdev_dump(cb, &rdev, &wdev, NULL);
if (err)
return err;
/* nl80211_prepare_wdev_dump acquired it in the successful case */
@@ -9723,7 +9727,7 @@ static int nl80211_dump_scan(struct sk_buff *skb, struct netlink_callback *cb)
int start = cb->args[2], idx = 0;
int err;

- err = nl80211_prepare_wdev_dump(cb, &rdev, &wdev);
+ err = nl80211_prepare_wdev_dump(cb, &rdev, &wdev, NULL);
if (err)
return err;
/* nl80211_prepare_wdev_dump acquired it in the successful case */
@@ -9856,7 +9860,7 @@ static int nl80211_dump_survey(struct sk_buff *skb, struct netlink_callback *cb)
if (!attrbuf)
return -ENOMEM;

- res = nl80211_prepare_wdev_dump(cb, &rdev, &wdev);
+ res = nl80211_prepare_wdev_dump(cb, &rdev, &wdev, attrbuf);
if (res) {
kfree(attrbuf);
return res;
diff --git a/net/wireless/nl80211.h b/net/wireless/nl80211.h
index a3f387770f1b..d642e3be4ee7 100644
--- a/net/wireless/nl80211.h
+++ b/net/wireless/nl80211.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Portions of this file
- * Copyright (C) 2018, 2020 Intel Corporation
+ * Copyright (C) 2018, 2020-2021 Intel Corporation
*/
#ifndef __NET_WIRELESS_NL80211_H
#define __NET_WIRELESS_NL80211_H
@@ -22,10 +22,6 @@ static inline u64 wdev_id(struct wireless_dev *wdev)
((u64)wiphy_to_rdev(wdev->wiphy)->wiphy_idx << 32);
}

-int nl80211_prepare_wdev_dump(struct netlink_callback *cb,
- struct cfg80211_registered_device **rdev,
- struct wireless_dev **wdev);
-
int nl80211_parse_chandef(struct cfg80211_registered_device *rdev,
struct genl_info *info,
struct cfg80211_chan_def *chandef);
--
2.31.1


2021-10-29 08:47:21

by Sven Eckelmann

[permalink] [raw]
Subject: Re: [PATCH] nl80211: fix radio statistics in survey dump

On Friday, 29 October 2021 09:25:39 CEST Johannes Berg wrote:
> From: Johannes Berg <[email protected]>
>
> Even if userspace specifies the NL80211_ATTR_SURVEY_RADIO_STATS
> attribute, we cannot get the statistics because we're not really
> parsing the incoming attributes properly any more.
>
> Fix this by passing the attrbuf to nl80211_prepare_wdev_dump()
> and filling it there, if given, and using a local version only
> if no output is desired.
>
> Since I'm touching it anyway, make nl80211_prepare_wdev_dump()
> static.

Tested-by: Sven Eckelmann <[email protected]>

Thanks for checking the problem. I've tested it together with
https://patchwork.kernel.org/project/linux-wireless/patch/[email protected]/
on an AX200 and firmware version 59.601f3a66.0 cc-a0-59.ucode.

$ sudo ./iw dev wlan0 survey dump --radio
Survey data from wlan0
channel active time: 17205356 ms
channel receive time: 653 ms
channel transmit time: 74 ms

But you might have noticed that the channel active time is off by a lot. Just
after a couple of seconds:

sudo ./iw dev wlan0 survey dump --radio
Survey data from wlan0
channel active time: 38689859 ms
channel receive time: 904 ms
channel transmit time: 128 ms

Here are the times from a loop with a delay of roughly one second:

channel active time: 47284414 ms
channel active time: 47284455 ms
channel active time: 47284496 ms
channel active time: 47284637 ms
channel active time: 47284742 ms
channel active time: 47284788 ms
channel active time: 47284860 ms
channel active time: 47284871 ms
channel active time: 47284883 ms
channel active time: 47284911 ms
channel active time: 47285040 ms
channel active time: 47285280 ms
channel active time: 47285449 ms
channel active time: 47285745 ms
channel active time: 47285886 ms
channel active time: 47286074 ms
channel active time: 47286108 ms
channel active time: 47286120 ms
channel active time: 47286154 ms
channel active time: 47286235 ms
channel active time: 47286247 ms
channel active time: 47286405 ms
channel active time: 47286418 ms
channel active time: 51581468 ms
channel active time: 51581569 ms
channel active time: 51581765 ms
channel active time: 51581795 ms
channel active time: 51581807 ms
channel active time: 51581818 ms
channel active time: 51581836 ms
channel active time: 51581877 ms
channel active time: 51582130 ms
channel active time: 51582205 ms
channel active time: 51582219 ms
channel active time: 51582269 ms
channel active time: 51582319 ms
channel active time: 51582379 ms
channel active time: 51582430 ms
channel active time: 51582491 ms
channel active time: 51582694 ms
channel active time: 51582870 ms
channel active time: 51582926 ms
channel active time: 51582946 ms
channel active time: 51583078 ms
channel active time: 51583111 ms
channel active time: 51583120 ms
channel active time: 51583129 ms
channel active time: 51583142 ms
channel active time: 51583172 ms
channel active time: 51583186 ms
channel active time: 51583331 ms
channel active time: 51583449 ms
channel active time: 51583649 ms
channel active time: 51583776 ms
channel active time: 51583789 ms
channel active time: 51583801 ms
channel active time: 51583821 ms
channel active time: 55878795 ms
channel active time: 55878828 ms
channel active time: 55878853 ms
channel active time: 55878863 ms
channel active time: 55879005 ms
channel active time: 55879051 ms
channel active time: 55879114 ms
channel active time: 55879144 ms
channel active time: 55879169 ms
channel active time: 55879313 ms
channel active time: 55879610 ms
channel active time: 55879626 ms
channel active time: 55879641 ms
channel active time: 55879786 ms
channel active time: 55879931 ms
channel active time: 55880082 ms
channel active time: 55880114 ms
channel active time: 55880145 ms
channel active time: 55880304 ms
channel active time: 55880342 ms
channel active time: 55880397 ms
channel active time: 55880407 ms
channel active time: 55880595 ms
channel active time: 55880653 ms
channel active time: 60175670 ms
channel active time: 60175702 ms
channel active time: 60175714 ms
channel active time: 60175863 ms
channel active time: 60175876 ms
channel active time: 60175887 ms
channel active time: 60175925 ms
channel active time: 60175935 ms
channel active time: 60175950 ms
channel active time: 60175957 ms
channel active time: 60175968 ms
channel active time: 60176020 ms

I haven't checked if these jumps are related to anything specific or not. But
if you multiple the values by 1000 (msec -> usec) and then create a diff
between the previous values then you see that these values are also all over
the place and not roughly ~1000000.

If you just read the mvm->radio_stats.on_time_rf (in usec) then you see following:

channel active time: 2346292 us
channel active time: 3357380 us
channel active time: 4368516 us
channel active time: 4567172 us
channel active time: 126139 us
channel active time: 196605 us
channel active time: 214119 us
channel active time: 234556 us
channel active time: 293402 us
channel active time: 311971 us
channel active time: 322606 us
channel active time: 368173 us
channel active time: 382785 us
channel active time: 420408 us
channel active time: 430851 us
channel active time: 442990 us
channel active time: 458566 us
channel active time: 54960 us
channel active time: 67810 us
channel active time: 187747 us
channel active time: 4295188641 us
channel active time: 4295229424 us
channel active time: 16295 us
channel active time: 160736 us
channel active time: 330166 us
channel active time: 340849 us
channel active time: 356933 us
channel active time: 562579 us
channel active time: 628140 us
channel active time: 678766 us
channel active time: 736022 us
channel active time: 751099 us
channel active time: 769377 us
channel active time: 787728 us
channel active time: 822183 us
channel active time: 832478 us
channel active time: 863107 us
channel active time: 915469 us
channel active time: 52166 us
channel active time: 201828 us
channel active time: 364236 us
channel active time: 387916 us
channel active time: 422965 us
channel active time: 8560 us
channel active time: 4295006989 us
channel active time: 4295020943 us
channel active time: 4295051766 us
channel active time: 4295086037 us
channel active time: 4295119851 us
channel active time: 4295157051 us
channel active time: 4295193488 us
channel active time: 4295247769 us
channel active time: 4295302615 us
channel active time: 4295315627 us
channel active time: 4295352876 us
channel active time: 45385 us
channel active time: 121871 us
channel active time: 142972 us
channel active time: 262344 us
channel active time: 418666 us

So it also jumps all over the place. This could be investigated further but I
just wanted to mention it here.

Kind regards,
Sven


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part.

2021-11-02 11:13:03

by Sven Eckelmann

[permalink] [raw]
Subject: Re: [PATCH] nl80211: fix radio statistics in survey dump

On Friday, 29 October 2021 10:46:43 CET Sven Eckelmann wrote:
> If you just read the mvm->radio_stats.on_time_rf (in usec) then you see following:
[...]
> channel active time: 8560 us
> channel active time: 4295006989 us
> channel active time: 4295020943 us
> channel active time: 4295051766 us
> channel active time: 4295086037 us
> channel active time: 4295119851 us
> channel active time: 4295157051 us
> channel active time: 4295193488 us
> channel active time: 4295247769 us
> channel active time: 4295302615 us
> channel active time: 4295315627 us
> channel active time: 4295352876 us
> channel active time: 45385 us
> channel active time: 121871 us
> channel active time: 142972 us
> channel active time: 262344 us
> channel active time: 418666 us
>
> So it also jumps all over the place. This could be investigated further but I
> just wanted to mention it here.

Sorry, wanted to write more about it last week but forgot about it. If I
basically filter out the upper 32 bit in mvm->radio_stats.on_time_rf then it
didn't look that bad on a AX210. It seems like the upper bits is sometimes
0x00000001 for unknown reasons. Like it would be some kind of flag which
should indicate some kind of change/event. So maybe the firmware team could
check what this means.

It is not really urgent - I just got interested in the problem :)

Thanks,
Sven


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part.

2021-11-02 12:08:00

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] nl80211: fix radio statistics in survey dump

On Tue, 2021-11-02 at 12:12 +0100, Sven Eckelmann wrote:
>
> Sorry, wanted to write more about it last week but forgot about it. If I
> basically filter out the upper 32 bit in mvm->radio_stats.on_time_rf then it
> didn't look that bad on a AX210. It seems like the upper bits is sometimes
> 0x00000001 for unknown reasons. Like it would be some kind of flag which
> should indicate some kind of change/event. So maybe the firmware team could
> check what this means.

Well, I checked, and it *is* just a u64 value.

However, I suspect it sometimes underflows when powersave time is
accounted into it, or something?

johannes