2014-02-26 01:09:49

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH v2 0/3] cfg80211: respin reprocessing pending requests

Here's a respin. There are no code changes, I just adjusted the
commit logs as that's all the feedback I got from them, and on
the last patch I just don't think a timer is best. If a timer
is desired let me know how much time you want to set it for,
I just don't see how that's any better.

The commit log on the other ones should help let people make
decisions if the implications of not merging this, I'll let
folks make the call about whether or not some of these belong
to stable, I personally don't think so -- but some others may
disagree.

Luis R. Rodriguez (3):
cfg80211: allow reprocessing of pending requests
cfg80211: fix processing world regdomain when non modular
cfg80211: processing regulatory requests on netdev notifier

net/wireless/reg.c | 16 +++++++++-------
net/wireless/reg.h | 1 +
2 files changed, 10 insertions(+), 7 deletions(-)

--
1.8.5.3



2014-02-26 01:10:05

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH 3/3] cfg80211: processing regulatory requests on netdev notifier

This adds a trigger to review any pending regulatory
requests whenever an 802.11 device interface is brought
down or up. We use this as an opportunistic trigger
for checking the regulatory work queues as otherwise
they they're only checked upon an initial regulatory
request or when beacon hints are found.

This opportunistic mechanism can be used to trigger
kicking the queues regulatory queues at any time from
userspace without having to change the regulatory state.

A timer just waiting upon init is not that appropriate
as when it should be triggered will depend on systems,
a much better approach is to use and add opportunistic
triggers. The interface coming up is typically a good
indicator that filesystems have been mounted since
firmware is required for some devices.

Reported-by: Sander Eikelenboom <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
net/wireless/reg.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/wireless/reg.h b/net/wireless/reg.h
index 37c180d..51912b3 100644
--- a/net/wireless/reg.h
+++ b/net/wireless/reg.h
@@ -37,6 +37,7 @@ unsigned int reg_get_max_bandwidth(const struct ieee80211_regdomain *rd,
const struct ieee80211_reg_rule *rule);

bool reg_last_request_cell_base(void);
+void reg_process_pending_work(void);

/**
* regulatory_hint_found_beacon - hints a beacon was found on a channel
--
1.8.5.3


2014-02-27 17:21:02

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH 3/3] cfg80211: processing regulatory requests on netdev notifier

On Thu, Feb 27, 2014 at 5:21 AM, Arik Nemtsov <[email protected]> wrote:
> Seems the actual patch is missing?

Yeah sorry I sent out a v3.

Luis

2014-02-26 01:09:55

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH 1/3] cfg80211: allow reprocessing of pending requests

In certain situations we want to trigger reprocessing
of the last regulatory hint. One situation in which
this makes sense is the case where the cfg80211 was
built-in to the kernel, CFG80211_INTERNAL_REGDB was not
enabled and the CRDA binary is on a partition not availble
during early boot. In such a case we want to be able to
re-process the same request at some other point.

When we are asked to re-process the same request we need
to be careful to not kfree it, addresses that.

Reported-by: Sander Eikelenboom <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
net/wireless/reg.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index b95e9cf..f5b120f 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -240,19 +240,21 @@ static char user_alpha2[2];
module_param(ieee80211_regdom, charp, 0444);
MODULE_PARM_DESC(ieee80211_regdom, "IEEE 802.11 regulatory domain code");

-static void reg_kfree_last_request(void)
+static void reg_kfree_last_request(struct regulatory_request *lr)
{
- struct regulatory_request *lr;
-
- lr = get_last_request();
-
if (lr != &core_request_world && lr)
kfree_rcu(lr, rcu_head);
}

static void reg_update_last_request(struct regulatory_request *request)
{
- reg_kfree_last_request();
+ struct regulatory_request *lr;
+
+ lr = get_last_request();
+ if (lr == request)
+ return;
+
+ reg_kfree_last_request(lr);
rcu_assign_pointer(last_request, request);
}

--
1.8.5.3


2014-02-27 20:31:55

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH 3/3] cfg80211: processing regulatory requests on netdev notifier

On Thu, Feb 27, 2014 at 7:20 PM, Luis R. Rodriguez
<[email protected]> wrote:
> On Thu, Feb 27, 2014 at 5:21 AM, Arik Nemtsov <[email protected]> wrote:
>> Seems the actual patch is missing?
>
> Yeah sorry I sent out a v3.

Yea finally noticed. Thanks.

Arik

2014-02-27 13:21:51

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH 3/3] cfg80211: processing regulatory requests on netdev notifier

On Wed, Feb 26, 2014 at 3:09 AM, Luis R. Rodriguez
<[email protected]> wrote:
> This adds a trigger to review any pending regulatory
> requests whenever an 802.11 device interface is brought
> down or up. We use this as an opportunistic trigger
> for checking the regulatory work queues as otherwise
> they they're only checked upon an initial regulatory
> request or when beacon hints are found.
>
> This opportunistic mechanism can be used to trigger
> kicking the queues regulatory queues at any time from
> userspace without having to change the regulatory state.
>
> A timer just waiting upon init is not that appropriate
> as when it should be triggered will depend on systems,
> a much better approach is to use and add opportunistic
> triggers. The interface coming up is typically a good
> indicator that filesystems have been mounted since
> firmware is required for some devices.
>
> Reported-by: Sander Eikelenboom <[email protected]>
> Signed-off-by: Luis R. Rodriguez <[email protected]>
> ---
> net/wireless/reg.h | 1 +
> 1 file changed, 1 insertion(+)

Seems the actual patch is missing?

Arik

2014-02-26 07:41:44

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] cfg80211: respin reprocessing pending requests


Wednesday, February 26, 2014, 2:09:39 AM, you wrote:

> Here's a respin. There are no code changes, I just adjusted the
> commit logs as that's all the feedback I got from them, and on
> the last patch I just don't think a timer is best. If a timer
> is desired let me know how much time you want to set it for,
> I just don't see how that's any better.

> The commit log on the other ones should help let people make
> decisions if the implications of not merging this, I'll let
> folks make the call about whether or not some of these belong
> to stable, I personally don't think so -- but some others may
> disagree.

I tested the original series and these fixed my problem and i did not
notice any regressions. Since there are no code changes:

Tested-By: Sander Eikelenboom <[email protected]>

Thanks Luis !
Hopefully it is going somewhere now after so many months.

-
Sander

> Luis R. Rodriguez (3):
> cfg80211: allow reprocessing of pending requests
> cfg80211: fix processing world regdomain when non modular
> cfg80211: processing regulatory requests on netdev notifier

> net/wireless/reg.c | 16 +++++++++-------
> net/wireless/reg.h | 1 +
> 2 files changed, 10 insertions(+), 7 deletions(-)



2014-02-26 01:10:00

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

This allows processing of the last regulatory request when
we determine its still pending. Without this if a regulatory
request failed to get processed by userspace we wouldn't
be able to re-process it later. An example situation that can
lead to an unprocessed last_request is enabling cfg80211 to
be built-in to the kernel, not enabling CFG80211_INTERNAL_REGDB
and the CRDA binary not being available at the time the udev
rule that kicks of CRDA triggers.

In such a situation we want to let some cfg80211 triggers
eventually kick CRDA for us again. Without this if the first
cycle attempt to kick off CRDA failed we'd be stuck without
the ability to change process any further regulatory domains.

cfg80211 will trigger re-processing of the regulatory queue
whenever schedule_work(&reg_work) is called, currently this
happens when:

* suspend / resume
* disconnect
* a beacon hint gets triggered (non DFS 5 GHz AP found)
* a regulatory request gets added to the queue

We don't have any specific opportunistic late boot triggers
to address a late mount of where CRDA resides though, adding
that should be done separately through another patch.
Without an opportunistic fix then this fix relies at least
one of the triggeres above to happen.

Reported-by: Sander Eikelenboom <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
net/wireless/reg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index f5b120f..7203b74 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -1857,7 +1857,7 @@ static void reg_process_pending_hints(void)

/* When last_request->processed becomes true this will be rescheduled */
if (lr && !lr->processed) {
- REG_DBG_PRINT("Pending regulatory request, waiting for it to be processed...\n");
+ reg_process_hint(lr);
return;
}

--
1.8.5.3


2014-03-19 14:01:10

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Fri, 2014-03-14 at 13:30 -0700, Colleen T wrote:
> Hi guys,
>
> This commit -- 5a970df8990d173e7e4092952f2e3da1de69b27d -- is causing
> a regression on mac80211-next/master in our mesh test framework on
> qemu. We are using cfg80211 as a module.

I'm reverting that commit.

johannes


2014-03-03 13:10:51

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Tue, 2014-02-25 at 17:09 -0800, Luis R. Rodriguez wrote:
> This allows processing of the last regulatory request when
> we determine its still pending. Without this if a regulatory
> request failed to get processed by userspace we wouldn't
> be able to re-process it later. An example situation that can
> lead to an unprocessed last_request is enabling cfg80211 to
> be built-in to the kernel, not enabling CFG80211_INTERNAL_REGDB
> and the CRDA binary not being available at the time the udev
> rule that kicks of CRDA triggers.
>
> In such a situation we want to let some cfg80211 triggers
> eventually kick CRDA for us again. Without this if the first
> cycle attempt to kick off CRDA failed we'd be stuck without
> the ability to change process any further regulatory domains.
>
> cfg80211 will trigger re-processing of the regulatory queue
> whenever schedule_work(&reg_work) is called, currently this
> happens when:
>
> * suspend / resume
> * disconnect
> * a beacon hint gets triggered (non DFS 5 GHz AP found)
> * a regulatory request gets added to the queue
>
> We don't have any specific opportunistic late boot triggers
> to address a late mount of where CRDA resides though, adding
> that should be done separately through another patch.
> Without an opportunistic fix then this fix relies at least
> one of the triggeres above to happen.

Ok, applied. (with that typo there fixed)

johannes


2014-03-14 20:48:47

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Fri, Mar 14, 2014 at 1:30 PM, Colleen T <[email protected]> wrote:
> This commit -- 5a970df8990d173e7e4092952f2e3da1de69b27d -- is causing
> a regression on mac80211-next/master in our mesh test framework on
> qemu. We are using cfg80211 as a module.
>
> In /etc/default/crda, I have:
> REGDOMAIN=US
>
> I can trigger the oops by loading mac80211_hwsim with three or more radios:
>
>> modprobe mac80211_hwsim radios=3


Thanks for reporting this, I'll take a look at this right away. Are
you using wireless-testing or mac80211-next? What sha1sum are you on?

Luis

2014-03-14 20:30:27

by Colleen T

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

Hi guys,

This commit -- 5a970df8990d173e7e4092952f2e3da1de69b27d -- is causing
a regression on mac80211-next/master in our mesh test framework on
qemu. We are using cfg80211 as a module.

In /etc/default/crda, I have:
REGDOMAIN=US

I can trigger the oops by loading mac80211_hwsim with three or more radios:

> modprobe mac80211_hwsim radios=3

It seems to be caused by updating the pending regulatory_requests
while new regulatory requests are still being added.

Here's the dmesg output which shows warnings, followed by an oops:
[ 22.360102] ------------[ cut here ]------------
[ 22.361001] WARNING: CPU: 0 PID: 468 at net/wireless/reg.c:1832
reg_process_hint+0x19a/0x3c0 [cfg80211]()
[ 22.362758] invalid initiator -30720
[ 22.363440] Modules linked in: mac80211_hwsim mac80211 cfg80211
[ 22.364689] CPU: 0 PID: 468 Comm: kworker/0:1 Not tainted
3.14.0-rc2-5a970df+ #86
[ 22.366114] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 22.367420] Workqueue: events reg_todo [cfg80211]
[ 22.368465] 0000000000000009 ffff880007367c88 ffffffff8183ffeb
ffff880007367cd0
[ 22.370092] ffff880007367cc0 ffffffff8104cfbd ffff88000605f800
0000000000000000
[ 22.371534] ffff880007c16e00 0000000000000000 0000000000000000
ffff880007367d20
[ 22.372994] Call Trace:
[ 22.373487] [<ffffffff8183ffeb>] dump_stack+0x4d/0x66
[ 22.374454] [<ffffffff8104cfbd>] warn_slowpath_common+0x7d/0xa0
[ 22.375586] [<ffffffff8104d02c>] warn_slowpath_fmt+0x4c/0x50
[ 22.376669] [<ffffffffa0001401>] ?
cfg80211_rdev_by_wiphy_idx+0x11/0x80 [cfg80211]
[ 22.378009] [<ffffffffa00077ba>] reg_process_hint+0x19a/0x3c0 [cfg80211]
[ 22.378976] [<ffffffffa0007b87>] reg_todo+0x1a7/0x1c0 [cfg80211]
[ 22.379647] [<ffffffff8106f52c>] process_one_work+0x1fc/0x670
[ 22.380304] [<ffffffff8106f4c1>] ? process_one_work+0x191/0x670
[ 22.380958] [<ffffffff8106fac1>] worker_thread+0x121/0x3a0
[ 22.381675] [<ffffffff8106f9a0>] ? process_one_work+0x670/0x670
[ 22.382574] [<ffffffff8107767d>] kthread+0xed/0x110
[ 22.383140] [<ffffffff81077590>] ? insert_kthread_work+0x70/0x70
[ 22.384188] [<ffffffff8185392c>] ret_from_fork+0x7c/0xb0
[ 22.385209] [<ffffffff81077590>] ? insert_kthread_work+0x70/0x70
[ 22.386325] ---[ end trace a50e766039e79b68 ]---
[ 22.387245] ------------[ cut here ]------------
[ 22.388216] WARNING: CPU: 0 PID: 468 at net/wireless/reg.c:1832
reg_process_hint+0x19a/0x3c0 [cfg80211]()
[ 22.390026] invalid initiator -559087616
[ 22.390801] Modules linked in: mac80211_hwsim mac80211 cfg80211
[ 22.391993] CPU: 0 PID: 468 Comm: kworker/0:1 Tainted: G W
3.14.0-rc2-5a970df+ #86
[ 22.393512] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 22.394584] Workqueue: events reg_todo [cfg80211]
[ 22.395482] 0000000000000009 ffff880007367c88 ffffffff8183ffeb
ffff880007367cd0
[ 22.396915] ffff880007367cc0 ffffffff8104cfbd ffff88000605f800
0000000000000000
[ 22.398364] ffff880007c16e00 0000000000000000 0000000000000000
ffff880007367d20
[ 22.399808] Call Trace:
[ 22.400312] [<ffffffff8183ffeb>] dump_stack+0x4d/0x66
[ 22.401291] [<ffffffff8104cfbd>] warn_slowpath_common+0x7d/0xa0
[ 22.402426] [<ffffffff8104d02c>] warn_slowpath_fmt+0x4c/0x50
[ 22.403515] [<ffffffffa0001401>] ?
cfg80211_rdev_by_wiphy_idx+0x11/0x80 [cfg80211]
[ 22.404924] [<ffffffffa00077ba>] reg_process_hint+0x19a/0x3c0 [cfg80211]
[ 22.406177] [<ffffffffa0007b87>] reg_todo+0x1a7/0x1c0 [cfg80211]
[ 22.407321] [<ffffffff8106f52c>] process_one_work+0x1fc/0x670
[ 22.408382] [<ffffffff8106f4c1>] ? process_one_work+0x191/0x670
[ 22.409249] [<ffffffff8106fac1>] worker_thread+0x121/0x3a0
[ 22.409886] [<ffffffff8106f9a0>] ? process_one_work+0x670/0x670
[ 22.410551] [<ffffffff8107767d>] kthread+0xed/0x110
[ 22.411107] [<ffffffff81077590>] ? insert_kthread_work+0x70/0x70
[ 22.411809] [<ffffffff8185392c>] ret_from_fork+0x7c/0xb0
[ 22.412655] [<ffffffff81077590>] ? insert_kthread_work+0x70/0x70
[ 22.413618] ---[ end trace a50e766039e79b69 ]---
[ 25.503446] cfg80211: Calling CRDA to update world regulatory domain
[ 25.507041] kernel tried to execute NX-protected page - exploit
attempt? (uid: 0)
[ 25.508020] BUG: unable to handle kernel paging request at ffff8800062bfcf0
[ 25.508020] IP: [<ffff8800062bfcf0>] 0xffff8800062bfcf0
[ 25.508020] PGD 295c067 PUD 295d067 PMD 80000000062001e3
[ 25.508020] Oops: 0011 [#1] SMP
[ 25.508020] Modules linked in: mac80211_hwsim mac80211 cfg80211
[ 25.508020] CPU: 0 PID: 2648 Comm: modprobe Tainted: G W
3.14.0-rc2-5a970df+ #86
[ 25.508020] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 25.508020] task: ffff88000724c640 ti: ffff8800037c4000 task.ti:
ffff8800037c4000
[ 25.508020] RIP: 0010:[<ffff8800062bfcf0>] [<ffff8800062bfcf0>]
0xffff8800062bfcf0
[ 25.508020] RSP: 0000:ffff880007c03ea8 EFLAGS: 00010292
[ 25.508020] RAX: ffff88000724c640 RBX: ffff88000605f800 RCX: 0000000000000000
[ 25.508020] RDX: 0000000000000020 RSI: 0000000000000000 RDI: ffff88000605f800
[ 25.508020] RBP: ffff880007c03f18 R08: 0000000000000001 R09: 0000000000000000
[ 25.508020] R10: ffff88000724c640 R11: 0000000000000000 R12: 0000000000000001
[ 25.508020] R13: 000000000000000a R14: ffff8800062bfcf0 R15: 0000000000000000
[ 25.508020] FS: 00007f92aeb0e700(0000) GS:ffff880007c00000(0000)
knlGS:0000000000000000
[ 25.508020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 25.508020] CR2: ffff8800062bfcf0 CR3: 000000000636d000 CR4: 00000000000006f0
[ 25.508020] Stack:
[ 25.508020] ffffffff810baa12 ffffffff810ba9cf ffff88000605f800
ffff880007c0d660
[ 25.508020] ffff88000724c640 ffff8800037c5fd8 ffff880007c0d688
0000000000000001
[ 25.508020] ffffffff81e3be40 0000000000000009 ffffffff81e040c8
0000000000000009
[ 25.508020] Call Trace:
[ 25.508020] <IRQ>
[ 25.508020] [<ffffffff810baa12>] ? rcu_process_callbacks+0x272/0x7e0
[ 25.508020] [<ffffffff810ba9cf>] ? rcu_process_callbacks+0x22f/0x7e0
[ 25.508020] [<ffffffff8105359e>] __do_softirq+0x12e/0x440
[ 25.508020] [<ffffffff81053b65>] irq_exit+0xa5/0xb0
[ 25.508020] [<ffffffff818559d5>] smp_apic_timer_interrupt+0x45/0x60
[ 25.508020] [<ffffffff8185462f>] apic_timer_interrupt+0x6f/0x80
[ 25.508020] <EOI>
[ 25.508020] [<ffffffff81158a68>] ? handle_mm_fault+0x198/0x9b0
[ 25.508020] [<ffffffff8184e26b>] ? __do_page_fault+0x2ab/0x560
[ 25.508020] [<ffffffff8184e265>] ? __do_page_fault+0x2a5/0x560
[ 25.508020] [<ffffffff810a1a10>] ? lock_release_non_nested+0xa0/0x300
[ 25.508020] [<ffffffff8115edcf>] ? do_brk+0x2bf/0x350
[ 25.508020] [<ffffffff8184a889>] ? retint_swapgs+0xe/0x13
[ 25.508020] [<ffffffff813328ea>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 25.508020] [<ffffffff8184e52e>] do_page_fault+0xe/0x10
[ 25.508020] [<ffffffff8184aad2>] page_fault+0x22/0x30
[ 25.508020] Code: 00 00 00 00 00 00 00 00 00 00 00 17 e1 c7 81 ff
ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0 fc 2b 06
00 88 ff ff <60> dc b9 06 00 88 ff ff 00 00 00 00 00 00 00 00 00 00 00
00 ad
[ 25.508020] RIP [<ffff8800062bfcf0>] 0xffff8800062bfcf0
[ 25.508020] RIP [<ffff8800062bfcf0>] 0xffff8800062bfcf0
[ 25.508020] RSP <ffff880007c03ea8>
[ 25.508020] CR2: ffff8800062bfcf0
[ 25.508020] ---[ end trace a50e766039e79b6a ]---

After that, qemu locks hard. Seems like there might be a free on an
invalid pointer. The crash doesn't occur with this commit reverted.

Any advice?

Thanks,
Colleen

On Mon, Mar 3, 2014 at 5:10 AM, Johannes Berg <[email protected]> wrote:
> On Tue, 2014-02-25 at 17:09 -0800, Luis R. Rodriguez wrote:
>> This allows processing of the last regulatory request when
>> we determine its still pending. Without this if a regulatory
>> request failed to get processed by userspace we wouldn't
>> be able to re-process it later. An example situation that can
>> lead to an unprocessed last_request is enabling cfg80211 to
>> be built-in to the kernel, not enabling CFG80211_INTERNAL_REGDB
>> and the CRDA binary not being available at the time the udev
>> rule that kicks of CRDA triggers.
>>
>> In such a situation we want to let some cfg80211 triggers
>> eventually kick CRDA for us again. Without this if the first
>> cycle attempt to kick off CRDA failed we'd be stuck without
>> the ability to change process any further regulatory domains.
>>
>> cfg80211 will trigger re-processing of the regulatory queue
>> whenever schedule_work(&reg_work) is called, currently this
>> happens when:
>>
>> * suspend / resume
>> * disconnect
>> * a beacon hint gets triggered (non DFS 5 GHz AP found)
>> * a regulatory request gets added to the queue
>>
>> We don't have any specific opportunistic late boot triggers
>> to address a late mount of where CRDA resides though, adding
>> that should be done separately through another patch.
>> Without an opportunistic fix then this fix relies at least
>> one of the triggeres above to happen.
>
> Ok, applied. (with that typo there fixed)
>
> johannes
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2014-03-15 01:04:12

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Fri, Mar 14, 2014 at 3:12 PM, Colleen T <[email protected]> wrote:
> I'm on mac80211-next/master, sha is
> 5a970df8990d173e7e4092952f2e3da1de69b27d

I tried to reproduce by just merging the fixes in question onto Linus'
tree and using radios=3 but no go. Can you provide the full kernel
log, I'm particularly interested in what happened before. The
COUNTRY=US on debian, which I believe you're on, should just trigger a
regulatory domain setting to US upon initialization. Depending on how
Debian does it this could either trigger as a module parameter or as a
userspace event *after* the world regdom get set.

The radios=3 alone would not do anything other than add new radios so
I see no contention on the last_request by increasing or decreasing
that. last_request is also protected by rtnl and although I was
considering a possible race against processing the same last_request
twice if you see how we call reg_process_pending_hints() on reg_todo()
its protected by rtnl_lock() so I'm a bit puzzled as to how this is
being triggered as those operations should be atomic. The other corner
case I thought of was that for userspace requests which sets a
timeout, but our when the timeout hits we also rtnl_lock() under
reg_timeout_work().

The other corner I thought of was when we reset_regdomains() but all
callers are rtnl_lock()'d. This is also the same for callers of
set_regdom().

Do you get different results if you reduce the number of radios? If so
where's the trigger point?

Luis

2014-03-16 19:04:35

by Colleen T

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Sat, Mar 15, 2014 at 9:42 PM, Luis R. Rodriguez
<[email protected]> wrote:
> On Sat, Mar 15, 2014 at 8:59 AM, Janusz Dziedzic
> <[email protected]> wrote:
>> 2014-03-15 2:03 GMT+01:00 Luis R. Rodriguez <[email protected]>:
>>> On Fri, Mar 14, 2014 at 3:12 PM, Colleen T <[email protected]> wrote:
>>>> I'm on mac80211-next/master, sha is
>>>> 5a970df8990d173e7e4092952f2e3da1de69b27d
>>>
>>> I tried to reproduce by just merging the fixes in question onto Linus'
>>> tree and using radios=3 but no go. Can you provide the full kernel
>>> log, I'm particularly interested in what happened before. The
>>> COUNTRY=US on debian, which I believe you're on, should just trigger a
>>> regulatory domain setting to US upon initialization. Depending on how
>>> Debian does it this could either trigger as a module parameter or as a
>>> userspace event *after* the world regdom get set.

I'm attaching a full kernel log that shows the warning being
triggered. You are correct, I'm on debian. In the attached case I
ran:
$ modprobe mac80211_hwsim radios=4
I pulled the fixes onto Linus' tree and I end up with the same result.

>>> The radios=3 alone would not do anything other than add new radios so
>>> I see no contention on the last_request by increasing or decreasing
>>> that. last_request is also protected by rtnl and although I was
>>> considering a possible race against processing the same last_request
>>> twice if you see how we call reg_process_pending_hints() on reg_todo()
>>> its protected by rtnl_lock() so I'm a bit puzzled as to how this is
>>> being triggered as those operations should be atomic. The other corner
>>> case I thought of was that for userspace requests which sets a
>>> timeout, but our when the timeout hits we also rtnl_lock() under
>>> reg_timeout_work().
>>>
>>> The other corner I thought of was when we reset_regdomains() but all
>>> callers are rtnl_lock()'d. This is also the same for callers of
>>> set_regdom().
>>>
>>> Do you get different results if you reduce the number of radios? If so
>>> where's the trigger point?

I never see a problem for one radio or two radios. In doing some
additional testing, I noticed I don't *always* hit this warning, but I
*usually* get it. If I remove and load the mac80211_hwsim module
again (with three or four radios), I can quickly achieve the invalid
initiator warning. The number of the invalid initiator is not always
the same from one run to another.

>>
>> Just guessing, try add this one if not included in mac80211-next yet.
>>
>> [PATCH 1/2] cfg80211: regulatory, reset regdomain in case of error -
>> this goes to 3.14 directly.
>
> Thanks for poingint that out Janusz, that patch is not in
> mac80211-next. Colleen it'd be curious if that fixes your issue but I
> don't see how given that I don't see how your situation would hit the
> error case. It is certainly worth trying. In any case please do
> provide the full kernel log and please try the other things I
> mentioned, the more detail the better.

Janusz, thanks for the patch idea. I did try it, but I'm going down a
different code path, so it didn't fix my issue. Luis, please let me
know if there is other information that would help you.

> Luis

Colleen


Attachments:
full-kernel-log.txt (126.49 kB)

2014-03-14 22:12:37

by Colleen T

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Fri, Mar 14, 2014 at 1:48 PM, Luis R. Rodriguez
<[email protected]> wrote:
> On Fri, Mar 14, 2014 at 1:30 PM, Colleen T <[email protected]> wrote:
>> This commit -- 5a970df8990d173e7e4092952f2e3da1de69b27d -- is causing
>> a regression on mac80211-next/master in our mesh test framework on
>> qemu. We are using cfg80211 as a module.
>>
>> In /etc/default/crda, I have:
>> REGDOMAIN=US
>>
>> I can trigger the oops by loading mac80211_hwsim with three or more radios:
>>
>>> modprobe mac80211_hwsim radios=3
>
>
> Thanks for reporting this, I'll take a look at this right away. Are
> you using wireless-testing or mac80211-next? What sha1sum are you on?

Thanks a lot! I'm on mac80211-next/master, sha is
5a970df8990d173e7e4092952f2e3da1de69b27d
>
> Luis

Colleen

2014-03-16 04:43:06

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Sat, Mar 15, 2014 at 8:59 AM, Janusz Dziedzic
<[email protected]> wrote:
> 2014-03-15 2:03 GMT+01:00 Luis R. Rodriguez <[email protected]>:
>> On Fri, Mar 14, 2014 at 3:12 PM, Colleen T <[email protected]> wrote:
>>> I'm on mac80211-next/master, sha is
>>> 5a970df8990d173e7e4092952f2e3da1de69b27d
>>
>> I tried to reproduce by just merging the fixes in question onto Linus'
>> tree and using radios=3 but no go. Can you provide the full kernel
>> log, I'm particularly interested in what happened before. The
>> COUNTRY=US on debian, which I believe you're on, should just trigger a
>> regulatory domain setting to US upon initialization. Depending on how
>> Debian does it this could either trigger as a module parameter or as a
>> userspace event *after* the world regdom get set.
>>
>> The radios=3 alone would not do anything other than add new radios so
>> I see no contention on the last_request by increasing or decreasing
>> that. last_request is also protected by rtnl and although I was
>> considering a possible race against processing the same last_request
>> twice if you see how we call reg_process_pending_hints() on reg_todo()
>> its protected by rtnl_lock() so I'm a bit puzzled as to how this is
>> being triggered as those operations should be atomic. The other corner
>> case I thought of was that for userspace requests which sets a
>> timeout, but our when the timeout hits we also rtnl_lock() under
>> reg_timeout_work().
>>
>> The other corner I thought of was when we reset_regdomains() but all
>> callers are rtnl_lock()'d. This is also the same for callers of
>> set_regdom().
>>
>> Do you get different results if you reduce the number of radios? If so
>> where's the trigger point?
>>
>
> Just guessing, try add this one if not included in mac80211-next yet.
>
> [PATCH 1/2] cfg80211: regulatory, reset regdomain in case of error -
> this goes to 3.14 directly.

Thanks for poingint that out Janusz, that patch is not in
mac80211-next. Colleen it'd be curious if that fixes your issue but I
don't see how given that I don't see how your situation would hit the
error case. It is certainly worth trying. In any case please do
provide the full kernel log and please try the other things I
mentioned, the more detail the better.

Luis

2014-03-03 13:10:35

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 1/3] cfg80211: allow reprocessing of pending requests

On Tue, 2014-02-25 at 17:09 -0800, Luis R. Rodriguez wrote:
> In certain situations we want to trigger reprocessing
> of the last regulatory hint. One situation in which
> this makes sense is the case where the cfg80211 was
> built-in to the kernel, CFG80211_INTERNAL_REGDB was not
> enabled and the CRDA binary is on a partition not availble
> during early boot. In such a case we want to be able to
> re-process the same request at some other point.
>
> When we are asked to re-process the same request we need
> to be careful to not kfree it, addresses that.

This looks OK, I've applied it, but I've renamed the function since it
no longer has anything to do with the last request.

johannes



2014-03-15 15:59:38

by Janusz Dziedzic

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

2014-03-15 2:03 GMT+01:00 Luis R. Rodriguez <[email protected]>:
> On Fri, Mar 14, 2014 at 3:12 PM, Colleen T <[email protected]> wrote:
>> I'm on mac80211-next/master, sha is
>> 5a970df8990d173e7e4092952f2e3da1de69b27d
>
> I tried to reproduce by just merging the fixes in question onto Linus'
> tree and using radios=3 but no go. Can you provide the full kernel
> log, I'm particularly interested in what happened before. The
> COUNTRY=US on debian, which I believe you're on, should just trigger a
> regulatory domain setting to US upon initialization. Depending on how
> Debian does it this could either trigger as a module parameter or as a
> userspace event *after* the world regdom get set.
>
> The radios=3 alone would not do anything other than add new radios so
> I see no contention on the last_request by increasing or decreasing
> that. last_request is also protected by rtnl and although I was
> considering a possible race against processing the same last_request
> twice if you see how we call reg_process_pending_hints() on reg_todo()
> its protected by rtnl_lock() so I'm a bit puzzled as to how this is
> being triggered as those operations should be atomic. The other corner
> case I thought of was that for userspace requests which sets a
> timeout, but our when the timeout hits we also rtnl_lock() under
> reg_timeout_work().
>
> The other corner I thought of was when we reset_regdomains() but all
> callers are rtnl_lock()'d. This is also the same for callers of
> set_regdom().
>
> Do you get different results if you reduce the number of radios? If so
> where's the trigger point?
>

Just guessing, try add this one if not included in mac80211-next yet.

[PATCH 1/2] cfg80211: regulatory, reset regdomain in case of error -
this goes to 3.14 directly.

BR
Janusz

2014-04-10 06:13:41

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Wed, Apr 9, 2014 at 10:16 PM, Johannes Berg
<[email protected]> wrote:
> +static void reg_free_last_request(struct regulatory_request *lr)
>
> Isn't it odd to pass the last request to a function called
> reg_free_last_request()?
>

It makes sense to me. Non-last requests are freed using kfree. The
last request needs RCU.

Arik

2014-04-14 19:27:38

by Colleen T

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

Hello,

Applying Arik's patch on top of Luis' resolves our problem.

You also get my:

Tested-by: Colleen Twitty <[email protected]>

Thanks,
-Colleen

On Sun, Apr 13, 2014 at 5:50 AM, Eliad Peller <[email protected]> wrote:
> On Wed, Apr 9, 2014 at 7:33 PM, Arik Nemtsov <[email protected]> wrote:
>>
>> Seems I might have found the culprit - reg_todo is called while the
>> request to CRDA is in flight and the patch in question causes the
>> already-in-process user-request to be handled again. Since it's the
>> same regdomain as the last request (it's the last request itself), we
>> get this:
>>
>> treatment = __reg_process_hint_user(user_request);
>> if (treatment == REG_REQ_IGNORE ||
>> treatment == REG_REQ_ALREADY_SET) {
>> kfree(user_request); <------
>> return treatment;
>> }
>>
>> Can you try adding the attached patch? It just replaced relevant
>> kfree-s with a function that avoids freeing the last request.
>>
> i encountered a similar panic, and this patch seems to fix it.
> so you get my:
>
> Tested-by: Eliad Peller <[email protected]>
>
> Eliad.

2014-04-09 16:33:25

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

Hey Collen, Luis,

On Sun, Mar 16, 2014 at 9:04 PM, Colleen T <[email protected]> wrote:
> On Sat, Mar 15, 2014 at 9:42 PM, Luis R. Rodriguez
> <[email protected]> wrote:
>> On Sat, Mar 15, 2014 at 8:59 AM, Janusz Dziedzic
>> <[email protected]> wrote:
>>> 2014-03-15 2:03 GMT+01:00 Luis R. Rodriguez <[email protected]>:
>>>> On Fri, Mar 14, 2014 at 3:12 PM, Colleen T <[email protected]> wrote:
>>>>> I'm on mac80211-next/master, sha is
>>>>> 5a970df8990d173e7e4092952f2e3da1de69b27d
>>>>
>>>> I tried to reproduce by just merging the fixes in question onto Linus'
>>>> tree and using radios=3 but no go. Can you provide the full kernel
>>>> log, I'm particularly interested in what happened before. The
>>>> COUNTRY=US on debian, which I believe you're on, should just trigger a
>>>> regulatory domain setting to US upon initialization. Depending on how
>>>> Debian does it this could either trigger as a module parameter or as a
>>>> userspace event *after* the world regdom get set.
>
> I'm attaching a full kernel log that shows the warning being
> triggered. You are correct, I'm on debian. In the attached case I
> ran:
> $ modprobe mac80211_hwsim radios=4
> I pulled the fixes onto Linus' tree and I end up with the same result.
>

Seems I might have found the culprit - reg_todo is called while the
request to CRDA is in flight and the patch in question causes the
already-in-process user-request to be handled again. Since it's the
same regdomain as the last request (it's the last request itself), we
get this:

treatment = __reg_process_hint_user(user_request);
if (treatment == REG_REQ_IGNORE ||
treatment == REG_REQ_ALREADY_SET) {
kfree(user_request); <------
return treatment;
}

Can you try adding the attached patch? It just replaced relevant
kfree-s with a function that avoids freeing the last request.

Also, in the current scheme of things, CRDA might be called twice for
the same request in some corner cases, but that's not a problem, since
reg_is_valid_request() will block the second set_regdom().

Regards,
Arik


Attachments:
0001-cfg80211-avoid-freeing-last_request-while-in-flight.patch (2.75 kB)

2014-04-09 19:17:05

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

+static void reg_free_last_request(struct regulatory_request *lr)

Isn't it odd to pass the last request to a function called
reg_free_last_request()?

johannes


2014-04-10 08:17:40

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Thu, Apr 10, 2014 at 11:01 AM, Johannes Berg
<[email protected]> wrote:
> On Thu, 2014-04-10 at 09:13 +0300, Arik Nemtsov wrote:
>> On Wed, Apr 9, 2014 at 10:16 PM, Johannes Berg
>> <[email protected]> wrote:
>> > +static void reg_free_last_request(struct regulatory_request *lr)
>> >
>> > Isn't it odd to pass the last request to a function called
>> > reg_free_last_request()?
>> >
>>
>> It makes sense to me. Non-last requests are freed using kfree. The
>> last request needs RCU.
>
> Yeah but there can only be one "last request", so why pass it?

oh. well that was weird before as well. :)
let's wait to see if it solves the bug? I can respin it without the arg later.

Arik

2014-04-16 11:01:26

by Janusz Dziedzic

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

2014-04-16 12:38 GMT+02:00 Arik Nemtsov <[email protected]>:
> Thanks for testing Colleen!
>
> I'll respin the patch on top of wireless-testing.
>
Please don't remove "Pending regulatory reques ... " dbg print. This
could be usefull for debuging.

BR
Janusz

> Arik
>
> On Mon, Apr 14, 2014 at 10:27 PM, Colleen T <[email protected]> wrote:
>> Hello,
>>
>> Applying Arik's patch on top of Luis' resolves our problem.
>>
>> You also get my:
>>
>> Tested-by: Colleen Twitty <[email protected]>
>>
>> Thanks,
>> -Colleen
>>
>> On Sun, Apr 13, 2014 at 5:50 AM, Eliad Peller <[email protected]> wrote:
>>> On Wed, Apr 9, 2014 at 7:33 PM, Arik Nemtsov <[email protected]> wrote:
>>>>
>>>> Seems I might have found the culprit - reg_todo is called while the
>>>> request to CRDA is in flight and the patch in question causes the
>>>> already-in-process user-request to be handled again. Since it's the
>>>> same regdomain as the last request (it's the last request itself), we
>>>> get this:
>>>>
>>>> treatment = __reg_process_hint_user(user_request);
>>>> if (treatment == REG_REQ_IGNORE ||
>>>> treatment == REG_REQ_ALREADY_SET) {
>>>> kfree(user_request); <------
>>>> return treatment;
>>>> }
>>>>
>>>> Can you try adding the attached patch? It just replaced relevant
>>>> kfree-s with a function that avoids freeing the last request.
>>>>
>>> i encountered a similar panic, and this patch seems to fix it.
>>> so you get my:
>>>
>>> Tested-by: Eliad Peller <[email protected]>
>>>
>>> Eliad.



--
Janusz Dziedzic

2014-04-13 12:50:42

by Eliad Peller

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Wed, Apr 9, 2014 at 7:33 PM, Arik Nemtsov <[email protected]> wrote:
>
> Seems I might have found the culprit - reg_todo is called while the
> request to CRDA is in flight and the patch in question causes the
> already-in-process user-request to be handled again. Since it's the
> same regdomain as the last request (it's the last request itself), we
> get this:
>
> treatment = __reg_process_hint_user(user_request);
> if (treatment == REG_REQ_IGNORE ||
> treatment == REG_REQ_ALREADY_SET) {
> kfree(user_request); <------
> return treatment;
> }
>
> Can you try adding the attached patch? It just replaced relevant
> kfree-s with a function that avoids freeing the last request.
>
i encountered a similar panic, and this patch seems to fix it.
so you get my:

Tested-by: Eliad Peller <[email protected]>

Eliad.

2014-04-10 08:23:23

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Thu, 2014-04-10 at 11:17 +0300, Arik Nemtsov wrote:

> >> It makes sense to me. Non-last requests are freed using kfree. The
> >> last request needs RCU.
> >
> > Yeah but there can only be one "last request", so why pass it?
>
> oh. well that was weird before as well. :)
> let's wait to see if it solves the bug? I can respin it without the arg later.

Sure, makes sense.

johannes


2014-04-16 10:38:52

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

Thanks for testing Colleen!

I'll respin the patch on top of wireless-testing.

Arik

On Mon, Apr 14, 2014 at 10:27 PM, Colleen T <[email protected]> wrote:
> Hello,
>
> Applying Arik's patch on top of Luis' resolves our problem.
>
> You also get my:
>
> Tested-by: Colleen Twitty <[email protected]>
>
> Thanks,
> -Colleen
>
> On Sun, Apr 13, 2014 at 5:50 AM, Eliad Peller <[email protected]> wrote:
>> On Wed, Apr 9, 2014 at 7:33 PM, Arik Nemtsov <[email protected]> wrote:
>>>
>>> Seems I might have found the culprit - reg_todo is called while the
>>> request to CRDA is in flight and the patch in question causes the
>>> already-in-process user-request to be handled again. Since it's the
>>> same regdomain as the last request (it's the last request itself), we
>>> get this:
>>>
>>> treatment = __reg_process_hint_user(user_request);
>>> if (treatment == REG_REQ_IGNORE ||
>>> treatment == REG_REQ_ALREADY_SET) {
>>> kfree(user_request); <------
>>> return treatment;
>>> }
>>>
>>> Can you try adding the attached patch? It just replaced relevant
>>> kfree-s with a function that avoids freeing the last request.
>>>
>> i encountered a similar panic, and this patch seems to fix it.
>> so you get my:
>>
>> Tested-by: Eliad Peller <[email protected]>
>>
>> Eliad.

2014-04-16 11:07:36

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Wed, Apr 16, 2014 at 2:01 PM, Janusz Dziedzic
<[email protected]> wrote:
> 2014-04-16 12:38 GMT+02:00 Arik Nemtsov <[email protected]>:
>> Thanks for testing Colleen!
>>
>> I'll respin the patch on top of wireless-testing.
>>
> Please don't remove "Pending regulatory reques ... " dbg print. This
> could be usefull for debuging.

Well that's a comment for Luis I guess. Removing the print is not
really part of my patch.

Arik

2014-04-10 08:01:57

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular

On Thu, 2014-04-10 at 09:13 +0300, Arik Nemtsov wrote:
> On Wed, Apr 9, 2014 at 10:16 PM, Johannes Berg
> <[email protected]> wrote:
> > +static void reg_free_last_request(struct regulatory_request *lr)
> >
> > Isn't it odd to pass the last request to a function called
> > reg_free_last_request()?
> >
>
> It makes sense to me. Non-last requests are freed using kfree. The
> last request needs RCU.

Yeah but there can only be one "last request", so why pass it?

johannes


2014-04-09 20:28:28

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: [PATCH 2/3] cfg80211: fix processing world regdomain when non modular


Wednesday, April 9, 2014, 6:33:09 PM, you wrote:

> Hey Collen, Luis,

> On Sun, Mar 16, 2014 at 9:04 PM, Colleen T <[email protected]> wrote:
>> On Sat, Mar 15, 2014 at 9:42 PM, Luis R. Rodriguez
>> <[email protected]> wrote:
>>> On Sat, Mar 15, 2014 at 8:59 AM, Janusz Dziedzic
>>> <[email protected]> wrote:
>>>> 2014-03-15 2:03 GMT+01:00 Luis R. Rodriguez <[email protected]>:
>>>>> On Fri, Mar 14, 2014 at 3:12 PM, Colleen T <[email protected]> wrote:
>>>>>> I'm on mac80211-next/master, sha is
>>>>>> 5a970df8990d173e7e4092952f2e3da1de69b27d
>>>>>
>>>>> I tried to reproduce by just merging the fixes in question onto Linus'
>>>>> tree and using radios=3 but no go. Can you provide the full kernel
>>>>> log, I'm particularly interested in what happened before. The
>>>>> COUNTRY=US on debian, which I believe you're on, should just trigger a
>>>>> regulatory domain setting to US upon initialization. Depending on how
>>>>> Debian does it this could either trigger as a module parameter or as a
>>>>> userspace event *after* the world regdom get set.
>>
>> I'm attaching a full kernel log that shows the warning being
>> triggered. You are correct, I'm on debian. In the attached case I
>> ran:
>> $ modprobe mac80211_hwsim radios=4
>> I pulled the fixes onto Linus' tree and I end up with the same result.
>>

> Seems I might have found the culprit - reg_todo is called while the
> request to CRDA is in flight and the patch in question causes the
> already-in-process user-request to be handled again. Since it's the
> same regdomain as the last request (it's the last request itself), we
> get this:

> treatment = __reg_process_hint_user(user_request);
> if (treatment == REG_REQ_IGNORE ||
> treatment == REG_REQ_ALREADY_SET) {
> kfree(user_request); <------
> return treatment;
> }

> Can you try adding the attached patch? It just replaced relevant
> kfree-s with a function that avoids freeing the last request.

> Also, in the current scheme of things, CRDA might be called twice for
> the same request in some corner cases, but that's not a problem, since
> reg_is_valid_request() will block the second set_regdom().

> Regards,
> Arik

Thanks for looking in to this .. hope that this will make it go to upstream one
day :-)