2014-12-29 10:00:03

by Arik Nemtsov

[permalink] [raw]
Subject: [PATCH] cfg80211: fix deadlock during reg chan check

If a P2P GO is active, the cfg80211_reg_can_beacon function will take
the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock
is already taken by the parent channel-checking function, causing a
deadlock.
Split the checking code into two parts. The first part will check if the
wdev is active and saves the channel under the wdev lock. The second part
will check actual channel validity according to type.

Signed-off-by: Arik Nemtsov <[email protected]>
Reviewed-by: Ilan Peer <[email protected]>
Reviewed-by: Emmanuel Grumbach <[email protected]>
---
Requires the patch "cfg80211: correctly check ad-hoc channels" to be applied
first.

net/wireless/reg.c | 56 +++++++++++++++++++++++++++++++++---------------------
1 file changed, 34 insertions(+), 22 deletions(-)

diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index 978a5fd..fde4e17 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -1533,45 +1533,40 @@ static void reg_call_notifier(struct wiphy *wiphy,

static bool reg_wdev_chan_valid(struct wiphy *wiphy, struct wireless_dev *wdev)
{
- struct ieee80211_channel *ch;
struct cfg80211_chan_def chandef;
struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy);
- bool ret = true;
+ enum nl80211_iftype iftype;

wdev_lock(wdev);
+ iftype = wdev->iftype;

+ /* make sure the interface is active */
if (!wdev->netdev || !netif_running(wdev->netdev))
- goto out;
+ goto wdev_inactive_unlock;

- switch (wdev->iftype) {
+ switch (iftype) {
case NL80211_IFTYPE_AP:
case NL80211_IFTYPE_P2P_GO:
if (!wdev->beacon_interval)
- goto out;
-
- ret = cfg80211_reg_can_beacon(wiphy,
- &wdev->chandef, wdev->iftype);
+ goto wdev_inactive_unlock;
+ chandef = wdev->chandef;
break;
case NL80211_IFTYPE_ADHOC:
if (!wdev->ssid_len)
- goto out;
-
- ret = cfg80211_reg_can_beacon(wiphy,
- &wdev->chandef, wdev->iftype);
+ goto wdev_inactive_unlock;
+ chandef = wdev->chandef;
break;
case NL80211_IFTYPE_STATION:
case NL80211_IFTYPE_P2P_CLIENT:
if (!wdev->current_bss ||
!wdev->current_bss->pub.channel)
- goto out;
+ goto wdev_inactive_unlock;

- ch = wdev->current_bss->pub.channel;
- if (rdev->ops->get_channel &&
- !rdev_get_channel(rdev, wdev, &chandef))
- ret = cfg80211_chandef_usable(wiphy, &chandef,
- IEEE80211_CHAN_DISABLED);
- else
- ret = !(ch->flags & IEEE80211_CHAN_DISABLED);
+ if (!rdev->ops->get_channel ||
+ rdev_get_channel(rdev, wdev, &chandef))
+ cfg80211_chandef_create(&chandef,
+ wdev->current_bss->pub.channel,
+ NL80211_CHAN_NO_HT);
break;
case NL80211_IFTYPE_MONITOR:
case NL80211_IFTYPE_AP_VLAN:
@@ -1584,9 +1579,26 @@ static bool reg_wdev_chan_valid(struct wiphy *wiphy, struct wireless_dev *wdev)
break;
}

-out:
wdev_unlock(wdev);
- return ret;
+
+ switch (iftype) {
+ case NL80211_IFTYPE_AP:
+ case NL80211_IFTYPE_P2P_GO:
+ case NL80211_IFTYPE_ADHOC:
+ return cfg80211_reg_can_beacon(wiphy, &chandef, iftype);
+ case NL80211_IFTYPE_STATION:
+ case NL80211_IFTYPE_P2P_CLIENT:
+ return cfg80211_chandef_usable(wiphy, &chandef,
+ IEEE80211_CHAN_DISABLED);
+ default:
+ break;
+ }
+
+ return true;
+
+wdev_inactive_unlock:
+ wdev_unlock(wdev);
+ return true;
}

static void reg_leave_invalid_chans(struct wiphy *wiphy)
--
2.1.0



2015-01-07 13:54:03

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

On Mon, 2014-12-29 at 11:59 +0200, Arik Nemtsov wrote:
> If a P2P GO is active, the cfg80211_reg_can_beacon function will take
> the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock
> is already taken by the parent channel-checking function, causing a
> deadlock.
> Split the checking code into two parts. The first part will check if the
> wdev is active and saves the channel under the wdev lock. The second part
> will check actual channel validity according to type.

Applied to mac80211.git.

johannes


2015-01-07 13:43:01

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

On Wed, Jan 7, 2015 at 3:37 PM, Johannes Berg <[email protected]> wrote:
> On Wed, 2015-01-07 at 15:34 +0200, Arik Nemtsov wrote:
>
>> > I'm not convinced this is the right thing to do. When checking for the
>> > current wdev that it can use a channel, then it seems that it's own
>> > current BSS connection (if any) shouldn't actually be taken into account
>> > - ergo the lock shouldn't have to be taken, that interface should be
>> > excluded from the "can beacon due to concurrent check" anyway.
>>
>> We have a couple of checks we want to add in the pipeline that also
>> need "this" wdev in the concurrent check, so I'd prefer to avoid this.
>
> Why would you need to check "this" wdev when doing something for "this"
> wdev? Seems odd? But I'm willing to learn :)

There's some convoluted regulatory logic where if this GO (or any
other) are operating on this GO_CONCURRENT (and not indoor-only)
channel, then it may continue in its operation even after the STA that
operated concurrently has disconnected.

>
>> > Also, the only reason this can happen anyway is when you call "can
>> > beacon" for a station interface - which seems nonsensical. Given that
>>
>> This is not true. This happens with current code for a p2p-go
>> interface during channel validity checks in reg.c.
>
> Not sure I see this? The only thing doing wdev locking is
> cfg80211_go_permissive_chan(), no? And that only for station interfaces.

cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
currently only for GO interfaces, but for STA also in the future
(hopefully).
The latter is called during channel validity checks for GO.

Arik

2015-01-07 13:46:37

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

On Wed, 2015-01-07 at 15:42 +0200, Arik Nemtsov wrote:
> On Wed, Jan 7, 2015 at 3:37 PM, Johannes Berg <[email protected]> wrote:
> > On Wed, 2015-01-07 at 15:34 +0200, Arik Nemtsov wrote:
> >
> >> > I'm not convinced this is the right thing to do. When checking for the
> >> > current wdev that it can use a channel, then it seems that it's own
> >> > current BSS connection (if any) shouldn't actually be taken into account
> >> > - ergo the lock shouldn't have to be taken, that interface should be
> >> > excluded from the "can beacon due to concurrent check" anyway.
> >>
> >> We have a couple of checks we want to add in the pipeline that also
> >> need "this" wdev in the concurrent check, so I'd prefer to avoid this.
> >
> > Why would you need to check "this" wdev when doing something for "this"
> > wdev? Seems odd? But I'm willing to learn :)
>
> There's some convoluted regulatory logic where if this GO (or any
> other) are operating on this GO_CONCURRENT (and not indoor-only)
> channel, then it may continue in its operation even after the STA that
> operated concurrently has disconnected.

Uh, ok, not sure I have that yet...

> >
> >> > Also, the only reason this can happen anyway is when you call "can
> >> > beacon" for a station interface - which seems nonsensical. Given that
> >>
> >> This is not true. This happens with current code for a p2p-go
> >> interface during channel validity checks in reg.c.
> >
> > Not sure I see this? The only thing doing wdev locking is
> > cfg80211_go_permissive_chan(), no? And that only for station interfaces.
>
> cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
> currently only for GO interfaces, but for STA also in the future
> (hopefully).
> The latter is called during channel validity checks for GO.

Ok.

Should I just apply the patch as it is then?

johannes


2015-01-07 13:52:39

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

On Wed, Jan 7, 2015 at 3:50 PM, Johannes Berg <[email protected]> wrote:
> On Wed, 2015-01-07 at 15:48 +0200, Arik Nemtsov wrote:
>> >
>> >> >
>> >> >> > Also, the only reason this can happen anyway is when you call "can
>> >> >> > beacon" for a station interface - which seems nonsensical. Given that
>> >> >>
>> >> >> This is not true. This happens with current code for a p2p-go
>> >> >> interface during channel validity checks in reg.c.
>> >> >
>> >> > Not sure I see this? The only thing doing wdev locking is
>> >> > cfg80211_go_permissive_chan(), no? And that only for station interfaces.
>> >>
>> >> cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
>> >> currently only for GO interfaces, but for STA also in the future
>> >> (hopefully).
>> >> The latter is called during channel validity checks for GO.
>> >
>> > Ok.
>> >
>> > Should I just apply the patch as it is then?
>>
>> It fixes a real existing deadlock, so I think so, yea.
>
> Is it needed on 3.19?

Yes, since the channel validity checking is already there. Basically
everyone that sets up a GO and has some regulatory change afterwards
might deadlock..

Arik

2015-01-07 13:50:20

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

On Wed, 2015-01-07 at 15:48 +0200, Arik Nemtsov wrote:
> >
> >> >
> >> >> > Also, the only reason this can happen anyway is when you call "can
> >> >> > beacon" for a station interface - which seems nonsensical. Given that
> >> >>
> >> >> This is not true. This happens with current code for a p2p-go
> >> >> interface during channel validity checks in reg.c.
> >> >
> >> > Not sure I see this? The only thing doing wdev locking is
> >> > cfg80211_go_permissive_chan(), no? And that only for station interfaces.
> >>
> >> cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
> >> currently only for GO interfaces, but for STA also in the future
> >> (hopefully).
> >> The latter is called during channel validity checks for GO.
> >
> > Ok.
> >
> > Should I just apply the patch as it is then?
>
> It fixes a real existing deadlock, so I think so, yea.

Is it needed on 3.19?

johannes


2015-01-07 13:37:56

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

On Wed, 2015-01-07 at 15:34 +0200, Arik Nemtsov wrote:

> > I'm not convinced this is the right thing to do. When checking for the
> > current wdev that it can use a channel, then it seems that it's own
> > current BSS connection (if any) shouldn't actually be taken into account
> > - ergo the lock shouldn't have to be taken, that interface should be
> > excluded from the "can beacon due to concurrent check" anyway.
>
> We have a couple of checks we want to add in the pipeline that also
> need "this" wdev in the concurrent check, so I'd prefer to avoid this.

Why would you need to check "this" wdev when doing something for "this"
wdev? Seems odd? But I'm willing to learn :)

> Unless we treat the exclude_wdev as "already locked wdev", which I
> think is unglier than what I do here.

Yeah that doesn't seem right, agree.

> > Also, the only reason this can happen anyway is when you call "can
> > beacon" for a station interface - which seems nonsensical. Given that
>
> This is not true. This happens with current code for a p2p-go
> interface during channel validity checks in reg.c.

Not sure I see this? The only thing doing wdev locking is
cfg80211_go_permissive_chan(), no? And that only for station interfaces.

johannes


2015-01-07 13:49:12

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

>
>> >
>> >> > Also, the only reason this can happen anyway is when you call "can
>> >> > beacon" for a station interface - which seems nonsensical. Given that
>> >>
>> >> This is not true. This happens with current code for a p2p-go
>> >> interface during channel validity checks in reg.c.
>> >
>> > Not sure I see this? The only thing doing wdev locking is
>> > cfg80211_go_permissive_chan(), no? And that only for station interfaces.
>>
>> cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
>> currently only for GO interfaces, but for STA also in the future
>> (hopefully).
>> The latter is called during channel validity checks for GO.
>
> Ok.
>
> Should I just apply the patch as it is then?

It fixes a real existing deadlock, so I think so, yea.

Arik

2015-01-07 13:34:25

by Arik Nemtsov

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

On Tue, Jan 6, 2015 at 12:51 PM, Johannes Berg
<[email protected]> wrote:
> On Mon, 2014-12-29 at 11:59 +0200, Arik Nemtsov wrote:
>> If a P2P GO is active, the cfg80211_reg_can_beacon function will take
>> the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock
>> is already taken by the parent channel-checking function, causing a
>> deadlock.
>> Split the checking code into two parts. The first part will check if the
>> wdev is active and saves the channel under the wdev lock. The second part
>> will check actual channel validity according to type.
>
> I'm not convinced this is the right thing to do. When checking for the
> current wdev that it can use a channel, then it seems that it's own
> current BSS connection (if any) shouldn't actually be taken into account
> - ergo the lock shouldn't have to be taken, that interface should be
> excluded from the "can beacon due to concurrent check" anyway.

We have a couple of checks we want to add in the pipeline that also
need "this" wdev in the concurrent check, so I'd prefer to avoid this.

Unless we treat the exclude_wdev as "already locked wdev", which I
think is unglier than what I do here.

>
> Also, the only reason this can happen anyway is when you call "can
> beacon" for a station interface - which seems nonsensical. Given that

This is not true. This happens with current code for a p2p-go
interface during channel validity checks in reg.c.

Arik

2015-01-06 10:51:57

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: fix deadlock during reg chan check

On Mon, 2014-12-29 at 11:59 +0200, Arik Nemtsov wrote:
> If a P2P GO is active, the cfg80211_reg_can_beacon function will take
> the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock
> is already taken by the parent channel-checking function, causing a
> deadlock.
> Split the checking code into two parts. The first part will check if the
> wdev is active and saves the channel under the wdev lock. The second part
> will check actual channel validity according to type.

I'm not convinced this is the right thing to do. When checking for the
current wdev that it can use a channel, then it seems that it's own
current BSS connection (if any) shouldn't actually be taken into account
- ergo the lock shouldn't have to be taken, that interface should be
excluded from the "can beacon due to concurrent check" anyway.

Also, the only reason this can happen anyway is when you call "can
beacon" for a station interface - which seems nonsensical. Given that
this is now really becoming far more complex than originally designed
("can beacon") with TDLS ("can IR") perhaps this should be
* renamed to reg_can_ir()
* passed an "exclude wdev"
* (and use mutex_lock_nested with a big comment to explain to lockdep
what's
going on)

or so.

johannes