2019-08-28 21:17:07

by Denis Kenzior

[permalink] [raw]
Subject: [PATCH] cfg80211: Purge frame registrations on iftype change

Currently frame registrations are not purged, even when changing the
interface type. This can lead to potentially weird / dangerous
situations where frames possibly not relevant to a given interface
type remain registered and mgmt_frame_register is not called for the
no-longer-relevant frame types.

The kernel currently relies on userspace apps to actually purge the
registrations themselves, e.g. by closing the nl80211 socket associated
with those frames. However, this requires multiple nl80211 sockets to
be open by the userspace app, and for userspace to be aware of all state
changes. This is not something that the kernel should rely on.

This commit adds a call to cfg80211_mlme_purge_registrations() to
forcefully remove any registrations left over prior to switching the
iftype.

Cc: [email protected]

Signed-off-by: Denis Kenzior <[email protected]>
---
net/wireless/util.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/wireless/util.c b/net/wireless/util.c
index c99939067bb0..3fa092b78e62 100644
--- a/net/wireless/util.c
+++ b/net/wireless/util.c
@@ -964,6 +964,7 @@ int cfg80211_change_iface(struct cfg80211_registered_device *rdev,
}

cfg80211_process_rdev_events(rdev);
+ cfg80211_mlme_purge_registrations(dev->ieee80211_ptr);
}

err = rdev_change_virtual_intf(rdev, dev, ntype, params);
--
2.19.2


2019-08-30 08:53:32

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: Purge frame registrations on iftype change

On Wed, 2019-08-28 at 16:11 -0500, Denis Kenzior wrote:
> Currently frame registrations are not purged, even when changing the
> interface type. This can lead to potentially weird / dangerous
> situations where frames possibly not relevant to a given interface
> type remain registered and mgmt_frame_register is not called for the
> no-longer-relevant frame types.

I'd argue really just "weird and non-working", hardly dangerous. Even in
the mac80211 design where we want to not let you intercept e.g. AUTH
frames in client mode - if you did, then you'd just end up with a non-
working interface. Not sure I see any "dangerous situation". Not really
an all that important distinction though.

Depending on the design, it may also just be that those registrations
are *ignored*, because e.g. firmware intercepts the AUTH frame already,
which would just (maybe) confuse userspace - but that seems unlikely
since it switched interface type and has no real need for those frames
then.

> The kernel currently relies on userspace apps to actually purge the
> registrations themselves, e.g. by closing the nl80211 socket associated
> with those frames. However, this requires multiple nl80211 sockets to
> be open by the userspace app, and for userspace to be aware of all state
> changes. This is not something that the kernel should rely on.

I tend to agree with that the kernel shouldn't rely on it.

> This commit adds a call to cfg80211_mlme_purge_registrations() to
> forcefully remove any registrations left over prior to switching the
> iftype.

However, I do wonder if we should make this more transactional, and hang
on to them if the type switching fails. We're not notifying userspace
that the registrations have disappeared, so if type switching fails and
it continues to work with the old type rather than throwing its hands up
and quitting or something, it'd make a possibly bigger mess to just
silently have removed them already.

I *think* it should be safe to just move this after the switching
succeeds, since the switching can pretty much only be done at a point
where nothing is happening on the interface anyway, though that might
confuse the driver when the remove happens.

Also, perhaps it'd be better to actually hang on to those registrations
that *are* still possible afterwards? But to not confuse the driver I
guess that might require unregister/re-register to happen, all of which
requires hanging on to the list and going through it after the type
switch completed?

What do you think?

johannes

2019-08-30 15:31:37

by Denis Kenzior

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: Purge frame registrations on iftype change

Hi Johannes,

On 8/30/19 3:53 AM, Johannes Berg wrote:
> On Wed, 2019-08-28 at 16:11 -0500, Denis Kenzior wrote:
>> Currently frame registrations are not purged, even when changing the
>> interface type. This can lead to potentially weird / dangerous
>> situations where frames possibly not relevant to a given interface
>> type remain registered and mgmt_frame_register is not called for the
>> no-longer-relevant frame types.
>
> I'd argue really just "weird and non-working", hardly dangerous. Even in
> the mac80211 design where we want to not let you intercept e.g. AUTH
> frames in client mode - if you did, then you'd just end up with a non-
> working interface. Not sure I see any "dangerous situation". Not really
> an all that important distinction though.

Fair enough, I'm happy to drop / reword this language. It seemed fishy
to me since the unregistration operation was not called at all, and the
driver does go to some lengths to set up the valid frame registration
types.

>
> Depending on the design, it may also just be that those registrations
> are *ignored*, because e.g. firmware intercepts the AUTH frame already,
> which would just (maybe) confuse userspace - but that seems unlikely
> since it switched interface type and has no real need for those frames
> then.

There might be corner cases where userspace gets confused and doesn't
update the frame registrations properly. For example, wpa_s/hostap does
not listen to SET_INTERFACE events that I can tell. So if some external
app sets the mode (particularly on a 'live' interface) then all kinds of
unexpected things might happen. This is one of the motivations for
restricting certain NL80211 commands to interface SOCKET_OWNER.

So really this patch is intended more as a hot-fix / backport to stable
to make sure the older kernels can deal with some of these situations.

>
>> The kernel currently relies on userspace apps to actually purge the
>> registrations themselves, e.g. by closing the nl80211 socket associated
>> with those frames. However, this requires multiple nl80211 sockets to
>> be open by the userspace app, and for userspace to be aware of all state
>> changes. This is not something that the kernel should rely on.
>
> I tend to agree with that the kernel shouldn't rely on it.
>
>> This commit adds a call to cfg80211_mlme_purge_registrations() to
>> forcefully remove any registrations left over prior to switching the
>> iftype.
>
> However, I do wonder if we should make this more transactional, and hang
> on to them if the type switching fails. We're not notifying userspace
> that the registrations have disappeared, so if type switching fails and
> it continues to work with the old type rather than throwing its hands up
> and quitting or something, it'd make a possibly bigger mess to just
> silently have removed them already.

I do like that idea, not sure how to go about implementing it though?
The failure case is a bit hard to deal with. Something like
NL80211_EXT_FEATURE_LIVE_IFTYPE_CHANGE would help, particularly if
nl80211/cfg80211 actually checked it prior to doing anything (e.g.
disconnecting, etc). That would then take care of the majority of the
'typical' failure paths. I didn't add such checking in the other patch
set since I felt you might find it overly intrusive on userspace. But
maybe we really should do this?

So playing devil's advocate, another argument might be that by the time
we got here, we've already tore down a bunch of state. E.g.
disconnected the station, stopped AP, etc. So we've already
side-effected state in a bunch of ways, what's one more?

>
> I *think* it should be safe to just move this after the switching
> succeeds, since the switching can pretty much only be done at a point
> where nothing is happening on the interface anyway, though that might
> confuse the driver when the remove happens.
>

I would concur as that is what happens today. But should it?

> Also, perhaps it'd be better to actually hang on to those registrations
> that *are* still possible afterwards? But to not confuse the driver I
> guess that might require unregister/re-register to happen, all of which
> requires hanging on to the list and going through it after the type
> switch completed?

Yes, I had those exact thoughts as well.

It isn't currently clear to me if there are any guarantees on the driver
operation call sequence that cfg80211 provides. E.g. can the driver
expect rdev_change_virtual_intf to be called only once all the old
registrations are purged and the new registrations are performed after
the fact? Or should it expect things to just happen in any order?

>
> What do you think?
>

A big part of me thinks that just wiping the slate clean and having
userspace set it up from scratch isn't that much to ask and it might
want to do that anyway. It might (a big maybe?) also make the driver's
life easier if it can rely on certain guarantees from cfg80211. E.g.
that all invalid registrations are purged.

I have seen wpa_s perform a bunch of register commands which bounce off
with an -EALREADY. So it may already be erring on the side of caution
and assuming that it needs to reset the state fully? Not sure.

But if the kernel wants to be nice and spends some cycles figuring out
which frame registrations to keep and re-register them, that is also
fine with me.

Regards,
-Denis

2019-09-11 09:54:45

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: Purge frame registrations on iftype change

Hi,

On Fri, 2019-08-30 at 01:32 -0500, Denis Kenzior wrote:
> Hi Johannes,
>
> On 8/30/19 3:53 AM, Johannes Berg wrote:
> > On Wed, 2019-08-28 at 16:11 -0500, Denis Kenzior wrote:
> > > Currently frame registrations are not purged, even when changing the
> > > interface type. This can lead to potentially weird / dangerous
> > > situations where frames possibly not relevant to a given interface
> > > type remain registered and mgmt_frame_register is not called for the
> > > no-longer-relevant frame types.
> >
> > I'd argue really just "weird and non-working", hardly dangerous.

I think I may just have found a way that's sort of "dangerous" in the
sense of breaking all of our tests, but hey.

> > Even in
> > the mac80211 design where we want to not let you intercept e.g. AUTH
> > frames in client mode - if you did, then you'd just end up with a non-
> > working interface. Not sure I see any "dangerous situation". Not really
> > an all that important distinction though.
>
> Fair enough, I'm happy to drop / reword this language. It seemed fishy
> to me since the unregistration operation was not called at all, and the
> driver does go to some lengths to set up the valid frame registration
> types.

Sure.

> > However, I do wonder if we should make this more transactional, and hang
> > on to them if the type switching fails. We're not notifying userspace
> > that the registrations have disappeared, so if type switching fails and
> > it continues to work with the old type rather than throwing its hands up
> > and quitting or something, it'd make a possibly bigger mess to just
> > silently have removed them already.
>
> I do like that idea, not sure how to go about implementing it though?
> The failure case is a bit hard to deal with. Something like
> NL80211_EXT_FEATURE_LIVE_IFTYPE_CHANGE would help, particularly if
> nl80211/cfg80211 actually checked it prior to doing anything (e.g.
> disconnecting, etc). That would then take care of the majority of the
> 'typical' failure paths. I didn't add such checking in the other patch
> set since I felt you might find it overly intrusive on userspace. But
> maybe we really should do this?

As I just said on the other patch, I think we probably should do that
there, if just to be able to advertise a correct set of interface types
that you can switch between there. I don't see how it'd be more
intrusive to userspace than failing later? :-)

> So playing devil's advocate, another argument might be that by the time
> we got here, we've already tore down a bunch of state. E.g.
> disconnected the station, stopped AP, etc. So we've already
> side-effected state in a bunch of ways, what's one more?

True, fair point.

> > I *think* it should be safe to just move this after the switching
> > succeeds, since the switching can pretty much only be done at a point
> > where nothing is happening on the interface anyway, though that might
> > confuse the driver when the remove happens.
> >
>
> I would concur as that is what happens today. But should it?

Well, dunno, what should happen? If you ask drivers they might want to
remove & re-register after, for those registrations that are still
possible.

> It isn't currently clear to me if there are any guarantees on the driver
> operation call sequence that cfg80211 provides. E.g. can the driver
> expect rdev_change_virtual_intf to be called only once all the old
> registrations are purged and the new registrations are performed after
> the fact? Or should it expect things to just happen in any order?

Well, evidently it cannot rely on anything today, and for the most part
I guess this is implemented in the software paths where it doesn't
really matter (the same way that mac80211 implements it).

But it probably should be defined better.

> > What do you think?
> >
>
> A big part of me thinks that just wiping the slate clean and having
> userspace set it up from scratch isn't that much to ask and it might
> want to do that anyway. It might (a big maybe?) also make the driver's
> life easier if it can rely on certain guarantees from cfg80211. E.g.
> that all invalid registrations are purged.

Yeah, fair point.

> I have seen wpa_s perform a bunch of register commands which bounce off
> with an -EALREADY. So it may already be erring on the side of caution
> and assuming that it needs to reset the state fully? Not sure.

I'm pretty sure that it does in fact go through a full reset (re-setup)
after switching things around.

> But if the kernel wants to be nice and spends some cycles figuring out
> which frame registrations to keep and re-register them, that is also
> fine with me.

Let's not then.

I've applied this patch now.

johannes

2019-09-11 14:06:48

by Denis Kenzior

[permalink] [raw]
Subject: Re: [PATCH] cfg80211: Purge frame registrations on iftype change

Hi Johannes,

>> 'typical' failure paths. I didn't add such checking in the other patch
>> set since I felt you might find it overly intrusive on userspace. But
>> maybe we really should do this?
>
> As I just said on the other patch, I think we probably should do that
> there, if just to be able to advertise a correct set of interface types
> that you can switch between there. I don't see how it'd be more
> intrusive to userspace than failing later? :-)

What I was worried about was that all the fullmac drivers would have had
to be updated to set the feature bit, and it would have caused
wpa_s/hostapd to no longer be able to do the whole set_iftype -> ebusy
-> ifdown & set_iftype retry logic until all were updated.

>> I would concur as that is what happens today. But should it?
>
> Well, dunno, what should happen? If you ask drivers they might want to
> remove & re-register after, for those registrations that are still
> possible.
>

I would think you'd want to define a clear order of operations that
cfg80211 / mac80211 would enforce :)

>
> Let's not then.
>
> I've applied this patch now.
>

Great, thanks.

Regards,
-Denis