Isn't there a race there when you remove interfaces and at the same time
__ieee80211_rx is running? I don't see anything that should stop that,
and if it happens we'll probably blow up pretty spectacularly with
accesses to a freed netdev, or even sending it frames...
johannes
On Tuesday 27 March 2007 01:05, Johannes Berg wrote:
> Isn't there a race there when you remove interfaces and at the same time
> __ieee80211_rx is running? I don't see anything that should stop that,
> and if it happens we'll probably blow up pretty spectacularly with
> accesses to a freed netdev, or even sending it frames...
Well, IMHO __ieee80211_rx should be internal to mac80211 and not be exported.
(Same goes for the IRQ-unsafe txstatus report). We can't (and we really don't
want to) call synchronously into the stack from any driver. We want to
defer the packet processing work to a tasklet. So we always want to
call the _irqsafe functions. Why do we? Because of locking issues.
1) There's no need to hold the lowlevel driver IRQ spinlock while
processing the packet in the upper layers. We want to release it. Easiest
and non-racy way to do this is to use the _irqsafe function and its tasklet.
2) We want to prevent recursion back into the driver through a callback,
which would result in lock recursion (we had these problems in softmac, as
it is synchronous in most rx parts).
So why do I tell you this?
I think there's no way we can call __ieee80211_rx after removing the
interface, as the tasklet should be stopped first. But someone should
verify if I'm right here.
But anyway, the driver shouldn't be able to call __ieee80211_rx after the
device is down, too, as the driver has to make sure IRQs are stopped
when the device is brought down.
So we have kind of implicit locking here, but I think it's ok.
--
Greetings Michael.
On Wed, 2007-03-28 at 12:59 +0200, Michael Buesch wrote:
> Oh, you mean virtual interfaces.. yeah, there's probably a race. I don't
> know the locking there.
Yes, I mean virtual interfaces, when the thing is in progress while we
iterate through.
> But my comment about the non-irqsafe-variants-should-be-hidden still applies :)
I'll reread it ;)
But I'm not sure I fully agree. If the driver calls __ieee80211_rx in
softirq context that should be good without requiring to reschedule in
mac80211.
johannes
On Wednesday 28 March 2007 12:52, Michael Buesch wrote:
> On Tuesday 27 March 2007 01:05, Johannes Berg wrote:
> > Isn't there a race there when you remove interfaces and at the same time
> > __ieee80211_rx is running? I don't see anything that should stop that,
> > and if it happens we'll probably blow up pretty spectacularly with
> > accesses to a freed netdev, or even sending it frames...
>
> Well, IMHO __ieee80211_rx should be internal to mac80211 and not be exported.
> (Same goes for the IRQ-unsafe txstatus report). We can't (and we really don't
> want to) call synchronously into the stack from any driver. We want to
> defer the packet processing work to a tasklet. So we always want to
> call the _irqsafe functions. Why do we? Because of locking issues.
> 1) There's no need to hold the lowlevel driver IRQ spinlock while
> processing the packet in the upper layers. We want to release it. Easiest
> and non-racy way to do this is to use the _irqsafe function and its tasklet.
> 2) We want to prevent recursion back into the driver through a callback,
> which would result in lock recursion (we had these problems in softmac, as
> it is synchronous in most rx parts).
>
> So why do I tell you this?
> I think there's no way we can call __ieee80211_rx after removing the
> interface, as the tasklet should be stopped first. But someone should
> verify if I'm right here.
> But anyway, the driver shouldn't be able to call __ieee80211_rx after the
> device is down, too, as the driver has to make sure IRQs are stopped
> when the device is brought down.
>
> So we have kind of implicit locking here, but I think it's ok.
Oh, you mean virtual interfaces.. yeah, there's probably a race. I don't
know the locking there.
But my comment about the non-irqsafe-variants-should-be-hidden still applies :)
--
Greetings Michael.
On Wednesday 28 March 2007 13:07, Johannes Berg wrote:
> On Wed, 2007-03-28 at 12:59 +0200, Michael Buesch wrote:
>
> > Oh, you mean virtual interfaces.. yeah, there's probably a race. I don't
> > know the locking there.
>
> Yes, I mean virtual interfaces, when the thing is in progress while we
> iterate through.
>
> > But my comment about the non-irqsafe-variants-should-be-hidden still applies :)
>
> I'll reread it ;)
>
> But I'm not sure I fully agree. If the driver calls __ieee80211_rx in
> softirq context that should be good without requiring to reschedule in
> mac80211.
I'm not sure there is an easy and non-racy way to drop the driver IRQ lock
while calling __ieee80211_rx synchronously.
--
Greetings Michael.
On Wed, 2007-03-28 at 13:32 +0200, Michael Buesch wrote:
> On Wednesday 28 March 2007 13:07, Johannes Berg wrote:
> > On Wed, 2007-03-28 at 12:59 +0200, Michael Buesch wrote:
> >
> > > Oh, you mean virtual interfaces.. yeah, there's probably a race. I don't
> > > know the locking there.
> >
> > Yes, I mean virtual interfaces, when the thing is in progress while we
> > iterate through.
> >
> > > But my comment about the non-irqsafe-variants-should-be-hidden still applies :)
> >
> > I'll reread it ;)
> >
> > But I'm not sure I fully agree. If the driver calls __ieee80211_rx in
> > softirq context that should be good without requiring to reschedule in
> > mac80211.
>
> I'm not sure there is an easy and non-racy way to drop the driver IRQ lock
> while calling __ieee80211_rx synchronously.
Depends on the driver, I guess. rt* seem to reschedule already and call
__ieee80211_rx.
johannes
On Tue, 27 Mar 2007 01:05:22 +0200, Johannes Berg wrote:
> Isn't there a race there when you remove interfaces and at the same time
> __ieee80211_rx is running? I don't see anything that should stop that,
> and if it happens we'll probably blow up pretty spectacularly with
> accesses to a freed netdev, or even sending it frames...
Yes, there is a race.
- sta_info should be holding a reference to a net_device in its dev
field (sta_info_add).
- walking through the local->sub_if_list in __ieee80211_rx should
happen under a lock
- while invoking rx handlers in the list_for_each_entry loop (they
shouldn't be called under the lock above - hm, another thing that
makes locking in mac80211 hard) we should hold a reference to the
appropriate net_device
Jiri
--
Jiri Benc
SUSE Labs
On Wednesday 18 April 2007 14:31, Jiri Benc wrote:
> Yes, there is a race.
>
> - sta_info should be holding a reference to a net_device in its dev
> field (sta_info_add).
> - walking through the local->sub_if_list in __ieee80211_rx should
> happen under a lock
> - while invoking rx handlers in the list_for_each_entry loop (they
> shouldn't be called under the lock above - hm, another thing that
> makes locking in mac80211 hard) we should hold a reference to the
> appropriate net_device
This should be all fixed in the attached patch. It applies on top of johill's
no-rtnl-in-wext patch (because RTNLs are eliminated) and my radiotap RX patch
(just because I have it in my tree).
Unfortunately, it introduces a deadlock in the wext interface for removing
virtual interfaces, which is why this isn't signed off. I should be able to
figure that one out..
-Michael Wu