2017-05-31 10:07:39

by Jia-Ju Bai

[permalink] [raw]
Subject: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

The driver may sleep under a spin lock, and the function call path is:
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
b43legacy_synchronize_irq
synchronize_irq --> may sleep

To fix it, the lock is released before b43legacy_synchronize_irq, and the
lock is acquired again after this function.

Signed-off-by: Jia-Ju Bai <[email protected]>
---
drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);

if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);

if (conf->bssid)
memcpy(wl->bssid, conf->bssid, ETH_ALEN);
--
1.7.9.5


2017-05-31 15:33:00

by Michael Büsch

[permalink] [raw]
Subject: Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

On Wed, 31 May 2017 13:26:43 +0300
Kalle Valo <[email protected]> wrote:

> Jia-Ju Bai <[email protected]> writes:
>
> > The driver may sleep under a spin lock, and the function call path is:
> > b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
> > b43legacy_synchronize_irq
> > synchronize_irq --> may sleep
> >
> > To fix it, the lock is released before b43legacy_synchronize_irq, and the
> > lock is acquired again after this function.
> >
> > Signed-off-by: Jia-Ju Bai <[email protected]>
> > ---
> > drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
> > index f1e3dad..31ead21 100644
> > --- a/drivers/net/wireless/broadcom/b43legacy/main.c
> > +++ b/drivers/net/wireless/broadcom/b43legacy/main.c
> > @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
> > b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
> >
> > if (changed & BSS_CHANGED_BSSID) {
> > + spin_unlock_irqrestore(&wl->irq_lock, flags);
> > b43legacy_synchronize_irq(dev);
> > + spin_lock_irqsave(&wl->irq_lock, flags);
>
> To me this looks like a fragile workaround and not a real fix. You can
> easily add new race conditions with releasing the lock like this.
>


I think releasing the lock possibly is fine. It certainly is better than
sleeping with a lock held.
We disabled the device interrupts just before this line.

However I think the synchronize_irq should be outside of the
conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
two lines above)
I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
is set.


On the other hand b43 does not have this irq-disabling foobar anymore.
So somebody must have removed it. Maybe you can find the commit that
removed this stuff from b43 and port it to b43legacy?


So I would vote for moving the synchronize_irq up outside of the
conditional and put the unlock/lock sequence around it.
And as a second patch on top of that try to remove this stuff
altogether like b43 did.

--
Michael


Attachments:
(No filename) (833.00 B)
OpenPGP digital signature

2017-05-31 12:15:50

by Arend van Spriel

[permalink] [raw]
Subject: Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

On 31-05-17 12:26, Kalle Valo wrote:
> Jia-Ju Bai <[email protected]> writes:
>
>> The driver may sleep under a spin lock, and the function call path is:
>> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
>> b43legacy_synchronize_irq
>> synchronize_irq --> may sleep
>>
>> To fix it, the lock is released before b43legacy_synchronize_irq, and the
>> lock is acquired again after this function.
>>
>> Signed-off-by: Jia-Ju Bai <[email protected]>
>> ---
>> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
>> index f1e3dad..31ead21 100644
>> --- a/drivers/net/wireless/broadcom/b43legacy/main.c
>> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c
>> @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
>> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
>>
>> if (changed & BSS_CHANGED_BSSID) {
>> + spin_unlock_irqrestore(&wl->irq_lock, flags);
>> b43legacy_synchronize_irq(dev);
>> + spin_lock_irqsave(&wl->irq_lock, flags);
>
> To me this looks like a fragile workaround and not a real fix. You can
> easily add new race conditions with releasing the lock like this.

Hi Jia-Ju,

Agree with Kalle as I was about to say the same thing. You really need
to determine what is protected by the irq_lock. Here you are using the
lock because you are about to change wl->bssid a bit further down. Did
not check the entire function but it seems the lock perimeter is too wide.

Regards,
Arend

2017-05-31 10:27:20

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

Jia-Ju Bai <[email protected]> writes:

> The driver may sleep under a spin lock, and the function call path is:
> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
> b43legacy_synchronize_irq
> synchronize_irq --> may sleep
>
> To fix it, the lock is released before b43legacy_synchronize_irq, and the
> lock is acquired again after this function.
>
> Signed-off-by: Jia-Ju Bai <[email protected]>
> ---
> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
> index f1e3dad..31ead21 100644
> --- a/drivers/net/wireless/broadcom/b43legacy/main.c
> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c
> @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
>
> if (changed & BSS_CHANGED_BSSID) {
> + spin_unlock_irqrestore(&wl->irq_lock, flags);
> b43legacy_synchronize_irq(dev);
> + spin_lock_irqsave(&wl->irq_lock, flags);

To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.

--
Kalle Valo

2017-06-01 05:29:43

by Michael Büsch

[permalink] [raw]
Subject: Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

On Thu, 01 Jun 2017 07:27:20 +0300
Kalle Valo <[email protected]> wrote:

> Michael Büsch <[email protected]> writes:
>
> >> > --- a/drivers/net/wireless/broadcom/b43legacy/main.c
> >> > +++ b/drivers/net/wireless/broadcom/b43legacy/main.c
> >> > @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
> >> > b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
> >> >
> >> > if (changed & BSS_CHANGED_BSSID) {
> >> > + spin_unlock_irqrestore(&wl->irq_lock, flags);
> >> > b43legacy_synchronize_irq(dev);
> >> > + spin_lock_irqsave(&wl->irq_lock, flags);
> >>
> >> To me this looks like a fragile workaround and not a real fix. You can
> >> easily add new race conditions with releasing the lock like this.
> >>
> >
> >
> > I think releasing the lock possibly is fine. It certainly is better than
> > sleeping with a lock held.
>
> Sure, but IMHO in general I think the practise of releasing the lock
> like this in a middle of function is dangerous as one can easily miss
> that upper and lower halves of the function are not actually atomic
> anymore. And in this case that it's under a conditional makes it even
> worse.
>


Yes in general I agree. Releasing and re-acquiring a lock is dangerous.
But I think in this special case here it might be harmless.
The irq_lock is used mostly (if not exclusively; I don't fully
remember) to protect against the IRQ top and bottom half.
But we disabled the device IRQs a line above and the purpose of this
synchronize is to make sure the handler will finish and thus make
dropping the lock save.
Of course it does not make sense to do this with the lock held :)

--
Michael


Attachments:
(No filename) (833.00 B)
OpenPGP digital signature

2017-06-01 04:27:54

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

Michael B=C3=BCsch <[email protected]> writes:

>> > --- a/drivers/net/wireless/broadcom/b43legacy/main.c
>> > +++ b/drivers/net/wireless/broadcom/b43legacy/main.c
>> > @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct=
ieee80211_hw *hw,
>> > b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
>> >=20=20
>> > if (changed & BSS_CHANGED_BSSID) {
>> > + spin_unlock_irqrestore(&wl->irq_lock, flags);
>> > b43legacy_synchronize_irq(dev);
>> > + spin_lock_irqsave(&wl->irq_lock, flags);=20=20
>>=20
>> To me this looks like a fragile workaround and not a real fix. You can
>> easily add new race conditions with releasing the lock like this.
>>=20
>
>
> I think releasing the lock possibly is fine. It certainly is better than
> sleeping with a lock held.

Sure, but IMHO in general I think the practise of releasing the lock
like this in a middle of function is dangerous as one can easily miss
that upper and lower halves of the function are not actually atomic
anymore. And in this case that it's under a conditional makes it even
worse.

--=20
Kalle Valo

2017-06-01 01:06:07

by Jia-Ju Bai

[permalink] [raw]
Subject: Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

On 06/01/2017 08:07 AM, Larry Finger wrote:
> On 05/31/2017 10:32 AM, Michael B?sch wrote:
>> On Wed, 31 May 2017 13:26:43 +0300
>> Kalle Valo <[email protected]> wrote:
>>
>>> Jia-Ju Bai <[email protected]> writes:
>>>
>>>> The driver may sleep under a spin lock, and the function call path is:
>>>> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
>>>> b43legacy_synchronize_irq
>>>> synchronize_irq --> may sleep
>>>>
>>>> To fix it, the lock is released before b43legacy_synchronize_irq,
>>>> and the
>>>> lock is acquired again after this function.
>>>>
>>>> Signed-off-by: Jia-Ju Bai <[email protected]>
>>>> ---
>>>> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c
>>>> b/drivers/net/wireless/broadcom/b43legacy/main.c
>>>> index f1e3dad..31ead21 100644
>>>> --- a/drivers/net/wireless/broadcom/b43legacy/main.c
>>>> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c
>>>> @@ -2859,7 +2859,9 @@ static void
>>>> b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
>>>> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
>>>> if (changed & BSS_CHANGED_BSSID) {
>>>> + spin_unlock_irqrestore(&wl->irq_lock, flags);
>>>> b43legacy_synchronize_irq(dev);
>>>> + spin_lock_irqsave(&wl->irq_lock, flags);
>>>
>>> To me this looks like a fragile workaround and not a real fix. You can
>>> easily add new race conditions with releasing the lock like this.
>>>
>>
>>
>> I think releasing the lock possibly is fine. It certainly is better than
>> sleeping with a lock held.
>> We disabled the device interrupts just before this line.
>>
>> However I think the synchronize_irq should be outside of the
>> conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
>> two lines above)
>> I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
>> is set.
>>
>>
>> On the other hand b43 does not have this irq-disabling foobar anymore.
>> So somebody must have removed it. Maybe you can find the commit that
>> removed this stuff from b43 and port it to b43legacy?
>>
>>
>> So I would vote for moving the synchronize_irq up outside of the
>> conditional and put the unlock/lock sequence around it.
>> And as a second patch on top of that try to remove this stuff
>> altogether like b43 did.
>
> The patch that removed it in b43 is
>
> commit 36dbd9548e92268127b0c31b0e121e63e9207108
> Author: Michael Buesch <[email protected]>
> Date: Fri Sep 4 22:51:29 2009 +0200
>
> b43: Use a threaded IRQ handler
>
> Use a threaded IRQ handler to allow locking the mutex and
> sleeping while executing an interrupt.
> This removes usage of the irq_lock spinlock, but introduces
> a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel
> hard-irq handler. Sleeping busses (SDIO) will use mutex instead.
>
> Signed-off-by: Michael Buesch <[email protected]>
> Tested-by: Larry Finger <[email protected]>
> Signed-off-by: John W. Linville <[email protected]>
>
> I vaguely remember this patch. Although it is roughly a 1000-line fix,
> I will try to port it to b43legacy. I still have an old BCM4306 PCMCIA
> card that I can test in a PowerBook G4.
>
> I agree with Michael that this is the way to go. Both of Jia-Ju's
> patches should be rejected.
>
> Larry
>
>

It is fine to me to fix the bug by porting this former patch.

Thanks,
Jia-Ju Bai

2017-06-01 05:32:48

by Michael Büsch

[permalink] [raw]
Subject: Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

On Wed, 31 May 2017 19:07:15 -0500
Larry Finger <[email protected]> wrote:

> On 05/31/2017 10:32 AM, Michael Büsch wrote:
> > On Wed, 31 May 2017 13:26:43 +0300
> > Kalle Valo <[email protected]> wrote:
> >
> >> Jia-Ju Bai <[email protected]> writes:
> >>
> >>> The driver may sleep under a spin lock, and the function call path is:
> >>> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
> >>> b43legacy_synchronize_irq
> >>> synchronize_irq --> may sleep
> >>>
> >>> To fix it, the lock is released before b43legacy_synchronize_irq, and the
> >>> lock is acquired again after this function.
> >>>
> >>> Signed-off-by: Jia-Ju Bai <[email protected]>
> >>> ---
> >>> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
> >>> 1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
> >>> index f1e3dad..31ead21 100644
> >>> --- a/drivers/net/wireless/broadcom/b43legacy/main.c
> >>> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c
> >>> @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
> >>> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
> >>>
> >>> if (changed & BSS_CHANGED_BSSID) {
> >>> + spin_unlock_irqrestore(&wl->irq_lock, flags);
> >>> b43legacy_synchronize_irq(dev);
> >>> + spin_lock_irqsave(&wl->irq_lock, flags);
> >>
> >> To me this looks like a fragile workaround and not a real fix. You can
> >> easily add new race conditions with releasing the lock like this.
> >>
> >
> >
> > I think releasing the lock possibly is fine. It certainly is better than
> > sleeping with a lock held.
> > We disabled the device interrupts just before this line.
> >
> > However I think the synchronize_irq should be outside of the
> > conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
> > two lines above)
> > I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
> > is set.
> >
> >
> > On the other hand b43 does not have this irq-disabling foobar anymore.
> > So somebody must have removed it. Maybe you can find the commit that
> > removed this stuff from b43 and port it to b43legacy?
> >
> >
> > So I would vote for moving the synchronize_irq up outside of the
> > conditional and put the unlock/lock sequence around it.
> > And as a second patch on top of that try to remove this stuff
> > altogether like b43 did.
>
> The patch that removed it in b43 is
>
> commit 36dbd9548e92268127b0c31b0e121e63e9207108
> Author: Michael Buesch <[email protected]>
> Date: Fri Sep 4 22:51:29 2009 +0200

Damn it :D

> b43: Use a threaded IRQ handler
>
> Use a threaded IRQ handler to allow locking the mutex and
> sleeping while executing an interrupt.
> This removes usage of the irq_lock spinlock, but introduces
> a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel
> hard-irq handler. Sleeping busses (SDIO) will use mutex instead.
>
> Signed-off-by: Michael Buesch <[email protected]>
> Tested-by: Larry Finger <[email protected]>
> Signed-off-by: John W. Linville <[email protected]>
>
> I vaguely remember this patch. Although it is roughly a 1000-line fix, I will
> try to port it to b43legacy. I still have an old BCM4306 PCMCIA card that I can
> test in a PowerBook G4.
>
> I agree with Michael that this is the way to go. Both of Jia-Ju's patches should
> be rejected.


I'm not sure if it's worth it. There is a risk that this would
introduce new bugs.
But sure, please feel free to try it. This way we can find out how big
this change becomes.

--
Michael


Attachments:
(No filename) (833.00 B)
OpenPGP digital signature

2017-06-01 00:07:18

by Larry Finger

[permalink] [raw]
Subject: Re: [PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed

On 05/31/2017 10:32 AM, Michael B?sch wrote:
> On Wed, 31 May 2017 13:26:43 +0300
> Kalle Valo <[email protected]> wrote:
>
>> Jia-Ju Bai <[email protected]> writes:
>>
>>> The driver may sleep under a spin lock, and the function call path is:
>>> b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
>>> b43legacy_synchronize_irq
>>> synchronize_irq --> may sleep
>>>
>>> To fix it, the lock is released before b43legacy_synchronize_irq, and the
>>> lock is acquired again after this function.
>>>
>>> Signed-off-by: Jia-Ju Bai <[email protected]>
>>> ---
>>> drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
>>> index f1e3dad..31ead21 100644
>>> --- a/drivers/net/wireless/broadcom/b43legacy/main.c
>>> +++ b/drivers/net/wireless/broadcom/b43legacy/main.c
>>> @@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
>>> b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
>>>
>>> if (changed & BSS_CHANGED_BSSID) {
>>> + spin_unlock_irqrestore(&wl->irq_lock, flags);
>>> b43legacy_synchronize_irq(dev);
>>> + spin_lock_irqsave(&wl->irq_lock, flags);
>>
>> To me this looks like a fragile workaround and not a real fix. You can
>> easily add new race conditions with releasing the lock like this.
>>
>
>
> I think releasing the lock possibly is fine. It certainly is better than
> sleeping with a lock held.
> We disabled the device interrupts just before this line.
>
> However I think the synchronize_irq should be outside of the
> conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
> two lines above)
> I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
> is set.
>
>
> On the other hand b43 does not have this irq-disabling foobar anymore.
> So somebody must have removed it. Maybe you can find the commit that
> removed this stuff from b43 and port it to b43legacy?
>
>
> So I would vote for moving the synchronize_irq up outside of the
> conditional and put the unlock/lock sequence around it.
> And as a second patch on top of that try to remove this stuff
> altogether like b43 did.

The patch that removed it in b43 is

commit 36dbd9548e92268127b0c31b0e121e63e9207108
Author: Michael Buesch <[email protected]>
Date: Fri Sep 4 22:51:29 2009 +0200

b43: Use a threaded IRQ handler

Use a threaded IRQ handler to allow locking the mutex and
sleeping while executing an interrupt.
This removes usage of the irq_lock spinlock, but introduces
a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel
hard-irq handler. Sleeping busses (SDIO) will use mutex instead.

Signed-off-by: Michael Buesch <[email protected]>
Tested-by: Larry Finger <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

I vaguely remember this patch. Although it is roughly a 1000-line fix, I will
try to port it to b43legacy. I still have an old BCM4306 PCMCIA card that I can
test in a PowerBook G4.

I agree with Michael that this is the way to go. Both of Jia-Ju's patches should
be rejected.

Larry