2011-05-13 20:34:55

by James Hogan

[permalink] [raw]
Subject: BUG: scheduling while atomic 2.6.39-rc7 (iwl3945_irq_tasklet)

Hi,

On 2.6.39-rc7 I've seen a panic due to "BUG: scheduling while atomic"
with the backtrace below (not much detail as it was written in a text
message while it was displayed on the screen!). All worked fine in
2.6.38.

This was soon after resuming from suspend (enough time to unlock the
screen, but not much else). I think it was the same bug I saw in rc2 but
didn't have time to track down. I can probably get it to happen again if
more detail is needed. It doesn't happen every suspend (I think it had
survived a couple of suspend/resume cycles at this point).

I could bisect if necessary, but hopefully the backtrace will be enough
to see what's going on?

Cheers
James

...
die
do_general_protection
? restore_args
general protection
? iwl3945_irq_tasklet
? handle_irq_event
tasklet_action
__do_softirq
? lock_release
call_softirq
do_softirq
irq_exit
do_IRQ
common_interrupt


2011-05-16 17:31:02

by John W. Linville

[permalink] [raw]
Subject: Re: BUG: scheduling while atomic 2.6.39-rc7 (iwl3945_irq_tasklet)

On Fri, May 13, 2011 at 09:34:49PM +0100, James Hogan wrote:
> Hi,
>
> On 2.6.39-rc7 I've seen a panic due to "BUG: scheduling while atomic"
> with the backtrace below (not much detail as it was written in a text
> message while it was displayed on the screen!). All worked fine in
> 2.6.38.
>
> This was soon after resuming from suspend (enough time to unlock the
> screen, but not much else). I think it was the same bug I saw in rc2 but
> didn't have time to track down. I can probably get it to happen again if
> more detail is needed. It doesn't happen every suspend (I think it had
> survived a couple of suspend/resume cycles at this point).
>
> I could bisect if necessary, but hopefully the backtrace will be enough
> to see what's going on?

A bisect might be very helpful -- time is short for 2.6.39 already.

John
--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2011-05-17 08:41:42

by James Hogan

[permalink] [raw]
Subject: Re: BUG: scheduling while atomic 2.6.39-rc7 (iwl3945_irq_tasklet)

On 16 May 2011 18:25, John W. Linville <[email protected]> wrote:
> On Fri, May 13, 2011 at 09:34:49PM +0100, James Hogan wrote:
>> Hi,
>>
>> On 2.6.39-rc7 I've seen a panic due to "BUG: scheduling while atomic"
>> with the backtrace below (not much detail as it was written in a text
>> message while it was displayed on the screen!). All worked fine in
>> 2.6.38.
>>
>> This was soon after resuming from suspend (enough time to unlock the
>> screen, but not much else). I think it was the same bug I saw in rc2 but
>> didn't have time to track down. I can probably get it to happen again if
>> more detail is needed. It doesn't happen every suspend (I think it had
>> survived a couple of suspend/resume cycles at this point).
>>
>> I could bisect if necessary, but hopefully the backtrace will be enough
>> to see what's going on?
>
> A bisect might be very helpful -- time is short for 2.6.39 already.

Hmm, it won't reproduce. I'll have to try and bisect this evening, as
it was in my home network that it hit the BUG before.

Cheers
James

>
> John
> --
> John W. Linville ? ? ? ? ? ? ? ?Someday the world will need a hero, and you
> [email protected] ? ? ? ? ? ? ? ? ?might be all we have. ?Be ready.
>

--
James Hogan

2011-06-01 18:02:03

by James Hogan

[permalink] [raw]
Subject: Re: BUG: scheduling while atomic 2.6.39-rc7 (iwl3945_irq_tasklet)

On 1 June 2011 15:43, Stanislaw Gruszka <[email protected]> wrote:
> On Tue, May 17, 2011 at 09:41:42AM +0100, James Hogan wrote:
>> On 16 May 2011 18:25, John W. Linville <[email protected]> wrote:
>> > On Fri, May 13, 2011 at 09:34:49PM +0100, James Hogan wrote:
>> >> On 2.6.39-rc7 I've seen a panic due to "BUG: scheduling while atomic"
>> >> with the backtrace below (not much detail as it was written in a text
>> >> message while it was displayed on the screen!). All worked fine in
>> >> 2.6.38.
>> >>
>> >> This was soon after resuming from suspend (enough time to unlock the
>> >> screen, but not much else). I think it was the same bug I saw in rc2 but
>> >> didn't have time to track down. I can probably get it to happen again if
>> >> more detail is needed. It doesn't happen every suspend (I think it had
>> >> survived a couple of suspend/resume cycles at this point).
>> >>
>> >> I could bisect if necessary, but hopefully the backtrace will be enough
>> >> to see what's going on?
>> >
>> > A bisect might be very helpful -- time is short for 2.6.39 already.
>>
>> Hmm, it won't reproduce. I'll have to try and bisect this evening, as
>> it was in my home network that it hit the BUG before.
>
> We use mutex in atomic contex when changing channel. I'm not sure if
> this is the particular problem you have. If you found a way to
> reproduce, you may try this patch:

Thanks, I haven't managed to reproduce the problem at all
unfortunately so cannot bisect it (although I have had several failed
suspends, where it just boots up again from scratch instead of
resuming).

I'll try the patch when i get a chance

Cheers
James

>
> diff --git a/drivers/net/wireless/iwlegacy/iwl-core.c b/drivers/net/wireless/iwlegacy/iwl-core.c
> index 42df832..01244b2 100644
> --- a/drivers/net/wireless/iwlegacy/iwl-core.c
> +++ b/drivers/net/wireless/iwlegacy/iwl-core.c
> @@ -861,9 +861,7 @@ void iwl_legacy_chswitch_done(struct iwl_priv *priv, bool is_success)
>
> ? ? ? ?if (priv->switch_rxon.switch_in_progress) {
> ? ? ? ? ? ? ? ?ieee80211_chswitch_done(ctx->vif, is_success);
> - ? ? ? ? ? ? ? mutex_lock(&priv->mutex);
> ? ? ? ? ? ? ? ?priv->switch_rxon.switch_in_progress = false;
> - ? ? ? ? ? ? ? mutex_unlock(&priv->mutex);
> ? ? ? ?}
> ?}
> ?EXPORT_SYMBOL(iwl_legacy_chswitch_done);
>



--
James Hogan

2011-06-01 14:44:06

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: BUG: scheduling while atomic 2.6.39-rc7 (iwl3945_irq_tasklet)

On Tue, May 17, 2011 at 09:41:42AM +0100, James Hogan wrote:
> On 16 May 2011 18:25, John W. Linville <[email protected]> wrote:
> > On Fri, May 13, 2011 at 09:34:49PM +0100, James Hogan wrote:
> >> On 2.6.39-rc7 I've seen a panic due to "BUG: scheduling while atomic"
> >> with the backtrace below (not much detail as it was written in a text
> >> message while it was displayed on the screen!). All worked fine in
> >> 2.6.38.
> >>
> >> This was soon after resuming from suspend (enough time to unlock the
> >> screen, but not much else). I think it was the same bug I saw in rc2 but
> >> didn't have time to track down. I can probably get it to happen again if
> >> more detail is needed. It doesn't happen every suspend (I think it had
> >> survived a couple of suspend/resume cycles at this point).
> >>
> >> I could bisect if necessary, but hopefully the backtrace will be enough
> >> to see what's going on?
> >
> > A bisect might be very helpful -- time is short for 2.6.39 already.
>
> Hmm, it won't reproduce. I'll have to try and bisect this evening, as
> it was in my home network that it hit the BUG before.

We use mutex in atomic contex when changing channel. I'm not sure if
this is the particular problem you have. If you found a way to
reproduce, you may try this patch:

diff --git a/drivers/net/wireless/iwlegacy/iwl-core.c b/drivers/net/wireless/iwlegacy/iwl-core.c
index 42df832..01244b2 100644
--- a/drivers/net/wireless/iwlegacy/iwl-core.c
+++ b/drivers/net/wireless/iwlegacy/iwl-core.c
@@ -861,9 +861,7 @@ void iwl_legacy_chswitch_done(struct iwl_priv *priv, bool is_success)

if (priv->switch_rxon.switch_in_progress) {
ieee80211_chswitch_done(ctx->vif, is_success);
- mutex_lock(&priv->mutex);
priv->switch_rxon.switch_in_progress = false;
- mutex_unlock(&priv->mutex);
}
}
EXPORT_SYMBOL(iwl_legacy_chswitch_done);