2007-06-26 07:57:34

by gshan

[permalink] [raw]
Subject: bugs in __schedule()

Anybody has suggestions on this crash?

[c02dc4dc] __schedule+0x654/0x788
[c02dc6f4] schedule+0x4c/0xe4
[c02dbe24] __compat_down+0xc8/0x12c
[c0226b60] mv_sw_read_reg+0x178/0x17c
[c02296fc] mvSwIntrTasklet+0x128/0x744
[c0020afc] tasklet_action+0x7c/0xec
[c00204f4] ___do_softirq+0x80/0x11c
[c00205cc] __do_softirq+0x3c/0x6c
[c00206a4] do_softirq+0x60/0x68
[c00207a8] irq_exit+0x6c/0x94
[c0005e1c] do_IRQ+0x88/0xa4
[c0004b70] ret_from_except+0x0/0x18
[c000680c] __delay+0xc/0x14
[c0226598] mv_sw_read_smi_reg+0x84/0x1ac

Thanks,
Gavin
[c0226adc] mv_sw_read_reg+0xf4/0x17c



2007-06-26 08:15:12

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: bugs in __schedule()

On 6/26/07, gshan <[email protected]> wrote:
> Anybody has suggestions on this crash?

talk to whoever supplied the following into your kernel:

> [c02dc4dc] __schedule+0x654/0x788
> [c02dc6f4] schedule+0x4c/0xe4
> [c02dbe24] __compat_down+0xc8/0x12c
> [c0226b60] mv_sw_read_reg+0x178/0x17c

this

> [c02296fc] mvSwIntrTasklet+0x128/0x744

and this.

> [c0020afc] tasklet_action+0x7c/0xec
> [c00204f4] ___do_softirq+0x80/0x11c
> [c00205cc] __do_softirq+0x3c/0x6c
> [c00206a4] do_softirq+0x60/0x68
> [c00207a8] irq_exit+0x6c/0x94
> [c0005e1c] do_IRQ+0x88/0xa4
> [c0004b70] ret_from_except+0x0/0x18
> [c000680c] __delay+0xc/0x14
> [c0226598] mv_sw_read_smi_reg+0x84/0x1ac

2007-06-26 08:18:16

by Satyam Sharma

[permalink] [raw]
Subject: Re: bugs in __schedule()

Hi Gavin,

On 6/26/07, gshan <[email protected]> wrote:
> Anybody has suggestions on this crash?
>
> [c02dc4dc] __schedule+0x654/0x788
> [c02dc6f4] schedule+0x4c/0xe4
> [c02dbe24] __compat_down+0xc8/0x12c
> [c0226b60] mv_sw_read_reg+0x178/0x17c
> [c02296fc] mvSwIntrTasklet+0x128/0x744
> [c0020afc] tasklet_action+0x7c/0xec
> [c00204f4] ___do_softirq+0x80/0x11c
> [c00205cc] __do_softirq+0x3c/0x6c
> [c00206a4] do_softirq+0x60/0x68
> [c00207a8] irq_exit+0x6c/0x94
> [c0005e1c] do_IRQ+0x88/0xa4
> [c0004b70] ret_from_except+0x0/0x18
> [c000680c] __delay+0xc/0x14
> [c0226598] mv_sw_read_smi_reg+0x84/0x1ac

I bet you got more output than that, didn't you? :-)

Now I know this must be a "scheduling while interrupts disabled",
but why not post the whole error message?

[ Reminds me of the other thread where someone thought just mentioning
the EIP without any backtrace / other messages was enough to resolve a
boot-time panic. I agree they're all magicians on this list, but Gods they're
not ... ]

BTW I don't see "mvSwIntrTasklet" anywhere on -rc6. Is this a -mm kernel?

Cheers,
Satyam

2007-06-26 08:18:29

by gshan

[permalink] [raw]
Subject: Re: bugs in __schedule()

The code is written by myself. 2 questions:

1) tasklet couldn't sleep?
2) It is because the tasklet take a semaphore?

Thanks,
Gavin

Alexey Dobriyan wrote:
> On 6/26/07, gshan <[email protected]> wrote:
>> Anybody has suggestions on this crash?
>
> talk to whoever supplied the following into your kernel:
>
>> [c02dc4dc] __schedule+0x654/0x788
>> [c02dc6f4] schedule+0x4c/0xe4
>> [c02dbe24] __compat_down+0xc8/0x12c
>> [c0226b60] mv_sw_read_reg+0x178/0x17c
>
> this
>
>> [c02296fc] mvSwIntrTasklet+0x128/0x744
>
> and this.
>
>> [c0020afc] tasklet_action+0x7c/0xec
>> [c00204f4] ___do_softirq+0x80/0x11c
>> [c00205cc] __do_softirq+0x3c/0x6c
>> [c00206a4] do_softirq+0x60/0x68
>> [c00207a8] irq_exit+0x6c/0x94
>> [c0005e1c] do_IRQ+0x88/0xa4
>> [c0004b70] ret_from_except+0x0/0x18
>> [c000680c] __delay+0xc/0x14
>> [c0226598] mv_sw_read_smi_reg+0x84/0x1ac

2007-06-26 08:20:30

by gshan

[permalink] [raw]
Subject: Re: bugs in __schedule()

Here is the all output I have:

# ifconfig mgt0 10.0.51.27
BUG: scheduling while atomic: exe/0x00000101/752
caller is schedule+0x4c/0xe4
Call trace:
[c02dc4dc] __schedule+0x654/0x788
[c02dc6f4] schedule+0x4c/0xe4
[c02dbe24] __compat_down+0xc8/0x12c
[c0226b60] mv_sw_read_reg+0x178/0x17c
[c02296fc] mvSwIntrTasklet+0x128/0x744
[c0020afc] tasklet_action+0x7c/0xec
[c00204f4] ___do_softirq+0x80/0x11c
[c00205cc] __do_softirq+0x3c/0x6c
[c00206a4] do_softirq+0x60/0x68
[c00207a8] irq_exit+0x6c/0x94
[c0005e1c] do_IRQ+0x88/0xa4
[c0004b70] ret_from_except+0x0/0x18
[c000680c] __delay+0xc/0x14
[c0226598] mv_sw_read_smi_reg+0x84/0x1ac
[c0226adc] mv_sw_read_reg+0xf4/0x17c


Satyam Sharma wrote:
> Hi Gavin,
>
> On 6/26/07, gshan <[email protected]> wrote:
>> Anybody has suggestions on this crash?
>>
>> [c02dc4dc] __schedule+0x654/0x788
>> [c02dc6f4] schedule+0x4c/0xe4
>> [c02dbe24] __compat_down+0xc8/0x12c
>> [c0226b60] mv_sw_read_reg+0x178/0x17c
>> [c02296fc] mvSwIntrTasklet+0x128/0x744
>> [c0020afc] tasklet_action+0x7c/0xec
>> [c00204f4] ___do_softirq+0x80/0x11c
>> [c00205cc] __do_softirq+0x3c/0x6c
>> [c00206a4] do_softirq+0x60/0x68
>> [c00207a8] irq_exit+0x6c/0x94
>> [c0005e1c] do_IRQ+0x88/0xa4
>> [c0004b70] ret_from_except+0x0/0x18
>> [c000680c] __delay+0xc/0x14
>> [c0226598] mv_sw_read_smi_reg+0x84/0x1ac
>
> I bet you got more output than that, didn't you? :-)
>
> Now I know this must be a "scheduling while interrupts disabled",
> but why not post the whole error message?
>
> [ Reminds me of the other thread where someone thought just mentioning
> the EIP without any backtrace / other messages was enough to resolve a
> boot-time panic. I agree they're all magicians on this list, but Gods
> they're
> not ... ]
>
> BTW I don't see "mvSwIntrTasklet" anywhere on -rc6. Is this a -mm kernel?
>
> Cheers,
> Satyam

2007-06-26 08:25:18

by Satyam Sharma

[permalink] [raw]
Subject: Re: bugs in __schedule()

Hi Gavin,

On 6/26/07, gshan <[email protected]> wrote:
> Here is the all output I have:
>
> # ifconfig mgt0 10.0.51.27
> BUG: scheduling while atomic: exe/0x00000101/752

Yup, you can't sleep in taskets, they're atomic.

> caller is schedule+0x4c/0xe4
> Call trace:
> [c02dc4dc] __schedule+0x654/0x788
> [c02dc6f4] schedule+0x4c/0xe4
> [c02dbe24] __compat_down+0xc8/0x12c

So, you can't use / acquire semaphores in them. Use spinlocks.
If the shared data is also accessed from process context, use
spin_lock_bh() from the process context code.

Satyam

2007-06-26 08:36:40

by gshan

[permalink] [raw]
Subject: Re: bugs in __schedule()

Satyam Sharma wrote:
> Hi Gavin,
>
> On 6/26/07, gshan <[email protected]> wrote:
>> Here is the all output I have:
>>
>> # ifconfig mgt0 10.0.51.27
>> BUG: scheduling while atomic: exe/0x00000101/752
>
> Yup, you can't sleep in taskets, they're atomic.
>
>> caller is schedule+0x4c/0xe4
>> Call trace:
>> [c02dc4dc] __schedule+0x654/0x788
>> [c02dc6f4] schedule+0x4c/0xe4
>> [c02dbe24] __compat_down+0xc8/0x12c
>
> So, you can't use / acquire semaphores in them. Use spinlocks.
> If the shared data is also accessed from process context, use
> spin_lock_bh() from the process context code.
>
> Satyam
Thanks, Satyam. So I can replace the tasklet with kernel thread?

2007-06-26 08:50:46

by Satyam Sharma

[permalink] [raw]
Subject: Re: bugs in __schedule()

On 6/26/07, gshan <[email protected]> wrote:
> Thanks, Satyam. So I can replace the tasklet with kernel thread?

That depends on what you want to do ...

[ BTW: I should correct a typo in my original reply -- I should've said
"scheduling while atomic", and not "scheduling while interrupts disabled"
because interrupts are _not_ disabled in bottom halves, and from your
stack backtrace it was evident that a tasklet was executing. :-) ]

2007-06-26 08:55:48

by gshan

[permalink] [raw]
Subject: Re: bugs in __schedule()

Satyam Sharma wrote:
> On 6/26/07, gshan <[email protected]> wrote:
>> Thanks, Satyam. So I can replace the tasklet with kernel thread?
>
> That depends on what you want to do ...
>
> [ BTW: I should correct a typo in my original reply -- I should've said
> "scheduling while atomic", and not "scheduling while interrupts disabled"
> because interrupts are _not_ disabled in bottom halves, and from your
> stack backtrace it was evident that a tasklet was executing. :-) ]
I mean kernel thread could sleep, but tasklet can't. If so, it meet my
requirements.

2007-06-26 09:08:45

by Satyam Sharma

[permalink] [raw]
Subject: Re: bugs in __schedule()

On 6/26/07, gshan <[email protected]> wrote:
> I mean kernel thread could sleep, but tasklet can't. If so, it meet my
> requirements.

That's not really a justification to convert a tasklet to a kernel thread.
I suspect the simplest solution to your problem would be to simply
replace that semaphore with a spinlock. Other than that, we can't really
help without looking at code.

2007-06-26 09:13:23

by gshan

[permalink] [raw]
Subject: Re: bugs in __schedule()

Satyam Sharma wrote:
> On 6/26/07, gshan <[email protected]> wrote:
>> I mean kernel thread could sleep, but tasklet can't. If so, it meet my
>> requirements.
>
> That's not really a justification to convert a tasklet to a kernel
> thread.
> I suspect the simplest solution to your problem would be to simply
> replace that semaphore with a spinlock. Other than that, we can't really
> help without looking at code.
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Thank you very much, Satyam. Anyway, I will do experiments. Thank you
again :-)