2007-11-20 00:19:19

by Harald Dunkel

[permalink] [raw]
Subject: 2.6.23.8, ondemand scaling governor: "BUG: soft lockup detected on CPU#0!"

Hi folks,

using the ondemand scaling governour I see some error messages
in kern.log, e.g.:

Nov 20 01:00:46 bugs kernel: BUG: soft lockup detected on CPU#0!
Nov 20 01:00:46 bugs kernel: [<c013cf8d>] softlockup_tick+0x91/0xa6
Nov 20 01:00:46 bugs kernel: [<c012269c>] update_process_times+0x3a/0x5d
Nov 20 01:00:46 bugs kernel: [<c0131219>] tick_sched_timer+0x115/0x164
Nov 20 01:00:46 bugs kernel: [<c012d311>] hrtimer_interrupt+0x102/0x191
Nov 20 01:00:46 bugs kernel: [<c0106cd6>] timer_interrupt+0x2e/0x34
Nov 20 01:00:46 bugs kernel: [<c013d1f6>] handle_IRQ_event+0x1a/0x3f
Nov 20 01:00:46 bugs kernel: [<c013e4e1>] handle_level_irq+0xa8/0xb7
Nov 20 01:00:46 bugs kernel: [<c0106367>] do_IRQ+0x53/0x6c
Nov 20 01:00:46 bugs kernel: [<c0104853>] common_interrupt+0x23/0x28
Nov 20 01:00:46 bugs kernel: [<c011007b>] smp_apic_timer_interrupt+0x1a/0x70
Nov 20 01:00:46 bugs kernel: [<c0102a36>] default_idle+0x27/0x39
Nov 20 01:00:46 bugs kernel: [<c010234c>] cpu_idle+0x46/0x68
Nov 20 01:00:46 bugs kernel: [<c032e9e8>] start_kernel+0x24d/0x252
Nov 20 01:00:46 bugs kernel: [<c032e317>] unknown_bootoption+0x0/0x196
Nov 20 01:00:46 bugs kernel: =======================

This seems to happen when the load drops below the threshold and
the ondemand governor changes the CPU from 2GHz to 400MHz. If I use
the "performance" governor instead, then there is no such message.
If I set it back to "ondemand", then the message is printed
immediately.

# uname -a
Linux bugs 2.6.23.8 #1 PREEMPT Sun Nov 18 09:14:13 CET 2007 i686 GNU/Linux
# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 13
model name : Intel(R) Pentium(R) M processor 2.00GHz
stepping : 8
cpu MHz : 400.000
cache size : 2048 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx bts est tm2
bogomips : 798.34
clflush size : 64


Please mail if I can help to track this down.


Regards

Harri


2007-11-20 01:01:17

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: 2.6.23.8, ondemand scaling governor: "BUG: soft lockup detected on CPU#0!"



>-----Original Message-----
>From: [email protected]
>[mailto:[email protected]] On Behalf Of Harald Dunkel
>Sent: Monday, November 19, 2007 4:19 PM
>To: Linux Kernel list
>Subject: 2.6.23.8, ondemand scaling governor: "BUG: soft
>lockup detected on CPU#0!"
>
>Hi folks,
>
>using the ondemand scaling governour I see some error messages
>in kern.log, e.g.:
>
>Nov 20 01:00:46 bugs kernel: BUG: soft lockup detected on CPU#0!
>Nov 20 01:00:46 bugs kernel: [<c013cf8d>] softlockup_tick+0x91/0xa6
>Nov 20 01:00:46 bugs kernel: [<c012269c>]
>update_process_times+0x3a/0x5d
>Nov 20 01:00:46 bugs kernel: [<c0131219>] tick_sched_timer+0x115/0x164
>Nov 20 01:00:46 bugs kernel: [<c012d311>]
>hrtimer_interrupt+0x102/0x191
>Nov 20 01:00:46 bugs kernel: [<c0106cd6>] timer_interrupt+0x2e/0x34
>Nov 20 01:00:46 bugs kernel: [<c013d1f6>] handle_IRQ_event+0x1a/0x3f
>Nov 20 01:00:46 bugs kernel: [<c013e4e1>] handle_level_irq+0xa8/0xb7
>Nov 20 01:00:46 bugs kernel: [<c0106367>] do_IRQ+0x53/0x6c
>Nov 20 01:00:46 bugs kernel: [<c0104853>] common_interrupt+0x23/0x28
>Nov 20 01:00:46 bugs kernel: [<c011007b>]
>smp_apic_timer_interrupt+0x1a/0x70
>Nov 20 01:00:46 bugs kernel: [<c0102a36>] default_idle+0x27/0x39
>Nov 20 01:00:46 bugs kernel: [<c010234c>] cpu_idle+0x46/0x68
>Nov 20 01:00:46 bugs kernel: [<c032e9e8>] start_kernel+0x24d/0x252
>Nov 20 01:00:46 bugs kernel: [<c032e317>] unknown_bootoption+0x0/0x196
>Nov 20 01:00:46 bugs kernel: =======================
>
>This seems to happen when the load drops below the threshold and
>the ondemand governor changes the CPU from 2GHz to 400MHz. If I use
>the "performance" governor instead, then there is no such message.
>If I set it back to "ondemand", then the message is printed
>immediately.
>
># uname -a
>Linux bugs 2.6.23.8 #1 PREEMPT Sun Nov 18 09:14:13 CET 2007
>i686 GNU/Linux
># cat /proc/cpuinfo
>processor : 0
>vendor_id : GenuineIntel
>cpu family : 6
>model : 13
>model name : Intel(R) Pentium(R) M processor 2.00GHz
>stepping : 8
>cpu MHz : 400.000
>cache size : 2048 KB
>fdiv_bug : no
>hlt_bug : no
>f00f_bug : no
>coma_bug : no
>fpu : yes
>fpu_exception : yes
>cpuid level : 2
>wp : yes
>flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr
>pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe
>nx bts est tm2
>bogomips : 798.34
>clflush size : 64
>
>
>Please mail if I can help to track this down.
>
>

Can you try switching to powersave governor (which should always run CPU
at 400MHz) and see whether you see similar error?

Thanks,
Venki

2007-11-20 08:17:39

by Harald Dunkel

[permalink] [raw]
Subject: Re: 2.6.23.8, ondemand scaling governor: "BUG: soft lockup detected on CPU#0!"

Pallipadi, Venkatesh wrote:
>
> Can you try switching to powersave governor (which should always run CPU
> at 400MHz) and see whether you see similar error?
>

Yes, if I move from performance to powersave, then I see a similar
error:

Nov 20 09:06:48 bugs kernel: BUG: soft lockup detected on CPU#0!
Nov 20 09:06:48 bugs kernel: [<c013cf8d>] softlockup_tick+0x91/0xa6
Nov 20 09:06:48 bugs kernel: [<c012269c>] update_process_times+0x3a/0x5d
Nov 20 09:06:48 bugs kernel: [<c0131219>] tick_sched_timer+0x115/0x164
Nov 20 09:06:48 bugs kernel: [<c012d311>] hrtimer_interrupt+0x102/0x191
Nov 20 09:06:48 bugs kernel: [<c0106cd6>] timer_interrupt+0x2e/0x34
Nov 20 09:06:48 bugs kernel: [<c013d1f6>] handle_IRQ_event+0x1a/0x3f
Nov 20 09:06:48 bugs kernel: [<c013e4e1>] handle_level_irq+0xa8/0xb7
Nov 20 09:06:48 bugs kernel: [<c0106367>] do_IRQ+0x53/0x6c
Nov 20 09:06:48 bugs kernel: [<c0104853>] common_interrupt+0x23/0x28
Nov 20 09:06:48 bugs kernel: =======================


Please mail. Regards

Harri

2007-11-20 14:29:28

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: 2.6.23.8, ondemand scaling governor: "BUG: soft lockup detected on CPU#0!"



>-----Original Message-----
>From: Harald Dunkel [mailto:[email protected]]
>Sent: Tuesday, November 20, 2007 12:17 AM
>To: Pallipadi, Venkatesh
>Cc: Linux Kernel list
>Subject: Re: 2.6.23.8, ondemand scaling governor: "BUG: soft
>lockup detected on CPU#0!"
>
>Pallipadi, Venkatesh wrote:
>>
>> Can you try switching to powersave governor (which should
>always run CPU
>> at 400MHz) and see whether you see similar error?
>>
>
>Yes, if I move from performance to powersave, then I see a similar
>error:
>
>Nov 20 09:06:48 bugs kernel: BUG: soft lockup detected on CPU#0!
>Nov 20 09:06:48 bugs kernel: [<c013cf8d>] softlockup_tick+0x91/0xa6
>Nov 20 09:06:48 bugs kernel: [<c012269c>]
>update_process_times+0x3a/0x5d
>Nov 20 09:06:48 bugs kernel: [<c0131219>] tick_sched_timer+0x115/0x164
>Nov 20 09:06:48 bugs kernel: [<c012d311>]
>hrtimer_interrupt+0x102/0x191
>Nov 20 09:06:48 bugs kernel: [<c0106cd6>] timer_interrupt+0x2e/0x34
>Nov 20 09:06:48 bugs kernel: [<c013d1f6>] handle_IRQ_event+0x1a/0x3f
>Nov 20 09:06:48 bugs kernel: [<c013e4e1>] handle_level_irq+0xa8/0xb7
>Nov 20 09:06:48 bugs kernel: [<c0106367>] do_IRQ+0x53/0x6c
>Nov 20 09:06:48 bugs kernel: [<c0104853>] common_interrupt+0x23/0x28
>Nov 20 09:06:48 bugs kernel: =======================
>
>

This looks like TSC related issue. Ingo's patch commit id
a3b13c23f186ecb57204580cc1f2dbe9c284953a
http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.g
it;a=commit;h=a3b13c23f186ecb57204580cc1f2dbe9c284953a
should help.

Thanks,
Venki

2007-11-20 21:39:34

by Harald Dunkel

[permalink] [raw]
Subject: Re: 2.6.23.8, ondemand scaling governor: "BUG: soft lockup detected on CPU#0!"

Pallipadi, Venkatesh wrote:
>
> This looks like TSC related issue. Ingo's patch commit id
> a3b13c23f186ecb57204580cc1f2dbe9c284953a
> http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.g
> it;a=commit;h=a3b13c23f186ecb57204580cc1f2dbe9c284953a
> should help.
>

Yes, after applying this patch the problem seems to be gone.


Many thanx

Harri