2008-08-30 13:07:19

by Thomas Backlund

[permalink] [raw]
Subject: BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!

Hi, (please cc me as I'm not subscribed)

thought I would post this right now, I'll try to reproduce it with
vanilla 2.6.27-rc5 as soon as the buildhost is back up...

(vanilla 2.6.27-rc5 x86_64 also locked up my laptop wich is a
Core2 Duo T8300 during kernel build with make -j3, but
I dont have any logs of that yet)

Kernel: 2.6.27-rc5 + Mandriva patches
Config:
http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/branches/rebase-to-2.6.27/PATCHES/configs/x86_64.config?revision=276760&view=markup
Arch: x86_64

System: Intel Quad Core Q9300

getting this with netconsole while building kernel rpms with make -j5:


BUG: soft lockup - CPU#0 stuck for 61s! [sh:24555]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 0:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24555, comm: sh Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f4>] [<ffffffff802246f4>]
__ticket_spin_lock+0x14/0x20
RSP: 0018:ffff8801ee5cbb68 EFLAGS: 00000287
RAX: 000000000000f0ef RBX: ffff8801ee5cbb68 RCX: 0000000000000f9f
RDX: ffff88022a538000 RSI: ffff88022d9e0600 RDI: ffffffff8068da00
RBP: ffff88022a05bd48 R08: 0000000000000000 R09: ffff88022a538000
R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8034bfcc
R13: ffff8801ee5cbad8 R14: ffff88022d9e0600 R15: ffff88022a0dc0c8
FS: 00002ba3cce3cf40(0000) GS:ffffffff8068dd80(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006b7550 CR3: 00000001cc5d4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff803319c0>] ? ccs_realpath_from_dentry2+0x140/0x430
[<ffffffff802b7bc6>] ? ksize+0x16/0xc0
[<ffffffff80331d8f>] ? ccs_realpath_from_dentry+0x3f/0x60
[<ffffffff80331dff>] ? ccs_realpath_nofollow+0x4f/0x60
[<ffffffff803306de>] ? ccs_check_flags+0x1e/0x30
[<ffffffff8032c888>] ? find_next_domain+0x98/0x970
[<ffffffff8029c6b2>] ? handle_mm_fault+0x322/0x960
[<ffffffff8029adb7>] ? follow_page+0x2f7/0x350
[<ffffffff8029cdf8>] ? get_user_pages+0x108/0x450
[<ffffffff802b7bc6>] ? ksize+0x16/0xc0
[<ffffffff8032da76>] ? search_binary_handler_with_transition+0x406/0x4d0
[<ffffffff802c5c0d>] ? do_execve+0x1fd/0x240
[<ffffffff8020a2d9>] ? sys_execve+0x49/0x70
[<ffffffff8020ca4a>] ? stub_execve+0x6a/0xc0

BUG: soft lockup - CPU#1 stuck for 61s! [cc1:24471]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 1:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24471, comm: cc1 Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f4>] [<ffffffff802246f4>]
__ticket_spin_lock+0x14/0x20
RSP: 0018:ffff8801f691dd48 EFLAGS: 00000293
RAX: 0000000000007774 RBX: ffff8801f691dd48 RCX: 0000000000000000
RDX: 0000000000000005 RSI: ffff8801bf4a9019 RDI: ffffffff8068d900
RBP: ffff88022e8ba7c8 R08: 0000000000000000 R09: ffff8801f695dd80
R10: 0000000000000000 R11: 000000000000002b R12: ffffffff802c8beb
R13: ffff8801f691dd28 R14: ffffffff802c7881 R15: ffff8801f691dca8
FS: 00002ba4b8f766f0(0000) GS:ffff88022fc02980(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004f5b90 CR3: 00000001e28c5000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff802d34a3>] ? d_alloc+0x133/0x1d0
[<ffffffff802c7b50>] ? __lookup_hash+0xe0/0x140
[<ffffffff802c7bf0>] ? lookup_hash+0x40/0x50
[<ffffffff802cc199>] ? do_filp_open+0x239/0x8c0
[<ffffffff802d7d26>] ? alloc_fd+0x116/0x140
[<ffffffff802bd50e>] ? do_sys_open+0x5e/0xf0
[<ffffffff802bd5cb>] ? sys_open+0x1b/0x20
[<ffffffff8020c50a>] ? system_call_fastpath+0x16/0x1b

BUG: soft lockup - CPU#2 stuck for 61s! [cc1:24472]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 2:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24472, comm: cc1 Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f4>] [<ffffffff802246f4>]
__ticket_spin_lock+0x14/0x20
RSP: 0018:ffff8801e289bbe8 EFLAGS: 00000293
RAX: 0000000000007674 RBX: ffff8801e289bbe8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88022ac2b019 RDI: ffffffff8068d900
RBP: 0000000000000020 R08: ffff8800280549c0 R09: ffff8801c886b240
R10: 0000000000000000 R11: 000000000000001f R12: ffffffff8028c451
R13: ffff8801e289bb88 R14: 00000000015ec06f R15: 0000000000000002
FS: 00002b5291ef56f0(0000) GS:ffff88022fc02d00(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002b529385f018 CR3: 00000001ee580000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff802d34a3>] ? d_alloc+0x133/0x1d0
[<ffffffff802c8a9c>] ? do_lookup+0x17c/0x240
[<ffffffff802c90bb>] ? __link_path_walk+0x55b/0x830
[<ffffffff802c93e7>] ? path_walk+0x57/0xb0
[<ffffffff802c9583>] ? do_path_lookup+0x83/0x1c0
[<ffffffff8034b001>] ? security_file_alloc+0x11/0x20
[<ffffffff802c9837>] ? __path_lookup_intent_open+0x67/0xc0
[<ffffffff802c989c>] ? path_lookup_open+0xc/0x10
[<ffffffff802cc01e>] ? do_filp_open+0xbe/0x8c0
[<ffffffff8023bc33>] ? hrtick_start_fair+0x173/0x190
[<ffffffff80373cc6>] ? __up_read+0x46/0xb0
[<ffffffff8025c229>] ? up_read+0x9/0x10
[<ffffffff8022a89b>] ? do_page_fault+0x3eb/0xa80
[<ffffffff802d7d26>] ? alloc_fd+0x116/0x140
[<ffffffff802bd50e>] ? do_sys_open+0x5e/0xf0
[<ffffffff802bd5cb>] ? sys_open+0x1b/0x20
[<ffffffff8020c50a>] ? system_call_fastpath+0x16/0x1b

BUG: soft lockup - CPU#0 stuck for 61s! [sh:24555]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 0:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24555, comm: sh Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f4>] [<ffffffff802246f4>]
__ticket_spin_lock+0x14/0x20
RSP: 0018:ffff8801ee5cbb68 EFLAGS: 00000287
RAX: 000000000000f0ef RBX: ffff8801ee5cbb68 RCX: 0000000000000f9f
RDX: ffff88022a538000 RSI: ffff88022d9e0600 RDI: ffffffff8068da00
RBP: ffff88022a05bd48 R08: 0000000000000000 R09: ffff88022a538000
R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8034bfcc
R13: ffff8801ee5cbad8 R14: ffff88022d9e0600 R15: ffff88022a0dc0c8
FS: 00002ba3cce3cf40(0000) GS:ffffffff8068dd80(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006b7550 CR3: 00000001cc5d4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff803319c0>] ? ccs_realpath_from_dentry2+0x140/0x430
[<ffffffff802b7bc6>] ? ksize+0x16/0xc0
[<ffffffff80331d8f>] ? ccs_realpath_from_dentry+0x3f/0x60
[<ffffffff80331dff>] ? ccs_realpath_nofollow+0x4f/0x60
[<ffffffff803306de>] ? ccs_check_flags+0x1e/0x30
[<ffffffff8032c888>] ? find_next_domain+0x98/0x970
[<ffffffff8029c6b2>] ? handle_mm_fault+0x322/0x960
[<ffffffff8029adb7>] ? follow_page+0x2f7/0x350
[<ffffffff8029cdf8>] ? get_user_pages+0x108/0x450
[<ffffffff802b7bc6>] ? ksize+0x16/0xc0
[<ffffffff8032da76>] ? search_binary_handler_with_transition+0x406/0x4d0
[<ffffffff802c5c0d>] ? do_execve+0x1fd/0x240
[<ffffffff8020a2d9>] ? sys_execve+0x49/0x70
[<ffffffff8020ca4a>] ? stub_execve+0x6a/0xc0

BUG: soft lockup - CPU#1 stuck for 61s! [cc1:24471]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 1:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24471, comm: cc1 Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f4>] [<ffffffff802246f4>]
__ticket_spin_lock+0x14/0x20
RSP: 0018:ffff8801f691dd48 EFLAGS: 00000293
RAX: 0000000000007774 RBX: ffff8801f691dd48 RCX: 0000000000000000
RDX: 0000000000000005 RSI: ffff8801bf4a9019 RDI: ffffffff8068d900
RBP: ffff88022e8ba7c8 R08: 0000000000000000 R09: ffff8801f695dd80
R10: 0000000000000000 R11: 000000000000002b R12: ffffffff802c8beb
R13: ffff8801f691dd28 R14: ffffffff802c7881 R15: ffff8801f691dca8
FS: 00002ba4b8f766f0(0000) GS:ffff88022fc02980(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004f5b90 CR3: 00000001e28c5000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff802d34a3>] ? d_alloc+0x133/0x1d0
[<ffffffff802c7b50>] ? __lookup_hash+0xe0/0x140
[<ffffffff802c7bf0>] ? lookup_hash+0x40/0x50
[<ffffffff802cc199>] ? do_filp_open+0x239/0x8c0
[<ffffffff802d7d26>] ? alloc_fd+0x116/0x140
[<ffffffff802bd50e>] ? do_sys_open+0x5e/0xf0
[<ffffffff802bd5cb>] ? sys_open+0x1b/0x20
[<ffffffff8020c50a>] ? system_call_fastpath+0x16/0x1b

BUG: soft lockup - CPU#3 stuck for 61s! [make:24549]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 3:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24549, comm: make Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f4>] [<ffffffff802246f4>]
__ticket_spin_lock+0x14/0x20
RSP: 0018:ffff8801cc42de98 EFLAGS: 00000297
RAX: 0000000000007574 RBX: ffff8801cc42de98 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff804f83c1 RDI: ffffffff8068d900
RBP: ffff8801cc42deb8 R08: ffff8801cc42dec8 R09: ffffffff804f83c0
R10: 0000000000000002 R11: 0000000000000202 R12: ffff880000002c00
R13: 00000000000200d0 R14: 0000000000000000 R15: ffff8801ee574290
FS: 00002b71862d66f0(0000) GS:ffff88022f0ce080(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000004217ba CR3: 00000001f6856000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff802d2ade>] ? __d_path+0x5e/0x1e0
[<ffffffff802d3e76>] ? sys_getcwd+0x106/0x170
[<ffffffff8020c50a>] ? system_call_fastpath+0x16/0x1b

BUG: soft lockup - CPU#2 stuck for 61s! [cc1:24472]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 2:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24472, comm: cc1 Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f0>] [<ffffffff802246f0>]
__ticket_spin_lock+0x10/0x20
RSP: 0018:ffff8801e289bbe8 EFLAGS: 00000293
RAX: 0000000000007674 RBX: ffff8801e289bbe8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88022ac2b019 RDI: ffffffff8068d900
RBP: 0000000000000020 R08: ffff8800280549c0 R09: ffff8801c886b240
R10: 0000000000000000 R11: 000000000000001f R12: ffffffff8028c451
R13: ffff8801e289bb88 R14: 00000000015ec06f R15: 0000000000000002
FS: 00002b5291ef56f0(0000) GS:ffff88022fc02d00(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002b529385f018 CR3: 00000001ee580000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff802d34a3>] ? d_alloc+0x133/0x1d0
[<ffffffff802c8a9c>] ? do_lookup+0x17c/0x240
[<ffffffff802c90bb>] ? __link_path_walk+0x55b/0x830
[<ffffffff802c93e7>] ? path_walk+0x57/0xb0
[<ffffffff802c9583>] ? do_path_lookup+0x83/0x1c0
[<ffffffff8034b001>] ? security_file_alloc+0x11/0x20
[<ffffffff802c9837>] ? __path_lookup_intent_open+0x67/0xc0
[<ffffffff802c989c>] ? path_lookup_open+0xc/0x10
[<ffffffff802cc01e>] ? do_filp_open+0xbe/0x8c0
[<ffffffff8023bc33>] ? hrtick_start_fair+0x173/0x190
[<ffffffff80373cc6>] ? __up_read+0x46/0xb0
[<ffffffff8025c229>] ? up_read+0x9/0x10
[<ffffffff8022a89b>] ? do_page_fault+0x3eb/0xa80
[<ffffffff802d7d26>] ? alloc_fd+0x116/0x140
[<ffffffff802bd50e>] ? do_sys_open+0x5e/0xf0
[<ffffffff802bd5cb>] ? sys_open+0x1b/0x20
[<ffffffff8020c50a>] ? system_call_fastpath+0x16/0x1b

BUG: soft lockup - CPU#0 stuck for 61s! [sh:24555]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 0:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24555, comm: sh Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f4>] [<ffffffff802246f4>]
__ticket_spin_lock+0x14/0x20
RSP: 0018:ffff8801ee5cbb68 EFLAGS: 00000287
RAX: 000000000000f0ef RBX: ffff8801ee5cbb68 RCX: 0000000000000f9f
RDX: ffff88022a538000 RSI: ffff88022d9e0600 RDI: ffffffff8068da00
RBP: ffff88022a05bd48 R08: 0000000000000000 R09: ffff88022a538000
R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8034bfcc
R13: ffff8801ee5cbad8 R14: ffff88022d9e0600 R15: ffff88022a0dc0c8
FS: 00002ba3cce3cf40(0000) GS:ffffffff8068dd80(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006b7550 CR3: 00000001cc5d4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff803319c0>] ? ccs_realpath_from_dentry2+0x140/0x430
[<ffffffff802b7bc6>] ? ksize+0x16/0xc0
[<ffffffff80331d8f>] ? ccs_realpath_from_dentry+0x3f/0x60
[<ffffffff80331dff>] ? ccs_realpath_nofollow+0x4f/0x60
[<ffffffff803306de>] ? ccs_check_flags+0x1e/0x30
[<ffffffff8032c888>] ? find_next_domain+0x98/0x970
[<ffffffff8029c6b2>] ? handle_mm_fault+0x322/0x960
[<ffffffff8029adb7>] ? follow_page+0x2f7/0x350
[<ffffffff8029cdf8>] ? get_user_pages+0x108/0x450
[<ffffffff802b7bc6>] ? ksize+0x16/0xc0
[<ffffffff8032da76>] ? search_binary_handler_with_transition+0x406/0x4d0
[<ffffffff802c5c0d>] ? do_execve+0x1fd/0x240
[<ffffffff8020a2d9>] ? sys_execve+0x49/0x70
[<ffffffff8020ca4a>] ? stub_execve+0x6a/0xc0

BUG: soft lockup - CPU#1 stuck for 61s! [cc1:24471]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 1:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
Pid: 24471, comm: cc1 Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f6>] [<ffffffff802246f6>]
__ticket_spin_lock+0x16/0x20
RSP: 0018:ffff8801f691dd48 EFLAGS: 00000293
RAX: 0000000000007774 RBX: ffff8801f691dd48 RCX: 0000000000000000
RDX: 0000000000000005 RSI: ffff8801bf4a9019 RDI: ffffffff8068d900
RBP: ffff88022e8ba7c8 R08: 0000000000000000 R09: ffff8801f695dd80
R10: 0000000000000000 R11: 000000000000002b R12: ffffffff802c8beb
R13: ffff8801f691dd28 R14: ffffffff802c7881 R15: ffff8801f691dca8
FS: 00002ba4b8f766f0(0000) GS:ffff88022fc02980(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004f5b90 CR3: 00000001e28c5000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff802d34a3>] ? d_alloc+0x133/0x1d0
[<ffffffff802c7b50>] ? __lookup_hash+0xe0/0x140
[<ffffffff802c7bf0>] ? lookup_hash+0x40/0x50
[<ffffffff802cc199>] ? do_filp_open+0x239/0x8c0
[<ffffffff802d7d26>] ? alloc_fd+0x116/0x140
[<ffffffff802bd50e>] ? do_sys_open+0x5e/0xf0
[<ffffffff802bd5cb>] ? sys_open+0x1b/0x20
[<ffffffff8020c50a>] ? system_call_fastpath+0x16/0x1b

BUG: soft lockup - CPU#3 stuck for 61s! [make:24549]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 3:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_core ata_piix ahci libata dock
sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy async_tx xor
raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last unloaded:
scsi_wait_scan]
Pid: 24549, comm: make Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f4>] [<ffffffff802246f4>]
__ticket_spin_lock+0x14/0x20
RSP: 0018:ffff8801cc42de98 EFLAGS: 00000297
RAX: 0000000000007574 RBX: ffff8801cc42de98 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff804f83c1 RDI: ffffffff8068d900
RBP: ffff8801cc42deb8 R08: ffff8801cc42dec8 R09: ffffffff804f83c0
R10: 0000000000000002 R11: 0000000000000202 R12: ffff880000002c00
R13: 00000000000200d0 R14: 0000000000000000 R15: ffff8801ee574290
FS: 00002b71862d66f0(0000) GS:ffff88022f0ce080(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000004217ba CR3: 00000001f6856000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff802d2ade>] ? __d_path+0x5e/0x1e0
[<ffffffff802d3e76>] ? sys_getcwd+0x106/0x170
[<ffffffff8020c50a>] ? system_call_fastpath+0x16/0x1b

BUG: soft lockup - CPU#2 stuck for 61s! [cc1:24472]
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_disk ide_core ata_piix ahci
libata dock sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy
async_tx xor raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last
unloaded: scsi_wait_scan]
CPU 2:
Modules linked in: af_packet ipv6 binfmt_misc loop jfs xfs dm_mod
pata_marvell floppy cpufreq_ondemand cpufreq_conservative
cpufreq_powersave acpi_cpufreq freq_table netconsole configfs raid0
snd_hda_intel snd_pcm snd_timer thermal snd_page_alloc usbhid snd_hwdep
ohci1394 sky2 skge button ff_memless snd processor ieee1394 rtc_cmos
evdev i2c_i801 i2c_core soundcore sg ide_core ata_piix ahci libata dock
sd_mod scsi_mod crc_t10dif raid456 async_xor async_memcpy async_tx xor
raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore [last unloaded:
scsi_wait_scan]
Pid: 24472, comm: cc1 Not tainted 2.6.27-desktop-0.rc5.1mnb #1
RIP: 0010:[<ffffffff802246f6>] [<ffffffff802246f6>]
__ticket_spin_lock+0x16/0x20
RSP: 0018:ffff8801e289bbe8 EFLAGS: 00000293
RAX: 0000000000007674 RBX: ffff8801e289bbe8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88022ac2b019 RDI: ffffffff8068d900
RBP: 0000000000000020 R08: ffff8800280549c0 R09: ffff8801c886b240
R10: 0000000000000000 R11: 000000000000001f R12: ffffffff8028c451
R13: ffff8801e289bb88 R14: 00000000015ec06f R15: 0000000000000002
FS: 00002b5291ef56f0(0000) GS:ffff88022fc02d00(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002b529385f018 CR3: 00000001ee580000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
[<ffffffff802d34a3>] ? d_alloc+0x133/0x1d0
[<ffffffff802c8a9c>] ? do_lookup+0x17c/0x240
[<ffffffff802c90bb>] ? __link_path_walk+0x55b/0x830
[<ffffffff802c93e7>] ? path_walk+0x57/0xb0
[<ffffffff802c9583>] ? do_path_lookup+0x83/0x1c0
[<ffffffff8034b001>] ? security_file_alloc+0x11/0x20
[<ffffffff802c9837>] ? __path_lookup_intent_open+0x67/0xc0
[<ffffffff802c989c>] ? path_lookup_open+0xc/0x10
[<ffffffff802cc01e>] ? do_filp_open+0xbe/0x8c0
[<ffffffff8023bc33>] ? hrtick_start_fair+0x173/0x190
[<ffffffff80373cc6>] ? __up_read+0x46/0xb0
[<ffffffff8025c229>] ? up_read+0x9/0x10
[<ffffffff8022a89b>] ? do_page_fault+0x3eb/0xa80
[<ffffffff802d7d26>] ? alloc_fd+0x116/0x140
[<ffffffff802bd50e>] ? do_sys_open+0x5e/0xf0
[<ffffffff802bd5cb>] ? sys_open+0x1b/0x20
[<ffffffff8020c50a>] ? system_call_fastpath+0x16/0x1b

--
Thomas


2008-08-30 15:30:19

by Vegard Nossum

[permalink] [raw]
Subject: Re: BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!

On Sat, Aug 30, 2008 at 2:46 PM, Thomas Backlund <[email protected]> wrote:
> Hi, (please cc me as I'm not subscribed)
>
> thought I would post this right now, I'll try to reproduce it with vanilla
> 2.6.27-rc5 as soon as the buildhost is back up...
>
> (vanilla 2.6.27-rc5 x86_64 also locked up my laptop wich is a
> Core2 Duo T8300 during kernel build with make -j3, but
> I dont have any logs of that yet)
>
> Kernel: 2.6.27-rc5 + Mandriva patches
> Config:
> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/branches/rebase-to-2.6.27/PATCHES/configs/x86_64.config?revision=276760&view=markup
> Arch: x86_64
>
> System: Intel Quad Core Q9300
>
> getting this with netconsole while building kernel rpms with make -j5:

Hi,

I tried your recipe. Well, not exactly. But I also got this during a
kernel build. Notice that the kernel version is a clean v2.6.27-rc5
(i.e. latest Linus -git):

BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]
irq event stamp: 3585444
hardirqs last enabled at (3585443): [<c015927b>] trace_hardirqs_on+0xb/0x10
hardirqs last disabled at (3585444): [<c0290614>] trace_hardirqs_off_thunk+0xc/8
softirqs last enabled at (3585438): [<c0139fb1>] __do_softirq+0xe1/0x100
softirqs last disabled at (3585427): [<c013a075>] do_softirq+0xa5/0xb0

Pid: 0, comm: swapper Not tainted (2.6.27-rc5-00006-gbef69ea #3)
EIP: 0060:[<c011eb55>] EFLAGS: 00000202 CPU: 1
EIP is at native_safe_halt+0x5/0x10
EAX: 0036b5a3 EBX: c2161b80 ECX: 00000000 EDX: 00000000
ESI: 00000001 EDI: c090fb80 EBP: f7855f80 ESP: f7855f80
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 0807a0b0 CR3: 3695d000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
[<c010ad80>] default_idle+0x50/0x70
[<c01029c8>] cpu_idle+0x68/0x130
[<c059e120>] start_secondary+0x160/0x1c0
=======================

c011eb50 <native_safe_halt>:
c011eb50: 55 push %ebp
c011eb51: 89 e5 mov %esp,%ebp
c011eb53: fb sti
c011eb54: f4 hlt
c011eb55: 5d pop %ebp
c011eb56: c3 ret
c011eb57: 89 f6 mov %esi,%esi
c011eb59: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi

...there's not many clues here as to what went wrong. Curious.

(Yes, I played with CPU hotplug and hibernate as well.)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-30 15:54:53

by Vegard Nossum

[permalink] [raw]
Subject: Re: BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!

On Sat, Aug 30, 2008 at 5:30 PM, Vegard Nossum <[email protected]> wrote:
> On Sat, Aug 30, 2008 at 2:46 PM, Thomas Backlund <[email protected]> wrote:
>> Hi, (please cc me as I'm not subscribed)
>>
>> thought I would post this right now, I'll try to reproduce it with vanilla
>> 2.6.27-rc5 as soon as the buildhost is back up...
>>
>> (vanilla 2.6.27-rc5 x86_64 also locked up my laptop wich is a
>> Core2 Duo T8300 during kernel build with make -j3, but
>> I dont have any logs of that yet)
>>
>> Kernel: 2.6.27-rc5 + Mandriva patches
>> Config:
>> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/branches/rebase-to-2.6.27/PATCHES/configs/x86_64.config?revision=276760&view=markup
>> Arch: x86_64
>>
>> System: Intel Quad Core Q9300
>>
>> getting this with netconsole while building kernel rpms with make -j5:
>
> Hi,
>
> I tried your recipe. Well, not exactly. But I also got this during a
> kernel build. Notice that the kernel version is a clean v2.6.27-rc5
> (i.e. latest Linus -git):
>
> BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]
> irq event stamp: 3585444
> hardirqs last enabled at (3585443): [<c015927b>] trace_hardirqs_on+0xb/0x10
> hardirqs last disabled at (3585444): [<c0290614>] trace_hardirqs_off_thunk+0xc/8
> softirqs last enabled at (3585438): [<c0139fb1>] __do_softirq+0xe1/0x100
> softirqs last disabled at (3585427): [<c013a075>] do_softirq+0xa5/0xb0
>
> Pid: 0, comm: swapper Not tainted (2.6.27-rc5-00006-gbef69ea #3)
> EIP: 0060:[<c011eb55>] EFLAGS: 00000202 CPU: 1
> EIP is at native_safe_halt+0x5/0x10
> EAX: 0036b5a3 EBX: c2161b80 ECX: 00000000 EDX: 00000000
> ESI: 00000001 EDI: c090fb80 EBP: f7855f80 ESP: f7855f80
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: 0807a0b0 CR3: 3695d000 CR4: 000006d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
> [<c010ad80>] default_idle+0x50/0x70
> [<c01029c8>] cpu_idle+0x68/0x130
> [<c059e120>] start_secondary+0x160/0x1c0
> =======================
>
> c011eb50 <native_safe_halt>:
> c011eb50: 55 push %ebp
> c011eb51: 89 e5 mov %esp,%ebp
> c011eb53: fb sti
> c011eb54: f4 hlt
> c011eb55: 5d pop %ebp
> c011eb56: c3 ret
> c011eb57: 89 f6 mov %esi,%esi
> c011eb59: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi
>
> ...there's not many clues here as to what went wrong. Curious.
>
> (Yes, I played with CPU hotplug and hibernate as well.)

Ok, mine is definitely because of hotplug. The issue you are seeing is
probably more important, though, sorry for hijacking your thread. You
should be able to do the SysRq thing as well, though. SysRq-w (dump
blocked tasks) and SysRq-d (held locks) here:

SysRq : Show Blocked State
task PC stack pid father
bash D d450a1ee 5912 3772 3720
f68e1df4 00200046 2fb00387 d450a1ee 2fb00387 00000004 f68e1df0 c015aa4a
c2036d80 f7830cc0 1e881e2a 000000bc f67d72c0 f67d7534 c2036d80 f68e0000
f86e8552 00000000 6db5f2ce 00000000 f86e8552 f67d781c f67d72c0 c0d1dcf4
Call Trace:
[<c015aa4a>] ? __lock_acquire+0x27a/0xa00
[<c05a1a38>] schedule_timeout+0x78/0xc0
[<c015904f>] ? mark_held_locks+0x6f/0x90
[<c015927b>] ? trace_hardirqs_on+0xb/0x10
[<c01591e4>] ? trace_hardirqs_on_caller+0xd4/0x160
[<c015927b>] ? trace_hardirqs_on+0xb/0x10
[<c05a0e33>] wait_for_common+0xb3/0x130
[<c012c860>] ? default_wake_function+0x0/0x10
[<c05a0f42>] wait_for_completion+0x12/0x20
[<c016c264>] __stop_machine+0x184/0x1f0
[<c016c310>] ? stop_cpu+0x0/0xa0
[<c016c0d0>] ? chill+0x0/0x10
[<c057e920>] ? take_cpu_down+0x0/0x30
[<c057ea77>] _cpu_down+0xd7/0x270
[<c0179d1d>] ? __synchronize_sched+0x2d/0x40
[<c0146ef0>] ? wakeme_after_rcu+0x0/0x10
[<c057ec5a>] cpu_down+0x4a/0x70
[<c057ff68>] store_online+0x38/0x80
[<c057ff30>] ? store_online+0x0/0x80
[<c0312f5c>] sysdev_store+0x2c/0x40
[<c01e67b2>] sysfs_write_file+0xa2/0x100
[<c01a8606>] vfs_write+0x96/0x130
[<c01e6710>] ? sysfs_write_file+0x0/0x100
[<c01a8b4d>] sys_write+0x3d/0x70
[<c01040db>] sysenter_do_call+0x12/0x3f
=======================


SysRq : Show Locks Held

Showing all locks held in the system:
1 lock held by mingetty/3520:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3521:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3522:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3523:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3524:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
4 locks held by bash/3772:
#0: (&buffer->mutex){--..}, at: [<c01e673b>] sysfs_write_file+0x2b/0x100
#1: (cpu_add_remove_lock){--..}, at: [<c01367af>] cpu_maps_update_begin+0xf/00
#2: (&cpu_hotplug.lock){--..}, at: [<c013680a>] cpu_hotplug_begin+0x1a/0x50
#3: (lock){--..}, at: [<c016c132>] __stop_machine+0x52/0x1f0
1 lock held by bash/5074:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
2 locks held by bash/6363:
#0: (sysrq_key_table_lock){++..}, at: [<c030094a>] __handle_sysrq+0x1a/0x120
#1: (tasklist_lock){..--}, at: [<c01574a1>] debug_show_all_locks+0x31/0x180

=============================================

So it seems (in my case) to be waiting for &finished to complete
(__stop_machine()):

/* This will release the thread on our CPU. */
put_cpu();
wait_for_completion(&finished);
mutex_unlock(&lock);


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-08-30 21:14:18

by Thomas Backlund

[permalink] [raw]
Subject: Re: BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!

Thomas Backlund skrev:
> Hi, (please cc me as I'm not subscribed)
>
> thought I would post this right now, I'll try to reproduce it with
> vanilla 2.6.27-rc5 as soon as the buildhost is back up...
>

Nevermind this "BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for
61s!." part ...

It was caused by a Tomoyo patch that was not correct...

I just assumed it was the same as the laptop locked up too...

> (vanilla 2.6.27-rc5 x86_64 also locked up my laptop wich is a
> Core2 Duo T8300 during kernel build with make -j3, but
> I dont have any logs of that yet)
>

This issue is still a real one, but I need to get some oops on it...

I'll report back when I know more

--
Thomas

2008-08-31 01:36:48

by Tetsuo Handa

[permalink] [raw]
Subject: Re: BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!

Hello.

> Kernel: 2.6.27-rc5 + Mandriva patches
Yes, it is caused by TOMOYO patches.

> Call Trace:
> [<ffffffff804d6b69>] ? _spin_lock+0x9/0x10
> [<ffffffff803319c0>] ? ccs_realpath_from_dentry2+0x140/0x430
> [<ffffffff802b7bc6>] ? ksize+0x16/0xc0
> [<ffffffff80331d8f>] ? ccs_realpath_from_dentry+0x3f/0x60
> [<ffffffff80331dff>] ? ccs_realpath_nofollow+0x4f/0x60
> [<ffffffff803306de>] ? ccs_check_flags+0x1e/0x30
> [<ffffffff8032c888>] ? find_next_domain+0x98/0x970
> [<ffffffff8029c6b2>] ? handle_mm_fault+0x322/0x960
> [<ffffffff8029adb7>] ? follow_page+0x2f7/0x350
> [<ffffffff8029cdf8>] ? get_user_pages+0x108/0x450
> [<ffffffff802b7bc6>] ? ksize+0x16/0xc0
> [<ffffffff8032da76>] ? search_binary_handler_with_transition+0x406/0x4d0
> [<ffffffff802c5c0d>] ? do_execve+0x1fd/0x240
> [<ffffffff8020a2d9>] ? sys_execve+0x49/0x70
> [<ffffffff8020ca4a>] ? stub_execve+0x6a/0xc0
Inside ccs_realpath_from_dentry2(), it will take dcache_lock and vfsmount_lock
lock. But, the order of calling spin_lock(&dcache_lock); and
spin_lock(&vfsmount_lock); was reversed by AppArmor's patches.
See __d_path() in fs/dcache.c in the patch
http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/branches/rebase-to-2.6.27/PATCHES/patches/security-apparmor-2.6.27.patch?revision=276651&view=markup

The solution is to reverse the order in TOMOYO's patches.
It is solved in TOMOYO 1.6.4 . You can download it from
http://svn.sourceforge.jp/cgi-bin/viewcvs.cgi/trunk/1.6.x/ccs-patch/?root=tomoyo&rev=1499

I responded to Mandriva's kernel maintainers, so it will be solved shortly.

Regards.