LinuxLists.cc - 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert

2011-04-20 12:58:31

Subject: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

Hi,
I am not able to boot with the current git tree
(2.6.39-rc4-00089-g2f666bc). 2.6.39-rc3 boots just fine.

The backtrace [1] looks really messy.

I have seen a similar backtrace at https://lkml.org/lkml/2011/3/25/169
and tried the patch from https://lkml.org/lkml/2011/3/25/173 but it
didn't help. The backtrace looks similar (much smaller) but the machine
ends up dead as well with unbalanced preempt counter[2].

The config is attached.

Is this a known problem?

---
* [1] backtrace with the current git tree (2f666bc)

[ 31.207283] BUG: unable to handle kernel NULL pointer dereference at 0000000c
NULL pointer dereference[ 31.207540] IP: at 0000000c
[ 31.207540] IP: [<c028dece>] cfq_insert_request+0x1d/0x3f5
[ 31.207720] *pde = 00000000 [<c028dece>] cfq_insert_request+0x1d/0x3f5
NULL pointer dereference[ 31.207720] *pde = 00000000
at 0000000c
[ 31.207888] Oops: 0000 [#1]
PREEMPT [ 31.207540] IP:[ 31.207888] Oops: 0000 [#1] SMP PREEMPT DEBUG_PAGEALLOC [<c028dece>] cfq_insert_request+0x1d/0x3f5
SMP DEBUG_PAGEALLOC[ 31.207720] *pde = 00000000
[ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
[ 31.208401] Modules linked in:[ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
SMP fuse[ 31.208401] Modules linked in: tun fuseDEBUG_PAGEALLOC coretemp tun hwmon coretemp
snd_hda_codec_realtek hwmon snd_hda_codec_realtek[ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
snd_hda_intel[ 31.208401] Modules linked in: snd_hda_codec snd_hda_intel fuse snd_pcm_oss snd_hda_codec snd_mixer_oss snd_pcm_oss tun snd_pcm snd_mixer_oss snd_seq_oss snd_pcm coretemp snd_seq_oss hwmon snd_seq_midi_event snd_seq snd_seq_midi_event snd_hda_codec_realtek arc4 snd_seq ecb arc4 snd_hda_intel iwl3945 ecb snd_timer iwl3945 snd_hda_codec snd_seq_device snd_timer snd snd_seq_device snd_pcm_oss iwl_legacy snd snd_mixer_oss iwl_legacy mac80211 snd_pcm cfg80211 mac80211 snd_page_alloc cfg80211 snd_seq_oss rtc_cmos snd_page_alloc rtc_core snd_seq_midi_event rtc_cmos fujitsu_laptop rtc_core snd_seq video fujitsu_laptop rtc_lib video arc4 rtc_lib ecb backlight [last unloaded: scsi_wait_scan] backlight iwl3945 [last unloaded: scsi_wait_scan] snd_timer

snd_seq_device[ 31.210204]
snd[ 31.210204]
[ 31.210326] Pid: 2780, comm: Xorg Not tainted 2.6.39-rc4-00089-g2f666bc #119[ 31.210326] Pid: 2780, comm: Xorg Not tainted 2.6.39-rc4-00089-g2f666bc #119 iwl_legacy FUJITSU SIEMENS LIFEBOOK S7110 mac80211/FJNB1B5 FUJITSU SIEMENS LIFEBOOK S7110/FJNB1B5 cfg80211
snd_page_alloc
[ 31.210664] EIP: 0060:[<c028dece>] EFLAGS: 00010086 CPU: 0
rtc_cmos[ 31.210664] EIP: 0060:[<c028dece>] EFLAGS: 00010086 CPU: 0
rtc_core[ 31.210794] EIP is at cfq_insert_request+0x1d/0x3f5
[ 31.210794] EIP is at cfq_insert_request+0x1d/0x3f5
fujitsu_laptop[ 31.210923] EAX: 00000000 EBX: f52c9680 ECX: c062c184 EDX: f52c9680
video[ 31.210923] EAX: 00000000 EBX: f52c9680 ECX: c062c184 EDX: f52c9680
rtc_lib[ 31.211052] ESI: 00000000 EDI: f54b9000 EBP: f5069da4 ESP: f5069d78
backlight[ 31.211052] ESI: 00000000 EDI: f54b9000 EBP: f5069da4 ESP: f5069d78
[ 31.211140] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[last unloaded: scsi_wait_scan][ 31.211140] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068

[ 31.210204]
[ 31.211140] Process Xorg (pid: 2780, ti=f5068000 task=f54c3200 task.ti=f5068000)
[ 31.210326] Pid: 2780, comm: Xorg Not tainted 2.6.39-rc4-00089-g2f666bc #119[ 31.211140] Process Xorg (pid: 2780, ti=f5068000 task=f54c3200 task.ti=f5068000)
[ 31.211140] Stack:
FUJITSU SIEMENS LIFEBOOK S7110[ 31.211140] 00000000[ 31.211140] Stack:
/FJNB1B5 f5069d7c[ 31.211140] 00000000 f5069d7c f5069d7c
f5069d84 f5069d7c f5069d84 f5069d84[ 31.210664] EIP: 0060:[<c028dece>] EFLAGS: 00010086 CPU: 0
f52c9680 f5069d84 c020a5b4 f52c9680[ 31.210794] EIP is at cfq_insert_request+0x1d/0x3f5
c020a5b4[ 31.210923] EAX: 00000000 EBX: f52c9680 ECX: c062c184 EDX: f52c9680
00000004 00000004[ 31.211052] ESI: 00000000 EDI: f54b9000 EBP: f5069da4 ESP: f5069d78

[ 31.211140] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 31.211140] f52c9680
f5551f40[ 31.211140] f52c9680[ 31.211140] Process Xorg (pid: 2780, ti=f5068000 task=f54c3200 task.ti=f5068000)
00000000 f5551f40 f5069dbc 00000000[ 31.211140] Stack:
c027f733 f5069dbc 00000006 c027f733[ 31.211140] 00000000 f52c9680 00000006 f5069d7c f52c9680 f5551f40 f5069d7c f5551f40 f5069d84
[ 31.211140] 00000000
f5069d84[ 31.211140] 00000000 f52c9680 c020a5b4 00000004
[ 31.211140] f52c9680 f5551f40 00000000[ 31.211140] [<c027f733>] __elv_add_request+0x18d/0x1bb
f5069dbc[ 31.211140] [<c027f733>] __elv_add_request+0x18d/0x1bb
c027f733[ 31.211140] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
00000006[ 31.211140] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.211140] [<c045f42a>] io_schedule+0x61/0xad
f52c9680[ 31.211140] [<c045f42a>] io_schedule+0x61/0xad
f5551f40[ 31.211140] [<c0192697>] sleep_on_page_killable+0xd/0x32

[ 31.211140] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.211140] 00000000[ 31.211140] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.211140] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.211140] [<c027f733>] __elv_add_request+0x18d/0x1bb
[ 31.211140] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.211140] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.211140] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.211140] [<c045f42a>] io_schedule+0x61/0xad
[ 31.211140] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.211140] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.211140] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.211140] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.211140] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.211140] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.211140] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.211140] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.211140] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.211140] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.211140] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.211140] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.211140] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.211140] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.211140] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.211140] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.211140] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.211140] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.211140] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.211140] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.211140] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.211140] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.211140] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.211140] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.211140] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.211140] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.211140] [<c01c3e21>] sys_read+0x40/0x65
[ 31.211140] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.211140] [<c01c3e21>] sys_read+0x40/0x65
[ 31.211140] [<c0461149>] syscall_call+0x7/0xb
[ 31.211140] [<c01c3e21>] sys_read+0x40/0x65
[ 31.211140] Code: [ 31.211140] [<c0461149>] syscall_call+0x7/0xb
[ 31.211140] Code: [ 31.211140] [<c0461149>] syscall_call+0x7/0xb
00 b8 [ 31.211140] Code: 00 01 b8 00 01 00 00 00 00 00 b8 8d 00 65 8d 01 f4 65 5b f4 00 5e 5b 5f 5e 00 5d 5f c3 5d 00 55 c3 89 8d 55 e5 89 57 65 e5 56 57 f4 56 53 5b 83 53 ec 83 5e 20 ec 3e 20 5f 8d 3e 74 8d 5d 26 74 00 c3 26 8b 00 40 55 8b 0c 40 8b 89 0c 72 e5 5c 89 55 8b 57 72 e8 5c 89 8b 56 55 78 04 53 8b e8 42 8b 83 58 78 04 <8b> 8b ec 42 50 58 20 <8b> 50 0c 3e 89 0c f0 89 8d 83 f0 c2 83 74 0c c2 e8 0c 26 42 e8 f9 00 42 ff f9 ff 8b 40 0c 8b 72 5c 89 55 e8 8b 78 04 8b 42 58 <8b> 50 0c 89 f0 83 c2 0c e8 42 f9 ff rtc_core fujitsu_laptop rtc_core video fujitsu_laptop rtc_lib video rtc_lib rtc_core backlight backlight [last unloaded: scsi_wait_scan] fujitsu_laptop [last unloaded: scsi_wait_scan] video
rtc_lib
backlight[ 31.222032] Pid: 2780, comm: Xorg Tainted: G D 2.6.39-rc4-00089-g2f666bc #119
[last unloaded: scsi_wait_scan][ 31.222032] Pid: 2780, comm: Xorg Tainted: G D 2.6.39-rc4-00089-g2f666bc #119

[ 31.222243] Call Trace:
[ 31.222032] Pid: 2780, comm: Xorg Tainted: G D 2.6.39-rc4-00089-g2f666bc #119
[ 31.222243] Call Trace:
[ 31.222243] Call Trace:
[ 31.222372] [<c0128494>] __schedule_bug+0x4d/0x52
[ 31.222504] [<c045e9db>] schedule+0x76/0xa64
[ 31.222372] [<c0128494>] __schedule_bug+0x4d/0x52
[ 31.222372] [<c0128494>] __schedule_bug+0x4d/0x52
[ 31.222504] [<c045e9db>] schedule+0x76/0xa64
[ 31.222634] [<c01976aa>] ? __pagevec_free+0x79/0x86
[ 31.222504] [<c045e9db>] schedule+0x76/0xa64
[ 31.222634] [<c01976aa>] ? __pagevec_free+0x79/0x86
[ 31.222771] [<c019a0db>] ? release_pages+0x179/0x181
[ 31.222634] [<c01976aa>] ? __pagevec_free+0x79/0x86
[ 31.222771] [<c019a0db>] ? release_pages+0x179/0x181
[ 31.222771] [<c019a0db>] ? release_pages+0x179/0x181
[ 31.222929] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.222929] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.223069] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.222929] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.223069] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.223069] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.223206] [<c019a81d>] ? lru_add_drain+0x96/0x9b
[ 31.223206] [<c019a81d>] ? lru_add_drain+0x96/0x9b
[ 31.223344] [<c012e105>] __cond_resched+0x17/0x23
[ 31.223206] [<c019a81d>] ? lru_add_drain+0x96/0x9b
[ 31.223344] [<c012e105>] __cond_resched+0x17/0x23
[ 31.223478] [<c045f5c4>] _cond_resched+0x25/0x2c
[ 31.223344] [<c012e105>] __cond_resched+0x17/0x23
[ 31.223478] [<c045f5c4>] _cond_resched+0x25/0x2c
[ 31.223613] [<c01a94b8>] unmap_vmas+0x506/0x5ef
[ 31.223478] [<c045f5c4>] _cond_resched+0x25/0x2c
[ 31.223613] [<c01a94b8>] unmap_vmas+0x506/0x5ef
[ 31.223613] [<c01a94b8>] unmap_vmas+0x506/0x5ef
[ 31.223746] [<c01ad21e>] exit_mmap+0xae/0x143
[ 31.223746] [<c01ad21e>] exit_mmap+0xae/0x143
[ 31.223902] [<c01329c3>] mmput+0x3d/0xa0
[ 31.223746] [<c01ad21e>] exit_mmap+0xae/0x143
[ 31.223902] [<c01329c3>] mmput+0x3d/0xa0
[ 31.223902] [<c01329c3>] mmput+0x3d/0xa0
[ 31.224073] [<c013648b>] exit_mm+0xfd/0x105
[ 31.224073] [<c013648b>] exit_mm+0xfd/0x105
[ 31.224204] [<c0137a0e>] do_exit+0x1f3/0x5fe
[ 31.224073] [<c013648b>] exit_mm+0xfd/0x105
[ 31.224204] [<c0137a0e>] do_exit+0x1f3/0x5fe
[ 31.224351] [<c0461fa8>] oops_end+0x97/0x9f
[ 31.224204] [<c0137a0e>] do_exit+0x1f3/0x5fe
[ 31.224351] [<c0461fa8>] oops_end+0x97/0x9f
[ 31.224483] [<c011e130>] no_context+0x113/0x11d
[ 31.224351] [<c0461fa8>] oops_end+0x97/0x9f
[ 31.224483] [<c011e130>] no_context+0x113/0x11d
[ 31.224613] [<c011e22c>] __bad_area_nosemaphore+0xf2/0xfa
[ 31.224483] [<c011e130>] no_context+0x113/0x11d
[ 31.224748] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.224613] [<c011e22c>] __bad_area_nosemaphore+0xf2/0xfa
[ 31.224613] [<c011e22c>] __bad_area_nosemaphore+0xf2/0xfa
[ 31.224887] [<c011e246>] bad_area_nosemaphore+0x12/0x15
[ 31.224748] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225020] [<c0463384>] do_page_fault+0x228/0x401
[ 31.224887] [<c011e246>] bad_area_nosemaphore+0x12/0x15
[ 31.225020] [<c0463384>] do_page_fault+0x228/0x401
[ 31.225166] [<c02847f2>] ? blk_finish_plug+0x12/0x2d
[ 31.225317] [<c01ea609>] ? mpage_readpages+0xc9/0xd4
[ 31.225166] [<c02847f2>] ? blk_finish_plug+0x12/0x2d
[ 31.224748] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.224887] [<c011e246>] bad_area_nosemaphore+0x12/0x15
[ 31.225020] [<c0463384>] do_page_fault+0x228/0x401
[ 31.225166] [<c02847f2>] ? blk_finish_plug+0x12/0x2d
[ 31.225317] [<c01ea609>] ? mpage_readpages+0xc9/0xd4
[ 31.225450] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225317] [<c01ea609>] ? mpage_readpages+0xc9/0xd4
[ 31.225450] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225582] [<c0461746>] error_code+0x5a/0x60
[ 31.225450] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225582] [<c0461746>] error_code+0x5a/0x60
[ 31.225712] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225582] [<c0461746>] error_code+0x5a/0x60
[ 31.225844] [<c028dece>] ? cfq_insert_request+0x1d/0x3f5
[ 31.225712] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225712] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225844] [<c028dece>] ? cfq_insert_request+0x1d/0x3f5
[ 31.225977] [<c020a5b4>] ? ext3_bmap+0x84/0x84
[ 31.225844] [<c028dece>] ? cfq_insert_request+0x1d/0x3f5
[ 31.225977] [<c020a5b4>] ? ext3_bmap+0x84/0x84
[ 31.225977] [<c020a5b4>] ? ext3_bmap+0x84/0x84
[ 31.226106] [<c027f733>] __elv_add_request+0x18d/0x1bb
[ 31.226106] [<c027f733>] __elv_add_request+0x18d/0x1bb
[ 31.226271] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.226106] [<c027f733>] __elv_add_request+0x18d/0x1bb
[ 31.226271] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.226403] [<c045f42a>] io_schedule+0x61/0xad
[ 31.226271] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.226403] [<c045f42a>] io_schedule+0x61/0xad
[ 31.226535] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.226403] [<c045f42a>] io_schedule+0x61/0xad
[ 31.226535] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.226665] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.226535] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.226665] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.226815] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.226665] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.226815] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.226945] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.226815] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.226945] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.227079] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.226945] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.227079] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.227079] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.227210] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.227210] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.227380] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.227210] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.227380] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.227514] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.227380] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.227514] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.227646] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.227514] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.227646] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.227776] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.227646] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.227776] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.227906] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.227776] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.227906] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.228065] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.227906] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.228065] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.228195] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.228065] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.228195] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.228345] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.228195] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.228345] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.228489] [<c01c3e21>] sys_read+0x40/0x65
[ 31.228489] [<c01c3e21>] sys_read+0x40/0x65
[ 31.228345] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.228618] [<c0461149>] syscall_call+0x7/0xb
[ 31.228489] [<c01c3e21>] sys_read+0x40/0x65
[ 31.228618] [<c0461149>] syscall_call+0x7/0xb
[ 31.228618] [<c0461149>] syscall_call+0x7/0xb

Nothing else shown in the netconsole after this point.

* [2] backtrace with the patch from https://lkml.org/lkml/2011/3/25/173
on top of 2f666bc

[ 31.182252] BUG: unable to handle kernel NULL pointer dereferenceNULL pointer dereference at 0000000c
[ 31.182475] IP: at 0000000c
[ 31.182475] IP: [<c028df12>] cfq_insert_request+0x1d/0x3f5
[ 31.182651] *pde = 00000000 [<c028df12>] cfq_insert_request+0x1d/0x3f5
[ 31.182651] *pde = 00000000 NULL pointer dereference
at 0000000c
[ 31.182818] Oops: 0000 [#1]
PREEMPT [ 31.182818] Oops: 0000 [#1] [ 31.182475] IP:SMP PREEMPT DEBUG_PAGEALLOCSMP [<c028df12>] cfq_insert_request+0x1d/0x3f5
DEBUG_PAGEALLOC[ 31.182651] *pde = 00000000
[ 31.182818] Oops: 0000 [#1] [ 31.183119] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
PREEMPT [ 31.183334] Modules linked in:[ 31.183119] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
fuse[ 31.183334] Modules linked in:SMP tun fuse coretemp tunDEBUG_PAGEALLOC hwmon snd_hda_codec_realtek coretemp
hwmon snd_hda_intel[ 31.183119] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
snd_hda_codec_realtek arc4 snd_hda_intel ecb[ 31.183334] Modules linked in: arc4 snd_hda_codec ecb fuse snd_pcm_oss snd_hda_codec snd_mixer_oss tun snd_pcm iwl3945 coretemp snd_pcm_oss snd_mixer_oss hwmon iwl_legacy snd_pcm iwl3945 snd_seq_oss iwl_legacy snd_seq_midi_event snd_seq_oss mac80211 snd_seq_midi_event snd_seq mac80211 cfg80211 snd_seq snd_hda_codec_realtek snd_timer cfg80211 snd_timer snd_hda_intel snd_seq_device arc4 snd snd_seq_device snd_page_alloc snd ecb rtc_cmos snd_page_alloc video rtc_cmos snd_hda_codec fujitsu_laptop video rtc_core fujitsu_laptop snd_pcm_oss rtc_lib rtc_core snd_mixer_oss rtc_lib backlight snd_pcm [last unloaded: scsi_wait_scan] iwl3945 backlight [last unloaded: scsi_wait_scan]
iwl_legacy
snd_seq_oss[ 31.185118]
snd_seq_midi_event[ 31.185118]
[ 31.185241] Pid: 2759, comm: Xorg Not tainted 2.6.39-rc4-lkml-fix-00090-gab11267 #121 mac80211 FUJITSU SIEMENS LIFEBOOK S7110[ 31.185241] Pid: 2759, comm: Xorg Not tainted 2.6.39-rc4-lkml-fix-00090-gab11267 #121 snd_seq/FJNB1B5 FUJITSU SIEMENS LIFEBOOK S7110/FJNB1B5 cfg80211
snd_timer
[ 31.185579] EIP: 0060:[<c028df12>] EFLAGS: 00010086 CPU: 0
snd_seq_device[ 31.185579] EIP: 0060:[<c028df12>] EFLAGS: 00010086 CPU: 0
snd[ 31.185706] EIP is at cfq_insert_request+0x1d/0x3f5
snd_page_alloc[ 31.185706] EIP is at cfq_insert_request+0x1d/0x3f5
rtc_cmos[ 31.185830] EAX: 00000000 EBX: f504c750 ECX: c062c184 EDX: f504c750
[ 31.185830] EAX: 00000000 EBX: f504c750 ECX: c062c184 EDX: f504c750
video[ 31.185878] ESI: 00000000 EDI: f5d7b800 EBP: f566dd38 ESP: f566dd0c
fujitsu_laptop[ 31.185878] ESI: 00000000 EDI: f5d7b800 EBP: f566dd38 ESP: f566dd0c
rtc_core[ 31.185878] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 31.185878] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
rtc_lib backlight[ 31.185878] Process Xorg (pid: 2759, ti=f566c000 task=f5cd9900 task.ti=f566c000)
[last unloaded: scsi_wait_scan][ 31.185878] Process Xorg (pid: 2759, ti=f566c000 task=f5cd9900 task.ti=f566c000)
[ 31.185878] Stack:
[ 31.185878] f4deaf2c[ 31.185878] Stack:
[ 31.185118]
00000020[ 31.185878] f4deaf2c 00000020 00000020[ 31.185241] Pid: 2759, comm: Xorg Not tainted 2.6.39-rc4-lkml-fix-00090-gab11267 #121 00000000 00000020 f566dd28 00000000 FUJITSU SIEMENS LIFEBOOK S7110 f504c750 f566dd28 0009be28 f504c750/FJNB1B5 0009be28
00000000 00000000[ 31.185579] EIP: 0060:[<c028df12>] EFLAGS: 00010086 CPU: 0
[ 31.185706] EIP is at cfq_insert_request+0x1d/0x3f5
[ 31.185878] f504c750
f5f39f40[ 31.185878] f504c750 00000000 f5f39f40[ 31.185830] EAX: 00000000 EBX: f504c750 ECX: c062c184 EDX: f504c750
f566dd50 00000000 c027f733 f566dd50[ 31.185878] ESI: 00000000 EDI: f5d7b800 EBP: f566dd38 ESP: f566dd0c
00000006 c027f733 f504c750 00000006[ 31.185878] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
f504c750[ 31.185878] Process Xorg (pid: 2759, ti=f566c000 task=f5cd9900 task.ti=f566c000)
f5f39f40 f5f39f40[ 31.185878] Stack:
[ 31.185878] f4deaf2c[ 31.185878] 00000000
[ 31.185878] 00000000 00000020 00000020 00000000 f566dd28 f504c750 0009be28 00000000
[ 31.185878] f504c750 f5f39f40 00000000 f566dd50 c027f733 00000006 f504c750 f5f39f40
[ 31.185878] 00000000[ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
[ 31.185878] [<c0461786>] error_code+0x5a/0x60
[ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
[ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
[ 31.185878] [<c0461786>] error_code+0x5a/0x60
[ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
[ 31.185878] Code: [ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
00 [ 31.185878] Code: b8 00 [ 31.185878] [<c0461786>] error_code+0x5a/0x60
b8 [ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
01 [ 31.185878] Code: 00 01 00 00 00 00 00 8d 00 b8 65 8d f4 65 5b 01 f4 5e 5b 5f 5e 00 5d 5f c3 5d 00 55 c3 89 55 00 e5 89 57 8d e5 56 57 65 56 53 f4 83 53 ec 5b 83 20 ec 3e 5e 8d 20 3e 74 5f 8d 26 74 00 26 5d 8b 00 40 c3 8b 0c 40 8b 55 0c 72 8b 89 5c 72 89 5c e5 55 89 e8 55 57 8b e8 78 8b 56 04 78 8b 04 53 42 8b 58 42 83 <8b> 58 50 <8b> ec 50 20 0c 89 0c 3e f0 89 83 f0 8d c2 83 0c c2 74 e8 0c 42 e8 26 f9 42 ff f9 00 ff ff 8b ff 8b 4d 8b e8 4d 40 a1 e8 40 a1 0c ba 40 5f ba 8b c0 5f c0 72
5c [ 31.185878] EIP: [<c028df12>]
cfq_insert_request+0x1d/0x3f5[ 31.185878] EIP: [<c028df12>] 89 cfq_insert_request+0x1d/0x3f555 e8 SS:ESP 0068:f566dd0c
8b SS:ESP 0068:f566dd0c
[ 31.185878] CR2: 000000000000000c
78 [ 31.185878] CR2: 000000000000000c
04 [ 31.185878] ---[ end trace 35f4df0d1e223ee0 ]---
[ 31.185878] ---[ end trace 35f4df0d1e223ee0 ]---
8b [ 31.185878] note: Xorg[2759] exited with preempt_count 1
42 [ 31.185878] note: Xorg[2759] exited with preempt_count 1
58 <8b> 50 0c 89 f0 83 c2 0c e8 42 f9 ff ff 8b 4d e8 a1 40 ba 5f c0
[ 31.185878] EIP: [<c028df12>] cfq_insert_request+0x1d/0x3f5 SS:ESP 0068:f566dd0c
[ 31.185878] CR2: 000000000000000c
[ 31.185878] ---[ end trace 35f4df0d1e223ee0 ]---
[ 31.185878] note: Xorg[2759] exited with preempt_count 1
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

Attachments:

(No filename) (22.66 kB)
config-2.6.39-rc4-00089-g2f666bc (69.29 kB)
Download all attachments

2011-04-20 13:03:26

by Jens Axboe

[permalink] [raw]

Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-20 14:58, Michal Hocko wrote:
> Hi,
> I am not able to boot with the current git tree
> (2.6.39-rc4-00089-g2f666bc). 2.6.39-rc3 boots just fine.
>
> The backtrace [1] looks really messy.
>
> I have seen a similar backtrace at https://lkml.org/lkml/2011/3/25/169
> and tried the patch from https://lkml.org/lkml/2011/3/25/173 but it
> didn't help. The backtrace looks similar (much smaller) but the machine
> ends up dead as well with unbalanced preempt counter[2].
>
> The config is attached.
>
> Is this a known problem?

Not sure, can you please try with the attached patch applied as well?

diff --git a/block/blk-core.c b/block/blk-core.c
index 5fa3dd2..a2e58ee 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -292,7 +292,6 @@ EXPORT_SYMBOL(blk_sync_queue);
/**
* __blk_run_queue - run a single device queue
* @q: The queue to run
- * @force_kblockd: Don't run @q->request_fn directly. Use kblockd.
*
* Description:
* See @blk_run_queue. This variant must be called with the queue lock
@@ -303,15 +302,7 @@ void __blk_run_queue(struct request_queue *q)
if (unlikely(blk_queue_stopped(q)))
return;

- /*
- * Only recurse once to avoid overrunning the stack, let the unplug
- * handling reinvoke the handler shortly if we already got there.
- */
- if (!queue_flag_test_and_set(QUEUE_FLAG_REENTER, q)) {
- q->request_fn(q);
- queue_flag_clear(QUEUE_FLAG_REENTER, q);
- } else
- queue_delayed_work(kblockd_workqueue, &q->delay_work, 0);
+ q->request_fn(q);
}
EXPORT_SYMBOL(__blk_run_queue);

@@ -328,6 +319,7 @@ void blk_run_queue_async(struct request_queue *q)
if (likely(!blk_queue_stopped(q)))
queue_delayed_work(kblockd_workqueue, &q->delay_work, 0);
}
+EXPORT_SYMBOL(blk_run_queue_async);

/**
* blk_run_queue - run a single device queue
@@ -2787,7 +2779,6 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)

local_irq_restore(flags);
}
-EXPORT_SYMBOL(blk_flush_plug_list);

void blk_finish_plug(struct blk_plug *plug)
{
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 6d73512..bd23631 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -66,14 +66,14 @@ queue_requests_store(struct request_queue *q, const char *page, size_t count)

if (rl->count[BLK_RW_SYNC] >= q->nr_requests) {
blk_set_queue_full(q, BLK_RW_SYNC);
- } else if (rl->count[BLK_RW_SYNC]+1 <= q->nr_requests) {
+ } else {
blk_clear_queue_full(q, BLK_RW_SYNC);
wake_up(&rl->wait[BLK_RW_SYNC]);
}

if (rl->count[BLK_RW_ASYNC] >= q->nr_requests) {
blk_set_queue_full(q, BLK_RW_ASYNC);
- } else if (rl->count[BLK_RW_ASYNC]+1 <= q->nr_requests) {
+ } else {
blk_clear_queue_full(q, BLK_RW_ASYNC);
wake_up(&rl->wait[BLK_RW_ASYNC]);
}
@@ -508,8 +508,10 @@ int blk_register_queue(struct gendisk *disk)
return ret;

ret = kobject_add(&q->kobj, kobject_get(&dev->kobj), "%s", "queue");
- if (ret < 0)
+ if (ret < 0) {
+ blk_trace_remove_sysfs(dev);
return ret;
+ }

kobject_uevent(&q->kobj, KOBJ_ADD);

diff --git a/block/blk.h b/block/blk.h
index c9df8fc..6126346 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -22,7 +22,6 @@ void blk_rq_timed_out_timer(unsigned long data);
void blk_delete_timer(struct request *);
void blk_add_timer(struct request *);
void __generic_unplug_device(struct request_queue *);
-void blk_run_queue_async(struct request_queue *q);

/*
* Internal atomic flags for request handling
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 46b0a1d..5b52011 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -2582,28 +2582,20 @@ static void cfq_put_queue(struct cfq_queue *cfqq)
}

/*
- * Must always be called with the rcu_read_lock() held
+ * Call func for each cic attached to this ioc.
*/
static void
-__call_for_each_cic(struct io_context *ioc,
- void (*func)(struct io_context *, struct cfq_io_context *))
+call_for_each_cic(struct io_context *ioc,
+ void (*func)(struct io_context *, struct cfq_io_context *))
{
struct cfq_io_context *cic;
struct hlist_node *n;

+ rcu_read_lock();
+
hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list)
func(ioc, cic);
-}

-/*
- * Call func for each cic attached to this ioc.
- */
-static void
-call_for_each_cic(struct io_context *ioc,
- void (*func)(struct io_context *, struct cfq_io_context *))
-{
- rcu_read_lock();
- __call_for_each_cic(ioc, func);
rcu_read_unlock();
}

@@ -2664,7 +2656,7 @@ static void cfq_free_io_context(struct io_context *ioc)
* should be ok to iterate over the known list, we will see all cic's
* since no new ones are added.
*/
- __call_for_each_cic(ioc, cic_free_func);
+ call_for_each_cic(ioc, cic_free_func);
}

static void cfq_put_cooperator(struct cfq_queue *cfqq)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ab55c2f..e9901b8 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -411,8 +411,6 @@ static void scsi_run_queue(struct request_queue *q)
list_splice_init(&shost->starved_list, &starved_list);

while (!list_empty(&starved_list)) {
- int flagset;
-
/*
* As long as shost is accepting commands and we have
* starved queues, call blk_run_queue. scsi_request_fn
@@ -435,20 +433,7 @@ static void scsi_run_queue(struct request_queue *q)
continue;
}

- spin_unlock(shost->host_lock);
-
- spin_lock(sdev->request_queue->queue_lock);
- flagset = test_bit(QUEUE_FLAG_REENTER, &q->queue_flags) &&
- !test_bit(QUEUE_FLAG_REENTER,
- &sdev->request_queue->queue_flags);
- if (flagset)
- queue_flag_set(QUEUE_FLAG_REENTER, sdev->request_queue);
- __blk_run_queue(sdev->request_queue);
- if (flagset)
- queue_flag_clear(QUEUE_FLAG_REENTER, sdev->request_queue);
- spin_unlock(sdev->request_queue->queue_lock);
-
- spin_lock(shost->host_lock);
+ blk_run_queue_async(sdev->request_queue);
}
/* put any unprocessed entries back */
list_splice(&starved_list, &shost->starved_list);
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 28c3350..815069d 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3816,28 +3816,17 @@ fail_host_msg:
static void
fc_bsg_goose_queue(struct fc_rport *rport)
{
- int flagset;
- unsigned long flags;
-
if (!rport->rqst_q)
return;

+ /*
+ * This get/put dance makes no sense
+ */
get_device(&rport->dev);
-
- spin_lock_irqsave(rport->rqst_q->queue_lock, flags);
- flagset = test_bit(QUEUE_FLAG_REENTER, &rport->rqst_q->queue_flags) &&
- !test_bit(QUEUE_FLAG_REENTER, &rport->rqst_q->queue_flags);
- if (flagset)
- queue_flag_set(QUEUE_FLAG_REENTER, rport->rqst_q);
- __blk_run_queue(rport->rqst_q);
- if (flagset)
- queue_flag_clear(QUEUE_FLAG_REENTER, rport->rqst_q);
- spin_unlock_irqrestore(rport->rqst_q->queue_lock, flags);
-
+ blk_run_queue_async(rport->rqst_q);
put_device(&rport->dev);
}

-
/**
* fc_bsg_rport_dispatch - process rport bsg requests and dispatch to LLDD
* @q: rport request queue
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index cbbfd98..2ad95fa 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -388,20 +388,19 @@ struct request_queue
#define QUEUE_FLAG_SYNCFULL 3 /* read queue has been filled */
#define QUEUE_FLAG_ASYNCFULL 4 /* write queue has been filled */
#define QUEUE_FLAG_DEAD 5 /* queue being torn down */
-#define QUEUE_FLAG_REENTER 6 /* Re-entrancy avoidance */
-#define QUEUE_FLAG_ELVSWITCH 7 /* don't use elevator, just do FIFO */
-#define QUEUE_FLAG_BIDI 8 /* queue supports bidi requests */
-#define QUEUE_FLAG_NOMERGES 9 /* disable merge attempts */
-#define QUEUE_FLAG_SAME_COMP 10 /* force complete on same CPU */
-#define QUEUE_FLAG_FAIL_IO 11 /* fake timeout */
-#define QUEUE_FLAG_STACKABLE 12 /* supports request stacking */
-#define QUEUE_FLAG_NONROT 13 /* non-rotational device (SSD) */
+#define QUEUE_FLAG_ELVSWITCH 6 /* don't use elevator, just do FIFO */
+#define QUEUE_FLAG_BIDI 7 /* queue supports bidi requests */
+#define QUEUE_FLAG_NOMERGES 8 /* disable merge attempts */
+#define QUEUE_FLAG_SAME_COMP 9 /* force complete on same CPU */
+#define QUEUE_FLAG_FAIL_IO 10 /* fake timeout */
+#define QUEUE_FLAG_STACKABLE 11 /* supports request stacking */
+#define QUEUE_FLAG_NONROT 12 /* non-rotational device (SSD) */
#define QUEUE_FLAG_VIRT QUEUE_FLAG_NONROT /* paravirt device */
-#define QUEUE_FLAG_IO_STAT 15 /* do IO stats */
-#define QUEUE_FLAG_DISCARD 16 /* supports DISCARD */
-#define QUEUE_FLAG_NOXMERGES 17 /* No extended merges */
-#define QUEUE_FLAG_ADD_RANDOM 18 /* Contributes to random pool */
-#define QUEUE_FLAG_SECDISCARD 19 /* supports SECDISCARD */
+#define QUEUE_FLAG_IO_STAT 13 /* do IO stats */
+#define QUEUE_FLAG_DISCARD 14 /* supports DISCARD */
+#define QUEUE_FLAG_NOXMERGES 15 /* No extended merges */
+#define QUEUE_FLAG_ADD_RANDOM 16 /* Contributes to random pool */
+#define QUEUE_FLAG_SECDISCARD 17 /* supports SECDISCARD */

#define QUEUE_FLAG_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \
(1 << QUEUE_FLAG_STACKABLE) | \
@@ -699,6 +698,7 @@ extern void blk_sync_queue(struct request_queue *q);
extern void __blk_stop_queue(struct request_queue *q);
extern void __blk_run_queue(struct request_queue *q);
extern void blk_run_queue(struct request_queue *);
+extern void blk_run_queue_async(struct request_queue *q);
extern int blk_rq_map_user(struct request_queue *, struct request *,
struct rq_map_data *, void __user *, unsigned long,
gfp_t);

--
Jens Axboe

2011-04-20 13:13:16

by Jens Axboe

[permalink] [raw]

On 2011-04-22 09:00, Michal Hocko wrote:
> On Thu 21-04-11 21:00:26, Jens Axboe wrote:
>> On 2011-04-21 20:51, Michal Hocko wrote:
>>> On Thu 21-04-11 07:38:57, Linus Torvalds wrote:
>>>> On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
>>>>>>
>>>>>> I am going to bisect, let's see if I can find anything.
>>>>>
>>>>> Thanks, that would be great!
>>>>
>>>> I'd expect it to be very timing-dependent, and thus could easily be
>>>> triggered (or hidden) by unrelated changes.
>>>>
>>>> Just happening to have a request added to the elevator at _just_ the
>>>> same moment that another CPU is changing it and getting rid of the
>>>> data structures for the old one.
>>>
>>> And it really looks like a timing issue. I have bisected down to
>>> e710d7d5a9cab1041b7a3cf9e655b75d92786857. I had to skip[1] some commits
>>> due to compile errors [2].
>>> At first it looked quite promising because I was able to boot after I
>>> reverted that patch but then I have tried to revert it on top of rc4
>>> (2f666bcf757cb72549f360ef6da02f03620a48b6) and saw the same problem
>>> again.
>>>
>>> So I do not think that bisecting will help here.
>>
>> It will be timing dependent. If there's no allocated IO requests when
>> the switch happens, it'll work.
>>
>> But the commit that caused this regression is 5e84ea3a. If you revert
>> that, it should work fine. Or just apply the patch I sent (or update to
>> Linus' tree, it's in now) and it'll work as well.
>
> Great. I can boot just fine with the current Linus tree
> (2.6.39-rc4-00149-g91e8549).
>
> Thanks a lot!

Great, with that confirmed from you, we have no pending bugs in this
area.

--
Jens Axboe

2011-04-22 20:30:48

by Hugh Dickins

[permalink] [raw]

Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Fri, Apr 22, 2011 at 4:35 AM, Jens Axboe <[email protected]> wrote:
> On 2011-04-22 09:00, Michal Hocko wrote:
>> On Thu 21-04-11 21:00:26, Jens Axboe wrote:
>>> On 2011-04-21 20:51, Michal Hocko wrote:
>>>> On Thu 21-04-11 07:38:57, Linus Torvalds wrote:
>>>>> On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
>>>>>>>
>>>>>>> I am going to bisect, let's see if I can find anything.
>>>>>>
>>>>>> Thanks, that would be great!
>>>>>
>>>>> I'd expect it to be very timing-dependent, and thus could easily be
>>>>> triggered (or hidden) by unrelated changes.
>>>>>
>>>>> Just happening to have a request added to the elevator at _just_ the
>>>>> same moment that another CPU is changing it and getting rid of the
>>>>> data structures for the old one.
>>>>
>>>> And it really looks like a timing issue. I have bisected down to
>>>> e710d7d5a9cab1041b7a3cf9e655b75d92786857. I had to skip[1] some commits
>>>> due to compile errors [2].
>>>> At first it looked quite promising because I was able to boot after I
>>>> reverted that patch but then I have tried to revert it on top of rc4
>>>> (2f666bcf757cb72549f360ef6da02f03620a48b6) and saw the same problem
>>>> again.
>>>>
>>>> So I do not think that bisecting will help here.
>>>
>>> It will be timing dependent. If there's no allocated IO requests when
>>> the switch happens, it'll work.
>>>
>>> But the commit that caused this regression is 5e84ea3a. If you revert
>>> that, it should work fine. Or just apply the patch I sent (or update to
>>> Linus' tree, it's in now) and it'll work as well.
>>
>> Great. I can boot just fine with the current Linus tree
>> (2.6.39-rc4-00149-g91e8549).
>>
>> Thanks a lot!
>
> Great, with that confirmed from you, we have no pending bugs in this
> area.

Yes, I booted up last night's git kernel on the PowerPC G5, and
confirm that CONFIG_IDE=y worked with that, whereas rc4 still had the
hang (even with CONFIG_SMP=y CONFIG_PREEMPT=y).

Hugh