2011-04-20 12:58:31

by Michal Hocko

[permalink] [raw]
Subject: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

Hi,
I am not able to boot with the current git tree
(2.6.39-rc4-00089-g2f666bc). 2.6.39-rc3 boots just fine.

The backtrace [1] looks really messy.

I have seen a similar backtrace at https://lkml.org/lkml/2011/3/25/169
and tried the patch from https://lkml.org/lkml/2011/3/25/173 but it
didn't help. The backtrace looks similar (much smaller) but the machine
ends up dead as well with unbalanced preempt counter[2].

The config is attached.

Is this a known problem?

---
* [1] backtrace with the current git tree (2f666bc)

[ 31.207283] BUG: unable to handle kernel NULL pointer dereference at 0000000c
NULL pointer dereference[ 31.207540] IP: at 0000000c
[ 31.207540] IP: [<c028dece>] cfq_insert_request+0x1d/0x3f5
[ 31.207720] *pde = 00000000 [<c028dece>] cfq_insert_request+0x1d/0x3f5
NULL pointer dereference[ 31.207720] *pde = 00000000
at 0000000c
[ 31.207888] Oops: 0000 [#1]
PREEMPT [ 31.207540] IP:[ 31.207888] Oops: 0000 [#1] SMP PREEMPT DEBUG_PAGEALLOC [<c028dece>] cfq_insert_request+0x1d/0x3f5
SMP DEBUG_PAGEALLOC[ 31.207720] *pde = 00000000
[ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
[ 31.208401] Modules linked in:[ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
SMP fuse[ 31.208401] Modules linked in: tun fuseDEBUG_PAGEALLOC coretemp tun hwmon coretemp
snd_hda_codec_realtek hwmon snd_hda_codec_realtek[ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
snd_hda_intel[ 31.208401] Modules linked in: snd_hda_codec snd_hda_intel fuse snd_pcm_oss snd_hda_codec snd_mixer_oss snd_pcm_oss tun snd_pcm snd_mixer_oss snd_seq_oss snd_pcm coretemp snd_seq_oss hwmon snd_seq_midi_event snd_seq snd_seq_midi_event snd_hda_codec_realtek arc4 snd_seq ecb arc4 snd_hda_intel iwl3945 ecb snd_timer iwl3945 snd_hda_codec snd_seq_device snd_timer snd snd_seq_device snd_pcm_oss iwl_legacy snd snd_mixer_oss iwl_legacy mac80211 snd_pcm cfg80211 mac80211 snd_page_alloc cfg80211 snd_seq_oss rtc_cmos snd_page_alloc rtc_core snd_seq_midi_event rtc_cmos fujitsu_laptop rtc_core snd_seq video fujitsu_laptop rtc_lib video arc4 rtc_lib ecb backlight [last unloaded: scsi_wait_scan] backlight iwl3945 [last unloaded: scsi_wait_scan] snd_timer

snd_seq_device[ 31.210204]
snd[ 31.210204]
[ 31.210326] Pid: 2780, comm: Xorg Not tainted 2.6.39-rc4-00089-g2f666bc #119[ 31.210326] Pid: 2780, comm: Xorg Not tainted 2.6.39-rc4-00089-g2f666bc #119 iwl_legacy FUJITSU SIEMENS LIFEBOOK S7110 mac80211/FJNB1B5 FUJITSU SIEMENS LIFEBOOK S7110/FJNB1B5 cfg80211
snd_page_alloc
[ 31.210664] EIP: 0060:[<c028dece>] EFLAGS: 00010086 CPU: 0
rtc_cmos[ 31.210664] EIP: 0060:[<c028dece>] EFLAGS: 00010086 CPU: 0
rtc_core[ 31.210794] EIP is at cfq_insert_request+0x1d/0x3f5
[ 31.210794] EIP is at cfq_insert_request+0x1d/0x3f5
fujitsu_laptop[ 31.210923] EAX: 00000000 EBX: f52c9680 ECX: c062c184 EDX: f52c9680
video[ 31.210923] EAX: 00000000 EBX: f52c9680 ECX: c062c184 EDX: f52c9680
rtc_lib[ 31.211052] ESI: 00000000 EDI: f54b9000 EBP: f5069da4 ESP: f5069d78
backlight[ 31.211052] ESI: 00000000 EDI: f54b9000 EBP: f5069da4 ESP: f5069d78
[ 31.211140] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[last unloaded: scsi_wait_scan][ 31.211140] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068

[ 31.210204]
[ 31.211140] Process Xorg (pid: 2780, ti=f5068000 task=f54c3200 task.ti=f5068000)
[ 31.210326] Pid: 2780, comm: Xorg Not tainted 2.6.39-rc4-00089-g2f666bc #119[ 31.211140] Process Xorg (pid: 2780, ti=f5068000 task=f54c3200 task.ti=f5068000)
[ 31.211140] Stack:
FUJITSU SIEMENS LIFEBOOK S7110[ 31.211140] 00000000[ 31.211140] Stack:
/FJNB1B5 f5069d7c[ 31.211140] 00000000 f5069d7c f5069d7c
f5069d84 f5069d7c f5069d84 f5069d84[ 31.210664] EIP: 0060:[<c028dece>] EFLAGS: 00010086 CPU: 0
f52c9680 f5069d84 c020a5b4 f52c9680[ 31.210794] EIP is at cfq_insert_request+0x1d/0x3f5
c020a5b4[ 31.210923] EAX: 00000000 EBX: f52c9680 ECX: c062c184 EDX: f52c9680
00000004 00000004[ 31.211052] ESI: 00000000 EDI: f54b9000 EBP: f5069da4 ESP: f5069d78

[ 31.211140] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 31.211140] f52c9680
f5551f40[ 31.211140] f52c9680[ 31.211140] Process Xorg (pid: 2780, ti=f5068000 task=f54c3200 task.ti=f5068000)
00000000 f5551f40 f5069dbc 00000000[ 31.211140] Stack:
c027f733 f5069dbc 00000006 c027f733[ 31.211140] 00000000 f52c9680 00000006 f5069d7c f52c9680 f5551f40 f5069d7c f5551f40 f5069d84
[ 31.211140] 00000000
f5069d84[ 31.211140] 00000000 f52c9680 c020a5b4 00000004
[ 31.211140] f52c9680 f5551f40 00000000[ 31.211140] [<c027f733>] __elv_add_request+0x18d/0x1bb
f5069dbc[ 31.211140] [<c027f733>] __elv_add_request+0x18d/0x1bb
c027f733[ 31.211140] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
00000006[ 31.211140] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.211140] [<c045f42a>] io_schedule+0x61/0xad
f52c9680[ 31.211140] [<c045f42a>] io_schedule+0x61/0xad
f5551f40[ 31.211140] [<c0192697>] sleep_on_page_killable+0xd/0x32

[ 31.211140] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.211140] 00000000[ 31.211140] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.211140] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.211140] [<c027f733>] __elv_add_request+0x18d/0x1bb
[ 31.211140] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.211140] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.211140] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.211140] [<c045f42a>] io_schedule+0x61/0xad
[ 31.211140] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.211140] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.211140] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.211140] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.211140] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.211140] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.211140] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.211140] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.211140] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.211140] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.211140] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.211140] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.211140] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.211140] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.211140] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.211140] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.211140] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.211140] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.211140] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.211140] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.211140] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.211140] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.211140] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.211140] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.211140] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.211140] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.211140] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.211140] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.211140] [<c01c3e21>] sys_read+0x40/0x65
[ 31.211140] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.211140] [<c01c3e21>] sys_read+0x40/0x65
[ 31.211140] [<c0461149>] syscall_call+0x7/0xb
[ 31.211140] [<c01c3e21>] sys_read+0x40/0x65
[ 31.211140] Code: [ 31.211140] [<c0461149>] syscall_call+0x7/0xb
[ 31.211140] Code: [ 31.211140] [<c0461149>] syscall_call+0x7/0xb
00 b8 [ 31.211140] Code: 00 01 b8 00 01 00 00 00 00 00 b8 8d 00 65 8d 01 f4 65 5b f4 00 5e 5b 5f 5e 00 5d 5f c3 5d 00 55 c3 89 8d 55 e5 89 57 65 e5 56 57 f4 56 53 5b 83 53 ec 83 5e 20 ec 3e 20 5f 8d 3e 74 8d 5d 26 74 00 c3 26 8b 00 40 55 8b 0c 40 8b 89 0c 72 e5 5c 89 55 8b 57 72 e8 5c 89 8b 56 55 78 04 53 8b e8 42 8b 83 58 78 04 <8b> 8b ec 42 50 58 20 <8b> 50 0c 3e 89 0c f0 89 8d 83 f0 c2 83 74 0c c2 e8 0c 26 42 e8 f9 00 42 ff f9 ff 8b 40 0c 8b 72 5c 89 55 e8 8b 78 04 8b 42 58 <8b> 50 0c 89 f0 83 c2 0c e8 42 f9 ff rtc_core fujitsu_laptop rtc_core video fujitsu_laptop rtc_lib video rtc_lib rtc_core backlight backlight [last unloaded: scsi_wait_scan] fujitsu_laptop [last unloaded: scsi_wait_scan] video
rtc_lib
backlight[ 31.222032] Pid: 2780, comm: Xorg Tainted: G D 2.6.39-rc4-00089-g2f666bc #119
[last unloaded: scsi_wait_scan][ 31.222032] Pid: 2780, comm: Xorg Tainted: G D 2.6.39-rc4-00089-g2f666bc #119

[ 31.222243] Call Trace:
[ 31.222032] Pid: 2780, comm: Xorg Tainted: G D 2.6.39-rc4-00089-g2f666bc #119
[ 31.222243] Call Trace:
[ 31.222243] Call Trace:
[ 31.222372] [<c0128494>] __schedule_bug+0x4d/0x52
[ 31.222504] [<c045e9db>] schedule+0x76/0xa64
[ 31.222372] [<c0128494>] __schedule_bug+0x4d/0x52
[ 31.222372] [<c0128494>] __schedule_bug+0x4d/0x52
[ 31.222504] [<c045e9db>] schedule+0x76/0xa64
[ 31.222634] [<c01976aa>] ? __pagevec_free+0x79/0x86
[ 31.222504] [<c045e9db>] schedule+0x76/0xa64
[ 31.222634] [<c01976aa>] ? __pagevec_free+0x79/0x86
[ 31.222771] [<c019a0db>] ? release_pages+0x179/0x181
[ 31.222634] [<c01976aa>] ? __pagevec_free+0x79/0x86
[ 31.222771] [<c019a0db>] ? release_pages+0x179/0x181
[ 31.222771] [<c019a0db>] ? release_pages+0x179/0x181
[ 31.222929] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.222929] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.223069] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.222929] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.223069] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.223069] [<c0463566>] ? sub_preempt_count+0x9/0x8e
[ 31.223206] [<c019a81d>] ? lru_add_drain+0x96/0x9b
[ 31.223206] [<c019a81d>] ? lru_add_drain+0x96/0x9b
[ 31.223344] [<c012e105>] __cond_resched+0x17/0x23
[ 31.223206] [<c019a81d>] ? lru_add_drain+0x96/0x9b
[ 31.223344] [<c012e105>] __cond_resched+0x17/0x23
[ 31.223478] [<c045f5c4>] _cond_resched+0x25/0x2c
[ 31.223344] [<c012e105>] __cond_resched+0x17/0x23
[ 31.223478] [<c045f5c4>] _cond_resched+0x25/0x2c
[ 31.223613] [<c01a94b8>] unmap_vmas+0x506/0x5ef
[ 31.223478] [<c045f5c4>] _cond_resched+0x25/0x2c
[ 31.223613] [<c01a94b8>] unmap_vmas+0x506/0x5ef
[ 31.223613] [<c01a94b8>] unmap_vmas+0x506/0x5ef
[ 31.223746] [<c01ad21e>] exit_mmap+0xae/0x143
[ 31.223746] [<c01ad21e>] exit_mmap+0xae/0x143
[ 31.223902] [<c01329c3>] mmput+0x3d/0xa0
[ 31.223746] [<c01ad21e>] exit_mmap+0xae/0x143
[ 31.223902] [<c01329c3>] mmput+0x3d/0xa0
[ 31.223902] [<c01329c3>] mmput+0x3d/0xa0
[ 31.224073] [<c013648b>] exit_mm+0xfd/0x105
[ 31.224073] [<c013648b>] exit_mm+0xfd/0x105
[ 31.224204] [<c0137a0e>] do_exit+0x1f3/0x5fe
[ 31.224073] [<c013648b>] exit_mm+0xfd/0x105
[ 31.224204] [<c0137a0e>] do_exit+0x1f3/0x5fe
[ 31.224351] [<c0461fa8>] oops_end+0x97/0x9f
[ 31.224204] [<c0137a0e>] do_exit+0x1f3/0x5fe
[ 31.224351] [<c0461fa8>] oops_end+0x97/0x9f
[ 31.224483] [<c011e130>] no_context+0x113/0x11d
[ 31.224351] [<c0461fa8>] oops_end+0x97/0x9f
[ 31.224483] [<c011e130>] no_context+0x113/0x11d
[ 31.224613] [<c011e22c>] __bad_area_nosemaphore+0xf2/0xfa
[ 31.224483] [<c011e130>] no_context+0x113/0x11d
[ 31.224748] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.224613] [<c011e22c>] __bad_area_nosemaphore+0xf2/0xfa
[ 31.224613] [<c011e22c>] __bad_area_nosemaphore+0xf2/0xfa
[ 31.224887] [<c011e246>] bad_area_nosemaphore+0x12/0x15
[ 31.224748] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225020] [<c0463384>] do_page_fault+0x228/0x401
[ 31.224887] [<c011e246>] bad_area_nosemaphore+0x12/0x15
[ 31.225020] [<c0463384>] do_page_fault+0x228/0x401
[ 31.225166] [<c02847f2>] ? blk_finish_plug+0x12/0x2d
[ 31.225317] [<c01ea609>] ? mpage_readpages+0xc9/0xd4
[ 31.225166] [<c02847f2>] ? blk_finish_plug+0x12/0x2d
[ 31.224748] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.224887] [<c011e246>] bad_area_nosemaphore+0x12/0x15
[ 31.225020] [<c0463384>] do_page_fault+0x228/0x401
[ 31.225166] [<c02847f2>] ? blk_finish_plug+0x12/0x2d
[ 31.225317] [<c01ea609>] ? mpage_readpages+0xc9/0xd4
[ 31.225450] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225317] [<c01ea609>] ? mpage_readpages+0xc9/0xd4
[ 31.225450] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225582] [<c0461746>] error_code+0x5a/0x60
[ 31.225450] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225582] [<c0461746>] error_code+0x5a/0x60
[ 31.225712] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225582] [<c0461746>] error_code+0x5a/0x60
[ 31.225844] [<c028dece>] ? cfq_insert_request+0x1d/0x3f5
[ 31.225712] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225712] [<c046315c>] ? spurious_fault+0xb6/0xb6
[ 31.225844] [<c028dece>] ? cfq_insert_request+0x1d/0x3f5
[ 31.225977] [<c020a5b4>] ? ext3_bmap+0x84/0x84
[ 31.225844] [<c028dece>] ? cfq_insert_request+0x1d/0x3f5
[ 31.225977] [<c020a5b4>] ? ext3_bmap+0x84/0x84
[ 31.225977] [<c020a5b4>] ? ext3_bmap+0x84/0x84
[ 31.226106] [<c027f733>] __elv_add_request+0x18d/0x1bb
[ 31.226106] [<c027f733>] __elv_add_request+0x18d/0x1bb
[ 31.226271] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.226106] [<c027f733>] __elv_add_request+0x18d/0x1bb
[ 31.226271] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.226403] [<c045f42a>] io_schedule+0x61/0xad
[ 31.226271] [<c02847ba>] blk_flush_plug_list+0x124/0x14a
[ 31.226403] [<c045f42a>] io_schedule+0x61/0xad
[ 31.226535] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.226403] [<c045f42a>] io_schedule+0x61/0xad
[ 31.226535] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.226665] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.226535] [<c0192697>] sleep_on_page_killable+0xd/0x32
[ 31.226665] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.226815] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.226665] [<c045f9d3>] __wait_on_bit_lock+0x39/0x75
[ 31.226815] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.226945] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.226815] [<c019268a>] ? sleep_on_page+0x11/0x11
[ 31.226945] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.227079] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.226945] [<c019261a>] __lock_page_killable+0x53/0x5a
[ 31.227079] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.227079] [<c014bb29>] ? autoremove_wake_function+0x36/0x36
[ 31.227210] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.227210] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.227380] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.227210] [<c0193a43>] generic_file_aio_read+0x3f2/0x593
[ 31.227380] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.227514] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.227380] [<c01c367c>] do_sync_read+0x8f/0xca
[ 31.227514] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.227646] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.227514] [<c01ce440>] ? do_filp_open+0x26/0x62
[ 31.227646] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.227776] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.227646] [<c01bbf5a>] ? kmem_cache_free+0xac/0xfd
[ 31.227776] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.227906] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.227776] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.227906] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.228065] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.227906] [<c01c2309>] ? fd_install+0x47/0x4d
[ 31.228065] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.228195] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.228065] [<c01c3318>] ? do_sys_open+0x16f/0x179
[ 31.228195] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.228345] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.228195] [<c01c3d4b>] vfs_read+0x8f/0x125
[ 31.228345] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.228489] [<c01c3e21>] sys_read+0x40/0x65
[ 31.228489] [<c01c3e21>] sys_read+0x40/0x65
[ 31.228345] [<c01c35ed>] ? do_sync_write+0xca/0xca
[ 31.228618] [<c0461149>] syscall_call+0x7/0xb
[ 31.228489] [<c01c3e21>] sys_read+0x40/0x65
[ 31.228618] [<c0461149>] syscall_call+0x7/0xb
[ 31.228618] [<c0461149>] syscall_call+0x7/0xb

Nothing else shown in the netconsole after this point.

* [2] backtrace with the patch from https://lkml.org/lkml/2011/3/25/173
on top of 2f666bc

[ 31.182252] BUG: unable to handle kernel NULL pointer dereferenceNULL pointer dereference at 0000000c
[ 31.182475] IP: at 0000000c
[ 31.182475] IP: [<c028df12>] cfq_insert_request+0x1d/0x3f5
[ 31.182651] *pde = 00000000 [<c028df12>] cfq_insert_request+0x1d/0x3f5
[ 31.182651] *pde = 00000000 NULL pointer dereference
at 0000000c
[ 31.182818] Oops: 0000 [#1]
PREEMPT [ 31.182818] Oops: 0000 [#1] [ 31.182475] IP:SMP PREEMPT DEBUG_PAGEALLOCSMP [<c028df12>] cfq_insert_request+0x1d/0x3f5
DEBUG_PAGEALLOC[ 31.182651] *pde = 00000000
[ 31.182818] Oops: 0000 [#1] [ 31.183119] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
PREEMPT [ 31.183334] Modules linked in:[ 31.183119] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
fuse[ 31.183334] Modules linked in:SMP tun fuse coretemp tunDEBUG_PAGEALLOC hwmon snd_hda_codec_realtek coretemp
hwmon snd_hda_intel[ 31.183119] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
snd_hda_codec_realtek arc4 snd_hda_intel ecb[ 31.183334] Modules linked in: arc4 snd_hda_codec ecb fuse snd_pcm_oss snd_hda_codec snd_mixer_oss tun snd_pcm iwl3945 coretemp snd_pcm_oss snd_mixer_oss hwmon iwl_legacy snd_pcm iwl3945 snd_seq_oss iwl_legacy snd_seq_midi_event snd_seq_oss mac80211 snd_seq_midi_event snd_seq mac80211 cfg80211 snd_seq snd_hda_codec_realtek snd_timer cfg80211 snd_timer snd_hda_intel snd_seq_device arc4 snd snd_seq_device snd_page_alloc snd ecb rtc_cmos snd_page_alloc video rtc_cmos snd_hda_codec fujitsu_laptop video rtc_core fujitsu_laptop snd_pcm_oss rtc_lib rtc_core snd_mixer_oss rtc_lib backlight snd_pcm [last unloaded: scsi_wait_scan] iwl3945 backlight [last unloaded: scsi_wait_scan]
iwl_legacy
snd_seq_oss[ 31.185118]
snd_seq_midi_event[ 31.185118]
[ 31.185241] Pid: 2759, comm: Xorg Not tainted 2.6.39-rc4-lkml-fix-00090-gab11267 #121 mac80211 FUJITSU SIEMENS LIFEBOOK S7110[ 31.185241] Pid: 2759, comm: Xorg Not tainted 2.6.39-rc4-lkml-fix-00090-gab11267 #121 snd_seq/FJNB1B5 FUJITSU SIEMENS LIFEBOOK S7110/FJNB1B5 cfg80211
snd_timer
[ 31.185579] EIP: 0060:[<c028df12>] EFLAGS: 00010086 CPU: 0
snd_seq_device[ 31.185579] EIP: 0060:[<c028df12>] EFLAGS: 00010086 CPU: 0
snd[ 31.185706] EIP is at cfq_insert_request+0x1d/0x3f5
snd_page_alloc[ 31.185706] EIP is at cfq_insert_request+0x1d/0x3f5
rtc_cmos[ 31.185830] EAX: 00000000 EBX: f504c750 ECX: c062c184 EDX: f504c750
[ 31.185830] EAX: 00000000 EBX: f504c750 ECX: c062c184 EDX: f504c750
video[ 31.185878] ESI: 00000000 EDI: f5d7b800 EBP: f566dd38 ESP: f566dd0c
fujitsu_laptop[ 31.185878] ESI: 00000000 EDI: f5d7b800 EBP: f566dd38 ESP: f566dd0c
rtc_core[ 31.185878] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 31.185878] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
rtc_lib backlight[ 31.185878] Process Xorg (pid: 2759, ti=f566c000 task=f5cd9900 task.ti=f566c000)
[last unloaded: scsi_wait_scan][ 31.185878] Process Xorg (pid: 2759, ti=f566c000 task=f5cd9900 task.ti=f566c000)
[ 31.185878] Stack:
[ 31.185878] f4deaf2c[ 31.185878] Stack:
[ 31.185118]
00000020[ 31.185878] f4deaf2c 00000020 00000020[ 31.185241] Pid: 2759, comm: Xorg Not tainted 2.6.39-rc4-lkml-fix-00090-gab11267 #121 00000000 00000020 f566dd28 00000000 FUJITSU SIEMENS LIFEBOOK S7110 f504c750 f566dd28 0009be28 f504c750/FJNB1B5 0009be28
00000000 00000000[ 31.185579] EIP: 0060:[<c028df12>] EFLAGS: 00010086 CPU: 0
[ 31.185706] EIP is at cfq_insert_request+0x1d/0x3f5
[ 31.185878] f504c750
f5f39f40[ 31.185878] f504c750 00000000 f5f39f40[ 31.185830] EAX: 00000000 EBX: f504c750 ECX: c062c184 EDX: f504c750
f566dd50 00000000 c027f733 f566dd50[ 31.185878] ESI: 00000000 EDI: f5d7b800 EBP: f566dd38 ESP: f566dd0c
00000006 c027f733 f504c750 00000006[ 31.185878] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
f504c750[ 31.185878] Process Xorg (pid: 2759, ti=f566c000 task=f5cd9900 task.ti=f566c000)
f5f39f40 f5f39f40[ 31.185878] Stack:
[ 31.185878] f4deaf2c[ 31.185878] 00000000
[ 31.185878] 00000000 00000020 00000020 00000000 f566dd28 f504c750 0009be28 00000000
[ 31.185878] f504c750 f5f39f40 00000000 f566dd50 c027f733 00000006 f504c750 f5f39f40
[ 31.185878] 00000000[ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
[ 31.185878] [<c0461786>] error_code+0x5a/0x60
[ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
[ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
[ 31.185878] [<c0461786>] error_code+0x5a/0x60
[ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
[ 31.185878] Code: [ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
00 [ 31.185878] Code: b8 00 [ 31.185878] [<c0461786>] error_code+0x5a/0x60
b8 [ 31.185878] [<c046319c>] ? spurious_fault+0xb6/0xb6
01 [ 31.185878] Code: 00 01 00 00 00 00 00 8d 00 b8 65 8d f4 65 5b 01 f4 5e 5b 5f 5e 00 5d 5f c3 5d 00 55 c3 89 55 00 e5 89 57 8d e5 56 57 65 56 53 f4 83 53 ec 5b 83 20 ec 3e 5e 8d 20 3e 74 5f 8d 26 74 00 26 5d 8b 00 40 c3 8b 0c 40 8b 55 0c 72 8b 89 5c 72 89 5c e5 55 89 e8 55 57 8b e8 78 8b 56 04 78 8b 04 53 42 8b 58 42 83 <8b> 58 50 <8b> ec 50 20 0c 89 0c 3e f0 89 83 f0 8d c2 83 0c c2 74 e8 0c 42 e8 26 f9 42 ff f9 00 ff ff 8b ff 8b 4d 8b e8 4d 40 a1 e8 40 a1 0c ba 40 5f ba 8b c0 5f c0 72
5c [ 31.185878] EIP: [<c028df12>]
cfq_insert_request+0x1d/0x3f5[ 31.185878] EIP: [<c028df12>] 89 cfq_insert_request+0x1d/0x3f555 e8 SS:ESP 0068:f566dd0c
8b SS:ESP 0068:f566dd0c
[ 31.185878] CR2: 000000000000000c
78 [ 31.185878] CR2: 000000000000000c
04 [ 31.185878] ---[ end trace 35f4df0d1e223ee0 ]---
[ 31.185878] ---[ end trace 35f4df0d1e223ee0 ]---
8b [ 31.185878] note: Xorg[2759] exited with preempt_count 1
42 [ 31.185878] note: Xorg[2759] exited with preempt_count 1
58 <8b> 50 0c 89 f0 83 c2 0c e8 42 f9 ff ff 8b 4d e8 a1 40 ba 5f c0
[ 31.185878] EIP: [<c028df12>] cfq_insert_request+0x1d/0x3f5 SS:ESP 0068:f566dd0c
[ 31.185878] CR2: 000000000000000c
[ 31.185878] ---[ end trace 35f4df0d1e223ee0 ]---
[ 31.185878] note: Xorg[2759] exited with preempt_count 1
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic


Attachments:
(No filename) (22.66 kB)
config-2.6.39-rc4-00089-g2f666bc (69.29 kB)
Download all attachments

2011-04-20 13:03:26

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-20 14:58, Michal Hocko wrote:
> Hi,
> I am not able to boot with the current git tree
> (2.6.39-rc4-00089-g2f666bc). 2.6.39-rc3 boots just fine.
>
> The backtrace [1] looks really messy.
>
> I have seen a similar backtrace at https://lkml.org/lkml/2011/3/25/169
> and tried the patch from https://lkml.org/lkml/2011/3/25/173 but it
> didn't help. The backtrace looks similar (much smaller) but the machine
> ends up dead as well with unbalanced preempt counter[2].
>
> The config is attached.
>
> Is this a known problem?

Not sure, can you please try with the attached patch applied as well?

diff --git a/block/blk-core.c b/block/blk-core.c
index 5fa3dd2..a2e58ee 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -292,7 +292,6 @@ EXPORT_SYMBOL(blk_sync_queue);
/**
* __blk_run_queue - run a single device queue
* @q: The queue to run
- * @force_kblockd: Don't run @q->request_fn directly. Use kblockd.
*
* Description:
* See @blk_run_queue. This variant must be called with the queue lock
@@ -303,15 +302,7 @@ void __blk_run_queue(struct request_queue *q)
if (unlikely(blk_queue_stopped(q)))
return;

- /*
- * Only recurse once to avoid overrunning the stack, let the unplug
- * handling reinvoke the handler shortly if we already got there.
- */
- if (!queue_flag_test_and_set(QUEUE_FLAG_REENTER, q)) {
- q->request_fn(q);
- queue_flag_clear(QUEUE_FLAG_REENTER, q);
- } else
- queue_delayed_work(kblockd_workqueue, &q->delay_work, 0);
+ q->request_fn(q);
}
EXPORT_SYMBOL(__blk_run_queue);

@@ -328,6 +319,7 @@ void blk_run_queue_async(struct request_queue *q)
if (likely(!blk_queue_stopped(q)))
queue_delayed_work(kblockd_workqueue, &q->delay_work, 0);
}
+EXPORT_SYMBOL(blk_run_queue_async);

/**
* blk_run_queue - run a single device queue
@@ -2787,7 +2779,6 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)

local_irq_restore(flags);
}
-EXPORT_SYMBOL(blk_flush_plug_list);

void blk_finish_plug(struct blk_plug *plug)
{
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 6d73512..bd23631 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -66,14 +66,14 @@ queue_requests_store(struct request_queue *q, const char *page, size_t count)

if (rl->count[BLK_RW_SYNC] >= q->nr_requests) {
blk_set_queue_full(q, BLK_RW_SYNC);
- } else if (rl->count[BLK_RW_SYNC]+1 <= q->nr_requests) {
+ } else {
blk_clear_queue_full(q, BLK_RW_SYNC);
wake_up(&rl->wait[BLK_RW_SYNC]);
}

if (rl->count[BLK_RW_ASYNC] >= q->nr_requests) {
blk_set_queue_full(q, BLK_RW_ASYNC);
- } else if (rl->count[BLK_RW_ASYNC]+1 <= q->nr_requests) {
+ } else {
blk_clear_queue_full(q, BLK_RW_ASYNC);
wake_up(&rl->wait[BLK_RW_ASYNC]);
}
@@ -508,8 +508,10 @@ int blk_register_queue(struct gendisk *disk)
return ret;

ret = kobject_add(&q->kobj, kobject_get(&dev->kobj), "%s", "queue");
- if (ret < 0)
+ if (ret < 0) {
+ blk_trace_remove_sysfs(dev);
return ret;
+ }

kobject_uevent(&q->kobj, KOBJ_ADD);

diff --git a/block/blk.h b/block/blk.h
index c9df8fc..6126346 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -22,7 +22,6 @@ void blk_rq_timed_out_timer(unsigned long data);
void blk_delete_timer(struct request *);
void blk_add_timer(struct request *);
void __generic_unplug_device(struct request_queue *);
-void blk_run_queue_async(struct request_queue *q);

/*
* Internal atomic flags for request handling
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 46b0a1d..5b52011 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -2582,28 +2582,20 @@ static void cfq_put_queue(struct cfq_queue *cfqq)
}

/*
- * Must always be called with the rcu_read_lock() held
+ * Call func for each cic attached to this ioc.
*/
static void
-__call_for_each_cic(struct io_context *ioc,
- void (*func)(struct io_context *, struct cfq_io_context *))
+call_for_each_cic(struct io_context *ioc,
+ void (*func)(struct io_context *, struct cfq_io_context *))
{
struct cfq_io_context *cic;
struct hlist_node *n;

+ rcu_read_lock();
+
hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list)
func(ioc, cic);
-}

-/*
- * Call func for each cic attached to this ioc.
- */
-static void
-call_for_each_cic(struct io_context *ioc,
- void (*func)(struct io_context *, struct cfq_io_context *))
-{
- rcu_read_lock();
- __call_for_each_cic(ioc, func);
rcu_read_unlock();
}

@@ -2664,7 +2656,7 @@ static void cfq_free_io_context(struct io_context *ioc)
* should be ok to iterate over the known list, we will see all cic's
* since no new ones are added.
*/
- __call_for_each_cic(ioc, cic_free_func);
+ call_for_each_cic(ioc, cic_free_func);
}

static void cfq_put_cooperator(struct cfq_queue *cfqq)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ab55c2f..e9901b8 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -411,8 +411,6 @@ static void scsi_run_queue(struct request_queue *q)
list_splice_init(&shost->starved_list, &starved_list);

while (!list_empty(&starved_list)) {
- int flagset;
-
/*
* As long as shost is accepting commands and we have
* starved queues, call blk_run_queue. scsi_request_fn
@@ -435,20 +433,7 @@ static void scsi_run_queue(struct request_queue *q)
continue;
}

- spin_unlock(shost->host_lock);
-
- spin_lock(sdev->request_queue->queue_lock);
- flagset = test_bit(QUEUE_FLAG_REENTER, &q->queue_flags) &&
- !test_bit(QUEUE_FLAG_REENTER,
- &sdev->request_queue->queue_flags);
- if (flagset)
- queue_flag_set(QUEUE_FLAG_REENTER, sdev->request_queue);
- __blk_run_queue(sdev->request_queue);
- if (flagset)
- queue_flag_clear(QUEUE_FLAG_REENTER, sdev->request_queue);
- spin_unlock(sdev->request_queue->queue_lock);
-
- spin_lock(shost->host_lock);
+ blk_run_queue_async(sdev->request_queue);
}
/* put any unprocessed entries back */
list_splice(&starved_list, &shost->starved_list);
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 28c3350..815069d 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3816,28 +3816,17 @@ fail_host_msg:
static void
fc_bsg_goose_queue(struct fc_rport *rport)
{
- int flagset;
- unsigned long flags;
-
if (!rport->rqst_q)
return;

+ /*
+ * This get/put dance makes no sense
+ */
get_device(&rport->dev);
-
- spin_lock_irqsave(rport->rqst_q->queue_lock, flags);
- flagset = test_bit(QUEUE_FLAG_REENTER, &rport->rqst_q->queue_flags) &&
- !test_bit(QUEUE_FLAG_REENTER, &rport->rqst_q->queue_flags);
- if (flagset)
- queue_flag_set(QUEUE_FLAG_REENTER, rport->rqst_q);
- __blk_run_queue(rport->rqst_q);
- if (flagset)
- queue_flag_clear(QUEUE_FLAG_REENTER, rport->rqst_q);
- spin_unlock_irqrestore(rport->rqst_q->queue_lock, flags);
-
+ blk_run_queue_async(rport->rqst_q);
put_device(&rport->dev);
}

-
/**
* fc_bsg_rport_dispatch - process rport bsg requests and dispatch to LLDD
* @q: rport request queue
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index cbbfd98..2ad95fa 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -388,20 +388,19 @@ struct request_queue
#define QUEUE_FLAG_SYNCFULL 3 /* read queue has been filled */
#define QUEUE_FLAG_ASYNCFULL 4 /* write queue has been filled */
#define QUEUE_FLAG_DEAD 5 /* queue being torn down */
-#define QUEUE_FLAG_REENTER 6 /* Re-entrancy avoidance */
-#define QUEUE_FLAG_ELVSWITCH 7 /* don't use elevator, just do FIFO */
-#define QUEUE_FLAG_BIDI 8 /* queue supports bidi requests */
-#define QUEUE_FLAG_NOMERGES 9 /* disable merge attempts */
-#define QUEUE_FLAG_SAME_COMP 10 /* force complete on same CPU */
-#define QUEUE_FLAG_FAIL_IO 11 /* fake timeout */
-#define QUEUE_FLAG_STACKABLE 12 /* supports request stacking */
-#define QUEUE_FLAG_NONROT 13 /* non-rotational device (SSD) */
+#define QUEUE_FLAG_ELVSWITCH 6 /* don't use elevator, just do FIFO */
+#define QUEUE_FLAG_BIDI 7 /* queue supports bidi requests */
+#define QUEUE_FLAG_NOMERGES 8 /* disable merge attempts */
+#define QUEUE_FLAG_SAME_COMP 9 /* force complete on same CPU */
+#define QUEUE_FLAG_FAIL_IO 10 /* fake timeout */
+#define QUEUE_FLAG_STACKABLE 11 /* supports request stacking */
+#define QUEUE_FLAG_NONROT 12 /* non-rotational device (SSD) */
#define QUEUE_FLAG_VIRT QUEUE_FLAG_NONROT /* paravirt device */
-#define QUEUE_FLAG_IO_STAT 15 /* do IO stats */
-#define QUEUE_FLAG_DISCARD 16 /* supports DISCARD */
-#define QUEUE_FLAG_NOXMERGES 17 /* No extended merges */
-#define QUEUE_FLAG_ADD_RANDOM 18 /* Contributes to random pool */
-#define QUEUE_FLAG_SECDISCARD 19 /* supports SECDISCARD */
+#define QUEUE_FLAG_IO_STAT 13 /* do IO stats */
+#define QUEUE_FLAG_DISCARD 14 /* supports DISCARD */
+#define QUEUE_FLAG_NOXMERGES 15 /* No extended merges */
+#define QUEUE_FLAG_ADD_RANDOM 16 /* Contributes to random pool */
+#define QUEUE_FLAG_SECDISCARD 17 /* supports SECDISCARD */

#define QUEUE_FLAG_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \
(1 << QUEUE_FLAG_STACKABLE) | \
@@ -699,6 +698,7 @@ extern void blk_sync_queue(struct request_queue *q);
extern void __blk_stop_queue(struct request_queue *q);
extern void __blk_run_queue(struct request_queue *q);
extern void blk_run_queue(struct request_queue *);
+extern void blk_run_queue_async(struct request_queue *q);
extern int blk_rq_map_user(struct request_queue *, struct request *,
struct rq_map_data *, void __user *, unsigned long,
gfp_t);

--
Jens Axboe

2011-04-20 13:13:16

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-20 14:58, Michal Hocko wrote:
> Hi,
> I am not able to boot with the current git tree
> (2.6.39-rc4-00089-g2f666bc). 2.6.39-rc3 boots just fine.
>
> The backtrace [1] looks really messy.
>
> I have seen a similar backtrace at https://lkml.org/lkml/2011/3/25/169
> and tried the patch from https://lkml.org/lkml/2011/3/25/173 but it
> didn't help. The backtrace looks similar (much smaller) but the machine
> ends up dead as well with unbalanced preempt counter[2].
>
> The config is attached.
>
> Is this a known problem?
>
> ---
> * [1] backtrace with the current git tree (2f666bc)
>
> [ 31.207283] BUG: unable to handle kernel NULL pointer dereference at 0000000c
> NULL pointer dereference[ 31.207540] IP: at 0000000c
> [ 31.207540] IP: [<c028dece>] cfq_insert_request+0x1d/0x3f5
> [ 31.207720] *pde = 00000000 [<c028dece>] cfq_insert_request+0x1d/0x3f5
> NULL pointer dereference[ 31.207720] *pde = 00000000
> at 0000000c
> [ 31.207888] Oops: 0000 [#1]
> PREEMPT [ 31.207540] IP:[ 31.207888] Oops: 0000 [#1] SMP PREEMPT DEBUG_PAGEALLOC [<c028dece>] cfq_insert_request+0x1d/0x3f5
> SMP DEBUG_PAGEALLOC[ 31.207720] *pde = 00000000
> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler

Ahh hang on, this may be a good clue. Does your boot scripts change the
IO scheduler?

--
Jens Axboe

2011-04-20 13:29:08

by Michal Hocko

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Wed 20-04-11 15:13:15, Jens Axboe wrote:
> On 2011-04-20 14:58, Michal Hocko wrote:
> > Hi,
> > I am not able to boot with the current git tree
> > (2.6.39-rc4-00089-g2f666bc). 2.6.39-rc3 boots just fine.
> >
> > The backtrace [1] looks really messy.
> >
> > I have seen a similar backtrace at https://lkml.org/lkml/2011/3/25/169
> > and tried the patch from https://lkml.org/lkml/2011/3/25/173 but it
> > didn't help. The backtrace looks similar (much smaller) but the machine
> > ends up dead as well with unbalanced preempt counter[2].
> >
> > The config is attached.
> >
> > Is this a known problem?
> >
> > ---
> > * [1] backtrace with the current git tree (2f666bc)
> >
> > [ 31.207283] BUG: unable to handle kernel NULL pointer dereference at 0000000c
> > NULL pointer dereference[ 31.207540] IP: at 0000000c
> > [ 31.207540] IP: [<c028dece>] cfq_insert_request+0x1d/0x3f5
> > [ 31.207720] *pde = 00000000 [<c028dece>] cfq_insert_request+0x1d/0x3f5
> > NULL pointer dereference[ 31.207720] *pde = 00000000
> > at 0000000c
> > [ 31.207888] Oops: 0000 [#1]
> > PREEMPT [ 31.207540] IP:[ 31.207888] Oops: 0000 [#1] SMP PREEMPT DEBUG_PAGEALLOC [<c028dece>] cfq_insert_request+0x1d/0x3f5
> > SMP DEBUG_PAGEALLOC[ 31.207720] *pde = 00000000
> > [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
>
> Ahh hang on, this may be a good clue. Does your boot scripts change the
> IO scheduler?

Good one...
Yes, I have:
echo deadline > /sys/block/sda/queue/scheduler
in /etc/rc.local

I am able to boot after I remove it. This is the first time I have seen
"last sysfs file" being useful.
Still want me to test the patch from the other email?
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

2011-04-20 13:31:55

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-20 15:29, Michal Hocko wrote:
> On Wed 20-04-11 15:13:15, Jens Axboe wrote:
>> On 2011-04-20 14:58, Michal Hocko wrote:
>>> Hi,
>>> I am not able to boot with the current git tree
>>> (2.6.39-rc4-00089-g2f666bc). 2.6.39-rc3 boots just fine.
>>>
>>> The backtrace [1] looks really messy.
>>>
>>> I have seen a similar backtrace at https://lkml.org/lkml/2011/3/25/169
>>> and tried the patch from https://lkml.org/lkml/2011/3/25/173 but it
>>> didn't help. The backtrace looks similar (much smaller) but the machine
>>> ends up dead as well with unbalanced preempt counter[2].
>>>
>>> The config is attached.
>>>
>>> Is this a known problem?
>>>
>>> ---
>>> * [1] backtrace with the current git tree (2f666bc)
>>>
>>> [ 31.207283] BUG: unable to handle kernel NULL pointer dereference at 0000000c
>>> NULL pointer dereference[ 31.207540] IP: at 0000000c
>>> [ 31.207540] IP: [<c028dece>] cfq_insert_request+0x1d/0x3f5
>>> [ 31.207720] *pde = 00000000 [<c028dece>] cfq_insert_request+0x1d/0x3f5
>>> NULL pointer dereference[ 31.207720] *pde = 00000000
>>> at 0000000c
>>> [ 31.207888] Oops: 0000 [#1]
>>> PREEMPT [ 31.207540] IP:[ 31.207888] Oops: 0000 [#1] SMP PREEMPT DEBUG_PAGEALLOC [<c028dece>] cfq_insert_request+0x1d/0x3f5
>>> SMP DEBUG_PAGEALLOC[ 31.207720] *pde = 00000000
>>> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
>>
>> Ahh hang on, this may be a good clue. Does your boot scripts change the
>> IO scheduler?
>
> Good one...
> Yes, I have:
> echo deadline > /sys/block/sda/queue/scheduler
> in /etc/rc.local
>
> I am able to boot after I remove it. This is the first time I have seen
> "last sysfs file" being useful.

Haha, good point, it's the first time for me as well IO think.

> Still want me to test the patch from the other email?

No, that wont help you. I do wonder why this is a rc3 vs rc4 change.
I'll dig around and send you something to try.

--
Jens Axboe

2011-04-20 17:33:21

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-20 15:29, Michal Hocko wrote:
> On Wed 20-04-11 15:13:15, Jens Axboe wrote:
>> On 2011-04-20 14:58, Michal Hocko wrote:
>>> Hi,
>>> I am not able to boot with the current git tree
>>> (2.6.39-rc4-00089-g2f666bc). 2.6.39-rc3 boots just fine.
>>>
>>> The backtrace [1] looks really messy.
>>>
>>> I have seen a similar backtrace at https://lkml.org/lkml/2011/3/25/169
>>> and tried the patch from https://lkml.org/lkml/2011/3/25/173 but it
>>> didn't help. The backtrace looks similar (much smaller) but the machine
>>> ends up dead as well with unbalanced preempt counter[2].
>>>
>>> The config is attached.
>>>
>>> Is this a known problem?
>>>
>>> ---
>>> * [1] backtrace with the current git tree (2f666bc)
>>>
>>> [ 31.207283] BUG: unable to handle kernel NULL pointer dereference at 0000000c
>>> NULL pointer dereference[ 31.207540] IP: at 0000000c
>>> [ 31.207540] IP: [<c028dece>] cfq_insert_request+0x1d/0x3f5
>>> [ 31.207720] *pde = 00000000 [<c028dece>] cfq_insert_request+0x1d/0x3f5
>>> NULL pointer dereference[ 31.207720] *pde = 00000000
>>> at 0000000c
>>> [ 31.207888] Oops: 0000 [#1]
>>> PREEMPT [ 31.207540] IP:[ 31.207888] Oops: 0000 [#1] SMP PREEMPT DEBUG_PAGEALLOC [<c028dece>] cfq_insert_request+0x1d/0x3f5
>>> SMP DEBUG_PAGEALLOC[ 31.207720] *pde = 00000000
>>> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
>>
>> Ahh hang on, this may be a good clue. Does your boot scripts change the
>> IO scheduler?
>
> Good one...
> Yes, I have:
> echo deadline > /sys/block/sda/queue/scheduler
> in /etc/rc.local
>
> I am able to boot after I remove it. This is the first time I have seen
> "last sysfs file" being useful.
> Still want me to test the patch from the other email?

Is this a new addition to your system? IOW, how certain are you that
this is a regression that occured between rc3 and rc4?

--
Jens Axboe

2011-04-21 07:16:46

by Michal Hocko

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Wed 20-04-11 19:33:19, Jens Axboe wrote:
> On 2011-04-20 15:29, Michal Hocko wrote:
> > On Wed 20-04-11 15:13:15, Jens Axboe wrote:
> >> On 2011-04-20 14:58, Michal Hocko wrote:
[...]
> >>> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
> >>
> >> Ahh hang on, this may be a good clue. Does your boot scripts change the
> >> IO scheduler?
> >
> > Good one...
> > Yes, I have:
> > echo deadline > /sys/block/sda/queue/scheduler
> > in /etc/rc.local
> >
> > I am able to boot after I remove it. This is the first time I have seen
> > "last sysfs file" being useful.
> > Still want me to test the patch from the other email?
>
> Is this a new addition to your system? IOW, how certain are you that
> this is a regression that occured between rc3 and rc4?

No, I am setting the scheduler this way for quite some time. If I use it
rc4 explodes while rc3 boots just fine. I am wondering, can this be a
timing issue? I am able to set the scheduler after system settles down
after boot and kde starts.

I am going to bisect, let's see if I can find anything.
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

2011-04-21 07:25:56

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 21/04/2011, at 09.16, Michal Hocko <[email protected]> wrote:

> On Wed 20-04-11 19:33:19, Jens Axboe wrote:
>> On 2011-04-20 15:29, Michal Hocko wrote:
>>> On Wed 20-04-11 15:13:15, Jens Axboe wrote:
>>>> On 2011-04-20 14:58, Michal Hocko wrote:
> [...]
>>>>> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
>>>>
>>>> Ahh hang on, this may be a good clue. Does your boot scripts change the
>>>> IO scheduler?
>>>
>>> Good one...
>>> Yes, I have:
>>> echo deadline > /sys/block/sda/queue/scheduler
>>> in /etc/rc.local
>>>
>>> I am able to boot after I remove it. This is the first time I have seen
>>> "last sysfs file" being useful.
>>> Still want me to test the patch from the other email?
>>
>> Is this a new addition to your system? IOW, how certain are you that
>> this is a regression that occured between rc3 and rc4?
>
> No, I am setting the scheduler this way for quite some time. If I use it
> rc4 explodes while rc3 boots just fine. I am wondering, can this be a
> timing issue? I am able to set the scheduler after system settles down
> after boot and kde starts.
>
> I am going to bisect, let's see if I can find anything.

Thanks, that would be great!

Jens

2011-04-21 14:39:24

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
>>
>> I am going to bisect, let's see if I can find anything.
>
> Thanks, that would be great!

I'd expect it to be very timing-dependent, and thus could easily be
triggered (or hidden) by unrelated changes.

Just happening to have a request added to the elevator at _just_ the
same moment that another CPU is changing it and getting rid of the
data structures for the old one.

Linus

2011-04-21 15:05:32

by Jeff Moyer

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

Jens Axboe <[email protected]> writes:

> On 21/04/2011, at 09.16, Michal Hocko <[email protected]> wrote:
>
>> On Wed 20-04-11 19:33:19, Jens Axboe wrote:
>>> On 2011-04-20 15:29, Michal Hocko wrote:
>>>> On Wed 20-04-11 15:13:15, Jens Axboe wrote:
>>>>> On 2011-04-20 14:58, Michal Hocko wrote:
>> [...]
>>>>>> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
>>>>>
>>>>> Ahh hang on, this may be a good clue. Does your boot scripts change the
>>>>> IO scheduler?
>>>>
>>>> Good one...
>>>> Yes, I have:
>>>> echo deadline > /sys/block/sda/queue/scheduler
>>>> in /etc/rc.local
>>>>
>>>> I am able to boot after I remove it. This is the first time I have seen
>>>> "last sysfs file" being useful.
>>>> Still want me to test the patch from the other email?
>>>
>>> Is this a new addition to your system? IOW, how certain are you that
>>> this is a regression that occured between rc3 and rc4?
>>
>> No, I am setting the scheduler this way for quite some time. If I use it
>> rc4 explodes while rc3 boots just fine. I am wondering, can this be a
>> timing issue? I am able to set the scheduler after system settles down
>> after boot and kde starts.
>>
>> I am going to bisect, let's see if I can find anything.
>
> Thanks, that would be great!

OK, this is a long shot, but in a derivative kernel, I saw what may be
the same issue. Is this kernel built with CONFIG_BLK_CGROUP=n by
chance? The exact problem I saw was a panic on boot in
cfq_insert_request+0x77, which mapped to this:

/usr/src/debug/kernel-2.6.32-135.el6/linux-2.6.32-135.el6.x86_64/block/cfq-iosched.c:1997
ffffffff8125c390: 49 8b 84 24 a8 00 00 mov 0xa8(%r12),%rax
ffffffff8125c397: 00 <---------------------
ffffffff8125c398: 83 80 ec 02 00 00 01 addl $0x1,0x2ec(%rax)

cfq-iosched.c:1997 looks like this:

(RQ_CFQG(rq))->dispatched++;

Enabling CONFIG_BLK_CGROUP made the problem go away. Again, not sure
it's the same thing, but I figured I'd speak up in case it helps.

Cheers,
Jeff

2011-04-21 15:27:49

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Thu, Apr 21, 2011 at 11:04:45AM -0400, Jeff Moyer wrote:
> Jens Axboe <[email protected]> writes:
>
> > On 21/04/2011, at 09.16, Michal Hocko <[email protected]> wrote:
> >
> >> On Wed 20-04-11 19:33:19, Jens Axboe wrote:
> >>> On 2011-04-20 15:29, Michal Hocko wrote:
> >>>> On Wed 20-04-11 15:13:15, Jens Axboe wrote:
> >>>>> On 2011-04-20 14:58, Michal Hocko wrote:
> >> [...]
> >>>>>> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
> >>>>>
> >>>>> Ahh hang on, this may be a good clue. Does your boot scripts change the
> >>>>> IO scheduler?
> >>>>
> >>>> Good one...
> >>>> Yes, I have:
> >>>> echo deadline > /sys/block/sda/queue/scheduler
> >>>> in /etc/rc.local
> >>>>
> >>>> I am able to boot after I remove it. This is the first time I have seen
> >>>> "last sysfs file" being useful.
> >>>> Still want me to test the patch from the other email?
> >>>
> >>> Is this a new addition to your system? IOW, how certain are you that
> >>> this is a regression that occured between rc3 and rc4?
> >>
> >> No, I am setting the scheduler this way for quite some time. If I use it
> >> rc4 explodes while rc3 boots just fine. I am wondering, can this be a
> >> timing issue? I am able to set the scheduler after system settles down
> >> after boot and kde starts.
> >>
> >> I am going to bisect, let's see if I can find anything.
> >
> > Thanks, that would be great!
>
> OK, this is a long shot, but in a derivative kernel, I saw what may be
> the same issue. Is this kernel built with CONFIG_BLK_CGROUP=n by
> chance? The exact problem I saw was a panic on boot in
> cfq_insert_request+0x77, which mapped to this:
>
> /usr/src/debug/kernel-2.6.32-135.el6/linux-2.6.32-135.el6.x86_64/block/cfq-iosched.c:1997
> ffffffff8125c390: 49 8b 84 24 a8 00 00 mov 0xa8(%r12),%rax
> ffffffff8125c397: 00 <---------------------
> ffffffff8125c398: 83 80 ec 02 00 00 01 addl $0x1,0x2ec(%rax)
>
> cfq-iosched.c:1997 looks like this:
>
> (RQ_CFQG(rq))->dispatched++;
>
> Enabling CONFIG_BLK_CGROUP made the problem go away. Again, not sure
> it's the same thing, but I figured I'd speak up in case it helps.

Jeff,

I think we had fixed this issue upstream with following commit.

commit 50eaeb323a170e231263ccb433bb2f99bd9e27ac
Author: Dmitry Monakhov <[email protected]>
Date: Wed Apr 28 19:50:33 2010 +0200

cfq-iosched: fix broken cfq_ref_get_cfqf() for CONFIG_BLK_CGROUP=y &&
CFQ_GR

We should return the cfq_group for this case, not NULL.

Signed-off-by: Jens Axboe <[email protected]>


I just booted my system with 39-rc4 with BLK_CGROUP=n. So I doubt that's the
issue here.

Thanks
Vivek

2011-04-21 15:29:21

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-21 16:38, Linus Torvalds wrote:
> On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
>>>
>>> I am going to bisect, let's see if I can find anything.
>>
>> Thanks, that would be great!
>
> I'd expect it to be very timing-dependent, and thus could easily be
> triggered (or hidden) by unrelated changes.
>
> Just happening to have a request added to the elevator at _just_ the
> same moment that another CPU is changing it and getting rid of the
> data structures for the old one.

This particular bug does seem to trigger very reliably. The switching
works by ensuring that we have no requests with elevator data associated
with it before shutting down the old scheduler and attaching the new
one. I smells more like a bug on doing an insert sort on a non-priv
request. I bet this is it, Michal can you give it a spin?

diff --git a/block/elevator.c b/block/elevator.c
index 6f6abc0..45ca1e3 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -671,7 +671,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where)
q->boundary_rq = rq;
}
} else if (!(rq->cmd_flags & REQ_ELVPRIV) &&
- where == ELEVATOR_INSERT_SORT)
+ (where == ELEVATOR_INSERT_SORT ||
+ where == ELEVATOR_INSERT_SORT_MERGE))
where = ELEVATOR_INSERT_BACK;

switch (where) {

--
Jens Axboe

2011-04-21 15:30:33

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-21 17:04, Jeff Moyer wrote:
> Jens Axboe <[email protected]> writes:
>
>> On 21/04/2011, at 09.16, Michal Hocko <[email protected]> wrote:
>>
>>> On Wed 20-04-11 19:33:19, Jens Axboe wrote:
>>>> On 2011-04-20 15:29, Michal Hocko wrote:
>>>>> On Wed 20-04-11 15:13:15, Jens Axboe wrote:
>>>>>> On 2011-04-20 14:58, Michal Hocko wrote:
>>> [...]
>>>>>>> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
>>>>>>
>>>>>> Ahh hang on, this may be a good clue. Does your boot scripts change the
>>>>>> IO scheduler?
>>>>>
>>>>> Good one...
>>>>> Yes, I have:
>>>>> echo deadline > /sys/block/sda/queue/scheduler
>>>>> in /etc/rc.local
>>>>>
>>>>> I am able to boot after I remove it. This is the first time I have seen
>>>>> "last sysfs file" being useful.
>>>>> Still want me to test the patch from the other email?
>>>>
>>>> Is this a new addition to your system? IOW, how certain are you that
>>>> this is a regression that occured between rc3 and rc4?
>>>
>>> No, I am setting the scheduler this way for quite some time. If I use it
>>> rc4 explodes while rc3 boots just fine. I am wondering, can this be a
>>> timing issue? I am able to set the scheduler after system settles down
>>> after boot and kde starts.
>>>
>>> I am going to bisect, let's see if I can find anything.
>>
>> Thanks, that would be great!
>
> OK, this is a long shot, but in a derivative kernel, I saw what may be
> the same issue. Is this kernel built with CONFIG_BLK_CGROUP=n by
> chance? The exact problem I saw was a panic on boot in
> cfq_insert_request+0x77, which mapped to this:
>
> /usr/src/debug/kernel-2.6.32-135.el6/linux-2.6.32-135.el6.x86_64/block/cfq-iosched.c:1997
> ffffffff8125c390: 49 8b 84 24 a8 00 00 mov 0xa8(%r12),%rax
> ffffffff8125c397: 00 <---------------------
> ffffffff8125c398: 83 80 ec 02 00 00 01 addl $0x1,0x2ec(%rax)
>
> cfq-iosched.c:1997 looks like this:
>
> (RQ_CFQG(rq))->dispatched++;
>
> Enabling CONFIG_BLK_CGROUP made the problem go away. Again, not sure
> it's the same thing, but I figured I'd speak up in case it helps.

I think this is just the canary in this particular case, I bet it does
not work reliably in general after the addition of the SORT_MERGE
insert.

--
Jens Axboe

2011-04-21 15:52:34

by Jeff Moyer

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

Vivek Goyal <[email protected]> writes:

> On Thu, Apr 21, 2011 at 11:04:45AM -0400, Jeff Moyer wrote:
>> Jens Axboe <[email protected]> writes:
>>
>> > On 21/04/2011, at 09.16, Michal Hocko <[email protected]> wrote:
>> >
>> >> On Wed 20-04-11 19:33:19, Jens Axboe wrote:
>> >>> On 2011-04-20 15:29, Michal Hocko wrote:
>> >>>> On Wed 20-04-11 15:13:15, Jens Axboe wrote:
>> >>>>> On 2011-04-20 14:58, Michal Hocko wrote:
>> >> [...]
>> >>>>>> [ 31.207888] Oops: 0000 [#1] PREEMPT [ 31.208186] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler
>> >>>>>
>> >>>>> Ahh hang on, this may be a good clue. Does your boot scripts change the
>> >>>>> IO scheduler?
>> >>>>
>> >>>> Good one...
>> >>>> Yes, I have:
>> >>>> echo deadline > /sys/block/sda/queue/scheduler
>> >>>> in /etc/rc.local
>> >>>>
>> >>>> I am able to boot after I remove it. This is the first time I have seen
>> >>>> "last sysfs file" being useful.
>> >>>> Still want me to test the patch from the other email?
>> >>>
>> >>> Is this a new addition to your system? IOW, how certain are you that
>> >>> this is a regression that occured between rc3 and rc4?
>> >>
>> >> No, I am setting the scheduler this way for quite some time. If I use it
>> >> rc4 explodes while rc3 boots just fine. I am wondering, can this be a
>> >> timing issue? I am able to set the scheduler after system settles down
>> >> after boot and kde starts.
>> >>
>> >> I am going to bisect, let's see if I can find anything.
>> >
>> > Thanks, that would be great!
>>
>> OK, this is a long shot, but in a derivative kernel, I saw what may be
>> the same issue. Is this kernel built with CONFIG_BLK_CGROUP=n by
>> chance? The exact problem I saw was a panic on boot in
>> cfq_insert_request+0x77, which mapped to this:
>>
>> /usr/src/debug/kernel-2.6.32-135.el6/linux-2.6.32-135.el6.x86_64/block/cfq-iosched.c:1997
>> ffffffff8125c390: 49 8b 84 24 a8 00 00 mov 0xa8(%r12),%rax
>> ffffffff8125c397: 00 <---------------------
>> ffffffff8125c398: 83 80 ec 02 00 00 01 addl $0x1,0x2ec(%rax)
>>
>> cfq-iosched.c:1997 looks like this:
>>
>> (RQ_CFQG(rq))->dispatched++;
>>
>> Enabling CONFIG_BLK_CGROUP made the problem go away. Again, not sure
>> it's the same thing, but I figured I'd speak up in case it helps.
>
> Jeff,
>
> I think we had fixed this issue upstream with following commit.
>
> commit 50eaeb323a170e231263ccb433bb2f99bd9e27ac
> Author: Dmitry Monakhov <[email protected]>
> Date: Wed Apr 28 19:50:33 2010 +0200
>
> cfq-iosched: fix broken cfq_ref_get_cfqf() for CONFIG_BLK_CGROUP=y &&
> CFQ_GR
>
> We should return the cfq_group for this case, not NULL.
>
> Signed-off-by: Jens Axboe <[email protected]>
>
>
> I just booted my system with 39-rc4 with BLK_CGROUP=n. So I doubt that's the
> issue here.

OK, thanks for checking.

Cheers,
Jeff

2011-04-21 18:51:16

by Michal Hocko

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Thu 21-04-11 07:38:57, Linus Torvalds wrote:
> On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
> >>
> >> I am going to bisect, let's see if I can find anything.
> >
> > Thanks, that would be great!
>
> I'd expect it to be very timing-dependent, and thus could easily be
> triggered (or hidden) by unrelated changes.
>
> Just happening to have a request added to the elevator at _just_ the
> same moment that another CPU is changing it and getting rid of the
> data structures for the old one.

And it really looks like a timing issue. I have bisected down to
e710d7d5a9cab1041b7a3cf9e655b75d92786857. I had to skip[1] some commits
due to compile errors [2].
At first it looked quite promising because I was able to boot after I
reverted that patch but then I have tried to revert it on top of rc4
(2f666bcf757cb72549f360ef6da02f03620a48b6) and saw the same problem
again.

So I do not think that bisecting will help here.

---
[1] git bisect log
git bisect start
# bad: [2f666bcf757cb72549f360ef6da02f03620a48b6] Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
git bisect bad 2f666bcf757cb72549f360ef6da02f03620a48b6
# good: [a6360dd37e1a144ed11e6548371bade559a1e4df] Linux 2.6.39-rc3
git bisect good a6360dd37e1a144ed11e6548371bade559a1e4df
# bad: [fdfc552abe2c465ab91b84fb65f36d86c6737ab4] Merge branches 'core-fixes-for-linus', 'perf-fixes-for-linus', 'sched-fixes-for-linus', 'timer-fixes-for-linus' and 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad fdfc552abe2c465ab91b84fb65f36d86c6737ab4
# bad: [01eda2e0c0cf035308308a19581e4979285b51ec] kstrtox: fix compile warnings in test
git bisect bad 01eda2e0c0cf035308308a19581e4979285b51ec
# bad: [aaa119a3d4e0b7421c82ae3fbe288e65d0f3485f] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
git bisect bad aaa119a3d4e0b7421c82ae3fbe288e65d0f3485f
# bad: [d77d9597ad8f2bd381a5168005a21e82df6f18eb] Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86
git bisect bad d77d9597ad8f2bd381a5168005a21e82df6f18eb
# good: [08a0799d5736f1494ef35d386570d177447acbfb] acer-wmi: Fix capitalisation of GUID in module alias
git bisect good 08a0799d5736f1494ef35d386570d177447acbfb
# skip: [be85bccaa5aa5a11dcaf85f9e945ffefd253f631] Revert "vfs: Export file system uuid via /proc/<pid>/mountinfo"
git bisect skip be85bccaa5aa5a11dcaf85f9e945ffefd253f631
# skip: [95042f9eb78a8d9a17455e2ef263f2f310ecef15] vm: fix mlock() on stack guard page
git bisect skip 95042f9eb78a8d9a17455e2ef263f2f310ecef15
# bad: [0bba01695b74fdd2f9286243bb39f88544d81401] vfs: Re-introduce s_uuid in the superblock
git bisect bad 0bba01695b74fdd2f9286243bb39f88544d81401
# good: [16ad56972ca3161eb97583897f17c1ead0c4ebd0] Merge branch 'stable/bug-fixes-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
git bisect good 16ad56972ca3161eb97583897f17c1ead0c4ebd0
# skip: [0d88f6e804c824454b5ed0d3034ed3dcf7467a87] nfs: don't call __mark_inode_dirty while holding i_lock
git bisect skip 0d88f6e804c824454b5ed0d3034ed3dcf7467a87
# skip: [e328f05dd6c528fcc9eb6da157dd865dfe611352] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
git bisect skip e328f05dd6c528fcc9eb6da157dd865dfe611352
# bad: [e710d7d5a9cab1041b7a3cf9e655b75d92786857] mfd: Fetch cell pointer from platform_device->mfd_cell
git bisect bad e710d7d5a9cab1041b7a3cf9e655b75d92786857

[2]
fs/ext3/super.c: In function 'ext3_fill_super':
fs/ext3/super.c:1946:2: error: 'struct super_block' has no member named 's_uuid'
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

2011-04-21 18:52:13

by Michal Hocko

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Thu 21-04-11 17:29:14, Jens Axboe wrote:
> On 2011-04-21 16:38, Linus Torvalds wrote:
> > On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
> >>>
> >>> I am going to bisect, let's see if I can find anything.
> >>
> >> Thanks, that would be great!
> >
> > I'd expect it to be very timing-dependent, and thus could easily be
> > triggered (or hidden) by unrelated changes.
> >
> > Just happening to have a request added to the elevator at _just_ the
> > same moment that another CPU is changing it and getting rid of the
> > data structures for the old one.
>
> This particular bug does seem to trigger very reliably. The switching
> works by ensuring that we have no requests with elevator data associated
> with it before shutting down the old scheduler and attaching the new
> one. I smells more like a bug on doing an insert sort on a non-priv
> request. I bet this is it, Michal can you give it a spin?

Will try tomorrow.

>
> diff --git a/block/elevator.c b/block/elevator.c
> index 6f6abc0..45ca1e3 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -671,7 +671,8 @@ void __elv_add_request(struct request_queue *q, struct request *rq, int where)
> q->boundary_rq = rq;
> }
> } else if (!(rq->cmd_flags & REQ_ELVPRIV) &&
> - where == ELEVATOR_INSERT_SORT)
> + (where == ELEVATOR_INSERT_SORT ||
> + where == ELEVATOR_INSERT_SORT_MERGE))
> where = ELEVATOR_INSERT_BACK;
>
> switch (where) {
>
> --
> Jens Axboe
>

--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

2011-04-21 19:00:35

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-21 20:51, Michal Hocko wrote:
> On Thu 21-04-11 07:38:57, Linus Torvalds wrote:
>> On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
>>>>
>>>> I am going to bisect, let's see if I can find anything.
>>>
>>> Thanks, that would be great!
>>
>> I'd expect it to be very timing-dependent, and thus could easily be
>> triggered (or hidden) by unrelated changes.
>>
>> Just happening to have a request added to the elevator at _just_ the
>> same moment that another CPU is changing it and getting rid of the
>> data structures for the old one.
>
> And it really looks like a timing issue. I have bisected down to
> e710d7d5a9cab1041b7a3cf9e655b75d92786857. I had to skip[1] some commits
> due to compile errors [2].
> At first it looked quite promising because I was able to boot after I
> reverted that patch but then I have tried to revert it on top of rc4
> (2f666bcf757cb72549f360ef6da02f03620a48b6) and saw the same problem
> again.
>
> So I do not think that bisecting will help here.

It will be timing dependent. If there's no allocated IO requests when
the switch happens, it'll work.

But the commit that caused this regression is 5e84ea3a. If you revert
that, it should work fine. Or just apply the patch I sent (or update to
Linus' tree, it's in now) and it'll work as well.

--
Jens Axboe

2011-04-22 07:00:37

by Michal Hocko

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Thu 21-04-11 21:00:26, Jens Axboe wrote:
> On 2011-04-21 20:51, Michal Hocko wrote:
> > On Thu 21-04-11 07:38:57, Linus Torvalds wrote:
> >> On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
> >>>>
> >>>> I am going to bisect, let's see if I can find anything.
> >>>
> >>> Thanks, that would be great!
> >>
> >> I'd expect it to be very timing-dependent, and thus could easily be
> >> triggered (or hidden) by unrelated changes.
> >>
> >> Just happening to have a request added to the elevator at _just_ the
> >> same moment that another CPU is changing it and getting rid of the
> >> data structures for the old one.
> >
> > And it really looks like a timing issue. I have bisected down to
> > e710d7d5a9cab1041b7a3cf9e655b75d92786857. I had to skip[1] some commits
> > due to compile errors [2].
> > At first it looked quite promising because I was able to boot after I
> > reverted that patch but then I have tried to revert it on top of rc4
> > (2f666bcf757cb72549f360ef6da02f03620a48b6) and saw the same problem
> > again.
> >
> > So I do not think that bisecting will help here.
>
> It will be timing dependent. If there's no allocated IO requests when
> the switch happens, it'll work.
>
> But the commit that caused this regression is 5e84ea3a. If you revert
> that, it should work fine. Or just apply the patch I sent (or update to
> Linus' tree, it's in now) and it'll work as well.

Great. I can boot just fine with the current Linus tree
(2.6.39-rc4-00149-g91e8549).

Thanks a lot!
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

2011-04-22 11:36:08

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On 2011-04-22 09:00, Michal Hocko wrote:
> On Thu 21-04-11 21:00:26, Jens Axboe wrote:
>> On 2011-04-21 20:51, Michal Hocko wrote:
>>> On Thu 21-04-11 07:38:57, Linus Torvalds wrote:
>>>> On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
>>>>>>
>>>>>> I am going to bisect, let's see if I can find anything.
>>>>>
>>>>> Thanks, that would be great!
>>>>
>>>> I'd expect it to be very timing-dependent, and thus could easily be
>>>> triggered (or hidden) by unrelated changes.
>>>>
>>>> Just happening to have a request added to the elevator at _just_ the
>>>> same moment that another CPU is changing it and getting rid of the
>>>> data structures for the old one.
>>>
>>> And it really looks like a timing issue. I have bisected down to
>>> e710d7d5a9cab1041b7a3cf9e655b75d92786857. I had to skip[1] some commits
>>> due to compile errors [2].
>>> At first it looked quite promising because I was able to boot after I
>>> reverted that patch but then I have tried to revert it on top of rc4
>>> (2f666bcf757cb72549f360ef6da02f03620a48b6) and saw the same problem
>>> again.
>>>
>>> So I do not think that bisecting will help here.
>>
>> It will be timing dependent. If there's no allocated IO requests when
>> the switch happens, it'll work.
>>
>> But the commit that caused this regression is 5e84ea3a. If you revert
>> that, it should work fine. Or just apply the patch I sent (or update to
>> Linus' tree, it's in now) and it'll work as well.
>
> Great. I can boot just fine with the current Linus tree
> (2.6.39-rc4-00149-g91e8549).
>
> Thanks a lot!

Great, with that confirmed from you, we have no pending bugs in this
area.

--
Jens Axboe

2011-04-22 20:30:48

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.6.39-rc4 BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: cfq_insert_request+0x1d/0x3f5

On Fri, Apr 22, 2011 at 4:35 AM, Jens Axboe <[email protected]> wrote:
> On 2011-04-22 09:00, Michal Hocko wrote:
>> On Thu 21-04-11 21:00:26, Jens Axboe wrote:
>>> On 2011-04-21 20:51, Michal Hocko wrote:
>>>> On Thu 21-04-11 07:38:57, Linus Torvalds wrote:
>>>>> On Thu, Apr 21, 2011 at 12:25 AM, Jens Axboe <[email protected]> wrote:
>>>>>>>
>>>>>>> I am going to bisect, let's see if I can find anything.
>>>>>>
>>>>>> Thanks, that would be great!
>>>>>
>>>>> I'd expect it to be very timing-dependent, and thus could easily be
>>>>> triggered (or hidden) by unrelated changes.
>>>>>
>>>>> Just happening to have a request added to the elevator at _just_ the
>>>>> same moment that another CPU is changing it and getting rid of the
>>>>> data structures for the old one.
>>>>
>>>> And it really looks like a timing issue. I have bisected down to
>>>> e710d7d5a9cab1041b7a3cf9e655b75d92786857. I had to skip[1] some commits
>>>> due to compile errors [2].
>>>> At first it looked quite promising because I was able to boot after I
>>>> reverted that patch but then I have tried to revert it on top of rc4
>>>> (2f666bcf757cb72549f360ef6da02f03620a48b6) and saw the same problem
>>>> again.
>>>>
>>>> So I do not think that bisecting will help here.
>>>
>>> It will be timing dependent. If there's no allocated IO requests when
>>> the switch happens, it'll work.
>>>
>>> But the commit that caused this regression is 5e84ea3a. If you revert
>>> that, it should work fine. Or just apply the patch I sent (or update to
>>> Linus' tree, it's in now) and it'll work as well.
>>
>> Great. I can boot just fine with the current Linus tree
>> (2.6.39-rc4-00149-g91e8549).
>>
>> Thanks a lot!
>
> Great, with that confirmed from you, we have no pending bugs in this
> area.

Yes, I booted up last night's git kernel on the PowerPC G5, and
confirm that CONFIG_IDE=y worked with that, whereas rc4 still had the
hang (even with CONFIG_SMP=y CONFIG_PREEMPT=y).

Hugh