2014-06-24 04:04:05

by Jet Chen

[permalink] [raw]
Subject: [block, blk] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

Hi Tejun,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-mq-percpu_ref
commit c924ec35e72ce0d6c289b858d323f7eb3f5076a5 ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero")

+------------------------------------------------------+------------+------------+
| | ea854572ee | c924ec35e7 |
+------------------------------------------------------+------------+------------+
| boot_successes | 26 | 10 |
| boot_failures | 0 | 6 |
| BUG:unable_to_handle_kernel_NULL_pointer_dereference | 0 | 6 |
| Oops | 0 | 6 |
| RIP:blk_throtl_drain | 0 | 6 |
| kernel_BUG_at_arch/x86/mm/pageattr.c | 0 | 6 |
| invalid_opcode | 0 | 6 |
| RIP:change_page_attr_set_clr | 0 | 6 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 6 |
| backtrace:scsi_debug_exit | 0 | 6 |
| backtrace:SyS_delete_module | 0 | 6 |
+------------------------------------------------------+------------+------------+


[ 6254.898035] sda: unknown partition table
[ 6254.903049] sd 2:0:0:0: [sda] Attached SCSI disk
[ 6257.214012] sd 2:0:0:0: [sda] Synchronizing SCSI cache
[ 6257.216452] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[ 6257.217194] IP: [<ffffffff813ceae0>] blk_throtl_drain+0x30/0x150
[ 6257.217194] PGD 0 [ 6257.217194] Oops: 0000 [#1] SMP [ 6257.217194] Modules linked in: sd_mod scsi_debug(-) crct10dif_generic crc_t10dif crct10dif_common loop dm_mod fuse sg sr_mod cdrom ata_generic pata_acpi parport_pc snd_pcm floppy parport snd_timer snd cirrus syscopyarea sysfillrect soundcore sysimgblt ata_piix ttm drm_kms_helper pcspkr i2c_piix4 libata drm
[ 6257.217194] CPU: 2 PID: 28645 Comm: rmmod Not tainted 3.16.0-rc1-01107-ge1fff86 #1
[ 6257.217194] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 6257.217194] task: ffff8801156c0000 ti: ffff88006af74000 task.ti: ffff88006af74000
[ 6257.217194] RIP: 0010:[<ffffffff813ceae0>] [<ffffffff813ceae0>] blk_throtl_drain+0x30/0x150
[ 6257.217194] RSP: 0018:ffff88006af77b60 EFLAGS: 00010046
[ 6257.217194] RAX: 0000000000000000 RBX: ffff88006aec0000 RCX: ffff880052240620
[ 6257.217194] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 6257.217194] RBP: ffff88006af77b78 R08: 0000000000000000 R09: 0000000000000046
[ 6257.240049] R10: ffff88006af77b78 R11: 0000000000000000 R12: ffff88006aec0000
[ 6257.240049] R13: ffff88007e093600 R14: ffff88006aec0658 R15: ffff88007eb8f120
[ 6257.240049] FS: 00007fbe3f39b700(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
[ 6257.240049] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6257.240049] CR2: 0000000000000028 CR3: 000000006aeaf000 CR4: 00000000000006e0
[ 6257.240049] Stack:
[ 6257.240049] ffff88006aec0000 0000000000000000 ffff88006aec0668 ffff88006af77b88
[ 6257.240049] ffffffff813cbb1e ffff88006af77bb8 ffffffff813b0c1c ffff88006aec0000
[ 6257.240049] ffffffff81cf7940 ffff88006aec0000 ffff88007eb8f000 ffff88006af77bd0
[ 6257.240049] Call Trace:
[ 6257.240049] [<ffffffff813cbb1e>] blkcg_drain_queue+0xe/0x10
[ 6257.240049] [<ffffffff813b0c1c>] __blk_drain_queue+0x7c/0x180
[ 6257.240049] [<ffffffff813b0dae>] blk_queue_bypass_start+0x8e/0xd0
[ 6257.240049] [<ffffffff813cacc8>] blkcg_deactivate_policy+0x38/0x140
[ 6257.240049] [<ffffffff813ced34>] blk_throtl_exit+0x34/0x50
[ 6257.240049] [<ffffffff813cbb68>] blkcg_exit_queue+0x48/0x70
[ 6257.240049] [<ffffffff813b4476>] blk_release_queue+0x26/0x100
[ 6257.240049] [<ffffffff813dcc47>] kobject_cleanup+0x77/0x1b0
[ 6257.240049] [<ffffffff813dcaf8>] kobject_put+0x28/0x60
[ 6257.240049] [<ffffffff813adab5>] blk_put_queue+0x15/0x20
[ 6257.240049] [<ffffffff8151d9cb>] scsi_device_dev_release_usercontext+0xbb/0x120
[ 6257.240049] [<ffffffff81087727>] execute_in_process_context+0x67/0x70
[ 6257.240049] [<ffffffff8151d90c>] scsi_device_dev_release+0x1c/0x20
[ 6257.240049] [<ffffffff814deda2>] device_release+0x32/0xa0
[ 6257.240049] [<ffffffff813dcc47>] kobject_cleanup+0x77/0x1b0
[ 6257.240049] [<ffffffff813dcaf8>] kobject_put+0x28/0x60
[ 6257.240049] [<ffffffff814df097>] put_device+0x17/0x20
[ 6257.240049] [<ffffffff8151e419>] __scsi_remove_device+0xa9/0xe0
[ 6257.240049] [<ffffffff8151c9c4>] scsi_forget_host+0x64/0x70
[ 6257.240049] [<ffffffff81510ec7>] scsi_remove_host+0x77/0x120
[ 6257.240049] [<ffffffffa01c25a9>] sdebug_driver_remove+0x29/0x90 [scsi_debug]
[ 6257.240049] [<ffffffff814e332f>] __device_release_driver+0x7f/0xf0
[ 6257.240049] [<ffffffff814e33c3>] device_release_driver+0x23/0x30
[ 6257.240049] [<ffffffff814e2cc8>] bus_remove_device+0x108/0x180
[ 6257.240049] [<ffffffff814df5c9>] device_del+0x129/0x1c0
[ 6257.240049] [<ffffffff814df67e>] device_unregister+0x1e/0x60
[ 6257.240049] [<ffffffffa01c1efc>] sdebug_remove_adapter+0x4c/0x70 [scsi_debug]
[ 6257.240049] [<ffffffffa01c652d>] scsi_debug_exit+0x19/0xaec [scsi_debug]
[ 6257.240049] [<ffffffff810ea52e>] SyS_delete_module+0x12e/0x1c0
[ 6257.240049] [<ffffffff81835162>] ? int_signal+0x12/0x17
[ 6257.240049] [<ffffffff81834ea9>] system_call_fastpath+0x16/0x1b
[ 6257.240049] Code: 55 65 ff 04 25 a0 c7 00 00 48 89 e5 41 55 41 54 49 89 fc 53 4c 8b af 40 07 00 00 49 8b 85 a0 00 00 00 31 ff 48 8b 80 c8 05 00 00 <48> 8b 70 28 e8 f7 8c d2 ff 48 85 c0 48 89 c3 74 61 0f 1f 80 00 [ 6257.240049] RIP [<ffffffff813ceae0>] blk_throtl_drain+0x30/0x150
[ 6257.240049] RSP <ffff88006af77b60>
[ 6257.240049] CR2: 0000000000000028
[ 6257.240049] ------------[ cut here ]------------



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Jet


Attachments:
reproduce (1.72 kB)
.dmesg (38.55 kB)
config-3.16.0-rc1-00023-gc924ec35 (121.00 kB)
Download all attachments

2014-06-24 22:11:53

by Tejun Heo

[permalink] [raw]
Subject: Re: [block, blk] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

On Tue, Jun 24, 2014 at 12:01:22PM +0800, Jet Chen wrote:
> Hi Tejun,
>
> we noticed the below changes on
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-mq-percpu_ref
> commit c924ec35e72ce0d6c289b858d323f7eb3f5076a5 ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero")

Yeah, this is from trying to drain NULL root blkcg. It's hit from
another path too and will soon be fixed.

Thanks!

--
tejun