Dear SCSI developers:
During the rcutorture test performed on linux-stable 5.15.y in PPC VM
of Open Source Lab of Oregon State University, A SCSI related bug is
discovered [1]:
[ 5.178733][ C1] BUG: Kernel NULL pointer dereference on read at
0x00000008
...
[ 5.231013][ C1] [c00000001ff9fca0] [c0000000009ffbc8]
scsi_end_request+0xd8/0x1f0 (unreliable)^M
[ 5.234961][ C1] [c00000001ff9fcf0] [c000000000a00e68]
scsi_io_completion+0x88/0x700^M
[ 5.237863][ C1] [c00000001ff9fda0] [c0000000009f5028]
scsi_finish_command+0xe8/0x150^M
[ 5.240089][ C1] [c00000001ff9fdf0] [c000000000a00c70]
scsi_complete+0x90/0x140^M
[ 5.242481][ C1] [c00000001ff9fe20] [c0000000007e5170]
blk_complete_reqs+0x80/0xa0^M
[ 5.245187][ C1] [c00000001ff9fe50] [c000000000f0b5d0]
__do_softirq+0x1e0/0x4e0^M
[ 5.248479][ C1] [c00000001ff9ff90] [c0000000000170e8]
do_softirq_own_stack+0x48/0x60^M
[ 5.250919][ C1] [c00000000a5e7c40] [c00000000a5e7c80]
0xc00000000a5e7c80^M
[ 5.253792][ C1] [c00000000a5e7c70] [c0000000001534c0]
do_softirq+0xb0/0xc0^M
[ 5.256824][ C1] [c00000000a5e7ca0] [c0000000001535ac]
__local_bh_enable_ip+0xdc/0x110^M
[ 5.259414][ C1] [c00000000a5e7cc0] [c0000000001d75e8]
irq_forced_thread_fn+0xc8/0xf0^M
[ 5.261921][ C1] [c00000000a5e7d00] [c0000000001d7ae4]
irq_thread+0x1b4/0x2a0^M
[ 5.265298][ C1] [c00000000a5e7da0] [c00000000017d8c8]
kthread+0x1a8/0x1d0^M
[ 5.269184][ C1] [c00000000a5e7e10] [c00000000000cee4]
By adding printk statement in the SCSI subsystem and perform rounds of
qemu bootup, I found the bug is caused by following 'use after free'
scenery:
A)
B)
__scsi_scan_target
scsi_probe_and_add_lun
scsi_probe_lun
scsi_execute_req
__scsi_execute
blk_execute_rq ---> req --->
time out
__scsi_remove_device
blk_cleanup_queue
percpu_ref_exit(&q->q_usage_counter)
scsi_end_request
percpu_ref_put(&q->q_usage_counter)
USE-AFTER-FREE
Reported-by: Zhouyi Zhou <[email protected]>
Thanks for your intention
Zhouyi
[1] https://lore.kernel.org/lkml/CAABZP2wa_ZTHUr9tH_6OSpr+TgNACo4kMu3eawsGV5qkCDoAKg@mail.gmail.com/T/