2019-08-05 17:53:01

by syzbot

[permalink] [raw]
Subject: WARNING: refcount bug in blk_mq_free_request (2)

Hello,

syzbot found the following crash on:

HEAD commit: e21a712a Linux 5.3-rc3
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10cf349a600000
kernel config: https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
dashboard link: https://syzkaller.appspot.com/bug?extid=f4316dab9d4518b755eb
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=117a1906600000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11aa11aa600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]

------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 1 PID: 16 at lib/refcount.c:190 refcount_sub_and_test_checked
lib/refcount.c:190 [inline]
WARNING: CPU: 1 PID: 16 at lib/refcount.c:190
refcount_sub_and_test_checked+0x1d0/0x200 lib/refcount.c:180
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.3.0-rc3 #98
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
panic+0x2dc/0x755 kernel/panic.c:219
__warn.cold+0x20/0x4c kernel/panic.c:576
report_bug+0x263/0x2b0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:179 [inline]
fixup_bug arch/x86/kernel/traps.c:174 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1026
RIP: 0010:refcount_sub_and_test_checked lib/refcount.c:190 [inline]
RIP: 0010:refcount_sub_and_test_checked+0x1d0/0x200 lib/refcount.c:180
Code: 1d 7e b3 64 06 31 ff 89 de e8 9c a3 35 fe 84 db 75 94 e8 53 a2 35 fe
48 c7 c7 80 02 c6 87 c6 05 5e b3 64 06 01 e8 18 15 07 fe <0f> 0b e9 75 ff
ff ff e8 34 a2 35 fe e9 6e ff ff ff 48 89 df e8 b7
RSP: 0018:ffff8880a990fbb0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000100 RSI: ffffffff815c3ba6 RDI: ffffed1015321f68
RBP: ffff8880a990fc48 R08: ffff8880a9900440 R09: ffffed1015d24101
R10: ffffed1015d24100 R11: ffff8880ae920807 R12: 00000000ffffffff
R13: 0000000000000001 R14: ffff8880a990fc20 R15: 0000000000000000
refcount_dec_and_test_checked+0x1b/0x20 lib/refcount.c:220
blk_mq_free_request+0x3b8/0x580 block/blk-mq.c:524
__blk_mq_end_request block/blk-mq.c:550 [inline]
blk_mq_end_request+0x456/0x560 block/blk-mq.c:559
nbd_complete_rq+0x42/0x50 drivers/block/nbd.c:322
blk_done_softirq+0x2fe/0x4d0 block/blk-softirq.c:37
__do_softirq+0x262/0x98c kernel/softirq.c:292
run_ksoftirqd kernel/softirq.c:603 [inline]
run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595
smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165
kthread+0x361/0x430 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches


2019-08-06 00:00:01

by syzbot

[permalink] [raw]
Subject: Re: WARNING: refcount bug in blk_mq_free_request (2)

syzbot has bisected this bug to:

commit 12f5b93145450c750f315657ef239a314811aeeb
Author: Keith Busch <[email protected]>
Date: Tue May 29 13:52:28 2018 +0000

blk-mq: Remove generation seqeunce

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1263808c600000
start commit: e21a712a Linux 5.3-rc3
git tree: upstream
final crash: https://syzkaller.appspot.com/x/report.txt?x=1163808c600000
console output: https://syzkaller.appspot.com/x/log.txt?x=1663808c600000
kernel config: https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
dashboard link: https://syzkaller.appspot.com/bug?extid=f4316dab9d4518b755eb
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=117a1906600000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11aa11aa600000

Reported-by: [email protected]
Fixes: 12f5b9314545 ("blk-mq: Remove generation seqeunce")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

2019-08-06 15:48:33

by Keith Busch

[permalink] [raw]
Subject: Re: WARNING: refcount bug in blk_mq_free_request (2)

On Mon, Aug 05, 2019 at 10:52:07AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: e21a712a Linux 5.3-rc3
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10cf349a600000
> kernel config: https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
> dashboard link: https://syzkaller.appspot.com/bug?extid=f4316dab9d4518b755eb
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=117a1906600000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11aa11aa600000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: [email protected]
>
> ------------[ cut here ]------------
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 1 PID: 16 at lib/refcount.c:190 refcount_sub_and_test_checked
> lib/refcount.c:190 [inline]
> WARNING: CPU: 1 PID: 16 at lib/refcount.c:190
> refcount_sub_and_test_checked+0x1d0/0x200 lib/refcount.c:180
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.3.0-rc3 #98
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> panic+0x2dc/0x755 kernel/panic.c:219
> __warn.cold+0x20/0x4c kernel/panic.c:576
> report_bug+0x263/0x2b0 lib/bug.c:186
> fixup_bug arch/x86/kernel/traps.c:179 [inline]
> fixup_bug arch/x86/kernel/traps.c:174 [inline]
> do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
> do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
> invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1026
> RIP: 0010:refcount_sub_and_test_checked lib/refcount.c:190 [inline]
> RIP: 0010:refcount_sub_and_test_checked+0x1d0/0x200 lib/refcount.c:180
> Code: 1d 7e b3 64 06 31 ff 89 de e8 9c a3 35 fe 84 db 75 94 e8 53 a2 35 fe
> 48 c7 c7 80 02 c6 87 c6 05 5e b3 64 06 01 e8 18 15 07 fe <0f> 0b e9 75 ff
> ff ff e8 34 a2 35 fe e9 6e ff ff ff 48 89 df e8 b7
> RSP: 0018:ffff8880a990fbb0 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000100 RSI: ffffffff815c3ba6 RDI: ffffed1015321f68
> RBP: ffff8880a990fc48 R08: ffff8880a9900440 R09: ffffed1015d24101
> R10: ffffed1015d24100 R11: ffff8880ae920807 R12: 00000000ffffffff
> R13: 0000000000000001 R14: ffff8880a990fc20 R15: 0000000000000000
> refcount_dec_and_test_checked+0x1b/0x20 lib/refcount.c:220
> blk_mq_free_request+0x3b8/0x580 block/blk-mq.c:524
> __blk_mq_end_request block/blk-mq.c:550 [inline]
> blk_mq_end_request+0x456/0x560 block/blk-mq.c:559
> nbd_complete_rq+0x42/0x50 drivers/block/nbd.c:322
> blk_done_softirq+0x2fe/0x4d0 block/blk-softirq.c:37
> __do_softirq+0x262/0x98c kernel/softirq.c:292
> run_ksoftirqd kernel/softirq.c:603 [inline]
> run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595
> smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165
> kthread+0x361/0x430 kernel/kthread.c:255
> ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

It looks like ndb's timeout handler may complete the same request twice:
first via ndb_config_put()->ndb_clear_sock(), then again explicitly
through a direct call to blk_mq_complete_request().

2019-08-06 22:40:21

by Jens Axboe

[permalink] [raw]
Subject: Re: WARNING: refcount bug in blk_mq_free_request (2)

On 8/6/19 8:43 AM, Keith Busch wrote:
> On Mon, Aug 05, 2019 at 10:52:07AM -0700, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit: e21a712a Linux 5.3-rc3
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=10cf349a600000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
>> dashboard link: https://syzkaller.appspot.com/bug?extid=f4316dab9d4518b755eb
>> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=117a1906600000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11aa11aa600000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: [email protected]
>>
>> ------------[ cut here ]------------
>> refcount_t: underflow; use-after-free.
>> WARNING: CPU: 1 PID: 16 at lib/refcount.c:190 refcount_sub_and_test_checked
>> lib/refcount.c:190 [inline]
>> WARNING: CPU: 1 PID: 16 at lib/refcount.c:190
>> refcount_sub_and_test_checked+0x1d0/0x200 lib/refcount.c:180
>> Kernel panic - not syncing: panic_on_warn set ...
>> CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.3.0-rc3 #98
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>> __dump_stack lib/dump_stack.c:77 [inline]
>> dump_stack+0x172/0x1f0 lib/dump_stack.c:113
>> panic+0x2dc/0x755 kernel/panic.c:219
>> __warn.cold+0x20/0x4c kernel/panic.c:576
>> report_bug+0x263/0x2b0 lib/bug.c:186
>> fixup_bug arch/x86/kernel/traps.c:179 [inline]
>> fixup_bug arch/x86/kernel/traps.c:174 [inline]
>> do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
>> do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
>> invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1026
>> RIP: 0010:refcount_sub_and_test_checked lib/refcount.c:190 [inline]
>> RIP: 0010:refcount_sub_and_test_checked+0x1d0/0x200 lib/refcount.c:180
>> Code: 1d 7e b3 64 06 31 ff 89 de e8 9c a3 35 fe 84 db 75 94 e8 53 a2 35 fe
>> 48 c7 c7 80 02 c6 87 c6 05 5e b3 64 06 01 e8 18 15 07 fe <0f> 0b e9 75 ff
>> ff ff e8 34 a2 35 fe e9 6e ff ff ff 48 89 df e8 b7
>> RSP: 0018:ffff8880a990fbb0 EFLAGS: 00010286
>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> RDX: 0000000000000100 RSI: ffffffff815c3ba6 RDI: ffffed1015321f68
>> RBP: ffff8880a990fc48 R08: ffff8880a9900440 R09: ffffed1015d24101
>> R10: ffffed1015d24100 R11: ffff8880ae920807 R12: 00000000ffffffff
>> R13: 0000000000000001 R14: ffff8880a990fc20 R15: 0000000000000000
>> refcount_dec_and_test_checked+0x1b/0x20 lib/refcount.c:220
>> blk_mq_free_request+0x3b8/0x580 block/blk-mq.c:524
>> __blk_mq_end_request block/blk-mq.c:550 [inline]
>> blk_mq_end_request+0x456/0x560 block/blk-mq.c:559
>> nbd_complete_rq+0x42/0x50 drivers/block/nbd.c:322
>> blk_done_softirq+0x2fe/0x4d0 block/blk-softirq.c:37
>> __do_softirq+0x262/0x98c kernel/softirq.c:292
>> run_ksoftirqd kernel/softirq.c:603 [inline]
>> run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595
>> smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165
>> kthread+0x361/0x430 kernel/kthread.c:255
>> ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>> Kernel Offset: disabled
>> Rebooting in 86400 seconds..
>
> It looks like ndb's timeout handler may complete the same request twice:
> first via ndb_config_put()->ndb_clear_sock(), then again explicitly
> through a direct call to blk_mq_complete_request().

Adding Josef.

--
Jens Axboe