2021-12-05 13:16:15

by kernel test robot

[permalink] [raw]
Subject: [blk] 4575a8b36e: WARNING:at_kernel/sched/core.c:#__might_sleep



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 4575a8b36e3a56fa87b1f77e0064fc2ec36ebb7c ("[PATCH V2 1/5] blk-mq: remove hctx_lock and hctx_unlock")
url: https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-quiesce-improvement/20211130-154015
base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/linux-block/[email protected]

in testcase: blktests
version: blktests-x86_64-f51ee53-1_20211202
with following parameters:

test: nbd-group-01
ucode: 0xe2



on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 36.018984][ T964] WARNING: CPU: 1 PID: 964 at kernel/sched/core.c:9461 __might_sleep (kernel/sched/core.c:9461 (discriminator 9))
[ 36.028135][ T964] Modules linked in: nbd loop xfs libcrc32c dm_multipath dm_mod sd_mod t10_pi sg ipmi_devintf ipmi_msghandler intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp i915 kvm_intel mei_wdt kvm intel_gtt ttm wmi_bmof drm_kms_helper irqbypass crct10dif_pclmul syscopyarea sysfillrect crc32_pclmul crc32c_intel ghash_clmulni_intel sysimgblt rapl fb_sys_fops intel_cstate ahci libahci drm intel_uncore mei_me intel_pch_thermal libata mei wmi video intel_pmc_core acpi_pad ip_tables
[ 36.048867][ T303] RESULT_ROOT=/result/blktests/nbd-group-01-ucode=0xe2/lkp-skl-d03/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3-func/gcc-9/4575a8b36e3a56fa87b1f77e0064fc2ec36ebb7c/6
[ 36.073878][ T303]
[ 36.073994][ T964] CPU: 1 PID: 964 Comm: parted Tainted: G I 5.16.0-rc3-00091-g4575a8b36e3a #1
[ 36.102953][ T964] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015
[ 36.111136][ T964] RIP: 0010:__might_sleep (kernel/sched/core.c:9461 (discriminator 9))
[ 36.116481][ T964] Code: 48 8d bb 00 23 00 00 48 89 fa 48 c1 ea 03 80 3c 02 00 75 31 48 8b 93 00 23 00 00 44 89 f6 48 c7 c7 a0 f6 a7 83 e8 96 40 26 02 <0f> 0b e9 6e ff ff ff e8 1a eb 65 00 e9 1d ff ff ff e8 50 eb 65 00
All code
========
0: 48 8d bb 00 23 00 00 lea 0x2300(%rbx),%rdi
7: 48 89 fa mov %rdi,%rdx
a: 48 c1 ea 03 shr $0x3,%rdx
e: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
12: 75 31 jne 0x45
14: 48 8b 93 00 23 00 00 mov 0x2300(%rbx),%rdx
1b: 44 89 f6 mov %r14d,%esi
1e: 48 c7 c7 a0 f6 a7 83 mov $0xffffffff83a7f6a0,%rdi
25: e8 96 40 26 02 callq 0x22640c0
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 6e ff ff ff jmpq 0xffffffffffffff9f
31: e8 1a eb 65 00 callq 0x65eb50
36: e9 1d ff ff ff jmpq 0xffffffffffffff58
3b: e8 50 eb 65 00 callq 0x65eb90

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 6e ff ff ff jmpq 0xffffffffffffff75
7: e8 1a eb 65 00 callq 0x65eb26
c: e9 1d ff ff ff jmpq 0xffffffffffffff2e
11: e8 50 eb 65 00 callq 0x65eb66
[ 36.136101][ T964] RSP: 0018:ffffc90000f87580 EFLAGS: 00010286
[ 36.142134][ T964] RAX: 0000000000000000 RBX: ffff888859690000 RCX: 0000000000000000
[ 36.150086][ T964] RDX: 0000000000000004 RSI: 0000000000000008 RDI: fffff520001f0ea2
[ 36.157941][ T964] RBP: ffffc90000f875a8 R08: 0000000000000001 R09: ffffed10fae96741
[ 36.165885][ T964] R10: ffff8887d74b3a07 R11: ffffed10fae96740 R12: ffffffff83c07c00
[ 36.173851][ T964] R13: 000000000000081b R14: 0000000000000002 R15: ffffffff84c9cfc0
[ 36.181821][ T964] FS: 00007fbfd9ece5c0(0000) GS:ffff8887d7480000(0000) knlGS:0000000000000000
[ 36.190633][ T964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 36.197185][ T964] CR2: 000056330e10b000 CR3: 0000000857e7e002 CR4: 00000000003706e0
[ 36.205042][ T964] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 36.213012][ T964] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 36.220965][ T964] Call Trace:
[ 36.224218][ T964] <TASK>
[ 36.227024][ T964] blk_mq_run_hw_queue (block/blk-mq.c:2075 (discriminator 2))
[ 36.232000][ T964] blk_mq_sched_insert_requests (arch/x86/include/asm/preempt.h:80 include/linux/rcupdate.h:68 include/linux/rcupdate.h:686 include/linux/percpu-refcount.h:330 include/linux/percpu-refcount.h:351 block/blk-mq-sched.c:487)
[ 36.237871][ T964] blk_mq_flush_plug_list (block/blk-mq.c:2546)
[ 36.243227][ T964] ? update_io_ticks (block/blk-core.c:1040)
[ 36.248080][ T964] ? blk_mq_insert_requests (block/blk-mq.c:2546)
[ 36.253517][ T964] ? blk_mq_submit_bio (block/blk-mq.c:866 block/blk-mq.c:2386 block/blk-mq.c:2789)
[ 36.258603][ T964] ? blk_mq_try_issue_list_directly (block/blk-mq.c:2760)
[ 36.264718][ T964] blk_flush_plug (block/blk-core.c:1262)
[ 36.269269][ T964] ? _raw_spin_lock_irqsave (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:513 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
[ 36.274603][ T964] ? blk_start_plug_nr_ios (block/blk-core.c:1251)
[ 36.279943][ T964] ? __filemap_add_folio (mm/filemap.c:953)
[ 36.285106][ T964] io_schedule (kernel/sched/core.c:8371)
[ 36.287859][ T303] job=/lkp/jobs/scheduled/lkp-skl-d03/blktests-nbd-group-01-ucode=0xe2-debian-10.4-x86_64-20200603.cgz-4575a8b36e3a56fa87b1f77e0064fc2ec36ebb7c-20211203-44701-13xrrcj-6.yaml
[ 36.289315][ T964] bit_wait_io (include/linux/sched/signal.h:398 kernel/sched/wait_bit.c:210)
[ 36.289654][ T303]
[ 36.306422][ T964] __wait_on_bit (kernel/sched/wait_bit.c:49)
[ 36.317251][ T964] ? bit_wait_timeout (kernel/sched/wait_bit.c:208)
[ 36.322168][ T964] out_of_line_wait_on_bit (kernel/sched/wait_bit.c:60)
[ 36.327427][ T964] ? __wait_on_bit (kernel/sched/wait_bit.c:60)
[ 36.332063][ T964] ? var_wake_function (kernel/sched/wait_bit.c:22)
[ 36.337048][ T964] __block_write_begin_int (include/linux/wait_bit.h:101 fs/buffer.c:122 include/linux/buffer_head.h:356 fs/buffer.c:2048)
[ 36.342483][ T964] ? blkdev_direct_IO (block/fops.c:27)
[ 36.347382][ T964] ? invalidate_bh_lrus_cpu (fs/buffer.c:1974)
[ 36.352624][ T964] ? inode_owner_or_capable (fs/inode.c:2289)
[ 36.358042][ T964] ? blkdev_direct_IO (block/fops.c:27)
[ 36.362952][ T964] block_write_begin (fs/buffer.c:2060 fs/buffer.c:2120)
[ 36.367674][ T964] generic_perform_write (mm/filemap.c:3758)
[ 36.372832][ T964] ? trace_event_raw_event_file_check_and_advance_wb_err (mm/filemap.c:3721)
[ 36.380789][ T964] ? inode_update_time (fs/inode.c:1985)
[ 36.385599][ T964] ? mem_cgroup_oom_trylock (mm/memcontrol.c:2585)
[ 36.391016][ T964] __generic_file_write_iter (mm/filemap.c:3884)
[ 36.396524][ T964] blkdev_write_iter (block/fops.c:558)
[ 36.401358][ T964] ? blkdev_bio_end_io (block/fops.c:527)
[ 36.406415][ T964] ? folio_add_lru (mm/swap.c:468)
[ 36.410875][ T964] ? do_anonymous_page (mm/memory.c:3800)
[ 36.415947][ T964] new_sync_write (fs/read_write.c:504 (discriminator 1))
[ 36.420496][ T964] ? __handle_mm_fault (mm/memory.c:4549 mm/memory.c:4686)
[ 36.425670][ T964] ? new_sync_read (fs/read_write.c:493)
[ 36.430317][ T964] ? enqueue_hrtimer (kernel/time/hrtimer.c:1092 (discriminator 3))
[ 36.435041][ T964] vfs_write (fs/read_write.c:590)
[ 36.439168][ T964] ksys_write (fs/read_write.c:643)
[ 36.443280][ T964] ? __ia32_sys_read (fs/read_write.c:633)
[ 36.447914][ T964] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 36.452202][ T964] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)
[ 36.457985][ T964] RIP: 0033:0x7fbfda4fc504
[ 36.462271][ T964] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
All code
========
0: 00 f7 add %dh,%bh
2: d8 64 89 02 fsubs 0x2(%rcx,%rcx,4)
6: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
d: eb b3 jmp 0xffffffffffffffc2
f: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
16: 48 8d 05 f9 61 0d 00 lea 0xd61f9(%rip),%rax # 0xd6216
1d: 8b 00 mov (%rax),%eax
1f: 85 c0 test %eax,%eax
21: 75 13 jne 0x36
23: b8 01 00 00 00 mov $0x1,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 54 ja 0x86
32: c3 retq
33: 0f 1f 00 nopl (%rax)
36: 41 54 push %r12
38: 49 89 d4 mov %rdx,%r12
3b: 55 push %rbp
3c: 48 89 f5 mov %rsi,%rbp
3f: 53 push %rbx

Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 54 ja 0x5c
8: c3 retq
9: 0f 1f 00 nopl (%rax)
c: 41 54 push %r12
e: 49 89 d4 mov %rdx,%r12
11: 55 push %rbp
12: 48 89 f5 mov %rsi,%rbp
15: 53 push %rbx


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (9.96 kB)
config-5.16.0-rc3-00091-g4575a8b36e3a (173.59 kB)
job-script (5.49 kB)
dmesg.xz (41.21 kB)
blktests (2.83 kB)
job.yaml (4.49 kB)
reproduce (84.00 B)
Download all attachments