2024-04-12 03:12:15

by Oliver Sang

[permalink] [raw]
Subject: [bristot:dl_server_v6_try11] [sched/rt] 03100a344f: WARNING:at_kernel/sched/deadline.c:#task_contending



Hello,

kernel test robot noticed "WARNING:at_kernel/sched/deadline.c:#task_contending" on:

commit: 03100a344f14806e2e965fd79319b2bd8615601b ("sched/rt: Remove default bandwidth control")
git://git.kernel.org/cgit/linux/kernel/git/bristot/linux dl_server_v6_try11

in testcase: ltp
version: ltp-x86_64-14c1f76-1_20240406
with following parameters:

test: sched



compiler: gcc-13
test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (Ivy Bridge) with 16G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]


[ 185.555673][ T4639] ------------[ cut here ]------------
[ 185.555940][ T4639] dl_rq->running_bw > dl_rq->this_bw
[ 185.555952][ T4639] WARNING: CPU: 2 PID: 4639 at kernel/sched/deadline.c:254 task_contending (kernel/sched/deadline.c:254 (discriminator 1) kernel/sched/deadline.c:314 (discriminator 1) kernel/sched/deadline.c:511 (discriminator 1))
[ 185.556507][ T4639] Modules linked in: netconsole btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c sd_mod t10_pi intel_rapl_msr intel_rapl_common crc64_rocksoft_generic x86_pkg_temp_thermal intel_powerclamp coretemp crc64_rocksoft crc64 sg kvm_intel ipmi_devintf i915 ipmi_msghandler kvm drm_buddy crct10dif_pclmul intel_gtt crc32_pclmul crc32c_intel drm_display_helper ghash_clmulni_intel sha512_ssse3 ttm rapl mxm_wmi ahci drm_kms_helper intel_cstate firewire_ohci libahci video firewire_core i2c_i801 intel_uncore libata crc_itu_t lpc_ich i2c_smbus wmi binfmt_misc loop drm fuse dm_mod ip_tables
[ 185.557861][ T4639] CPU: 2 PID: 4639 Comm: cfs_bandwidth01 Not tainted 6.9.0-rc2-00006-g03100a344f14 #1
[ 185.558184][ T4639] Hardware name: /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012
[ 185.558459][ T4639] RIP: 0010:task_contending (kernel/sched/deadline.c:254 (discriminator 1) kernel/sched/deadline.c:314 (discriminator 1) kernel/sched/deadline.c:511 (discriminator 1))
[ 185.558674][ T4639] Code: e8 f2 1c 75 00 e9 29 fd ff ff 80 3d 85 11 50 04 00 0f 85 ed fc ff ff 48 c7 c7 a0 05 e9 83 c6 05 71 11 50 04 01 e8 7d aa ec ff <0f> 0b e9 d3 fc ff ff 80 3d 5f 11 50 04 00 0f 85 97 fc ff ff 48 c7
All code
========
0: e8 f2 1c 75 00 callq 0x751cf7
5: e9 29 fd ff ff jmpq 0xfffffffffffffd33
a: 80 3d 85 11 50 04 00 cmpb $0x0,0x4501185(%rip) # 0x4501196
11: 0f 85 ed fc ff ff jne 0xfffffffffffffd04
17: 48 c7 c7 a0 05 e9 83 mov $0xffffffff83e905a0,%rdi
1e: c6 05 71 11 50 04 01 movb $0x1,0x4501171(%rip) # 0x4501196
25: e8 7d aa ec ff callq 0xffffffffffecaaa7
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 d3 fc ff ff jmpq 0xfffffffffffffd04
31: 80 3d 5f 11 50 04 00 cmpb $0x0,0x450115f(%rip) # 0x4501197
38: 0f 85 97 fc ff ff jne 0xfffffffffffffcd5
3e: 48 rex.W
3f: c7 .byte 0xc7

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 d3 fc ff ff jmpq 0xfffffffffffffcda
7: 80 3d 5f 11 50 04 00 cmpb $0x0,0x450115f(%rip) # 0x450116d
e: 0f 85 97 fc ff ff jne 0xfffffffffffffcab
14: 48 rex.W
15: c7 .byte 0xc7
[ 185.559149][ T4639] RSP: 0018:ffffc9000ac3fb18 EFLAGS: 00010086
[ 185.559382][ T4639] RAX: 0000000000000000 RBX: ffff8883447c52a8 RCX: 0000000000000027
[ 185.559652][ T4639] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffff888344530b08
[ 185.559938][ T4639] RBP: ffff8883447c4940 R08: 0000000000000001 R09: ffffed10688a6161
[ 185.560241][ T4639] R10: ffff888344530b0b R11: 0000000000000001 R12: 0000000000019998
[ 185.560540][ T4639] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000071
[ 185.560811][ T4639] FS: 00007f685e46f740(0000) GS:ffff888344500000(0000) knlGS:0000000000000000
[ 185.561114][ T4639] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 185.561374][ T4639] CR2: 00007f685e470068 CR3: 00000003ff104001 CR4: 00000000001706f0
[ 185.561667][ T4639] Call Trace:
[ 185.561840][ T4639] <TASK>
[ 185.562039][ T4639] ? __warn (kernel/panic.c:694)
[ 185.562270][ T4639] ? task_contending (kernel/sched/deadline.c:254 (discriminator 1) kernel/sched/deadline.c:314 (discriminator 1) kernel/sched/deadline.c:511 (discriminator 1))
[ 185.562502][ T4639] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 185.562731][ T4639] ? handle_bug (arch/x86/kernel/traps.c:239 (discriminator 1))
[ 185.562933][ T4639] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
[ 185.563142][ T4639] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
[ 185.563351][ T4639] ? task_contending (kernel/sched/deadline.c:254 (discriminator 1) kernel/sched/deadline.c:314 (discriminator 1) kernel/sched/deadline.c:511 (discriminator 1))
[ 185.563553][ T4639] ? task_contending (kernel/sched/deadline.c:254 (discriminator 1) kernel/sched/deadline.c:314 (discriminator 1) kernel/sched/deadline.c:511 (discriminator 1))
[ 185.563758][ T4639] enqueue_dl_entity (kernel/sched/deadline.c:75 kernel/sched/deadline.c:1071 kernel/sched/deadline.c:2053)
[ 185.563981][ T4639] dl_server_start (kernel/sched/deadline.c:1652)
[ 185.564185][ T4639] ? dl_server_update_idle_time (kernel/sched/deadline.c:1611 kernel/sched/deadline.c:1594)
[ 185.564410][ T4639] enqueue_task_fair (kernel/sched/fair.c:6749)
[ 185.564614][ T4639] activate_task (kernel/sched/core.c:2167 (discriminator 2))
[ 185.564834][ T4639] wake_up_new_task (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/sched.h:185 kernel/sched/core.c:4921)
[ 185.565042][ T4639] ? __pfx_wake_up_new_task (kernel/sched/core.c:4897)
[ 185.565266][ T4639] kernel_clone (kernel/fork.c:2830 (discriminator 1))
[ 185.565469][ T4639] ? __pfx_kernel_clone (kernel/fork.c:2756)
[ 185.565674][ T4639] ? __pfx_free_object_rcu (mm/kmemleak.c:508)
[ 185.565891][ T4639] ? rcu_segcblist_enqueue (arch/x86/include/asm/atomic64_64.h:25 include/linux/atomic/atomic-arch-fallback.h:2672 include/linux/atomic/atomic-long.h:121 include/linux/atomic/atomic-instrumented.h:3261 kernel/rcu/rcu_segcblist.c:214 kernel/rcu/rcu_segcblist.c:231 kernel/rcu/rcu_segcblist.c:343)
[ 185.566093][ T4639] ? __call_rcu_common+0x319/0xa00
[ 185.566310][ T4639] __do_sys_clone (kernel/fork.c:2928)
[ 185.566502][ T4639] ? __pfx___do_sys_clone (kernel/fork.c:2928)
[ 185.566715][ T4639] ? syscall_exit_to_user_mode (kernel/entry/common.c:221)
[ 185.566929][ T4639] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
[ 185.567119][ T4639] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
[ 185.567343][ T4639] RIP: 0033:0x7f685e546293
[ 185.567565][ T4639] Code: 00 00 00 00 00 66 90 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 89 c2 85 c0 75 2c 64 48 8b 04 25 10 00 00
All code
========
0: 00 00 add %al,(%rax)
2: 00 00 add %al,(%rax)
4: 00 66 90 add %ah,-0x70(%rsi)
7: 64 48 8b 04 25 10 00 mov %fs:0x10,%rax
e: 00 00
10: 45 31 c0 xor %r8d,%r8d
13: 31 d2 xor %edx,%edx
15: 31 f6 xor %esi,%esi
17: bf 11 00 20 01 mov $0x1200011,%edi
1c: 4c 8d 90 d0 02 00 00 lea 0x2d0(%rax),%r10
23: b8 38 00 00 00 mov $0x38,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 35 ja 0x67
32: 89 c2 mov %eax,%edx
34: 85 c0 test %eax,%eax
36: 75 2c jne 0x64
38: 64 fs
39: 48 rex.W
3a: 8b .byte 0x8b
3b: 04 25 add $0x25,%al
3d: 10 00 adc %al,(%rax)
...

Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 35 ja 0x3d
8: 89 c2 mov %eax,%edx
a: 85 c0 test %eax,%eax
c: 75 2c jne 0x3a
e: 64 fs
f: 48 rex.W
10: 8b .byte 0x8b
11: 04 25 add $0x25,%al
13: 10 00 adc %al,(%rax)
...
[ 185.568066][ T4639] RSP: 002b:00007ffc6abbefd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[ 185.568335][ T4639] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f685e546293
[ 185.568593][ T4639] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[ 185.568865][ T4639] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000064
[ 185.569126][ T4639] R10: 00007f685e46fa10 R11: 0000000000000246 R12: 0000000000000001
[ 185.569403][ T4639] R13: 0000555c55664069 R14: 0000555c55675cb8 R15: 0000000000000000
[ 185.569668][ T4639] </TASK>
[ 185.569823][ T4639] ---[ end trace 0000000000000000 ]---


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240412/[email protected]



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki