Greeting,
FYI, we noticed BUG:KASAN:stack-out-of-bounds_in_unwind_next_frame due to commit (built with gcc-11):
commit: ffb1b4a41016295e298409c9dbcacd55680bd6d4 ("x86/unwind/orc: Add 'signal' field to ORC metadata")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git objtool/core
[test failed on linux-next/master 9d9019bcea1aac7eed64a1a4966282b6b7b141c8]
in testcase: igt
version: igt-x86_64-d2ca8db8-1_20230211
with following parameters:
group: gem_ctx_create
test: maximum-mem
on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-lkp/[email protected]
[ 235.289948][ C1] BUG: KASAN: stack-out-of-bounds in unwind_next_frame (arch/x86/include/asm/ptrace.h:136 arch/x86/kernel/unwind_orc.c:455)
[ 235.297832][ C1] Read of size 8 at addr ffffc9000169f3a0 by task gem_ctx_create/601
[ 235.305714][ C1]
[ 235.307891][ C1] CPU: 1 PID: 601 Comm: gem_ctx_create Tainted: G I 6.2.0-rc2-00011-gffb1b4a41016 #1
[ 235.318536][ C1] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015
[ 235.326587][ C1] Call Trace:
[ 235.329714][ C1] <IRQ>
[ 235.332410][ C1] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[ 235.336750][ C1] print_address_description+0x87/0x2a1
[ 235.343163][ C1] print_report (mm/kasan/report.c:418)
[ 235.347499][ C1] ? kasan_addr_to_slab (mm/kasan/common.c:35)
[ 235.352276][ C1] ? unwind_next_frame (arch/x86/include/asm/ptrace.h:136 arch/x86/kernel/unwind_orc.c:455)
[ 235.357398][ C1] kasan_report (mm/kasan/report.c:519)
[ 235.361570][ C1] ? unwind_next_frame (arch/x86/include/asm/ptrace.h:136 arch/x86/kernel/unwind_orc.c:455)
[ 235.366694][ C1] unwind_next_frame (arch/x86/include/asm/ptrace.h:136 arch/x86/kernel/unwind_orc.c:455)
[ 235.371645][ C1] ? orc_find+0x1ed/0x330
[ 235.376424][ C1] ? orc_find+0x1ed/0x330
[ 235.381194][ C1] ? orc_find+0x1ed/0x330
[ 235.385966][ C1] ? kernel_text_address (kernel/extable.c:99)
[ 235.390909][ C1] ? orc_find+0x1ed/0x330
[ 235.395678][ C1] ? write_profile (kernel/stacktrace.c:83)
[ 235.400279][ C1] arch_stack_walk (arch/x86/kernel/stacktrace.c:24)
[ 235.404705][ C1] ? orc_find+0x1ed/0x330
[ 235.409474][ C1] stack_trace_save (kernel/stacktrace.c:123)
[ 235.413986][ C1] ? filter_irq_stacks (kernel/stacktrace.c:114)
[ 235.418757][ C1] ? perf_event_task_tick (arch/x86/include/asm/preempt.h:80 include/linux/rcupdate.h:94 include/linux/rcupdate.h:762 kernel/events/core.c:4305)
[ 235.423874][ C1] kasan_save_stack (mm/kasan/common.c:46)
[ 235.428385][ C1] ? kasan_save_stack (mm/kasan/common.c:46)
[ 235.433078][ C1] ? __kasan_record_aux_stack (mm/kasan/generic.c:488)
[ 235.438453][ C1] ? insert_work (include/linux/instrumented.h:72 include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:635 kernel/workqueue.c:642 kernel/workqueue.c:1361)
[ 235.442789][ C1] ? __queue_work (kernel/workqueue.c:1520)
[ 235.447300][ C1] ? queue_work_on (kernel/workqueue.c:1546)
[ 235.451725][ C1] ? intel_engine_add_retire (drivers/gpu/drm/i915/gt/intel_gt_requests.c:120) i915
[ 235.457918][ C1] ? __execlists_schedule_out (drivers/gpu/drm/i915/gt/intel_execlists_submission.c:613) i915
[ 235.464194][ C1] ? execlists_submission_tasklet (drivers/gpu/drm/i915/gt/intel_execlists_submission.c:660 drivers/gpu/drm/i915/gt/intel_execlists_submission.c:2045 drivers/gpu/drm/i915/gt/intel_execlists_submission.c:2476) i915
[ 235.470793][ C1] ? tasklet_action_common+0x21e/0x2b0
[ 235.477118][ C1] ? __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:572)
[ 235.481632][ C1] ? __irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650)
[ 235.486314][ C1] ? common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14))
[ 235.490999][ C1] ? asm_common_interrupt (arch/x86/include/asm/idtentry.h:640)
[ 235.496046][ C1] ? mutex_spin_on_owner (kernel/locking/mutex.c:387)
[ 235.501161][ C1] ? common_interrupt (arch/x86/kernel/irq.c:240)
[ 235.505843][ C1] ? asm_common_interrupt (arch/x86/include/asm/idtentry.h:640)
[ 235.510874][ C1] ? orc_find+0x1ed/0x330
[ 235.515645][ C1] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1700 kernel/time/hrtimer.c:1749)
[ 235.520847][ C1] ? enqueue_hrtimer (kernel/time/hrtimer.c:1719)
[ 235.525616][ C1] ? sched_clock_cpu (kernel/sched/clock.c:369)
[ 235.530299][ C1] ? clockevents_program_event (kernel/time/clockevents.c:336 (discriminator 3))
[ 235.535932][ C1] ? var_wake_function (kernel/sched/clock.c:364)
[ 235.540873][ C1] ? hrtimer_interrupt (kernel/time/hrtimer.c:1824)
[ 235.545815][ C1] __kasan_record_aux_stack (mm/kasan/generic.c:488)
[ 235.551018][ C1] insert_work (include/linux/instrumented.h:72 include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:635 kernel/workqueue.c:642 kernel/workqueue.c:1361)
[ 235.555199][ C1] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1107 (discriminator 13))
[ 235.560833][ C1] __queue_work (kernel/workqueue.c:1520)
[ 235.565172][ C1] ? check_preempt_curr (arch/x86/include/asm/bitops.h:207 arch/x86/include/asm/bitops.h:239 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/thread_info.h:118 include/linux/sched.h:2038 include/linux/sched.h:2053 kernel/sched/core.c:2185)
[ 235.570205][ C1] queue_work_on (kernel/workqueue.c:1546)
[ 235.574456][ C1] intel_engine_add_retire (drivers/gpu/drm/i915/gt/intel_gt_requests.c:120) i915
[ 235.580456][ C1] ? __i915_request_submit (arch/x86/include/asm/bitops.h:207 arch/x86/include/asm/bitops.h:239 include/asm-generic/bitops/instrumented-non-atomic.h:142 drivers/gpu/drm/i915/i915_request.c:695) i915
[ 235.586470][ C1] ? engine_retire (drivers/gpu/drm/i915/gt/intel_gt_requests.c:114) i915
[ 235.591790][ C1] __execlists_schedule_out (drivers/gpu/drm/i915/gt/intel_execlists_submission.c:613) i915
[ 235.597884][ C1] execlists_submission_tasklet (drivers/gpu/drm/i915/gt/intel_execlists_submission.c:660 drivers/gpu/drm/i915/gt/intel_execlists_submission.c:2045 drivers/gpu/drm/i915/gt/intel_execlists_submission.c:2476) i915
[ 235.604325][ C1] ? execlists_dequeue (drivers/gpu/drm/i915/gt/intel_execlists_submission.c:2422) i915
[ 235.610156][ C1] ? var_wake_function (kernel/sched/clock.c:364)
[ 235.615110][ C1] tasklet_action_common+0x21e/0x2b0
[ 235.621266][ C1] __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:572)
[ 235.625606][ C1] __irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650)
[ 235.630120][ C1] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14))
[ 235.634631][ C1] </IRQ>
[ 235.637412][ C1] <TASK>
[ 235.640194][ C1] asm_common_interrupt (arch/x86/include/asm/idtentry.h:640)
[ 235.645051][ C1] RIP: 0010:mutex_spin_on_owner (kernel/locking/mutex.c:387)
[ 235.650790][ C1] Code: 5e 00 4c 89 f8 48 c1 e8 03 80 3c 28 00 75 6c 49 8b 07 a8 01 74 d8 e9 f3 fe ff ff 48 83 c4 08 b8 01 00 00 00 5b 5d 41 5c 41 5d <41> 5e 41 5f c3 4c 89 ef e8 e1 83 5e 00 e9 2b ff ff ff e8 b7 83 5e
All code
========
0: 5e pop %rsi
1: 00 4c 89 f8 add %cl,-0x8(%rcx,%rcx,4)
5: 48 c1 e8 03 shr $0x3,%rax
9: 80 3c 28 00 cmpb $0x0,(%rax,%rbp,1)
d: 75 6c jne 0x7b
f: 49 8b 07 mov (%r15),%rax
12: a8 01 test $0x1,%al
14: 74 d8 je 0xffffffffffffffee
16: e9 f3 fe ff ff jmpq 0xffffffffffffff0e
1b: 48 83 c4 08 add $0x8,%rsp
1f: b8 01 00 00 00 mov $0x1,%eax
24: 5b pop %rbx
25: 5d pop %rbp
26: 41 5c pop %r12
28: 41 5d pop %r13
2a:* 41 5e pop %r14 <-- trapping instruction
2c: 41 5f pop %r15
2e: c3 retq
2f: 4c 89 ef mov %r13,%rdi
32: e8 e1 83 5e 00 callq 0x5e8418
37: e9 2b ff ff ff jmpq 0xffffffffffffff67
3c: e8 .byte 0xe8
3d: b7 83 mov $0x83,%bh
3f: 5e pop %rsi
Code starting with the faulting instruction
===========================================
0: 41 5e pop %r14
2: 41 5f pop %r15
4: c3 retq
5: 4c 89 ef mov %r13,%rdi
8: e8 e1 83 5e 00 callq 0x5e83ee
d: e9 2b ff ff ff jmpq 0xffffffffffffff3d
12: e8 .byte 0xe8
13: b7 83 mov $0x83,%bh
15: 5e pop %rsi
[ 235.670165][ C1] RSP: 0018:ffffc9000169f2d8 EFLAGS: 00000286
[ 235.676076][ C1] RAX: 0000000000000001 RBX: ffff8881acc1a980 RCX: ffffffff812cb45a
[ 235.683872][ C1] RDX: ffffed110b145033 RSI: 0000000000000008 RDI: ffff888858a28190
[ 235.691666][ C1] RBP: ffffc9000169f410 R08: 0000000000000000 R09: ffff888858a28197
[ 235.699462][ C1] R10: ffffed110b145032 R11: 0000000000000001 R12: ffff8881ab068000
[ 235.707253][ C1] R13: ffffed110b145032 R14: ffffed103560d000 R15: ffff888858a28190
[ 235.715050][ C1] ? mutex_spin_on_owner (arch/x86/include/asm/atomic64_64.h:22 include/linux/atomic/atomic-long.h:29 include/linux/atomic/atomic-instrumented.h:1266 kernel/locking/mutex.c:81 kernel/locking/mutex.c:359)
[ 235.720098][ C1] ? __mutex_lock+0x33a/0x1040
[ 235.725732][ C1] common_interrupt (arch/x86/kernel/irq.c:240)
[ 235.730242][ C1] asm_common_interrupt (arch/x86/include/asm/idtentry.h:640)
[ 235.735097][ C1] RIP: orc_find+0x1ed/0x330
[ 235.740471][ C1] Code: 00 48 89 fa 48 c1 ea 03 0f b6 14 1a 84 d2 74 09 80 fa 03 0f 8e ae 00 00 00 8b 90 e0 01 00 00 4c 89 e1 48 89 ef e8 23 f9 ff ff <48> 85 c0 74 0d 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e c3 4c 89 e7 e8
All code
========
0: 00 48 89 add %cl,-0x77(%rax)
3: fa cli
4: 48 c1 ea 03 shr $0x3,%rdx
8: 0f b6 14 1a movzbl (%rdx,%rbx,1),%edx
c: 84 d2 test %dl,%dl
e: 74 09 je 0x19
10: 80 fa 03 cmp $0x3,%dl
13: 0f 8e ae 00 00 00 jle 0xc7
19: 8b 90 e0 01 00 00 mov 0x1e0(%rax),%edx
1f: 4c 89 e1 mov %r12,%rcx
22: 48 89 ef mov %rbp,%rdi
25: e8 23 f9 ff ff callq 0xfffffffffffff94d
2a:* 48 85 c0 test %rax,%rax <-- trapping instruction
2d: 74 0d je 0x3c
2f: 48 83 c4 10 add $0x10,%rsp
33: 5b pop %rbx
34: 5d pop %rbp
35: 41 5c pop %r12
37: 41 5d pop %r13
39: 41 5e pop %r14
3b: c3 retq
3c: 4c 89 e7 mov %r12,%rdi
3f: e8 .byte 0xe8
Code starting with the faulting instruction
===========================================
0: 48 85 c0 test %rax,%rax
3: 74 0d je 0x12
5: 48 83 c4 10 add $0x10,%rsp
9: 5b pop %rbx
a: 5d pop %rbp
b: 41 5c pop %r12
d: 41 5d pop %r13
f: 41 5e pop %r14
11: c3 retq
12: 4c 89 e7 mov %r12,%rdi
15: e8 .byte 0xe8
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests