Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: da044747401fc16202e223c9da970ed4e84fd84d ("tasklets: Replace spin wait in tasklet_unlock_wait()")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git irq/core
in testcase: igt
version: igt-x86_64-e230cd8d-1_20210106
with following parameters:
group: group-00
ucode: 0xe2
on test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz with 28G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
kern :err : [ 37.420838] BUG: sleeping function called from invalid context at kernel/softirq.c:648
kern :err : [ 37.428697] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1430, name: i915_pm_rps
kern :warn : [ 37.436943] CPU: 4 PID: 1430 Comm: i915_pm_rps Tainted: G I 5.12.0-rc2-00009-gda044747401f #1
kern :warn : [ 37.446695] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
kern :warn : [ 37.454031] Call Trace:
kern :warn : [ 37.456459] dump_stack (kbuild/src/consumer/lib/dump_stack.c:122)
kern :warn : [ 37.459749] ___might_sleep.cold (kbuild/src/consumer/kernel/sched/core.c:8331 kbuild/src/consumer/kernel/sched/core.c:8288)
kern :warn : [ 37.463815] tasklet_unlock_wait (kbuild/src/consumer/kernel/softirq.c:648)
kern :warn : [ 37.467880] ? fw_domains_get_with_fallback (kbuild/src/consumer/drivers/gpu/drm/i915/intel_uncore.c:121 kbuild/src/consumer/drivers/gpu/drm/i915/intel_uncore.c:136 kbuild/src/consumer/drivers/gpu/drm/i915/intel_uncore.c:229 kbuild/src/consumer/drivers/gpu/drm/i915/intel_uncore.c:276) i915
kern :warn : [ 37.473656] ? intel_uncore_forcewake_get (kbuild/src/consumer/drivers/gpu/drm/i915/intel_uncore.c:644) i915
kern :warn : [ 37.479735] execlists_reset_prepare (kbuild/src/consumer/drivers/gpu/drm/i915/i915_gem.h:108 kbuild/src/consumer/drivers/gpu/drm/i915/gt/intel_execlists_submission.c:2778) i915
kern :warn : [ 37.484783] __intel_engine_reset_bh (kbuild/src/consumer/drivers/gpu/drm/i915/gt/intel_reset.c:1131) i915
kern :warn : [ 37.489931] intel_gt_handle_error (kbuild/src/consumer/drivers/gpu/drm/i915/gt/intel_reset.c:1284) i915
kern :warn : [ 37.494995] ? path_openat (kbuild/src/consumer/fs/namei.c:3499)
kern :warn : [ 37.498715] ? ttwu_do_wakeup (kbuild/src/consumer/kernel/sched/core.c:2942)
kern :warn : [ 37.502606] i915_wedged_set (kbuild/src/consumer/drivers/gpu/drm/i915/i915_debugfs.c:796 (discriminator 1)) i915
kern :warn : [ 37.506974] ? _kstrtoull (kbuild/src/consumer/lib/kstrtox.c:92)
kern :warn : [ 37.510451] simple_attr_write (kbuild/src/consumer/fs/libfs.c:985)
kern :warn : [ 37.514428] full_proxy_write (kbuild/src/consumer/fs/debugfs/file.c:234)
kern :warn : [ 37.518234] vfs_write (kbuild/src/consumer/fs/read_write.c:603)
kern :warn : [ 37.521523] ksys_write (kbuild/src/consumer/fs/read_write.c:658)
kern :warn : [ 37.524826] do_syscall_64 (kbuild/src/consumer/arch/x86/entry/common.c:46)
kern :warn : [ 37.528389] entry_SYSCALL_64_after_hwframe (kbuild/src/consumer/arch/x86/entry/entry_64.S:112)
kern :warn : [ 37.533402] RIP: 0033:0x7fcfd05a7471
kern :warn : [ 37.536949] Code: 00 00 75 05 48 83 c4 58 c3 e8 0b 4d ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 8b 05 da ef 00 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48
All code
========
0: 00 00 add %al,(%rax)
2: 75 05 jne 0x9
4: 48 83 c4 58 add $0x58,%rsp
8: c3 retq
9: e8 0b 4d ff ff callq 0xffffffffffff4d19
e: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
15: 00 00 00
18: 90 nop
19: 8b 05 da ef 00 00 mov 0xefda(%rip),%eax # 0xeff9
1f: 85 c0 test %eax,%eax
21: 75 16 jne 0x39
23: b8 01 00 00 00 mov $0x1,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 57 ja 0x89
32: c3 retq
33: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
39: 41 54 push %r12
3b: 49 89 d4 mov %rdx,%r12
3e: 55 push %rbp
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 57 ja 0x5f
8: c3 retq
9: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
f: 41 54 push %r12
11: 49 89 d4 mov %rdx,%r12
14: 55 push %rbp
15: 48 rex.W
kern :warn : [ 37.555600] RSP: 002b:00007fffc8083e88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
kern :warn : [ 37.563127] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcfd05a7471
kern :warn : [ 37.570207] RDX: 0000000000000014 RSI: 00007fffc8083ee0 RDI: 0000000000000010
kern :warn : [ 37.577286] RBP: 0000000000000014 R08: 0000000000000000 R09: 00007fffc8083ce4
kern :warn : [ 37.584366] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffc8083ee0
kern :warn : [ 37.591446] R13: 0000000000000010 R14: 00007fffc8083ee0 R15: 0000000000000000
kern :notice: [ 37.598530] i915 0000:00:02.0: [drm] Resetting rcs0 for Manually set wedged engine mask = ffffffffffffffff
user :notice: [ 37.905995] result_service: raw_upload, RESULT_MNT: /internal-lkp-server/result, RESULT_ROOT: /internal-lkp-server/result/igt/group-00-ucode=0xe2/lkp-skl-d01/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/da044747401fc16202e223c9da970ed4e84fd84d/3, TMP_RESULT_ROOT: /tmp/lkp/result
user :notice: [ 37.936455] run-job /lkp/jobs/scheduled/lkp-skl-d01/igt-group-00-ucode=0xe2-debian-10.4-x86_64-20200603.cgz-da044747401fc16202e223c9da970ed4e84fd84d-20210320-12495-93bzb0-4.yaml
user :notice: [ 39.163610] /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 http://internal-lkp-server:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-skl-d01/igt-group-00-ucode=0xe2-debian-10.4-x86_64-20200603.cgz-da044747401fc16202e223c9da970ed4e84fd84d-20210320-12495-93bzb0-4.yaml&job_state=running -O /dev/null
user :notice: [ 39.195287] target ucode: 0xe2
user :notice: [ 39.200681] current_version: e2, target_version: e2
user :notice: [ 39.208728] 2021-03-20 13:21:41 build/tests/gem_ctx_bad_destroy --run-subtest double-destroy
user :notice: [ 39.220183] IGT-Version: 1.25-ge230cd8d (x86_64) (Linux: 5.12.0-rc2-00009-gda044747401f x86_64)
user :notice: [ 39.230955] Starting subtest: double-destroy
user :notice: [ 39.237417] Subtest double-destroy: SUCCESS (0.000s)
user :notice: [ 39.245522] 2021-03-20 13:21:42 build/tests/gem_ctx_bad_destroy --run-subtest invalid-ctx
user :notice: [ 39.257025] IGT-Version: 1.25-ge230cd8d (x86_64) (Linux: 5.12.0-rc2-00009-gda044747401f x86_64)
user :notice: [ 39.267744] Starting subtest: invalid-ctx
user :notice: [ 39.273995] Subtest invalid-ctx: SUCCESS (0.000s)
user :notice: [ 39.281987] 2021-03-20 13:21:42 build/tests/gem_ctx_bad_destroy --run-subtest invalid-default-ctx
user :notice: [ 39.294258] IGT-Version: 1.25-ge230cd8d (x86_64) (Linux: 5.12.0-rc2-00009-gda044747401f x86_64)
user :notice: [ 39.305192] Starting subtest: invalid-default-ctx
user :notice: [ 39.312316] Subtest invalid-default-ctx: SUCCESS (0.000s)
user :notice: [ 39.320626] 2021-03-20 13:21:42 build/tests/gem_ctx_bad_destroy --run-subtest invalid-pad
user :notice: [ 39.332021] IGT-Version: 1.25-ge230cd8d (x86_64) (Linux: 5.12.0-rc2-00009-gda044747401f x86_64)
user :notice: [ 39.342723] Starting subtest: invalid-pad
user :notice: [ 39.348961] Subtest invalid-pad: SUCCESS (0.000s)
user :notice: [ 39.356751] 2021-03-20 13:21:42 build/tests/gem_exec_async --run-subtest concurrent-writes
user :notice: [ 39.368011] IGT-Version: 1.25-ge230cd8d (x86_64) (Linux: 5.12.0-rc2-00009-gda044747401f x86_64)
user :notice: [ 39.378906] Starting subtest: concurrent-writes
user :notice: [ 39.385445] Starting dynamic subtest: rcs0
user :notice: [ 39.391691] Dynamic subtest rcs0: SUCCESS (0.001s)
user :notice: [ 39.398575] Starting dynamic subtest: bcs0
user :notice: [ 39.404907] Dynamic subtest bcs0: SUCCESS (0.000s)
user :notice: [ 39.411731] Starting dynamic subtest: vcs0
user :notice: [ 39.418083] Dynamic subtest vcs0: SUCCESS (0.000s)
user :notice: [ 39.424920] Starting dynamic subtest: vecs0
user :notice: [ 39.431389] Dynamic subtest vecs0: SUCCESS (0.000s)
user :notice: [ 39.438633] Subtest concurrent-writes: SUCCESS (0.026s)
user :notice: [ 39.446812] 2021-03-20 13:21:42 build/tests/gem_exec_async --run-subtest forked-writes
user :notice: [ 39.457885] IGT-Version: 1.25-ge230cd8d (x86_64) (Linux: 5.12.0-rc2-00009-gda044747401f x86_64)
user :notice: [ 39.468582] Starting subtest: forked-writes
user :notice: [ 39.474762] Starting dynamic subtest: rcs0
user :notice: [ 39.481139] Dynamic subtest rcs0: SUCCESS (0.002s)
user :notice: [ 39.487990] Starting dynamic subtest: bcs0
user :notice: [ 39.494348] Dynamic subtest bcs0: SUCCESS (0.002s)
user :notice: [ 39.501205] Starting dynamic subtest: vcs0
user :notice: [ 39.507559] Dynamic subtest vcs0: SUCCESS (0.002s)
user :notice: [ 39.514436] Starting dynamic subtest: vecs0
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
Thanks,
Oliver Sang
The i915 driver has its own tasklet interface which was overseen in the
tasklet rework. __tasklet_disable_sync_once() is a wrapper around
tasklet_unlock_wait(). tasklet_unlock_wait() might sleep, but the i915
wrappers invoke it from non-preemtible contexts with bottom halves disabled.
Use tasklet_unlock_spin_wait() instead which can be invoked from
non-preemptible contexts.
Fixes: da044747401fc ("tasklets: Replace spin wait in tasklet_unlock_wait()")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
drivers/gpu/drm/i915/i915_gem.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index e622aee6e4be9..440c35f1abc9e 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -105,7 +105,7 @@ static inline bool tasklet_is_locked(const struct tasklet_struct *t)
static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
{
if (!atomic_fetch_inc(&t->count))
- tasklet_unlock_wait(t);
+ tasklet_unlock_spin_wait(t);
}
static inline bool __tasklet_is_enabled(const struct tasklet_struct *t)
--
2.31.0
The following commit has been merged into the irq/core branch of tip:
Commit-ID: 6e457914935a3161eeb74e319abf9fd511aa1e4d
Gitweb: https://git.kernel.org/tip/6e457914935a3161eeb74e319abf9fd511aa1e4d
Author: Sebastian Andrzej Siewior <[email protected]>
AuthorDate: Tue, 23 Mar 2021 10:22:21 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Thu, 25 Mar 2021 18:21:03 +01:00
drm/i915: Use tasklet_unlock_spin_wait() in __tasklet_disable_sync_once()
The i915 driver has its own tasklet interface which was overseen in the
tasklet rework. __tasklet_disable_sync_once() is a wrapper around
tasklet_unlock_wait(). tasklet_unlock_wait() might sleep, but the i915
wrappers invokes it from non-preemtible contexts with bottom halves disabled.
Use tasklet_unlock_spin_wait() instead which can be invoked from
non-preemptible contexts.
Fixes: da044747401fc ("tasklets: Replace spin wait in tasklet_unlock_wait()")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/20210323092221.awq7g5b2muzypjw3@flow
---
drivers/gpu/drm/i915/i915_gem.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index e622aee..440c35f 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -105,7 +105,7 @@ static inline bool tasklet_is_locked(const struct tasklet_struct *t)
static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
{
if (!atomic_fetch_inc(&t->count))
- tasklet_unlock_wait(t);
+ tasklet_unlock_spin_wait(t);
}
static inline bool __tasklet_is_enabled(const struct tasklet_struct *t)