2019-02-02 18:37:37

by Mikhail Gavrilov

[permalink] [raw]
Subject: [5.0.0 rc3 BUG] possible irq lock inversion dependency detected

Hi folks.
I at last year already caught bug with same signature at kvm subsystem
in 4.18.0 rc8 kernel. It happened when I started virtual machine. And
then Thomas was fixed this bug [1].
But now it happened when I changed DPI in ProtocaseDesigner (this is
java application which helps design computer cases). And it not
harmless issue because application stuck and I could terminate
ProtocaseDesigner only with `kill -9`.

Can anyone tell me which mailing list is more relevant to this bug? I
suppose this can be linux-smp or may be linux-x86_64.


[ 3534.478386] ========================================================
[ 3534.478387] WARNING: possible irq lock inversion dependency detected
[ 3534.478390] 5.0.0-0.rc4.git3.2.fc30.x86_64 #1 Tainted: G C
[ 3534.478392] --------------------------------------------------------
[ 3534.478393] java/13935 just changed the state of lock:
[ 3534.478395] 000000007eda2fa7 (&st->lock){+...}, at:
speculation_ctrl_update+0x164/0x1d0
[ 3534.478401] but this lock was taken by another, HARDIRQ-safe lock
in the past:
[ 3534.478402] (&(&sighand->siglock)->rlock){-.-.}
[ 3534.478403]

and interrupts could create inverse lock ordering between them.

[ 3534.478405]
other info that might help us debug this:
[ 3534.478406] Possible interrupt unsafe locking scenario:

[ 3534.478407] CPU0 CPU1
[ 3534.478408] ---- ----
[ 3534.478408] lock(&st->lock);
[ 3534.478410] local_irq_disable();
[ 3534.478411] lock(&(&sighand->siglock)->rlock);
[ 3534.478412] lock(&st->lock);
[ 3534.478413] <Interrupt>
[ 3534.478414] lock(&(&sighand->siglock)->rlock);
[ 3534.478415]
*** DEADLOCK ***

[ 3534.478417] no locks held by java/13935.
[ 3534.478417]
the shortest dependencies between 2nd lock and 1st lock:
[ 3534.478420] -> (&(&sighand->siglock)->rlock){-.-.} {
[ 3534.478422] IN-HARDIRQ-W at:
[ 3534.478426] _raw_spin_lock_irqsave+0x49/0x83
[ 3534.478429] __lock_task_sighand+0x85/0x180
[ 3534.478431] do_send_sig_info+0x35/0x90
[ 3534.478432] kill_pid_info+0x83/0x150
[ 3534.478435] it_real_fn+0x39/0xf0
[ 3534.478437] __hrtimer_run_queues+0x110/0x450
[ 3534.478439] hrtimer_interrupt+0x10e/0x240
[ 3534.478441] smp_apic_timer_interrupt+0x83/0x230
[ 3534.478443] apic_timer_interrupt+0xf/0x20
[ 3534.478446] cpuidle_enter_state+0xbf/0x470
[ 3534.478448] do_idle+0x1ec/0x280
[ 3534.478450] cpu_startup_entry+0x19/0x20
[ 3534.478452] start_secondary+0x1b3/0x200
[ 3534.478455] secondary_startup_64+0xa4/0xb0
[ 3534.478456] IN-SOFTIRQ-W at:
[ 3534.478458] _raw_spin_lock_irqsave+0x49/0x83
[ 3534.478460] __lock_task_sighand+0x85/0x180
[ 3534.478461] do_send_sig_info+0x35/0x90
[ 3534.478463] kill_pid_info+0x83/0x150
[ 3534.478465] it_real_fn+0x39/0xf0
[ 3534.478466] __hrtimer_run_queues+0x110/0x450
[ 3534.478468] hrtimer_interrupt+0x10e/0x240
[ 3534.478470] smp_apic_timer_interrupt+0x83/0x230
[ 3534.478472] apic_timer_interrupt+0xf/0x20
[ 3534.478474] _raw_spin_unlock_irqrestore+0x50/0x60
[ 3534.478476] run_rebalance_domains+0x70/0xa0
[ 3534.478478] __do_softirq+0xe2/0x46a
[ 3534.478480] irq_exit+0x119/0x120
[ 3534.478482] smp_apic_timer_interrupt+0xa4/0x230
[ 3534.478484] apic_timer_interrupt+0xf/0x20
[ 3534.478486] cpuidle_enter_state+0xbf/0x470
[ 3534.478488] do_idle+0x1ec/0x280
[ 3534.478489] cpu_startup_entry+0x19/0x20
[ 3534.478492] start_kernel+0x52a/0x548
[ 3534.478494] secondary_startup_64+0xa4/0xb0
[ 3534.478495] INITIAL USE at:
[ 3534.478497] _raw_spin_lock_irq+0x3d/0x80
[ 3534.478498] calculate_sigpending+0x22/0x40
[ 3534.478500] ret_from_fork+0x8/0x50
[ 3534.478501] }
[ 3534.478504] ... key at: [<ffffffff8e257290>] __key.68419+0x0/0x8
[ 3534.478505] ... acquired at:
[ 3534.478507] speculation_ctrl_update+0x164/0x1d0
[ 3534.478509] speculation_ctrl_update_current+0x22/0x30
[ 3534.478511] ssb_prctl_set.part.0+0x1a/0x20
[ 3534.478513] arch_seccomp_spec_mitigate+0x25/0x50
[ 3534.478515] do_seccomp+0x8f1/0x960
[ 3534.478517] do_syscall_64+0x60/0x1e0
[ 3534.478519] entry_SYSCALL_64_after_hwframe+0x49/0xbe

[ 3534.478521] -> (&st->lock){+...} {
[ 3534.478523] HARDIRQ-ON-W at:
[ 3534.478525] _raw_spin_lock+0x31/0x80
[ 3534.478526] speculation_ctrl_update+0x164/0x1d0
[ 3534.478528] speculation_ctrl_update_current+0x22/0x30
[ 3534.478530] arch_prctl_spec_ctrl_set+0xe5/0x110
[ 3534.478532] __x64_sys_prctl+0xce/0x53c
[ 3534.478533] do_syscall_64+0x60/0x1e0
[ 3534.478535] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 3534.478536] INITIAL USE at:
[ 3534.478538] _raw_spin_lock+0x31/0x80
[ 3534.478540] speculation_ctrl_update+0x164/0x1d0
[ 3534.478542] speculation_ctrl_update_current+0x22/0x30
[ 3534.478543] ssb_prctl_set.part.0+0x1a/0x20
[ 3534.478545] arch_seccomp_spec_mitigate+0x25/0x50
[ 3534.478547] do_seccomp+0x8f1/0x960
[ 3534.478549] do_syscall_64+0x60/0x1e0
[ 3534.478551] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 3534.478552] }
[ 3534.478554] ... key at: [<ffffffff8e1f69c8>] __key.60805+0x0/0x8
[ 3534.478555] ... acquired at:
[ 3534.478558] __lock_acquire+0x544/0x1660
[ 3534.478559] lock_acquire+0xa2/0x1b0
[ 3534.478561] _raw_spin_lock+0x31/0x80
[ 3534.478563] speculation_ctrl_update+0x164/0x1d0
[ 3534.478564] speculation_ctrl_update_current+0x22/0x30
[ 3534.478566] arch_prctl_spec_ctrl_set+0xe5/0x110
[ 3534.478567] __x64_sys_prctl+0xce/0x53c
[ 3534.478569] do_syscall_64+0x60/0x1e0
[ 3534.478571] entry_SYSCALL_64_after_hwframe+0x49/0xbe

[ 3534.478572]
stack backtrace:
[ 3534.478575] CPU: 10 PID: 13935 Comm: java Tainted: G C
5.0.0-0.rc4.git3.2.fc30.x86_64 #1
[ 3534.478577] Hardware name: System manufacturer System Product
Name/ROG STRIX X470-I GAMING, BIOS 1103 11/16/2018
[ 3534.478578] Call Trace:
[ 3534.478582] dump_stack+0x85/0xc0
[ 3534.478584] check_usage_backwards.cold+0x1d/0x26
[ 3534.478588] mark_lock+0x188/0x600
[ 3534.478590] ? print_shortest_lock_dependencies+0x40/0x40
[ 3534.478592] __lock_acquire+0x544/0x1660
[ 3534.478595] ? sched_clock+0x5/0x10
[ 3534.478597] ? sched_clock_cpu+0xc/0xc0
[ 3534.478600] lock_acquire+0xa2/0x1b0
[ 3534.478602] ? speculation_ctrl_update+0x164/0x1d0
[ 3534.478604] _raw_spin_lock+0x31/0x80
[ 3534.478606] ? speculation_ctrl_update+0x164/0x1d0
[ 3534.478608] speculation_ctrl_update+0x164/0x1d0
[ 3534.478610] speculation_ctrl_update_current+0x22/0x30
[ 3534.478612] arch_prctl_spec_ctrl_set+0xe5/0x110
[ 3534.478614] __x64_sys_prctl+0xce/0x53c
[ 3534.478616] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 3534.478619] do_syscall_64+0x60/0x1e0
[ 3534.478621] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 3534.478623] RIP: 0033:0x7fb593d6b25e
[ 3534.478625] Code: 48 8b 0d 2d 3c 0c 00 f7 d8 64 89 01 48 83 c8 ff
c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 9d 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fa 3b 0c 00 f7 d8 64 89
01 48
[ 3534.478627] RSP: 002b:00007fff7fe07f78 EFLAGS: 00000246 ORIG_RAX:
000000000000009d
[ 3534.478629] RAX: ffffffffffffffda RBX: 00007fb593b2f088 RCX: 00007fb593d6b25e
[ 3534.478630] RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000035
[ 3534.478632] RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000000
[ 3534.478633] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff7fe0c688
[ 3534.478634] R13: 00007fff7fe0c6b8 R14: 00007fb593b2f158 R15: 0000000000000000

$ eu-addr2line -e
/lib/debug/lib/modules/5.0.0-0.rc4.git3.2.fc30.x86_64/vmlinux
speculation_ctrl_update+0x164
arch/x86/kernel/process.c:371:10

Thanks.

[1] https://www.spinics.net/lists/kvm/msg173197.html

--
Best Regards,
Mike Gavrilov.


Attachments:
dmesg.txt (138.59 kB)

2019-04-14 17:34:08

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: [5.0.0 rc3 BUG] possible irq lock inversion dependency detected

Hi!

This bug is still present in 5.1 rc4
Why no one answered me?
Maybe I wrote something wrong?
Please help me.


$ /usr/src/kernels/`uname -r`/scripts/faddr2line
/lib/debug/lib/modules/`uname -r`/vmlinux
speculation_ctrl_update+0x164
speculation_ctrl_update+0x164/0x1d0:
amd_set_core_ssb_state at arch/x86/kernel/process.c:383
(inlined by) __speculation_ctrl_update at arch/x86/kernel/process.c:439
(inlined by) speculation_ctrl_update at arch/x86/kernel/process.c:482


--
Best Regards,
Mike Gavrilov.


On Sat, 2 Feb 2019 at 23:33, Mikhail Gavrilov
<[email protected]> wrote:
>
> Hi folks.
> I at last year already caught bug with same signature at kvm subsystem
> in 4.18.0 rc8 kernel. It happened when I started virtual machine. And
> then Thomas was fixed this bug [1].
> But now it happened when I changed DPI in ProtocaseDesigner (this is
> java application which helps design computer cases). And it not
> harmless issue because application stuck and I could terminate
> ProtocaseDesigner only with `kill -9`.
>
> Can anyone tell me which mailing list is more relevant to this bug? I
> suppose this can be linux-smp or may be linux-x86_64.
>
>
> [ 3534.478386] ========================================================
> [ 3534.478387] WARNING: possible irq lock inversion dependency detected
> [ 3534.478390] 5.0.0-0.rc4.git3.2.fc30.x86_64 #1 Tainted: G C
> [ 3534.478392] --------------------------------------------------------
> [ 3534.478393] java/13935 just changed the state of lock:
> [ 3534.478395] 000000007eda2fa7 (&st->lock){+...}, at:
> speculation_ctrl_update+0x164/0x1d0
> [ 3534.478401] but this lock was taken by another, HARDIRQ-safe lock
> in the past:
> [ 3534.478402] (&(&sighand->siglock)->rlock){-.-.}
> [ 3534.478403]
>
> and interrupts could create inverse lock ordering between them.
>
> [ 3534.478405]
> other info that might help us debug this:
> [ 3534.478406] Possible interrupt unsafe locking scenario:
>
> [ 3534.478407] CPU0 CPU1
> [ 3534.478408] ---- ----
> [ 3534.478408] lock(&st->lock);
> [ 3534.478410] local_irq_disable();
> [ 3534.478411] lock(&(&sighand->siglock)->rlock);
> [ 3534.478412] lock(&st->lock);
> [ 3534.478413] <Interrupt>
> [ 3534.478414] lock(&(&sighand->siglock)->rlock);
> [ 3534.478415]
> *** DEADLOCK ***
>
> [ 3534.478417] no locks held by java/13935.
> [ 3534.478417]
> the shortest dependencies between 2nd lock and 1st lock:
> [ 3534.478420] -> (&(&sighand->siglock)->rlock){-.-.} {
> [ 3534.478422] IN-HARDIRQ-W at:
> [ 3534.478426] _raw_spin_lock_irqsave+0x49/0x83
> [ 3534.478429] __lock_task_sighand+0x85/0x180
> [ 3534.478431] do_send_sig_info+0x35/0x90
> [ 3534.478432] kill_pid_info+0x83/0x150
> [ 3534.478435] it_real_fn+0x39/0xf0
> [ 3534.478437] __hrtimer_run_queues+0x110/0x450
> [ 3534.478439] hrtimer_interrupt+0x10e/0x240
> [ 3534.478441] smp_apic_timer_interrupt+0x83/0x230
> [ 3534.478443] apic_timer_interrupt+0xf/0x20
> [ 3534.478446] cpuidle_enter_state+0xbf/0x470
> [ 3534.478448] do_idle+0x1ec/0x280
> [ 3534.478450] cpu_startup_entry+0x19/0x20
> [ 3534.478452] start_secondary+0x1b3/0x200
> [ 3534.478455] secondary_startup_64+0xa4/0xb0
> [ 3534.478456] IN-SOFTIRQ-W at:
> [ 3534.478458] _raw_spin_lock_irqsave+0x49/0x83
> [ 3534.478460] __lock_task_sighand+0x85/0x180
> [ 3534.478461] do_send_sig_info+0x35/0x90
> [ 3534.478463] kill_pid_info+0x83/0x150
> [ 3534.478465] it_real_fn+0x39/0xf0
> [ 3534.478466] __hrtimer_run_queues+0x110/0x450
> [ 3534.478468] hrtimer_interrupt+0x10e/0x240
> [ 3534.478470] smp_apic_timer_interrupt+0x83/0x230
> [ 3534.478472] apic_timer_interrupt+0xf/0x20
> [ 3534.478474] _raw_spin_unlock_irqrestore+0x50/0x60
> [ 3534.478476] run_rebalance_domains+0x70/0xa0
> [ 3534.478478] __do_softirq+0xe2/0x46a
> [ 3534.478480] irq_exit+0x119/0x120
> [ 3534.478482] smp_apic_timer_interrupt+0xa4/0x230
> [ 3534.478484] apic_timer_interrupt+0xf/0x20
> [ 3534.478486] cpuidle_enter_state+0xbf/0x470
> [ 3534.478488] do_idle+0x1ec/0x280
> [ 3534.478489] cpu_startup_entry+0x19/0x20
> [ 3534.478492] start_kernel+0x52a/0x548
> [ 3534.478494] secondary_startup_64+0xa4/0xb0
> [ 3534.478495] INITIAL USE at:
> [ 3534.478497] _raw_spin_lock_irq+0x3d/0x80
> [ 3534.478498] calculate_sigpending+0x22/0x40
> [ 3534.478500] ret_from_fork+0x8/0x50
> [ 3534.478501] }
> [ 3534.478504] ... key at: [<ffffffff8e257290>] __key.68419+0x0/0x8
> [ 3534.478505] ... acquired at:
> [ 3534.478507] speculation_ctrl_update+0x164/0x1d0
> [ 3534.478509] speculation_ctrl_update_current+0x22/0x30
> [ 3534.478511] ssb_prctl_set.part.0+0x1a/0x20
> [ 3534.478513] arch_seccomp_spec_mitigate+0x25/0x50
> [ 3534.478515] do_seccomp+0x8f1/0x960
> [ 3534.478517] do_syscall_64+0x60/0x1e0
> [ 3534.478519] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> [ 3534.478521] -> (&st->lock){+...} {
> [ 3534.478523] HARDIRQ-ON-W at:
> [ 3534.478525] _raw_spin_lock+0x31/0x80
> [ 3534.478526] speculation_ctrl_update+0x164/0x1d0
> [ 3534.478528] speculation_ctrl_update_current+0x22/0x30
> [ 3534.478530] arch_prctl_spec_ctrl_set+0xe5/0x110
> [ 3534.478532] __x64_sys_prctl+0xce/0x53c
> [ 3534.478533] do_syscall_64+0x60/0x1e0
> [ 3534.478535] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3534.478536] INITIAL USE at:
> [ 3534.478538] _raw_spin_lock+0x31/0x80
> [ 3534.478540] speculation_ctrl_update+0x164/0x1d0
> [ 3534.478542] speculation_ctrl_update_current+0x22/0x30
> [ 3534.478543] ssb_prctl_set.part.0+0x1a/0x20
> [ 3534.478545] arch_seccomp_spec_mitigate+0x25/0x50
> [ 3534.478547] do_seccomp+0x8f1/0x960
> [ 3534.478549] do_syscall_64+0x60/0x1e0
> [ 3534.478551] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3534.478552] }
> [ 3534.478554] ... key at: [<ffffffff8e1f69c8>] __key.60805+0x0/0x8
> [ 3534.478555] ... acquired at:
> [ 3534.478558] __lock_acquire+0x544/0x1660
> [ 3534.478559] lock_acquire+0xa2/0x1b0
> [ 3534.478561] _raw_spin_lock+0x31/0x80
> [ 3534.478563] speculation_ctrl_update+0x164/0x1d0
> [ 3534.478564] speculation_ctrl_update_current+0x22/0x30
> [ 3534.478566] arch_prctl_spec_ctrl_set+0xe5/0x110
> [ 3534.478567] __x64_sys_prctl+0xce/0x53c
> [ 3534.478569] do_syscall_64+0x60/0x1e0
> [ 3534.478571] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> [ 3534.478572]
> stack backtrace:
> [ 3534.478575] CPU: 10 PID: 13935 Comm: java Tainted: G C
> 5.0.0-0.rc4.git3.2.fc30.x86_64 #1
> [ 3534.478577] Hardware name: System manufacturer System Product
> Name/ROG STRIX X470-I GAMING, BIOS 1103 11/16/2018
> [ 3534.478578] Call Trace:
> [ 3534.478582] dump_stack+0x85/0xc0
> [ 3534.478584] check_usage_backwards.cold+0x1d/0x26
> [ 3534.478588] mark_lock+0x188/0x600
> [ 3534.478590] ? print_shortest_lock_dependencies+0x40/0x40
> [ 3534.478592] __lock_acquire+0x544/0x1660
> [ 3534.478595] ? sched_clock+0x5/0x10
> [ 3534.478597] ? sched_clock_cpu+0xc/0xc0
> [ 3534.478600] lock_acquire+0xa2/0x1b0
> [ 3534.478602] ? speculation_ctrl_update+0x164/0x1d0
> [ 3534.478604] _raw_spin_lock+0x31/0x80
> [ 3534.478606] ? speculation_ctrl_update+0x164/0x1d0
> [ 3534.478608] speculation_ctrl_update+0x164/0x1d0
> [ 3534.478610] speculation_ctrl_update_current+0x22/0x30
> [ 3534.478612] arch_prctl_spec_ctrl_set+0xe5/0x110
> [ 3534.478614] __x64_sys_prctl+0xce/0x53c
> [ 3534.478616] ? trace_hardirqs_off_thunk+0x1a/0x1c
> [ 3534.478619] do_syscall_64+0x60/0x1e0
> [ 3534.478621] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [ 3534.478623] RIP: 0033:0x7fb593d6b25e
> [ 3534.478625] Code: 48 8b 0d 2d 3c 0c 00 f7 d8 64 89 01 48 83 c8 ff
> c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 9d 00 00
> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fa 3b 0c 00 f7 d8 64 89
> 01 48
> [ 3534.478627] RSP: 002b:00007fff7fe07f78 EFLAGS: 00000246 ORIG_RAX:
> 000000000000009d
> [ 3534.478629] RAX: ffffffffffffffda RBX: 00007fb593b2f088 RCX: 00007fb593d6b25e
> [ 3534.478630] RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000035
> [ 3534.478632] RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000000
> [ 3534.478633] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff7fe0c688
> [ 3534.478634] R13: 00007fff7fe0c6b8 R14: 00007fb593b2f158 R15: 0000000000000000
>
> $ eu-addr2line -e
> /lib/debug/lib/modules/5.0.0-0.rc4.git3.2.fc30.x86_64/vmlinux
> speculation_ctrl_update+0x164
> arch/x86/kernel/process.c:371:10
>
> Thanks.
>
> [1] https://www.spinics.net/lists/kvm/msg173197.html
>
> --
> Best Regards,
> Mike Gavrilov.

2019-04-14 17:52:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [5.0.0 rc3 BUG] possible irq lock inversion dependency detected

Hi,

On Sun, 14 Apr 2019, Mikhail Gavrilov wrote:

> This bug is still present in 5.1 rc4
> Why no one answered me?

Because mails fall through the cracks occasionally.

Does the patch below cure your problem?

Thanks,

tglx

8<------------------
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -426,6 +426,8 @@ static __always_inline void __speculatio
u64 msr = x86_spec_ctrl_base;
bool updmsr = false;

+ lockdep_assert_irqs_disabled();
+
/*
* If TIF_SSBD is different, select the proper mitigation
* method. Note that if SSBD mitigation is disabled or permanentely
@@ -477,10 +479,12 @@ static unsigned long speculation_ctrl_up

void speculation_ctrl_update(unsigned long tif)
{
+ unsigned long flags;
+
/* Forced update. Make sure all relevant TIF flags are different */
- preempt_disable();
+ local_irq_save(flags);
__speculation_ctrl_update(~tif, tif);
- preempt_enable();
+ local_irq_restore(flags);
}

/* Called from seccomp/prctl update */

2019-04-14 20:09:45

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: [5.0.0 rc3 BUG] possible irq lock inversion dependency detected

On Sun, 14 Apr 2019 at 22:51, Thomas Gleixner <[email protected]> wrote:
> Because mails fall through the cracks occasionally.
>
> Does the patch below cure your problem?
>
> Thanks,
>
> tglx
>
> 8<------------------
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -426,6 +426,8 @@ static __always_inline void __speculatio
> u64 msr = x86_spec_ctrl_base;
> bool updmsr = false;
>
> + lockdep_assert_irqs_disabled();
> +
> /*
> * If TIF_SSBD is different, select the proper mitigation
> * method. Note that if SSBD mitigation is disabled or permanentely
> @@ -477,10 +479,12 @@ static unsigned long speculation_ctrl_up
>
> void speculation_ctrl_update(unsigned long tif)
> {
> + unsigned long flags;
> +
> /* Forced update. Make sure all relevant TIF flags are different */
> - preempt_disable();
> + local_irq_save(flags);
> __speculation_ctrl_update(~tif, tif);
> - preempt_enable();
> + local_irq_restore(flags);
> }
>
> /* Called from seccomp/prctl update */

Thanks, with this patch problem was gone.
We have time land it in 5.1?

--
Best Regards,
Mike Gavrilov.

Subject: [tip:x86/urgent] x86/speculation: Prevent deadlock on ssb_state::lock

Commit-ID: 2f5fb19341883bb6e37da351bc3700489d8506a7
Gitweb: https://git.kernel.org/tip/2f5fb19341883bb6e37da351bc3700489d8506a7
Author: Thomas Gleixner <[email protected]>
AuthorDate: Sun, 14 Apr 2019 19:51:06 +0200
Committer: Thomas Gleixner <[email protected]>
CommitDate: Sun, 14 Apr 2019 23:05:52 +0200

x86/speculation: Prevent deadlock on ssb_state::lock

Mikhail reported a lockdep splat related to the AMD specific ssb_state
lock:

CPU0 CPU1
lock(&st->lock);
local_irq_disable();
lock(&(&sighand->siglock)->rlock);
lock(&st->lock);
<Interrupt>
lock(&(&sighand->siglock)->rlock);

*** DEADLOCK ***

The connection between sighand->siglock and st->lock comes through seccomp,
which takes st->lock while holding sighand->siglock.

Make sure interrupts are disabled when __speculation_ctrl_update() is
invoked via prctl() -> speculation_ctrl_update(). Add a lockdep assert to
catch future offenders.

Fixes: 1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD")
Reported-by: Mikhail Gavrilov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Mikhail Gavrilov <[email protected]>
Cc: Thomas Lendacky <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kernel/process.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 58ac7be52c7a..957eae13b370 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -426,6 +426,8 @@ static __always_inline void __speculation_ctrl_update(unsigned long tifp,
u64 msr = x86_spec_ctrl_base;
bool updmsr = false;

+ lockdep_assert_irqs_disabled();
+
/*
* If TIF_SSBD is different, select the proper mitigation
* method. Note that if SSBD mitigation is disabled or permanentely
@@ -477,10 +479,12 @@ static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk)

void speculation_ctrl_update(unsigned long tif)
{
+ unsigned long flags;
+
/* Forced update. Make sure all relevant TIF flags are different */
- preempt_disable();
+ local_irq_save(flags);
__speculation_ctrl_update(~tif, tif);
- preempt_enable();
+ local_irq_restore(flags);
}

/* Called from seccomp/prctl update */

2019-04-18 16:54:13

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: [5.0.0 rc3 BUG] possible irq lock inversion dependency detected

On Mon, 15 Apr 2019 at 01:07, Mikhail Gavrilov
<[email protected]> wrote:
>
>
> Thanks, with this patch problem was gone.
> We have time land it in 5.1?
>

I received automated email with follow content:

> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: 1f50ddb4f418 x86/speculation: Handle HT correctly on AMD.
>
> The bot has tested the following trees: v5.0.7, v4.19.34, v4.14.111, v4.9.168, v4.4.178.
>
> v5.0.7: Build OK!
> v4.19.34: Build OK!
> v4.14.111: Build failed! Errors:
> arch/x86/kernel/process.c:417:2: error: implicit declaration of function ‘lockdep_assert_irqs_disabled’; did you mean ‘lockdep_assert_cpus_held’? [-Werror=implicit-function-declaration]
>
> v4.9.168: Failed to apply! Possible dependencies:
> 01daf56875ee ("x86/speculation: Reorganize speculation control MSRs update")
> 26c4d75b2340 ("x86/speculation: Rename SSBD update functions")
> 5bfbe3ad5840 ("x86/speculation: Prepare for per task indirect branch speculation control")
>
> v4.4.178: Failed to apply! Possible dependencies:
> 01daf56875ee ("x86/speculation: Reorganize speculation control MSRs update")
> 26c4d75b2340 ("x86/speculation: Rename SSBD update functions")
> 5bfbe3ad5840 ("x86/speculation: Prepare for per task indirect branch speculation control")
> cc69b3498921 ("x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}")
>
>
> How should we proceed with this patch?

So I wonder is any further actions are required for merging patch to mainline?

--
Best Regards,
Mike Gavrilov.

2019-04-18 19:19:24

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [5.0.0 rc3 BUG] possible irq lock inversion dependency detected

On Thu, 18 Apr 2019, Mikhail Gavrilov wrote:
> On Mon, 15 Apr 2019 at 01:07, Mikhail Gavrilov
> <[email protected]> wrote:
> > How should we proceed with this patch?
>
> So I wonder is any further actions are required for merging patch to mainline?

You also received automated mail that it is queued in the x86/urgent branch
which is targeted for 5.1-rc6. So nothing to do for you.

Thanks,

tglx