Possible recursive locking is detected by lockdep when SMC
falls back to TCP. The corresponding warnings are as follows:
============================================
WARNING: possible recursive locking detected
5.16.0-rc1+ #18 Tainted: G E
--------------------------------------------
wrk/1391 is trying to acquire lock:
ffff975246c8e7d8 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0x109/0x250 [smc]
but task is already holding lock:
ffff975246c8f918 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0xfe/0x250 [smc]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&ei->socket.wq.wait);
lock(&ei->socket.wq.wait);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by wrk/1391:
#0: ffff975246040130 (sk_lock-AF_SMC){+.+.}-{0:0}, at: smc_connect+0x43/0x150 [smc]
#1: ffff975246c8f918 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0xfe/0x250 [smc]
stack backtrace:
Call Trace:
<TASK>
dump_stack_lvl+0x56/0x7b
__lock_acquire+0x951/0x11f0
lock_acquire+0x27a/0x320
? smc_switch_to_fallback+0x109/0x250 [smc]
? smc_switch_to_fallback+0xfe/0x250 [smc]
_raw_spin_lock_irq+0x3b/0x80
? smc_switch_to_fallback+0x109/0x250 [smc]
smc_switch_to_fallback+0x109/0x250 [smc]
smc_connect_fallback+0xe/0x30 [smc]
__smc_connect+0xcf/0x1090 [smc]
? mark_held_locks+0x61/0x80
? __local_bh_enable_ip+0x77/0xe0
? lockdep_hardirqs_on+0xbf/0x130
? smc_connect+0x12a/0x150 [smc]
smc_connect+0x12a/0x150 [smc]
__sys_connect+0x8a/0xc0
? syscall_enter_from_user_mode+0x20/0x70
__x64_sys_connect+0x16/0x20
do_syscall_64+0x34/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
The nested locking in smc_switch_to_fallback() is considered to
possibly cause a deadlock because smc_wait->lock and clc_wait->lock
are the same type of lock. But actually it is safe so far since
there is no other place trying to obtain smc_wait->lock when
clc_wait->lock is held. So the patch replaces spin_lock() with
spin_lock_nested() to avoid false report by lockdep.
Link: https://lkml.org/lkml/2021/11/19/962
Fixes: 2153bd1e3d3d ("Transfer remaining wait queue entries during fallback")
Reported-by: [email protected]
Signed-off-by: Wen Gu <[email protected]>
---
net/smc/af_smc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index b61c802..2692cba 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -585,7 +585,7 @@ static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code)
* to clcsocket->wq during the fallback.
*/
spin_lock_irqsave(&smc_wait->lock, flags);
- spin_lock(&clc_wait->lock);
+ spin_lock_nested(&clc_wait->lock, SINGLE_DEPTH_NESTING);
list_splice_init(&smc_wait->head, &clc_wait->head);
spin_unlock(&clc_wait->lock);
spin_unlock_irqrestore(&smc_wait->lock, flags);
--
1.8.3.1
On Mon, Nov 22, 2021 at 08:32:53PM +0800, Wen Gu wrote:
> Possible recursive locking is detected by lockdep when SMC
> falls back to TCP. The corresponding warnings are as follows:
>
> ============================================
> WARNING: possible recursive locking detected
> 5.16.0-rc1+ #18 Tainted: G E
> --------------------------------------------
> wrk/1391 is trying to acquire lock:
> ffff975246c8e7d8 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0x109/0x250 [smc]
>
> but task is already holding lock:
> ffff975246c8f918 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0xfe/0x250 [smc]
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&ei->socket.wq.wait);
> lock(&ei->socket.wq.wait);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 2 locks held by wrk/1391:
> #0: ffff975246040130 (sk_lock-AF_SMC){+.+.}-{0:0}, at: smc_connect+0x43/0x150 [smc]
> #1: ffff975246c8f918 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0xfe/0x250 [smc]
>
> stack backtrace:
> Call Trace:
> <TASK>
> dump_stack_lvl+0x56/0x7b
> __lock_acquire+0x951/0x11f0
> lock_acquire+0x27a/0x320
> ? smc_switch_to_fallback+0x109/0x250 [smc]
> ? smc_switch_to_fallback+0xfe/0x250 [smc]
> _raw_spin_lock_irq+0x3b/0x80
> ? smc_switch_to_fallback+0x109/0x250 [smc]
> smc_switch_to_fallback+0x109/0x250 [smc]
> smc_connect_fallback+0xe/0x30 [smc]
> __smc_connect+0xcf/0x1090 [smc]
> ? mark_held_locks+0x61/0x80
> ? __local_bh_enable_ip+0x77/0xe0
> ? lockdep_hardirqs_on+0xbf/0x130
> ? smc_connect+0x12a/0x150 [smc]
> smc_connect+0x12a/0x150 [smc]
> __sys_connect+0x8a/0xc0
> ? syscall_enter_from_user_mode+0x20/0x70
> __x64_sys_connect+0x16/0x20
> do_syscall_64+0x34/0x90
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> The nested locking in smc_switch_to_fallback() is considered to
> possibly cause a deadlock because smc_wait->lock and clc_wait->lock
> are the same type of lock. But actually it is safe so far since
> there is no other place trying to obtain smc_wait->lock when
> clc_wait->lock is held. So the patch replaces spin_lock() with
> spin_lock_nested() to avoid false report by lockdep.
>
> Link: https://lkml.org/lkml/2021/11/19/962
> Fixes: 2153bd1e3d3d ("Transfer remaining wait queue entries during fallback")
> Reported-by: [email protected]
> Signed-off-by: Wen Gu <[email protected]>
Acked-by: Tony Lu <[email protected]>
> ---
> net/smc/af_smc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index b61c802..2692cba 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -585,7 +585,7 @@ static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code)
> * to clcsocket->wq during the fallback.
> */
> spin_lock_irqsave(&smc_wait->lock, flags);
> - spin_lock(&clc_wait->lock);
> + spin_lock_nested(&clc_wait->lock, SINGLE_DEPTH_NESTING);
> list_splice_init(&smc_wait->head, &clc_wait->head);
> spin_unlock(&clc_wait->lock);
> spin_unlock_irqrestore(&smc_wait->lock, flags);
> --
> 1.8.3.1
Hello:
This patch was applied to netdev/net.git (master)
by David S. Miller <[email protected]>:
On Mon, 22 Nov 2021 20:32:53 +0800 you wrote:
> Possible recursive locking is detected by lockdep when SMC
> falls back to TCP. The corresponding warnings are as follows:
>
> ============================================
> WARNING: possible recursive locking detected
> 5.16.0-rc1+ #18 Tainted: G E
>
> [...]
Here is the summary with links:
- [net] net/smc: Avoid warning of possible recursive locking
https://git.kernel.org/netdev/net/c/7a61432dc813
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html