2023-09-20 16:35:32

by Tomi Valkeinen

[permalink] [raw]
Subject: [PATCH 0/2] drm: Revert dma-fence annotations from omapdrm and tidss

Both omapdrm and tidss give quite similar lockdep warnings ever since
the "Annotate dma-fence critical section in commit path" patches.

I've tried to look at this, but I haven't gotten far with understanding
all this, and I feel that I'd need to really dive deep into the details
to understand all the locking dependencies relevant here.

With some git-log digging, I noticed this:

https://patchwork.freedesktop.org/patch/462170/

So maybe the "fix" is just to revert the patches. If yes, great, we can
apply the patches here. If not, not so great, and I'll just have to
start the diving =).

Signed-off-by: Tomi Valkeinen <[email protected]>
---
Tomi Valkeinen (2):
Revert "drm/tidss: Annotate dma-fence critical section in commit path"
Revert "drm/omapdrm: Annotate dma-fence critical section in commit path"

drivers/gpu/drm/omapdrm/omap_drv.c | 9 ++++-----
drivers/gpu/drm/tidss/tidss_kms.c | 4 ----
2 files changed, 4 insertions(+), 9 deletions(-)
---
base-commit: 9fc75c40faa29df14ba16066be6bdfaea9f39ce4
change-id: 20230920-dma-fence-annotation-revert-166d0efab368

Best regards,
--
Tomi Valkeinen <[email protected]>


2023-09-21 04:46:50

by Tomi Valkeinen

[permalink] [raw]
Subject: [PATCH 1/2] Revert "drm/tidss: Annotate dma-fence critical section in commit path"

This reverts commit 4d56a4f08391857ba93465de489707b66adad114.

The DMA-fence annotations cause a lockdep warning (see below). As per
https://patchwork.freedesktop.org/patch/462170/ it sounds like the
annotations don't work correctly.

======================================================
WARNING: possible circular locking dependency detected
6.6.0-rc2+ #1 Not tainted
------------------------------------------------------
kmstest/733 is trying to acquire lock:
ffff8000819377f0 (fs_reclaim){+.+.}-{0:0}, at: __kmem_cache_alloc_node+0x58/0x2d4

but task is already holding lock:
ffff800081a06aa0 (dma_fence_map){++++}-{0:0}, at: tidss_atomic_commit_tail+0x20/0xc0 [tidss]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (dma_fence_map){++++}-{0:0}:
__dma_fence_might_wait+0x5c/0xd0
dma_resv_lockdep+0x1a4/0x32c
do_one_initcall+0x84/0x2fc
kernel_init_freeable+0x28c/0x4c4
kernel_init+0x24/0x1dc
ret_from_fork+0x10/0x20

-> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
fs_reclaim_acquire+0x70/0xe4
__kmem_cache_alloc_node+0x58/0x2d4
kmalloc_trace+0x38/0x78
__kthread_create_worker+0x3c/0x150
kthread_create_worker+0x64/0x8c
workqueue_init+0x1e8/0x2f0
kernel_init_freeable+0x11c/0x4c4
kernel_init+0x24/0x1dc
ret_from_fork+0x10/0x20

-> #0 (fs_reclaim){+.+.}-{0:0}:
__lock_acquire+0x1370/0x20d8
lock_acquire+0x1e8/0x308
fs_reclaim_acquire+0xd0/0xe4
__kmem_cache_alloc_node+0x58/0x2d4
__kmalloc_node_track_caller+0x58/0xf0
kmemdup+0x34/0x60
regmap_bulk_write+0x64/0x2c0
tc358768_bridge_pre_enable+0x8c/0x12d0 [tc358768]
drm_atomic_bridge_call_pre_enable+0x68/0x80 [drm]
drm_atomic_bridge_chain_pre_enable+0x50/0x158 [drm]
drm_atomic_helper_commit_modeset_enables+0x164/0x264 [drm_kms_helper]
tidss_atomic_commit_tail+0x58/0xc0 [tidss]
commit_tail+0xa0/0x188 [drm_kms_helper]
drm_atomic_helper_commit+0x1a8/0x1c0 [drm_kms_helper]
drm_atomic_commit+0xa8/0xe0 [drm]
drm_mode_atomic_ioctl+0x9ec/0xc80 [drm]
drm_ioctl_kernel+0xc4/0x170 [drm]
drm_ioctl+0x234/0x4b0 [drm]
drm_compat_ioctl+0x110/0x12c [drm]
__arm64_compat_sys_ioctl+0x128/0x150
invoke_syscall+0x48/0x110
el0_svc_common.constprop.0+0x40/0xe0
do_el0_svc_compat+0x1c/0x38
el0_svc_compat+0x48/0xb4
el0t_32_sync_handler+0xb0/0x138
el0t_32_sync+0x194/0x198

other info that might help us debug this:

Chain exists of:
fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
rlock(dma_fence_map);
lock(mmu_notifier_invalidate_range_start);
lock(dma_fence_map);
lock(fs_reclaim);

*** DEADLOCK ***

3 locks held by kmstest/733:
#0: ffff800082e5bba0 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_mode_atomic_ioctl+0x118/0xc80 [drm]
#1: ffff000004224c88 (crtc_ww_class_mutex){+.+.}-{3:3}, at: modeset_lock+0xdc/0x1a0 [drm]
#2: ffff800081a06aa0 (dma_fence_map){++++}-{0:0}, at: tidss_atomic_commit_tail+0x20/0xc0 [tidss]

stack backtrace:
CPU: 0 PID: 733 Comm: kmstest Not tainted 6.6.0-rc2+ #1
Hardware name: Toradex Verdin AM62 on Verdin Development Board (DT)
Call trace:
dump_backtrace+0x98/0x118
show_stack+0x18/0x24
dump_stack_lvl+0x60/0xac
dump_stack+0x18/0x24
print_circular_bug+0x288/0x368
check_noncircular+0x168/0x17c
__lock_acquire+0x1370/0x20d8
lock_acquire+0x1e8/0x308
fs_reclaim_acquire+0xd0/0xe4
__kmem_cache_alloc_node+0x58/0x2d4
__kmalloc_node_track_caller+0x58/0xf0
kmemdup+0x34/0x60
regmap_bulk_write+0x64/0x2c0
tc358768_bridge_pre_enable+0x8c/0x12d0 [tc358768]
drm_atomic_bridge_call_pre_enable+0x68/0x80 [drm]
drm_atomic_bridge_chain_pre_enable+0x50/0x158 [drm]
drm_atomic_helper_commit_modeset_enables+0x164/0x264 [drm_kms_helper]
tidss_atomic_commit_tail+0x58/0xc0 [tidss]
commit_tail+0xa0/0x188 [drm_kms_helper]
drm_atomic_helper_commit+0x1a8/0x1c0 [drm_kms_helper]
drm_atomic_commit+0xa8/0xe0 [drm]
drm_mode_atomic_ioctl+0x9ec/0xc80 [drm]
drm_ioctl_kernel+0xc4/0x170 [drm]
drm_ioctl+0x234/0x4b0 [drm]
drm_compat_ioctl+0x110/0x12c [drm]
__arm64_compat_sys_ioctl+0x128/0x150
invoke_syscall+0x48/0x110
el0_svc_common.constprop.0+0x40/0xe0
do_el0_svc_compat+0x1c/0x38
el0_svc_compat+0x48/0xb4
el0t_32_sync_handler+0xb0/0x138
el0t_32_sync+0x194/0x198

Fixes: 4d56a4f08391 ("drm/tidss: Annotate dma-fence critical section in commit path")
Signed-off-by: Tomi Valkeinen <[email protected]>
---
drivers/gpu/drm/tidss/tidss_kms.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/tidss/tidss_kms.c b/drivers/gpu/drm/tidss/tidss_kms.c
index c979ad1af236..d096d8d2bc8f 100644
--- a/drivers/gpu/drm/tidss/tidss_kms.c
+++ b/drivers/gpu/drm/tidss/tidss_kms.c
@@ -4,8 +4,6 @@
* Author: Tomi Valkeinen <[email protected]>
*/

-#include <linux/dma-fence.h>
-
#include <drm/drm_atomic.h>
#include <drm/drm_atomic_helper.h>
#include <drm/drm_bridge.h>
@@ -25,7 +23,6 @@ static void tidss_atomic_commit_tail(struct drm_atomic_state *old_state)
{
struct drm_device *ddev = old_state->dev;
struct tidss_device *tidss = to_tidss(ddev);
- bool fence_cookie = dma_fence_begin_signalling();

dev_dbg(ddev->dev, "%s\n", __func__);

@@ -36,7 +33,6 @@ static void tidss_atomic_commit_tail(struct drm_atomic_state *old_state)
drm_atomic_helper_commit_modeset_enables(ddev, old_state);

drm_atomic_helper_commit_hw_done(old_state);
- dma_fence_end_signalling(fence_cookie);
drm_atomic_helper_wait_for_flip_done(ddev, old_state);

drm_atomic_helper_cleanup_planes(ddev, old_state);

--
2.34.1

2023-10-03 00:19:45

by Aradhya Bhatia

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "drm/tidss: Annotate dma-fence critical section in commit path"



On 20-Sep-23 18:27, Tomi Valkeinen wrote:
> This reverts commit 4d56a4f08391857ba93465de489707b66adad114.
>
> The DMA-fence annotations cause a lockdep warning (see below). As per
> https://patchwork.freedesktop.org/patch/462170/ it sounds like the
> annotations don't work correctly.
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.6.0-rc2+ #1 Not tainted
> ------------------------------------------------------
> kmstest/733 is trying to acquire lock:
> ffff8000819377f0 (fs_reclaim){+.+.}-{0:0}, at: __kmem_cache_alloc_node+0x58/0x2d4
>
> but task is already holding lock:
> ffff800081a06aa0 (dma_fence_map){++++}-{0:0}, at: tidss_atomic_commit_tail+0x20/0xc0 [tidss]
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 (dma_fence_map){++++}-{0:0}:
> __dma_fence_might_wait+0x5c/0xd0
> dma_resv_lockdep+0x1a4/0x32c
> do_one_initcall+0x84/0x2fc
> kernel_init_freeable+0x28c/0x4c4
> kernel_init+0x24/0x1dc
> ret_from_fork+0x10/0x20
>
> -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
> fs_reclaim_acquire+0x70/0xe4
> __kmem_cache_alloc_node+0x58/0x2d4
> kmalloc_trace+0x38/0x78
> __kthread_create_worker+0x3c/0x150
> kthread_create_worker+0x64/0x8c
> workqueue_init+0x1e8/0x2f0
> kernel_init_freeable+0x11c/0x4c4
> kernel_init+0x24/0x1dc
> ret_from_fork+0x10/0x20
>
> -> #0 (fs_reclaim){+.+.}-{0:0}:
> __lock_acquire+0x1370/0x20d8
> lock_acquire+0x1e8/0x308
> fs_reclaim_acquire+0xd0/0xe4
> __kmem_cache_alloc_node+0x58/0x2d4
> __kmalloc_node_track_caller+0x58/0xf0
> kmemdup+0x34/0x60
> regmap_bulk_write+0x64/0x2c0
> tc358768_bridge_pre_enable+0x8c/0x12d0 [tc358768]
> drm_atomic_bridge_call_pre_enable+0x68/0x80 [drm]
> drm_atomic_bridge_chain_pre_enable+0x50/0x158 [drm]
> drm_atomic_helper_commit_modeset_enables+0x164/0x264 [drm_kms_helper]
> tidss_atomic_commit_tail+0x58/0xc0 [tidss]
> commit_tail+0xa0/0x188 [drm_kms_helper]
> drm_atomic_helper_commit+0x1a8/0x1c0 [drm_kms_helper]
> drm_atomic_commit+0xa8/0xe0 [drm]
> drm_mode_atomic_ioctl+0x9ec/0xc80 [drm]
> drm_ioctl_kernel+0xc4/0x170 [drm]
> drm_ioctl+0x234/0x4b0 [drm]
> drm_compat_ioctl+0x110/0x12c [drm]
> __arm64_compat_sys_ioctl+0x128/0x150
> invoke_syscall+0x48/0x110
> el0_svc_common.constprop.0+0x40/0xe0
> do_el0_svc_compat+0x1c/0x38
> el0_svc_compat+0x48/0xb4
> el0t_32_sync_handler+0xb0/0x138
> el0t_32_sync+0x194/0x198
>
> other info that might help us debug this:
>
> Chain exists of:
> fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> rlock(dma_fence_map);
> lock(mmu_notifier_invalidate_range_start);
> lock(dma_fence_map);
> lock(fs_reclaim);
>
> *** DEADLOCK ***
>
> 3 locks held by kmstest/733:
> #0: ffff800082e5bba0 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_mode_atomic_ioctl+0x118/0xc80 [drm]
> #1: ffff000004224c88 (crtc_ww_class_mutex){+.+.}-{3:3}, at: modeset_lock+0xdc/0x1a0 [drm]
> #2: ffff800081a06aa0 (dma_fence_map){++++}-{0:0}, at: tidss_atomic_commit_tail+0x20/0xc0 [tidss]
>
> stack backtrace:
> CPU: 0 PID: 733 Comm: kmstest Not tainted 6.6.0-rc2+ #1
> Hardware name: Toradex Verdin AM62 on Verdin Development Board (DT)
> Call trace:
> dump_backtrace+0x98/0x118
> show_stack+0x18/0x24
> dump_stack_lvl+0x60/0xac
> dump_stack+0x18/0x24
> print_circular_bug+0x288/0x368
> check_noncircular+0x168/0x17c
> __lock_acquire+0x1370/0x20d8
> lock_acquire+0x1e8/0x308
> fs_reclaim_acquire+0xd0/0xe4
> __kmem_cache_alloc_node+0x58/0x2d4
> __kmalloc_node_track_caller+0x58/0xf0
> kmemdup+0x34/0x60
> regmap_bulk_write+0x64/0x2c0
> tc358768_bridge_pre_enable+0x8c/0x12d0 [tc358768]
> drm_atomic_bridge_call_pre_enable+0x68/0x80 [drm]
> drm_atomic_bridge_chain_pre_enable+0x50/0x158 [drm]
> drm_atomic_helper_commit_modeset_enables+0x164/0x264 [drm_kms_helper]
> tidss_atomic_commit_tail+0x58/0xc0 [tidss]
> commit_tail+0xa0/0x188 [drm_kms_helper]
> drm_atomic_helper_commit+0x1a8/0x1c0 [drm_kms_helper]
> drm_atomic_commit+0xa8/0xe0 [drm]
> drm_mode_atomic_ioctl+0x9ec/0xc80 [drm]
> drm_ioctl_kernel+0xc4/0x170 [drm]
> drm_ioctl+0x234/0x4b0 [drm]
> drm_compat_ioctl+0x110/0x12c [drm]
> __arm64_compat_sys_ioctl+0x128/0x150
> invoke_syscall+0x48/0x110
> el0_svc_common.constprop.0+0x40/0xe0
> do_el0_svc_compat+0x1c/0x38
> el0_svc_compat+0x48/0xb4
> el0t_32_sync_handler+0xb0/0x138
> el0t_32_sync+0x194/0x198
>
> Fixes: 4d56a4f08391 ("drm/tidss: Annotate dma-fence critical section in commit path")
> Signed-off-by: Tomi Valkeinen <[email protected]>

Reviewed-by: Aradhya Bhatia <[email protected]>

Regards
Aradhya

> ---
> drivers/gpu/drm/tidss/tidss_kms.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/tidss/tidss_kms.c b/drivers/gpu/drm/tidss/tidss_kms.c
> index c979ad1af236..d096d8d2bc8f 100644
> --- a/drivers/gpu/drm/tidss/tidss_kms.c
> +++ b/drivers/gpu/drm/tidss/tidss_kms.c
> @@ -4,8 +4,6 @@
> * Author: Tomi Valkeinen <[email protected]>
> */
>
> -#include <linux/dma-fence.h>
> -
> #include <drm/drm_atomic.h>
> #include <drm/drm_atomic_helper.h>
> #include <drm/drm_bridge.h>
> @@ -25,7 +23,6 @@ static void tidss_atomic_commit_tail(struct drm_atomic_state *old_state)
> {
> struct drm_device *ddev = old_state->dev;
> struct tidss_device *tidss = to_tidss(ddev);
> - bool fence_cookie = dma_fence_begin_signalling();
>
> dev_dbg(ddev->dev, "%s\n", __func__);
>
> @@ -36,7 +33,6 @@ static void tidss_atomic_commit_tail(struct drm_atomic_state *old_state)
> drm_atomic_helper_commit_modeset_enables(ddev, old_state);
>
> drm_atomic_helper_commit_hw_done(old_state);
> - dma_fence_end_signalling(fence_cookie);
> drm_atomic_helper_wait_for_flip_done(ddev, old_state);
>
> drm_atomic_helper_cleanup_planes(ddev, old_state);
>