2024-05-01 06:57:10

by Adrián Larumbe

[permalink] [raw]
Subject: [PATCH v3 0/2] drm: Fix dma_resv deadlock at drm object pin time

This is v3 of https://lore.kernel.org/dri-devel/[email protected]/

The goal of this patch series is fixing a deadlock upon locking the dma reservation
of a DRM gem object when pinning it, at a prime import operation.

Changes from v2:
- Removed comment explaining reason why an already-locked
pin function replaced the locked variant inside Panfrost's
object pin callback.
- Moved already-assigned attachment warning into generic
already-locked gem object pin function

Adrián Larumbe (2):
drm/panfrost: Fix dma_resv deadlock at drm object pin time
drm/gem-shmem: Add import attachment warning to locked pin function

drivers/gpu/drm/drm_gem_shmem_helper.c | 2 ++
drivers/gpu/drm/lima/lima_gem.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +-
3 files changed, 4 insertions(+), 2 deletions(-)


base-commit: 75b68f22e39aafb22f3d8e3071e1aba73560788c
--
2.44.0



2024-05-01 06:57:30

by Adrián Larumbe

[permalink] [raw]
Subject: [PATCH v3 2/2] drm/gem-shmem: Add import attachment warning to locked pin function

Commit ec144244a43f ("drm/gem-shmem: Acquire reservation lock in GEM
pin/unpin callbacks") moved locking DRM object's dma reservation to
drm_gem_shmem_object_pin, and made drm_gem_shmem_pin_locked public, so we
need to make sure the non-NULL check warning is also added to the latter.

Signed-off-by: Adrián Larumbe <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 177773bcdbfd..ad5d9f704e15 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -233,6 +233,8 @@ int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)

dma_resv_assert_held(shmem->base.resv);

+ drm_WARN_ON(shmem->base.dev, shmem->base.import_attach);
+
ret = drm_gem_shmem_get_pages(shmem);

return ret;
--
2.44.0


2024-05-01 06:58:03

by Adrián Larumbe

[permalink] [raw]
Subject: [PATCH v3 1/2] drm/panfrost: Fix dma_resv deadlock at drm object pin time

When Panfrost must pin an object that is being prepared a dma-buf
attachment for on behalf of another driver, the core drm gem object pinning
code already takes a lock on the object's dma reservation.

However, Panfrost GEM object's pinning callback would eventually try taking
the lock on the same dma reservation when delegating pinning of the object
onto the shmem subsystem, which led to a deadlock.

This can be shown by enabling CONFIG_DEBUG_WW_MUTEX_SLOWPATH, which throws
the following recursive locking situation:

weston/3440 is trying to acquire lock:
ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_pin+0x34/0xb8 [drm_shmem_helper]
but task is already holding lock:
ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_pin+0x2c/0x80 [drm]

Fix it by assuming the object's reservation had already been locked by the
time we reach panfrost_gem_pin.

Do the same thing for the Lima driver, as it most likely suffers from the
same issue.

Cc: Thomas Zimmermann <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Cc: Boris Brezillon <[email protected]>
Cc: Steven Price <[email protected]>
Fixes: a78027847226 ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()")
Signed-off-by: Adrián Larumbe <[email protected]>
---
drivers/gpu/drm/lima/lima_gem.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 7ea244d876ca..c4e0f9faaa47 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -185,7 +185,7 @@ static int lima_gem_pin(struct drm_gem_object *obj)
if (bo->heap_size)
return -EINVAL;

- return drm_gem_shmem_pin(&bo->base);
+ return drm_gem_shmem_object_pin(obj);
}

static int lima_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index d47b40b82b0b..f268bd5c2884 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -192,7 +192,7 @@ static int panfrost_gem_pin(struct drm_gem_object *obj)
if (bo->is_heap)
return -EINVAL;

- return drm_gem_shmem_pin(&bo->base);
+ return drm_gem_shmem_object_pin(obj);
}

static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj)
--
2.44.0


2024-05-02 07:02:06

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] drm/gem-shmem: Add import attachment warning to locked pin function

On Wed, 1 May 2024 07:56:00 +0100
Adrián Larumbe <[email protected]> wrote:

> Commit ec144244a43f ("drm/gem-shmem: Acquire reservation lock in GEM
> pin/unpin callbacks") moved locking DRM object's dma reservation to
> drm_gem_shmem_object_pin, and made drm_gem_shmem_pin_locked public, so we
> need to make sure the non-NULL check warning is also added to the latter.
>
> Signed-off-by: Adrián Larumbe <[email protected]>
> ---
> drivers/gpu/drm/drm_gem_shmem_helper.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 177773bcdbfd..ad5d9f704e15 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -233,6 +233,8 @@ int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
>
> dma_resv_assert_held(shmem->base.resv);
>
> + drm_WARN_ON(shmem->base.dev, shmem->base.import_attach);

If we add a WARN_ON() here, we can probably drop the one in
drm_gem_shmem_pin(), and do the same for drm_gem_shmem_unpin[_locked]()
to keep things consistent.

> +
> ret = drm_gem_shmem_get_pages(shmem);
>
> return ret;


2024-05-02 07:09:51

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] drm/panfrost: Fix dma_resv deadlock at drm object pin time

On Wed, 1 May 2024 07:55:59 +0100
Adrián Larumbe <[email protected]> wrote:

> When Panfrost must pin an object that is being prepared a dma-buf
> attachment for on behalf of another driver, the core drm gem object pinning
> code already takes a lock on the object's dma reservation.
>
> However, Panfrost GEM object's pinning callback would eventually try taking
> the lock on the same dma reservation when delegating pinning of the object
> onto the shmem subsystem, which led to a deadlock.
>
> This can be shown by enabling CONFIG_DEBUG_WW_MUTEX_SLOWPATH, which throws
> the following recursive locking situation:
>
> weston/3440 is trying to acquire lock:
> ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_pin+0x34/0xb8 [drm_shmem_helper]
> but task is already holding lock:
> ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_pin+0x2c/0x80 [drm]
>
> Fix it by assuming the object's reservation had already been locked by the
> time we reach panfrost_gem_pin.

You should probably mention that drm_gem_shmem_object_pin() assumes the
lock to be held, thus explaining the drm_gem_shmem_pin() ->
drm_gem_shmem_object_pin() transition.

Oh, and the commit message refers explicitly to Panfrost in a few places
even though you're fixing Lima as well.

>
> Do the same thing for the Lima driver, as it most likely suffers from the
> same issue.
>
> Cc: Thomas Zimmermann <[email protected]>
> Cc: Dmitry Osipenko <[email protected]>
> Cc: Boris Brezillon <[email protected]>
> Cc: Steven Price <[email protected]>
> Fixes: a78027847226 ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()")
> Signed-off-by: Adrián Larumbe <[email protected]>

With the commit message adjusted as suggested,

Reviewed-by: Boris Brezillon <[email protected]>

> ---
> drivers/gpu/drm/lima/lima_gem.c | 2 +-
> drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 7ea244d876ca..c4e0f9faaa47 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -185,7 +185,7 @@ static int lima_gem_pin(struct drm_gem_object *obj)
> if (bo->heap_size)
> return -EINVAL;
>
> - return drm_gem_shmem_pin(&bo->base);
> + return drm_gem_shmem_object_pin(obj);
> }
>
> static int lima_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
> index d47b40b82b0b..f268bd5c2884 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
> @@ -192,7 +192,7 @@ static int panfrost_gem_pin(struct drm_gem_object *obj)
> if (bo->is_heap)
> return -EINVAL;
>
> - return drm_gem_shmem_pin(&bo->base);
> + return drm_gem_shmem_object_pin(obj);
> }
>
> static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj)


2024-05-02 11:14:34

by Thomas Zimmermann

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] drm/panfrost: Fix dma_resv deadlock at drm object pin time

Hi

Am 01.05.24 um 08:55 schrieb Adrián Larumbe:
> When Panfrost must pin an object that is being prepared a dma-buf
> attachment for on behalf of another driver, the core drm gem object pinning
> code already takes a lock on the object's dma reservation.
>
> However, Panfrost GEM object's pinning callback would eventually try taking
> the lock on the same dma reservation when delegating pinning of the object
> onto the shmem subsystem, which led to a deadlock.
>
> This can be shown by enabling CONFIG_DEBUG_WW_MUTEX_SLOWPATH, which throws
> the following recursive locking situation:
>
> weston/3440 is trying to acquire lock:
> ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_pin+0x34/0xb8 [drm_shmem_helper]
> but task is already holding lock:
> ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_pin+0x2c/0x80 [drm]
>
> Fix it by assuming the object's reservation had already been locked by the
> time we reach panfrost_gem_pin.

Maybe say that the reservation lock has been taken in drm_gem_pin()

>
> Do the same thing for the Lima driver, as it most likely suffers from the
> same issue.

Please split this patch into one for panfrost and one for lima. To each
patch, you can add

Reviewed-by: Thomas Zimmermann <[email protected]>

Best regards
Thomas

>
> Cc: Thomas Zimmermann <[email protected]>
> Cc: Dmitry Osipenko <[email protected]>
> Cc: Boris Brezillon <[email protected]>
> Cc: Steven Price <[email protected]>
> Fixes: a78027847226 ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()")
> Signed-off-by: Adrián Larumbe <[email protected]>
> ---
> drivers/gpu/drm/lima/lima_gem.c | 2 +-
> drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 7ea244d876ca..c4e0f9faaa47 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -185,7 +185,7 @@ static int lima_gem_pin(struct drm_gem_object *obj)
> if (bo->heap_size)
> return -EINVAL;
>
> - return drm_gem_shmem_pin(&bo->base);
> + return drm_gem_shmem_object_pin(obj);
> }
>
> static int lima_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
> index d47b40b82b0b..f268bd5c2884 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
> @@ -192,7 +192,7 @@ static int panfrost_gem_pin(struct drm_gem_object *obj)
> if (bo->is_heap)
> return -EINVAL;
>
> - return drm_gem_shmem_pin(&bo->base);
> + return drm_gem_shmem_object_pin(obj);
> }
>
> static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj)

--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


2024-05-02 11:51:32

by Thomas Zimmermann

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] drm: Fix dma_resv deadlock at drm object pin time

Hi,

ignoring my r-b on patch 1, I'd like to rethink the current patches in
general.

I think drm_gem_shmem_pin() should become the locked version of _pin(),
so that drm_gem_shmem_object_pin() can call it directly. The existing
_pin_unlocked() would not be needed any longer. Same for the _unpin()
functions. This change would also fix the consistency with the semantics
of the shmem _vmap() functions, which never take reservation locks.

There are only two external callers of drm_gem_shmem_pin(): the test
case and panthor. These assume that drm_gem_shmem_pin() acquires the
reservation lock. The test case should likely call drm_gem_pin()
instead. That would acquire the reservation lock and the test would
validate that shmem's pin helper integrates well into the overall GEM
framework. The way panthor uses drm_gem_shmem_pin() looks wrong to me.
For now, it could receive a wrapper that takes the lock and that's it.

Best regards
Thomas

Am 01.05.24 um 08:55 schrieb Adrián Larumbe:
> This is v3 of https://lore.kernel.org/dri-devel/[email protected]/
>
> The goal of this patch series is fixing a deadlock upon locking the dma reservation
> of a DRM gem object when pinning it, at a prime import operation.
>
> Changes from v2:
> - Removed comment explaining reason why an already-locked
> pin function replaced the locked variant inside Panfrost's
> object pin callback.
> - Moved already-assigned attachment warning into generic
> already-locked gem object pin function
>
> Adrián Larumbe (2):
> drm/panfrost: Fix dma_resv deadlock at drm object pin time
> drm/gem-shmem: Add import attachment warning to locked pin function
>
> drivers/gpu/drm/drm_gem_shmem_helper.c | 2 ++
> drivers/gpu/drm/lima/lima_gem.c | 2 +-
> drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +-
> 3 files changed, 4 insertions(+), 2 deletions(-)
>
>
> base-commit: 75b68f22e39aafb22f3d8e3071e1aba73560788c

--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


2024-05-02 11:59:55

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] drm: Fix dma_resv deadlock at drm object pin time

Hi Thomas,

On Thu, 2 May 2024 13:51:16 +0200
Thomas Zimmermann <[email protected]> wrote:

> Hi,
>
> ignoring my r-b on patch 1, I'd like to rethink the current patches in
> general.
>
> I think drm_gem_shmem_pin() should become the locked version of _pin(),
> so that drm_gem_shmem_object_pin() can call it directly. The existing
> _pin_unlocked() would not be needed any longer. Same for the _unpin()
> functions. This change would also fix the consistency with the semantics
> of the shmem _vmap() functions, which never take reservation locks.
>
> There are only two external callers of drm_gem_shmem_pin(): the test
> case and panthor. These assume that drm_gem_shmem_pin() acquires the
> reservation lock. The test case should likely call drm_gem_pin()
> instead. That would acquire the reservation lock and the test would
> validate that shmem's pin helper integrates well into the overall GEM
> framework. The way panthor uses drm_gem_shmem_pin() looks wrong to me.
> For now, it could receive a wrapper that takes the lock and that's it.

I do agree that the current inconsistencies in the naming is
troublesome (sometimes _unlocked, sometimes _locked, with the version
without any suffix meaning either _locked or _unlocked depending on
what the suffixed version does), and that's the very reason I asked
Dmitry to address that in his shrinker series [1]. So, ideally I'd
prefer if patches from Dmitry's series were applied instead of
trying to fix that here (IIRC, we had an ack from Maxime).

Regards,

Boris

2024-05-02 12:00:48

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] drm: Fix dma_resv deadlock at drm object pin time

On Thu, 2 May 2024 13:59:41 +0200
Boris Brezillon <[email protected]> wrote:

> Hi Thomas,
>
> On Thu, 2 May 2024 13:51:16 +0200
> Thomas Zimmermann <[email protected]> wrote:
>
> > Hi,
> >
> > ignoring my r-b on patch 1, I'd like to rethink the current patches in
> > general.
> >
> > I think drm_gem_shmem_pin() should become the locked version of _pin(),
> > so that drm_gem_shmem_object_pin() can call it directly. The existing
> > _pin_unlocked() would not be needed any longer. Same for the _unpin()
> > functions. This change would also fix the consistency with the semantics
> > of the shmem _vmap() functions, which never take reservation locks.
> >
> > There are only two external callers of drm_gem_shmem_pin(): the test
> > case and panthor. These assume that drm_gem_shmem_pin() acquires the
> > reservation lock. The test case should likely call drm_gem_pin()
> > instead. That would acquire the reservation lock and the test would
> > validate that shmem's pin helper integrates well into the overall GEM
> > framework. The way panthor uses drm_gem_shmem_pin() looks wrong to me.
> > For now, it could receive a wrapper that takes the lock and that's it.
>
> I do agree that the current inconsistencies in the naming is
> troublesome (sometimes _unlocked, sometimes _locked, with the version
> without any suffix meaning either _locked or _unlocked depending on
> what the suffixed version does), and that's the very reason I asked
> Dmitry to address that in his shrinker series [1]. So, ideally I'd
> prefer if patches from Dmitry's series were applied instead of
> trying to fix that here (IIRC, we had an ack from Maxime).

With the link this time :-).

[1]https://lore.kernel.org/lkml/[email protected]/T/

>
> Regards,
>
> Boris


2024-05-02 12:18:54

by Thomas Zimmermann

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] drm: Fix dma_resv deadlock at drm object pin time

Hi

Am 02.05.24 um 14:00 schrieb Boris Brezillon:
> On Thu, 2 May 2024 13:59:41 +0200
> Boris Brezillon <[email protected]> wrote:
>
>> Hi Thomas,
>>
>> On Thu, 2 May 2024 13:51:16 +0200
>> Thomas Zimmermann <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> ignoring my r-b on patch 1, I'd like to rethink the current patches in
>>> general.
>>>
>>> I think drm_gem_shmem_pin() should become the locked version of _pin(),
>>> so that drm_gem_shmem_object_pin() can call it directly. The existing
>>> _pin_unlocked() would not be needed any longer. Same for the _unpin()
>>> functions. This change would also fix the consistency with the semantics
>>> of the shmem _vmap() functions, which never take reservation locks.
>>>
>>> There are only two external callers of drm_gem_shmem_pin(): the test
>>> case and panthor. These assume that drm_gem_shmem_pin() acquires the
>>> reservation lock. The test case should likely call drm_gem_pin()
>>> instead. That would acquire the reservation lock and the test would
>>> validate that shmem's pin helper integrates well into the overall GEM
>>> framework. The way panthor uses drm_gem_shmem_pin() looks wrong to me.
>>> For now, it could receive a wrapper that takes the lock and that's it.
>> I do agree that the current inconsistencies in the naming is
>> troublesome (sometimes _unlocked, sometimes _locked, with the version
>> without any suffix meaning either _locked or _unlocked depending on
>> what the suffixed version does), and that's the very reason I asked
>> Dmitry to address that in his shrinker series [1]. So, ideally I'd
>> prefer if patches from Dmitry's series were applied instead of
>> trying to fix that here (IIRC, we had an ack from Maxime).
> With the link this time :-).
>
> [1]https://lore.kernel.org/lkml/[email protected]/T/

Thanks. I remember these patches. Somehow I thought they would have been
merged already. I wasn't super happy about the naming changes in patch
5, because the names of the GEM object callbacks do no longer correspond
with their implementations. But anyway.

If we go that direction, we should here simply push drm_gem_shmem_pin()
and drm_gem_shmem_unpin() into panthor and update the shmem tests with
drm_gem_pin(). Panfrost and lima would call drm_gem_shmem_pin_locked().
IMHO we should not promote the use of drm_gem_shmem_object_*()
functions, as they are meant to be callbacks for struct
drm_gem_object_funcs. (Auto-generating them would be nice.)

Best regards
Thomas


>
>> Regards,
>>
>> Boris

--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)