2024-01-05 18:46:53

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 00/30] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

This series:

1. Adds common drm-shmem memory shrinker
2. Moves drm-shmem drivers to new SGT usage policy
3. Enables shrinker for VirtIO-GPU driver
4. Switches Panfrost driver to the common shrinker
5. Fixes bugs and improves/refactors drm-shmem code

Mesa: https://gitlab.freedesktop.org/digetx/mesa/-/commits/virgl-madvise
IGT: https://gitlab.freedesktop.org/digetx/igt-gpu-tools/-/commits/virtio-madvise
https://gitlab.freedesktop.org/digetx/igt-gpu-tools/-/commits/panfrost-madvise

Changelog:

v19:- Addressed v18 review comments from Boris Brezillon:

- Improved bisectablity of Panfrost and drm-shmem patches by
fixing iterim warning splats related to shrinker changes.

- Improved commit messages

- Reworked Lima patch to avoid adding `put_pages` flag

- Reworked Panfrost patch that switches driver to explicit
get/put pages by moving drm_gem_shmem_put_pages() into
gem_mapping_release() instead of gem_free_object().

- Updated Panfrost patch that switches driver to generic shrinker
with more comments and minor imporovemnts.

- Added new Panfrost patch from Boris that fixes handling of
a partially mapped heap BOs.

- Added link to the related Mesa MR to the commit msg of the patch
that adds new DRM_VIRTGPU_MADVISE ioctl, like was requested by
Gurchetan Singh.

- Added ackes/r-b from Boris Brezillon and Maxime Ripard

- New patches in v19:

drm/gem: Document locking rule of vmap and evict callbacks
drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()

- Factored out couple code changes into these new separate patches:

drm/shmem-helper: Avoid lockdep warning when pages are released
drm/shmem-helper: Turn warnings about imported GEM into errors

v18:- Added new pathes that change sgt allocation policy. Previously once
sgt was allocated, it exsited till GEM was freed. Now sgt is destroyed
once pages are unpinned and drivers have to manage the pages' pinning
refcounting by themselves using get/put() and pin/unpin() pages.
This removes pages refcounting ambiguity from the drm-shmem core.

- Dropped patch that changed drm_gem_shmem_vmap_locked() error handling,
like was requested by Boris Brezillon.

- Added new patches that make minor improvements:

- Optimize unlocked get_pages_sgt()
- Don't free refcounted GEM

- Dropped t-b from the Panfrost shrinker patch that was given for older
patch version since code changed with the new sgt allocation policy.

v17:- Dropped patches that added new drm-shmem sgt flags, fixing dma-buf UAF
in drm-prime error code path and preventing invalid page_count when GEM
is freed. Will revist them later on and then factor them out into a
seprate patchset.

- Dropped patches that replaced drm_gem_shmem_free() with
drm_gem_object_put(), they not needed anymore after changing
drm_gem_shmem_free() to not touch reservation lock.

- Addressed review comments from Boris Brezillon:

- Added new patch to clean up error unwinding in
drm_gem_shmem_vmap_locked()

- Added new __drm_gem_shmem_put_pages() to let the callers
to assert the held reservation lock themselves

- Moved replacement of shmem->pages check with refcount_read()
in drm_gem_shmem_free() to the shrinker addition patch

- Improved commit message of the vmap_use_count patch

- Added r-bs from Boris Brezillon that he gave to v16

v16:- Added more comments to the code for the new drm-shmem flags

- Added r-bs from Boris Brezillon

- Fixed typos and made impovements pointed out by Boris Brezillon

- Replaced kref with refcount_t as was suggested by Boris Brezillon

- Corrected placement of got_sgt flag in the Lima driver, also renamed
flag to got_pages_sgt

- Removed drm_gem_shmem_resv_assert_held() and made drm_gem_shmem_free()
to free pages without a new func that doesn't touch resv lock, as was
suggested by Boris Brezillon

- Added pages_pin_count to drm_gem_shmem_print_info()

v15:- Moved drm-shmem reference counters to use kref that allows to
optimize unlocked functions, like was suggested by Boris Brezillon.

- Changed drm/gem/shmem function names to use _locked postfix and
dropped the _unlocked, making the naming scheme consistent across
DRM code, like was suggested by Boris Brezillon.

- Added patch that fixes UAF in drm-shmem for drivers that import
dma-buf and then release buffer in the import error code path.

- Added patch that makes drm-shmem use new flag for SGT's get_pages()
refcounting, preventing unbalanced refcounting when GEM is freed.

- Fixed guest blob pinning in virtio-gpu driver that was missed
previously in the shrinker patch.

- Moved VC4 and virtio-gpu drivers to use drm_gem_put() in
GEM-creation error code paths, which is now required by drm-shmem
and was missed in a previous patch versions.

- Virtio-GPU now attaches shmem pages to host on first use and not
when BO is created. In older patch versions there was a potential
race condition in the BO creation code path where both
get_sgt()+object_attach() should've been made under same resv lock,
otherwise pages could be evicted before attachment is invoked.

- Virtio-GPU and drm-shmem shrinker patches are split into smaller
ones.

v14:- All the prerequisite reservation locking patches landed upstream,
previously were a part of this series in v13 and older.

https://lore.kernel.org/dri-devel/[email protected]/

- Added patches to improve locked/unlocked function names, like was
suggested by Boris Brezillon for v13.

- Made all exported drm-shmem symbols GPL, like was previously
discussed with Thomas Zimmermann on this series.

- Improved virtio-gpu shrinker patch. Now it won't detach purged BO
when userspace closes GEM. Crosvm (and not qemu) checks res_id on
CMD_CTX_DETACH_RESOURCE and prints noisy error message if ID is
invalid, which wasn't noticed before.

v13:- Updated virtio-gpu shrinker patch to use drm_gem_shmem_object_pin()
directly instead of drm_gem_pin() and dropped patch that exported
drm_gem_pin() functions, like was requested by Thomas Zimmermann in
v12.

v12:- Fixed the "no previous prototype for function" warning reported by
kernel build bot for v11.

- Fixed the missing reservation lock reported by Intel CI for VGEM
driver. Other drivers using drm-shmem were affected similarly to
VGEM. The problem was in the dma-buf attachment code path that led
to drm-shmem pinning function which assumed the held reservation lock
by drm_gem_pin(). In the past that code path was causing trouble for
i915 driver and we've changed the locking scheme for the attachment
code path in the dma-buf core to let exporters to handle the locking
themselves. After a closer investigation, I realized that my assumption
about testing of dma-buf export code path using Panfrost driver was
incorrect. Now I created additional local test to exrecise the Panfrost
export path. I also reproduced the issue reported by the Intel CI for
v10. It's all fixed now by making the drm_gem_shmem_pin() to take the
resv lock by itself.

- Patches are based on top of drm-tip, CC'd intel-gfx CI for testing.

v11:- Rebased on a recent linux-next. Added new patch as a result:

drm/shmem-helper: Export drm_gem_shmem_get_pages_sgt_locked()

It's needed by the virtio-gpu driver to swap-in/unevict shmem
object, previously get_pages_sgt() didn't use locking.

- Separated the "Add memory shrinker" patch into smaller parts to ease
the reviewing, as was requested by Thomas Zimmermann:

drm/shmem-helper: Factor out pages alloc/release from
drm_gem_shmem_get/put_pages()
drm/shmem-helper: Add pages_pin_count field
drm/shmem-helper: Switch drm_gem_shmem_vmap/vunmap to use pin/unpin
drm/shmem-helper: Factor out unpinning part from drm_gem_shmem_purge()

- Addessed the v10 review comments from Thomas Zimmermann: return errno
instead of bool, sort code alphabetically, rename function and etc
minor changes.

- Added new patch to remove the "map->is_iomem" from drm-shmem, as
was suggested by Thomas Zimmermann.

- Added acks and r-b's that were given to v10.

v10:- Was partially applied to misc-fixes/next.

https://lore.kernel.org/dri-devel/[email protected]/T/

Boris Brezillon (1):
drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()

Dmitry Osipenko (29):
drm/gem: Change locked/unlocked postfix of drm_gem_v/unmap() function
names
drm/gem: Add _locked postfix to functions that have unlocked
counterpart
drm/gem: Document locking rule of vmap and evict callbacks
drm/shmem-helper: Make all exported symbols GPL
drm/shmem-helper: Refactor locked/unlocked functions
drm/shmem-helper: Remove obsoleted is_iomem test
drm/shmem-helper: Add and use pages_pin_count
drm/shmem-helper: Use refcount_t for pages_use_count
drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()
drm/shmem-helper: Switch drm_gem_shmem_vmap/vunmap to use pin/unpin
drm/shmem-helper: Use refcount_t for vmap_use_count
drm/shmem-helper: Prepare drm_gem_shmem_free() to shrinker addition
drm/shmem-helper: Make drm_gem_shmem_get_pages() public
drm/shmem-helper: Add drm_gem_shmem_put_pages()
drm/shmem-helper: Avoid lockdep warning when pages are released
drm/lima: Explicitly get and put drm-shmem pages
drm/panfrost: Explicitly get and put drm-shmem pages
drm/virtio: Explicitly get and put drm-shmem pages
drm/v3d: Explicitly get and put drm-shmem pages
drm/shmem-helper: Change sgt allocation policy
drm/shmem-helper: Add common memory shrinker
drm/shmem-helper: Export drm_gem_shmem_get_pages_sgt_locked()
drm/shmem-helper: Optimize unlocked get_pages_sgt()
drm/shmem-helper: Don't free refcounted GEM
drm/shmem-helper: Turn warnings about imported GEM into errors
drm/virtio: Pin display framebuffer BO
drm/virtio: Attach shmem BOs dynamically
drm/virtio: Support shmem shrinking
drm/panfrost: Switch to generic memory shrinker

drivers/gpu/drm/drm_client.c | 6 +-
drivers/gpu/drm/drm_gem.c | 26 +-
drivers/gpu/drm/drm_gem_framebuffer_helper.c | 6 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 663 +++++++++++++++---
drivers/gpu/drm/drm_internal.h | 4 +-
drivers/gpu/drm/drm_prime.c | 4 +-
drivers/gpu/drm/lima/lima_gem.c | 23 +-
drivers/gpu/drm/lima/lima_sched.c | 4 +-
drivers/gpu/drm/panfrost/Makefile | 1 -
drivers/gpu/drm/panfrost/panfrost_device.h | 4 -
drivers/gpu/drm/panfrost/panfrost_drv.c | 31 +-
drivers/gpu/drm/panfrost/panfrost_dump.c | 4 +-
drivers/gpu/drm/panfrost/panfrost_gem.c | 110 ++-
drivers/gpu/drm/panfrost/panfrost_gem.h | 9 -
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 127 ----
drivers/gpu/drm/panfrost/panfrost_job.c | 18 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 39 +-
drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 6 +-
drivers/gpu/drm/v3d/v3d_bo.c | 11 +-
drivers/gpu/drm/virtio/virtgpu_drv.h | 22 +-
drivers/gpu/drm/virtio/virtgpu_gem.c | 85 +++
drivers/gpu/drm/virtio/virtgpu_ioctl.c | 57 +-
drivers/gpu/drm/virtio/virtgpu_kms.c | 8 +
drivers/gpu/drm/virtio/virtgpu_object.c | 143 +++-
drivers/gpu/drm/virtio/virtgpu_plane.c | 17 +-
drivers/gpu/drm/virtio/virtgpu_submit.c | 15 +-
drivers/gpu/drm/virtio/virtgpu_vq.c | 40 ++
include/drm/drm_device.h | 10 +-
include/drm/drm_gem.h | 15 +-
include/drm/drm_gem_shmem_helper.h | 117 +++-
include/uapi/drm/virtgpu_drm.h | 14 +
31 files changed, 1216 insertions(+), 423 deletions(-)
delete mode 100644 drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c

--
2.43.0



2024-01-05 18:47:02

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 01/30] drm/gem: Change locked/unlocked postfix of drm_gem_v/unmap() function names

Make drm/gem API function names consistent by having locked function
use the _locked postfix in the name, while the unlocked variants don't
use the _unlocked postfix. Rename drm_gem_v/unmap() function names to
make them consistent with the rest of the API functions.

Acked-by: Maxime Ripard <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Suggested-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_client.c | 6 +++---
drivers/gpu/drm/drm_gem.c | 20 ++++++++++----------
drivers/gpu/drm/drm_gem_framebuffer_helper.c | 6 +++---
drivers/gpu/drm/drm_internal.h | 4 ++--
drivers/gpu/drm/drm_prime.c | 4 ++--
drivers/gpu/drm/lima/lima_sched.c | 4 ++--
drivers/gpu/drm/panfrost/panfrost_dump.c | 4 ++--
drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 6 +++---
include/drm/drm_gem.h | 4 ++--
9 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index 9403b3f576f7..7ee9baf46eaa 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -255,7 +255,7 @@ void drm_client_dev_restore(struct drm_device *dev)
static void drm_client_buffer_delete(struct drm_client_buffer *buffer)
{
if (buffer->gem) {
- drm_gem_vunmap_unlocked(buffer->gem, &buffer->map);
+ drm_gem_vunmap(buffer->gem, &buffer->map);
drm_gem_object_put(buffer->gem);
}

@@ -339,7 +339,7 @@ drm_client_buffer_vmap(struct drm_client_buffer *buffer,
* fd_install step out of the driver backend hooks, to make that
* final step optional for internal users.
*/
- ret = drm_gem_vmap_unlocked(buffer->gem, map);
+ ret = drm_gem_vmap(buffer->gem, map);
if (ret)
return ret;

@@ -361,7 +361,7 @@ void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
{
struct iosys_map *map = &buffer->map;

- drm_gem_vunmap_unlocked(buffer->gem, map);
+ drm_gem_vunmap(buffer->gem, map);
}
EXPORT_SYMBOL(drm_client_buffer_vunmap);

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 44a948b80ee1..95327b003692 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1175,7 +1175,7 @@ void drm_gem_unpin(struct drm_gem_object *obj)
obj->funcs->unpin(obj);
}

-int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
+int drm_gem_vmap_locked(struct drm_gem_object *obj, struct iosys_map *map)
{
int ret;

@@ -1192,9 +1192,9 @@ int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)

return 0;
}
-EXPORT_SYMBOL(drm_gem_vmap);
+EXPORT_SYMBOL(drm_gem_vmap_locked);

-void drm_gem_vunmap(struct drm_gem_object *obj, struct iosys_map *map)
+void drm_gem_vunmap_locked(struct drm_gem_object *obj, struct iosys_map *map)
{
dma_resv_assert_held(obj->resv);

@@ -1207,27 +1207,27 @@ void drm_gem_vunmap(struct drm_gem_object *obj, struct iosys_map *map)
/* Always set the mapping to NULL. Callers may rely on this. */
iosys_map_clear(map);
}
-EXPORT_SYMBOL(drm_gem_vunmap);
+EXPORT_SYMBOL(drm_gem_vunmap_locked);

-int drm_gem_vmap_unlocked(struct drm_gem_object *obj, struct iosys_map *map)
+int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
{
int ret;

dma_resv_lock(obj->resv, NULL);
- ret = drm_gem_vmap(obj, map);
+ ret = drm_gem_vmap_locked(obj, map);
dma_resv_unlock(obj->resv);

return ret;
}
-EXPORT_SYMBOL(drm_gem_vmap_unlocked);
+EXPORT_SYMBOL(drm_gem_vmap);

-void drm_gem_vunmap_unlocked(struct drm_gem_object *obj, struct iosys_map *map)
+void drm_gem_vunmap(struct drm_gem_object *obj, struct iosys_map *map)
{
dma_resv_lock(obj->resv, NULL);
- drm_gem_vunmap(obj, map);
+ drm_gem_vunmap_locked(obj, map);
dma_resv_unlock(obj->resv);
}
-EXPORT_SYMBOL(drm_gem_vunmap_unlocked);
+EXPORT_SYMBOL(drm_gem_vunmap);

/**
* drm_gem_lock_reservations - Sets up the ww context and acquires
diff --git a/drivers/gpu/drm/drm_gem_framebuffer_helper.c b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
index 3bdb6ba37ff4..3808f47310bf 100644
--- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c
+++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
@@ -362,7 +362,7 @@ int drm_gem_fb_vmap(struct drm_framebuffer *fb, struct iosys_map *map,
ret = -EINVAL;
goto err_drm_gem_vunmap;
}
- ret = drm_gem_vmap_unlocked(obj, &map[i]);
+ ret = drm_gem_vmap(obj, &map[i]);
if (ret)
goto err_drm_gem_vunmap;
}
@@ -384,7 +384,7 @@ int drm_gem_fb_vmap(struct drm_framebuffer *fb, struct iosys_map *map,
obj = drm_gem_fb_get_obj(fb, i);
if (!obj)
continue;
- drm_gem_vunmap_unlocked(obj, &map[i]);
+ drm_gem_vunmap(obj, &map[i]);
}
return ret;
}
@@ -411,7 +411,7 @@ void drm_gem_fb_vunmap(struct drm_framebuffer *fb, struct iosys_map *map)
continue;
if (iosys_map_is_null(&map[i]))
continue;
- drm_gem_vunmap_unlocked(obj, &map[i]);
+ drm_gem_vunmap(obj, &map[i]);
}
}
EXPORT_SYMBOL(drm_gem_fb_vunmap);
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 8e4faf0a28e6..227f58e5b232 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -172,8 +172,8 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent,

int drm_gem_pin(struct drm_gem_object *obj);
void drm_gem_unpin(struct drm_gem_object *obj);
-int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map);
-void drm_gem_vunmap(struct drm_gem_object *obj, struct iosys_map *map);
+int drm_gem_vmap_locked(struct drm_gem_object *obj, struct iosys_map *map);
+void drm_gem_vunmap_locked(struct drm_gem_object *obj, struct iosys_map *map);

/* drm_debugfs.c drm_debugfs_crc.c */
#if defined(CONFIG_DEBUG_FS)
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 834a5e28abbe..4a5935a400ec 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -684,7 +684,7 @@ int drm_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct iosys_map *map)
{
struct drm_gem_object *obj = dma_buf->priv;

- return drm_gem_vmap(obj, map);
+ return drm_gem_vmap_locked(obj, map);
}
EXPORT_SYMBOL(drm_gem_dmabuf_vmap);

@@ -700,7 +700,7 @@ void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, struct iosys_map *map)
{
struct drm_gem_object *obj = dma_buf->priv;

- drm_gem_vunmap(obj, map);
+ drm_gem_vunmap_locked(obj, map);
}
EXPORT_SYMBOL(drm_gem_dmabuf_vunmap);

diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index c3bf8cda8498..3813f30480ba 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -371,7 +371,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)
} else {
buffer_chunk->size = lima_bo_size(bo);

- ret = drm_gem_vmap_unlocked(&bo->base.base, &map);
+ ret = drm_gem_vmap(&bo->base.base, &map);
if (ret) {
kvfree(et);
goto out;
@@ -379,7 +379,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task)

memcpy(buffer_chunk + 1, map.vaddr, buffer_chunk->size);

- drm_gem_vunmap_unlocked(&bo->base.base, &map);
+ drm_gem_vunmap(&bo->base.base, &map);
}

buffer_chunk = (void *)(buffer_chunk + 1) + buffer_chunk->size;
diff --git a/drivers/gpu/drm/panfrost/panfrost_dump.c b/drivers/gpu/drm/panfrost/panfrost_dump.c
index 47751302f1bc..4042afe2fbf4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_dump.c
+++ b/drivers/gpu/drm/panfrost/panfrost_dump.c
@@ -209,7 +209,7 @@ void panfrost_core_dump(struct panfrost_job *job)
goto dump_header;
}

- ret = drm_gem_vmap_unlocked(&bo->base.base, &map);
+ ret = drm_gem_vmap(&bo->base.base, &map);
if (ret) {
dev_err(pfdev->dev, "Panfrost Dump: couldn't map Buffer Object\n");
iter.hdr->bomap.valid = 0;
@@ -228,7 +228,7 @@ void panfrost_core_dump(struct panfrost_job *job)
vaddr = map.vaddr;
memcpy(iter.data, vaddr, bo->base.base.size);

- drm_gem_vunmap_unlocked(&bo->base.base, &map);
+ drm_gem_vunmap(&bo->base.base, &map);

iter.hdr->bomap.valid = 1;

diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
index ba9b6e2b2636..52befead08c6 100644
--- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
+++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
@@ -106,7 +106,7 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
goto err_close_bo;
}

- ret = drm_gem_vmap_unlocked(&bo->base, &map);
+ ret = drm_gem_vmap(&bo->base, &map);
if (ret)
goto err_put_mapping;
perfcnt->buf = map.vaddr;
@@ -165,7 +165,7 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
return 0;

err_vunmap:
- drm_gem_vunmap_unlocked(&bo->base, &map);
+ drm_gem_vunmap(&bo->base, &map);
err_put_mapping:
panfrost_gem_mapping_put(perfcnt->mapping);
err_close_bo:
@@ -195,7 +195,7 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev,
GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF));

perfcnt->user = NULL;
- drm_gem_vunmap_unlocked(&perfcnt->mapping->obj->base.base, &map);
+ drm_gem_vunmap(&perfcnt->mapping->obj->base.base, &map);
perfcnt->buf = NULL;
panfrost_gem_close(&perfcnt->mapping->obj->base.base, file_priv);
panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 369505447acd..decb19ffb2c8 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -527,8 +527,8 @@ struct page **drm_gem_get_pages(struct drm_gem_object *obj);
void drm_gem_put_pages(struct drm_gem_object *obj, struct page **pages,
bool dirty, bool accessed);

-int drm_gem_vmap_unlocked(struct drm_gem_object *obj, struct iosys_map *map);
-void drm_gem_vunmap_unlocked(struct drm_gem_object *obj, struct iosys_map *map);
+int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map);
+void drm_gem_vunmap(struct drm_gem_object *obj, struct iosys_map *map);

int drm_gem_objects_lookup(struct drm_file *filp, void __user *bo_handles,
int count, struct drm_gem_object ***objs_out);
--
2.43.0


2024-01-05 18:47:17

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 02/30] drm/gem: Add _locked postfix to functions that have unlocked counterpart

Add _locked postfix to drm_gem functions that have unlocked counterpart
functions to make GEM functions naming more consistent and intuitive in
regards to the locking requirements.

Acked-by: Maxime Ripard <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Suggested-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem.c | 6 +++---
include/drm/drm_gem.h | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 95327b003692..4523cd40fb2f 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1490,10 +1490,10 @@ drm_gem_lru_scan(struct drm_gem_lru *lru,
EXPORT_SYMBOL(drm_gem_lru_scan);

/**
- * drm_gem_evict - helper to evict backing pages for a GEM object
+ * drm_gem_evict_locked - helper to evict backing pages for a GEM object
* @obj: obj in question
*/
-int drm_gem_evict(struct drm_gem_object *obj)
+int drm_gem_evict_locked(struct drm_gem_object *obj)
{
dma_resv_assert_held(obj->resv);

@@ -1505,4 +1505,4 @@ int drm_gem_evict(struct drm_gem_object *obj)

return 0;
}
-EXPORT_SYMBOL(drm_gem_evict);
+EXPORT_SYMBOL(drm_gem_evict_locked);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index decb19ffb2c8..f835fdee6a5e 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -551,7 +551,7 @@ unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru,
unsigned long *remaining,
bool (*shrink)(struct drm_gem_object *obj));

-int drm_gem_evict(struct drm_gem_object *obj);
+int drm_gem_evict_locked(struct drm_gem_object *obj);

#ifdef CONFIG_LOCKDEP
/**
--
2.43.0


2024-01-05 18:47:56

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 04/30] drm/shmem-helper: Make all exported symbols GPL

Make all drm-shmem exported symbols GPL to make them consistent with
the rest of drm-shmem symbols.

Acked-by: Maxime Ripard <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index e435f986cd13..0d61f2b3e213 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -226,7 +226,7 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
shmem->pages_mark_accessed_on_put);
shmem->pages = NULL;
}
-EXPORT_SYMBOL(drm_gem_shmem_put_pages);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages);

static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
{
@@ -271,7 +271,7 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)

return ret;
}
-EXPORT_SYMBOL(drm_gem_shmem_pin);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_pin);

/**
* drm_gem_shmem_unpin - Unpin backing pages for a shmem GEM object
@@ -290,7 +290,7 @@ void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem)
drm_gem_shmem_unpin_locked(shmem);
dma_resv_unlock(shmem->base.resv);
}
-EXPORT_SYMBOL(drm_gem_shmem_unpin);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_unpin);

/*
* drm_gem_shmem_vmap - Create a virtual mapping for a shmem GEM object
@@ -360,7 +360,7 @@ int drm_gem_shmem_vmap(struct drm_gem_shmem_object *shmem,

return ret;
}
-EXPORT_SYMBOL(drm_gem_shmem_vmap);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_vmap);

/*
* drm_gem_shmem_vunmap - Unmap a virtual mapping for a shmem GEM object
@@ -396,7 +396,7 @@ void drm_gem_shmem_vunmap(struct drm_gem_shmem_object *shmem,

shmem->vaddr = NULL;
}
-EXPORT_SYMBOL(drm_gem_shmem_vunmap);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_vunmap);

static int
drm_gem_shmem_create_with_handle(struct drm_file *file_priv,
@@ -435,7 +435,7 @@ int drm_gem_shmem_madvise(struct drm_gem_shmem_object *shmem, int madv)

return (madv >= 0);
}
-EXPORT_SYMBOL(drm_gem_shmem_madvise);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_madvise);

void drm_gem_shmem_purge(struct drm_gem_shmem_object *shmem)
{
@@ -467,7 +467,7 @@ void drm_gem_shmem_purge(struct drm_gem_shmem_object *shmem)

invalidate_mapping_pages(file_inode(obj->filp)->i_mapping, 0, (loff_t)-1);
}
-EXPORT_SYMBOL(drm_gem_shmem_purge);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_purge);

/**
* drm_gem_shmem_dumb_create - Create a dumb shmem buffer object
@@ -642,7 +642,7 @@ void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem,
drm_printf_indent(p, indent, "vmap_use_count=%u\n", shmem->vmap_use_count);
drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr);
}
-EXPORT_SYMBOL(drm_gem_shmem_print_info);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_print_info);

/**
* drm_gem_shmem_get_sg_table - Provide a scatter/gather table of pinned
--
2.43.0


2024-01-05 18:48:01

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 03/30] drm/gem: Document locking rule of vmap and evict callbacks

The vmap/vunmap/evict GEM callbacks are always invoked with a held GEM's
reservation lock. Document this locking rule for clarity.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
include/drm/drm_gem.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index f835fdee6a5e..021f64371056 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -156,7 +156,8 @@ struct drm_gem_object_funcs {
* @vmap:
*
* Returns a virtual address for the buffer. Used by the
- * drm_gem_dmabuf_vmap() helper.
+ * drm_gem_dmabuf_vmap() helper. Called with a held GEM reservation
+ * lock.
*
* This callback is optional.
*/
@@ -166,7 +167,8 @@ struct drm_gem_object_funcs {
* @vunmap:
*
* Releases the address previously returned by @vmap. Used by the
- * drm_gem_dmabuf_vunmap() helper.
+ * drm_gem_dmabuf_vunmap() helper. Called with a held GEM reservation
+ * lock.
*
* This callback is optional.
*/
@@ -189,7 +191,8 @@ struct drm_gem_object_funcs {
* @evict:
*
* Evicts gem object out from memory. Used by the drm_gem_object_evict()
- * helper. Returns 0 on success, -errno otherwise.
+ * helper. Returns 0 on success, -errno otherwise. Called with a held
+ * GEM reservation lock.
*
* This callback is optional.
*/
--
2.43.0


2024-01-05 18:48:15

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 05/30] drm/shmem-helper: Refactor locked/unlocked functions

Add locked and remove unlocked postfixes from drm-shmem function names,
making names consistent with the drm/gem core code.

Reviewed-by: Boris Brezillon <[email protected]>
Suggested-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 60 +++++++++----------
drivers/gpu/drm/lima/lima_gem.c | 6 +-
drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +-
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
include/drm/drm_gem_shmem_helper.h | 28 ++++-----
7 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 0d61f2b3e213..043e8e3b129c 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -153,7 +153,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
kfree(shmem->sgt);
}
if (shmem->pages)
- drm_gem_shmem_put_pages(shmem);
+ drm_gem_shmem_put_pages_locked(shmem);

drm_WARN_ON(obj->dev, shmem->pages_use_count);

@@ -165,7 +165,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_free);

-static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
+static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;
struct page **pages;
@@ -199,12 +199,12 @@ static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
}

/*
- * drm_gem_shmem_put_pages - Decrease use count on the backing pages for a shmem GEM object
+ * drm_gem_shmem_put_pages_locked - Decrease use count on the backing pages for a shmem GEM object
* @shmem: shmem GEM object
*
* This function decreases the use count and puts the backing pages when use drops to zero.
*/
-void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
+void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;

@@ -226,7 +226,7 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
shmem->pages_mark_accessed_on_put);
shmem->pages = NULL;
}
-EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);

static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
{
@@ -234,7 +234,7 @@ static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)

dma_resv_assert_held(shmem->base.resv);

- ret = drm_gem_shmem_get_pages(shmem);
+ ret = drm_gem_shmem_get_pages_locked(shmem);

return ret;
}
@@ -243,7 +243,7 @@ static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem)
{
dma_resv_assert_held(shmem->base.resv);

- drm_gem_shmem_put_pages(shmem);
+ drm_gem_shmem_put_pages_locked(shmem);
}

/**
@@ -293,7 +293,7 @@ void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem)
EXPORT_SYMBOL_GPL(drm_gem_shmem_unpin);

/*
- * drm_gem_shmem_vmap - Create a virtual mapping for a shmem GEM object
+ * drm_gem_shmem_vmap_locked - Create a virtual mapping for a shmem GEM object
* @shmem: shmem GEM object
* @map: Returns the kernel virtual address of the SHMEM GEM object's backing
* store.
@@ -302,13 +302,13 @@ EXPORT_SYMBOL_GPL(drm_gem_shmem_unpin);
* exists for the buffer backing the shmem GEM object. It hides the differences
* between dma-buf imported and natively allocated objects.
*
- * Acquired mappings should be cleaned up by calling drm_gem_shmem_vunmap().
+ * Acquired mappings should be cleaned up by calling drm_gem_shmem_vunmap_locked().
*
* Returns:
* 0 on success or a negative error code on failure.
*/
-int drm_gem_shmem_vmap(struct drm_gem_shmem_object *shmem,
- struct iosys_map *map)
+int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,
+ struct iosys_map *map)
{
struct drm_gem_object *obj = &shmem->base;
int ret = 0;
@@ -331,7 +331,7 @@ int drm_gem_shmem_vmap(struct drm_gem_shmem_object *shmem,
return 0;
}

- ret = drm_gem_shmem_get_pages(shmem);
+ ret = drm_gem_shmem_get_pages_locked(shmem);
if (ret)
goto err_zero_use;

@@ -354,28 +354,28 @@ int drm_gem_shmem_vmap(struct drm_gem_shmem_object *shmem,

err_put_pages:
if (!obj->import_attach)
- drm_gem_shmem_put_pages(shmem);
+ drm_gem_shmem_put_pages_locked(shmem);
err_zero_use:
shmem->vmap_use_count = 0;

return ret;
}
-EXPORT_SYMBOL_GPL(drm_gem_shmem_vmap);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_vmap_locked);

/*
- * drm_gem_shmem_vunmap - Unmap a virtual mapping for a shmem GEM object
+ * drm_gem_shmem_vunmap_locked - Unmap a virtual mapping for a shmem GEM object
* @shmem: shmem GEM object
* @map: Kernel virtual address where the SHMEM GEM object was mapped
*
* This function cleans up a kernel virtual address mapping acquired by
- * drm_gem_shmem_vmap(). The mapping is only removed when the use count drops to
- * zero.
+ * drm_gem_shmem_vmap_locked(). The mapping is only removed when the use count
+ * drops to zero.
*
* This function hides the differences between dma-buf imported and natively
* allocated objects.
*/
-void drm_gem_shmem_vunmap(struct drm_gem_shmem_object *shmem,
- struct iosys_map *map)
+void drm_gem_shmem_vunmap_locked(struct drm_gem_shmem_object *shmem,
+ struct iosys_map *map)
{
struct drm_gem_object *obj = &shmem->base;

@@ -391,12 +391,12 @@ void drm_gem_shmem_vunmap(struct drm_gem_shmem_object *shmem,
return;

vunmap(shmem->vaddr);
- drm_gem_shmem_put_pages(shmem);
+ drm_gem_shmem_put_pages_locked(shmem);
}

shmem->vaddr = NULL;
}
-EXPORT_SYMBOL_GPL(drm_gem_shmem_vunmap);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_vunmap_locked);

static int
drm_gem_shmem_create_with_handle(struct drm_file *file_priv,
@@ -424,7 +424,7 @@ drm_gem_shmem_create_with_handle(struct drm_file *file_priv,
/* Update madvise status, returns true if not purged, else
* false or -errno.
*/
-int drm_gem_shmem_madvise(struct drm_gem_shmem_object *shmem, int madv)
+int drm_gem_shmem_madvise_locked(struct drm_gem_shmem_object *shmem, int madv)
{
dma_resv_assert_held(shmem->base.resv);

@@ -435,9 +435,9 @@ int drm_gem_shmem_madvise(struct drm_gem_shmem_object *shmem, int madv)

return (madv >= 0);
}
-EXPORT_SYMBOL_GPL(drm_gem_shmem_madvise);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_madvise_locked);

-void drm_gem_shmem_purge(struct drm_gem_shmem_object *shmem)
+void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;
struct drm_device *dev = obj->dev;
@@ -451,7 +451,7 @@ void drm_gem_shmem_purge(struct drm_gem_shmem_object *shmem)
kfree(shmem->sgt);
shmem->sgt = NULL;

- drm_gem_shmem_put_pages(shmem);
+ drm_gem_shmem_put_pages_locked(shmem);

shmem->madv = -1;

@@ -467,7 +467,7 @@ void drm_gem_shmem_purge(struct drm_gem_shmem_object *shmem)

invalidate_mapping_pages(file_inode(obj->filp)->i_mapping, 0, (loff_t)-1);
}
-EXPORT_SYMBOL_GPL(drm_gem_shmem_purge);
+EXPORT_SYMBOL_GPL(drm_gem_shmem_purge_locked);

/**
* drm_gem_shmem_dumb_create - Create a dumb shmem buffer object
@@ -564,7 +564,7 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma)
struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);

dma_resv_lock(shmem->base.resv, NULL);
- drm_gem_shmem_put_pages(shmem);
+ drm_gem_shmem_put_pages_locked(shmem);
dma_resv_unlock(shmem->base.resv);

drm_gem_vm_close(vma);
@@ -611,7 +611,7 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct
}

dma_resv_lock(shmem->base.resv, NULL);
- ret = drm_gem_shmem_get_pages(shmem);
+ ret = drm_gem_shmem_get_pages_locked(shmem);
dma_resv_unlock(shmem->base.resv);

if (ret)
@@ -679,7 +679,7 @@ static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_

drm_WARN_ON(obj->dev, obj->import_attach);

- ret = drm_gem_shmem_get_pages(shmem);
+ ret = drm_gem_shmem_get_pages_locked(shmem);
if (ret)
return ERR_PTR(ret);

@@ -701,7 +701,7 @@ static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_
sg_free_table(sgt);
kfree(sgt);
err_put_pages:
- drm_gem_shmem_put_pages(shmem);
+ drm_gem_shmem_put_pages_locked(shmem);
return ERR_PTR(ret);
}

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 4f9736e5f929..433bda72e59b 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -180,7 +180,7 @@ static int lima_gem_pin(struct drm_gem_object *obj)
if (bo->heap_size)
return -EINVAL;

- return drm_gem_shmem_pin(&bo->base);
+ return drm_gem_shmem_object_pin(obj);
}

static int lima_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
@@ -190,7 +190,7 @@ static int lima_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
if (bo->heap_size)
return -EINVAL;

- return drm_gem_shmem_vmap(&bo->base, map);
+ return drm_gem_shmem_object_vmap(obj, map);
}

static int lima_gem_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)
@@ -200,7 +200,7 @@ static int lima_gem_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)
if (bo->heap_size)
return -EINVAL;

- return drm_gem_shmem_mmap(&bo->base, vma);
+ return drm_gem_shmem_object_mmap(obj, vma);
}

static const struct drm_gem_object_funcs lima_gem_funcs = {
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index a926d71e8131..a15d62f19afb 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -438,7 +438,7 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data,
}
}

- args->retained = drm_gem_shmem_madvise(&bo->base, args->madv);
+ args->retained = drm_gem_shmem_madvise_locked(&bo->base, args->madv);

if (args->retained) {
if (args->madv == PANFROST_MADV_DONTNEED)
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index d47b40b82b0b..f268bd5c2884 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -192,7 +192,7 @@ static int panfrost_gem_pin(struct drm_gem_object *obj)
if (bo->is_heap)
return -EINVAL;

- return drm_gem_shmem_pin(&bo->base);
+ return drm_gem_shmem_object_pin(obj);
}

static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj)
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
index 3d9f51bd48b6..02b60ea1433a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
@@ -51,7 +51,7 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj)
goto unlock_mappings;

panfrost_gem_teardown_mappings_locked(bo);
- drm_gem_shmem_purge(&bo->base);
+ drm_gem_shmem_purge_locked(&bo->base);
ret = true;

dma_resv_unlock(shmem->base.resv);
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index f38385fe76bb..1ab081bd81a8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -538,7 +538,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
err_map:
sg_free_table(sgt);
err_pages:
- drm_gem_shmem_put_pages(&bo->base);
+ drm_gem_shmem_put_pages_locked(&bo->base);
err_unlock:
dma_resv_unlock(obj->resv);
err_bo:
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index bf0c31aa8fbe..9e83212becbb 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -99,16 +99,16 @@ struct drm_gem_shmem_object {
struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t size);
void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem);

-void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem);
+void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem);
int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem);
void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem);
-int drm_gem_shmem_vmap(struct drm_gem_shmem_object *shmem,
- struct iosys_map *map);
-void drm_gem_shmem_vunmap(struct drm_gem_shmem_object *shmem,
- struct iosys_map *map);
+int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,
+ struct iosys_map *map);
+void drm_gem_shmem_vunmap_locked(struct drm_gem_shmem_object *shmem,
+ struct iosys_map *map);
int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct *vma);

-int drm_gem_shmem_madvise(struct drm_gem_shmem_object *shmem, int madv);
+int drm_gem_shmem_madvise_locked(struct drm_gem_shmem_object *shmem, int madv);

static inline bool drm_gem_shmem_is_purgeable(struct drm_gem_shmem_object *shmem)
{
@@ -117,7 +117,7 @@ static inline bool drm_gem_shmem_is_purgeable(struct drm_gem_shmem_object *shmem
!shmem->base.dma_buf && !shmem->base.import_attach;
}

-void drm_gem_shmem_purge(struct drm_gem_shmem_object *shmem);
+void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem);

struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem);
struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_shmem_object *shmem);
@@ -208,12 +208,12 @@ static inline struct sg_table *drm_gem_shmem_object_get_sg_table(struct drm_gem_
}

/*
- * drm_gem_shmem_object_vmap - GEM object function for drm_gem_shmem_vmap()
+ * drm_gem_shmem_object_vmap - GEM object function for drm_gem_shmem_vmap_locked()
* @obj: GEM object
* @map: Returns the kernel virtual address of the SHMEM GEM object's backing store.
*
- * This function wraps drm_gem_shmem_vmap(). Drivers that employ the shmem helpers should
- * use it as their &drm_gem_object_funcs.vmap handler.
+ * This function wraps drm_gem_shmem_vmap_locked(). Drivers that employ the shmem
+ * helpers should use it as their &drm_gem_object_funcs.vmap handler.
*
* Returns:
* 0 on success or a negative error code on failure.
@@ -223,7 +223,7 @@ static inline int drm_gem_shmem_object_vmap(struct drm_gem_object *obj,
{
struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);

- return drm_gem_shmem_vmap(shmem, map);
+ return drm_gem_shmem_vmap_locked(shmem, map);
}

/*
@@ -231,15 +231,15 @@ static inline int drm_gem_shmem_object_vmap(struct drm_gem_object *obj,
* @obj: GEM object
* @map: Kernel virtual address where the SHMEM GEM object was mapped
*
- * This function wraps drm_gem_shmem_vunmap(). Drivers that employ the shmem helpers should
- * use it as their &drm_gem_object_funcs.vunmap handler.
+ * This function wraps drm_gem_shmem_vunmap_locked(). Drivers that employ the shmem
+ * helpers should use it as their &drm_gem_object_funcs.vunmap handler.
*/
static inline void drm_gem_shmem_object_vunmap(struct drm_gem_object *obj,
struct iosys_map *map)
{
struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);

- drm_gem_shmem_vunmap(shmem, map);
+ drm_gem_shmem_vunmap_locked(shmem, map);
}

/**
--
2.43.0


2024-01-05 18:48:24

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 06/30] drm/shmem-helper: Remove obsoleted is_iomem test

Everything that uses the mapped buffer should be agnostic to is_iomem.
The only reason for the is_iomem test is that we're setting shmem->vaddr
to the returned map->vaddr. Now that the shmem->vaddr code is gone, remove
the obsoleted is_iomem test to clean up the code.

Acked-by: Maxime Ripard <[email protected]>
Suggested-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 043e8e3b129c..1f0a66386415 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -315,12 +315,6 @@ int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,

if (obj->import_attach) {
ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
- if (!ret) {
- if (drm_WARN_ON(obj->dev, map->is_iomem)) {
- dma_buf_vunmap(obj->import_attach->dmabuf, map);
- return -EIO;
- }
- }
} else {
pgprot_t prot = PAGE_KERNEL;

--
2.43.0


2024-01-05 18:48:46

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 07/30] drm/shmem-helper: Add and use pages_pin_count

Add separate pages_pin_count for tracking of whether drm-shmem pages are
moveable or not. With the addition of memory shrinker support to drm-shmem,
the pages_use_count will no longer determine whether pages are hard-pinned
in memory, but whether pages exist and are soft-pinned (and could be swapped
out). The pages_pin_count > 1 will hard-pin pages in memory.

Acked-by: Maxime Ripard <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Suggested-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 25 +++++++++++++++++--------
include/drm/drm_gem_shmem_helper.h | 11 +++++++++++
2 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 1f0a66386415..55b9dd3d4b18 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -156,6 +156,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
drm_gem_shmem_put_pages_locked(shmem);

drm_WARN_ON(obj->dev, shmem->pages_use_count);
+ drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_pin_count));

dma_resv_unlock(shmem->base.resv);
}
@@ -234,18 +235,16 @@ static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)

dma_resv_assert_held(shmem->base.resv);

+ if (refcount_inc_not_zero(&shmem->pages_pin_count))
+ return 0;
+
ret = drm_gem_shmem_get_pages_locked(shmem);
+ if (!ret)
+ refcount_set(&shmem->pages_pin_count, 1);

return ret;
}

-static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem)
-{
- dma_resv_assert_held(shmem->base.resv);
-
- drm_gem_shmem_put_pages_locked(shmem);
-}
-
/**
* drm_gem_shmem_pin - Pin backing pages for a shmem GEM object
* @shmem: shmem GEM object
@@ -263,6 +262,9 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)

drm_WARN_ON(obj->dev, obj->import_attach);

+ if (refcount_inc_not_zero(&shmem->pages_pin_count))
+ return 0;
+
ret = dma_resv_lock_interruptible(shmem->base.resv, NULL);
if (ret)
return ret;
@@ -286,8 +288,14 @@ void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem)

drm_WARN_ON(obj->dev, obj->import_attach);

+ if (refcount_dec_not_one(&shmem->pages_pin_count))
+ return;
+
dma_resv_lock(shmem->base.resv, NULL);
- drm_gem_shmem_unpin_locked(shmem);
+
+ if (refcount_dec_and_test(&shmem->pages_pin_count))
+ drm_gem_shmem_put_pages_locked(shmem);
+
dma_resv_unlock(shmem->base.resv);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_unpin);
@@ -632,6 +640,7 @@ void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem,
if (shmem->base.import_attach)
return;

+ drm_printf_indent(p, indent, "pages_pin_count=%u\n", refcount_read(&shmem->pages_pin_count));
drm_printf_indent(p, indent, "pages_use_count=%u\n", shmem->pages_use_count);
drm_printf_indent(p, indent, "vmap_use_count=%u\n", shmem->vmap_use_count);
drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr);
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index 9e83212becbb..c708a9f45cbd 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -39,6 +39,17 @@ struct drm_gem_shmem_object {
*/
unsigned int pages_use_count;

+ /**
+ * @pages_pin_count:
+ *
+ * Reference count on the pinned pages table.
+ *
+ * Pages are hard-pinned and reside in memory if count
+ * greater than zero. Otherwise, when count is zero, the pages are
+ * allowed to be evicted and purged by memory shrinker.
+ */
+ refcount_t pages_pin_count;
+
/**
* @madv: State for madvise
*
--
2.43.0


2024-01-05 18:49:03

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 08/30] drm/shmem-helper: Use refcount_t for pages_use_count

Use atomic refcount_t helper for pages_use_count to optimize pin/unpin
functions by skipping reservation locking while GEM's pin refcount > 1.

Acked-by: Maxime Ripard <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Suggested-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 33 +++++++++++--------------
drivers/gpu/drm/lima/lima_gem.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
include/drm/drm_gem_shmem_helper.h | 2 +-
4 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 55b9dd3d4b18..cacf0f8c42e2 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -155,7 +155,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
if (shmem->pages)
drm_gem_shmem_put_pages_locked(shmem);

- drm_WARN_ON(obj->dev, shmem->pages_use_count);
+ drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count));
drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_pin_count));

dma_resv_unlock(shmem->base.resv);
@@ -173,14 +173,13 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)

dma_resv_assert_held(shmem->base.resv);

- if (shmem->pages_use_count++ > 0)
+ if (refcount_inc_not_zero(&shmem->pages_use_count))
return 0;

pages = drm_gem_get_pages(obj);
if (IS_ERR(pages)) {
drm_dbg_kms(obj->dev, "Failed to get pages (%ld)\n",
PTR_ERR(pages));
- shmem->pages_use_count = 0;
return PTR_ERR(pages);
}

@@ -196,6 +195,8 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)

shmem->pages = pages;

+ refcount_set(&shmem->pages_use_count, 1);
+
return 0;
}

@@ -211,21 +212,17 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)

dma_resv_assert_held(shmem->base.resv);

- if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count))
- return;
-
- if (--shmem->pages_use_count > 0)
- return;
-
+ if (refcount_dec_and_test(&shmem->pages_use_count)) {
#ifdef CONFIG_X86
- if (shmem->map_wc)
- set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+ if (shmem->map_wc)
+ set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
#endif

- drm_gem_put_pages(obj, shmem->pages,
- shmem->pages_mark_dirty_on_put,
- shmem->pages_mark_accessed_on_put);
- shmem->pages = NULL;
+ drm_gem_put_pages(obj, shmem->pages,
+ shmem->pages_mark_dirty_on_put,
+ shmem->pages_mark_accessed_on_put);
+ shmem->pages = NULL;
+ }
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);

@@ -552,8 +549,8 @@ static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
* mmap'd, vm_open() just grabs an additional reference for the new
* mm the vma is getting copied into (ie. on fork()).
*/
- if (!drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count))
- shmem->pages_use_count++;
+ drm_WARN_ON_ONCE(obj->dev,
+ !refcount_inc_not_zero(&shmem->pages_use_count));

dma_resv_unlock(shmem->base.resv);

@@ -641,7 +638,7 @@ void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem,
return;

drm_printf_indent(p, indent, "pages_pin_count=%u\n", refcount_read(&shmem->pages_pin_count));
- drm_printf_indent(p, indent, "pages_use_count=%u\n", shmem->pages_use_count);
+ drm_printf_indent(p, indent, "pages_use_count=%u\n", refcount_read(&shmem->pages_use_count));
drm_printf_indent(p, indent, "vmap_use_count=%u\n", shmem->vmap_use_count);
drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr);
}
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 433bda72e59b..2a97aa85416b 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -47,7 +47,7 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm)
}

bo->base.pages = pages;
- bo->base.pages_use_count = 1;
+ refcount_set(&bo->base.pages_use_count, 1);

mapping_set_unevictable(mapping);
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 1ab081bd81a8..bd5a0073009d 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -489,7 +489,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
goto err_unlock;
}
bo->base.pages = pages;
- bo->base.pages_use_count = 1;
+ refcount_set(&bo->base.pages_use_count, 1);
} else {
pages = bo->base.pages;
if (pages[page_offset]) {
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index c708a9f45cbd..2c5dc62df20c 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -37,7 +37,7 @@ struct drm_gem_shmem_object {
* Reference count on the pages table.
* The pages are put when the count reaches zero.
*/
- unsigned int pages_use_count;
+ refcount_t pages_use_count;

/**
* @pages_pin_count:
--
2.43.0


2024-01-05 18:49:18

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
lock if pages_use_count is non-zero, leveraging from atomicity of the
refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.

Acked-by: Maxime Ripard <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Suggested-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index cacf0f8c42e2..1c032513abf1 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);

+static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
+{
+ int ret;
+
+ if (refcount_inc_not_zero(&shmem->pages_use_count))
+ return 0;
+
+ dma_resv_lock(shmem->base.resv, NULL);
+ ret = drm_gem_shmem_get_pages_locked(shmem);
+ dma_resv_unlock(shmem->base.resv);
+
+ return ret;
+}
+
static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
{
int ret;
@@ -609,10 +623,7 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct
return ret;
}

- dma_resv_lock(shmem->base.resv, NULL);
- ret = drm_gem_shmem_get_pages_locked(shmem);
- dma_resv_unlock(shmem->base.resv);
-
+ ret = drm_gem_shmem_get_pages(shmem);
if (ret)
return ret;

--
2.43.0


2024-01-05 18:49:38

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 10/30] drm/shmem-helper: Switch drm_gem_shmem_vmap/vunmap to use pin/unpin

The vmapped pages shall be pinned in memory and previously get/put_pages()
were implicitly hard-pinning/unpinning the pages. This will no longer be
the case with addition of memory shrinker because pages_use_count > 0 won't
determine anymore whether pages are hard-pinned (they will be soft-pinned),
while the new pages_pin_count will do the hard-pinning. Switch the
vmap/vunmap() to use pin/unpin() functions in a preparation of addition
of the memory shrinker support to drm-shmem.

Acked-by: Maxime Ripard <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 19 ++++++++++++-------
include/drm/drm_gem_shmem_helper.h | 2 +-
2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 1c032513abf1..9c89183f81b7 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -256,6 +256,14 @@ static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
return ret;
}

+static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem)
+{
+ dma_resv_assert_held(shmem->base.resv);
+
+ if (refcount_dec_and_test(&shmem->pages_pin_count))
+ drm_gem_shmem_put_pages_locked(shmem);
+}
+
/**
* drm_gem_shmem_pin - Pin backing pages for a shmem GEM object
* @shmem: shmem GEM object
@@ -303,10 +311,7 @@ void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem)
return;

dma_resv_lock(shmem->base.resv, NULL);
-
- if (refcount_dec_and_test(&shmem->pages_pin_count))
- drm_gem_shmem_put_pages_locked(shmem);
-
+ drm_gem_shmem_unpin_locked(shmem);
dma_resv_unlock(shmem->base.resv);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_unpin);
@@ -344,7 +349,7 @@ int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,
return 0;
}

- ret = drm_gem_shmem_get_pages_locked(shmem);
+ ret = drm_gem_shmem_pin_locked(shmem);
if (ret)
goto err_zero_use;

@@ -367,7 +372,7 @@ int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,

err_put_pages:
if (!obj->import_attach)
- drm_gem_shmem_put_pages_locked(shmem);
+ drm_gem_shmem_unpin_locked(shmem);
err_zero_use:
shmem->vmap_use_count = 0;

@@ -404,7 +409,7 @@ void drm_gem_shmem_vunmap_locked(struct drm_gem_shmem_object *shmem,
return;

vunmap(shmem->vaddr);
- drm_gem_shmem_put_pages_locked(shmem);
+ drm_gem_shmem_unpin_locked(shmem);
}

shmem->vaddr = NULL;
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index 2c5dc62df20c..80623b897803 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -124,7 +124,7 @@ int drm_gem_shmem_madvise_locked(struct drm_gem_shmem_object *shmem, int madv);
static inline bool drm_gem_shmem_is_purgeable(struct drm_gem_shmem_object *shmem)
{
return (shmem->madv > 0) &&
- !shmem->vmap_use_count && shmem->sgt &&
+ !refcount_read(&shmem->pages_pin_count) && shmem->sgt &&
!shmem->base.dma_buf && !shmem->base.import_attach;
}

--
2.43.0


2024-01-05 18:50:05

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 11/30] drm/shmem-helper: Use refcount_t for vmap_use_count

Use refcount_t helper for vmap_use_count to make refcounting consistent
with pages_use_count and pages_pin_count that use refcount_t. This also
makes vmapping to benefit from the refcount_t's overflow checks.

Acked-by: Maxime Ripard <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Suggested-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 28 +++++++++++---------------
include/drm/drm_gem_shmem_helper.h | 2 +-
2 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 9c89183f81b7..3403700780c3 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -144,7 +144,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
} else {
dma_resv_lock(shmem->base.resv, NULL);

- drm_WARN_ON(obj->dev, shmem->vmap_use_count);
+ drm_WARN_ON(obj->dev, refcount_read(&shmem->vmap_use_count));

if (shmem->sgt) {
dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
@@ -344,23 +344,25 @@ int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,

dma_resv_assert_held(shmem->base.resv);

- if (shmem->vmap_use_count++ > 0) {
+ if (refcount_inc_not_zero(&shmem->vmap_use_count)) {
iosys_map_set_vaddr(map, shmem->vaddr);
return 0;
}

ret = drm_gem_shmem_pin_locked(shmem);
if (ret)
- goto err_zero_use;
+ return ret;

if (shmem->map_wc)
prot = pgprot_writecombine(prot);
shmem->vaddr = vmap(shmem->pages, obj->size >> PAGE_SHIFT,
VM_MAP, prot);
- if (!shmem->vaddr)
+ if (!shmem->vaddr) {
ret = -ENOMEM;
- else
+ } else {
iosys_map_set_vaddr(map, shmem->vaddr);
+ refcount_set(&shmem->vmap_use_count, 1);
+ }
}

if (ret) {
@@ -373,8 +375,6 @@ int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,
err_put_pages:
if (!obj->import_attach)
drm_gem_shmem_unpin_locked(shmem);
-err_zero_use:
- shmem->vmap_use_count = 0;

return ret;
}
@@ -402,14 +402,10 @@ void drm_gem_shmem_vunmap_locked(struct drm_gem_shmem_object *shmem,
} else {
dma_resv_assert_held(shmem->base.resv);

- if (drm_WARN_ON_ONCE(obj->dev, !shmem->vmap_use_count))
- return;
-
- if (--shmem->vmap_use_count > 0)
- return;
-
- vunmap(shmem->vaddr);
- drm_gem_shmem_unpin_locked(shmem);
+ if (refcount_dec_and_test(&shmem->vmap_use_count)) {
+ vunmap(shmem->vaddr);
+ drm_gem_shmem_unpin_locked(shmem);
+ }
}

shmem->vaddr = NULL;
@@ -655,7 +651,7 @@ void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem,

drm_printf_indent(p, indent, "pages_pin_count=%u\n", refcount_read(&shmem->pages_pin_count));
drm_printf_indent(p, indent, "pages_use_count=%u\n", refcount_read(&shmem->pages_use_count));
- drm_printf_indent(p, indent, "vmap_use_count=%u\n", shmem->vmap_use_count);
+ drm_printf_indent(p, indent, "vmap_use_count=%u\n", refcount_read(&shmem->vmap_use_count));
drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_print_info);
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index 80623b897803..18020f653d7e 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -82,7 +82,7 @@ struct drm_gem_shmem_object {
* Reference count on the virtual address.
* The address are un-mapped when the count reaches zero.
*/
- unsigned int vmap_use_count;
+ refcount_t vmap_use_count;

/**
* @pages_mark_dirty_on_put:
--
2.43.0


2024-01-05 18:50:17

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 12/30] drm/shmem-helper: Prepare drm_gem_shmem_free() to shrinker addition

Prepare drm_gem_shmem_free() to addition of memory shrinker support
to drm-shmem by adding and using variant of put_pages() that doesn't
touch reservation lock. Reservation shouldn't be touched because lockdep
will trigger a bogus warning about locking contention with fs_reclaim
code paths that can't happen during the time when GEM is freed and
lockdep doesn't know about that.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 40 ++++++++++++++------------
1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 3403700780c3..799a3c5015ad 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -128,6 +128,22 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_create);

+static void
+drm_gem_shmem_free_pages(struct drm_gem_shmem_object *shmem)
+{
+ struct drm_gem_object *obj = &shmem->base;
+
+#ifdef CONFIG_X86
+ if (shmem->map_wc)
+ set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+#endif
+
+ drm_gem_put_pages(obj, shmem->pages,
+ shmem->pages_mark_dirty_on_put,
+ shmem->pages_mark_accessed_on_put);
+ shmem->pages = NULL;
+}
+
/**
* drm_gem_shmem_free - Free resources associated with a shmem GEM object
* @shmem: shmem GEM object to free
@@ -142,8 +158,6 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
if (obj->import_attach) {
drm_prime_gem_destroy(obj, shmem->sgt);
} else {
- dma_resv_lock(shmem->base.resv, NULL);
-
drm_WARN_ON(obj->dev, refcount_read(&shmem->vmap_use_count));

if (shmem->sgt) {
@@ -152,13 +166,12 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
}
- if (shmem->pages)
- drm_gem_shmem_put_pages_locked(shmem);
+ if (shmem->pages &&
+ refcount_dec_and_test(&shmem->pages_use_count))
+ drm_gem_shmem_free_pages(shmem);

drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count));
drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_pin_count));
-
- dma_resv_unlock(shmem->base.resv);
}

drm_gem_object_release(obj);
@@ -208,21 +221,10 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
*/
void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
{
- struct drm_gem_object *obj = &shmem->base;
-
dma_resv_assert_held(shmem->base.resv);

- if (refcount_dec_and_test(&shmem->pages_use_count)) {
-#ifdef CONFIG_X86
- if (shmem->map_wc)
- set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
-#endif
-
- drm_gem_put_pages(obj, shmem->pages,
- shmem->pages_mark_dirty_on_put,
- shmem->pages_mark_accessed_on_put);
- shmem->pages = NULL;
- }
+ if (refcount_dec_and_test(&shmem->pages_use_count))
+ drm_gem_shmem_free_pages(shmem);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);

--
2.43.0


2024-01-05 18:50:21

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 13/30] drm/shmem-helper: Make drm_gem_shmem_get_pages() public

We're going to move away from having implicit get_pages() done by
get_pages_sgt() to simplify refcnt handling. Drivers will manage
get/put_pages() by themselves. Expose the drm_gem_shmem_get_pages()
in a public drm-shmem API.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 10 +++++++++-
include/drm/drm_gem_shmem_helper.h | 1 +
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 799a3c5015ad..dc416a4bce1b 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -228,7 +228,14 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);

-static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
+/*
+ * drm_gem_shmem_get_pages - Increase use count on the backing pages for a shmem GEM object
+ * @shmem: shmem GEM object
+ *
+ * This function Increases the use count and allocates the backing pages if
+ * use-count equals to zero.
+ */
+int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
{
int ret;

@@ -241,6 +248,7 @@ static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)

return ret;
}
+EXPORT_SYMBOL_GPL(drm_gem_shmem_get_pages);

static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
{
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index 18020f653d7e..6dedc0739fbc 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -110,6 +110,7 @@ struct drm_gem_shmem_object {
struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t size);
void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem);

+int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem);
void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem);
int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem);
void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem);
--
2.43.0


2024-01-05 18:50:41

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 14/30] drm/shmem-helper: Add drm_gem_shmem_put_pages()

We're going to move away from having implicit get_pages() done by
get_pages_sgt() to ease simplify refcnt handling. Drivers will manage
get/put_pages() by themselves. Add drm_gem_shmem_put_pages().

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 20 ++++++++++++++++++++
include/drm/drm_gem_shmem_helper.h | 1 +
2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index dc416a4bce1b..f5ed64f78648 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -218,6 +218,7 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
* @shmem: shmem GEM object
*
* This function decreases the use count and puts the backing pages when use drops to zero.
+ * Caller must hold GEM's reservation lock.
*/
void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
{
@@ -228,6 +229,25 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);

+/*
+ * drm_gem_shmem_put_pages - Decrease use count on the backing pages for a shmem GEM object
+ * @shmem: shmem GEM object
+ *
+ * This function decreases the use count and puts the backing pages when use drops to zero.
+ * It's unlocked version of drm_gem_shmem_put_pages_locked(), caller must not hold
+ * GEM's reservation lock.
+ */
+void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
+{
+ if (refcount_dec_not_one(&shmem->pages_use_count))
+ return;
+
+ dma_resv_lock(shmem->base.resv, NULL);
+ drm_gem_shmem_put_pages_locked(shmem);
+ dma_resv_unlock(shmem->base.resv);
+}
+EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages);
+
/*
* drm_gem_shmem_get_pages - Increase use count on the backing pages for a shmem GEM object
* @shmem: shmem GEM object
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index 6dedc0739fbc..525480488451 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -111,6 +111,7 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t
void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem);

int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem);
+void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem);
void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem);
int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem);
void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem);
--
2.43.0


2024-01-05 18:50:50

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 15/30] drm/shmem-helper: Avoid lockdep warning when pages are released

All drivers will be moved to get/put pages explicitly and then the last
put_pages() will be invoked during gem_free() time by some drivers.
We can't touch reservation lock when GEM is freed because that will cause
a spurious warning from lockdep when shrinker support will be added.
Lockdep doesn't know that fs_reclaim isn't functioning for a freed object,
and thus, can't deadlock. Release pages directly without taking reservation
lock if GEM is freed and its refcount is zero.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index f5ed64f78648..c7357110ca76 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -242,6 +242,22 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
if (refcount_dec_not_one(&shmem->pages_use_count))
return;

+ /*
+ * Destroying the object is a special case because acquiring
+ * the obj lock can cause a locking order inversion between
+ * reservation_ww_class_mutex and fs_reclaim.
+ *
+ * This deadlock is not actually possible, because no one should
+ * be already holding the lock when GEM is released. Unfortunately
+ * lockdep is not aware of this detail. So when the refcount drops
+ * to zero, we pretend it is already locked.
+ */
+ if (!kref_read(&shmem->base.refcount)) {
+ if (refcount_dec_and_test(&shmem->pages_use_count))
+ drm_gem_shmem_free_pages(shmem);
+ return;
+ }
+
dma_resv_lock(shmem->base.resv, NULL);
drm_gem_shmem_put_pages_locked(shmem);
dma_resv_unlock(shmem->base.resv);
--
2.43.0


2024-01-05 18:51:06

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 16/30] drm/lima: Explicitly get and put drm-shmem pages

To simplify the drm-shmem refcnt handling, we're moving away from
the implicit get_pages() that is used by get_pages_sgt(). From now on
drivers will have to pin pages while they use sgt. Lima driver doesn't
have shrinker, hence pages are pinned and sgt is valid as long as pages'
use-count > 0.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/lima/lima_gem.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 2a97aa85416b..9c3e34a7fbed 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -115,6 +115,7 @@ int lima_gem_create_handle(struct drm_device *dev, struct drm_file *file,
return PTR_ERR(shmem);

obj = &shmem->base;
+ bo = to_lima_bo(obj);

/* Mali Utgard GPU can only support 32bit address space */
mask = mapping_gfp_mask(obj->filp->f_mapping);
@@ -123,13 +124,17 @@ int lima_gem_create_handle(struct drm_device *dev, struct drm_file *file,
mapping_set_gfp_mask(obj->filp->f_mapping, mask);

if (is_heap) {
- bo = to_lima_bo(obj);
err = lima_heap_alloc(bo, NULL);
if (err)
goto out;
} else {
- struct sg_table *sgt = drm_gem_shmem_get_pages_sgt(shmem);
+ struct sg_table *sgt;

+ err = drm_gem_shmem_get_pages(shmem);
+ if (err)
+ goto out;
+
+ sgt = drm_gem_shmem_get_pages_sgt(shmem);
if (IS_ERR(sgt)) {
err = PTR_ERR(sgt);
goto out;
@@ -139,6 +144,9 @@ int lima_gem_create_handle(struct drm_device *dev, struct drm_file *file,
err = drm_gem_handle_create(file, obj, handle);

out:
+ if (err && refcount_read(&bo->base.pages_use_count))
+ drm_gem_shmem_put_pages(shmem);
+
/* drop reference from allocate - handle holds it now */
drm_gem_object_put(obj);

@@ -152,6 +160,9 @@ static void lima_gem_free_object(struct drm_gem_object *obj)
if (!list_empty(&bo->va))
dev_err(obj->dev->dev, "lima gem free bo still has va\n");

+ if (refcount_read(&bo->base.pages_use_count))
+ drm_gem_shmem_put_pages(&bo->base);
+
drm_gem_shmem_free(&bo->base);
}

--
2.43.0


2024-01-05 18:51:22

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 17/30] drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()

From: Boris Brezillon <[email protected]>

If some the pages or sgt allocation failed, we shouldn't release the
pages ref we got earlier, otherwise we will end up with unbalanced
get/put_pages() calls. We should instead leave everything in place
and let the BO release function deal with extra cleanup when the object
is destroyed, or let the fault handler try again next time it's called.

Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations")
Cc: <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Co-developed-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index bd5a0073009d..4a0b4bf03f1a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -502,11 +502,18 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
mapping_set_unevictable(mapping);

for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
+ /* Can happen if the last fault only partially filled this
+ * section of the pages array before failing. In that case
+ * we skip already filled pages.
+ */
+ if (pages[i])
+ continue;
+
pages[i] = shmem_read_mapping_page(mapping, i);
if (IS_ERR(pages[i])) {
ret = PTR_ERR(pages[i]);
pages[i] = NULL;
- goto err_pages;
+ goto err_unlock;
}
}

@@ -514,7 +521,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
ret = sg_alloc_table_from_pages(sgt, pages + page_offset,
NUM_FAULT_PAGES, 0, SZ_2M, GFP_KERNEL);
if (ret)
- goto err_pages;
+ goto err_unlock;

ret = dma_map_sgtable(pfdev->dev, sgt, DMA_BIDIRECTIONAL, 0);
if (ret)
@@ -537,8 +544,6 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,

err_map:
sg_free_table(sgt);
-err_pages:
- drm_gem_shmem_put_pages_locked(&bo->base);
err_unlock:
dma_resv_unlock(obj->resv);
err_bo:
--
2.43.0


2024-01-05 18:51:36

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 18/30] drm/panfrost: Explicitly get and put drm-shmem pages

To simplify the drm-shmem refcnt handling, we're moving away from
the implicit get_pages() that is used by get_pages_sgt(). From now on
drivers will have to pin pages while they use sgt. Panfrost's shrinker
doesn't support swapping out BOs, hence pages are pinned and sgt is valid
as long as pages' use-count > 0.

In Panfrost, panfrost_gem_mapping, which is the object representing a
GPU mapping of a BO, owns a pages ref. This guarantees that any BO being
mapped GPU side has its pages retained till the mapping is destroyed.

Since pages are no longer guaranteed to stay pinned for the BO lifetime,
and MADVISE(DONT_NEED) flagging remains after the GEM handle has been
destroyed, we need to add an extra 'is_purgeable' check in
panfrost_gem_purge(), to make sure we're not trying to purge a BO that
already had its pages released.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_gem.c | 63 ++++++++++++++-----
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 6 ++
2 files changed, 52 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index f268bd5c2884..7edfc12f7c1f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -35,20 +35,6 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj)
*/
WARN_ON_ONCE(!list_empty(&bo->mappings.list));

- if (bo->sgts) {
- int i;
- int n_sgt = bo->base.base.size / SZ_2M;
-
- for (i = 0; i < n_sgt; i++) {
- if (bo->sgts[i].sgl) {
- dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
- DMA_BIDIRECTIONAL, 0);
- sg_free_table(&bo->sgts[i]);
- }
- }
- kvfree(bo->sgts);
- }
-
drm_gem_shmem_free(&bo->base);
}

@@ -85,11 +71,40 @@ panfrost_gem_teardown_mapping(struct panfrost_gem_mapping *mapping)

static void panfrost_gem_mapping_release(struct kref *kref)
{
- struct panfrost_gem_mapping *mapping;
-
- mapping = container_of(kref, struct panfrost_gem_mapping, refcount);
+ struct panfrost_gem_mapping *mapping =
+ container_of(kref, struct panfrost_gem_mapping, refcount);
+ struct panfrost_gem_object *bo = mapping->obj;
+ struct panfrost_device *pfdev = bo->base.base.dev->dev_private;

panfrost_gem_teardown_mapping(mapping);
+
+ /* On heap BOs, release the sgts created in the fault handler path. */
+ if (bo->sgts) {
+ int i, n_sgt = bo->base.base.size / SZ_2M;
+
+ for (i = 0; i < n_sgt; i++) {
+ if (bo->sgts[i].sgl) {
+ dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
+ DMA_BIDIRECTIONAL, 0);
+ sg_free_table(&bo->sgts[i]);
+ }
+ }
+ kvfree(bo->sgts);
+ }
+
+ /* Pages ref is owned by the panfrost_gem_mapping object. We must
+ * release our pages ref (if any), before releasing the object
+ * ref.
+ * Non-heap BOs acquired the pages at panfrost_gem_mapping creation
+ * time, and heap BOs may have acquired pages if the fault handler
+ * was called, in which case bo->sgts should be non-NULL.
+ */
+ if (!bo->base.base.import_attach && (!bo->is_heap || bo->sgts) &&
+ bo->base.madv >= 0) {
+ drm_gem_shmem_put_pages(&bo->base);
+ bo->sgts = NULL;
+ }
+
drm_gem_object_put(&mapping->obj->base.base);
panfrost_mmu_ctx_put(mapping->mmu);
kfree(mapping);
@@ -125,6 +140,20 @@ int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv)
if (!mapping)
return -ENOMEM;

+ if (!bo->is_heap && !bo->base.base.import_attach) {
+ /* Pages ref is owned by the panfrost_gem_mapping object.
+ * For non-heap BOs, we request pages at mapping creation
+ * time, such that the panfrost_mmu_map() call, further down in
+ * this function, is guaranteed to have pages_use_count > 0
+ * when drm_gem_shmem_get_pages_sgt() is called.
+ */
+ ret = drm_gem_shmem_get_pages(&bo->base);
+ if (ret) {
+ kfree(mapping);
+ return ret;
+ }
+ }
+
INIT_LIST_HEAD(&mapping->node);
kref_init(&mapping->refcount);
drm_gem_object_get(obj);
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
index 02b60ea1433a..d4fb0854cf2f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
@@ -50,6 +50,12 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj)
if (!dma_resv_trylock(shmem->base.resv))
goto unlock_mappings;

+ /* BO might have become unpurgeable if the last pages_use_count ref
+ * was dropped, but the BO hasn't been destroyed yet.
+ */
+ if (!drm_gem_shmem_is_purgeable(shmem))
+ goto unlock_mappings;
+
panfrost_gem_teardown_mappings_locked(bo);
drm_gem_shmem_purge_locked(&bo->base);
ret = true;
--
2.43.0


2024-01-05 18:51:50

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 19/30] drm/virtio: Explicitly get and put drm-shmem pages

We're moving away from implicit get_pages() that is done by
get_pages_sgt() to simplify the refcnt handling. Drivers will have
to pin pages while they use sgt. VirtIO-GPU doesn't support shrinker,
hence pages are pinned and sgt is valid as long as pages' use-count > 0.

Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/virtio/virtgpu_object.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c
index c7e74cf13022..e58528c562ef 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -67,6 +67,7 @@ void virtio_gpu_cleanup_object(struct virtio_gpu_object *bo)

virtio_gpu_resource_id_put(vgdev, bo->hw_res_handle);
if (virtio_gpu_is_shmem(bo)) {
+ drm_gem_shmem_put_pages(&bo->base);
drm_gem_shmem_free(&bo->base);
} else if (virtio_gpu_is_vram(bo)) {
struct virtio_gpu_object_vram *vram = to_virtio_gpu_vram(bo);
@@ -196,9 +197,13 @@ int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
return PTR_ERR(shmem_obj);
bo = gem_to_virtio_gpu_obj(&shmem_obj->base);

+ ret = drm_gem_shmem_get_pages(shmem_obj);
+ if (ret)
+ goto err_free_gem;
+
ret = virtio_gpu_resource_id_get(vgdev, &bo->hw_res_handle);
if (ret < 0)
- goto err_free_gem;
+ goto err_put_pages;

bo->dumb = params->dumb;

@@ -243,6 +248,8 @@ int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
kvfree(ents);
err_put_id:
virtio_gpu_resource_id_put(vgdev, bo->hw_res_handle);
+err_put_pages:
+ drm_gem_shmem_put_pages(shmem_obj);
err_free_gem:
drm_gem_shmem_free(shmem_obj);
return ret;
--
2.43.0


2024-01-05 18:52:02

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 20/30] drm/v3d: Explicitly get and put drm-shmem pages

To simplify the drm-shmem refcnt handling, we're moving away from
the implicit get_pages() that is used by get_pages_sgt(). From now on
drivers will have to pin pages while they use sgt. V3D driver doesn't
support shrinker, hence pages are pinned and sgt is valid as long as
pages' use-count > 0.

Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/v3d/v3d_bo.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c
index 1bdfac8beafd..ccf04ce93e8c 100644
--- a/drivers/gpu/drm/v3d/v3d_bo.c
+++ b/drivers/gpu/drm/v3d/v3d_bo.c
@@ -50,6 +50,9 @@ void v3d_free_object(struct drm_gem_object *obj)
/* GPU execution may have dirtied any pages in the BO. */
bo->base.pages_mark_dirty_on_put = true;

+ if (!obj->import_attach)
+ drm_gem_shmem_put_pages(&bo->base);
+
drm_gem_shmem_free(&bo->base);
}

@@ -139,12 +142,18 @@ struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct drm_file *file_priv,
bo = to_v3d_bo(&shmem_obj->base);
bo->vaddr = NULL;

- ret = v3d_bo_create_finish(&shmem_obj->base);
+ ret = drm_gem_shmem_get_pages(shmem_obj);
if (ret)
goto free_obj;

+ ret = v3d_bo_create_finish(&shmem_obj->base);
+ if (ret)
+ goto put_pages;
+
return bo;

+put_pages:
+ drm_gem_shmem_put_pages(shmem_obj);
free_obj:
drm_gem_shmem_free(shmem_obj);
return ERR_PTR(ret);
--
2.43.0


2024-01-05 18:52:17

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 21/30] drm/shmem-helper: Change sgt allocation policy

In a preparation to addition of drm-shmem memory shrinker support, change
the SGT allocation policy in this way:

1. SGT can be allocated only if shmem pages are pinned at the
time of allocation, otherwise allocation fails.

2. Drivers must ensure that pages are pinned during the time of SGT usage
and should get new SGT if pages were unpinned.

This new policy is required by the shrinker because it will move pages
to/from SWAP unless pages are pinned, invalidating SGT pointer once pages
are relocated.

Previous patches prepared drivers to the new policy.

Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 55 ++++++++++++++------------
1 file changed, 29 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index c7357110ca76..ff5437ab2c95 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -133,6 +133,14 @@ drm_gem_shmem_free_pages(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;

+ if (shmem->sgt) {
+ dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
+ DMA_BIDIRECTIONAL, 0);
+ sg_free_table(shmem->sgt);
+ kfree(shmem->sgt);
+ shmem->sgt = NULL;
+ }
+
#ifdef CONFIG_X86
if (shmem->map_wc)
set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
@@ -155,24 +163,12 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;

- if (obj->import_attach) {
+ if (obj->import_attach)
drm_prime_gem_destroy(obj, shmem->sgt);
- } else {
- drm_WARN_ON(obj->dev, refcount_read(&shmem->vmap_use_count));

- if (shmem->sgt) {
- dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
- DMA_BIDIRECTIONAL, 0);
- sg_free_table(shmem->sgt);
- kfree(shmem->sgt);
- }
- if (shmem->pages &&
- refcount_dec_and_test(&shmem->pages_use_count))
- drm_gem_shmem_free_pages(shmem);
-
- drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count));
- drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_pin_count));
- }
+ drm_WARN_ON(obj->dev, refcount_read(&shmem->vmap_use_count));
+ drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count));
+ drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_pin_count));

drm_gem_object_release(obj);
kfree(shmem);
@@ -722,6 +718,9 @@ struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem)

drm_WARN_ON(obj->dev, obj->import_attach);

+ if (drm_WARN_ON(obj->dev, !shmem->pages))
+ return ERR_PTR(-ENOMEM);
+
return drm_prime_pages_to_sg(obj->dev, shmem->pages, obj->size >> PAGE_SHIFT);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_get_sg_table);
@@ -737,15 +736,10 @@ static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_

drm_WARN_ON(obj->dev, obj->import_attach);

- ret = drm_gem_shmem_get_pages_locked(shmem);
- if (ret)
- return ERR_PTR(ret);
-
sgt = drm_gem_shmem_get_sg_table(shmem);
- if (IS_ERR(sgt)) {
- ret = PTR_ERR(sgt);
- goto err_put_pages;
- }
+ if (IS_ERR(sgt))
+ return sgt;
+
/* Map the pages for use by the h/w. */
ret = dma_map_sgtable(obj->dev->dev, sgt, DMA_BIDIRECTIONAL, 0);
if (ret)
@@ -758,8 +752,6 @@ static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_
err_free_sgt:
sg_free_table(sgt);
kfree(sgt);
-err_put_pages:
- drm_gem_shmem_put_pages_locked(shmem);
return ERR_PTR(ret);
}

@@ -776,6 +768,17 @@ static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_
* and difference between dma-buf imported and natively allocated objects.
* drm_gem_shmem_get_sg_table() should not be directly called by drivers.
*
+ * Drivers should adhere to these SGT usage rules:
+ *
+ * 1. SGT should be allocated only if shmem pages are pinned at the
+ * time of allocation, otherwise allocation will fail.
+ *
+ * 2. Drivers should ensure that pages are pinned during the time of
+ * SGT usage and should get new SGT if pages were unpinned.
+ *
+ * Drivers don't own returned SGT and must take care of the SGT pointer
+ * lifetime. SGT is valid as long as GEM pages that backing SGT are pinned.
+ *
* Returns:
* A pointer to the scatter/gather table of pinned pages or errno on failure.
*/
--
2.43.0


2024-01-05 18:52:36

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

Introduce common drm-shmem shrinker for DRM drivers.

To start using drm-shmem shrinker drivers should do the following:

1. Implement evict() callback of GEM object where driver should check
whether object is purgeable or evictable using drm-shmem helpers and
perform the shrinking action

2. Initialize drm-shmem internals using drmm_gem_shmem_init(drm_device),
which will register drm-shmem shrinker

3. Implement madvise IOCTL that will use drm_gem_shmem_madvise()

Signed-off-by: Daniel Almeida <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 365 +++++++++++++++++-
drivers/gpu/drm/panfrost/panfrost_gem.c | 3 +-
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 13 +-
include/drm/drm_device.h | 10 +-
include/drm/drm_gem_shmem_helper.h | 68 +++-
5 files changed, 433 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index ff5437ab2c95..59cebd1e35af 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -20,6 +20,7 @@
#include <drm/drm_device.h>
#include <drm/drm_drv.h>
#include <drm/drm_gem_shmem_helper.h>
+#include <drm/drm_managed.h>
#include <drm/drm_prime.h>
#include <drm/drm_print.h>

@@ -128,11 +129,49 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_create);

+static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
+{
+ return (shmem->madv >= 0) && shmem->base.funcs->evict &&
+ refcount_read(&shmem->pages_use_count) &&
+ !refcount_read(&shmem->pages_pin_count) &&
+ !shmem->base.dma_buf && !shmem->base.import_attach &&
+ !shmem->evicted;
+}
+
+static void
+drm_gem_shmem_shrinker_update_lru_locked(struct drm_gem_shmem_object *shmem)
+{
+ struct drm_gem_object *obj = &shmem->base;
+ struct drm_gem_shmem *shmem_mm = obj->dev->shmem_mm;
+ struct drm_gem_shmem_shrinker *shmem_shrinker = &shmem_mm->shrinker;
+
+ dma_resv_assert_held(shmem->base.resv);
+
+ if (!shmem_shrinker || obj->import_attach)
+ return;
+
+ if (shmem->madv < 0)
+ drm_gem_lru_remove(&shmem->base);
+ else if (drm_gem_shmem_is_evictable(shmem) || drm_gem_shmem_is_purgeable(shmem))
+ drm_gem_lru_move_tail(&shmem_shrinker->lru_evictable, &shmem->base);
+ else if (shmem->evicted)
+ drm_gem_lru_move_tail(&shmem_shrinker->lru_evicted, &shmem->base);
+ else if (!shmem->pages)
+ drm_gem_lru_remove(&shmem->base);
+ else
+ drm_gem_lru_move_tail(&shmem_shrinker->lru_pinned, &shmem->base);
+}
+
static void
drm_gem_shmem_free_pages(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;

+ if (!shmem->pages) {
+ drm_WARN_ON(obj->dev, !shmem->evicted && shmem->madv >= 0);
+ return;
+ }
+
if (shmem->sgt) {
dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
DMA_BIDIRECTIONAL, 0);
@@ -175,15 +214,26 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_free);

-static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
+static int
+drm_gem_shmem_acquire_pages(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;
struct page **pages;

+ if (drm_WARN_ON(obj->dev, obj->import_attach))
+ return -EINVAL;
+
dma_resv_assert_held(shmem->base.resv);

- if (refcount_inc_not_zero(&shmem->pages_use_count))
+ if (shmem->madv < 0) {
+ drm_WARN_ON(obj->dev, shmem->pages);
+ return -ENOMEM;
+ }
+
+ if (shmem->pages) {
+ drm_WARN_ON(obj->dev, !shmem->evicted);
return 0;
+ }

pages = drm_gem_get_pages(obj);
if (IS_ERR(pages)) {
@@ -204,8 +254,29 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)

shmem->pages = pages;

+ return 0;
+}
+
+static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
+{
+ int err;
+
+ dma_resv_assert_held(shmem->base.resv);
+
+ if (shmem->madv < 0)
+ return -ENOMEM;
+
+ if (refcount_inc_not_zero(&shmem->pages_use_count))
+ return 0;
+
+ err = drm_gem_shmem_acquire_pages(shmem);
+ if (err)
+ return err;
+
refcount_set(&shmem->pages_use_count, 1);

+ drm_gem_shmem_shrinker_update_lru_locked(shmem);
+
return 0;
}

@@ -222,6 +293,8 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)

if (refcount_dec_and_test(&shmem->pages_use_count))
drm_gem_shmem_free_pages(shmem);
+
+ drm_gem_shmem_shrinker_update_lru_locked(shmem);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);

@@ -266,6 +339,11 @@ EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages);
*
* This function Increases the use count and allocates the backing pages if
* use-count equals to zero.
+ *
+ * Note that this function doesn't pin pages in memory. If your driver
+ * uses drm-shmem shrinker, then it's free to relocate pages to swap.
+ * Getting pages only guarantees that pages are allocated, and not that
+ * pages reside in memory. In order to pin pages use drm_gem_shmem_pin().
*/
int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
{
@@ -291,6 +369,10 @@ static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
if (refcount_inc_not_zero(&shmem->pages_pin_count))
return 0;

+ ret = drm_gem_shmem_swapin_locked(shmem);
+ if (ret)
+ return ret;
+
ret = drm_gem_shmem_get_pages_locked(shmem);
if (!ret)
refcount_set(&shmem->pages_pin_count, 1);
@@ -489,29 +571,48 @@ int drm_gem_shmem_madvise_locked(struct drm_gem_shmem_object *shmem, int madv)

madv = shmem->madv;

+ drm_gem_shmem_shrinker_update_lru_locked(shmem);
+
return (madv >= 0);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_madvise_locked);

-void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem)
+int drm_gem_shmem_madvise(struct drm_gem_shmem_object *shmem, int madv)
{
struct drm_gem_object *obj = &shmem->base;
- struct drm_device *dev = obj->dev;
+ int ret;

- dma_resv_assert_held(shmem->base.resv);
+ ret = dma_resv_lock_interruptible(obj->resv, NULL);
+ if (ret)
+ return ret;

- drm_WARN_ON(obj->dev, !drm_gem_shmem_is_purgeable(shmem));
+ ret = drm_gem_shmem_madvise_locked(shmem, madv);
+ dma_resv_unlock(obj->resv);

- dma_unmap_sgtable(dev->dev, shmem->sgt, DMA_BIDIRECTIONAL, 0);
- sg_free_table(shmem->sgt);
- kfree(shmem->sgt);
- shmem->sgt = NULL;
+ return ret;
+}
+EXPORT_SYMBOL_GPL(drm_gem_shmem_madvise);

- drm_gem_shmem_put_pages_locked(shmem);
+static void
+drm_gem_shmem_shrinker_put_pages_locked(struct drm_gem_shmem_object *shmem)
+{
+ struct drm_gem_object *obj = &shmem->base;
+ struct drm_device *dev = obj->dev;

- shmem->madv = -1;
+ dma_resv_assert_held(shmem->base.resv);

+ if (shmem->evicted)
+ return;
+
+ drm_gem_shmem_free_pages(shmem);
drm_vma_node_unmap(&obj->vma_node, dev->anon_inode->i_mapping);
+}
+
+void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem)
+{
+ struct drm_gem_object *obj = &shmem->base;
+
+ drm_gem_shmem_shrinker_put_pages_locked(shmem);
drm_gem_free_mmap_offset(obj);

/* Our goal here is to return as much of the memory as
@@ -522,9 +623,45 @@ void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem)
shmem_truncate_range(file_inode(obj->filp), 0, (loff_t)-1);

invalidate_mapping_pages(file_inode(obj->filp)->i_mapping, 0, (loff_t)-1);
+
+ shmem->madv = -1;
+ shmem->evicted = false;
+ drm_gem_shmem_shrinker_update_lru_locked(shmem);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_purge_locked);

+/**
+ * drm_gem_shmem_swapin_locked() - Moves shmem GEM back to memory and enables
+ * hardware access to the memory.
+ * @shmem: shmem GEM object
+ *
+ * This function moves shmem GEM back to memory if it was previously evicted
+ * by the memory shrinker. The GEM is ready to use on success.
+ *
+ * Returns:
+ * 0 on success or a negative error code on failure.
+ */
+int drm_gem_shmem_swapin_locked(struct drm_gem_shmem_object *shmem)
+{
+ int err;
+
+ dma_resv_assert_held(shmem->base.resv);
+
+ if (!shmem->evicted)
+ return 0;
+
+ err = drm_gem_shmem_acquire_pages(shmem);
+ if (err)
+ return err;
+
+ shmem->evicted = false;
+
+ drm_gem_shmem_shrinker_update_lru_locked(shmem);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(drm_gem_shmem_swapin_locked);
+
/**
* drm_gem_shmem_dumb_create - Create a dumb shmem buffer object
* @file: DRM file structure to create the dumb buffer for
@@ -571,22 +708,32 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
vm_fault_t ret;
struct page *page;
pgoff_t page_offset;
+ int err;

/* We don't use vmf->pgoff since that has the fake offset */
page_offset = (vmf->address - vma->vm_start) >> PAGE_SHIFT;

dma_resv_lock(shmem->base.resv, NULL);

- if (page_offset >= num_pages ||
- drm_WARN_ON_ONCE(obj->dev, !shmem->pages) ||
- shmem->madv < 0) {
+ err = drm_gem_shmem_swapin_locked(shmem);
+ if (err) {
+ ret = VM_FAULT_OOM;
+ goto unlock;
+ }
+
+ if (page_offset >= num_pages || !shmem->pages) {
ret = VM_FAULT_SIGBUS;
} else {
+ /*
+ * shmem->pages is guaranteed to be valid while reservation
+ * lock is held and drm_gem_shmem_swapin_locked() succeeds.
+ */
page = shmem->pages[page_offset];

ret = vmf_insert_pfn(vma, vmf->address, page_to_pfn(page));
}

+unlock:
dma_resv_unlock(shmem->base.resv);

return ret;
@@ -609,6 +756,7 @@ static void drm_gem_shmem_vm_open(struct vm_area_struct *vma)
drm_WARN_ON_ONCE(obj->dev,
!refcount_inc_not_zero(&shmem->pages_use_count));

+ drm_gem_shmem_shrinker_update_lru_locked(shmem);
dma_resv_unlock(shmem->base.resv);

drm_gem_vm_open(vma);
@@ -694,7 +842,9 @@ void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem,
drm_printf_indent(p, indent, "pages_pin_count=%u\n", refcount_read(&shmem->pages_pin_count));
drm_printf_indent(p, indent, "pages_use_count=%u\n", refcount_read(&shmem->pages_use_count));
drm_printf_indent(p, indent, "vmap_use_count=%u\n", refcount_read(&shmem->vmap_use_count));
+ drm_printf_indent(p, indent, "evicted=%d\n", shmem->evicted);
drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr);
+ drm_printf_indent(p, indent, "madv=%d\n", shmem->madv);
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_print_info);

@@ -784,8 +934,13 @@ static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_
*/
struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_shmem_object *shmem)
{
- int ret;
+ struct drm_gem_object *obj = &shmem->base;
struct sg_table *sgt;
+ int ret;
+
+ if (drm_WARN_ON(obj->dev, drm_gem_shmem_is_evictable(shmem)) ||
+ drm_WARN_ON(obj->dev, drm_gem_shmem_is_purgeable(shmem)))
+ return ERR_PTR(-EBUSY);

ret = dma_resv_lock_interruptible(shmem->base.resv, NULL);
if (ret)
@@ -832,6 +987,184 @@ drm_gem_shmem_prime_import_sg_table(struct drm_device *dev,
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_sg_table);

+static unsigned long
+drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker,
+ struct shrink_control *sc)
+{
+ struct drm_gem_shmem_shrinker *shmem_shrinker = shrinker->private_data;
+ unsigned long count = shmem_shrinker->lru_evictable.count;
+
+ if (count >= SHRINK_EMPTY)
+ return SHRINK_EMPTY - 1;
+
+ return count ?: SHRINK_EMPTY;
+}
+
+void drm_gem_shmem_evict_locked(struct drm_gem_shmem_object *shmem)
+{
+ struct drm_gem_object *obj = &shmem->base;
+
+ drm_WARN_ON(obj->dev, !drm_gem_shmem_is_evictable(shmem));
+ drm_WARN_ON(obj->dev, shmem->evicted);
+
+ drm_gem_shmem_shrinker_put_pages_locked(shmem);
+
+ shmem->evicted = true;
+ drm_gem_shmem_shrinker_update_lru_locked(shmem);
+}
+EXPORT_SYMBOL_GPL(drm_gem_shmem_evict_locked);
+
+static bool drm_gem_shmem_shrinker_evict_locked(struct drm_gem_object *obj)
+{
+ struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
+ int err;
+
+ if (!drm_gem_shmem_is_evictable(shmem) ||
+ get_nr_swap_pages() < obj->size >> PAGE_SHIFT)
+ return false;
+
+ err = drm_gem_evict_locked(obj);
+ if (err)
+ return false;
+
+ return true;
+}
+
+static bool drm_gem_shmem_shrinker_purge_locked(struct drm_gem_object *obj)
+{
+ struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
+ int err;
+
+ if (!drm_gem_shmem_is_purgeable(shmem))
+ return false;
+
+ err = drm_gem_evict_locked(obj);
+ if (err)
+ return false;
+
+ return true;
+}
+
+static unsigned long
+drm_gem_shmem_shrinker_scan_objects(struct shrinker *shrinker,
+ struct shrink_control *sc)
+{
+ struct drm_gem_shmem_shrinker *shmem_shrinker = shrinker->private_data;
+ unsigned long nr_to_scan = sc->nr_to_scan;
+ unsigned long remaining = 0;
+ unsigned long freed = 0;
+
+ /* purge as many objects as we can */
+ freed += drm_gem_lru_scan(&shmem_shrinker->lru_evictable,
+ nr_to_scan, &remaining,
+ drm_gem_shmem_shrinker_purge_locked);
+
+ /* evict as many objects as we can */
+ if (freed < nr_to_scan)
+ freed += drm_gem_lru_scan(&shmem_shrinker->lru_evictable,
+ nr_to_scan - freed, &remaining,
+ drm_gem_shmem_shrinker_evict_locked);
+
+ return (freed > 0 && remaining > 0) ? freed : SHRINK_STOP;
+}
+
+static int drm_gem_shmem_shrinker_init(struct drm_gem_shmem *shmem_mm,
+ const char *shrinker_name)
+{
+ struct drm_gem_shmem_shrinker *shmem_shrinker = &shmem_mm->shrinker;
+ struct shrinker *shrinker;
+
+ shrinker = shrinker_alloc(0, shrinker_name);
+ if (!shrinker)
+ return -ENOMEM;
+
+ shrinker->count_objects = drm_gem_shmem_shrinker_count_objects;
+ shrinker->scan_objects = drm_gem_shmem_shrinker_scan_objects;
+ shrinker->private_data = shmem_shrinker;
+ shrinker->seeks = DEFAULT_SEEKS;
+
+ mutex_init(&shmem_shrinker->lock);
+ shmem_shrinker->shrinker = shrinker;
+ drm_gem_lru_init(&shmem_shrinker->lru_evictable, &shmem_shrinker->lock);
+ drm_gem_lru_init(&shmem_shrinker->lru_evicted, &shmem_shrinker->lock);
+ drm_gem_lru_init(&shmem_shrinker->lru_pinned, &shmem_shrinker->lock);
+
+ shrinker_register(shrinker);
+
+ return 0;
+}
+
+static void drm_gem_shmem_shrinker_release(struct drm_device *dev,
+ struct drm_gem_shmem *shmem_mm)
+{
+ struct drm_gem_shmem_shrinker *shmem_shrinker = &shmem_mm->shrinker;
+
+ shrinker_free(shmem_shrinker->shrinker);
+ drm_WARN_ON(dev, !list_empty(&shmem_shrinker->lru_evictable.list));
+ drm_WARN_ON(dev, !list_empty(&shmem_shrinker->lru_evicted.list));
+ drm_WARN_ON(dev, !list_empty(&shmem_shrinker->lru_pinned.list));
+ mutex_destroy(&shmem_shrinker->lock);
+}
+
+static int drm_gem_shmem_init(struct drm_device *dev)
+{
+ int err;
+
+ if (drm_WARN_ON(dev, dev->shmem_mm))
+ return -EBUSY;
+
+ dev->shmem_mm = kzalloc(sizeof(*dev->shmem_mm), GFP_KERNEL);
+ if (!dev->shmem_mm)
+ return -ENOMEM;
+
+ err = drm_gem_shmem_shrinker_init(dev->shmem_mm, dev->unique);
+ if (err)
+ goto free_gem_shmem;
+
+ return 0;
+
+free_gem_shmem:
+ kfree(dev->shmem_mm);
+ dev->shmem_mm = NULL;
+
+ return err;
+}
+
+static void drm_gem_shmem_release(struct drm_device *dev, void *ptr)
+{
+ struct drm_gem_shmem *shmem_mm = dev->shmem_mm;
+
+ drm_gem_shmem_shrinker_release(dev, shmem_mm);
+ dev->shmem_mm = NULL;
+ kfree(shmem_mm);
+}
+
+/**
+ * drmm_gem_shmem_init() - Initialize drm-shmem internals
+ * @dev: DRM device
+ *
+ * Cleanup is automatically managed as part of DRM device releasing.
+ * Calling this function multiple times will result in a error.
+ *
+ * Returns:
+ * 0 on success or a negative error code on failure.
+ */
+int drmm_gem_shmem_init(struct drm_device *dev)
+{
+ int err;
+
+ err = drm_gem_shmem_init(dev);
+ if (err)
+ return err;
+
+ err = drmm_add_action_or_reset(dev, drm_gem_shmem_release, NULL);
+ if (err)
+ return err;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(drmm_gem_shmem_init);
+
MODULE_DESCRIPTION("DRM SHMEM memory-management helpers");
MODULE_IMPORT_NS(DMA_BUF);
MODULE_LICENSE("GPL v2");
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 7edfc12f7c1f..8c26b7e41b95 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -99,8 +99,7 @@ static void panfrost_gem_mapping_release(struct kref *kref)
* time, and heap BOs may have acquired pages if the fault handler
* was called, in which case bo->sgts should be non-NULL.
*/
- if (!bo->base.base.import_attach && (!bo->is_heap || bo->sgts) &&
- bo->base.madv >= 0) {
+ if (!bo->base.base.import_attach && (!bo->is_heap || bo->sgts)) {
drm_gem_shmem_put_pages(&bo->base);
bo->sgts = NULL;
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
index d4fb0854cf2f..7b4deba803ed 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
@@ -15,6 +15,13 @@
#include "panfrost_gem.h"
#include "panfrost_mmu.h"

+static bool panfrost_gem_shmem_is_purgeable(struct drm_gem_shmem_object *shmem)
+{
+ return (shmem->madv > 0) &&
+ !refcount_read(&shmem->pages_pin_count) && shmem->sgt &&
+ !shmem->base.dma_buf && !shmem->base.import_attach;
+}
+
static unsigned long
panfrost_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
{
@@ -26,7 +33,7 @@ panfrost_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc
return 0;

list_for_each_entry(shmem, &pfdev->shrinker_list, madv_list) {
- if (drm_gem_shmem_is_purgeable(shmem))
+ if (panfrost_gem_shmem_is_purgeable(shmem))
count += shmem->base.size >> PAGE_SHIFT;
}

@@ -53,7 +60,7 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj)
/* BO might have become unpurgeable if the last pages_use_count ref
* was dropped, but the BO hasn't been destroyed yet.
*/
- if (!drm_gem_shmem_is_purgeable(shmem))
+ if (!panfrost_gem_shmem_is_purgeable(shmem))
goto unlock_mappings;

panfrost_gem_teardown_mappings_locked(bo);
@@ -80,7 +87,7 @@ panfrost_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
list_for_each_entry_safe(shmem, tmp, &pfdev->shrinker_list, madv_list) {
if (freed >= sc->nr_to_scan)
break;
- if (drm_gem_shmem_is_purgeable(shmem) &&
+ if (panfrost_gem_shmem_is_purgeable(shmem) &&
panfrost_gem_purge(&shmem->base)) {
freed += shmem->base.size >> PAGE_SHIFT;
list_del_init(&shmem->madv_list);
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index 63767cf24371..6e729e716505 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -15,6 +15,7 @@ struct drm_vblank_crtc;
struct drm_vma_offset_manager;
struct drm_vram_mm;
struct drm_fb_helper;
+struct drm_gem_shmem_shrinker;

struct inode;

@@ -289,8 +290,13 @@ struct drm_device {
/** @vma_offset_manager: GEM information */
struct drm_vma_offset_manager *vma_offset_manager;

- /** @vram_mm: VRAM MM memory manager */
- struct drm_vram_mm *vram_mm;
+ union {
+ /** @vram_mm: VRAM MM memory manager */
+ struct drm_vram_mm *vram_mm;
+
+ /** @shmem_mm: SHMEM GEM memory manager */
+ struct drm_gem_shmem *shmem_mm;
+ };

/**
* @switch_power_state:
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index 525480488451..df97c11fc99a 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -6,6 +6,7 @@
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/mutex.h>
+#include <linux/shrinker.h>

#include <drm/drm_file.h>
#include <drm/drm_gem.h>
@@ -13,6 +14,7 @@
#include <drm/drm_prime.h>

struct dma_buf_attachment;
+struct drm_device;
struct drm_mode_create_dumb;
struct drm_printer;
struct sg_table;
@@ -54,8 +56,8 @@ struct drm_gem_shmem_object {
* @madv: State for madvise
*
* 0 is active/inuse.
+ * 1 is not-needed/can-be-purged
* A negative value is the object is purged.
- * Positive values are driver specific and not used by the helpers.
*/
int madv;

@@ -102,6 +104,14 @@ struct drm_gem_shmem_object {
* @map_wc: map object write-combined (instead of using shmem defaults).
*/
bool map_wc : 1;
+
+ /**
+ * @evicted: True if shmem pages are evicted by the memory shrinker.
+ * Used internally by memory shrinker. The evicted pages can be
+ * moved back to memory using drm_gem_shmem_swapin_locked(), unlike
+ * the purged pages (madv < 0) that are destroyed permanently.
+ */
+ bool evicted : 1;
};

#define to_drm_gem_shmem_obj(obj) \
@@ -122,14 +132,19 @@ void drm_gem_shmem_vunmap_locked(struct drm_gem_shmem_object *shmem,
int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct *vma);

int drm_gem_shmem_madvise_locked(struct drm_gem_shmem_object *shmem, int madv);
+int drm_gem_shmem_madvise(struct drm_gem_shmem_object *shmem, int madv);

static inline bool drm_gem_shmem_is_purgeable(struct drm_gem_shmem_object *shmem)
{
- return (shmem->madv > 0) &&
- !refcount_read(&shmem->pages_pin_count) && shmem->sgt &&
+ return (shmem->madv > 0) && shmem->base.funcs->evict &&
+ refcount_read(&shmem->pages_use_count) &&
+ !refcount_read(&shmem->pages_pin_count) &&
!shmem->base.dma_buf && !shmem->base.import_attach;
}

+int drm_gem_shmem_swapin_locked(struct drm_gem_shmem_object *shmem);
+
+void drm_gem_shmem_evict_locked(struct drm_gem_shmem_object *shmem);
void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem);

struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem);
@@ -273,6 +288,53 @@ static inline int drm_gem_shmem_object_mmap(struct drm_gem_object *obj, struct v
return drm_gem_shmem_mmap(shmem, vma);
}

+/**
+ * drm_gem_shmem_object_madvise - unlocked GEM object function for drm_gem_shmem_madvise_locked()
+ * @obj: GEM object
+ * @madv: Madvise value
+ *
+ * This function wraps drm_gem_shmem_madvise_locked(), providing unlocked variant.
+ *
+ * Returns:
+ * 0 on success or a negative error code on failure.
+ */
+static inline int drm_gem_shmem_object_madvise(struct drm_gem_object *obj, int madv)
+{
+ struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
+
+ return drm_gem_shmem_madvise(shmem, madv);
+}
+
+/**
+ * struct drm_gem_shmem_shrinker - Memory shrinker of GEM shmem memory manager
+ */
+struct drm_gem_shmem_shrinker {
+ /** @lock: Protects @lru_* */
+ struct mutex lock;
+
+ /** @shrinker: Shrinker for purging shmem GEM objects */
+ struct shrinker *shrinker;
+
+ /** @lru_pinned: List of pinned shmem GEM objects */
+ struct drm_gem_lru lru_pinned;
+
+ /** @lru_evictable: List of shmem GEM objects to be evicted */
+ struct drm_gem_lru lru_evictable;
+
+ /** @lru_evicted: List of evicted shmem GEM objects */
+ struct drm_gem_lru lru_evicted;
+};
+
+/**
+ * struct drm_gem_shmem - GEM shmem memory manager
+ */
+struct drm_gem_shmem {
+ /** @shrinker: GEM shmem shrinker */
+ struct drm_gem_shmem_shrinker shrinker;
+};
+
+int drmm_gem_shmem_init(struct drm_device *dev);
+
/*
* Driver ops
*/
--
2.43.0


2024-01-05 18:52:44

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 23/30] drm/shmem-helper: Export drm_gem_shmem_get_pages_sgt_locked()

Export drm_gem_shmem_get_pages_sgt_locked() that will be used by virtio-gpu
shrinker during GEM swap-in operation done under the held reservation lock.

Reviewed-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 22 +++++++++++++++++++++-
include/drm/drm_gem_shmem_helper.h | 1 +
2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 59cebd1e35af..8fd7851c088b 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -875,12 +875,31 @@ struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem)
}
EXPORT_SYMBOL_GPL(drm_gem_shmem_get_sg_table);

-static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_object *shmem)
+/**
+ * drm_gem_shmem_get_pages_sgt_locked - Provide a scatter/gather table of pinned
+ * pages for a shmem GEM object
+ * @shmem: shmem GEM object
+ *
+ * This is a locked version of @drm_gem_shmem_get_sg_table that exports a
+ * scatter/gather table suitable for PRIME usage by calling the standard
+ * DMA mapping API.
+ *
+ * Drivers must hold GEM's reservation lock when using this function.
+ *
+ * Drivers who need to acquire an scatter/gather table for objects need to call
+ * drm_gem_shmem_get_pages_sgt() instead.
+ *
+ * Returns:
+ * A pointer to the scatter/gather table of pinned pages or error pointer on failure.
+ */
+struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;
int ret;
struct sg_table *sgt;

+ dma_resv_assert_held(shmem->base.resv);
+
if (shmem->sgt)
return shmem->sgt;

@@ -904,6 +923,7 @@ static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_
kfree(sgt);
return ERR_PTR(ret);
}
+EXPORT_SYMBOL_GPL(drm_gem_shmem_get_pages_sgt_locked);

/**
* drm_gem_shmem_get_pages_sgt - Pin pages, dma map them, and return a
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index df97c11fc99a..167f00f089de 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -149,6 +149,7 @@ void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem);

struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem);
struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_shmem_object *shmem);
+struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_object *shmem);

void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem,
struct drm_printer *p, unsigned int indent);
--
2.43.0


2024-01-05 18:53:05

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 24/30] drm/shmem-helper: Optimize unlocked get_pages_sgt()

SGT isn't refcounted. Once SGT pointer has been obtained, it remains the
same for both locked and unlocked get_pages_sgt(). Return cached SGT
directly without taking a potentially expensive lock.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 8fd7851c088b..e6e6e693ab95 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -962,6 +962,18 @@ struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_shmem_object *shmem)
drm_WARN_ON(obj->dev, drm_gem_shmem_is_purgeable(shmem)))
return ERR_PTR(-EBUSY);

+ /*
+ * Drivers that use shrinker should take into account that shrinker
+ * may relocate BO, thus invalidating the returned SGT pointer.
+ * Such drivers should pin GEM while they use SGT.
+ *
+ * Drivers that don't use shrinker should take into account that
+ * SGT is released together with the GEM pages. Pages should be kept
+ * alive while SGT is used.
+ */
+ if (shmem->sgt)
+ return shmem->sgt;
+
ret = dma_resv_lock_interruptible(shmem->base.resv, NULL);
if (ret)
return ERR_PTR(ret);
--
2.43.0


2024-01-05 18:53:17

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 25/30] drm/shmem-helper: Don't free refcounted GEM

Don't free shmem object if it has pages that are in use at the time of
the GEM's freeing if DRM driver doesn't manage GEM/pages lifetime properly.
This prevents memory corruption due to the use-after-free bug in exchange
to leaking GEM.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index e6e6e693ab95..0d95d723b90d 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -205,9 +205,15 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
if (obj->import_attach)
drm_prime_gem_destroy(obj, shmem->sgt);

- drm_WARN_ON(obj->dev, refcount_read(&shmem->vmap_use_count));
- drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count));
- drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_pin_count));
+ /*
+ * Prevent memory corruption caused by the use-after-free bug in a
+ * case where shmem user erroneously holds reference to pages while
+ * GEM is freed by leaking the GEM.
+ */
+ if (drm_WARN_ON(obj->dev, refcount_read(&shmem->vmap_use_count)) ||
+ drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count)) ||
+ drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_pin_count)))
+ return;

drm_gem_object_release(obj);
kfree(shmem);
--
2.43.0


2024-01-05 18:53:31

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 26/30] drm/shmem-helper: Turn warnings about imported GEM into errors

Turn sanity warnings about DRM-SHMEM API misuse into a error conditions
for cases where imported GEM is used when it shouldn't be used.

Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 0d95d723b90d..7d2fe12bd793 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -409,7 +409,8 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)
struct drm_gem_object *obj = &shmem->base;
int ret;

- drm_WARN_ON(obj->dev, obj->import_attach);
+ if (drm_WARN_ON(obj->dev, obj->import_attach))
+ return -EINVAL;

if (refcount_inc_not_zero(&shmem->pages_pin_count))
return 0;
@@ -872,7 +873,8 @@ struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;

- drm_WARN_ON(obj->dev, obj->import_attach);
+ if (drm_WARN_ON(obj->dev, obj->import_attach))
+ return ERR_PTR(-EINVAL);

if (drm_WARN_ON(obj->dev, !shmem->pages))
return ERR_PTR(-ENOMEM);
@@ -909,7 +911,8 @@ struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_object
if (shmem->sgt)
return shmem->sgt;

- drm_WARN_ON(obj->dev, obj->import_attach);
+ if (drm_WARN_ON(obj->dev, obj->import_attach))
+ return ERR_PTR(-EINVAL);

sgt = drm_gem_shmem_get_sg_table(shmem);
if (IS_ERR(sgt))
--
2.43.0


2024-01-05 18:53:51

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 27/30] drm/virtio: Pin display framebuffer BO

Prepare to addition of memory shrinker support by pinning display
framebuffer BO pages in memory while they are in use by display on host.
Shrinker is free to relocate framebuffer BO pages if it doesn't know that
pages are in use, thus pin the pages to disallow shrinker to move them.

Acked-by: Gerd Hoffmann <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/virtio/virtgpu_drv.h | 2 ++
drivers/gpu/drm/virtio/virtgpu_gem.c | 19 +++++++++++++++++++
drivers/gpu/drm/virtio/virtgpu_plane.c | 17 +++++++++++++++--
3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index bb7d86a0c6a1..83d1e4622292 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -318,6 +318,8 @@ void virtio_gpu_array_put_free(struct virtio_gpu_object_array *objs);
void virtio_gpu_array_put_free_delayed(struct virtio_gpu_device *vgdev,
struct virtio_gpu_object_array *objs);
void virtio_gpu_array_put_free_work(struct work_struct *work);
+int virtio_gpu_gem_pin(struct virtio_gpu_object *bo);
+void virtio_gpu_gem_unpin(struct virtio_gpu_object *bo);

/* virtgpu_vq.c */
int virtio_gpu_alloc_vbufs(struct virtio_gpu_device *vgdev);
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
index 7db48d17ee3a..625c05d625bf 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -294,3 +294,22 @@ void virtio_gpu_array_put_free_work(struct work_struct *work)
}
spin_unlock(&vgdev->obj_free_lock);
}
+
+int virtio_gpu_gem_pin(struct virtio_gpu_object *bo)
+{
+ int err;
+
+ if (virtio_gpu_is_shmem(bo)) {
+ err = drm_gem_shmem_pin(&bo->base);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+void virtio_gpu_gem_unpin(struct virtio_gpu_object *bo)
+{
+ if (virtio_gpu_is_shmem(bo))
+ drm_gem_shmem_unpin(&bo->base);
+}
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
index a72a2dbda031..162fb8a44d71 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -248,20 +248,28 @@ static int virtio_gpu_plane_prepare_fb(struct drm_plane *plane,
struct virtio_gpu_device *vgdev = dev->dev_private;
struct virtio_gpu_framebuffer *vgfb;
struct virtio_gpu_object *bo;
+ int err;

if (!new_state->fb)
return 0;

vgfb = to_virtio_gpu_framebuffer(new_state->fb);
bo = gem_to_virtio_gpu_obj(vgfb->base.obj[0]);
- if (!bo || (plane->type == DRM_PLANE_TYPE_PRIMARY && !bo->guest_blob))
+
+ err = virtio_gpu_gem_pin(bo);
+ if (err)
+ return err;
+
+ if (plane->type == DRM_PLANE_TYPE_PRIMARY && !bo->guest_blob)
return 0;

if (bo->dumb && (plane->state->fb != new_state->fb)) {
vgfb->fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context,
0);
- if (!vgfb->fence)
+ if (!vgfb->fence) {
+ virtio_gpu_gem_unpin(bo);
return -ENOMEM;
+ }
}

return 0;
@@ -271,15 +279,20 @@ static void virtio_gpu_plane_cleanup_fb(struct drm_plane *plane,
struct drm_plane_state *state)
{
struct virtio_gpu_framebuffer *vgfb;
+ struct virtio_gpu_object *bo;

if (!state->fb)
return;

vgfb = to_virtio_gpu_framebuffer(state->fb);
+ bo = gem_to_virtio_gpu_obj(vgfb->base.obj[0]);
+
if (vgfb->fence) {
dma_fence_put(&vgfb->fence->f);
vgfb->fence = NULL;
}
+
+ virtio_gpu_gem_unpin(bo);
}

static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
--
2.43.0


2024-01-05 18:54:05

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 28/30] drm/virtio: Attach shmem BOs dynamically

Prepare for addition of memory shrinker support by attaching shmem pages
to host dynamically on first use. Previously the attachment vq command
wasn't fenced and there was no vq kick made in the BO creation code path,
hence the attachment already was happening dynamically, but implicitly.
Making attachment explicitly dynamic will allow to simplify and reuse more
code when shrinker will be added. The virtio_gpu_object_shmem_init() now
works under the held reservation lock, which will be important to have for
shrinker to avoid moving pages while they are in active use by the driver.

Acked-by: Gerd Hoffmann <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/virtio/virtgpu_drv.h | 7 +++
drivers/gpu/drm/virtio/virtgpu_gem.c | 26 +++++++++
drivers/gpu/drm/virtio/virtgpu_ioctl.c | 32 +++++++----
drivers/gpu/drm/virtio/virtgpu_object.c | 73 ++++++++++++++++++++-----
drivers/gpu/drm/virtio/virtgpu_submit.c | 15 ++++-
5 files changed, 125 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 83d1e4622292..1837dc7ea9fb 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -92,6 +92,7 @@ struct virtio_gpu_object {
uint32_t hw_res_handle;
bool dumb;
bool created;
+ bool detached;
bool host3d_blob, guest_blob;
uint32_t blob_mem, blob_flags;

@@ -318,6 +319,8 @@ void virtio_gpu_array_put_free(struct virtio_gpu_object_array *objs);
void virtio_gpu_array_put_free_delayed(struct virtio_gpu_device *vgdev,
struct virtio_gpu_object_array *objs);
void virtio_gpu_array_put_free_work(struct work_struct *work);
+int virtio_gpu_array_prepare(struct virtio_gpu_device *vgdev,
+ struct virtio_gpu_object_array *objs);
int virtio_gpu_gem_pin(struct virtio_gpu_object *bo);
void virtio_gpu_gem_unpin(struct virtio_gpu_object *bo);

@@ -458,6 +461,10 @@ int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,

bool virtio_gpu_is_shmem(struct virtio_gpu_object *bo);

+int virtio_gpu_reattach_shmem_object_locked(struct virtio_gpu_object *bo);
+
+int virtio_gpu_reattach_shmem_object(struct virtio_gpu_object *bo);
+
int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
uint32_t *resid);
/* virtgpu_prime.c */
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
index 625c05d625bf..97e67064c97e 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -295,6 +295,26 @@ void virtio_gpu_array_put_free_work(struct work_struct *work)
spin_unlock(&vgdev->obj_free_lock);
}

+int virtio_gpu_array_prepare(struct virtio_gpu_device *vgdev,
+ struct virtio_gpu_object_array *objs)
+{
+ struct virtio_gpu_object *bo;
+ int ret = 0;
+ u32 i;
+
+ for (i = 0; i < objs->nents; i++) {
+ bo = gem_to_virtio_gpu_obj(objs->objs[i]);
+
+ if (virtio_gpu_is_shmem(bo) && bo->detached) {
+ ret = virtio_gpu_reattach_shmem_object_locked(bo);
+ if (ret)
+ break;
+ }
+ }
+
+ return ret;
+}
+
int virtio_gpu_gem_pin(struct virtio_gpu_object *bo)
{
int err;
@@ -303,6 +323,12 @@ int virtio_gpu_gem_pin(struct virtio_gpu_object *bo)
err = drm_gem_shmem_pin(&bo->base);
if (err)
return err;
+
+ err = virtio_gpu_reattach_shmem_object(bo);
+ if (err) {
+ drm_gem_shmem_unpin(&bo->base);
+ return err;
+ }
}

return 0;
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index e4f76f315550..c7da22006149 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -256,6 +256,10 @@ static int virtio_gpu_transfer_from_host_ioctl(struct drm_device *dev,
if (ret != 0)
goto err_put_free;

+ ret = virtio_gpu_array_prepare(vgdev, objs);
+ if (ret)
+ goto err_unlock;
+
fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
if (!fence) {
ret = -ENOMEM;
@@ -298,11 +302,25 @@ static int virtio_gpu_transfer_to_host_ioctl(struct drm_device *dev, void *data,
goto err_put_free;
}

+ ret = virtio_gpu_array_lock_resv(objs);
+ if (ret != 0)
+ goto err_put_free;
+
+ ret = virtio_gpu_array_prepare(vgdev, objs);
+ if (ret)
+ goto err_unlock;
+
+ fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
+ if (!fence) {
+ ret = -ENOMEM;
+ goto err_unlock;
+ }
+
if (!vgdev->has_virgl_3d) {
virtio_gpu_cmd_transfer_to_host_2d
(vgdev, offset,
args->box.w, args->box.h, args->box.x, args->box.y,
- objs, NULL);
+ objs, fence);
} else {
virtio_gpu_create_context(dev, file);

@@ -311,23 +329,13 @@ static int virtio_gpu_transfer_to_host_ioctl(struct drm_device *dev, void *data,
goto err_put_free;
}

- ret = virtio_gpu_array_lock_resv(objs);
- if (ret != 0)
- goto err_put_free;
-
- ret = -ENOMEM;
- fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context,
- 0);
- if (!fence)
- goto err_unlock;
-
virtio_gpu_cmd_transfer_to_host_3d
(vgdev,
vfpriv ? vfpriv->ctx_id : 0, offset, args->level,
args->stride, args->layer_stride, &args->box, objs,
fence);
- dma_fence_put(&fence->f);
}
+ dma_fence_put(&fence->f);
virtio_gpu_notify(vgdev);
return 0;

diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c
index e58528c562ef..de347aa3b9a8 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -143,7 +143,7 @@ static int virtio_gpu_object_shmem_init(struct virtio_gpu_device *vgdev,
struct sg_table *pages;
int si;

- pages = drm_gem_shmem_get_pages_sgt(&bo->base);
+ pages = drm_gem_shmem_get_pages_sgt_locked(&bo->base);
if (IS_ERR(pages))
return PTR_ERR(pages);

@@ -177,6 +177,40 @@ static int virtio_gpu_object_shmem_init(struct virtio_gpu_device *vgdev,
return 0;
}

+int virtio_gpu_reattach_shmem_object_locked(struct virtio_gpu_object *bo)
+{
+ struct virtio_gpu_device *vgdev = bo->base.base.dev->dev_private;
+ struct virtio_gpu_mem_entry *ents;
+ unsigned int nents;
+ int err;
+
+ if (!bo->detached)
+ return 0;
+
+ err = virtio_gpu_object_shmem_init(vgdev, bo, &ents, &nents);
+ if (err)
+ return err;
+
+ virtio_gpu_object_attach(vgdev, bo, ents, nents);
+
+ bo->detached = false;
+
+ return 0;
+}
+
+int virtio_gpu_reattach_shmem_object(struct virtio_gpu_object *bo)
+{
+ int ret;
+
+ ret = dma_resv_lock_interruptible(bo->base.base.resv, NULL);
+ if (ret)
+ return ret;
+ ret = virtio_gpu_reattach_shmem_object_locked(bo);
+ dma_resv_unlock(bo->base.base.resv);
+
+ return ret;
+}
+
int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
struct virtio_gpu_object_params *params,
struct virtio_gpu_object **bo_ptr,
@@ -207,45 +241,56 @@ int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,

bo->dumb = params->dumb;

- ret = virtio_gpu_object_shmem_init(vgdev, bo, &ents, &nents);
- if (ret != 0)
- goto err_put_id;
+ if (bo->blob_mem == VIRTGPU_BLOB_MEM_GUEST)
+ bo->guest_blob = true;

if (fence) {
ret = -ENOMEM;
objs = virtio_gpu_array_alloc(1);
if (!objs)
- goto err_free_entry;
+ goto err_put_id;
virtio_gpu_array_add_obj(objs, &bo->base.base);

ret = virtio_gpu_array_lock_resv(objs);
if (ret != 0)
goto err_put_objs;
+ } else {
+ ret = dma_resv_lock(bo->base.base.resv, NULL);
+ if (ret)
+ goto err_put_id;
}

if (params->blob) {
- if (params->blob_mem == VIRTGPU_BLOB_MEM_GUEST)
- bo->guest_blob = true;
+ ret = virtio_gpu_object_shmem_init(vgdev, bo, &ents, &nents);
+ if (ret)
+ goto err_unlock_objs;
+ } else {
+ bo->detached = true;
+ }

+ if (params->blob)
virtio_gpu_cmd_resource_create_blob(vgdev, bo, params,
ents, nents);
- } else if (params->virgl) {
+ else if (params->virgl)
virtio_gpu_cmd_resource_create_3d(vgdev, bo, params,
objs, fence);
- virtio_gpu_object_attach(vgdev, bo, ents, nents);
- } else {
+ else
virtio_gpu_cmd_create_resource(vgdev, bo, params,
objs, fence);
- virtio_gpu_object_attach(vgdev, bo, ents, nents);
- }
+
+ if (!fence)
+ dma_resv_unlock(bo->base.base.resv);

*bo_ptr = bo;
return 0;

+err_unlock_objs:
+ if (fence)
+ virtio_gpu_array_unlock_resv(objs);
+ else
+ dma_resv_unlock(bo->base.base.resv);
err_put_objs:
virtio_gpu_array_put_free(objs);
-err_free_entry:
- kvfree(ents);
err_put_id:
virtio_gpu_resource_id_put(vgdev, bo->hw_res_handle);
err_put_pages:
diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c b/drivers/gpu/drm/virtio/virtgpu_submit.c
index 5c514946bbad..6e4ef2593e8f 100644
--- a/drivers/gpu/drm/virtio/virtgpu_submit.c
+++ b/drivers/gpu/drm/virtio/virtgpu_submit.c
@@ -464,8 +464,19 @@ static void virtio_gpu_install_out_fence_fd(struct virtio_gpu_submit *submit)

static int virtio_gpu_lock_buflist(struct virtio_gpu_submit *submit)
{
- if (submit->buflist)
- return virtio_gpu_array_lock_resv(submit->buflist);
+ int err;
+
+ if (submit->buflist) {
+ err = virtio_gpu_array_lock_resv(submit->buflist);
+ if (err)
+ return err;
+
+ err = virtio_gpu_array_prepare(submit->vgdev, submit->buflist);
+ if (err) {
+ virtio_gpu_array_unlock_resv(submit->buflist);
+ return err;
+ }
+ }

return 0;
}
--
2.43.0


2024-01-05 18:54:21

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 29/30] drm/virtio: Support shmem shrinking

Support generic drm-shmem memory shrinker and add new madvise IOCTL to
the VirtIO-GPU driver. BO cache manager of Mesa driver will mark BOs as
"don't need" using the new IOCTL to let shrinker purge the marked BOs on
OOM, the shrinker will also evict unpurgeable shmem BOs from memory if
guest supports SWAP file or partition.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15278
Acked-by: Gerd Hoffmann <[email protected]>
Signed-off-by: Daniel Almeida <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/virtio/virtgpu_drv.h | 13 +++++-
drivers/gpu/drm/virtio/virtgpu_gem.c | 48 +++++++++++++++++--
drivers/gpu/drm/virtio/virtgpu_ioctl.c | 25 ++++++++++
drivers/gpu/drm/virtio/virtgpu_kms.c | 8 ++++
drivers/gpu/drm/virtio/virtgpu_object.c | 61 +++++++++++++++++++++++++
drivers/gpu/drm/virtio/virtgpu_vq.c | 40 ++++++++++++++++
include/uapi/drm/virtgpu_drm.h | 14 ++++++
7 files changed, 204 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 1837dc7ea9fb..37188c00e161 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -283,7 +283,7 @@ struct virtio_gpu_fpriv {
};

/* virtgpu_ioctl.c */
-#define DRM_VIRTIO_NUM_IOCTLS 12
+#define DRM_VIRTIO_NUM_IOCTLS 13
extern struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS];
void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file);

@@ -321,6 +321,8 @@ void virtio_gpu_array_put_free_delayed(struct virtio_gpu_device *vgdev,
void virtio_gpu_array_put_free_work(struct work_struct *work);
int virtio_gpu_array_prepare(struct virtio_gpu_device *vgdev,
struct virtio_gpu_object_array *objs);
+int virtio_gpu_gem_host_mem_release(struct virtio_gpu_object *bo);
+int virtio_gpu_gem_madvise(struct virtio_gpu_object *obj, int madv);
int virtio_gpu_gem_pin(struct virtio_gpu_object *bo);
void virtio_gpu_gem_unpin(struct virtio_gpu_object *bo);

@@ -334,6 +336,8 @@ void virtio_gpu_cmd_create_resource(struct virtio_gpu_device *vgdev,
struct virtio_gpu_fence *fence);
void virtio_gpu_cmd_unref_resource(struct virtio_gpu_device *vgdev,
struct virtio_gpu_object *bo);
+int virtio_gpu_cmd_release_resource(struct virtio_gpu_device *vgdev,
+ struct virtio_gpu_object *bo);
void virtio_gpu_cmd_transfer_to_host_2d(struct virtio_gpu_device *vgdev,
uint64_t offset,
uint32_t width, uint32_t height,
@@ -354,6 +358,9 @@ void virtio_gpu_object_attach(struct virtio_gpu_device *vgdev,
struct virtio_gpu_object *obj,
struct virtio_gpu_mem_entry *ents,
unsigned int nents);
+void virtio_gpu_object_detach(struct virtio_gpu_device *vgdev,
+ struct virtio_gpu_object *obj,
+ struct virtio_gpu_fence *fence);
void virtio_gpu_cursor_ping(struct virtio_gpu_device *vgdev,
struct virtio_gpu_output *output);
int virtio_gpu_cmd_get_display_info(struct virtio_gpu_device *vgdev);
@@ -497,4 +504,8 @@ void virtio_gpu_vram_unmap_dma_buf(struct device *dev,
int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);

+/* virtgpu_gem_shrinker.c */
+int virtio_gpu_gem_shrinker_init(struct virtio_gpu_device *vgdev);
+void virtio_gpu_gem_shrinker_fini(struct virtio_gpu_device *vgdev);
+
#endif
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
index 97e67064c97e..68d27ae582ba 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -147,10 +147,20 @@ void virtio_gpu_gem_object_close(struct drm_gem_object *obj,
struct virtio_gpu_device *vgdev = obj->dev->dev_private;
struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
struct virtio_gpu_object_array *objs;
+ struct virtio_gpu_object *bo;

if (!vgdev->has_virgl_3d)
return;

+ bo = gem_to_virtio_gpu_obj(obj);
+
+ /*
+ * Purged BO was already detached and released, the resource ID
+ * is invalid by now.
+ */
+ if (!virtio_gpu_gem_madvise(bo, VIRTGPU_MADV_WILLNEED))
+ return;
+
objs = virtio_gpu_array_alloc(1);
if (!objs)
return;
@@ -305,16 +315,46 @@ int virtio_gpu_array_prepare(struct virtio_gpu_device *vgdev,
for (i = 0; i < objs->nents; i++) {
bo = gem_to_virtio_gpu_obj(objs->objs[i]);

- if (virtio_gpu_is_shmem(bo) && bo->detached) {
- ret = virtio_gpu_reattach_shmem_object_locked(bo);
- if (ret)
- break;
+ if (virtio_gpu_is_shmem(bo)) {
+ if (bo->base.madv)
+ return -EINVAL;
+
+ if (bo->detached) {
+ ret = virtio_gpu_reattach_shmem_object_locked(bo);
+ if (ret)
+ break;
+ }
}
}

return ret;
}

+int virtio_gpu_gem_madvise(struct virtio_gpu_object *bo, int madv)
+{
+ if (virtio_gpu_is_shmem(bo))
+ return drm_gem_shmem_object_madvise(&bo->base.base, madv);
+
+ return 1;
+}
+
+int virtio_gpu_gem_host_mem_release(struct virtio_gpu_object *bo)
+{
+ struct virtio_gpu_device *vgdev = bo->base.base.dev->dev_private;
+ int err;
+
+ if (bo->created) {
+ err = virtio_gpu_cmd_release_resource(vgdev, bo);
+ if (err)
+ return err;
+
+ virtio_gpu_notify(vgdev);
+ bo->created = false;
+ }
+
+ return 0;
+}
+
int virtio_gpu_gem_pin(struct virtio_gpu_object *bo)
{
int err;
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index c7da22006149..a42799146090 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -701,6 +701,28 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev,
return ret;
}

+static int virtio_gpu_madvise_ioctl(struct drm_device *dev,
+ void *data,
+ struct drm_file *file)
+{
+ struct drm_virtgpu_madvise *args = data;
+ struct virtio_gpu_object *bo;
+ struct drm_gem_object *obj;
+
+ if (args->madv > VIRTGPU_MADV_DONTNEED)
+ return -EOPNOTSUPP;
+
+ obj = drm_gem_object_lookup(file, args->bo_handle);
+ if (!obj)
+ return -ENOENT;
+
+ bo = gem_to_virtio_gpu_obj(obj);
+ args->retained = virtio_gpu_gem_madvise(bo, args->madv);
+ drm_gem_object_put(obj);
+
+ return 0;
+}
+
struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS] = {
DRM_IOCTL_DEF_DRV(VIRTGPU_MAP, virtio_gpu_map_ioctl,
DRM_RENDER_ALLOW),
@@ -740,4 +762,7 @@ struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS] = {

DRM_IOCTL_DEF_DRV(VIRTGPU_CONTEXT_INIT, virtio_gpu_context_init_ioctl,
DRM_RENDER_ALLOW),
+
+ DRM_IOCTL_DEF_DRV(VIRTGPU_MADVISE, virtio_gpu_madvise_ioctl,
+ DRM_RENDER_ALLOW),
};
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 5a3b5aaed1f3..43e237082cec 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -245,6 +245,12 @@ int virtio_gpu_init(struct virtio_device *vdev, struct drm_device *dev)
goto err_scanouts;
}

+ ret = drmm_gem_shmem_init(dev);
+ if (ret) {
+ DRM_ERROR("shmem init failed\n");
+ goto err_modeset;
+ }
+
virtio_device_ready(vgdev->vdev);

if (num_capsets)
@@ -259,6 +265,8 @@ int virtio_gpu_init(struct virtio_device *vdev, struct drm_device *dev)
}
return 0;

+err_modeset:
+ virtio_gpu_modeset_fini(vgdev);
err_scanouts:
virtio_gpu_free_vbufs(vgdev);
err_vbufs:
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c
index de347aa3b9a8..86888c1ae5d4 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -98,6 +98,60 @@ static void virtio_gpu_free_object(struct drm_gem_object *obj)
virtio_gpu_cleanup_object(bo);
}

+static int virtio_gpu_detach_object_fenced(struct virtio_gpu_object *bo)
+{
+ struct virtio_gpu_device *vgdev = bo->base.base.dev->dev_private;
+ struct virtio_gpu_fence *fence;
+
+ if (bo->detached)
+ return 0;
+
+ fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
+ if (!fence)
+ return -ENOMEM;
+
+ virtio_gpu_object_detach(vgdev, bo, fence);
+ virtio_gpu_notify(vgdev);
+
+ dma_fence_wait(&fence->f, false);
+ dma_fence_put(&fence->f);
+
+ bo->detached = true;
+
+ return 0;
+}
+
+static int virtio_gpu_shmem_evict(struct drm_gem_object *obj)
+{
+ struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
+ int err;
+
+ /* blob is not movable, it's impossible to detach it from host */
+ if (bo->blob_mem)
+ return -EBUSY;
+
+ /*
+ * At first tell host to stop using guest's memory to ensure that
+ * host won't touch the released guest's memory once it's gone.
+ */
+ err = virtio_gpu_detach_object_fenced(bo);
+ if (err)
+ return err;
+
+ if (drm_gem_shmem_is_purgeable(&bo->base)) {
+ err = virtio_gpu_gem_host_mem_release(bo);
+ if (err)
+ return err;
+
+ drm_gem_shmem_purge_locked(&bo->base);
+ } else {
+ bo->base.pages_mark_dirty_on_put = 1;
+ drm_gem_shmem_evict_locked(&bo->base);
+ }
+
+ return 0;
+}
+
static const struct drm_gem_object_funcs virtio_gpu_shmem_funcs = {
.free = virtio_gpu_free_object,
.open = virtio_gpu_gem_object_open,
@@ -111,6 +165,7 @@ static const struct drm_gem_object_funcs virtio_gpu_shmem_funcs = {
.vunmap = drm_gem_shmem_object_vunmap,
.mmap = drm_gem_shmem_object_mmap,
.vm_ops = &drm_gem_shmem_vm_ops,
+ .evict = virtio_gpu_shmem_evict,
};

bool virtio_gpu_is_shmem(struct virtio_gpu_object *bo)
@@ -187,6 +242,10 @@ int virtio_gpu_reattach_shmem_object_locked(struct virtio_gpu_object *bo)
if (!bo->detached)
return 0;

+ err = drm_gem_shmem_swapin_locked(&bo->base);
+ if (err)
+ return err;
+
err = virtio_gpu_object_shmem_init(vgdev, bo, &ents, &nents);
if (err)
return err;
@@ -240,6 +299,8 @@ int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
goto err_put_pages;

bo->dumb = params->dumb;
+ bo->blob_mem = params->blob_mem;
+ bo->blob_flags = params->blob_flags;

if (bo->blob_mem == VIRTGPU_BLOB_MEM_GUEST)
bo->guest_blob = true;
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c
index b1a00c0c25a7..14ab470f413a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -545,6 +545,21 @@ void virtio_gpu_cmd_unref_resource(struct virtio_gpu_device *vgdev,
virtio_gpu_cleanup_object(bo);
}

+int virtio_gpu_cmd_release_resource(struct virtio_gpu_device *vgdev,
+ struct virtio_gpu_object *bo)
+{
+ struct virtio_gpu_resource_unref *cmd_p;
+ struct virtio_gpu_vbuffer *vbuf;
+
+ cmd_p = virtio_gpu_alloc_cmd(vgdev, &vbuf, sizeof(*cmd_p));
+ memset(cmd_p, 0, sizeof(*cmd_p));
+
+ cmd_p->hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_RESOURCE_UNREF);
+ cmd_p->resource_id = cpu_to_le32(bo->hw_res_handle);
+
+ return virtio_gpu_queue_ctrl_buffer(vgdev, vbuf);
+}
+
void virtio_gpu_cmd_set_scanout(struct virtio_gpu_device *vgdev,
uint32_t scanout_id, uint32_t resource_id,
uint32_t width, uint32_t height,
@@ -645,6 +660,23 @@ virtio_gpu_cmd_resource_attach_backing(struct virtio_gpu_device *vgdev,
virtio_gpu_queue_fenced_ctrl_buffer(vgdev, vbuf, fence);
}

+static void
+virtio_gpu_cmd_resource_detach_backing(struct virtio_gpu_device *vgdev,
+ u32 resource_id,
+ struct virtio_gpu_fence *fence)
+{
+ struct virtio_gpu_resource_attach_backing *cmd_p;
+ struct virtio_gpu_vbuffer *vbuf;
+
+ cmd_p = virtio_gpu_alloc_cmd(vgdev, &vbuf, sizeof(*cmd_p));
+ memset(cmd_p, 0, sizeof(*cmd_p));
+
+ cmd_p->hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING);
+ cmd_p->resource_id = cpu_to_le32(resource_id);
+
+ virtio_gpu_queue_fenced_ctrl_buffer(vgdev, vbuf, fence);
+}
+
static void virtio_gpu_cmd_get_display_info_cb(struct virtio_gpu_device *vgdev,
struct virtio_gpu_vbuffer *vbuf)
{
@@ -1107,6 +1139,14 @@ void virtio_gpu_object_attach(struct virtio_gpu_device *vgdev,
ents, nents, NULL);
}

+void virtio_gpu_object_detach(struct virtio_gpu_device *vgdev,
+ struct virtio_gpu_object *obj,
+ struct virtio_gpu_fence *fence)
+{
+ virtio_gpu_cmd_resource_detach_backing(vgdev, obj->hw_res_handle,
+ fence);
+}
+
void virtio_gpu_cursor_ping(struct virtio_gpu_device *vgdev,
struct virtio_gpu_output *output)
{
diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
index c2ce71987e9b..78255060bc9a 100644
--- a/include/uapi/drm/virtgpu_drm.h
+++ b/include/uapi/drm/virtgpu_drm.h
@@ -48,6 +48,7 @@ extern "C" {
#define DRM_VIRTGPU_GET_CAPS 0x09
#define DRM_VIRTGPU_RESOURCE_CREATE_BLOB 0x0a
#define DRM_VIRTGPU_CONTEXT_INIT 0x0b
+#define DRM_VIRTGPU_MADVISE 0x0c

#define VIRTGPU_EXECBUF_FENCE_FD_IN 0x01
#define VIRTGPU_EXECBUF_FENCE_FD_OUT 0x02
@@ -213,6 +214,15 @@ struct drm_virtgpu_context_init {
__u64 ctx_set_params;
};

+#define VIRTGPU_MADV_WILLNEED 0
+#define VIRTGPU_MADV_DONTNEED 1
+struct drm_virtgpu_madvise {
+ __u32 bo_handle;
+ __u32 retained; /* out, non-zero if BO can be used */
+ __u32 madv;
+ __u32 pad;
+};
+
/*
* Event code that's given when VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK is in
* effect. The event size is sizeof(drm_event), since there is no additional
@@ -263,6 +273,10 @@ struct drm_virtgpu_context_init {
DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_CONTEXT_INIT, \
struct drm_virtgpu_context_init)

+#define DRM_IOCTL_VIRTGPU_MADVISE \
+ DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_MADVISE, \
+ struct drm_virtgpu_madvise)
+
#if defined(__cplusplus)
}
#endif
--
2.43.0


2024-01-05 18:54:39

by Dmitry Osipenko

[permalink] [raw]
Subject: [PATCH v19 30/30] drm/panfrost: Switch to generic memory shrinker

Replace Panfrost's custom memory shrinker with a common drm-shmem
memory shrinker.

Co-developed-by: Boris Brezillon <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Signed-off-by: Dmitry Osipenko <[email protected]>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 4 +-
drivers/gpu/drm/panfrost/Makefile | 1 -
drivers/gpu/drm/panfrost/panfrost_device.h | 4 -
drivers/gpu/drm/panfrost/panfrost_drv.c | 29 ++--
drivers/gpu/drm/panfrost/panfrost_gem.c | 60 ++++----
drivers/gpu/drm/panfrost/panfrost_gem.h | 9 --
.../gpu/drm/panfrost/panfrost_gem_shrinker.c | 140 ------------------
drivers/gpu/drm/panfrost/panfrost_job.c | 18 ++-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 24 ++-
include/drm/drm_gem_shmem_helper.h | 7 -
10 files changed, 83 insertions(+), 213 deletions(-)
delete mode 100644 drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 7d2fe12bd793..56e88378079b 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -89,8 +89,6 @@ __drm_gem_shmem_create(struct drm_device *dev, size_t size, bool private)
if (ret)
goto err_release;

- INIT_LIST_HEAD(&shmem->madv_list);
-
if (!private) {
/*
* Our buffers are kept pinned, so allocating them
@@ -619,6 +617,8 @@ void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem)
{
struct drm_gem_object *obj = &shmem->base;

+ drm_WARN_ON_ONCE(obj->dev, !drm_gem_shmem_is_purgeable(shmem));
+
drm_gem_shmem_shrinker_put_pages_locked(shmem);
drm_gem_free_mmap_offset(obj);

diff --git a/drivers/gpu/drm/panfrost/Makefile b/drivers/gpu/drm/panfrost/Makefile
index 2c01c1e7523e..f2cb1ab0a32d 100644
--- a/drivers/gpu/drm/panfrost/Makefile
+++ b/drivers/gpu/drm/panfrost/Makefile
@@ -5,7 +5,6 @@ panfrost-y := \
panfrost_device.o \
panfrost_devfreq.o \
panfrost_gem.o \
- panfrost_gem_shrinker.o \
panfrost_gpu.o \
panfrost_job.o \
panfrost_mmu.o \
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index 62f7e3527385..cea6df9cd650 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -140,10 +140,6 @@ struct panfrost_device {
atomic_t pending;
} reset;

- struct mutex shrinker_lock;
- struct list_head shrinker_list;
- struct shrinker *shrinker;
-
struct panfrost_devfreq pfdevfreq;

struct {
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index a15d62f19afb..5c730d15a24d 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -171,7 +171,6 @@ panfrost_lookup_bos(struct drm_device *dev,
break;
}

- atomic_inc(&bo->gpu_usecount);
job->mappings[i] = mapping;
}

@@ -397,7 +396,6 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data,
{
struct panfrost_file_priv *priv = file_priv->driver_priv;
struct drm_panfrost_madvise *args = data;
- struct panfrost_device *pfdev = dev->dev_private;
struct drm_gem_object *gem_obj;
struct panfrost_gem_object *bo;
int ret = 0;
@@ -410,11 +408,15 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data,

bo = to_panfrost_bo(gem_obj);

+ if (bo->is_heap) {
+ args->retained = 1;
+ goto out_put_object;
+ }
+
ret = dma_resv_lock_interruptible(bo->base.base.resv, NULL);
if (ret)
goto out_put_object;

- mutex_lock(&pfdev->shrinker_lock);
mutex_lock(&bo->mappings.lock);
if (args->madv == PANFROST_MADV_DONTNEED) {
struct panfrost_gem_mapping *first;
@@ -440,17 +442,8 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data,

args->retained = drm_gem_shmem_madvise_locked(&bo->base, args->madv);

- if (args->retained) {
- if (args->madv == PANFROST_MADV_DONTNEED)
- list_move_tail(&bo->base.madv_list,
- &pfdev->shrinker_list);
- else if (args->madv == PANFROST_MADV_WILLNEED)
- list_del_init(&bo->base.madv_list);
- }
-
out_unlock_mappings:
mutex_unlock(&bo->mappings.lock);
- mutex_unlock(&pfdev->shrinker_lock);
dma_resv_unlock(bo->base.base.resv);
out_put_object:
drm_gem_object_put(gem_obj);
@@ -635,9 +628,6 @@ static int panfrost_probe(struct platform_device *pdev)
ddev->dev_private = pfdev;
pfdev->ddev = ddev;

- mutex_init(&pfdev->shrinker_lock);
- INIT_LIST_HEAD(&pfdev->shrinker_list);
-
err = panfrost_device_init(pfdev);
if (err) {
if (err != -EPROBE_DEFER)
@@ -659,13 +649,13 @@ static int panfrost_probe(struct platform_device *pdev)
if (err < 0)
goto err_out1;

- err = panfrost_gem_shrinker_init(ddev);
- if (err)
- goto err_out2;
+ err = drmm_gem_shmem_init(ddev);
+ if (err < 0)
+ goto err_unregister_dev;

return 0;

-err_out2:
+err_unregister_dev:
drm_dev_unregister(ddev);
err_out1:
pm_runtime_disable(pfdev->dev);
@@ -682,7 +672,6 @@ static void panfrost_remove(struct platform_device *pdev)
struct drm_device *ddev = pfdev->ddev;

drm_dev_unregister(ddev);
- panfrost_gem_shrinker_cleanup(ddev);

pm_runtime_get_sync(pfdev->dev);
pm_runtime_disable(pfdev->dev);
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 8c26b7e41b95..05eb5a89c4ed 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -17,17 +17,6 @@
static void panfrost_gem_free_object(struct drm_gem_object *obj)
{
struct panfrost_gem_object *bo = to_panfrost_bo(obj);
- struct panfrost_device *pfdev = obj->dev->dev_private;
-
- /*
- * Make sure the BO is no longer inserted in the shrinker list before
- * taking care of the destruction itself. If we don't do that we have a
- * race condition between this function and what's done in
- * panfrost_gem_shrinker_scan().
- */
- mutex_lock(&pfdev->shrinker_lock);
- list_del_init(&bo->base.madv_list);
- mutex_unlock(&pfdev->shrinker_lock);

/*
* If we still have mappings attached to the BO, there's a problem in
@@ -57,26 +46,23 @@ panfrost_gem_mapping_get(struct panfrost_gem_object *bo,
return mapping;
}

-static void
-panfrost_gem_teardown_mapping(struct panfrost_gem_mapping *mapping)
+static void panfrost_gem_mapping_release(struct kref *kref)
{
+ struct panfrost_gem_mapping *mapping =
+ container_of(kref, struct panfrost_gem_mapping, refcount);
+ struct panfrost_gem_object *bo = mapping->obj;
+ struct panfrost_device *pfdev = bo->base.base.dev->dev_private;
+
+ /* Shrinker may purge the mapping at the same time. */
+ dma_resv_lock(mapping->obj->base.base.resv, NULL);
if (mapping->active)
panfrost_mmu_unmap(mapping);
+ dma_resv_unlock(mapping->obj->base.base.resv);

spin_lock(&mapping->mmu->mm_lock);
if (drm_mm_node_allocated(&mapping->mmnode))
drm_mm_remove_node(&mapping->mmnode);
spin_unlock(&mapping->mmu->mm_lock);
-}
-
-static void panfrost_gem_mapping_release(struct kref *kref)
-{
- struct panfrost_gem_mapping *mapping =
- container_of(kref, struct panfrost_gem_mapping, refcount);
- struct panfrost_gem_object *bo = mapping->obj;
- struct panfrost_device *pfdev = bo->base.base.dev->dev_private;
-
- panfrost_gem_teardown_mapping(mapping);

/* On heap BOs, release the sgts created in the fault handler path. */
if (bo->sgts) {
@@ -117,12 +103,14 @@ void panfrost_gem_mapping_put(struct panfrost_gem_mapping *mapping)
kref_put(&mapping->refcount, panfrost_gem_mapping_release);
}

-void panfrost_gem_teardown_mappings_locked(struct panfrost_gem_object *bo)
+void panfrost_gem_evict_mappings_locked(struct panfrost_gem_object *bo)
{
struct panfrost_gem_mapping *mapping;

- list_for_each_entry(mapping, &bo->mappings.list, node)
- panfrost_gem_teardown_mapping(mapping);
+ list_for_each_entry(mapping, &bo->mappings.list, node) {
+ if (mapping->active)
+ panfrost_mmu_unmap(mapping);
+ }
}

int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv)
@@ -251,6 +239,25 @@ static size_t panfrost_gem_rss(struct drm_gem_object *obj)
return 0;
}

+static int panfrost_shmem_evict(struct drm_gem_object *obj)
+{
+ struct panfrost_gem_object *bo = to_panfrost_bo(obj);
+
+ if (!drm_gem_shmem_is_purgeable(&bo->base))
+ return -EBUSY;
+
+ if (!mutex_trylock(&bo->mappings.lock))
+ return -EBUSY;
+
+ panfrost_gem_evict_mappings_locked(bo);
+
+ drm_gem_shmem_purge_locked(&bo->base);
+
+ mutex_unlock(&bo->mappings.lock);
+
+ return 0;
+}
+
static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.free = panfrost_gem_free_object,
.open = panfrost_gem_open,
@@ -265,6 +272,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.status = panfrost_gem_status,
.rss = panfrost_gem_rss,
.vm_ops = &drm_gem_shmem_vm_ops,
+ .evict = panfrost_shmem_evict,
};

/**
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h
index 7516b7ecf7fe..8ddc2d310d29 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
@@ -30,12 +30,6 @@ struct panfrost_gem_object {
struct mutex lock;
} mappings;

- /*
- * Count the number of jobs referencing this BO so we don't let the
- * shrinker reclaim this object prematurely.
- */
- atomic_t gpu_usecount;
-
/*
* Object chunk size currently mapped onto physical memory
*/
@@ -86,7 +80,4 @@ panfrost_gem_mapping_get(struct panfrost_gem_object *bo,
void panfrost_gem_mapping_put(struct panfrost_gem_mapping *mapping);
void panfrost_gem_teardown_mappings_locked(struct panfrost_gem_object *bo);

-int panfrost_gem_shrinker_init(struct drm_device *dev);
-void panfrost_gem_shrinker_cleanup(struct drm_device *dev);
-
#endif /* __PANFROST_GEM_H__ */
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
deleted file mode 100644
index 7b4deba803ed..000000000000
--- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
+++ /dev/null
@@ -1,140 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2019 Arm Ltd.
- *
- * Based on msm_gem_freedreno.c:
- * Copyright (C) 2016 Red Hat
- * Author: Rob Clark <[email protected]>
- */
-
-#include <linux/list.h>
-
-#include <drm/drm_device.h>
-#include <drm/drm_gem_shmem_helper.h>
-
-#include "panfrost_device.h"
-#include "panfrost_gem.h"
-#include "panfrost_mmu.h"
-
-static bool panfrost_gem_shmem_is_purgeable(struct drm_gem_shmem_object *shmem)
-{
- return (shmem->madv > 0) &&
- !refcount_read(&shmem->pages_pin_count) && shmem->sgt &&
- !shmem->base.dma_buf && !shmem->base.import_attach;
-}
-
-static unsigned long
-panfrost_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
-{
- struct panfrost_device *pfdev = shrinker->private_data;
- struct drm_gem_shmem_object *shmem;
- unsigned long count = 0;
-
- if (!mutex_trylock(&pfdev->shrinker_lock))
- return 0;
-
- list_for_each_entry(shmem, &pfdev->shrinker_list, madv_list) {
- if (panfrost_gem_shmem_is_purgeable(shmem))
- count += shmem->base.size >> PAGE_SHIFT;
- }
-
- mutex_unlock(&pfdev->shrinker_lock);
-
- return count;
-}
-
-static bool panfrost_gem_purge(struct drm_gem_object *obj)
-{
- struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
- struct panfrost_gem_object *bo = to_panfrost_bo(obj);
- bool ret = false;
-
- if (atomic_read(&bo->gpu_usecount))
- return false;
-
- if (!mutex_trylock(&bo->mappings.lock))
- return false;
-
- if (!dma_resv_trylock(shmem->base.resv))
- goto unlock_mappings;
-
- /* BO might have become unpurgeable if the last pages_use_count ref
- * was dropped, but the BO hasn't been destroyed yet.
- */
- if (!panfrost_gem_shmem_is_purgeable(shmem))
- goto unlock_mappings;
-
- panfrost_gem_teardown_mappings_locked(bo);
- drm_gem_shmem_purge_locked(&bo->base);
- ret = true;
-
- dma_resv_unlock(shmem->base.resv);
-
-unlock_mappings:
- mutex_unlock(&bo->mappings.lock);
- return ret;
-}
-
-static unsigned long
-panfrost_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
-{
- struct panfrost_device *pfdev = shrinker->private_data;
- struct drm_gem_shmem_object *shmem, *tmp;
- unsigned long freed = 0;
-
- if (!mutex_trylock(&pfdev->shrinker_lock))
- return SHRINK_STOP;
-
- list_for_each_entry_safe(shmem, tmp, &pfdev->shrinker_list, madv_list) {
- if (freed >= sc->nr_to_scan)
- break;
- if (panfrost_gem_shmem_is_purgeable(shmem) &&
- panfrost_gem_purge(&shmem->base)) {
- freed += shmem->base.size >> PAGE_SHIFT;
- list_del_init(&shmem->madv_list);
- }
- }
-
- mutex_unlock(&pfdev->shrinker_lock);
-
- if (freed > 0)
- pr_info_ratelimited("Purging %lu bytes\n", freed << PAGE_SHIFT);
-
- return freed;
-}
-
-/**
- * panfrost_gem_shrinker_init - Initialize panfrost shrinker
- * @dev: DRM device
- *
- * This function registers and sets up the panfrost shrinker.
- */
-int panfrost_gem_shrinker_init(struct drm_device *dev)
-{
- struct panfrost_device *pfdev = dev->dev_private;
-
- pfdev->shrinker = shrinker_alloc(0, "drm-panfrost");
- if (!pfdev->shrinker)
- return -ENOMEM;
-
- pfdev->shrinker->count_objects = panfrost_gem_shrinker_count;
- pfdev->shrinker->scan_objects = panfrost_gem_shrinker_scan;
- pfdev->shrinker->private_data = pfdev;
-
- shrinker_register(pfdev->shrinker);
-
- return 0;
-}
-
-/**
- * panfrost_gem_shrinker_cleanup - Clean up panfrost shrinker
- * @dev: DRM device
- *
- * This function unregisters the panfrost shrinker.
- */
-void panfrost_gem_shrinker_cleanup(struct drm_device *dev)
-{
- struct panfrost_device *pfdev = dev->dev_private;
-
- if (pfdev->shrinker)
- shrinker_free(pfdev->shrinker);
-}
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 0c2dbf6ef2a5..9e26cb013191 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -289,6 +289,19 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos,
dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE);
}

+static int panfrost_objects_prepare(struct drm_gem_object **bos, int bo_count)
+{
+ struct panfrost_gem_object *bo;
+ int ret = 0;
+
+ while (!ret && bo_count--) {
+ bo = to_panfrost_bo(bos[bo_count]);
+ ret = bo->base.madv != PANFROST_MADV_WILLNEED ? -EINVAL : 0;
+ }
+
+ return ret;
+}
+
int panfrost_job_push(struct panfrost_job *job)
{
struct panfrost_device *pfdev = job->pfdev;
@@ -300,6 +313,10 @@ int panfrost_job_push(struct panfrost_job *job)
if (ret)
return ret;

+ ret = panfrost_objects_prepare(job->bos, job->bo_count);
+ if (ret)
+ goto unlock;
+
mutex_lock(&pfdev->sched_lock);
drm_sched_job_arm(&job->base);

@@ -341,7 +358,6 @@ static void panfrost_job_cleanup(struct kref *ref)
if (!job->mappings[i])
break;

- atomic_dec(&job->mappings[i]->obj->gpu_usecount);
panfrost_gem_mapping_put(job->mappings[i]);
}
kvfree(job->mappings);
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 4a0b4bf03f1a..22e18f7986e7 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -328,6 +328,7 @@ int panfrost_mmu_map(struct panfrost_gem_mapping *mapping)
struct panfrost_device *pfdev = to_panfrost_device(obj->dev);
struct sg_table *sgt;
int prot = IOMMU_READ | IOMMU_WRITE;
+ int ret = 0;

if (WARN_ON(mapping->active))
return 0;
@@ -335,15 +336,32 @@ int panfrost_mmu_map(struct panfrost_gem_mapping *mapping)
if (bo->noexec)
prot |= IOMMU_NOEXEC;

+ if (!obj->import_attach) {
+ /*
+ * Don't allow shrinker to move pages while pages are mapped.
+ * It's fine to move pages afterwards because shrinker will
+ * take care of unmapping pages during eviction.
+ */
+ ret = drm_gem_shmem_pin(shmem);
+ if (ret)
+ return ret;
+ }
+
sgt = drm_gem_shmem_get_pages_sgt(shmem);
- if (WARN_ON(IS_ERR(sgt)))
- return PTR_ERR(sgt);
+ if (WARN_ON(IS_ERR(sgt))) {
+ ret = PTR_ERR(sgt);
+ goto unpin;
+ }

mmu_map_sg(pfdev, mapping->mmu, mapping->mmnode.start << PAGE_SHIFT,
prot, sgt);
mapping->active = true;

- return 0;
+unpin:
+ if (!obj->import_attach)
+ drm_gem_shmem_unpin(shmem);
+
+ return ret;
}

void panfrost_mmu_unmap(struct panfrost_gem_mapping *mapping)
diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
index 167f00f089de..9c6bb00260fc 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -61,13 +61,6 @@ struct drm_gem_shmem_object {
*/
int madv;

- /**
- * @madv_list: List entry for madvise tracking
- *
- * Typically used by drivers to track purgeable objects
- */
- struct list_head madv_list;
-
/**
* @sgt: Scatter/gather table for imported PRIME buffers
*/
--
2.43.0


2024-01-25 07:55:24

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 03/30] drm/gem: Document locking rule of vmap and evict callbacks

On Fri, 5 Jan 2024 21:45:57 +0300
Dmitry Osipenko <[email protected]> wrote:

> The vmap/vunmap/evict GEM callbacks are always invoked with a held GEM's
> reservation lock. Document this locking rule for clarity.
>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Boris Brezillon <[email protected]>

> ---
> include/drm/drm_gem.h | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index f835fdee6a5e..021f64371056 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -156,7 +156,8 @@ struct drm_gem_object_funcs {
> * @vmap:
> *
> * Returns a virtual address for the buffer. Used by the
> - * drm_gem_dmabuf_vmap() helper.
> + * drm_gem_dmabuf_vmap() helper. Called with a held GEM reservation
> + * lock.
> *
> * This callback is optional.
> */
> @@ -166,7 +167,8 @@ struct drm_gem_object_funcs {
> * @vunmap:
> *
> * Releases the address previously returned by @vmap. Used by the
> - * drm_gem_dmabuf_vunmap() helper.
> + * drm_gem_dmabuf_vunmap() helper. Called with a held GEM reservation
> + * lock.
> *
> * This callback is optional.
> */
> @@ -189,7 +191,8 @@ struct drm_gem_object_funcs {
> * @evict:
> *
> * Evicts gem object out from memory. Used by the drm_gem_object_evict()
> - * helper. Returns 0 on success, -errno otherwise.
> + * helper. Returns 0 on success, -errno otherwise. Called with a held
> + * GEM reservation lock.
> *
> * This callback is optional.
> */


2024-01-25 08:08:12

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 12/30] drm/shmem-helper: Prepare drm_gem_shmem_free() to shrinker addition

On Fri, 5 Jan 2024 21:46:06 +0300
Dmitry Osipenko <[email protected]> wrote:

> Prepare drm_gem_shmem_free() to addition of memory shrinker support
> to drm-shmem by adding and using variant of put_pages() that doesn't
> touch reservation lock. Reservation shouldn't be touched because lockdep
> will trigger a bogus warning about locking contention with fs_reclaim
> code paths that can't happen during the time when GEM is freed and
> lockdep doesn't know about that.
>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Boris Brezillon <[email protected]>

> ---
> drivers/gpu/drm/drm_gem_shmem_helper.c | 40 ++++++++++++++------------
> 1 file changed, 21 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 3403700780c3..799a3c5015ad 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -128,6 +128,22 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t
> }
> EXPORT_SYMBOL_GPL(drm_gem_shmem_create);
>
> +static void
> +drm_gem_shmem_free_pages(struct drm_gem_shmem_object *shmem)
> +{
> + struct drm_gem_object *obj = &shmem->base;
> +
> +#ifdef CONFIG_X86
> + if (shmem->map_wc)
> + set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
> +#endif
> +
> + drm_gem_put_pages(obj, shmem->pages,
> + shmem->pages_mark_dirty_on_put,
> + shmem->pages_mark_accessed_on_put);
> + shmem->pages = NULL;
> +}
> +
> /**
> * drm_gem_shmem_free - Free resources associated with a shmem GEM object
> * @shmem: shmem GEM object to free
> @@ -142,8 +158,6 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
> if (obj->import_attach) {
> drm_prime_gem_destroy(obj, shmem->sgt);
> } else {
> - dma_resv_lock(shmem->base.resv, NULL);
> -
> drm_WARN_ON(obj->dev, refcount_read(&shmem->vmap_use_count));
>
> if (shmem->sgt) {
> @@ -152,13 +166,12 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
> sg_free_table(shmem->sgt);
> kfree(shmem->sgt);
> }
> - if (shmem->pages)
> - drm_gem_shmem_put_pages_locked(shmem);
> + if (shmem->pages &&
> + refcount_dec_and_test(&shmem->pages_use_count))
> + drm_gem_shmem_free_pages(shmem);
>
> drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count));
> drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_pin_count));
> -
> - dma_resv_unlock(shmem->base.resv);
> }
>
> drm_gem_object_release(obj);
> @@ -208,21 +221,10 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
> */
> void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> {
> - struct drm_gem_object *obj = &shmem->base;
> -
> dma_resv_assert_held(shmem->base.resv);
>
> - if (refcount_dec_and_test(&shmem->pages_use_count)) {
> -#ifdef CONFIG_X86
> - if (shmem->map_wc)
> - set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
> -#endif
> -
> - drm_gem_put_pages(obj, shmem->pages,
> - shmem->pages_mark_dirty_on_put,
> - shmem->pages_mark_accessed_on_put);
> - shmem->pages = NULL;
> - }
> + if (refcount_dec_and_test(&shmem->pages_use_count))
> + drm_gem_shmem_free_pages(shmem);
> }
> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>


2024-01-25 08:08:47

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 13/30] drm/shmem-helper: Make drm_gem_shmem_get_pages() public

On Fri, 5 Jan 2024 21:46:07 +0300
Dmitry Osipenko <[email protected]> wrote:

> We're going to move away from having implicit get_pages() done by
> get_pages_sgt() to simplify refcnt handling. Drivers will manage
> get/put_pages() by themselves. Expose the drm_gem_shmem_get_pages()
> in a public drm-shmem API.
>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Boris Brezillon <[email protected]>

> ---
> drivers/gpu/drm/drm_gem_shmem_helper.c | 10 +++++++++-
> include/drm/drm_gem_shmem_helper.h | 1 +
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 799a3c5015ad..dc416a4bce1b 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -228,7 +228,14 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> }
> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>
> -static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> +/*
> + * drm_gem_shmem_get_pages - Increase use count on the backing pages for a shmem GEM object
> + * @shmem: shmem GEM object
> + *
> + * This function Increases the use count and allocates the backing pages if
> + * use-count equals to zero.
> + */
> +int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> {
> int ret;
>
> @@ -241,6 +248,7 @@ static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
>
> return ret;
> }
> +EXPORT_SYMBOL_GPL(drm_gem_shmem_get_pages);
>
> static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
> {
> diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
> index 18020f653d7e..6dedc0739fbc 100644
> --- a/include/drm/drm_gem_shmem_helper.h
> +++ b/include/drm/drm_gem_shmem_helper.h
> @@ -110,6 +110,7 @@ struct drm_gem_shmem_object {
> struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t size);
> void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem);
>
> +int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem);
> void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem);
> int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem);
> void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem);


2024-01-25 08:09:01

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 14/30] drm/shmem-helper: Add drm_gem_shmem_put_pages()

On Fri, 5 Jan 2024 21:46:08 +0300
Dmitry Osipenko <[email protected]> wrote:

> We're going to move away from having implicit get_pages() done by
> get_pages_sgt() to ease simplify refcnt handling. Drivers will manage
> get/put_pages() by themselves. Add drm_gem_shmem_put_pages().
>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Boris Brezillon <[email protected]>

> ---
> drivers/gpu/drm/drm_gem_shmem_helper.c | 20 ++++++++++++++++++++
> include/drm/drm_gem_shmem_helper.h | 1 +
> 2 files changed, 21 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index dc416a4bce1b..f5ed64f78648 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -218,6 +218,7 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
> * @shmem: shmem GEM object
> *
> * This function decreases the use count and puts the backing pages when use drops to zero.
> + * Caller must hold GEM's reservation lock.
> */
> void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> {
> @@ -228,6 +229,25 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> }
> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>
> +/*
> + * drm_gem_shmem_put_pages - Decrease use count on the backing pages for a shmem GEM object
> + * @shmem: shmem GEM object
> + *
> + * This function decreases the use count and puts the backing pages when use drops to zero.
> + * It's unlocked version of drm_gem_shmem_put_pages_locked(), caller must not hold
> + * GEM's reservation lock.
> + */
> +void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
> +{
> + if (refcount_dec_not_one(&shmem->pages_use_count))
> + return;
> +
> + dma_resv_lock(shmem->base.resv, NULL);
> + drm_gem_shmem_put_pages_locked(shmem);
> + dma_resv_unlock(shmem->base.resv);
> +}
> +EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages);
> +
> /*
> * drm_gem_shmem_get_pages - Increase use count on the backing pages for a shmem GEM object
> * @shmem: shmem GEM object
> diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
> index 6dedc0739fbc..525480488451 100644
> --- a/include/drm/drm_gem_shmem_helper.h
> +++ b/include/drm/drm_gem_shmem_helper.h
> @@ -111,6 +111,7 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t
> void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem);
>
> int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem);
> +void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem);
> void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem);
> int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem);
> void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem);


2024-01-25 08:09:38

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 15/30] drm/shmem-helper: Avoid lockdep warning when pages are released

On Fri, 5 Jan 2024 21:46:09 +0300
Dmitry Osipenko <[email protected]> wrote:

> All drivers will be moved to get/put pages explicitly and then the last
> put_pages() will be invoked during gem_free() time by some drivers.
> We can't touch reservation lock when GEM is freed because that will cause
> a spurious warning from lockdep when shrinker support will be added.
> Lockdep doesn't know that fs_reclaim isn't functioning for a freed object,
> and thus, can't deadlock. Release pages directly without taking reservation
> lock if GEM is freed and its refcount is zero.
>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Boris Brezillon <[email protected]>

> ---
> drivers/gpu/drm/drm_gem_shmem_helper.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index f5ed64f78648..c7357110ca76 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -242,6 +242,22 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
> if (refcount_dec_not_one(&shmem->pages_use_count))
> return;
>
> + /*
> + * Destroying the object is a special case because acquiring
> + * the obj lock can cause a locking order inversion between
> + * reservation_ww_class_mutex and fs_reclaim.
> + *
> + * This deadlock is not actually possible, because no one should
> + * be already holding the lock when GEM is released. Unfortunately
> + * lockdep is not aware of this detail. So when the refcount drops
> + * to zero, we pretend it is already locked.
> + */
> + if (!kref_read(&shmem->base.refcount)) {
> + if (refcount_dec_and_test(&shmem->pages_use_count))
> + drm_gem_shmem_free_pages(shmem);
> + return;
> + }
> +
> dma_resv_lock(shmem->base.resv, NULL);
> drm_gem_shmem_put_pages_locked(shmem);
> dma_resv_unlock(shmem->base.resv);


2024-01-25 08:11:10

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 18/30] drm/panfrost: Explicitly get and put drm-shmem pages

On Fri, 5 Jan 2024 21:46:12 +0300
Dmitry Osipenko <[email protected]> wrote:

> To simplify the drm-shmem refcnt handling, we're moving away from
> the implicit get_pages() that is used by get_pages_sgt(). From now on
> drivers will have to pin pages while they use sgt. Panfrost's shrinker
> doesn't support swapping out BOs, hence pages are pinned and sgt is valid
> as long as pages' use-count > 0.
>
> In Panfrost, panfrost_gem_mapping, which is the object representing a
> GPU mapping of a BO, owns a pages ref. This guarantees that any BO being
> mapped GPU side has its pages retained till the mapping is destroyed.
>
> Since pages are no longer guaranteed to stay pinned for the BO lifetime,
> and MADVISE(DONT_NEED) flagging remains after the GEM handle has been
> destroyed, we need to add an extra 'is_purgeable' check in
> panfrost_gem_purge(), to make sure we're not trying to purge a BO that
> already had its pages released.
>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Boris Brezillon <[email protected]>

But I'd like to have Steve's review as well on that one.

> ---
> drivers/gpu/drm/panfrost/panfrost_gem.c | 63 ++++++++++++++-----
> .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 6 ++
> 2 files changed, 52 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
> index f268bd5c2884..7edfc12f7c1f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
> @@ -35,20 +35,6 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj)
> */
> WARN_ON_ONCE(!list_empty(&bo->mappings.list));
>
> - if (bo->sgts) {
> - int i;
> - int n_sgt = bo->base.base.size / SZ_2M;
> -
> - for (i = 0; i < n_sgt; i++) {
> - if (bo->sgts[i].sgl) {
> - dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
> - DMA_BIDIRECTIONAL, 0);
> - sg_free_table(&bo->sgts[i]);
> - }
> - }
> - kvfree(bo->sgts);
> - }
> -
> drm_gem_shmem_free(&bo->base);
> }
>
> @@ -85,11 +71,40 @@ panfrost_gem_teardown_mapping(struct panfrost_gem_mapping *mapping)
>
> static void panfrost_gem_mapping_release(struct kref *kref)
> {
> - struct panfrost_gem_mapping *mapping;
> -
> - mapping = container_of(kref, struct panfrost_gem_mapping, refcount);
> + struct panfrost_gem_mapping *mapping =
> + container_of(kref, struct panfrost_gem_mapping, refcount);
> + struct panfrost_gem_object *bo = mapping->obj;
> + struct panfrost_device *pfdev = bo->base.base.dev->dev_private;
>
> panfrost_gem_teardown_mapping(mapping);
> +
> + /* On heap BOs, release the sgts created in the fault handler path. */
> + if (bo->sgts) {
> + int i, n_sgt = bo->base.base.size / SZ_2M;
> +
> + for (i = 0; i < n_sgt; i++) {
> + if (bo->sgts[i].sgl) {
> + dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
> + DMA_BIDIRECTIONAL, 0);
> + sg_free_table(&bo->sgts[i]);
> + }
> + }
> + kvfree(bo->sgts);
> + }
> +
> + /* Pages ref is owned by the panfrost_gem_mapping object. We must
> + * release our pages ref (if any), before releasing the object
> + * ref.
> + * Non-heap BOs acquired the pages at panfrost_gem_mapping creation
> + * time, and heap BOs may have acquired pages if the fault handler
> + * was called, in which case bo->sgts should be non-NULL.
> + */
> + if (!bo->base.base.import_attach && (!bo->is_heap || bo->sgts) &&
> + bo->base.madv >= 0) {
> + drm_gem_shmem_put_pages(&bo->base);
> + bo->sgts = NULL;
> + }
> +
> drm_gem_object_put(&mapping->obj->base.base);
> panfrost_mmu_ctx_put(mapping->mmu);
> kfree(mapping);
> @@ -125,6 +140,20 @@ int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv)
> if (!mapping)
> return -ENOMEM;
>
> + if (!bo->is_heap && !bo->base.base.import_attach) {
> + /* Pages ref is owned by the panfrost_gem_mapping object.
> + * For non-heap BOs, we request pages at mapping creation
> + * time, such that the panfrost_mmu_map() call, further down in
> + * this function, is guaranteed to have pages_use_count > 0
> + * when drm_gem_shmem_get_pages_sgt() is called.
> + */
> + ret = drm_gem_shmem_get_pages(&bo->base);
> + if (ret) {
> + kfree(mapping);
> + return ret;
> + }
> + }
> +
> INIT_LIST_HEAD(&mapping->node);
> kref_init(&mapping->refcount);
> drm_gem_object_get(obj);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> index 02b60ea1433a..d4fb0854cf2f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> @@ -50,6 +50,12 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj)
> if (!dma_resv_trylock(shmem->base.resv))
> goto unlock_mappings;
>
> + /* BO might have become unpurgeable if the last pages_use_count ref
> + * was dropped, but the BO hasn't been destroyed yet.
> + */
> + if (!drm_gem_shmem_is_purgeable(shmem))
> + goto unlock_mappings;
> +
> panfrost_gem_teardown_mappings_locked(bo);
> drm_gem_shmem_purge_locked(&bo->base);
> ret = true;


2024-01-25 09:17:32

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 23/30] drm/shmem-helper: Export drm_gem_shmem_get_pages_sgt_locked()

On Fri, 5 Jan 2024 21:46:17 +0300
Dmitry Osipenko <[email protected]> wrote:

> Export drm_gem_shmem_get_pages_sgt_locked() that will be used by virtio-gpu
> shrinker during GEM swap-in operation done under the held reservation lock.
>

Nit: I'd move that patch before "drm/shmem-helper: Add common memory
shrinker", because you'll need to call
drm_gem_shmem_get_pages_locked() and
drm_gem_shmem_get_pages_sgt_locked() if you want to repopulate the MMU
page table after when an eviction happened (see my comment on patch 22).

> Reviewed-by: Boris Brezillon <[email protected]>
> Signed-off-by: Dmitry Osipenko <[email protected]>
> ---
> drivers/gpu/drm/drm_gem_shmem_helper.c | 22 +++++++++++++++++++++-
> include/drm/drm_gem_shmem_helper.h | 1 +
> 2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 59cebd1e35af..8fd7851c088b 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -875,12 +875,31 @@ struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem)
> }
> EXPORT_SYMBOL_GPL(drm_gem_shmem_get_sg_table);
>
> -static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_object *shmem)
> +/**
> + * drm_gem_shmem_get_pages_sgt_locked - Provide a scatter/gather table of pinned
> + * pages for a shmem GEM object
> + * @shmem: shmem GEM object
> + *
> + * This is a locked version of @drm_gem_shmem_get_sg_table that exports a
> + * scatter/gather table suitable for PRIME usage by calling the standard
> + * DMA mapping API.
> + *
> + * Drivers must hold GEM's reservation lock when using this function.
> + *
> + * Drivers who need to acquire an scatter/gather table for objects need to call
> + * drm_gem_shmem_get_pages_sgt() instead.
> + *
> + * Returns:
> + * A pointer to the scatter/gather table of pinned pages or error pointer on failure.
> + */
> +struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_object *shmem)
> {
> struct drm_gem_object *obj = &shmem->base;
> int ret;
> struct sg_table *sgt;
>
> + dma_resv_assert_held(shmem->base.resv);
> +
> if (shmem->sgt)
> return shmem->sgt;
>
> @@ -904,6 +923,7 @@ static struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_
> kfree(sgt);
> return ERR_PTR(ret);
> }
> +EXPORT_SYMBOL_GPL(drm_gem_shmem_get_pages_sgt_locked);
>
> /**
> * drm_gem_shmem_get_pages_sgt - Pin pages, dma map them, and return a
> diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h
> index df97c11fc99a..167f00f089de 100644
> --- a/include/drm/drm_gem_shmem_helper.h
> +++ b/include/drm/drm_gem_shmem_helper.h
> @@ -149,6 +149,7 @@ void drm_gem_shmem_purge_locked(struct drm_gem_shmem_object *shmem);
>
> struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem);
> struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_shmem_object *shmem);
> +struct sg_table *drm_gem_shmem_get_pages_sgt_locked(struct drm_gem_shmem_object *shmem);
>
> void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem,
> struct drm_printer *p, unsigned int indent);


2024-01-25 09:19:30

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On Fri, 5 Jan 2024 21:46:16 +0300
Dmitry Osipenko <[email protected]> wrote:

> *
> * This function Increases the use count and allocates the backing pages if
> * use-count equals to zero.
> + *
> + * Note that this function doesn't pin pages in memory. If your driver
> + * uses drm-shmem shrinker, then it's free to relocate pages to swap.
> + * Getting pages only guarantees that pages are allocated, and not that
> + * pages reside in memory. In order to pin pages use drm_gem_shmem_pin().

I still find this explanation confusing, if pages are allocated, they
reside in memory. The only difference between drm_gem_shmem_get_pages()
and drm_gem_shmem_pin_pages() is that the former lets the system
reclaim the memory if the buffer is idle (no unsignalled fence attached
to the dma_resv).

We also need to describe the workflow for GEM validation (that's the
TTM term for the swapin process happening when a GPU job is submitted).

1. Prepare the GPU job and initialize its fence
2. Lock the GEM resv
3. Add the GPU job fence to the resv object
4. If the GEM is evicted
a. call drm_gem_shmem_swapin_locked()
b. get the new sgt with drm_gem_shmem_get_pages_sgt_locked()
c. repopulate the MMU table (driver internals)
5. Unlock the GEM dma_resv
6. Submit the GPU job

With this sequence, the GEM pages are guaranteed to stay around until
the GPU job is finished.

> */
> int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)

2024-01-25 09:50:16

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 30/30] drm/panfrost: Switch to generic memory shrinker

On Fri, 5 Jan 2024 21:46:24 +0300
Dmitry Osipenko <[email protected]> wrote:

> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -328,6 +328,7 @@ int panfrost_mmu_map(struct panfrost_gem_mapping *mapping)
> struct panfrost_device *pfdev = to_panfrost_device(obj->dev);
> struct sg_table *sgt;
> int prot = IOMMU_READ | IOMMU_WRITE;
> + int ret = 0;
>
> if (WARN_ON(mapping->active))
> return 0;
> @@ -335,15 +336,32 @@ int panfrost_mmu_map(struct panfrost_gem_mapping *mapping)
> if (bo->noexec)
> prot |= IOMMU_NOEXEC;
>
> + if (!obj->import_attach) {
> + /*
> + * Don't allow shrinker to move pages while pages are mapped.
> + * It's fine to move pages afterwards because shrinker will
> + * take care of unmapping pages during eviction.
> + */

That's not exactly what this shmem_pin() is about, is it? I think it's
here to meet the drm_gem_shmem_get_pages_sgt() rule stating that pages
must be pinned while the sgt returned by drm_gem_shmem_get_pages_sgt()
is manipulated. You actually unpin the GEM just after the mmu_map_sg()
call, which means pages could very well be reclaimed while the MMU
still has a mapping referencing those physical pages. And that's fine,
because what's supposed to protect against that is the fence we
register to the GEM resv at job submission time.

> + ret = drm_gem_shmem_pin(shmem);
> + if (ret)
> + return ret;
> + }
> +
> sgt = drm_gem_shmem_get_pages_sgt(shmem);
> - if (WARN_ON(IS_ERR(sgt)))
> - return PTR_ERR(sgt);
> + if (WARN_ON(IS_ERR(sgt))) {
> + ret = PTR_ERR(sgt);
> + goto unpin;
> + }
>
> mmu_map_sg(pfdev, mapping->mmu, mapping->mmnode.start << PAGE_SHIFT,
> prot, sgt);
> mapping->active = true;
>
> - return 0;
> +unpin:
> + if (!obj->import_attach)
> + drm_gem_shmem_unpin(shmem);
> +
> + return ret;
> }

2024-01-25 10:07:00

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 24/30] drm/shmem-helper: Optimize unlocked get_pages_sgt()

On Fri, 5 Jan 2024 21:46:18 +0300
Dmitry Osipenko <[email protected]> wrote:

> SGT isn't refcounted. Once SGT pointer has been obtained, it remains the
> same for both locked and unlocked get_pages_sgt(). Return cached SGT
> directly without taking a potentially expensive lock.
>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Boris Brezillon <[email protected]>

but I'm wondering if we should have made this change directly in
'drm/shmem-helper: Change sgt allocation policy'.

> ---
> drivers/gpu/drm/drm_gem_shmem_helper.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 8fd7851c088b..e6e6e693ab95 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -962,6 +962,18 @@ struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_shmem_object *shmem)
> drm_WARN_ON(obj->dev, drm_gem_shmem_is_purgeable(shmem)))
> return ERR_PTR(-EBUSY);
>
> + /*
> + * Drivers that use shrinker should take into account that shrinker
> + * may relocate BO, thus invalidating the returned SGT pointer.
> + * Such drivers should pin GEM while they use SGT.
> + *
> + * Drivers that don't use shrinker should take into account that
> + * SGT is released together with the GEM pages. Pages should be kept
> + * alive while SGT is used.
> + */
> + if (shmem->sgt)
> + return shmem->sgt;
> +
> ret = dma_resv_lock_interruptible(shmem->base.resv, NULL);
> if (ret)
> return ERR_PTR(ret);


2024-01-25 10:47:41

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On Fri, 5 Jan 2024 21:46:16 +0300
Dmitry Osipenko <[email protected]> wrote:

> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> +{
> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
> + refcount_read(&shmem->pages_use_count) &&
> + !refcount_read(&shmem->pages_pin_count) &&
> + !shmem->base.dma_buf && !shmem->base.import_attach &&
> + !shmem->evicted;

Are we missing

&& dma_resv_test_signaled(shmem->base.resv,
DMA_RESV_USAGE_BOOKKEEP)

to make sure the GPU is done using the BO?
The same applies to drm_gem_shmem_is_purgeable() BTW.

If you don't want to do this test here, we need a way to let drivers
provide a custom is_{evictable,purgeable}() test.

I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
to let drivers move the GEMs that were used most recently (those
referenced by a GPU job) at the end of the evictable LRU.

> +}
> +

2024-01-25 12:04:59

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 30/30] drm/panfrost: Switch to generic memory shrinker

On 1/25/24 12:49, Boris Brezillon wrote:
> On Fri, 5 Jan 2024 21:46:24 +0300
> Dmitry Osipenko <[email protected]> wrote:
>
>> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> @@ -328,6 +328,7 @@ int panfrost_mmu_map(struct panfrost_gem_mapping *mapping)
>> struct panfrost_device *pfdev = to_panfrost_device(obj->dev);
>> struct sg_table *sgt;
>> int prot = IOMMU_READ | IOMMU_WRITE;
>> + int ret = 0;
>>
>> if (WARN_ON(mapping->active))
>> return 0;
>> @@ -335,15 +336,32 @@ int panfrost_mmu_map(struct panfrost_gem_mapping *mapping)
>> if (bo->noexec)
>> prot |= IOMMU_NOEXEC;
>>
>> + if (!obj->import_attach) {
>> + /*
>> + * Don't allow shrinker to move pages while pages are mapped.
>> + * It's fine to move pages afterwards because shrinker will
>> + * take care of unmapping pages during eviction.
>> + */
>
> That's not exactly what this shmem_pin() is about, is it? I think it's
> here to meet the drm_gem_shmem_get_pages_sgt() rule stating that pages
> must be pinned while the sgt returned by drm_gem_shmem_get_pages_sgt()
> is manipulated. You actually unpin the GEM just after the mmu_map_sg()
> call, which means pages could very well be reclaimed while the MMU
> still has a mapping referencing those physical pages. And that's fine,
> because what's supposed to protect against that is the fence we
> register to the GEM resv at job submission time.

The comment indeed needs to be improved, thanks.

s/are mapped/in process of mapping creation/

--
Best regards,
Dmitry


2024-01-25 16:58:47

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH v19 18/30] drm/panfrost: Explicitly get and put drm-shmem pages

On 05/01/2024 18:46, Dmitry Osipenko wrote:
> To simplify the drm-shmem refcnt handling, we're moving away from
> the implicit get_pages() that is used by get_pages_sgt(). From now on
> drivers will have to pin pages while they use sgt. Panfrost's shrinker
> doesn't support swapping out BOs, hence pages are pinned and sgt is valid
> as long as pages' use-count > 0.
>
> In Panfrost, panfrost_gem_mapping, which is the object representing a
> GPU mapping of a BO, owns a pages ref. This guarantees that any BO being
> mapped GPU side has its pages retained till the mapping is destroyed.
>
> Since pages are no longer guaranteed to stay pinned for the BO lifetime,
> and MADVISE(DONT_NEED) flagging remains after the GEM handle has been
> destroyed, we need to add an extra 'is_purgeable' check in
> panfrost_gem_purge(), to make sure we're not trying to purge a BO that
> already had its pages released.
>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Steven Price <[email protected]>

Although I don't like the condition in panfrost_gem_mapping_release()
for drm_gem_shmem_put_pages() and assigning NULL to bo->sgts - it feels
very fragile. See below.

> ---
> drivers/gpu/drm/panfrost/panfrost_gem.c | 63 ++++++++++++++-----
> .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 6 ++
> 2 files changed, 52 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
> index f268bd5c2884..7edfc12f7c1f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
> @@ -35,20 +35,6 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj)
> */
> WARN_ON_ONCE(!list_empty(&bo->mappings.list));
>
> - if (bo->sgts) {
> - int i;
> - int n_sgt = bo->base.base.size / SZ_2M;
> -
> - for (i = 0; i < n_sgt; i++) {
> - if (bo->sgts[i].sgl) {
> - dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
> - DMA_BIDIRECTIONAL, 0);
> - sg_free_table(&bo->sgts[i]);
> - }
> - }
> - kvfree(bo->sgts);
> - }
> -
> drm_gem_shmem_free(&bo->base);
> }
>
> @@ -85,11 +71,40 @@ panfrost_gem_teardown_mapping(struct panfrost_gem_mapping *mapping)
>
> static void panfrost_gem_mapping_release(struct kref *kref)
> {
> - struct panfrost_gem_mapping *mapping;
> -
> - mapping = container_of(kref, struct panfrost_gem_mapping, refcount);
> + struct panfrost_gem_mapping *mapping =
> + container_of(kref, struct panfrost_gem_mapping, refcount);
> + struct panfrost_gem_object *bo = mapping->obj;
> + struct panfrost_device *pfdev = bo->base.base.dev->dev_private;
>
> panfrost_gem_teardown_mapping(mapping);
> +
> + /* On heap BOs, release the sgts created in the fault handler path. */
> + if (bo->sgts) {
> + int i, n_sgt = bo->base.base.size / SZ_2M;
> +
> + for (i = 0; i < n_sgt; i++) {
> + if (bo->sgts[i].sgl) {
> + dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
> + DMA_BIDIRECTIONAL, 0);
> + sg_free_table(&bo->sgts[i]);
> + }
> + }
> + kvfree(bo->sgts);
> + }
> +
> + /* Pages ref is owned by the panfrost_gem_mapping object. We must
> + * release our pages ref (if any), before releasing the object
> + * ref.
> + * Non-heap BOs acquired the pages at panfrost_gem_mapping creation
> + * time, and heap BOs may have acquired pages if the fault handler
> + * was called, in which case bo->sgts should be non-NULL.
> + */
> + if (!bo->base.base.import_attach && (!bo->is_heap || bo->sgts) &&
> + bo->base.madv >= 0) {
> + drm_gem_shmem_put_pages(&bo->base);
> + bo->sgts = NULL;

The assignment of NULL here really ought to be unconditional - it isn't
a valid pointer because of the kvfree() above.

I also feel that the big condition above suggests there's a need for a
better state machine to keep track of what's going on.

But having said that I do think this series as a whole is an
improvement, it's nice to get the shrinker code generic. And sadly I
don't have an immediate idea for cleaning this up, hence my R-b.

Steve

> + }
> +
> drm_gem_object_put(&mapping->obj->base.base);
> panfrost_mmu_ctx_put(mapping->mmu);
> kfree(mapping);
> @@ -125,6 +140,20 @@ int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv)
> if (!mapping)
> return -ENOMEM;
>
> + if (!bo->is_heap && !bo->base.base.import_attach) {
> + /* Pages ref is owned by the panfrost_gem_mapping object.
> + * For non-heap BOs, we request pages at mapping creation
> + * time, such that the panfrost_mmu_map() call, further down in
> + * this function, is guaranteed to have pages_use_count > 0
> + * when drm_gem_shmem_get_pages_sgt() is called.
> + */
> + ret = drm_gem_shmem_get_pages(&bo->base);
> + if (ret) {
> + kfree(mapping);
> + return ret;
> + }
> + }
> +
> INIT_LIST_HEAD(&mapping->node);
> kref_init(&mapping->refcount);
> drm_gem_object_get(obj);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> index 02b60ea1433a..d4fb0854cf2f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> @@ -50,6 +50,12 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj)
> if (!dma_resv_trylock(shmem->base.resv))
> goto unlock_mappings;
>
> + /* BO might have become unpurgeable if the last pages_use_count ref
> + * was dropped, but the BO hasn't been destroyed yet.
> + */
> + if (!drm_gem_shmem_is_purgeable(shmem))
> + goto unlock_mappings;
> +
> panfrost_gem_teardown_mappings_locked(bo);
> drm_gem_shmem_purge_locked(&bo->base);
> ret = true;


2024-01-25 17:10:19

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH v19 17/30] drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()

On 05/01/2024 18:46, Dmitry Osipenko wrote:
> From: Boris Brezillon <[email protected]>
>
> If some the pages or sgt allocation failed, we shouldn't release the
> pages ref we got earlier, otherwise we will end up with unbalanced
> get/put_pages() calls. We should instead leave everything in place
> and let the BO release function deal with extra cleanup when the object
> is destroyed, or let the fault handler try again next time it's called.
>
> Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations")
> Cc: <[email protected]>
> Signed-off-by: Boris Brezillon <[email protected]>
> Co-developed-by: Dmitry Osipenko <[email protected]>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: Steven Price <[email protected]>

> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index bd5a0073009d..4a0b4bf03f1a 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -502,11 +502,18 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> mapping_set_unevictable(mapping);
>
> for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
> + /* Can happen if the last fault only partially filled this
> + * section of the pages array before failing. In that case
> + * we skip already filled pages.
> + */
> + if (pages[i])
> + continue;
> +
> pages[i] = shmem_read_mapping_page(mapping, i);
> if (IS_ERR(pages[i])) {
> ret = PTR_ERR(pages[i]);
> pages[i] = NULL;
> - goto err_pages;
> + goto err_unlock;
> }
> }
>
> @@ -514,7 +521,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> ret = sg_alloc_table_from_pages(sgt, pages + page_offset,
> NUM_FAULT_PAGES, 0, SZ_2M, GFP_KERNEL);
> if (ret)
> - goto err_pages;
> + goto err_unlock;
>
> ret = dma_map_sgtable(pfdev->dev, sgt, DMA_BIDIRECTIONAL, 0);
> if (ret)
> @@ -537,8 +544,6 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
>
> err_map:
> sg_free_table(sgt);
> -err_pages:
> - drm_gem_shmem_put_pages_locked(&bo->base);
> err_unlock:
> dma_resv_unlock(obj->resv);
> err_bo:


2024-01-25 21:44:51

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 17/30] drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()

On 1/26/24 00:41, Dmitry Osipenko wrote:
> On 1/5/24 21:46, Dmitry Osipenko wrote:
>> for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
>> + /* Can happen if the last fault only partially filled this
>> + * section of the pages array before failing. In that case
>> + * we skip already filled pages.
>> + */
>> + if (pages[i])
>> + continue;
>> +
>> pages[i] = shmem_read_mapping_page(mapping, i);
>
> Although, the shmem_read_mapping_page() should return same page if it
> was already allocated, isn't it? I.e. there was no bug here and the
> fixes/stable tags not needed.

Scratch that, I forgot that the patch is about the unbalanced
get/put_pages

--
Best regards,
Dmitry


2024-01-25 21:52:35

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

On 1/25/24 20:24, Daniel Vetter wrote:
> On Fri, Jan 05, 2024 at 09:46:03PM +0300, Dmitry Osipenko wrote:
>> Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
>> lock if pages_use_count is non-zero, leveraging from atomicity of the
>> refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
>>
>> Acked-by: Maxime Ripard <[email protected]>
>> Reviewed-by: Boris Brezillon <[email protected]>
>> Suggested-by: Boris Brezillon <[email protected]>
>> Signed-off-by: Dmitry Osipenko <[email protected]>
>> ---
>> drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
>> 1 file changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
>> index cacf0f8c42e2..1c032513abf1 100644
>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
>> @@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
>> }
>> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>>
>> +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
>> +{
>> + int ret;
>
> Just random drive-by comment: a might_lock annotation here might be good,
> or people could hit some really interesting bugs that are rather hard to
> reproduce ...
> -Sima

Thanks for the suggestion!

--
Best regards,
Dmitry


2024-01-25 21:56:38

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 17/30] drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()

On 1/5/24 21:46, Dmitry Osipenko wrote:
> for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
> + /* Can happen if the last fault only partially filled this
> + * section of the pages array before failing. In that case
> + * we skip already filled pages.
> + */
> + if (pages[i])
> + continue;
> +
> pages[i] = shmem_read_mapping_page(mapping, i);

Although, the shmem_read_mapping_page() should return same page if it
was already allocated, isn't it? I.e. there was no bug here and the
fixes/stable tags not needed.

--
Best regards,
Dmitry


2024-01-25 21:58:07

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On 1/25/24 13:19, Boris Brezillon wrote:
> On Fri, 5 Jan 2024 21:46:16 +0300
> Dmitry Osipenko <[email protected]> wrote:
>
>> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
>> +{
>> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
>> + refcount_read(&shmem->pages_use_count) &&
>> + !refcount_read(&shmem->pages_pin_count) &&
>> + !shmem->base.dma_buf && !shmem->base.import_attach &&
>> + !shmem->evicted;
>
> Are we missing
>
> && dma_resv_test_signaled(shmem->base.resv,
> DMA_RESV_USAGE_BOOKKEEP)
>
> to make sure the GPU is done using the BO?
> The same applies to drm_gem_shmem_is_purgeable() BTW.
>
> If you don't want to do this test here, we need a way to let drivers
> provide a custom is_{evictable,purgeable}() test.
>
> I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
> to let drivers move the GEMs that were used most recently (those
> referenced by a GPU job) at the end of the evictable LRU.

We have the signaled-check in the common drm_gem_evict() helper:

https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496

--
Best regards,
Dmitry


2024-01-25 23:45:48

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

On Fri, Jan 05, 2024 at 09:46:03PM +0300, Dmitry Osipenko wrote:
> Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
> lock if pages_use_count is non-zero, leveraging from atomicity of the
> refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
>
> Acked-by: Maxime Ripard <[email protected]>
> Reviewed-by: Boris Brezillon <[email protected]>
> Suggested-by: Boris Brezillon <[email protected]>
> Signed-off-by: Dmitry Osipenko <[email protected]>
> ---
> drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
> 1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index cacf0f8c42e2..1c032513abf1 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> }
> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>
> +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> +{
> + int ret;

Just random drive-by comment: a might_lock annotation here might be good,
or people could hit some really interesting bugs that are rather hard to
reproduce ...
-Sima

> +
> + if (refcount_inc_not_zero(&shmem->pages_use_count))
> + return 0;
> +
> + dma_resv_lock(shmem->base.resv, NULL);
> + ret = drm_gem_shmem_get_pages_locked(shmem);
> + dma_resv_unlock(shmem->base.resv);
> +
> + return ret;
> +}
> +
> static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
> {
> int ret;
> @@ -609,10 +623,7 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct
> return ret;
> }
>
> - dma_resv_lock(shmem->base.resv, NULL);
> - ret = drm_gem_shmem_get_pages_locked(shmem);
> - dma_resv_unlock(shmem->base.resv);
> -
> + ret = drm_gem_shmem_get_pages(shmem);
> if (ret)
> return ret;
>
> --
> 2.43.0
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Subject: Re: [PATCH v19 17/30] drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()

Il 05/01/24 19:46, Dmitry Osipenko ha scritto:
> From: Boris Brezillon <[email protected]>
>
> If some the pages or sgt allocation failed, we shouldn't release the
> pages ref we got earlier, otherwise we will end up with unbalanced
> get/put_pages() calls. We should instead leave everything in place
> and let the BO release function deal with extra cleanup when the object
> is destroyed, or let the fault handler try again next time it's called.
>
> Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations")
> Cc: <[email protected]>
> Signed-off-by: Boris Brezillon <[email protected]>
> Co-developed-by: Dmitry Osipenko <[email protected]>
> Signed-off-by: Dmitry Osipenko <[email protected]>

Reviewed-by: AngeloGioacchino Del Regno <[email protected]>



2024-01-26 10:26:37

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 18/30] drm/panfrost: Explicitly get and put drm-shmem pages

On Thu, 25 Jan 2024 16:47:24 +0000
Steven Price <[email protected]> wrote:

> On 05/01/2024 18:46, Dmitry Osipenko wrote:
> > To simplify the drm-shmem refcnt handling, we're moving away from
> > the implicit get_pages() that is used by get_pages_sgt(). From now on
> > drivers will have to pin pages while they use sgt. Panfrost's shrinker
> > doesn't support swapping out BOs, hence pages are pinned and sgt is valid
> > as long as pages' use-count > 0.
> >
> > In Panfrost, panfrost_gem_mapping, which is the object representing a
> > GPU mapping of a BO, owns a pages ref. This guarantees that any BO being
> > mapped GPU side has its pages retained till the mapping is destroyed.
> >
> > Since pages are no longer guaranteed to stay pinned for the BO lifetime,
> > and MADVISE(DONT_NEED) flagging remains after the GEM handle has been
> > destroyed, we need to add an extra 'is_purgeable' check in
> > panfrost_gem_purge(), to make sure we're not trying to purge a BO that
> > already had its pages released.
> >
> > Signed-off-by: Dmitry Osipenko <[email protected]>
>
> Reviewed-by: Steven Price <[email protected]>
>
> Although I don't like the condition in panfrost_gem_mapping_release()
> for drm_gem_shmem_put_pages() and assigning NULL to bo->sgts - it feels
> very fragile. See below.
>
> > ---
> > drivers/gpu/drm/panfrost/panfrost_gem.c | 63 ++++++++++++++-----
> > .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 6 ++
> > 2 files changed, 52 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
> > index f268bd5c2884..7edfc12f7c1f 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
> > @@ -35,20 +35,6 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj)
> > */
> > WARN_ON_ONCE(!list_empty(&bo->mappings.list));
> >
> > - if (bo->sgts) {
> > - int i;
> > - int n_sgt = bo->base.base.size / SZ_2M;
> > -
> > - for (i = 0; i < n_sgt; i++) {
> > - if (bo->sgts[i].sgl) {
> > - dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
> > - DMA_BIDIRECTIONAL, 0);
> > - sg_free_table(&bo->sgts[i]);
> > - }
> > - }
> > - kvfree(bo->sgts);
> > - }
> > -
> > drm_gem_shmem_free(&bo->base);
> > }
> >
> > @@ -85,11 +71,40 @@ panfrost_gem_teardown_mapping(struct panfrost_gem_mapping *mapping)
> >
> > static void panfrost_gem_mapping_release(struct kref *kref)
> > {
> > - struct panfrost_gem_mapping *mapping;
> > -
> > - mapping = container_of(kref, struct panfrost_gem_mapping, refcount);
> > + struct panfrost_gem_mapping *mapping =
> > + container_of(kref, struct panfrost_gem_mapping, refcount);
> > + struct panfrost_gem_object *bo = mapping->obj;
> > + struct panfrost_device *pfdev = bo->base.base.dev->dev_private;
> >
> > panfrost_gem_teardown_mapping(mapping);
> > +
> > + /* On heap BOs, release the sgts created in the fault handler path. */
> > + if (bo->sgts) {
> > + int i, n_sgt = bo->base.base.size / SZ_2M;
> > +
> > + for (i = 0; i < n_sgt; i++) {
> > + if (bo->sgts[i].sgl) {
> > + dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
> > + DMA_BIDIRECTIONAL, 0);
> > + sg_free_table(&bo->sgts[i]);
> > + }
> > + }
> > + kvfree(bo->sgts);
> > + }
> > +
> > + /* Pages ref is owned by the panfrost_gem_mapping object. We must
> > + * release our pages ref (if any), before releasing the object
> > + * ref.
> > + * Non-heap BOs acquired the pages at panfrost_gem_mapping creation
> > + * time, and heap BOs may have acquired pages if the fault handler
> > + * was called, in which case bo->sgts should be non-NULL.
> > + */
> > + if (!bo->base.base.import_attach && (!bo->is_heap || bo->sgts) &&
> > + bo->base.madv >= 0) {
> > + drm_gem_shmem_put_pages(&bo->base);
> > + bo->sgts = NULL;
>
> The assignment of NULL here really ought to be unconditional - it isn't
> a valid pointer because of the kvfree() above.

Fair enough. How about we drop the '|| bo->sgts' and add an
drm_gem_shmem_put_pages() to the above if (bo->sgts) block, where we'll
also assign bo->sgts to NULL?

>
> I also feel that the big condition above suggests there's a need for a
> better state machine to keep track of what's going on.

I'm planning to extend drm_gem_shmem to support the alloc-on-fault use
case that all Mali GPUs seem to rely on (lima, panfrost and soon
panthor would use those helpers). The idea is to:

- make the allocation non-blocking, so we can kill the blocking
allocation in the dma signalling path (basically what intel does)
- allow dynamic extension of the pages array using an xarray instead of
a plain array

Hopefully this makes the state tracking a lot easier, and we can also
get rid of the hack we have in panfrost/lima where we manipulate
drm_gem_shmem_object refcounts directly.

>
> But having said that I do think this series as a whole is an
> improvement, it's nice to get the shrinker code generic. And sadly I
> don't have an immediate idea for cleaning this up, hence my R-b.
>
> Steve
>
> > + }
> > +
> > drm_gem_object_put(&mapping->obj->base.base);
> > panfrost_mmu_ctx_put(mapping->mmu);
> > kfree(mapping);
> > @@ -125,6 +140,20 @@ int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv)
> > if (!mapping)
> > return -ENOMEM;
> >
> > + if (!bo->is_heap && !bo->base.base.import_attach) {
> > + /* Pages ref is owned by the panfrost_gem_mapping object.
> > + * For non-heap BOs, we request pages at mapping creation
> > + * time, such that the panfrost_mmu_map() call, further down in
> > + * this function, is guaranteed to have pages_use_count > 0
> > + * when drm_gem_shmem_get_pages_sgt() is called.
> > + */
> > + ret = drm_gem_shmem_get_pages(&bo->base);
> > + if (ret) {
> > + kfree(mapping);
> > + return ret;
> > + }
> > + }
> > +
> > INIT_LIST_HEAD(&mapping->node);
> > kref_init(&mapping->refcount);
> > drm_gem_object_get(obj);
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > index 02b60ea1433a..d4fb0854cf2f 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
> > @@ -50,6 +50,12 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj)
> > if (!dma_resv_trylock(shmem->base.resv))
> > goto unlock_mappings;
> >
> > + /* BO might have become unpurgeable if the last pages_use_count ref
> > + * was dropped, but the BO hasn't been destroyed yet.
> > + */
> > + if (!drm_gem_shmem_is_purgeable(shmem))
> > + goto unlock_mappings;
> > +
> > panfrost_gem_teardown_mappings_locked(bo);
> > drm_gem_shmem_purge_locked(&bo->base);
> > ret = true;
>


2024-01-26 10:27:17

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On Fri, 26 Jan 2024 00:56:47 +0300
Dmitry Osipenko <[email protected]> wrote:

> On 1/25/24 13:19, Boris Brezillon wrote:
> > On Fri, 5 Jan 2024 21:46:16 +0300
> > Dmitry Osipenko <[email protected]> wrote:
> >
> >> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> >> +{
> >> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
> >> + refcount_read(&shmem->pages_use_count) &&
> >> + !refcount_read(&shmem->pages_pin_count) &&
> >> + !shmem->base.dma_buf && !shmem->base.import_attach &&
> >> + !shmem->evicted;
> >
> > Are we missing
> >
> > && dma_resv_test_signaled(shmem->base.resv,
> > DMA_RESV_USAGE_BOOKKEEP)
> >
> > to make sure the GPU is done using the BO?
> > The same applies to drm_gem_shmem_is_purgeable() BTW.
> >
> > If you don't want to do this test here, we need a way to let drivers
> > provide a custom is_{evictable,purgeable}() test.
> >
> > I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
> > to let drivers move the GEMs that were used most recently (those
> > referenced by a GPU job) at the end of the evictable LRU.
>
> We have the signaled-check in the common drm_gem_evict() helper:
>
> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496

Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of
DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific
->evict() hook (though that means calling dma_resv_test_signaled()
twice, which is not great, oh well).

The problem about the evictable LRU remains though: we need a way to let
drivers put their BOs at the end of the list when the BO has been used
by the GPU, don't we?


2024-01-26 10:35:43

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

On Thu, 25 Jan 2024 18:24:04 +0100
Daniel Vetter <[email protected]> wrote:

> On Fri, Jan 05, 2024 at 09:46:03PM +0300, Dmitry Osipenko wrote:
> > Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
> > lock if pages_use_count is non-zero, leveraging from atomicity of the
> > refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
> >
> > Acked-by: Maxime Ripard <[email protected]>
> > Reviewed-by: Boris Brezillon <[email protected]>
> > Suggested-by: Boris Brezillon <[email protected]>
> > Signed-off-by: Dmitry Osipenko <[email protected]>
> > ---
> > drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
> > 1 file changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index cacf0f8c42e2..1c032513abf1 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> > }
> > EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
> >
> > +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> > +{
> > + int ret;
>
> Just random drive-by comment: a might_lock annotation here might be good,
> or people could hit some really interesting bugs that are rather hard to
> reproduce ...

Actually, being able to acquire a ref in a dma-signalling path on an
object we know for sure already has refcount >= 1 (because we previously
acquired a ref in a path where dma_resv_lock() was allowed), was the
primary reason I suggested moving to this atomic-refcount approach.

In the meantime, drm_gpuvm has evolved in a way that allows me to not
take the ref in the dma-signalling path (the gpuvm_bo object now holds
the ref, and it's acquired/released outside the dma-signalling path).

Not saying we shouldn't add this might_lock(), but others might have
good reasons to have this function called in a path where locking
is not allowed.

2024-01-26 11:57:44

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH v19 18/30] drm/panfrost: Explicitly get and put drm-shmem pages

On 26/01/2024 09:39, Boris Brezillon wrote:
> On Thu, 25 Jan 2024 16:47:24 +0000
> Steven Price <[email protected]> wrote:
>
>> On 05/01/2024 18:46, Dmitry Osipenko wrote:
>>> To simplify the drm-shmem refcnt handling, we're moving away from
>>> the implicit get_pages() that is used by get_pages_sgt(). From now on
>>> drivers will have to pin pages while they use sgt. Panfrost's shrinker
>>> doesn't support swapping out BOs, hence pages are pinned and sgt is valid
>>> as long as pages' use-count > 0.
>>>
>>> In Panfrost, panfrost_gem_mapping, which is the object representing a
>>> GPU mapping of a BO, owns a pages ref. This guarantees that any BO being
>>> mapped GPU side has its pages retained till the mapping is destroyed.
>>>
>>> Since pages are no longer guaranteed to stay pinned for the BO lifetime,
>>> and MADVISE(DONT_NEED) flagging remains after the GEM handle has been
>>> destroyed, we need to add an extra 'is_purgeable' check in
>>> panfrost_gem_purge(), to make sure we're not trying to purge a BO that
>>> already had its pages released.
>>>
>>> Signed-off-by: Dmitry Osipenko <[email protected]>
>>
>> Reviewed-by: Steven Price <[email protected]>
>>
>> Although I don't like the condition in panfrost_gem_mapping_release()
>> for drm_gem_shmem_put_pages() and assigning NULL to bo->sgts - it feels
>> very fragile. See below.
>>
>>> ---
>>> drivers/gpu/drm/panfrost/panfrost_gem.c | 63 ++++++++++++++-----
>>> .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 6 ++
>>> 2 files changed, 52 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
>>> index f268bd5c2884..7edfc12f7c1f 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
>>> @@ -35,20 +35,6 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj)
>>> */
>>> WARN_ON_ONCE(!list_empty(&bo->mappings.list));
>>>
>>> - if (bo->sgts) {
>>> - int i;
>>> - int n_sgt = bo->base.base.size / SZ_2M;
>>> -
>>> - for (i = 0; i < n_sgt; i++) {
>>> - if (bo->sgts[i].sgl) {
>>> - dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
>>> - DMA_BIDIRECTIONAL, 0);
>>> - sg_free_table(&bo->sgts[i]);
>>> - }
>>> - }
>>> - kvfree(bo->sgts);
>>> - }
>>> -
>>> drm_gem_shmem_free(&bo->base);
>>> }
>>>
>>> @@ -85,11 +71,40 @@ panfrost_gem_teardown_mapping(struct panfrost_gem_mapping *mapping)
>>>
>>> static void panfrost_gem_mapping_release(struct kref *kref)
>>> {
>>> - struct panfrost_gem_mapping *mapping;
>>> -
>>> - mapping = container_of(kref, struct panfrost_gem_mapping, refcount);
>>> + struct panfrost_gem_mapping *mapping =
>>> + container_of(kref, struct panfrost_gem_mapping, refcount);
>>> + struct panfrost_gem_object *bo = mapping->obj;
>>> + struct panfrost_device *pfdev = bo->base.base.dev->dev_private;
>>>
>>> panfrost_gem_teardown_mapping(mapping);
>>> +
>>> + /* On heap BOs, release the sgts created in the fault handler path. */
>>> + if (bo->sgts) {
>>> + int i, n_sgt = bo->base.base.size / SZ_2M;
>>> +
>>> + for (i = 0; i < n_sgt; i++) {
>>> + if (bo->sgts[i].sgl) {
>>> + dma_unmap_sgtable(pfdev->dev, &bo->sgts[i],
>>> + DMA_BIDIRECTIONAL, 0);
>>> + sg_free_table(&bo->sgts[i]);
>>> + }
>>> + }
>>> + kvfree(bo->sgts);
>>> + }
>>> +
>>> + /* Pages ref is owned by the panfrost_gem_mapping object. We must
>>> + * release our pages ref (if any), before releasing the object
>>> + * ref.
>>> + * Non-heap BOs acquired the pages at panfrost_gem_mapping creation
>>> + * time, and heap BOs may have acquired pages if the fault handler
>>> + * was called, in which case bo->sgts should be non-NULL.
>>> + */
>>> + if (!bo->base.base.import_attach && (!bo->is_heap || bo->sgts) &&
>>> + bo->base.madv >= 0) {
>>> + drm_gem_shmem_put_pages(&bo->base);
>>> + bo->sgts = NULL;
>>
>> The assignment of NULL here really ought to be unconditional - it isn't
>> a valid pointer because of the kvfree() above.
>
> Fair enough. How about we drop the '|| bo->sgts' and add an
> drm_gem_shmem_put_pages() to the above if (bo->sgts) block, where we'll
> also assign bo->sgts to NULL?

Yes that would be good.

>>
>> I also feel that the big condition above suggests there's a need for a
>> better state machine to keep track of what's going on.
>
> I'm planning to extend drm_gem_shmem to support the alloc-on-fault use
> case that all Mali GPUs seem to rely on (lima, panfrost and soon
> panthor would use those helpers). The idea is to:
>
> - make the allocation non-blocking, so we can kill the blocking
> allocation in the dma signalling path (basically what intel does)
> - allow dynamic extension of the pages array using an xarray instead of
> a plain array
>
> Hopefully this makes the state tracking a lot easier, and we can also
> get rid of the hack we have in panfrost/lima where we manipulate
> drm_gem_shmem_object refcounts directly.

That sounds great - it would definitely be good to get rid of the
refcount hack, it confuses me everytime ;)

Thanks,

Steve

>>
>> But having said that I do think this series as a whole is an
>> improvement, it's nice to get the shrinker code generic. And sadly I
>> don't have an immediate idea for cleaning this up, hence my R-b.
>>
>> Steve
>>
>>> + }
>>> +
>>> drm_gem_object_put(&mapping->obj->base.base);
>>> panfrost_mmu_ctx_put(mapping->mmu);
>>> kfree(mapping);
>>> @@ -125,6 +140,20 @@ int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv)
>>> if (!mapping)
>>> return -ENOMEM;
>>>
>>> + if (!bo->is_heap && !bo->base.base.import_attach) {
>>> + /* Pages ref is owned by the panfrost_gem_mapping object.
>>> + * For non-heap BOs, we request pages at mapping creation
>>> + * time, such that the panfrost_mmu_map() call, further down in
>>> + * this function, is guaranteed to have pages_use_count > 0
>>> + * when drm_gem_shmem_get_pages_sgt() is called.
>>> + */
>>> + ret = drm_gem_shmem_get_pages(&bo->base);
>>> + if (ret) {
>>> + kfree(mapping);
>>> + return ret;
>>> + }
>>> + }
>>> +
>>> INIT_LIST_HEAD(&mapping->node);
>>> kref_init(&mapping->refcount);
>>> drm_gem_object_get(obj);
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
>>> index 02b60ea1433a..d4fb0854cf2f 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
>>> @@ -50,6 +50,12 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj)
>>> if (!dma_resv_trylock(shmem->base.resv))
>>> goto unlock_mappings;
>>>
>>> + /* BO might have become unpurgeable if the last pages_use_count ref
>>> + * was dropped, but the BO hasn't been destroyed yet.
>>> + */
>>> + if (!drm_gem_shmem_is_purgeable(shmem))
>>> + goto unlock_mappings;
>>> +
>>> panfrost_gem_teardown_mappings_locked(bo);
>>> drm_gem_shmem_purge_locked(&bo->base);
>>> ret = true;
>>
>


2024-01-26 16:28:26

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On 1/26/24 12:55, Boris Brezillon wrote:
> On Fri, 26 Jan 2024 00:56:47 +0300
> Dmitry Osipenko <[email protected]> wrote:
>
>> On 1/25/24 13:19, Boris Brezillon wrote:
>>> On Fri, 5 Jan 2024 21:46:16 +0300
>>> Dmitry Osipenko <[email protected]> wrote:
>>>
>>>> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
>>>> +{
>>>> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
>>>> + refcount_read(&shmem->pages_use_count) &&
>>>> + !refcount_read(&shmem->pages_pin_count) &&
>>>> + !shmem->base.dma_buf && !shmem->base.import_attach &&
>>>> + !shmem->evicted;
>>>
>>> Are we missing
>>>
>>> && dma_resv_test_signaled(shmem->base.resv,
>>> DMA_RESV_USAGE_BOOKKEEP)
>>>
>>> to make sure the GPU is done using the BO?
>>> The same applies to drm_gem_shmem_is_purgeable() BTW.
>>>
>>> If you don't want to do this test here, we need a way to let drivers
>>> provide a custom is_{evictable,purgeable}() test.
>>>
>>> I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
>>> to let drivers move the GEMs that were used most recently (those
>>> referenced by a GPU job) at the end of the evictable LRU.
>>
>> We have the signaled-check in the common drm_gem_evict() helper:
>>
>> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496
>
> Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of
> DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific
> ->evict() hook (though that means calling dma_resv_test_signaled()
> twice, which is not great, oh well).

Maybe we should change drm_gem_evict() to use BOOKKEEP. The
test_signaled(BOOKKEEP) should be a "stronger" check than
test_signaled(READ)?

> The problem about the evictable LRU remains though: we need a way to let
> drivers put their BOs at the end of the list when the BO has been used
> by the GPU, don't we?

If BO is use, then it won't be evicted, while idling BOs will be
evicted. Hence, the used BOs will be naturally moved down the LRU list
each time shrinker is invoked.

--
Best regards,
Dmitry


2024-01-26 16:44:16

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

On 1/26/24 13:18, Boris Brezillon wrote:
> On Thu, 25 Jan 2024 18:24:04 +0100
> Daniel Vetter <[email protected]> wrote:
>
>> On Fri, Jan 05, 2024 at 09:46:03PM +0300, Dmitry Osipenko wrote:
>>> Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
>>> lock if pages_use_count is non-zero, leveraging from atomicity of the
>>> refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
>>>
>>> Acked-by: Maxime Ripard <[email protected]>
>>> Reviewed-by: Boris Brezillon <[email protected]>
>>> Suggested-by: Boris Brezillon <[email protected]>
>>> Signed-off-by: Dmitry Osipenko <[email protected]>
>>> ---
>>> drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
>>> 1 file changed, 15 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
>>> index cacf0f8c42e2..1c032513abf1 100644
>>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
>>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
>>> @@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
>>> }
>>> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>>>
>>> +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
>>> +{
>>> + int ret;
>>
>> Just random drive-by comment: a might_lock annotation here might be good,
>> or people could hit some really interesting bugs that are rather hard to
>> reproduce ...
>
> Actually, being able to acquire a ref in a dma-signalling path on an
> object we know for sure already has refcount >= 1 (because we previously
> acquired a ref in a path where dma_resv_lock() was allowed), was the
> primary reason I suggested moving to this atomic-refcount approach.
>
> In the meantime, drm_gpuvm has evolved in a way that allows me to not
> take the ref in the dma-signalling path (the gpuvm_bo object now holds
> the ref, and it's acquired/released outside the dma-signalling path).
>
> Not saying we shouldn't add this might_lock(), but others might have
> good reasons to have this function called in a path where locking
> is not allowed.

For Panthor the might_lock indeed won't be a appropriate, thanks for
reminding about it. I'll add explanatory comment to the code.

--
Best regards,
Dmitry


2024-01-26 18:12:47

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On Fri, 26 Jan 2024 19:27:49 +0300
Dmitry Osipenko <[email protected]> wrote:

> On 1/26/24 12:55, Boris Brezillon wrote:
> > On Fri, 26 Jan 2024 00:56:47 +0300
> > Dmitry Osipenko <[email protected]> wrote:
> >
> >> On 1/25/24 13:19, Boris Brezillon wrote:
> >>> On Fri, 5 Jan 2024 21:46:16 +0300
> >>> Dmitry Osipenko <[email protected]> wrote:
> >>>
> >>>> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> >>>> +{
> >>>> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
> >>>> + refcount_read(&shmem->pages_use_count) &&
> >>>> + !refcount_read(&shmem->pages_pin_count) &&
> >>>> + !shmem->base.dma_buf && !shmem->base.import_attach &&
> >>>> + !shmem->evicted;
> >>>
> >>> Are we missing
> >>>
> >>> && dma_resv_test_signaled(shmem->base.resv,
> >>> DMA_RESV_USAGE_BOOKKEEP)
> >>>
> >>> to make sure the GPU is done using the BO?
> >>> The same applies to drm_gem_shmem_is_purgeable() BTW.
> >>>
> >>> If you don't want to do this test here, we need a way to let drivers
> >>> provide a custom is_{evictable,purgeable}() test.
> >>>
> >>> I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
> >>> to let drivers move the GEMs that were used most recently (those
> >>> referenced by a GPU job) at the end of the evictable LRU.
> >>
> >> We have the signaled-check in the common drm_gem_evict() helper:
> >>
> >> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496
> >
> > Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of
> > DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific
> > ->evict() hook (though that means calling dma_resv_test_signaled()
> > twice, which is not great, oh well).
>
> Maybe we should change drm_gem_evict() to use BOOKKEEP. The
> test_signaled(BOOKKEEP) should be a "stronger" check than
> test_signaled(READ)?

It is, just wondering if some users have a good reason to want
READ here.

>
> > The problem about the evictable LRU remains though: we need a way to let
> > drivers put their BOs at the end of the list when the BO has been used
> > by the GPU, don't we?
>
> If BO is use, then it won't be evicted, while idling BOs will be
> evicted. Hence, the used BOs will be naturally moved down the LRU list
> each time shrinker is invoked.
>

That only do the trick if the BOs being used most often are busy when
the shrinker kicks in though. Let's take this scenario:


BO 1 BO 2 shinker

busy
idle (first-pos-in-evictable-LRU)

busy
idle (second-pos-in-evictable-LRU)

busy
idle

busy
idle

busy
idle

find a BO to evict
pick BO 2

busy (swapin)
idle

If the LRU had been updated at each busy event, BO 1 should have
been picked for eviction. But we evicted the BO that was first
recorded idle instead of the one that was least recently
recorded busy.

2024-01-29 06:16:24

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On 1/26/24 21:12, Boris Brezillon wrote:
> On Fri, 26 Jan 2024 19:27:49 +0300
> Dmitry Osipenko <[email protected]> wrote:
>
>> On 1/26/24 12:55, Boris Brezillon wrote:
>>> On Fri, 26 Jan 2024 00:56:47 +0300
>>> Dmitry Osipenko <[email protected]> wrote:
>>>
>>>> On 1/25/24 13:19, Boris Brezillon wrote:
>>>>> On Fri, 5 Jan 2024 21:46:16 +0300
>>>>> Dmitry Osipenko <[email protected]> wrote:
>>>>>
>>>>>> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
>>>>>> +{
>>>>>> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
>>>>>> + refcount_read(&shmem->pages_use_count) &&
>>>>>> + !refcount_read(&shmem->pages_pin_count) &&
>>>>>> + !shmem->base.dma_buf && !shmem->base.import_attach &&
>>>>>> + !shmem->evicted;
>>>>>
>>>>> Are we missing
>>>>>
>>>>> && dma_resv_test_signaled(shmem->base.resv,
>>>>> DMA_RESV_USAGE_BOOKKEEP)
>>>>>
>>>>> to make sure the GPU is done using the BO?
>>>>> The same applies to drm_gem_shmem_is_purgeable() BTW.
>>>>>
>>>>> If you don't want to do this test here, we need a way to let drivers
>>>>> provide a custom is_{evictable,purgeable}() test.
>>>>>
>>>>> I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
>>>>> to let drivers move the GEMs that were used most recently (those
>>>>> referenced by a GPU job) at the end of the evictable LRU.
>>>>
>>>> We have the signaled-check in the common drm_gem_evict() helper:
>>>>
>>>> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496
>>>
>>> Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of
>>> DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific
>>> ->evict() hook (though that means calling dma_resv_test_signaled()
>>> twice, which is not great, oh well).
>>
>> Maybe we should change drm_gem_evict() to use BOOKKEEP. The
>> test_signaled(BOOKKEEP) should be a "stronger" check than
>> test_signaled(READ)?
>
> It is, just wondering if some users have a good reason to want
> READ here.
>
>>
>>> The problem about the evictable LRU remains though: we need a way to let
>>> drivers put their BOs at the end of the list when the BO has been used
>>> by the GPU, don't we?
>>
>> If BO is use, then it won't be evicted, while idling BOs will be
>> evicted. Hence, the used BOs will be naturally moved down the LRU list
>> each time shrinker is invoked.
>>
>
> That only do the trick if the BOs being used most often are busy when
> the shrinker kicks in though. Let's take this scenario:
>
>
> BO 1 BO 2 shinker
>
> busy
> idle (first-pos-in-evictable-LRU)
>
> busy
> idle (second-pos-in-evictable-LRU)
>
> busy
> idle
>
> busy
> idle
>
> busy
> idle
>
> find a BO to evict
> pick BO 2
>
> busy (swapin)
> idle
>
> If the LRU had been updated at each busy event, BO 1 should have
> been picked for eviction. But we evicted the BO that was first
> recorded idle instead of the one that was least recently
> recorded busy.

You have to swapin(BO) every time BO goes to busy state, and swapin does drm_gem_lru_move_tail(BO). Hence, each time BO goes idle->busy, it's moved down the LRU list.

For example, please see patch #29 where virtio-gpu invokes swapin for each job's BO in the submit()->virtio_gpu_array_prepare() code path.

--
Best regards,
Dmitry


2024-01-29 08:55:20

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On Mon, 29 Jan 2024 09:16:04 +0300
Dmitry Osipenko <[email protected]> wrote:

> On 1/26/24 21:12, Boris Brezillon wrote:
> > On Fri, 26 Jan 2024 19:27:49 +0300
> > Dmitry Osipenko <[email protected]> wrote:
> >
> >> On 1/26/24 12:55, Boris Brezillon wrote:
> >>> On Fri, 26 Jan 2024 00:56:47 +0300
> >>> Dmitry Osipenko <[email protected]> wrote:
> >>>
> >>>> On 1/25/24 13:19, Boris Brezillon wrote:
> >>>>> On Fri, 5 Jan 2024 21:46:16 +0300
> >>>>> Dmitry Osipenko <[email protected]> wrote:
> >>>>>
> >>>>>> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> >>>>>> +{
> >>>>>> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
> >>>>>> + refcount_read(&shmem->pages_use_count) &&
> >>>>>> + !refcount_read(&shmem->pages_pin_count) &&
> >>>>>> + !shmem->base.dma_buf && !shmem->base.import_attach &&
> >>>>>> + !shmem->evicted;
> >>>>>
> >>>>> Are we missing
> >>>>>
> >>>>> && dma_resv_test_signaled(shmem->base.resv,
> >>>>> DMA_RESV_USAGE_BOOKKEEP)
> >>>>>
> >>>>> to make sure the GPU is done using the BO?
> >>>>> The same applies to drm_gem_shmem_is_purgeable() BTW.
> >>>>>
> >>>>> If you don't want to do this test here, we need a way to let drivers
> >>>>> provide a custom is_{evictable,purgeable}() test.
> >>>>>
> >>>>> I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
> >>>>> to let drivers move the GEMs that were used most recently (those
> >>>>> referenced by a GPU job) at the end of the evictable LRU.
> >>>>
> >>>> We have the signaled-check in the common drm_gem_evict() helper:
> >>>>
> >>>> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496
> >>>
> >>> Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of
> >>> DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific
> >>> ->evict() hook (though that means calling dma_resv_test_signaled()
> >>> twice, which is not great, oh well).
> >>
> >> Maybe we should change drm_gem_evict() to use BOOKKEEP. The
> >> test_signaled(BOOKKEEP) should be a "stronger" check than
> >> test_signaled(READ)?
> >
> > It is, just wondering if some users have a good reason to want
> > READ here.
> >
> >>
> >>> The problem about the evictable LRU remains though: we need a way to let
> >>> drivers put their BOs at the end of the list when the BO has been used
> >>> by the GPU, don't we?
> >>
> >> If BO is use, then it won't be evicted, while idling BOs will be
> >> evicted. Hence, the used BOs will be naturally moved down the LRU list
> >> each time shrinker is invoked.
> >>
> >
> > That only do the trick if the BOs being used most often are busy when
> > the shrinker kicks in though. Let's take this scenario:
> >
> >
> > BO 1 BO 2 shinker
> >
> > busy
> > idle (first-pos-in-evictable-LRU)
> >
> > busy
> > idle (second-pos-in-evictable-LRU)
> >
> > busy
> > idle
> >
> > busy
> > idle
> >
> > busy
> > idle
> >
> > find a BO to evict
> > pick BO 2
> >
> > busy (swapin)
> > idle
> >
> > If the LRU had been updated at each busy event, BO 1 should have
> > been picked for eviction. But we evicted the BO that was first
> > recorded idle instead of the one that was least recently
> > recorded busy.
>
> You have to swapin(BO) every time BO goes to busy state, and swapin does drm_gem_lru_move_tail(BO). Hence, each time BO goes idle->busy, it's moved down the LRU list.

Ah, that's the bit I was missing. It makes sense now. I guess that's
good enough for now, we can sort out the BOOKKEEP vs READ in a
follow-up series.

Reviewed-by: Boris Brezillon <[email protected]>

2024-01-29 09:01:26

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On Fri, 5 Jan 2024 21:46:16 +0300
Dmitry Osipenko <[email protected]> wrote:

> +/**
> + * drm_gem_shmem_swapin_locked() - Moves shmem GEM back to memory and enables
> + * hardware access to the memory.
> + * @shmem: shmem GEM object
> + *
> + * This function moves shmem GEM back to memory if it was previously evicted
> + * by the memory shrinker. The GEM is ready to use on success.
> + *
> + * Returns:
> + * 0 on success or a negative error code on failure.
> + */
> +int drm_gem_shmem_swapin_locked(struct drm_gem_shmem_object *shmem)
> +{
> + int err;
> +
> + dma_resv_assert_held(shmem->base.resv);
> +
> + if (!shmem->evicted)
> + return 0;

Shouldn't we have a drm_gem_shmem_shrinker_update_lru_locked() even if
the object wasn't evicted, such that idle->busy transition moves the BO
to the list tail?

> +
> + err = drm_gem_shmem_acquire_pages(shmem);
> + if (err)
> + return err;
> +
> + shmem->evicted = false;
> +
> + drm_gem_shmem_shrinker_update_lru_locked(shmem);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(drm_gem_shmem_swapin_locked);
> +

2024-01-29 09:06:57

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On Mon, 29 Jan 2024 09:55:11 +0100
Boris Brezillon <[email protected]> wrote:

> On Mon, 29 Jan 2024 09:16:04 +0300
> Dmitry Osipenko <[email protected]> wrote:
>
> > On 1/26/24 21:12, Boris Brezillon wrote:
> > > On Fri, 26 Jan 2024 19:27:49 +0300
> > > Dmitry Osipenko <[email protected]> wrote:
> > >
> > >> On 1/26/24 12:55, Boris Brezillon wrote:
> > >>> On Fri, 26 Jan 2024 00:56:47 +0300
> > >>> Dmitry Osipenko <[email protected]> wrote:
> > >>>
> > >>>> On 1/25/24 13:19, Boris Brezillon wrote:
> > >>>>> On Fri, 5 Jan 2024 21:46:16 +0300
> > >>>>> Dmitry Osipenko <[email protected]> wrote:
> > >>>>>
> > >>>>>> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> > >>>>>> +{
> > >>>>>> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
> > >>>>>> + refcount_read(&shmem->pages_use_count) &&
> > >>>>>> + !refcount_read(&shmem->pages_pin_count) &&
> > >>>>>> + !shmem->base.dma_buf && !shmem->base.import_attach &&
> > >>>>>> + !shmem->evicted;
> > >>>>>
> > >>>>> Are we missing
> > >>>>>
> > >>>>> && dma_resv_test_signaled(shmem->base.resv,
> > >>>>> DMA_RESV_USAGE_BOOKKEEP)
> > >>>>>
> > >>>>> to make sure the GPU is done using the BO?
> > >>>>> The same applies to drm_gem_shmem_is_purgeable() BTW.
> > >>>>>
> > >>>>> If you don't want to do this test here, we need a way to let drivers
> > >>>>> provide a custom is_{evictable,purgeable}() test.
> > >>>>>
> > >>>>> I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
> > >>>>> to let drivers move the GEMs that were used most recently (those
> > >>>>> referenced by a GPU job) at the end of the evictable LRU.
> > >>>>
> > >>>> We have the signaled-check in the common drm_gem_evict() helper:
> > >>>>
> > >>>> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496
> > >>>
> > >>> Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of
> > >>> DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific
> > >>> ->evict() hook (though that means calling dma_resv_test_signaled()
> > >>> twice, which is not great, oh well).
> > >>
> > >> Maybe we should change drm_gem_evict() to use BOOKKEEP. The
> > >> test_signaled(BOOKKEEP) should be a "stronger" check than
> > >> test_signaled(READ)?
> > >
> > > It is, just wondering if some users have a good reason to want
> > > READ here.
> > >
> > >>
> > >>> The problem about the evictable LRU remains though: we need a way to let
> > >>> drivers put their BOs at the end of the list when the BO has been used
> > >>> by the GPU, don't we?
> > >>
> > >> If BO is use, then it won't be evicted, while idling BOs will be
> > >> evicted. Hence, the used BOs will be naturally moved down the LRU list
> > >> each time shrinker is invoked.
> > >>
> > >
> > > That only do the trick if the BOs being used most often are busy when
> > > the shrinker kicks in though. Let's take this scenario:
> > >
> > >
> > > BO 1 BO 2 shinker
> > >
> > > busy
> > > idle (first-pos-in-evictable-LRU)
> > >
> > > busy
> > > idle (second-pos-in-evictable-LRU)
> > >
> > > busy
> > > idle
> > >
> > > busy
> > > idle
> > >
> > > busy
> > > idle
> > >
> > > find a BO to evict
> > > pick BO 2
> > >
> > > busy (swapin)
> > > idle
> > >
> > > If the LRU had been updated at each busy event, BO 1 should have
> > > been picked for eviction. But we evicted the BO that was first
> > > recorded idle instead of the one that was least recently
> > > recorded busy.
> >
> > You have to swapin(BO) every time BO goes to busy state, and swapin does drm_gem_lru_move_tail(BO). Hence, each time BO goes idle->busy, it's moved down the LRU list.
>
> Ah, that's the bit I was missing. It makes sense now. I guess that's
> good enough for now, we can sort out the BOOKKEEP vs READ in a
> follow-up series.

On second look, it seems drm_gem_shmem_shrinker_update_lru_locked()
doesn't call drm_gem_shmem_shrinker_update_lru_locked() if the BO was
already resident? Is there something else I'm overlooking here?

>
> Reviewed-by: Boris Brezillon <[email protected]>


2024-01-29 09:27:04

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On 1/29/24 12:01, Boris Brezillon wrote:
> On Fri, 5 Jan 2024 21:46:16 +0300
> Dmitry Osipenko <[email protected]> wrote:
>
>> +/**
>> + * drm_gem_shmem_swapin_locked() - Moves shmem GEM back to memory and enables
>> + * hardware access to the memory.
>> + * @shmem: shmem GEM object
>> + *
>> + * This function moves shmem GEM back to memory if it was previously evicted
>> + * by the memory shrinker. The GEM is ready to use on success.
>> + *
>> + * Returns:
>> + * 0 on success or a negative error code on failure.
>> + */
>> +int drm_gem_shmem_swapin_locked(struct drm_gem_shmem_object *shmem)
>> +{
>> + int err;
>> +
>> + dma_resv_assert_held(shmem->base.resv);
>> +
>> + if (!shmem->evicted)
>> + return 0;
>
> Shouldn't we have a drm_gem_shmem_shrinker_update_lru_locked() even if
> the object wasn't evicted, such that idle->busy transition moves the BO
> to the list tail?

Seems so, good catch. I'll double-check and remove it in the next version.

--
Best regards,
Dmitry


2024-01-30 08:39:17

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On Thu, Jan 25, 2024 at 10:07:03AM +0100, Boris Brezillon wrote:
> On Fri, 5 Jan 2024 21:46:16 +0300
> Dmitry Osipenko <[email protected]> wrote:
>
> > *
> > * This function Increases the use count and allocates the backing pages if
> > * use-count equals to zero.
> > + *
> > + * Note that this function doesn't pin pages in memory. If your driver
> > + * uses drm-shmem shrinker, then it's free to relocate pages to swap.
> > + * Getting pages only guarantees that pages are allocated, and not that
> > + * pages reside in memory. In order to pin pages use drm_gem_shmem_pin().
>
> I still find this explanation confusing, if pages are allocated, they
> reside in memory. The only difference between drm_gem_shmem_get_pages()
> and drm_gem_shmem_pin_pages() is that the former lets the system
> reclaim the memory if the buffer is idle (no unsignalled fence attached
> to the dma_resv).
>
> We also need to describe the workflow for GEM validation (that's the
> TTM term for the swapin process happening when a GPU job is submitted).
>
> 1. Prepare the GPU job and initialize its fence
> 2. Lock the GEM resv
> 3. Add the GPU job fence to the resv object
> 4. If the GEM is evicted
> a. call drm_gem_shmem_swapin_locked()
> b. get the new sgt with drm_gem_shmem_get_pages_sgt_locked()
> c. repopulate the MMU table (driver internals)

Might be good to explain where to call drm_sched_job_arm() here for
drivers using drm/sched, since that also needs to be at a very specific
point. Probably best to flesh out the details here by linking to the
relevant drm/sched and gpuvm functions as examples.

> 5. Unlock the GEM dma_resv
> 6. Submit the GPU job
>
> With this sequence, the GEM pages are guaranteed to stay around until
> the GPU job is finished.

Yeah I think the comment needs to explain how this ties together with
dma_resv locking and dma_resv fences, otherwise it just doesn't make much
sense.

This holds even more so given that some of the earlier drivers derived
from i915-gem code (and i915-gem itself) use _pin() both for these more
permanent pinnings, and also to temporarily put the memory in place before
it all gets fenced and then unpinned&unlocked.

So would be really good to have the sharpest possible nomeclatura here we
can get, and link between all the related concepts and functions in the
kerneldoc.

Some overview flow like Boris sketched above in a DOC: section would also
be great.

Cheers, Sima
>
> > */
> > int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2024-01-30 10:07:50

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

On 1/30/24 11:39, Daniel Vetter wrote:
> On Thu, Jan 25, 2024 at 10:07:03AM +0100, Boris Brezillon wrote:
>> On Fri, 5 Jan 2024 21:46:16 +0300
>> Dmitry Osipenko <[email protected]> wrote:
>>
>>> *
>>> * This function Increases the use count and allocates the backing pages if
>>> * use-count equals to zero.
>>> + *
>>> + * Note that this function doesn't pin pages in memory. If your driver
>>> + * uses drm-shmem shrinker, then it's free to relocate pages to swap.
>>> + * Getting pages only guarantees that pages are allocated, and not that
>>> + * pages reside in memory. In order to pin pages use drm_gem_shmem_pin().
>>
>> I still find this explanation confusing, if pages are allocated, they
>> reside in memory. The only difference between drm_gem_shmem_get_pages()
>> and drm_gem_shmem_pin_pages() is that the former lets the system
>> reclaim the memory if the buffer is idle (no unsignalled fence attached
>> to the dma_resv).
>>
>> We also need to describe the workflow for GEM validation (that's the
>> TTM term for the swapin process happening when a GPU job is submitted).
>>
>> 1. Prepare the GPU job and initialize its fence
>> 2. Lock the GEM resv
>> 3. Add the GPU job fence to the resv object
>> 4. If the GEM is evicted
>> a. call drm_gem_shmem_swapin_locked()
>> b. get the new sgt with drm_gem_shmem_get_pages_sgt_locked()
>> c. repopulate the MMU table (driver internals)
>
> Might be good to explain where to call drm_sched_job_arm() here for
> drivers using drm/sched, since that also needs to be at a very specific
> point. Probably best to flesh out the details here by linking to the
> relevant drm/sched and gpuvm functions as examples.
>
>> 5. Unlock the GEM dma_resv
>> 6. Submit the GPU job
>>
>> With this sequence, the GEM pages are guaranteed to stay around until
>> the GPU job is finished.
>
> Yeah I think the comment needs to explain how this ties together with
> dma_resv locking and dma_resv fences, otherwise it just doesn't make much
> sense.
>
> This holds even more so given that some of the earlier drivers derived
> from i915-gem code (and i915-gem itself) use _pin() both for these more
> permanent pinnings, and also to temporarily put the memory in place before
> it all gets fenced and then unpinned&unlocked.
>
> So would be really good to have the sharpest possible nomeclatura here we
> can get, and link between all the related concepts and functions in the
> kerneldoc.
>
> Some overview flow like Boris sketched above in a DOC: section would also
> be great.

Thank you all for the feedback! I'll add all this doc in the next version

--
Best regards,
Dmitry


2024-01-30 10:08:12

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

On 1/30/24 11:34, Daniel Vetter wrote:
> On Fri, Jan 26, 2024 at 07:43:29PM +0300, Dmitry Osipenko wrote:
>> On 1/26/24 13:18, Boris Brezillon wrote:
>>> On Thu, 25 Jan 2024 18:24:04 +0100
>>> Daniel Vetter <[email protected]> wrote:
>>>
>>>> On Fri, Jan 05, 2024 at 09:46:03PM +0300, Dmitry Osipenko wrote:
>>>>> Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
>>>>> lock if pages_use_count is non-zero, leveraging from atomicity of the
>>>>> refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
>>>>>
>>>>> Acked-by: Maxime Ripard <[email protected]>
>>>>> Reviewed-by: Boris Brezillon <[email protected]>
>>>>> Suggested-by: Boris Brezillon <[email protected]>
>>>>> Signed-off-by: Dmitry Osipenko <[email protected]>
>>>>> ---
>>>>> drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
>>>>> 1 file changed, 15 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
>>>>> index cacf0f8c42e2..1c032513abf1 100644
>>>>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
>>>>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
>>>>> @@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>>>>>
>>>>> +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
>>>>> +{
>>>>> + int ret;
>>>>
>>>> Just random drive-by comment: a might_lock annotation here might be good,
>>>> or people could hit some really interesting bugs that are rather hard to
>>>> reproduce ...
>>>
>>> Actually, being able to acquire a ref in a dma-signalling path on an
>>> object we know for sure already has refcount >= 1 (because we previously
>>> acquired a ref in a path where dma_resv_lock() was allowed), was the
>>> primary reason I suggested moving to this atomic-refcount approach.
>>>
>>> In the meantime, drm_gpuvm has evolved in a way that allows me to not
>>> take the ref in the dma-signalling path (the gpuvm_bo object now holds
>>> the ref, and it's acquired/released outside the dma-signalling path).
>>>
>>> Not saying we shouldn't add this might_lock(), but others might have
>>> good reasons to have this function called in a path where locking
>>> is not allowed.
>>
>> For Panthor the might_lock indeed won't be a appropriate, thanks for
>> reminding about it. I'll add explanatory comment to the code.
>
> Hm these kind of tricks feel very dangerous to me. I think it would be
> good to split up the two cases into two functions:
>
> 1. first one does only the atomic_inc and splats if the refcount is zero.
> I think something in the name that denotes that we're incrementing a
> borrowed pages reference would be good here, so like get_borrowed_pages
> (there's not really a naming convention for these in the kernel).
> Unfortunately no rust so we can't enforce that you provide the right kind
> of borrowed reference at compile time.
>
> 2. second one has the might_lock.
>
> This way you force callers to think what they're doing and ideally
> document where the borrowed reference is from, and ideally document that
> in the code. Otherwise we'll end up with way too much "works in testing,
> but is a nice CVE" code :-/

We indeed can have both variants of the borrowed/non-borrowed functions.
Thanks again for the suggestions

--
Best regards,
Dmitry


2024-01-30 10:11:49

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

On Tue, 30 Jan 2024 09:34:29 +0100
Daniel Vetter <[email protected]> wrote:

> On Fri, Jan 26, 2024 at 07:43:29PM +0300, Dmitry Osipenko wrote:
> > On 1/26/24 13:18, Boris Brezillon wrote:
> > > On Thu, 25 Jan 2024 18:24:04 +0100
> > > Daniel Vetter <[email protected]> wrote:
> > >
> > >> On Fri, Jan 05, 2024 at 09:46:03PM +0300, Dmitry Osipenko wrote:
> > >>> Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
> > >>> lock if pages_use_count is non-zero, leveraging from atomicity of the
> > >>> refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
> > >>>
> > >>> Acked-by: Maxime Ripard <[email protected]>
> > >>> Reviewed-by: Boris Brezillon <[email protected]>
> > >>> Suggested-by: Boris Brezillon <[email protected]>
> > >>> Signed-off-by: Dmitry Osipenko <[email protected]>
> > >>> ---
> > >>> drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
> > >>> 1 file changed, 15 insertions(+), 4 deletions(-)
> > >>>
> > >>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > >>> index cacf0f8c42e2..1c032513abf1 100644
> > >>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > >>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > >>> @@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> > >>> }
> > >>> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
> > >>>
> > >>> +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> > >>> +{
> > >>> + int ret;
> > >>
> > >> Just random drive-by comment: a might_lock annotation here might be good,
> > >> or people could hit some really interesting bugs that are rather hard to
> > >> reproduce ...
> > >
> > > Actually, being able to acquire a ref in a dma-signalling path on an
> > > object we know for sure already has refcount >= 1 (because we previously
> > > acquired a ref in a path where dma_resv_lock() was allowed), was the
> > > primary reason I suggested moving to this atomic-refcount approach.
> > >
> > > In the meantime, drm_gpuvm has evolved in a way that allows me to not
> > > take the ref in the dma-signalling path (the gpuvm_bo object now holds
> > > the ref, and it's acquired/released outside the dma-signalling path).
> > >
> > > Not saying we shouldn't add this might_lock(), but others might have
> > > good reasons to have this function called in a path where locking
> > > is not allowed.
> >
> > For Panthor the might_lock indeed won't be a appropriate, thanks for
> > reminding about it. I'll add explanatory comment to the code.
>
> Hm these kind of tricks feel very dangerous to me. I think it would be
> good to split up the two cases into two functions:
>
> 1. first one does only the atomic_inc and splats if the refcount is zero.
> I think something in the name that denotes that we're incrementing a
> borrowed pages reference would be good here, so like get_borrowed_pages
> (there's not really a naming convention for these in the kernel).
> Unfortunately no rust so we can't enforce that you provide the right kind
> of borrowed reference at compile time.

Yeah, I also considered adding a dedicated function for that use case
at some point, instead of abusing get_pages(). Given I no longer need
it, we can probably add this might_lock() and defer the addition of this
get_borrowed_pages() helper until someone actually needs it.

>
> 2. second one has the might_lock.
>
> This way you force callers to think what they're doing and ideally
> document where the borrowed reference is from, and ideally document that
> in the code. Otherwise we'll end up with way too much "works in testing,
> but is a nice CVE" code :-/

Totally agree with you on that point.

2024-01-30 12:22:38

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

On Fri, Jan 26, 2024 at 07:43:29PM +0300, Dmitry Osipenko wrote:
> On 1/26/24 13:18, Boris Brezillon wrote:
> > On Thu, 25 Jan 2024 18:24:04 +0100
> > Daniel Vetter <[email protected]> wrote:
> >
> >> On Fri, Jan 05, 2024 at 09:46:03PM +0300, Dmitry Osipenko wrote:
> >>> Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
> >>> lock if pages_use_count is non-zero, leveraging from atomicity of the
> >>> refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
> >>>
> >>> Acked-by: Maxime Ripard <[email protected]>
> >>> Reviewed-by: Boris Brezillon <[email protected]>
> >>> Suggested-by: Boris Brezillon <[email protected]>
> >>> Signed-off-by: Dmitry Osipenko <[email protected]>
> >>> ---
> >>> drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
> >>> 1 file changed, 15 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> >>> index cacf0f8c42e2..1c032513abf1 100644
> >>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> >>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> >>> @@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> >>> }
> >>> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
> >>>
> >>> +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> >>> +{
> >>> + int ret;
> >>
> >> Just random drive-by comment: a might_lock annotation here might be good,
> >> or people could hit some really interesting bugs that are rather hard to
> >> reproduce ...
> >
> > Actually, being able to acquire a ref in a dma-signalling path on an
> > object we know for sure already has refcount >= 1 (because we previously
> > acquired a ref in a path where dma_resv_lock() was allowed), was the
> > primary reason I suggested moving to this atomic-refcount approach.
> >
> > In the meantime, drm_gpuvm has evolved in a way that allows me to not
> > take the ref in the dma-signalling path (the gpuvm_bo object now holds
> > the ref, and it's acquired/released outside the dma-signalling path).
> >
> > Not saying we shouldn't add this might_lock(), but others might have
> > good reasons to have this function called in a path where locking
> > is not allowed.
>
> For Panthor the might_lock indeed won't be a appropriate, thanks for
> reminding about it. I'll add explanatory comment to the code.

Hm these kind of tricks feel very dangerous to me. I think it would be
good to split up the two cases into two functions:

1. first one does only the atomic_inc and splats if the refcount is zero.
I think something in the name that denotes that we're incrementing a
borrowed pages reference would be good here, so like get_borrowed_pages
(there's not really a naming convention for these in the kernel).
Unfortunately no rust so we can't enforce that you provide the right kind
of borrowed reference at compile time.

2. second one has the might_lock.

This way you force callers to think what they're doing and ideally
document where the borrowed reference is from, and ideally document that
in the code. Otherwise we'll end up with way too much "works in testing,
but is a nice CVE" code :-/

Cheers, Sima
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2024-02-01 18:53:36

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 09/30] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages()

On 1/30/24 13:10, Boris Brezillon wrote:
> On Tue, 30 Jan 2024 09:34:29 +0100
> Daniel Vetter <[email protected]> wrote:
>
>> On Fri, Jan 26, 2024 at 07:43:29PM +0300, Dmitry Osipenko wrote:
>>> On 1/26/24 13:18, Boris Brezillon wrote:
>>>> On Thu, 25 Jan 2024 18:24:04 +0100
>>>> Daniel Vetter <[email protected]> wrote:
>>>>
>>>>> On Fri, Jan 05, 2024 at 09:46:03PM +0300, Dmitry Osipenko wrote:
>>>>>> Add lockless drm_gem_shmem_get_pages() helper that skips taking reservation
>>>>>> lock if pages_use_count is non-zero, leveraging from atomicity of the
>>>>>> refcount_t. Make drm_gem_shmem_mmap() to utilize the new helper.
>>>>>>
>>>>>> Acked-by: Maxime Ripard <[email protected]>
>>>>>> Reviewed-by: Boris Brezillon <[email protected]>
>>>>>> Suggested-by: Boris Brezillon <[email protected]>
>>>>>> Signed-off-by: Dmitry Osipenko <[email protected]>
>>>>>> ---
>>>>>> drivers/gpu/drm/drm_gem_shmem_helper.c | 19 +++++++++++++++----
>>>>>> 1 file changed, 15 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
>>>>>> index cacf0f8c42e2..1c032513abf1 100644
>>>>>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
>>>>>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
>>>>>> @@ -226,6 +226,20 @@ void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
>>>>>> }
>>>>>> EXPORT_SYMBOL_GPL(drm_gem_shmem_put_pages_locked);
>>>>>>
>>>>>> +static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
>>>>>> +{
>>>>>> + int ret;
>>>>>
>>>>> Just random drive-by comment: a might_lock annotation here might be good,
>>>>> or people could hit some really interesting bugs that are rather hard to
>>>>> reproduce ...
>>>>
>>>> Actually, being able to acquire a ref in a dma-signalling path on an
>>>> object we know for sure already has refcount >= 1 (because we previously
>>>> acquired a ref in a path where dma_resv_lock() was allowed), was the
>>>> primary reason I suggested moving to this atomic-refcount approach.
>>>>
>>>> In the meantime, drm_gpuvm has evolved in a way that allows me to not
>>>> take the ref in the dma-signalling path (the gpuvm_bo object now holds
>>>> the ref, and it's acquired/released outside the dma-signalling path).
>>>>
>>>> Not saying we shouldn't add this might_lock(), but others might have
>>>> good reasons to have this function called in a path where locking
>>>> is not allowed.
>>>
>>> For Panthor the might_lock indeed won't be a appropriate, thanks for
>>> reminding about it. I'll add explanatory comment to the code.
>>
>> Hm these kind of tricks feel very dangerous to me. I think it would be
>> good to split up the two cases into two functions:
>>
>> 1. first one does only the atomic_inc and splats if the refcount is zero.
>> I think something in the name that denotes that we're incrementing a
>> borrowed pages reference would be good here, so like get_borrowed_pages
>> (there's not really a naming convention for these in the kernel).
>> Unfortunately no rust so we can't enforce that you provide the right kind
>> of borrowed reference at compile time.
>
> Yeah, I also considered adding a dedicated function for that use case
> at some point, instead of abusing get_pages(). Given I no longer need
> it, we can probably add this might_lock() and defer the addition of this
> get_borrowed_pages() helper until someone actually needs it.

Ack, I'll add the might_lock() then. Missed previously that you don't
need to use get_pages() anymore. Thanks

--
Best regards,
Dmitry


2024-04-04 15:25:45

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH v19 17/30] drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()

On 1/5/24 21:46, Dmitry Osipenko wrote:
> From: Boris Brezillon <[email protected]>
>
> If some the pages or sgt allocation failed, we shouldn't release the
> pages ref we got earlier, otherwise we will end up with unbalanced
> get/put_pages() calls. We should instead leave everything in place
> and let the BO release function deal with extra cleanup when the object
> is destroyed, or let the fault handler try again next time it's called.
>
> Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations")
> Cc: <[email protected]>
> Signed-off-by: Boris Brezillon <[email protected]>
> Co-developed-by: Dmitry Osipenko <[email protected]>
> Signed-off-by: Dmitry Osipenko <[email protected]>
> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index bd5a0073009d..4a0b4bf03f1a 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -502,11 +502,18 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> mapping_set_unevictable(mapping);
>
> for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
> + /* Can happen if the last fault only partially filled this
> + * section of the pages array before failing. In that case
> + * we skip already filled pages.
> + */
> + if (pages[i])
> + continue;
> +
> pages[i] = shmem_read_mapping_page(mapping, i);
> if (IS_ERR(pages[i])) {
> ret = PTR_ERR(pages[i]);
> pages[i] = NULL;
> - goto err_pages;
> + goto err_unlock;
> }
> }
>
> @@ -514,7 +521,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> ret = sg_alloc_table_from_pages(sgt, pages + page_offset,
> NUM_FAULT_PAGES, 0, SZ_2M, GFP_KERNEL);
> if (ret)
> - goto err_pages;
> + goto err_unlock;
>
> ret = dma_map_sgtable(pfdev->dev, sgt, DMA_BIDIRECTIONAL, 0);
> if (ret)
> @@ -537,8 +544,6 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
>
> err_map:
> sg_free_table(sgt);
> -err_pages:
> - drm_gem_shmem_put_pages_locked(&bo->base);
> err_unlock:
> dma_resv_unlock(obj->resv);
> err_bo:

Applied to misc-fixes

Forgot that this patch doesn't depend on others in this series, sorry
for not doing it earlier

--
Best regards,
Dmitry