2022-06-25 23:05:29

by Rob Clark

[permalink] [raw]
Subject: [PATCH 00/15] drm+msm: Shrinker and LRU rework

From: Rob Clark <[email protected]>

This is mostly motivated by getting drm/msm to pass an i-g-t shrinker
test that I've been working on. In particular the test creates and
cycles between more GEM buffers than what will fit in RAM to force
eviction and re-pin. (There are sub-tests that cover this case both
single threaded and with many child processes in parallel.)

Getting this test to pass necessitated a few improvements:

1. Re-ordering submit path to get rid of __GFP_NORETRY (in the common
case, doing this for syncobjs is still TODO)
2. Decoupling locks needed in the retire path from locks that could
be held while hitting reclaim in the submit path
3. If necessary, allow stalling on active BOs for reclaim.

The latter point is because we pin objects in the synchronous part of
the submit path (before queuing on the drm gpu-scheduler), which means
in the parallel variant of the i-g-t test, we need to be able to block
in the reclaim path until some queued work has completed/retired.

In the process of re-working how drm/msm tracks buffer state in it's
various LRU lists, I refactored out a drm_gem_lru helper which, in
theory, should be usable by other drivers and drm shmem helpers for
implementing LRU tracking and shrinker.

Rob Clark (15):
drm/msm: Switch to pfn mappings
drm/msm: Make enable_eviction flag static
drm/msm: Reorder lock vs submit alloc
drm/msm: Small submit cleanup
drm/msm: Split out idr_lock
drm/msm/gem: Check for active in shrinker path
drm/msm/gem: Rename update_inactive
drm/msm/gem: Rename to pin/unpin_pages
drm/msm/gem: Consolidate pin/unpin paths
drm/msm/gem: Remove active refcnt
drm/gem: Add LRU/shrinker helper
drm/msm/gem: Convert to using drm_gem_lru
drm/msm/gem: Unpin buffers earlier
drm/msm/gem: Consolidate shrinker trace
drm/msm/gem: Evict active GEM objects when necessary

drivers/gpu/drm/drm_gem.c | 183 +++++++++++++++++++++++++
drivers/gpu/drm/msm/msm_drv.c | 18 ++-
drivers/gpu/drm/msm/msm_drv.h | 70 +++++++---
drivers/gpu/drm/msm/msm_gem.c | 154 +++++++--------------
drivers/gpu/drm/msm/msm_gem.h | 112 +--------------
drivers/gpu/drm/msm/msm_gem_prime.c | 4 +-
drivers/gpu/drm/msm/msm_gem_shrinker.c | 164 ++++++++++------------
drivers/gpu/drm/msm/msm_gem_submit.c | 78 ++++-------
drivers/gpu/drm/msm/msm_gpu.c | 3 -
drivers/gpu/drm/msm/msm_gpu.h | 10 +-
drivers/gpu/drm/msm/msm_gpu_trace.h | 36 +++--
drivers/gpu/drm/msm/msm_submitqueue.c | 1 +
include/drm/drm_gem.h | 56 ++++++++
13 files changed, 485 insertions(+), 404 deletions(-)

--
2.36.1


2022-06-25 23:07:02

by Rob Clark

[permalink] [raw]
Subject: [PATCH 11/15] drm/gem: Add LRU/shrinker helper

From: Rob Clark <[email protected]>

Add a simple LRU helper to assist with driver's shrinker implementation.
It handles tracking the number of backing pages associated with a given
LRU, and provides a helper to implement shrinker_scan.

A driver can use multiple LRU instances to track objects in various
states, for example a dontneed LRU for purgeable objects, a willneed LRU
for evictable objects, and an unpinned LRU for objects without backing
pages.

All LRUs that the object can be moved between must share a single lock.

Cc: Daniel Vetter <[email protected]>
Cc: Thomas Zimmermann <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/drm_gem.c | 183 ++++++++++++++++++++++++++++++++++++++
include/drm/drm_gem.h | 56 ++++++++++++
2 files changed, 239 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index eb0c2d041f13..684db28cc71c 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -165,6 +165,7 @@ void drm_gem_private_object_init(struct drm_device *dev,
obj->resv = &obj->_resv;

drm_vma_node_reset(&obj->vma_node);
+ INIT_LIST_HEAD(&obj->lru_node);
}
EXPORT_SYMBOL(drm_gem_private_object_init);

@@ -951,6 +952,7 @@ drm_gem_object_release(struct drm_gem_object *obj)

dma_resv_fini(&obj->_resv);
drm_gem_free_mmap_offset(obj);
+ drm_gem_lru_remove(obj);
}
EXPORT_SYMBOL(drm_gem_object_release);

@@ -1274,3 +1276,184 @@ drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
ww_acquire_fini(acquire_ctx);
}
EXPORT_SYMBOL(drm_gem_unlock_reservations);
+
+/**
+ * drm_gem_lru_init - initialize a LRU
+ *
+ * @lru: The LRU to initialize
+ * @lock: The lock protecting the LRU
+ */
+void
+drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock)
+{
+ lru->lock = lock;
+ lru->count = 0;
+ INIT_LIST_HEAD(&lru->list);
+}
+EXPORT_SYMBOL(drm_gem_lru_init);
+
+static void
+lru_remove(struct drm_gem_object *obj)
+{
+ obj->lru->count -= obj->size >> PAGE_SHIFT;
+ WARN_ON(obj->lru->count < 0);
+ list_del(&obj->lru_node);
+ obj->lru = NULL;
+}
+
+/**
+ * drm_gem_lru_remove - remove object from whatever LRU it is in
+ *
+ * If the object is currently in any LRU, remove it.
+ *
+ * @obj: The GEM object to remove from current LRU
+ */
+void
+drm_gem_lru_remove(struct drm_gem_object *obj)
+{
+ struct drm_gem_lru *lru = obj->lru;
+
+ if (!lru)
+ return;
+
+ mutex_lock(lru->lock);
+ lru_remove(obj);
+ mutex_unlock(lru->lock);
+}
+EXPORT_SYMBOL(drm_gem_lru_remove);
+
+/**
+ * drm_gem_lru_move_tail - move the object to the tail of the LRU
+ *
+ * If the object is already in this LRU it will be moved to the
+ * tail. Otherwise it will be removed from whichever other LRU
+ * it is in (if any) and moved into this LRU.
+ *
+ * @lru: The LRU to move the object into.
+ * @obj: The GEM object to move into this LRU
+ */
+void
+drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj)
+{
+ mutex_lock(lru->lock);
+ drm_gem_lru_move_tail_locked(lru, obj);
+ mutex_unlock(lru->lock);
+}
+EXPORT_SYMBOL(drm_gem_lru_move_tail);
+
+/**
+ * drm_gem_lru_move_tail_locked - move the object to the tail of the LRU
+ *
+ * If the object is already in this LRU it will be moved to the
+ * tail. Otherwise it will be removed from whichever other LRU
+ * it is in (if any) and moved into this LRU.
+ *
+ * Call with LRU lock held.
+ *
+ * @lru: The LRU to move the object into.
+ * @obj: The GEM object to move into this LRU
+ */
+void
+drm_gem_lru_move_tail_locked(struct drm_gem_lru *lru, struct drm_gem_object *obj)
+{
+ WARN_ON(!mutex_is_locked(lru->lock));
+
+ if (obj->lru)
+ lru_remove(obj);
+
+ lru->count += obj->size >> PAGE_SHIFT;
+ list_add_tail(&obj->lru_node, &lru->list);
+ obj->lru = lru;
+}
+EXPORT_SYMBOL(drm_gem_lru_move_tail_locked);
+
+/**
+ * drm_gem_lru_scan - helper to implement shrinker.scan_objects
+ *
+ * If the shrink callback succeeds, it is expected that the driver
+ * move the object out of this LRU.
+ *
+ * If the LRU possibly contain active buffers, it is the responsibility
+ * of the shrink callback to check for this (ie. dma_resv_test_signaled())
+ * or if necessary block until the buffer becomes idle.
+ *
+ * @lru: The LRU to scan
+ * @nr_to_scan: The number of pages to try to reclaim
+ * @shrink: Callback to try to shrink/reclaim the object.
+ */
+unsigned long
+drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan,
+ bool (*shrink)(struct drm_gem_object *obj))
+{
+ struct drm_gem_lru still_in_lru;
+ struct drm_gem_object *obj;
+ unsigned freed = 0;
+
+ drm_gem_lru_init(&still_in_lru, lru->lock);
+
+ mutex_lock(lru->lock);
+
+ while (freed < nr_to_scan) {
+ obj = list_first_entry_or_null(&lru->list, typeof(*obj), lru_node);
+
+ if (!obj)
+ break;
+
+ drm_gem_lru_move_tail_locked(&still_in_lru, obj);
+
+ /*
+ * If it's in the process of being freed, gem_object->free()
+ * may be blocked on lock waiting to remove it. So just
+ * skip it.
+ */
+ if (!kref_get_unless_zero(&obj->refcount))
+ continue;
+
+ /*
+ * Now that we own a reference, we can drop the lock for the
+ * rest of the loop body, to reduce contention with other
+ * code paths that need the LRU lock
+ */
+ mutex_unlock(lru->lock);
+
+ /*
+ * Note that this still needs to be trylock, since we can
+ * hit shrinker in response to trying to get backing pages
+ * for this obj (ie. while it's lock is already held)
+ */
+ if (!dma_resv_trylock(obj->resv))
+ goto tail;
+
+ if (shrink(obj)) {
+ freed += obj->size >> PAGE_SHIFT;
+
+ /*
+ * If we succeeded in releasing the object's backing
+ * pages, we expect the driver to have moved the object
+ * out of this LRU
+ */
+ WARN_ON(obj->lru == &still_in_lru);
+ WARN_ON(obj->lru == lru);
+ }
+
+ dma_resv_unlock(obj->resv);
+
+tail:
+ drm_gem_object_put(obj);
+ mutex_lock(lru->lock);
+ }
+
+ /*
+ * Move objects we've skipped over out of the temporary still_in_lru
+ * back into this LRU
+ */
+ list_for_each_entry (obj, &still_in_lru.list, lru_node)
+ obj->lru = lru;
+ list_splice_tail(&still_in_lru.list, &lru->list);
+ lru->count += still_in_lru.count;
+
+ mutex_unlock(lru->lock);
+
+ return freed;
+}
+EXPORT_SYMBOL(drm_gem_lru_scan);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 87cffc9efa85..f13a9080af37 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -174,6 +174,41 @@ struct drm_gem_object_funcs {
const struct vm_operations_struct *vm_ops;
};

+/**
+ * struct drm_gem_lru - A simple LRU helper
+ *
+ * A helper for tracking GEM objects in a given state, to aid in
+ * driver's shrinker implementation. Tracks the count of pages
+ * for lockless &shrinker.count_objects, and provides
+ * &drm_gem_lru_scan for driver's &shrinker.scan_objects
+ * implementation.
+ */
+struct drm_gem_lru {
+ /**
+ * @lock:
+ *
+ * Lock protecting movement of GEM objects between LRUs. All
+ * LRUs that the object can move between should be protected
+ * by the same lock.
+ */
+ struct mutex *lock;
+
+ /**
+ * @count:
+ *
+ * The total number of backing pages of the GEM objects in
+ * this LRU.
+ */
+ long count;
+
+ /**
+ * @list:
+ *
+ * The LRU list.
+ */
+ struct list_head list;
+};
+
/**
* struct drm_gem_object - GEM buffer object
*
@@ -312,6 +347,20 @@ struct drm_gem_object {
*
*/
const struct drm_gem_object_funcs *funcs;
+
+ /**
+ * @lru_node:
+ *
+ * List node in a &drm_gem_lru.
+ */
+ struct list_head lru_node;
+
+ /**
+ * @lru:
+ *
+ * The current LRU list that the GEM object is on.
+ */
+ struct drm_gem_lru *lru;
};

/**
@@ -420,4 +469,11 @@ void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
u32 handle, u64 *offset);

+void drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock);
+void drm_gem_lru_remove(struct drm_gem_object *obj);
+void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj);
+void drm_gem_lru_move_tail_locked(struct drm_gem_lru *lru, struct drm_gem_object *obj);
+unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan,
+ bool (*shrink)(struct drm_gem_object *obj));
+
#endif /* __DRM_GEM_H__ */
--
2.36.1

2022-06-25 23:19:21

by Rob Clark

[permalink] [raw]
Subject: [PATCH 14/15] drm/msm/gem: Consolidate shrinker trace

From: Rob Clark <[email protected]>

Combine separate trace events for purge vs evict into one. When we add
support for purging/evicting active buffers we'll just add more info
into this one trace event, rather than adding a bunch more events.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem_shrinker.c | 19 ++++++---------
drivers/gpu/drm/msm/msm_gpu_trace.h | 32 +++++++++++---------------
2 files changed, 20 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 530b1102b46d..5cc05d669a08 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -71,25 +71,20 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
struct msm_drm_private *priv =
container_of(shrinker, struct msm_drm_private, shrinker);
long nr = sc->nr_to_scan;
- unsigned long freed;
+ unsigned long freed, purged, evicted = 0;

- freed = drm_gem_lru_scan(&priv->lru.dontneed, nr, purge);
- nr -= freed;
-
- if (freed > 0)
- trace_msm_gem_purge(freed << PAGE_SHIFT);
+ purged = drm_gem_lru_scan(&priv->lru.dontneed, nr, purge);
+ nr -= purged;

if (can_swap() && nr > 0) {
- unsigned long evicted;
-
evicted = drm_gem_lru_scan(&priv->lru.willneed, nr, evict);
nr -= evicted;
+ }

- if (evicted > 0)
- trace_msm_gem_evict(evicted << PAGE_SHIFT);
+ freed = purged + evicted;

- freed += evicted;
- }
+ if (freed)
+ trace_msm_gem_shrink(sc->nr_to_scan, purged, evicted);

return (freed > 0) ? freed : SHRINK_STOP;
}
diff --git a/drivers/gpu/drm/msm/msm_gpu_trace.h b/drivers/gpu/drm/msm/msm_gpu_trace.h
index ca0b08d7875b..8867fa0a0306 100644
--- a/drivers/gpu/drm/msm/msm_gpu_trace.h
+++ b/drivers/gpu/drm/msm/msm_gpu_trace.h
@@ -115,29 +115,23 @@ TRACE_EVENT(msm_gmu_freq_change,
);


-TRACE_EVENT(msm_gem_purge,
- TP_PROTO(u32 bytes),
- TP_ARGS(bytes),
+TRACE_EVENT(msm_gem_shrink,
+ TP_PROTO(u32 nr_to_scan, u32 purged, u32 evicted),
+ TP_ARGS(nr_to_scan, purged, evicted),
TP_STRUCT__entry(
- __field(u32, bytes)
+ __field(u32, nr_to_scan)
+ __field(u32, purged)
+ __field(u32, evicted)
),
TP_fast_assign(
- __entry->bytes = bytes;
+ __entry->nr_to_scan = nr_to_scan;
+ __entry->purged = purged;
+ __entry->evicted = evicted;
),
- TP_printk("Purging %u bytes", __entry->bytes)
-);
-
-
-TRACE_EVENT(msm_gem_evict,
- TP_PROTO(u32 bytes),
- TP_ARGS(bytes),
- TP_STRUCT__entry(
- __field(u32, bytes)
- ),
- TP_fast_assign(
- __entry->bytes = bytes;
- ),
- TP_printk("Evicting %u bytes", __entry->bytes)
+ TP_printk("nr_to_scan=%u pages, purged=%u pages, evicted=%u pages",
+ __entry->nr_to_scan,
+ __entry->purged,
+ __entry->evicted)
);


--
2.36.1

2022-06-25 23:19:51

by Rob Clark

[permalink] [raw]
Subject: [PATCH 15/15] drm/msm/gem: Evict active GEM objects when necessary

From: Rob Clark <[email protected]>

If we are under enough memory pressure, we should stall waiting for
active buffers to become idle in order to evict.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem_shrinker.c | 68 +++++++++++++++++++++-----
drivers/gpu/drm/msm/msm_gpu_trace.h | 16 +++---
2 files changed, 66 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 5cc05d669a08..b0bee040432a 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -24,6 +24,11 @@ static bool can_swap(void)
return enable_eviction && get_nr_swap_pages() > 0;
}

+static bool can_block(struct shrink_control *sc)
+{
+ return current_is_kswapd() || (sc->gfp_mask & __GFP_RECLAIM);
+}
+
static unsigned long
msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
{
@@ -65,26 +70,65 @@ evict(struct drm_gem_object *obj)
return true;
}

+static bool
+wait_for_idle(struct drm_gem_object *obj)
+{
+ enum dma_resv_usage usage = dma_resv_usage_rw(true);
+ return dma_resv_wait_timeout(obj->resv, usage, false, 1000) > 0;
+}
+
+static bool
+active_purge(struct drm_gem_object *obj)
+{
+ if (!wait_for_idle(obj))
+ return false;
+
+ return purge(obj);
+}
+
+static bool
+active_evict(struct drm_gem_object *obj)
+{
+ if (!wait_for_idle(obj))
+ return false;
+
+ return evict(obj);
+}
+
static unsigned long
msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
{
struct msm_drm_private *priv =
container_of(shrinker, struct msm_drm_private, shrinker);
+ struct {
+ struct drm_gem_lru *lru;
+ bool (*shrink)(struct drm_gem_object *obj);
+ bool cond;
+ unsigned long freed;
+ } stages[] = {
+ /* Stages of progressively more aggressive/expensive reclaim: */
+ { &priv->lru.dontneed, purge, true },
+ { &priv->lru.willneed, evict, can_swap() },
+ { &priv->lru.dontneed, active_purge, can_block(sc) },
+ { &priv->lru.willneed, active_evict, can_swap() && can_block(sc) },
+ };
long nr = sc->nr_to_scan;
- unsigned long freed, purged, evicted = 0;
-
- purged = drm_gem_lru_scan(&priv->lru.dontneed, nr, purge);
- nr -= purged;
-
- if (can_swap() && nr > 0) {
- evicted = drm_gem_lru_scan(&priv->lru.willneed, nr, evict);
- nr -= evicted;
+ unsigned long freed = 0;
+
+ for (unsigned i = 0; (nr > 0) && (i < ARRAY_SIZE(stages)); i++) {
+ if (!stages[i].cond)
+ continue;
+ stages[i].freed =
+ drm_gem_lru_scan(stages[i].lru, nr, stages[i].shrink);
+ nr -= stages[i].freed;
+ freed += stages[i].freed;
}

- freed = purged + evicted;
-
- if (freed)
- trace_msm_gem_shrink(sc->nr_to_scan, purged, evicted);
+ if (freed) {
+ trace_msm_gem_shrink(sc->nr_to_scan, stages[0].freed,
+ stages[1].freed, stages[2].freed,
+ stages[3].freed);
+ }

return (freed > 0) ? freed : SHRINK_STOP;
}
diff --git a/drivers/gpu/drm/msm/msm_gpu_trace.h b/drivers/gpu/drm/msm/msm_gpu_trace.h
index 8867fa0a0306..ac40d857bc45 100644
--- a/drivers/gpu/drm/msm/msm_gpu_trace.h
+++ b/drivers/gpu/drm/msm/msm_gpu_trace.h
@@ -116,22 +116,26 @@ TRACE_EVENT(msm_gmu_freq_change,


TRACE_EVENT(msm_gem_shrink,
- TP_PROTO(u32 nr_to_scan, u32 purged, u32 evicted),
- TP_ARGS(nr_to_scan, purged, evicted),
+ TP_PROTO(u32 nr_to_scan, u32 purged, u32 evicted,
+ u32 active_purged, u32 active_evicted),
+ TP_ARGS(nr_to_scan, purged, evicted, active_purged, active_evicted),
TP_STRUCT__entry(
__field(u32, nr_to_scan)
__field(u32, purged)
__field(u32, evicted)
+ __field(u32, active_purged)
+ __field(u32, active_evicted)
),
TP_fast_assign(
__entry->nr_to_scan = nr_to_scan;
__entry->purged = purged;
__entry->evicted = evicted;
+ __entry->active_purged = active_purged;
+ __entry->active_evicted = active_evicted;
),
- TP_printk("nr_to_scan=%u pages, purged=%u pages, evicted=%u pages",
- __entry->nr_to_scan,
- __entry->purged,
- __entry->evicted)
+ TP_printk("nr_to_scan=%u pg, purged=%u pg, evicted=%u pg, active_purged=%u pg, active_evicted=%u pg",
+ __entry->nr_to_scan, __entry->purged, __entry->evicted,
+ __entry->active_purged, __entry->active_evicted)
);


--
2.36.1

2022-06-25 23:20:27

by Rob Clark

[permalink] [raw]
Subject: [PATCH 08/15] drm/msm/gem: Rename to pin/unpin_pages

From: Rob Clark <[email protected]>

Since that is what these fxns actually do.. they are getting *pinned*
pages (as opposed to cases where we need pages, but don't need them
pinned, like CPU mappings).

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem.c | 18 +++++++++++++-----
drivers/gpu/drm/msm/msm_gem.h | 4 ++--
drivers/gpu/drm/msm/msm_gem_prime.c | 4 ++--
3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 97467364dc0a..3da64c7f65a2 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -177,30 +177,38 @@ static void put_pages(struct drm_gem_object *obj)
}
}

-struct page **msm_gem_get_pages(struct drm_gem_object *obj)
+static struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj)
{
struct msm_gem_object *msm_obj = to_msm_bo(obj);
struct page **p;

- msm_gem_lock(obj);
+ GEM_WARN_ON(!msm_gem_is_locked(obj));

if (GEM_WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED)) {
- msm_gem_unlock(obj);
return ERR_PTR(-EBUSY);
}

p = get_pages(obj);
-
if (!IS_ERR(p)) {
msm_obj->pin_count++;
update_lru(obj);
}

+ return p;
+}
+
+struct page **msm_gem_pin_pages(struct drm_gem_object *obj)
+{
+ struct page **p;
+
+ msm_gem_lock(obj);
+ p = msm_gem_pin_pages_locked(obj);
msm_gem_unlock(obj);
+
return p;
}

-void msm_gem_put_pages(struct drm_gem_object *obj)
+void msm_gem_unpin_pages(struct drm_gem_object *obj)
{
struct msm_gem_object *msm_obj = to_msm_bo(obj);

diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 0ab0dc4f8c25..6fe521ccda45 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -159,8 +159,8 @@ int msm_gem_get_and_pin_iova(struct drm_gem_object *obj,
struct msm_gem_address_space *aspace, uint64_t *iova);
void msm_gem_unpin_iova(struct drm_gem_object *obj,
struct msm_gem_address_space *aspace);
-struct page **msm_gem_get_pages(struct drm_gem_object *obj);
-void msm_gem_put_pages(struct drm_gem_object *obj);
+struct page **msm_gem_pin_pages(struct drm_gem_object *obj);
+void msm_gem_unpin_pages(struct drm_gem_object *obj);
int msm_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
struct drm_mode_create_dumb *args);
int msm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
diff --git a/drivers/gpu/drm/msm/msm_gem_prime.c b/drivers/gpu/drm/msm/msm_gem_prime.c
index dcc8a573bc76..c1d91863df05 100644
--- a/drivers/gpu/drm/msm/msm_gem_prime.c
+++ b/drivers/gpu/drm/msm/msm_gem_prime.c
@@ -63,12 +63,12 @@ struct drm_gem_object *msm_gem_prime_import_sg_table(struct drm_device *dev,
int msm_gem_prime_pin(struct drm_gem_object *obj)
{
if (!obj->import_attach)
- msm_gem_get_pages(obj);
+ msm_gem_pin_pages(obj);
return 0;
}

void msm_gem_prime_unpin(struct drm_gem_object *obj)
{
if (!obj->import_attach)
- msm_gem_put_pages(obj);
+ msm_gem_unpin_pages(obj);
}
--
2.36.1

2022-06-25 23:24:20

by Rob Clark

[permalink] [raw]
Subject: [PATCH 01/15] drm/msm: Switch to pfn mappings

From: Rob Clark <[email protected]>

I'm not entirely sure why we were using VM_MIXEDMAP. These are never
CoW mappings. Let's switch to be more consistent with what other
drivers and the GEM shmem helpers do.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index ad7da2ca35ab..8ddbd2e001d4 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -259,7 +259,8 @@ static vm_fault_t msm_gem_fault(struct vm_fault *vmf)
VERB("Inserting %p pfn %lx, pa %lx", (void *)vmf->address,
pfn, pfn << PAGE_SHIFT);

- ret = vmf_insert_mixed(vma, vmf->address, __pfn_to_pfn_t(pfn, PFN_DEV));
+ ret = vmf_insert_pfn(vma, vmf->address, pfn);
+
out_unlock:
msm_gem_unlock(obj);
out:
@@ -1051,7 +1052,7 @@ static int msm_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_struct
{
struct msm_gem_object *msm_obj = to_msm_bo(obj);

- vma->vm_flags |= VM_IO | VM_MIXEDMAP | VM_DONTEXPAND | VM_DONTDUMP;
+ vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
vma->vm_page_prot = msm_gem_pgprot(msm_obj, vm_get_page_prot(vma->vm_flags));

return 0;
--
2.36.1

2022-06-25 23:26:01

by Rob Clark

[permalink] [raw]
Subject: [PATCH 03/15] drm/msm: Reorder lock vs submit alloc

From: Rob Clark <[email protected]>

This lets us drop the NORETRY.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem_submit.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index c9e4aeb14f4a..b7c61a99d274 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -36,7 +36,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
if (sz > SIZE_MAX)
return ERR_PTR(-ENOMEM);

- submit = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY);
+ submit = kzalloc(sz, GFP_KERNEL);
if (!submit)
return ERR_PTR(-ENOMEM);

@@ -771,25 +771,21 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
trace_msm_gpu_submit(pid_nr(pid), ring->id, submitid,
args->nr_bos, args->nr_cmds);

- ret = mutex_lock_interruptible(&queue->lock);
- if (ret)
- goto out_post_unlock;
-
if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
if (out_fence_fd < 0) {
ret = out_fence_fd;
- goto out_unlock;
+ return ret;
}
}

- submit = submit_create(dev, gpu, queue, args->nr_bos,
- args->nr_cmds);
- if (IS_ERR(submit)) {
- ret = PTR_ERR(submit);
- submit = NULL;
- goto out_unlock;
- }
+ submit = submit_create(dev, gpu, queue, args->nr_bos, args->nr_cmds);
+ if (IS_ERR(submit))
+ return PTR_ERR(submit);
+
+ ret = mutex_lock_interruptible(&queue->lock);
+ if (ret)
+ goto out_post_unlock;

submit->pid = pid;
submit->ident = submitid;
@@ -965,9 +961,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
if (ret && (out_fence_fd >= 0))
put_unused_fd(out_fence_fd);
mutex_unlock(&queue->lock);
+out_post_unlock:
if (submit)
msm_gem_submit_put(submit);
-out_post_unlock:
if (!IS_ERR_OR_NULL(post_deps)) {
for (i = 0; i < args->nr_out_syncobjs; ++i) {
kfree(post_deps[i].chain);
--
2.36.1

2022-06-25 23:26:07

by Rob Clark

[permalink] [raw]
Subject: [PATCH 02/15] drm/msm: Make enable_eviction flag static

From: Rob Clark <[email protected]>

No need for it to be visible outside of this one src file.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 086dacf2f26a..6e39d959b9f0 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -15,7 +15,7 @@
/* Default disabled for now until it has some more testing on the different
* iommu combinations that can be paired with the driver:
*/
-bool enable_eviction = false;
+static bool enable_eviction = false;
MODULE_PARM_DESC(enable_eviction, "Enable swappable GEM buffers");
module_param(enable_eviction, bool, 0600);

--
2.36.1

2022-06-25 23:26:18

by Rob Clark

[permalink] [raw]
Subject: [PATCH 12/15] drm/msm/gem: Convert to using drm_gem_lru

From: Rob Clark <[email protected]>

This converts over to use the shared GEM LRU/shrinker helpers. Note
that it means we are no longer tracking purgeable or willneed buffers
that are active separately. But the most recently pinned buffers should
be at the tail of the various LRUs, and the shrinker is already prepared
to encounter objects which are still active.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_drv.c | 14 +--
drivers/gpu/drm/msm/msm_drv.h | 70 +++++++++++----
drivers/gpu/drm/msm/msm_gem.c | 58 ++++--------
drivers/gpu/drm/msm/msm_gem.h | 93 --------------------
drivers/gpu/drm/msm/msm_gem_shrinker.c | 117 ++++++-------------------
drivers/gpu/drm/msm/msm_gpu.c | 3 -
drivers/gpu/drm/msm/msm_gpu.h | 6 --
7 files changed, 104 insertions(+), 257 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index ace91ead2caf..46afdb4ac96e 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -373,14 +373,18 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
INIT_LIST_HEAD(&priv->objects);
mutex_init(&priv->obj_lock);

- INIT_LIST_HEAD(&priv->inactive_willneed);
- INIT_LIST_HEAD(&priv->inactive_dontneed);
- INIT_LIST_HEAD(&priv->inactive_unpinned);
- mutex_init(&priv->mm_lock);
+ /*
+ * Initialize the LRUs:
+ */
+ mutex_init(&priv->lru.lock);
+ drm_gem_lru_init(&priv->lru.unbacked, &priv->lru.lock);
+ drm_gem_lru_init(&priv->lru.pinned, &priv->lru.lock);
+ drm_gem_lru_init(&priv->lru.willneed, &priv->lru.lock);
+ drm_gem_lru_init(&priv->lru.dontneed, &priv->lru.lock);

/* Teach lockdep about lock ordering wrt. shrinker: */
fs_reclaim_acquire(GFP_KERNEL);
- might_lock(&priv->mm_lock);
+ might_lock(&priv->lru.lock);
fs_reclaim_release(GFP_KERNEL);

drm_mode_config_init(ddev);
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 099a67d10c3a..b5c789777e01 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -152,28 +152,60 @@ struct msm_drm_private {
struct mutex obj_lock;

/**
- * LRUs of inactive GEM objects. Every bo is either in one of the
- * inactive lists (depending on whether or not it is shrinkable) or
- * gpu->active_list (for the gpu it is active on[1]), or transiently
- * on a temporary list as the shrinker is running.
+ * lru:
*
- * Note that inactive_willneed also contains pinned and vmap'd bos,
- * but the number of pinned-but-not-active objects is small (scanout
- * buffers, ringbuffer, etc).
+ * The various LRU's that a GEM object is in at various stages of
+ * it's lifetime. Objects start out in the unbacked LRU. When
+ * pinned (for scannout or permanently mapped GPU buffers, like
+ * ringbuffer, memptr, fw, etc) it moves to the pinned LRU. When
+ * unpinned, it moves into willneed or dontneed LRU depending on
+ * madvise state. When backing pages are evicted (willneed) or
+ * purged (dontneed) it moves back into the unbacked LRU.
*
- * These lists are protected by mm_lock (which should be acquired
- * before per GEM object lock). One should *not* hold mm_lock in
- * get_pages()/vmap()/etc paths, as they can trigger the shrinker.
- *
- * [1] if someone ever added support for the old 2d cores, there could be
- * more than one gpu object
+ * The dontneed LRU is considered by the shrinker for objects
+ * that are candidate for purging, and the willneed LRU is
+ * considered for objects that could be evicted.
*/
- struct list_head inactive_willneed; /* inactive + potentially unpin/evictable */
- struct list_head inactive_dontneed; /* inactive + shrinkable */
- struct list_head inactive_unpinned; /* inactive + purged or unpinned */
- long shrinkable_count; /* write access under mm_lock */
- long evictable_count; /* write access under mm_lock */
- struct mutex mm_lock;
+ struct {
+ /**
+ * unbacked:
+ *
+ * The LRU for GEM objects without backing pages allocated.
+ * This mostly exists so that objects are always is one
+ * LRU.
+ */
+ struct drm_gem_lru unbacked;
+
+ /**
+ * pinned:
+ *
+ * The LRU for pinned GEM objects
+ */
+ struct drm_gem_lru pinned;
+
+ /**
+ * willneed:
+ *
+ * The LRU for unpinned GEM objects which are in madvise
+ * WILLNEED state (ie. can be evicted)
+ */
+ struct drm_gem_lru willneed;
+
+ /**
+ * dontneed:
+ *
+ * The LRU for unpinned GEM objects which are in madvise
+ * DONTNEED state (ie. can be purged)
+ */
+ struct drm_gem_lru dontneed;
+
+ /**
+ * lock:
+ *
+ * Protects manipulation of all of the LRUs.
+ */
+ struct mutex lock;
+ } lru;

struct workqueue_struct *wq;

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 209438744bab..d4e8af46f4ef 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -174,6 +174,7 @@ static void put_pages(struct drm_gem_object *obj)
put_pages_vram(obj);

msm_obj->pages = NULL;
+ update_lru(obj);
}
}

@@ -210,8 +211,6 @@ struct page **msm_gem_pin_pages(struct drm_gem_object *obj)

void msm_gem_unpin_pages(struct drm_gem_object *obj)
{
- struct msm_gem_object *msm_obj = to_msm_bo(obj);
-
msm_gem_lock(obj);
msm_gem_unpin_locked(obj);
msm_gem_unlock(obj);
@@ -761,7 +760,6 @@ void msm_gem_purge(struct drm_gem_object *obj)
put_iova_vmas(obj);

msm_obj->madv = __MSM_MADV_PURGED;
- update_lru(obj);

drm_gem_free_mmap_offset(obj);

@@ -786,7 +784,6 @@ void msm_gem_evict(struct drm_gem_object *obj)

GEM_WARN_ON(!msm_gem_is_locked(obj));
GEM_WARN_ON(is_unevictable(msm_obj));
- GEM_WARN_ON(!msm_obj->evictable);

/* Get rid of any iommu mapping(s): */
put_iova_spaces(obj, false);
@@ -794,8 +791,6 @@ void msm_gem_evict(struct drm_gem_object *obj)
drm_vma_node_unmap(&obj->vma_node, dev->anon_inode->i_mapping);

put_pages(obj);
-
- update_lru(obj);
}

void msm_gem_vunmap(struct drm_gem_object *obj)
@@ -818,26 +813,20 @@ static void update_lru(struct drm_gem_object *obj)

GEM_WARN_ON(!msm_gem_is_locked(&msm_obj->base));

- mutex_lock(&priv->mm_lock);
-
- if (msm_obj->dontneed)
- mark_unpurgeable(msm_obj);
- if (msm_obj->evictable)
- mark_unevictable(msm_obj);
-
- list_del(&msm_obj->mm_list);
- if ((msm_obj->madv == MSM_MADV_WILLNEED) && msm_obj->sgt) {
- list_add_tail(&msm_obj->mm_list, &priv->inactive_willneed);
- mark_evictable(msm_obj);
- } else if (msm_obj->madv == MSM_MADV_DONTNEED) {
- list_add_tail(&msm_obj->mm_list, &priv->inactive_dontneed);
- mark_purgeable(msm_obj);
+ if (!msm_obj->pages) {
+ GEM_WARN_ON(msm_obj->pin_count);
+ GEM_WARN_ON(msm_obj->vmap_count);
+
+ drm_gem_lru_move_tail(&priv->lru.unbacked, obj);
+ } else if (msm_obj->pin_count || msm_obj->vmap_count) {
+ drm_gem_lru_move_tail(&priv->lru.pinned, obj);
+ } else if (msm_obj->madv == MSM_MADV_WILLNEED) {
+ drm_gem_lru_move_tail(&priv->lru.willneed, obj);
} else {
- GEM_WARN_ON((msm_obj->madv != __MSM_MADV_PURGED) && msm_obj->sgt);
- list_add_tail(&msm_obj->mm_list, &priv->inactive_unpinned);
- }
+ GEM_WARN_ON(msm_obj->madv != MSM_MADV_DONTNEED);

- mutex_unlock(&priv->mm_lock);
+ drm_gem_lru_move_tail(&priv->lru.dontneed, obj);
+ }
}

bool msm_gem_active(struct drm_gem_object *obj)
@@ -995,12 +984,6 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
list_del(&msm_obj->node);
mutex_unlock(&priv->obj_lock);

- mutex_lock(&priv->mm_lock);
- if (msm_obj->dontneed)
- mark_unpurgeable(msm_obj);
- list_del(&msm_obj->mm_list);
- mutex_unlock(&priv->mm_lock);
-
put_iova_spaces(obj, true);

if (obj->import_attach) {
@@ -1160,13 +1143,6 @@ struct drm_gem_object *msm_gem_new(struct drm_device *dev, uint32_t size, uint32

to_msm_bo(obj)->vram_node = &vma->node;

- /* Call chain get_pages() -> update_inactive() tries to
- * access msm_obj->mm_list, but it is not initialized yet.
- * To avoid NULL pointer dereference error, initialize
- * mm_list to be empty.
- */
- INIT_LIST_HEAD(&msm_obj->mm_list);
-
msm_gem_lock(obj);
pages = get_pages(obj);
msm_gem_unlock(obj);
@@ -1189,9 +1165,7 @@ struct drm_gem_object *msm_gem_new(struct drm_device *dev, uint32_t size, uint32
mapping_set_gfp_mask(obj->filp->f_mapping, GFP_HIGHUSER);
}

- mutex_lock(&priv->mm_lock);
- list_add_tail(&msm_obj->mm_list, &priv->inactive_unpinned);
- mutex_unlock(&priv->mm_lock);
+ drm_gem_lru_move_tail(&priv->lru.unbacked, obj);

mutex_lock(&priv->obj_lock);
list_add_tail(&msm_obj->node, &priv->objects);
@@ -1247,9 +1221,7 @@ struct drm_gem_object *msm_gem_import(struct drm_device *dev,

msm_gem_unlock(obj);

- mutex_lock(&priv->mm_lock);
- list_add_tail(&msm_obj->mm_list, &priv->inactive_unpinned);
- mutex_unlock(&priv->mm_lock);
+ drm_gem_lru_move_tail(&priv->lru.pinned, obj);

mutex_lock(&priv->obj_lock);
list_add_tail(&msm_obj->node, &priv->objects);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 420ba49bf21a..0403b27ff779 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -93,16 +93,6 @@ struct msm_gem_object {
*/
uint8_t madv;

- /**
- * Is object on inactive_dontneed list (ie. counted in priv->shrinkable_count)?
- */
- bool dontneed : 1;
-
- /**
- * Is object evictable (ie. counted in priv->evictable_count)?
- */
- bool evictable : 1;
-
/**
* count of active vmap'ing
*/
@@ -114,17 +104,6 @@ struct msm_gem_object {
*/
struct list_head node;

- /**
- * An object is either:
- * inactive - on priv->inactive_dontneed or priv->inactive_willneed
- * (depending on purgeability status)
- * active - on one one of the gpu's active_list.. well, at
- * least for now we don't have (I don't think) hw sync between
- * 2d and 3d one devices which have both, meaning we need to
- * block on submit if a bo is already on other ring
- */
- struct list_head mm_list;
-
struct page **pages;
struct sg_table *sgt;
void *vaddr;
@@ -206,12 +185,6 @@ msm_gem_lock(struct drm_gem_object *obj)
dma_resv_lock(obj->resv, NULL);
}

-static inline bool __must_check
-msm_gem_trylock(struct drm_gem_object *obj)
-{
- return dma_resv_trylock(obj->resv);
-}
-
static inline int
msm_gem_lock_interruptible(struct drm_gem_object *obj)
{
@@ -260,77 +233,11 @@ static inline bool is_vunmapable(struct msm_gem_object *msm_obj)
return (msm_obj->vmap_count == 0) && msm_obj->vaddr;
}

-static inline void mark_purgeable(struct msm_gem_object *msm_obj)
-{
- struct msm_drm_private *priv = msm_obj->base.dev->dev_private;
-
- GEM_WARN_ON(!mutex_is_locked(&priv->mm_lock));
-
- if (is_unpurgeable(msm_obj))
- return;
-
- if (GEM_WARN_ON(msm_obj->dontneed))
- return;
-
- priv->shrinkable_count += msm_obj->base.size >> PAGE_SHIFT;
- msm_obj->dontneed = true;
-}
-
-static inline void mark_unpurgeable(struct msm_gem_object *msm_obj)
-{
- struct msm_drm_private *priv = msm_obj->base.dev->dev_private;
-
- GEM_WARN_ON(!mutex_is_locked(&priv->mm_lock));
-
- if (is_unpurgeable(msm_obj))
- return;
-
- if (GEM_WARN_ON(!msm_obj->dontneed))
- return;
-
- priv->shrinkable_count -= msm_obj->base.size >> PAGE_SHIFT;
- GEM_WARN_ON(priv->shrinkable_count < 0);
- msm_obj->dontneed = false;
-}
-
static inline bool is_unevictable(struct msm_gem_object *msm_obj)
{
return is_unpurgeable(msm_obj) || msm_obj->vaddr;
}

-static inline void mark_evictable(struct msm_gem_object *msm_obj)
-{
- struct msm_drm_private *priv = msm_obj->base.dev->dev_private;
-
- WARN_ON(!mutex_is_locked(&priv->mm_lock));
-
- if (is_unevictable(msm_obj))
- return;
-
- if (WARN_ON(msm_obj->evictable))
- return;
-
- priv->evictable_count += msm_obj->base.size >> PAGE_SHIFT;
- msm_obj->evictable = true;
-}
-
-static inline void mark_unevictable(struct msm_gem_object *msm_obj)
-{
- struct msm_drm_private *priv = msm_obj->base.dev->dev_private;
-
- WARN_ON(!mutex_is_locked(&priv->mm_lock));
-
- if (is_unevictable(msm_obj))
- return;
-
- if (WARN_ON(!msm_obj->evictable))
- return;
-
- priv->evictable_count -= msm_obj->base.size >> PAGE_SHIFT;
- WARN_ON(priv->evictable_count < 0);
- msm_obj->evictable = false;
-}
-
void msm_gem_purge(struct drm_gem_object *obj);
void msm_gem_evict(struct drm_gem_object *obj);
void msm_gem_vunmap(struct drm_gem_object *obj);
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index ea8ed74982c1..530b1102b46d 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -29,121 +29,61 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
{
struct msm_drm_private *priv =
container_of(shrinker, struct msm_drm_private, shrinker);
- unsigned count = priv->shrinkable_count;
+ unsigned count = priv->lru.dontneed.count;

if (can_swap())
- count += priv->evictable_count;
+ count += priv->lru.willneed.count;

return count;
}

static bool
-purge(struct msm_gem_object *msm_obj)
+purge(struct drm_gem_object *obj)
{
- if (!is_purgeable(msm_obj))
+ if (!is_purgeable(to_msm_bo(obj)))
return false;

- if (msm_gem_active(&msm_obj->base))
+ if (msm_gem_active(obj))
return false;

- /*
- * This will move the obj out of still_in_list to
- * the purged list
- */
- msm_gem_purge(&msm_obj->base);
+ msm_gem_purge(obj);

return true;
}

static bool
-evict(struct msm_gem_object *msm_obj)
+evict(struct drm_gem_object *obj)
{
- if (is_unevictable(msm_obj))
+ if (is_unevictable(to_msm_bo(obj)))
return false;

- if (msm_gem_active(&msm_obj->base))
+ if (msm_gem_active(obj))
return false;

- msm_gem_evict(&msm_obj->base);
+ msm_gem_evict(obj);

return true;
}

-static unsigned long
-scan(struct msm_drm_private *priv, unsigned nr_to_scan, struct list_head *list,
- bool (*shrink)(struct msm_gem_object *msm_obj))
-{
- unsigned freed = 0;
- struct list_head still_in_list;
-
- INIT_LIST_HEAD(&still_in_list);
-
- mutex_lock(&priv->mm_lock);
-
- while (freed < nr_to_scan) {
- struct msm_gem_object *msm_obj = list_first_entry_or_null(
- list, typeof(*msm_obj), mm_list);
-
- if (!msm_obj)
- break;
-
- list_move_tail(&msm_obj->mm_list, &still_in_list);
-
- /*
- * If it is in the process of being freed, msm_gem_free_object
- * can be blocked on mm_lock waiting to remove it. So just
- * skip it.
- */
- if (!kref_get_unless_zero(&msm_obj->base.refcount))
- continue;
-
- /*
- * Now that we own a reference, we can drop mm_lock for the
- * rest of the loop body, to reduce contention with the
- * retire_submit path (which could make more objects purgeable)
- */
-
- mutex_unlock(&priv->mm_lock);
-
- /*
- * Note that this still needs to be trylock, since we can
- * hit shrinker in response to trying to get backing pages
- * for this obj (ie. while it's lock is already held)
- */
- if (!msm_gem_trylock(&msm_obj->base))
- goto tail;
-
- if (shrink(msm_obj))
- freed += msm_obj->base.size >> PAGE_SHIFT;
-
- msm_gem_unlock(&msm_obj->base);
-
-tail:
- drm_gem_object_put(&msm_obj->base);
- mutex_lock(&priv->mm_lock);
- }
-
- list_splice_tail(&still_in_list, list);
- mutex_unlock(&priv->mm_lock);
-
- return freed;
-}
-
static unsigned long
msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
{
struct msm_drm_private *priv =
container_of(shrinker, struct msm_drm_private, shrinker);
+ long nr = sc->nr_to_scan;
unsigned long freed;

- freed = scan(priv, sc->nr_to_scan, &priv->inactive_dontneed, purge);
+ freed = drm_gem_lru_scan(&priv->lru.dontneed, nr, purge);
+ nr -= freed;

if (freed > 0)
trace_msm_gem_purge(freed << PAGE_SHIFT);

- if (can_swap() && freed < sc->nr_to_scan) {
- int evicted = scan(priv, sc->nr_to_scan - freed,
- &priv->inactive_willneed, evict);
+ if (can_swap() && nr > 0) {
+ unsigned long evicted;
+
+ evicted = drm_gem_lru_scan(&priv->lru.willneed, nr, evict);
+ nr -= evicted;

if (evicted > 0)
trace_msm_gem_evict(evicted << PAGE_SHIFT);
@@ -179,12 +119,12 @@ msm_gem_shrinker_shrink(struct drm_device *dev, unsigned long nr_to_scan)
static const int vmap_shrink_limit = 15;

static bool
-vmap_shrink(struct msm_gem_object *msm_obj)
+vmap_shrink(struct drm_gem_object *obj)
{
- if (!is_vunmapable(msm_obj))
+ if (!is_vunmapable(to_msm_bo(obj)))
return false;

- msm_gem_vunmap(&msm_obj->base);
+ msm_gem_vunmap(obj);

return true;
}
@@ -194,17 +134,18 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr)
{
struct msm_drm_private *priv =
container_of(nb, struct msm_drm_private, vmap_notifier);
- struct list_head *mm_lists[] = {
- &priv->inactive_dontneed,
- &priv->inactive_willneed,
- priv->gpu ? &priv->gpu->active_list : NULL,
+ struct drm_gem_lru *lrus[] = {
+ &priv->lru.dontneed,
+ &priv->lru.willneed,
+ &priv->lru.pinned,
NULL,
};
unsigned idx, unmapped = 0;

- for (idx = 0; mm_lists[idx] && unmapped < vmap_shrink_limit; idx++) {
- unmapped += scan(priv, vmap_shrink_limit - unmapped,
- mm_lists[idx], vmap_shrink);
+ for (idx = 0; lrus[idx] && unmapped < vmap_shrink_limit; idx++) {
+ unmapped += drm_gem_lru_scan(lrus[idx],
+ vmap_shrink_limit - unmapped,
+ vmap_shrink);
}

*(unsigned long *)ptr += unmapped;
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 8c00f9187c03..bdee6ea51b73 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -862,7 +862,6 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,

sched_set_fifo_low(gpu->worker->task);

- INIT_LIST_HEAD(&gpu->active_list);
mutex_init(&gpu->active_lock);
mutex_init(&gpu->lock);
init_waitqueue_head(&gpu->retire_event);
@@ -990,8 +989,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu)

DBG("%s", gpu->name);

- WARN_ON(!list_empty(&gpu->active_list));
-
for (i = 0; i < ARRAY_SIZE(gpu->rb); i++) {
msm_ringbuffer_destroy(gpu->rb[i]);
gpu->rb[i] = NULL;
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 4ca56d96344a..b837785cdb04 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -178,12 +178,6 @@ struct msm_gpu {
*/
int cur_ctx_seqno;

- /*
- * List of GEM active objects on this gpu. Protected by
- * msm_drm_private::mm_lock
- */
- struct list_head active_list;
-
/**
* lock:
*
--
2.36.1

2022-06-25 23:26:43

by Rob Clark

[permalink] [raw]
Subject: [PATCH 10/15] drm/msm/gem: Remove active refcnt

From: Rob Clark <[email protected]>

At this point the pinned refcnt is sufficient, and the shrinker is
already prepared to encounter objects which are still active according
to fences attached to the resv.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem.c | 45 ++--------------------------
drivers/gpu/drm/msm/msm_gem.h | 14 ++-------
drivers/gpu/drm/msm/msm_gem_submit.c | 22 ++------------
3 files changed, 8 insertions(+), 73 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 407b18a24dc4..209438744bab 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -734,8 +734,7 @@ int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv)
/* If the obj is inactive, we might need to move it
* between inactive lists
*/
- if (msm_obj->active_count == 0)
- update_lru(obj);
+ update_lru(obj);

msm_gem_unlock(obj);

@@ -788,7 +787,6 @@ void msm_gem_evict(struct drm_gem_object *obj)
GEM_WARN_ON(!msm_gem_is_locked(obj));
GEM_WARN_ON(is_unevictable(msm_obj));
GEM_WARN_ON(!msm_obj->evictable);
- GEM_WARN_ON(msm_obj->active_count);

/* Get rid of any iommu mapping(s): */
put_iova_spaces(obj, false);
@@ -813,37 +811,6 @@ void msm_gem_vunmap(struct drm_gem_object *obj)
msm_obj->vaddr = NULL;
}

-void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu)
-{
- struct msm_gem_object *msm_obj = to_msm_bo(obj);
- struct msm_drm_private *priv = obj->dev->dev_private;
-
- might_sleep();
- GEM_WARN_ON(!msm_gem_is_locked(obj));
- GEM_WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED);
- GEM_WARN_ON(msm_obj->dontneed);
-
- if (msm_obj->active_count++ == 0) {
- mutex_lock(&priv->mm_lock);
- if (msm_obj->evictable)
- mark_unevictable(msm_obj);
- list_move_tail(&msm_obj->mm_list, &gpu->active_list);
- mutex_unlock(&priv->mm_lock);
- }
-}
-
-void msm_gem_active_put(struct drm_gem_object *obj)
-{
- struct msm_gem_object *msm_obj = to_msm_bo(obj);
-
- might_sleep();
- GEM_WARN_ON(!msm_gem_is_locked(obj));
-
- if (--msm_obj->active_count == 0) {
- update_lru(obj);
- }
-}
-
static void update_lru(struct drm_gem_object *obj)
{
struct msm_drm_private *priv = obj->dev->dev_private;
@@ -851,9 +818,6 @@ static void update_lru(struct drm_gem_object *obj)

GEM_WARN_ON(!msm_gem_is_locked(&msm_obj->base));

- if (msm_obj->active_count != 0)
- return;
-
mutex_lock(&priv->mm_lock);

if (msm_obj->dontneed)
@@ -926,7 +890,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m,
stats->all.count++;
stats->all.size += obj->size;

- if (is_active(msm_obj)) {
+ if (msm_gem_active(obj)) {
stats->active.count++;
stats->active.size += obj->size;
}
@@ -954,7 +918,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m,
}

seq_printf(m, "%08x: %c %2d (%2d) %08llx %p",
- msm_obj->flags, is_active(msm_obj) ? 'A' : 'I',
+ msm_obj->flags, msm_gem_active(obj) ? 'A' : 'I',
obj->name, kref_read(&obj->refcount),
off, msm_obj->vaddr);

@@ -1037,9 +1001,6 @@ static void msm_gem_free_object(struct drm_gem_object *obj)
list_del(&msm_obj->mm_list);
mutex_unlock(&priv->mm_lock);

- /* object should not be on active list: */
- GEM_WARN_ON(is_active(msm_obj));
-
put_iova_spaces(obj, true);

if (obj->import_attach) {
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 6fe521ccda45..420ba49bf21a 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -138,7 +138,6 @@ struct msm_gem_object {

char name[32]; /* Identifier to print for the debugfs files */

- int active_count;
int pin_count;
};
#define to_msm_bo(x) container_of(x, struct msm_gem_object, base)
@@ -171,8 +170,6 @@ void *msm_gem_get_vaddr_active(struct drm_gem_object *obj);
void msm_gem_put_vaddr_locked(struct drm_gem_object *obj);
void msm_gem_put_vaddr(struct drm_gem_object *obj);
int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv);
-void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu);
-void msm_gem_active_put(struct drm_gem_object *obj);
bool msm_gem_active(struct drm_gem_object *obj);
int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout);
int msm_gem_cpu_fini(struct drm_gem_object *obj);
@@ -245,12 +242,6 @@ msm_gem_is_locked(struct drm_gem_object *obj)
return dma_resv_is_locked(obj->resv) || (kref_read(&obj->refcount) == 0);
}

-static inline bool is_active(struct msm_gem_object *msm_obj)
-{
- GEM_WARN_ON(!msm_gem_is_locked(&msm_obj->base));
- return msm_obj->active_count;
-}
-
/* imported/exported objects are not purgeable: */
static inline bool is_unpurgeable(struct msm_gem_object *msm_obj)
{
@@ -391,9 +382,8 @@ struct msm_gem_submit {
/* make sure these don't conflict w/ MSM_SUBMIT_BO_x */
#define BO_VALID 0x8000 /* is current addr in cmdstream correct/valid? */
#define BO_LOCKED 0x4000 /* obj lock is held */
-#define BO_ACTIVE 0x2000 /* active refcnt is held */
-#define BO_OBJ_PINNED 0x1000 /* obj (pages) is pinned and on active list */
-#define BO_VMA_PINNED 0x0800 /* vma (virtual address) is pinned */
+#define BO_OBJ_PINNED 0x2000 /* obj (pages) is pinned and on active list */
+#define BO_VMA_PINNED 0x1000 /* vma (virtual address) is pinned */
uint32_t flags;
union {
struct msm_gem_object *obj;
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 16c662808522..adf358fb8e9d 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -243,17 +243,13 @@ static void submit_cleanup_bo(struct msm_gem_submit *submit, int i,
if (flags & BO_OBJ_PINNED)
msm_gem_unpin_locked(obj);

- if (flags & BO_ACTIVE)
- msm_gem_active_put(obj);
-
if (flags & BO_LOCKED)
dma_resv_unlock(obj->resv);
}

static void submit_unlock_unpin_bo(struct msm_gem_submit *submit, int i)
{
- unsigned cleanup_flags = BO_VMA_PINNED | BO_OBJ_PINNED |
- BO_ACTIVE | BO_LOCKED;
+ unsigned cleanup_flags = BO_VMA_PINNED | BO_OBJ_PINNED | BO_LOCKED;
submit_cleanup_bo(submit, i, cleanup_flags);

if (!(submit->bos[i].flags & BO_VALID))
@@ -358,18 +354,6 @@ static int submit_pin_objects(struct msm_gem_submit *submit)

submit->valid = true;

- /*
- * Increment active_count first, so if under memory pressure, we
- * don't inadvertently evict a bo needed by the submit in order
- * to pin an earlier bo in the same submit.
- */
- for (i = 0; i < submit->nr_bos; i++) {
- struct drm_gem_object *obj = &submit->bos[i].obj->base;
-
- msm_gem_active_get(obj, submit->gpu);
- submit->bos[i].flags |= BO_ACTIVE;
- }
-
for (i = 0; i < submit->nr_bos; i++) {
struct drm_gem_object *obj = &submit->bos[i].obj->base;
struct msm_gem_vma *vma;
@@ -521,7 +505,7 @@ static void submit_cleanup(struct msm_gem_submit *submit, bool error)
unsigned i;

if (error)
- cleanup_flags |= BO_VMA_PINNED | BO_OBJ_PINNED | BO_ACTIVE;
+ cleanup_flags |= BO_VMA_PINNED | BO_OBJ_PINNED;

for (i = 0; i < submit->nr_bos; i++) {
struct msm_gem_object *msm_obj = submit->bos[i].obj;
@@ -540,7 +524,7 @@ void msm_submit_retire(struct msm_gem_submit *submit)

msm_gem_lock(obj);
/* Note, VMA already fence-unpinned before submit: */
- submit_cleanup_bo(submit, i, BO_OBJ_PINNED | BO_ACTIVE);
+ submit_cleanup_bo(submit, i, BO_OBJ_PINNED);
msm_gem_unlock(obj);
drm_gem_object_put(obj);
}
--
2.36.1

2022-06-25 23:27:41

by Rob Clark

[permalink] [raw]
Subject: [PATCH 09/15] drm/msm/gem: Consolidate pin/unpin paths

From: Rob Clark <[email protected]>

Avoid having multiple spots where we increment/decrement pin_count (and
associated LRU updating)

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 3da64c7f65a2..407b18a24dc4 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -190,7 +190,7 @@ static struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj)

p = get_pages(obj);
if (!IS_ERR(p)) {
- msm_obj->pin_count++;
+ to_msm_bo(obj)->pin_count++;
update_lru(obj);
}

@@ -213,9 +213,7 @@ void msm_gem_unpin_pages(struct drm_gem_object *obj)
struct msm_gem_object *msm_obj = to_msm_bo(obj);

msm_gem_lock(obj);
- msm_obj->pin_count--;
- GEM_WARN_ON(msm_obj->pin_count < 0);
- update_lru(obj);
+ msm_gem_unpin_locked(obj);
msm_gem_unlock(obj);
}

@@ -436,14 +434,13 @@ int msm_gem_pin_vma_locked(struct drm_gem_object *obj, struct msm_gem_vma *vma)
if (GEM_WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED))
return -EBUSY;

- pages = get_pages(obj);
+ pages = msm_gem_pin_pages_locked(obj);
if (IS_ERR(pages))
return PTR_ERR(pages);

ret = msm_gem_map_vma(vma->aspace, vma, prot, msm_obj->sgt, obj->size);
-
- if (!ret)
- msm_obj->pin_count++;
+ if (ret)
+ msm_gem_unpin_locked(obj);

return ret;
}
--
2.36.1

2022-06-25 23:27:43

by Rob Clark

[permalink] [raw]
Subject: [PATCH 13/15] drm/msm/gem: Unpin buffers earlier

From: Rob Clark <[email protected]>

We've already attached the fences, so obj->resv (which shrinker checks)
tells us whether they are still active. So we can unpin sooner, before
we drop the queue lock.

This also avoids the need to grab the obj lock in the retire path,
avoiding potential for lock contention between submit and retire.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index adf358fb8e9d..5599d93ec0d2 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -501,11 +501,11 @@ static int submit_reloc(struct msm_gem_submit *submit, struct msm_gem_object *ob
*/
static void submit_cleanup(struct msm_gem_submit *submit, bool error)
{
- unsigned cleanup_flags = BO_LOCKED;
+ unsigned cleanup_flags = BO_LOCKED | BO_OBJ_PINNED;
unsigned i;

if (error)
- cleanup_flags |= BO_VMA_PINNED | BO_OBJ_PINNED;
+ cleanup_flags |= BO_VMA_PINNED;

for (i = 0; i < submit->nr_bos; i++) {
struct msm_gem_object *msm_obj = submit->bos[i].obj;
@@ -522,10 +522,6 @@ void msm_submit_retire(struct msm_gem_submit *submit)
for (i = 0; i < submit->nr_bos; i++) {
struct drm_gem_object *obj = &submit->bos[i].obj->base;

- msm_gem_lock(obj);
- /* Note, VMA already fence-unpinned before submit: */
- submit_cleanup_bo(submit, i, BO_OBJ_PINNED);
- msm_gem_unlock(obj);
drm_gem_object_put(obj);
}
}
--
2.36.1