2023-09-29 19:37:52

by Adrián Larumbe

[permalink] [raw]
Subject: [PATCH v8 0/5] Add fdinfo support to Panfrost

This patch series adds fdinfo support to the Panfrost DRM driver. It will
display a series of key:value pairs under /proc/pid/fdinfo/fd for render
processes that open the Panfrost DRM file.

The pairs contain basic drm gpu engine and memory region information that
can either be cat by a privileged user or accessed with IGT's gputop
utility.

Changelog:

v1: https://lore.kernel.org/lkml/[email protected]/T/

v2: https://lore.kernel.org/lkml/[email protected]/T/
- Changed the way gpu cycles and engine time are calculated, using GPU
registers and taking into account potential resets.
- Split render engine values into fragment and vertex/tiler ones.
- Added more fine-grained calculation of RSS size for BO's.
- Implemente selection of drm-memory region size units.
- Removed locking of shrinker's mutex in GEM obj status function.

v3: https://lore.kernel.org/lkml/[email protected]/
- Changed fdinfo engine names to something more descriptive.;
- Mentioned GPU cycle counts aren't an exact measure.
- Handled the case when job->priv might be NULL.
- Handled 32 bit overflow of cycle register.
- Kept fdinfo drm memory stats size unit display within 10k times the
previous multiplier for more accurate BO size numbers.
- Removed special handling of Prime imported BO RSS.
- Use rss_size only for heap objects.
- Use bo->base.madv instead of specific purgeable flag.
- Fixed kernel test robot warnings.

v4: https://lore.kernel.org/lkml/[email protected]/
- Move cycle counter get and put to panfrost_job_hw_submit and
panfrost_job_handle_{err,done} for more accuracy.
- Make sure cycle counter refs are released in reset path
- Drop the model param for toggling cycle counting and do
leave it down to the debugfs file.
- Don't disable cycle counter when togglint debugfs file,
let refcounting logic handle it instead.
- Remove fdinfo data nested structure definion and 'names' field
- When incrementing BO RSS size in GPU MMU page fault IRQ handler, assume
granuality of 2MiB for every successful mapping.
- drm-file picks an fdinfo memory object size unit that doesn't lose precision.

v5: https://lore.kernel.org/lkml/[email protected]/
- Removed explicit initialisation of atomic variable for profiling mode,
as it's allocated with kzalloc.
- Pass engine utilisation structure to jobs rather than the file context, to avoid
future misusage of the latter.
- Remove double reading of cycle counter register and ktime in job deqeueue function,
as the scheduler will make sure these values are read over in case of requeuing.
- Moved putting of cycle counting refcnt into panfrost job dequeue.
function to avoid repetition.

v6: https://lore.kernel.org/lkml/[email protected]/T/
- Fix wrong swapped-round engine time and cycle values in fdinfo
drm print statements.

v7: https://lore.kernel.org/lkml/[email protected]/T/
- Make sure an object's actual RSS size is added to the overall fdinfo's purgeable
and active size tally when it's both resident and purgeable or active.
- Create a drm/panfrost.rst documentation file with meaning of fdinfo strings.
- BUILD_BUG_ON checking the engine name array size for fdinfo.
- Added copyright notices for Amazon in Panfrost's new debugfs files.
- Discarded fdinfo memory stats unit size selection patch.

v8:
- Style improvements and addressing nitpicks.

Adrián Larumbe (5):
drm/panfrost: Add cycle count GPU register definitions
drm/panfrost: Add fdinfo support GPU load metrics
drm/panfrost: Add fdinfo support for memory stats
drm/drm_file: Add DRM obj's RSS reporting function for fdinfo
drm/panfrost: Implement generic DRM object RSS reporting function

Documentation/gpu/drm-usage-stats.rst | 1 +
Documentation/gpu/panfrost.rst | 38 +++++++++++++
drivers/gpu/drm/drm_file.c | 8 +--
drivers/gpu/drm/panfrost/Makefile | 2 +
drivers/gpu/drm/panfrost/panfrost_debugfs.c | 21 ++++++++
drivers/gpu/drm/panfrost/panfrost_debugfs.h | 14 +++++
drivers/gpu/drm/panfrost/panfrost_devfreq.c | 8 +++
drivers/gpu/drm/panfrost/panfrost_devfreq.h | 3 ++
drivers/gpu/drm/panfrost/panfrost_device.c | 2 +
drivers/gpu/drm/panfrost/panfrost_device.h | 13 +++++
drivers/gpu/drm/panfrost/panfrost_drv.c | 60 ++++++++++++++++++++-
drivers/gpu/drm/panfrost/panfrost_gem.c | 30 +++++++++++
drivers/gpu/drm/panfrost/panfrost_gem.h | 5 ++
drivers/gpu/drm/panfrost/panfrost_gpu.c | 41 ++++++++++++++
drivers/gpu/drm/panfrost/panfrost_gpu.h | 4 ++
drivers/gpu/drm/panfrost/panfrost_job.c | 24 +++++++++
drivers/gpu/drm/panfrost/panfrost_job.h | 5 ++
drivers/gpu/drm/panfrost/panfrost_mmu.c | 1 +
drivers/gpu/drm/panfrost/panfrost_regs.h | 5 ++
include/drm/drm_gem.h | 9 ++++
20 files changed, 290 insertions(+), 4 deletions(-)
create mode 100644 Documentation/gpu/panfrost.rst
create mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.c
create mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.h


base-commit: f45acf7acf75921c0409d452f0165f51a19a74fd
--
2.42.0


2023-09-29 22:32:19

by Adrián Larumbe

[permalink] [raw]
Subject: [PATCH v8 4/5] drm/drm_file: Add DRM obj's RSS reporting function for fdinfo

Some BO's might be mapped onto physical memory chunkwise and on demand,
like Panfrost's tiler heap. In this case, even though the
drm_gem_shmem_object page array might already be allocated, only a very
small fraction of the BO is currently backed by system memory, but
drm_show_memory_stats will then proceed to add its entire virtual size to
the file's total resident size regardless.

This led to very unrealistic RSS sizes being reckoned for Panfrost, where
said tiler heap buffer is initially allocated with a virtual size of 128
MiB, but only a small part of it will eventually be backed by system memory
after successive GPU page faults.

Provide a new DRM object generic function that would allow drivers to
return a more accurate RSS and purgeable sizes for their BOs.

Signed-off-by: Adrián Larumbe <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
---
drivers/gpu/drm/drm_file.c | 8 +++++---
include/drm/drm_gem.h | 9 +++++++++
2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 883d83bc0e3d..9a1bd8d0d785 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -930,6 +930,8 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
spin_lock(&file->table_lock);
idr_for_each_entry (&file->object_idr, obj, id) {
enum drm_gem_object_status s = 0;
+ size_t add_size = (obj->funcs && obj->funcs->rss) ?
+ obj->funcs->rss(obj) : obj->size;

if (obj->funcs && obj->funcs->status) {
s = obj->funcs->status(obj);
@@ -944,7 +946,7 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
}

if (s & DRM_GEM_OBJECT_RESIDENT) {
- status.resident += obj->size;
+ status.resident += add_size;
} else {
/* If already purged or not yet backed by pages, don't
* count it as purgeable:
@@ -953,14 +955,14 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
}

if (!dma_resv_test_signaled(obj->resv, dma_resv_usage_rw(true))) {
- status.active += obj->size;
+ status.active += add_size;

/* If still active, don't count as purgeable: */
s &= ~DRM_GEM_OBJECT_PURGEABLE;
}

if (s & DRM_GEM_OBJECT_PURGEABLE)
- status.purgeable += obj->size;
+ status.purgeable += add_size;
}
spin_unlock(&file->table_lock);

diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index bc9f6aa2f3fe..16364487fde9 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -208,6 +208,15 @@ struct drm_gem_object_funcs {
*/
enum drm_gem_object_status (*status)(struct drm_gem_object *obj);

+ /**
+ * @rss:
+ *
+ * Return resident size of the object in physical memory.
+ *
+ * Called by drm_show_memory_stats().
+ */
+ size_t (*rss)(struct drm_gem_object *obj);
+
/**
* @vm_ops:
*
--
2.42.0

2023-09-29 23:32:21

by Adrián Larumbe

[permalink] [raw]
Subject: [PATCH v8 2/5] drm/panfrost: Add fdinfo support GPU load metrics

The drm-stats fdinfo tags made available to user space are drm-engine,
drm-cycles, drm-max-freq and drm-curfreq, one per job slot.

This deviates from standard practice in other DRM drivers, where a single
set of key:value pairs is provided for the whole render engine. However,
Panfrost has separate queues for fragment and vertex/tiler jobs, so a
decision was made to calculate bus cycles and workload times separately.

Maximum operating frequency is calculated at devfreq initialisation time.
Current frequency is made available to user space because nvtop uses it
when performing engine usage calculations.

It is important to bear in mind that both GPU cycle and kernel time numbers
provided are at best rough estimations, and always reported in excess from
the actual figure because of two reasons:
- Excess time because of the delay between the end of a job processing,
the subsequent job IRQ and the actual time of the sample.
- Time spent in the engine queue waiting for the GPU to pick up the next
job.

To avoid race conditions during enablement/disabling, a reference counting
mechanism was introduced, and a job flag that tells us whether a given job
increased the refcount. This is necessary, because user space can toggle
cycle counting through a debugfs file, and a given job might have been in
flight by the time cycle counting was disabled.

The main goal of the debugfs cycle counter knob is letting tools like nvtop
or IGT's gputop switch it at any time, to avoid power waste in case no
engine usage measuring is necessary.

Also add a documentation file explaining the possible values for fdinfo's
engine keystrings and Panfrost-specific drm-curfreq-<keystr> pairs.

Signed-off-by: Adrián Larumbe <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
---
Documentation/gpu/drm-usage-stats.rst | 1 +
Documentation/gpu/panfrost.rst | 38 ++++++++++++++
drivers/gpu/drm/panfrost/Makefile | 2 +
drivers/gpu/drm/panfrost/panfrost_debugfs.c | 21 ++++++++
drivers/gpu/drm/panfrost/panfrost_debugfs.h | 14 +++++
drivers/gpu/drm/panfrost/panfrost_devfreq.c | 8 +++
drivers/gpu/drm/panfrost/panfrost_devfreq.h | 3 ++
drivers/gpu/drm/panfrost/panfrost_device.c | 2 +
drivers/gpu/drm/panfrost/panfrost_device.h | 13 +++++
drivers/gpu/drm/panfrost/panfrost_drv.c | 58 ++++++++++++++++++++-
drivers/gpu/drm/panfrost/panfrost_gpu.c | 41 +++++++++++++++
drivers/gpu/drm/panfrost/panfrost_gpu.h | 4 ++
drivers/gpu/drm/panfrost/panfrost_job.c | 24 +++++++++
drivers/gpu/drm/panfrost/panfrost_job.h | 5 ++
14 files changed, 233 insertions(+), 1 deletion(-)
create mode 100644 Documentation/gpu/panfrost.rst
create mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.c
create mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.h

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index fe35a291ff3e..8d963cd7c1b7 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -169,3 +169,4 @@ Driver specific implementations
-------------------------------

:ref:`i915-usage-stats`
+:ref:`panfrost-usage-stats`
diff --git a/Documentation/gpu/panfrost.rst b/Documentation/gpu/panfrost.rst
new file mode 100644
index 000000000000..ecc48ba5ac11
--- /dev/null
+++ b/Documentation/gpu/panfrost.rst
@@ -0,0 +1,38 @@
+===========================
+ drm/Panfrost Mali Driver
+===========================
+
+.. _panfrost-usage-stats:
+
+Panfrost DRM client usage stats implementation
+==========================================
+
+The drm/Panfrost driver implements the DRM client usage stats specification as
+documented in :ref:`drm-client-usage-stats`.
+
+Example of the output showing the implemented key value pairs and entirety of
+the currently possible format options:
+
+::
+ pos: 0
+ flags: 02400002
+ mnt_id: 27
+ ino: 531
+ drm-driver: panfrost
+ drm-client-id: 14
+ drm-engine-fragment: 1846584880 ns
+ drm-cycles-fragment: 1424359409
+ drm-maxfreq-fragment: 799999987 Hz
+ drm-curfreq-fragment: 799999987 Hz
+ drm-engine-vertex-tiler: 71932239 ns
+ drm-cycles-vertex-tiler: 52617357
+ drm-maxfreq-vertex-tiler: 799999987 Hz
+ drm-curfreq-vertex-tiler: 799999987 Hz
+ drm-total-memory: 290 MiB
+ drm-shared-memory: 0 MiB
+ drm-active-memory: 226 MiB
+ drm-resident-memory: 36496 KiB
+ drm-purgeable-memory: 128 KiB
+
+Possible `drm-engine-` key names are: `fragment`, and `vertex-tiler`.
+`drm-curfreq-` values convey the current operating frequency for that engine.
diff --git a/drivers/gpu/drm/panfrost/Makefile b/drivers/gpu/drm/panfrost/Makefile
index 7da2b3f02ed9..2c01c1e7523e 100644
--- a/drivers/gpu/drm/panfrost/Makefile
+++ b/drivers/gpu/drm/panfrost/Makefile
@@ -12,4 +12,6 @@ panfrost-y := \
panfrost_perfcnt.o \
panfrost_dump.o

+panfrost-$(CONFIG_DEBUG_FS) += panfrost_debugfs.o
+
obj-$(CONFIG_DRM_PANFROST) += panfrost.o
diff --git a/drivers/gpu/drm/panfrost/panfrost_debugfs.c b/drivers/gpu/drm/panfrost/panfrost_debugfs.c
new file mode 100644
index 000000000000..72d4286a6bf7
--- /dev/null
+++ b/drivers/gpu/drm/panfrost/panfrost_debugfs.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright 2023 Collabora ltd. */
+/* Copyright 2023 Amazon.com, Inc. or its affiliates. */
+
+#include <linux/debugfs.h>
+#include <linux/platform_device.h>
+#include <drm/drm_debugfs.h>
+#include <drm/drm_file.h>
+#include <drm/panfrost_drm.h>
+
+#include "panfrost_device.h"
+#include "panfrost_gpu.h"
+#include "panfrost_debugfs.h"
+
+void panfrost_debugfs_init(struct drm_minor *minor)
+{
+ struct drm_device *dev = minor->dev;
+ struct panfrost_device *pfdev = platform_get_drvdata(to_platform_device(dev->dev));
+
+ debugfs_create_atomic_t("profile", 0600, minor->debugfs_root, &pfdev->profile_mode);
+}
diff --git a/drivers/gpu/drm/panfrost/panfrost_debugfs.h b/drivers/gpu/drm/panfrost/panfrost_debugfs.h
new file mode 100644
index 000000000000..c5af5f35877f
--- /dev/null
+++ b/drivers/gpu/drm/panfrost/panfrost_debugfs.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2023 Collabora ltd.
+ * Copyright 2023 Amazon.com, Inc. or its affiliates.
+ */
+
+#ifndef PANFROST_DEBUGFS_H
+#define PANFROST_DEBUGFS_H
+
+#ifdef CONFIG_DEBUG_FS
+void panfrost_debugfs_init(struct drm_minor *minor);
+#endif
+
+#endif /* PANFROST_DEBUGFS_H */
diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
index 58dfb15a8757..28caffc689e2 100644
--- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
+++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
@@ -58,6 +58,7 @@ static int panfrost_devfreq_get_dev_status(struct device *dev,
spin_lock_irqsave(&pfdevfreq->lock, irqflags);

panfrost_devfreq_update_utilization(pfdevfreq);
+ pfdevfreq->current_frequency = status->current_frequency;

status->total_time = ktime_to_ns(ktime_add(pfdevfreq->busy_time,
pfdevfreq->idle_time));
@@ -117,6 +118,7 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev)
struct devfreq *devfreq;
struct thermal_cooling_device *cooling;
struct panfrost_devfreq *pfdevfreq = &pfdev->pfdevfreq;
+ unsigned long freq = ULONG_MAX;

if (pfdev->comp->num_supplies > 1) {
/*
@@ -172,6 +174,12 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev)
return ret;
}

+ /* Find the fastest defined rate */
+ opp = dev_pm_opp_find_freq_floor(dev, &freq);
+ if (IS_ERR(opp))
+ return PTR_ERR(opp);
+ pfdevfreq->fast_rate = freq;
+
dev_pm_opp_put(opp);

/*
diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.h b/drivers/gpu/drm/panfrost/panfrost_devfreq.h
index 1514c1f9d91c..48dbe185f206 100644
--- a/drivers/gpu/drm/panfrost/panfrost_devfreq.h
+++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.h
@@ -19,6 +19,9 @@ struct panfrost_devfreq {
struct devfreq_simple_ondemand_data gov_data;
bool opp_of_table_added;

+ unsigned long current_frequency;
+ unsigned long fast_rate;
+
ktime_t busy_time;
ktime_t idle_time;
ktime_t time_last_update;
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
index fa1a086a862b..28f7046e1b1a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -207,6 +207,8 @@ int panfrost_device_init(struct panfrost_device *pfdev)

spin_lock_init(&pfdev->as_lock);

+ spin_lock_init(&pfdev->cycle_counter.lock);
+
err = panfrost_clk_init(pfdev);
if (err) {
dev_err(pfdev->dev, "clk init failed %d\n", err);
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index b0126b9fbadc..1e85656dc2f7 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -107,6 +107,7 @@ struct panfrost_device {
struct list_head scheduled_jobs;

struct panfrost_perfcnt *perfcnt;
+ atomic_t profile_mode;

struct mutex sched_lock;

@@ -121,6 +122,11 @@ struct panfrost_device {
struct shrinker shrinker;

struct panfrost_devfreq pfdevfreq;
+
+ struct {
+ atomic_t use_count;
+ spinlock_t lock;
+ } cycle_counter;
};

struct panfrost_mmu {
@@ -135,12 +141,19 @@ struct panfrost_mmu {
struct list_head list;
};

+struct panfrost_engine_usage {
+ unsigned long long elapsed_ns[NUM_JOB_SLOTS];
+ unsigned long long cycles[NUM_JOB_SLOTS];
+};
+
struct panfrost_file_priv {
struct panfrost_device *pfdev;

struct drm_sched_entity sched_entity[NUM_JOB_SLOTS];

struct panfrost_mmu *mmu;
+
+ struct panfrost_engine_usage engine_usage;
};

static inline struct panfrost_device *to_panfrost_device(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index a2ab99698ca8..97e5bc4a82c8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -20,6 +20,7 @@
#include "panfrost_job.h"
#include "panfrost_gpu.h"
#include "panfrost_perfcnt.h"
+#include "panfrost_debugfs.h"

static bool unstable_ioctls;
module_param_unsafe(unstable_ioctls, bool, 0600);
@@ -267,6 +268,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
job->requirements = args->requirements;
job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
job->mmu = file_priv->mmu;
+ job->engine_usage = &file_priv->engine_usage;

slot = panfrost_job_get_slot(job);

@@ -523,7 +525,56 @@ static const struct drm_ioctl_desc panfrost_drm_driver_ioctls[] = {
PANFROST_IOCTL(MADVISE, madvise, DRM_RENDER_ALLOW),
};

-DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
+static void panfrost_gpu_show_fdinfo(struct panfrost_device *pfdev,
+ struct panfrost_file_priv *panfrost_priv,
+ struct drm_printer *p)
+{
+ int i;
+
+ /*
+ * IMPORTANT NOTE: drm-cycles and drm-engine measurements are not
+ * accurate, as they only provide a rough estimation of the number of
+ * GPU cycles and CPU time spent in a given context. This is due to two
+ * different factors:
+ * - Firstly, we must consider the time the CPU and then the kernel
+ * takes to process the GPU interrupt, which means additional time and
+ * GPU cycles will be added in excess to the real figure.
+ * - Secondly, the pipelining done by the Job Manager (2 job slots per
+ * engine) implies there is no way to know exactly how much time each
+ * job spent on the GPU.
+ */
+
+ static const char * const engine_names[] = {
+ "fragment", "vertex-tiler", "compute-only"
+ };
+
+ BUILD_BUG_ON(ARRAY_SIZE(engine_names) != NUM_JOB_SLOTS);
+
+ for (i = 0; i < NUM_JOB_SLOTS - 1; i++) {
+ drm_printf(p, "drm-engine-%s:\t%llu ns\n",
+ engine_names[i], panfrost_priv->engine_usage.elapsed_ns[i]);
+ drm_printf(p, "drm-cycles-%s:\t%llu\n",
+ engine_names[i], panfrost_priv->engine_usage.cycles[i]);
+ drm_printf(p, "drm-maxfreq-%s:\t%lu Hz\n",
+ engine_names[i], pfdev->pfdevfreq.fast_rate);
+ drm_printf(p, "drm-curfreq-%s:\t%lu Hz\n",
+ engine_names[i], pfdev->pfdevfreq.current_frequency);
+ }
+}
+
+static void panfrost_show_fdinfo(struct drm_printer *p, struct drm_file *file)
+{
+ struct drm_device *dev = file->minor->dev;
+ struct panfrost_device *pfdev = dev->dev_private;
+
+ panfrost_gpu_show_fdinfo(pfdev, file->driver_priv, p);
+}
+
+static const struct file_operations panfrost_drm_driver_fops = {
+ .owner = THIS_MODULE,
+ DRM_GEM_FOPS,
+ .show_fdinfo = drm_show_fdinfo,
+};

/*
* Panfrost driver version:
@@ -535,6 +586,7 @@ static const struct drm_driver panfrost_drm_driver = {
.driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ,
.open = panfrost_open,
.postclose = panfrost_postclose,
+ .show_fdinfo = panfrost_show_fdinfo,
.ioctls = panfrost_drm_driver_ioctls,
.num_ioctls = ARRAY_SIZE(panfrost_drm_driver_ioctls),
.fops = &panfrost_drm_driver_fops,
@@ -546,6 +598,10 @@ static const struct drm_driver panfrost_drm_driver = {

.gem_create_object = panfrost_gem_create_object,
.gem_prime_import_sg_table = panfrost_gem_prime_import_sg_table,
+
+#ifdef CONFIG_DEBUG_FS
+ .debugfs_init = panfrost_debugfs_init,
+#endif
};

static int panfrost_probe(struct platform_device *pdev)
diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
index 2faa344d89ee..f0be7e19b13e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -73,6 +73,13 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);

+ /*
+ * All in-flight jobs should have released their cycle
+ * counter references upon reset, but let us make sure
+ */
+ if (drm_WARN_ON(pfdev->ddev, atomic_read(&pfdev->cycle_counter.use_count) != 0))
+ atomic_set(&pfdev->cycle_counter.use_count, 0);
+
return 0;
}

@@ -321,6 +328,40 @@ static void panfrost_gpu_init_features(struct panfrost_device *pfdev)
pfdev->features.shader_present, pfdev->features.l2_present);
}

+void panfrost_cycle_counter_get(struct panfrost_device *pfdev)
+{
+ if (atomic_inc_not_zero(&pfdev->cycle_counter.use_count))
+ return;
+
+ spin_lock(&pfdev->cycle_counter.lock);
+ if (atomic_inc_return(&pfdev->cycle_counter.use_count) == 1)
+ gpu_write(pfdev, GPU_CMD, GPU_CMD_CYCLE_COUNT_START);
+ spin_unlock(&pfdev->cycle_counter.lock);
+}
+
+void panfrost_cycle_counter_put(struct panfrost_device *pfdev)
+{
+ if (atomic_add_unless(&pfdev->cycle_counter.use_count, -1, 1))
+ return;
+
+ spin_lock(&pfdev->cycle_counter.lock);
+ if (atomic_dec_return(&pfdev->cycle_counter.use_count) == 0)
+ gpu_write(pfdev, GPU_CMD, GPU_CMD_CYCLE_COUNT_STOP);
+ spin_unlock(&pfdev->cycle_counter.lock);
+}
+
+unsigned long long panfrost_cycle_counter_read(struct panfrost_device *pfdev)
+{
+ u32 hi, lo;
+
+ do {
+ hi = gpu_read(pfdev, GPU_CYCLE_COUNT_HI);
+ lo = gpu_read(pfdev, GPU_CYCLE_COUNT_LO);
+ } while (hi != gpu_read(pfdev, GPU_CYCLE_COUNT_HI));
+
+ return ((u64)hi << 32) | lo;
+}
+
void panfrost_gpu_power_on(struct panfrost_device *pfdev)
{
int ret;
diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.h b/drivers/gpu/drm/panfrost/panfrost_gpu.h
index 468c51e7e46d..876fdad9f721 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.h
@@ -16,6 +16,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev);
void panfrost_gpu_power_on(struct panfrost_device *pfdev);
void panfrost_gpu_power_off(struct panfrost_device *pfdev);

+void panfrost_cycle_counter_get(struct panfrost_device *pfdev);
+void panfrost_cycle_counter_put(struct panfrost_device *pfdev);
+unsigned long long panfrost_cycle_counter_read(struct panfrost_device *pfdev);
+
void panfrost_gpu_amlogic_quirk(struct panfrost_device *pfdev);

#endif
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 033f5e684707..fb16de2d0420 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -159,6 +159,16 @@ panfrost_dequeue_job(struct panfrost_device *pfdev, int slot)
struct panfrost_job *job = pfdev->jobs[slot][0];

WARN_ON(!job);
+ if (job->is_profiled) {
+ if (job->engine_usage) {
+ job->engine_usage->elapsed_ns[slot] +=
+ ktime_to_ns(ktime_sub(ktime_get(), job->start_time));
+ job->engine_usage->cycles[slot] +=
+ panfrost_cycle_counter_read(pfdev) - job->start_cycles;
+ }
+ panfrost_cycle_counter_put(job->pfdev);
+ }
+
pfdev->jobs[slot][0] = pfdev->jobs[slot][1];
pfdev->jobs[slot][1] = NULL;

@@ -233,6 +243,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
subslot = panfrost_enqueue_job(pfdev, js, job);
/* Don't queue the job if a reset is in progress */
if (!atomic_read(&pfdev->reset.pending)) {
+ if (atomic_read(&pfdev->profile_mode)) {
+ panfrost_cycle_counter_get(pfdev);
+ job->is_profiled = true;
+ job->start_time = ktime_get();
+ job->start_cycles = panfrost_cycle_counter_read(pfdev);
+ }
+
job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
dev_dbg(pfdev->dev,
"JS: Submitting atom %p to js[%d][%d] with head=0x%llx AS %d",
@@ -660,10 +677,14 @@ panfrost_reset(struct panfrost_device *pfdev,
* stuck jobs. Let's make sure the PM counters stay balanced by
* manually calling pm_runtime_put_noidle() and
* panfrost_devfreq_record_idle() for each stuck job.
+ * Let's also make sure the cycle counting register's refcnt is
+ * kept balanced to prevent it from running forever
*/
spin_lock(&pfdev->js->job_lock);
for (i = 0; i < NUM_JOB_SLOTS; i++) {
for (j = 0; j < ARRAY_SIZE(pfdev->jobs[0]) && pfdev->jobs[i][j]; j++) {
+ if (pfdev->jobs[i][j]->is_profiled)
+ panfrost_cycle_counter_put(pfdev->jobs[i][j]->pfdev);
pm_runtime_put_noidle(pfdev->dev);
panfrost_devfreq_record_idle(&pfdev->pfdevfreq);
}
@@ -926,6 +947,9 @@ void panfrost_job_close(struct panfrost_file_priv *panfrost_priv)
}

job_write(pfdev, JS_COMMAND(i), cmd);
+
+ /* Jobs can outlive their file context */
+ job->engine_usage = NULL;
}
}
spin_unlock(&pfdev->js->job_lock);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
index 8becc1ba0eb9..17ff808dba07 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -32,6 +32,11 @@ struct panfrost_job {

/* Fence to be signaled by drm-sched once its done with the job */
struct dma_fence *render_done_fence;
+
+ struct panfrost_engine_usage *engine_usage;
+ bool is_profiled;
+ ktime_t start_time;
+ u64 start_cycles;
};

int panfrost_job_init(struct panfrost_device *pfdev);
--
2.42.0

2023-09-30 00:18:03

by Adrián Larumbe

[permalink] [raw]
Subject: [PATCH v8 5/5] drm/panfrost: Implement generic DRM object RSS reporting function

BO's RSS is updated every time new pages are allocated on demand and mapped
for the object at GPU page fault's IRQ handler, but only for heap buffers.
The reason this is unnecessary for non-heap buffers is that they are mapped
onto the GPU's VA space and backed by physical memory in their entirety at
BO creation time.

This calculation is unnecessary for imported PRIME objects, since heap
buffers cannot be exported by our driver, and the actual BO RSS size is the
one reported in its attached dmabuf structure.

Signed-off-by: Adrián Larumbe <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_gem.c | 15 +++++++++++++++
drivers/gpu/drm/panfrost/panfrost_gem.h | 5 +++++
drivers/gpu/drm/panfrost/panfrost_mmu.c | 1 +
3 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index de238b71b321..0cf64456e29a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -209,6 +209,20 @@ static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj
return res;
}

+static size_t panfrost_gem_rss(struct drm_gem_object *obj)
+{
+ struct panfrost_gem_object *bo = to_panfrost_bo(obj);
+
+ if (bo->is_heap) {
+ return bo->heap_rss_size;
+ } else if (bo->base.pages) {
+ WARN_ON(bo->heap_rss_size);
+ return bo->base.base.size;
+ }
+
+ return 0;
+}
+
static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.free = panfrost_gem_free_object,
.open = panfrost_gem_open,
@@ -221,6 +235,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.vunmap = drm_gem_shmem_object_vunmap,
.mmap = drm_gem_shmem_object_mmap,
.status = panfrost_gem_status,
+ .rss = panfrost_gem_rss,
.vm_ops = &drm_gem_shmem_vm_ops,
};

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h
index ad2877eeeccd..13c0a8149c3a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
@@ -36,6 +36,11 @@ struct panfrost_gem_object {
*/
atomic_t gpu_usecount;

+ /*
+ * Object chunk size currently mapped onto physical memory
+ */
+ size_t heap_rss_size;
+
bool noexec :1;
bool is_heap :1;
};
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index d54d4e7b2195..846dd697c410 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -522,6 +522,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt);

bomapping->active = true;
+ bo->heap_rss_size += SZ_2M;

dev_dbg(pfdev->dev, "mapped page fault @ AS%d %llx", as, addr);

--
2.42.0

2023-10-03 02:11:34

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v8 2/5] drm/panfrost: Add fdinfo support GPU load metrics

Hi Adri?n,

kernel test robot noticed the following build warnings:

[auto build test WARNING on f45acf7acf75921c0409d452f0165f51a19a74fd]

url: https://github.com/intel-lab-lkp/linux/commits/Adri-n-Larumbe/drm-panfrost-Add-cycle-count-GPU-register-definitions/20230930-041528
base: f45acf7acf75921c0409d452f0165f51a19a74fd
patch link: https://lore.kernel.org/r/20230929181616.2769345-3-adrian.larumbe%40collabora.com
patch subject: [PATCH v8 2/5] drm/panfrost: Add fdinfo support GPU load metrics
reproduce: (https://download.01.org/0day-ci/archive/20231003/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All warnings (new ones prefixed by >>):

>> Documentation/gpu/panfrost.rst:8: WARNING: Title underline too short.
>> Documentation/gpu/panfrost.rst: WARNING: document isn't included in any toctree

vim +8 Documentation/gpu/panfrost.rst

6
7 Panfrost DRM client usage stats implementation
> 8 ==========================================
9

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-10-03 15:42:10

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v8 4/5] drm/drm_file: Add DRM obj's RSS reporting function for fdinfo

On Fri, Sep 29, 2023 at 11:16 AM Adrián Larumbe
<[email protected]> wrote:
>
> Some BO's might be mapped onto physical memory chunkwise and on demand,
> like Panfrost's tiler heap. In this case, even though the
> drm_gem_shmem_object page array might already be allocated, only a very
> small fraction of the BO is currently backed by system memory, but
> drm_show_memory_stats will then proceed to add its entire virtual size to
> the file's total resident size regardless.
>
> This led to very unrealistic RSS sizes being reckoned for Panfrost, where
> said tiler heap buffer is initially allocated with a virtual size of 128
> MiB, but only a small part of it will eventually be backed by system memory
> after successive GPU page faults.
>
> Provide a new DRM object generic function that would allow drivers to
> return a more accurate RSS and purgeable sizes for their BOs.
>
> Signed-off-by: Adrián Larumbe <[email protected]>
> Reviewed-by: Boris Brezillon <[email protected]>
> Reviewed-by: Steven Price <[email protected]>
> Reviewed-by: AngeloGioacchino Del Regno <[email protected]>

Reviewed-by: Rob Clark <[email protected]>

> ---
> drivers/gpu/drm/drm_file.c | 8 +++++---
> include/drm/drm_gem.h | 9 +++++++++
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 883d83bc0e3d..9a1bd8d0d785 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -930,6 +930,8 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
> spin_lock(&file->table_lock);
> idr_for_each_entry (&file->object_idr, obj, id) {
> enum drm_gem_object_status s = 0;
> + size_t add_size = (obj->funcs && obj->funcs->rss) ?
> + obj->funcs->rss(obj) : obj->size;
>
> if (obj->funcs && obj->funcs->status) {
> s = obj->funcs->status(obj);
> @@ -944,7 +946,7 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
> }
>
> if (s & DRM_GEM_OBJECT_RESIDENT) {
> - status.resident += obj->size;
> + status.resident += add_size;
> } else {
> /* If already purged or not yet backed by pages, don't
> * count it as purgeable:
> @@ -953,14 +955,14 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
> }
>
> if (!dma_resv_test_signaled(obj->resv, dma_resv_usage_rw(true))) {
> - status.active += obj->size;
> + status.active += add_size;
>
> /* If still active, don't count as purgeable: */
> s &= ~DRM_GEM_OBJECT_PURGEABLE;
> }
>
> if (s & DRM_GEM_OBJECT_PURGEABLE)
> - status.purgeable += obj->size;
> + status.purgeable += add_size;
> }
> spin_unlock(&file->table_lock);
>
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index bc9f6aa2f3fe..16364487fde9 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -208,6 +208,15 @@ struct drm_gem_object_funcs {
> */
> enum drm_gem_object_status (*status)(struct drm_gem_object *obj);
>
> + /**
> + * @rss:
> + *
> + * Return resident size of the object in physical memory.
> + *
> + * Called by drm_show_memory_stats().
> + */
> + size_t (*rss)(struct drm_gem_object *obj);
> +
> /**
> * @vm_ops:
> *
> --
> 2.42.0
>

2023-10-04 11:06:25

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH v8 0/5] Add fdinfo support to Panfrost

On Fri, 29 Sep 2023 19:14:26 +0100
Adrián Larumbe <[email protected]> wrote:

> This patch series adds fdinfo support to the Panfrost DRM driver. It will
> display a series of key:value pairs under /proc/pid/fdinfo/fd for render
> processes that open the Panfrost DRM file.
>
> The pairs contain basic drm gpu engine and memory region information that
> can either be cat by a privileged user or accessed with IGT's gputop
> utility.
>
> Changelog:
>
> v1: https://lore.kernel.org/lkml/[email protected]/T/
>
> v2: https://lore.kernel.org/lkml/[email protected]/T/
> - Changed the way gpu cycles and engine time are calculated, using GPU
> registers and taking into account potential resets.
> - Split render engine values into fragment and vertex/tiler ones.
> - Added more fine-grained calculation of RSS size for BO's.
> - Implemente selection of drm-memory region size units.
> - Removed locking of shrinker's mutex in GEM obj status function.
>
> v3: https://lore.kernel.org/lkml/[email protected]/
> - Changed fdinfo engine names to something more descriptive.;
> - Mentioned GPU cycle counts aren't an exact measure.
> - Handled the case when job->priv might be NULL.
> - Handled 32 bit overflow of cycle register.
> - Kept fdinfo drm memory stats size unit display within 10k times the
> previous multiplier for more accurate BO size numbers.
> - Removed special handling of Prime imported BO RSS.
> - Use rss_size only for heap objects.
> - Use bo->base.madv instead of specific purgeable flag.
> - Fixed kernel test robot warnings.
>
> v4: https://lore.kernel.org/lkml/[email protected]/
> - Move cycle counter get and put to panfrost_job_hw_submit and
> panfrost_job_handle_{err,done} for more accuracy.
> - Make sure cycle counter refs are released in reset path
> - Drop the model param for toggling cycle counting and do
> leave it down to the debugfs file.
> - Don't disable cycle counter when togglint debugfs file,
> let refcounting logic handle it instead.
> - Remove fdinfo data nested structure definion and 'names' field
> - When incrementing BO RSS size in GPU MMU page fault IRQ handler, assume
> granuality of 2MiB for every successful mapping.
> - drm-file picks an fdinfo memory object size unit that doesn't lose precision.
>
> v5: https://lore.kernel.org/lkml/[email protected]/
> - Removed explicit initialisation of atomic variable for profiling mode,
> as it's allocated with kzalloc.
> - Pass engine utilisation structure to jobs rather than the file context, to avoid
> future misusage of the latter.
> - Remove double reading of cycle counter register and ktime in job deqeueue function,
> as the scheduler will make sure these values are read over in case of requeuing.
> - Moved putting of cycle counting refcnt into panfrost job dequeue.
> function to avoid repetition.
>
> v6: https://lore.kernel.org/lkml/[email protected]/T/
> - Fix wrong swapped-round engine time and cycle values in fdinfo
> drm print statements.
>
> v7: https://lore.kernel.org/lkml/[email protected]/T/
> - Make sure an object's actual RSS size is added to the overall fdinfo's purgeable
> and active size tally when it's both resident and purgeable or active.
> - Create a drm/panfrost.rst documentation file with meaning of fdinfo strings.
> - BUILD_BUG_ON checking the engine name array size for fdinfo.
> - Added copyright notices for Amazon in Panfrost's new debugfs files.
> - Discarded fdinfo memory stats unit size selection patch.
>
> v8:
> - Style improvements and addressing nitpicks.
>
> Adrián Larumbe (5):
> drm/panfrost: Add cycle count GPU register definitions
> drm/panfrost: Add fdinfo support GPU load metrics
> drm/panfrost: Add fdinfo support for memory stats
> drm/drm_file: Add DRM obj's RSS reporting function for fdinfo
> drm/panfrost: Implement generic DRM object RSS reporting function

Queued to drm-misc-next.

Thanks!

Boris

>
> Documentation/gpu/drm-usage-stats.rst | 1 +
> Documentation/gpu/panfrost.rst | 38 +++++++++++++
> drivers/gpu/drm/drm_file.c | 8 +--
> drivers/gpu/drm/panfrost/Makefile | 2 +
> drivers/gpu/drm/panfrost/panfrost_debugfs.c | 21 ++++++++
> drivers/gpu/drm/panfrost/panfrost_debugfs.h | 14 +++++
> drivers/gpu/drm/panfrost/panfrost_devfreq.c | 8 +++
> drivers/gpu/drm/panfrost/panfrost_devfreq.h | 3 ++
> drivers/gpu/drm/panfrost/panfrost_device.c | 2 +
> drivers/gpu/drm/panfrost/panfrost_device.h | 13 +++++
> drivers/gpu/drm/panfrost/panfrost_drv.c | 60 ++++++++++++++++++++-
> drivers/gpu/drm/panfrost/panfrost_gem.c | 30 +++++++++++
> drivers/gpu/drm/panfrost/panfrost_gem.h | 5 ++
> drivers/gpu/drm/panfrost/panfrost_gpu.c | 41 ++++++++++++++
> drivers/gpu/drm/panfrost/panfrost_gpu.h | 4 ++
> drivers/gpu/drm/panfrost/panfrost_job.c | 24 +++++++++
> drivers/gpu/drm/panfrost/panfrost_job.h | 5 ++
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 1 +
> drivers/gpu/drm/panfrost/panfrost_regs.h | 5 ++
> include/drm/drm_gem.h | 9 ++++
> 20 files changed, 290 insertions(+), 4 deletions(-)
> create mode 100644 Documentation/gpu/panfrost.rst
> create mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.c
> create mode 100644 drivers/gpu/drm/panfrost/panfrost_debugfs.h
>
>
> base-commit: f45acf7acf75921c0409d452f0165f51a19a74fd

2023-10-05 17:06:22

by Adrián Larumbe

[permalink] [raw]
Subject: [PATCH v8 3/5] drm/panfrost: Add fdinfo support for memory stats

A new DRM GEM object function is added so that drm_show_memory_stats can
provide more accurate memory usage numbers.

Ideally, in panfrost_gem_status, the BO's purgeable flag would be checked
after locking the driver's shrinker mutex, but drm_show_memory_stats takes
over the drm file's object handle database spinlock, so there's potential
for a race condition here.

Signed-off-by: Adrián Larumbe <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_drv.c | 2 ++
drivers/gpu/drm/panfrost/panfrost_gem.c | 15 +++++++++++++++
2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 97e5bc4a82c8..b834777b409b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -568,6 +568,8 @@ static void panfrost_show_fdinfo(struct drm_printer *p, struct drm_file *file)
struct panfrost_device *pfdev = dev->dev_private;

panfrost_gpu_show_fdinfo(pfdev, file->driver_priv, p);
+
+ drm_show_memory_stats(p, file);
}

static const struct file_operations panfrost_drm_driver_fops = {
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 3c812fbd126f..de238b71b321 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -195,6 +195,20 @@ static int panfrost_gem_pin(struct drm_gem_object *obj)
return drm_gem_shmem_pin(&bo->base);
}

+static enum drm_gem_object_status panfrost_gem_status(struct drm_gem_object *obj)
+{
+ struct panfrost_gem_object *bo = to_panfrost_bo(obj);
+ enum drm_gem_object_status res = 0;
+
+ if (bo->base.pages)
+ res |= DRM_GEM_OBJECT_RESIDENT;
+
+ if (bo->base.madv == PANFROST_MADV_DONTNEED)
+ res |= DRM_GEM_OBJECT_PURGEABLE;
+
+ return res;
+}
+
static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.free = panfrost_gem_free_object,
.open = panfrost_gem_open,
@@ -206,6 +220,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs = {
.vmap = drm_gem_shmem_object_vmap,
.vunmap = drm_gem_shmem_object_vunmap,
.mmap = drm_gem_shmem_object_mmap,
+ .status = panfrost_gem_status,
.vm_ops = &drm_gem_shmem_vm_ops,
};

--
2.42.0