2023-04-27 17:56:36

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 0/9] drm: fdinfo memory stats

From: Rob Clark <[email protected]>

Similar motivation to other similar recent attempt[1]. But with an
attempt to have some shared code for this. As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well. But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

I've combined the separate series to add comm/cmdline override onto
the end of this, simply out of convenience (they would otherwise
conflict in a bunch of places).

v2: Extend things to allow for multiple regions other than just system
"memory", make drm_show_memory_stats() a helper so that, drivers
can use it or not based on their needs (but in either case, re-
use drm_print_memory_stats()

[1] https://patchwork.freedesktop.org/series/112397/


Rob Clark (9):
drm/docs: Fix usage stats typos
drm: Add common fdinfo helper
drm/msm: Switch to fdinfo helper
drm/amdgpu: Switch to fdinfo helper
drm: Add fdinfo memory stats
drm/msm: Add memory stats to fdinfo
drm/doc: Relax fdinfo string constraints
drm/fdinfo: Add comm/cmdline override fields
drm/msm: Wire up comm/cmdline override for fdinfo

Documentation/gpu/drm-usage-stats.rst | 109 +++++++++++----
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h | 2 +-
drivers/gpu/drm/drm_file.c | 147 +++++++++++++++++++++
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 24 +++-
drivers/gpu/drm/msm/msm_drv.c | 15 ++-
drivers/gpu/drm/msm/msm_gem.c | 15 +++
drivers/gpu/drm/msm/msm_gpu.c | 2 -
drivers/gpu/drm/msm/msm_gpu.h | 10 ++
include/drm/drm_drv.h | 7 +
include/drm/drm_file.h | 42 ++++++
include/drm/drm_gem.h | 30 +++++
13 files changed, 375 insertions(+), 47 deletions(-)

--
2.39.2


2023-04-27 17:57:11

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 1/9] drm/docs: Fix usage stats typos

From: Rob Clark <[email protected]>

Fix a couple missing ':'s.

Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
---
Documentation/gpu/drm-usage-stats.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index b46327356e80..72d069e5dacb 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -105,7 +105,7 @@ object belong to this client, in the respective memory region.
Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
indicating kibi- or mebi-bytes.

-- drm-cycles-<str> <uint>
+- drm-cycles-<str>: <uint>

Engine identifier string must be the same as the one specified in the
drm-engine-<str> tag and shall contain the number of busy cycles for the given
@@ -117,7 +117,7 @@ larger value within a reasonable period. Upon observing a value lower than what
was previously read, userspace is expected to stay with that larger previous
value until a monotonic update is seen.

-- drm-maxfreq-<str> <uint> [Hz|MHz|KHz]
+- drm-maxfreq-<str>: <uint> [Hz|MHz|KHz]

Engine identifier string must be the same as the one specified in the
drm-engine-<str> tag and shall contain the maximum frequency for the given
--
2.39.2

2023-04-27 17:57:16

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 2/9] drm: Add common fdinfo helper

From: Rob Clark <[email protected]>

Handle a bit of the boiler-plate in a single case, and make it easier to
add some core tracked stats. This also ensures consistent behavior
across drivers for standardised fields.

v2: Update drm-usage-stats.rst, 64b client-id, rename drm_show_fdinfo

Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
---
Documentation/gpu/drm-usage-stats.rst | 10 +++++++-
drivers/gpu/drm/drm_file.c | 35 +++++++++++++++++++++++++++
include/drm/drm_drv.h | 7 ++++++
include/drm/drm_file.h | 4 +++
4 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index 72d069e5dacb..552195fb1ea3 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -126,7 +126,15 @@ percentage utilization of the engine, whereas drm-engine-<str> only reflects
time active without considering what frequency the engine is operating as a
percentage of it's maximum frequency.

+Implementation Details
+======================
+
+Drivers should use drm_show_fdinfo() in their `struct file_operations`, and
+implement &drm_driver.show_fdinfo if they wish to provide any stats which
+are not provided by drm_show_fdinfo(). But even driver specific stats should
+be documented above and where possible, aligned with other drivers.
+
Driver specific implementations
-===============================
+-------------------------------

:ref:`i915-usage-stats`
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index a51ff8cee049..6d5bdd684ae2 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -148,6 +148,7 @@ bool drm_dev_needs_global_mutex(struct drm_device *dev)
*/
struct drm_file *drm_file_alloc(struct drm_minor *minor)
{
+ static atomic64_t ident = ATOMIC_INIT(0);
struct drm_device *dev = minor->dev;
struct drm_file *file;
int ret;
@@ -156,6 +157,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
if (!file)
return ERR_PTR(-ENOMEM);

+ /* Get a unique identifier for fdinfo: */
+ file->client_id = atomic64_inc_return(&ident);
file->pid = get_pid(task_pid(current));
file->minor = minor;

@@ -868,6 +871,38 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
}
EXPORT_SYMBOL(drm_send_event);

+/**
+ * drm_show_fdinfo - helper for drm file fops
+ * @seq_file: output stream
+ * @f: the device file instance
+ *
+ * Helper to implement fdinfo, for userspace to query usage stats, etc, of a
+ * process using the GPU. See also &drm_driver.show_fdinfo.
+ *
+ * For text output format description please see Documentation/gpu/drm-usage-stats.rst
+ */
+void drm_show_fdinfo(struct seq_file *m, struct file *f)
+{
+ struct drm_file *file = f->private_data;
+ struct drm_device *dev = file->minor->dev;
+ struct drm_printer p = drm_seq_file_printer(m);
+
+ drm_printf(&p, "drm-driver:\t%s\n", dev->driver->name);
+ drm_printf(&p, "drm-client-id:\t%llu\n", file->client_id);
+
+ if (dev_is_pci(dev->dev)) {
+ struct pci_dev *pdev = to_pci_dev(dev->dev);
+
+ drm_printf(&p, "drm-pdev:\t%04x:%02x:%02x.%d\n",
+ pci_domain_nr(pdev->bus), pdev->bus->number,
+ PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+ }
+
+ if (dev->driver->show_fdinfo)
+ dev->driver->show_fdinfo(&p, file);
+}
+EXPORT_SYMBOL(drm_show_fdinfo);
+
/**
* mock_drm_getfile - Create a new struct file for the drm device
* @minor: drm minor to wrap (e.g. #drm_device.primary)
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 5b86bb7603e7..5edf2a13733b 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -401,6 +401,13 @@ struct drm_driver {
struct drm_device *dev, uint32_t handle,
uint64_t *offset);

+ /**
+ * @show_fdinfo:
+ *
+ * Print device specific fdinfo. See Documentation/gpu/drm-usage-stats.rst.
+ */
+ void (*show_fdinfo)(struct drm_printer *p, struct drm_file *f);
+
/** @major: driver major number */
int major;
/** @minor: driver minor number */
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 0d1f853092ab..6de6d0e9c634 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -258,6 +258,9 @@ struct drm_file {
/** @pid: Process that opened this file. */
struct pid *pid;

+ /** @client_id: A unique id for fdinfo */
+ u64 client_id;
+
/** @magic: Authentication magic, see @authenticated. */
drm_magic_t magic;

@@ -437,6 +440,7 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
void drm_send_event_timestamp_locked(struct drm_device *dev,
struct drm_pending_event *e,
ktime_t timestamp);
+void drm_show_fdinfo(struct seq_file *m, struct file *f);

struct file *mock_drm_getfile(struct drm_minor *minor, unsigned int flags);

--
2.39.2

2023-04-27 17:57:22

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 3/9] drm/msm: Switch to fdinfo helper

From: Rob Clark <[email protected]>

Now that we have a common helper, use it.

Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Dmitry Baryshkov <[email protected]>
---
drivers/gpu/drm/msm/msm_drv.c | 11 +++++------
drivers/gpu/drm/msm/msm_gpu.c | 2 --
2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 9b6f17b1261f..1e941aa77609 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1043,23 +1043,21 @@ static const struct drm_ioctl_desc msm_ioctls[] = {
DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_QUERY, msm_ioctl_submitqueue_query, DRM_RENDER_ALLOW),
};

-static void msm_fop_show_fdinfo(struct seq_file *m, struct file *f)
+static void msm_show_fdinfo(struct drm_printer *p, struct drm_file *file)
{
- struct drm_file *file = f->private_data;
struct drm_device *dev = file->minor->dev;
struct msm_drm_private *priv = dev->dev_private;
- struct drm_printer p = drm_seq_file_printer(m);

if (!priv->gpu)
return;

- msm_gpu_show_fdinfo(priv->gpu, file->driver_priv, &p);
+ msm_gpu_show_fdinfo(priv->gpu, file->driver_priv, p);
}

static const struct file_operations fops = {
.owner = THIS_MODULE,
DRM_GEM_FOPS,
- .show_fdinfo = msm_fop_show_fdinfo,
+ .show_fdinfo = drm_show_fdinfo,
};

static const struct drm_driver msm_driver = {
@@ -1069,7 +1067,7 @@ static const struct drm_driver msm_driver = {
DRIVER_MODESET |
DRIVER_SYNCOBJ,
.open = msm_open,
- .postclose = msm_postclose,
+ .postclose = msm_postclose,
.lastclose = drm_fb_helper_lastclose,
.dumb_create = msm_gem_dumb_create,
.dumb_map_offset = msm_gem_dumb_map_offset,
@@ -1080,6 +1078,7 @@ static const struct drm_driver msm_driver = {
#ifdef CONFIG_DEBUG_FS
.debugfs_init = msm_debugfs_init,
#endif
+ .show_fdinfo = msm_show_fdinfo,
.ioctls = msm_ioctls,
.num_ioctls = ARRAY_SIZE(msm_ioctls),
.fops = &fops,
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index b1647b851018..52db90e34ead 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -151,8 +151,6 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
struct drm_printer *p)
{
- drm_printf(p, "drm-driver:\t%s\n", gpu->dev->driver->name);
- drm_printf(p, "drm-client-id:\t%u\n", ctx->seqno);
drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
--
2.39.2

2023-04-27 17:57:24

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 4/9] drm/amdgpu: Switch to fdinfo helper

From: Rob Clark <[email protected]>

Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Christian König <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++++++----------
drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h | 2 +-
3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f5ffca24def4..6c0e0c614b94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2752,7 +2752,7 @@ static const struct file_operations amdgpu_driver_kms_fops = {
.compat_ioctl = amdgpu_kms_compat_ioctl,
#endif
#ifdef CONFIG_PROC_FS
- .show_fdinfo = amdgpu_show_fdinfo
+ .show_fdinfo = drm_show_fdinfo,
#endif
};

@@ -2807,6 +2807,7 @@ static const struct drm_driver amdgpu_kms_driver = {
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = &amdgpu_driver_kms_fops,
.release = &amdgpu_driver_release_kms,
+ .show_fdinfo = amdgpu_show_fdinfo,

.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 99a7855ab1bc..c2fdd5e448d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -53,9 +53,8 @@ static const char *amdgpu_ip_name[AMDGPU_HW_IP_NUM] = {
[AMDGPU_HW_IP_VCN_JPEG] = "jpeg",
};

-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file)
{
- struct drm_file *file = f->private_data;
struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
struct amdgpu_fpriv *fpriv = file->driver_priv;
struct amdgpu_vm *vm = &fpriv->vm;
@@ -86,18 +85,15 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
* ******************************************************************
*/

- seq_printf(m, "pasid:\t%u\n", fpriv->vm.pasid);
- seq_printf(m, "drm-driver:\t%s\n", file->minor->dev->driver->name);
- seq_printf(m, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
- seq_printf(m, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
- seq_printf(m, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
- seq_printf(m, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
- seq_printf(m, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
+ drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
+ drm_printf(p, "drm-memory-vram:\t%llu KiB\n", vram_mem/1024UL);
+ drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", gtt_mem/1024UL);
+ drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", cpu_mem/1024UL);
for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
if (!usage[hw_ip])
continue;

- seq_printf(m, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
+ drm_printf(p, "drm-engine-%s:\t%Ld ns\n", amdgpu_ip_name[hw_ip],
ktime_to_ns(usage[hw_ip]));
}
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
index e86834bfea1d..0398f5a159ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h
@@ -37,6 +37,6 @@
#include "amdgpu_ids.h"

uint32_t amdgpu_get_ip_count(struct amdgpu_device *adev, int id);
-void amdgpu_show_fdinfo(struct seq_file *m, struct file *f);
+void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file);

#endif
--
2.39.2

2023-04-27 17:59:07

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 7/9] drm/doc: Relax fdinfo string constraints

From: Rob Clark <[email protected]>

The restriction about no whitespace, etc, really only applies to the
usage of strings in keys. Values can contain anything (other than
newline).

Signed-off-by: Rob Clark <[email protected]>
Acked-by: Tvrtko Ursulin <[email protected]>
---
Documentation/gpu/drm-usage-stats.rst | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index bfc14150452c..58dc0d3f8c58 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -24,7 +24,7 @@ File format specification
- All keys shall be prefixed with `drm-`.
- Whitespace between the delimiter and first non-whitespace character shall be
ignored when parsing.
-- Neither keys or values are allowed to contain whitespace characters.
+- Keys are not allowed to contain whitespace characters.
- Numerical key value pairs can end with optional unit string.
- Data type of the value is fixed as defined in the specification.

@@ -39,12 +39,13 @@ Data types
----------

- <uint> - Unsigned integer without defining the maximum value.
-- <str> - String excluding any above defined reserved characters or whitespace.
+- <keystr> - String excluding any above defined reserved characters or whitespace.
+- <valstr> - String.

Mandatory fully standardised keys
---------------------------------

-- drm-driver: <str>
+- drm-driver: <valstr>

String shall contain the name this driver registered as via the respective
`struct drm_driver` data structure.
@@ -75,10 +76,10 @@ the above described criteria in order to associate data to individual clients.
Utilization
^^^^^^^^^^^

-- drm-engine-<str>: <uint> ns
+- drm-engine-<keystr>: <uint> ns

GPUs usually contain multiple execution engines. Each shall be given a stable
-and unique name (str), with possible values documented in the driver specific
+and unique name (keystr), with possible values documented in the driver specific
documentation.

Value shall be in specified time units which the respective GPU engine spent
@@ -90,19 +91,19 @@ larger value within a reasonable period. Upon observing a value lower than what
was previously read, userspace is expected to stay with that larger previous
value until a monotonic update is seen.

-- drm-engine-capacity-<str>: <uint>
+- drm-engine-capacity-<keystr>: <uint>

Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain a greater than zero number in case the
+drm-engine-<keystr> tag and shall contain a greater than zero number in case the
exported engine corresponds to a group of identical hardware engines.

In the absence of this tag parser shall assume capacity of one. Zero capacity
is not allowed.

-- drm-cycles-<str>: <uint>
+- drm-cycles-<keystr>: <uint>

Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain the number of busy cycles for the given
+drm-engine-<keystr> tag and shall contain the number of busy cycles for the given
engine.

Values are not required to be constantly monotonic if it makes the driver
@@ -111,12 +112,12 @@ larger value within a reasonable period. Upon observing a value lower than what
was previously read, userspace is expected to stay with that larger previous
value until a monotonic update is seen.

-- drm-maxfreq-<str>: <uint> [Hz|MHz|KHz]
+- drm-maxfreq-<keystr>: <uint> [Hz|MHz|KHz]

Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain the maximum frequency for the given
-engine. Taken together with drm-cycles-<str>, this can be used to calculate
-percentage utilization of the engine, whereas drm-engine-<str> only reflects
+drm-engine-<keystr> tag and shall contain the maximum frequency for the given
+engine. Taken together with drm-cycles-<keystr>, this can be used to calculate
+percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
time active without considering what frequency the engine is operating as a
percentage of it's maximum frequency.

--
2.39.2

2023-04-27 17:59:38

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 8/9] drm/fdinfo: Add comm/cmdline override fields

From: Rob Clark <[email protected]>

These are useful in particular for VM scenarios where the process which
has opened to drm device file is just a proxy for the real user in a VM
guest.

Signed-off-by: Rob Clark <[email protected]>
---
Documentation/gpu/drm-usage-stats.rst | 18 ++++++++++++++++++
drivers/gpu/drm/drm_file.c | 15 +++++++++++++++
include/drm/drm_file.h | 19 +++++++++++++++++++
3 files changed, 52 insertions(+)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index 58dc0d3f8c58..e4877cf8089c 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -73,6 +73,24 @@ scope of each device, in which case `drm-pdev` shall be present as well.
Userspace should make sure to not double account any usage statistics by using
the above described criteria in order to associate data to individual clients.

+- drm-comm-override: <valstr>
+
+Returns the client executable override string. Some drivers support letting
+userspace override this in cases where the userspace is simply a "proxy".
+Such as is the case with virglrenderer drm native context, where the host
+process is just forwarding command submission, etc, from guest userspace.
+This allows the proxy to make visible the executable name of the actual
+app in the VM guest.
+
+- drm-cmdline-override: <valstr>
+
+Returns the client cmdline override string. Some drivers support letting
+userspace override this in cases where the userspace is simply a "proxy".
+Such as is the case with virglrenderer drm native context, where the host
+process is just forwarding command submission, etc, from guest userspace.
+This allows the proxy to make visible the cmdline of the actual app in the
+VM guest.
+
Utilization
^^^^^^^^^^^

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 9321eb0bf020..d7514c313af1 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -178,6 +178,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
spin_lock_init(&file->master_lookup_lock);
mutex_init(&file->event_read_lock);

+ mutex_init(&file->override_lock);
+
if (drm_core_check_feature(dev, DRIVER_GEM))
drm_gem_open(dev, file);

@@ -292,6 +294,8 @@ void drm_file_free(struct drm_file *file)
WARN_ON(!list_empty(&file->event_list));

put_pid(file->pid);
+ kfree(file->override_comm);
+ kfree(file->override_cmdline);
kfree(file);
}

@@ -995,6 +999,17 @@ void drm_show_fdinfo(struct seq_file *m, struct file *f)
PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
}

+ mutex_lock(&file->override_lock);
+ if (file->override_comm) {
+ drm_printf(&p, "drm-comm-override:\t%s\n",
+ file->override_comm);
+ }
+ if (file->override_cmdline) {
+ drm_printf(&p, "drm-cmdline-override:\t%s\n",
+ file->override_cmdline);
+ }
+ mutex_unlock(&file->override_lock);
+
if (dev->driver->show_fdinfo)
dev->driver->show_fdinfo(&p, file);
}
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 1339e925af52..604d05fa6f0c 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -370,6 +370,25 @@ struct drm_file {
*/
struct drm_prime_file_private prime;

+ /**
+ * @comm: Overridden task comm
+ *
+ * Accessed under override_lock
+ */
+ char *override_comm;
+
+ /**
+ * @cmdline: Overridden task cmdline
+ *
+ * Accessed under override_lock
+ */
+ char *override_cmdline;
+
+ /**
+ * @override_lock: Serialize access to override_comm and override_cmdline
+ */
+ struct mutex override_lock;
+
/* private: */
#if IS_ENABLED(CONFIG_DRM_LEGACY)
unsigned long lock_count; /* DRI1 legacy lock count */
--
2.39.2

2023-04-27 18:00:05

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 5/9] drm: Add fdinfo memory stats

From: Rob Clark <[email protected]>

Add support to dump GEM stats to fdinfo.

v2: Fix typos, change size units to match docs, use div_u64
v3: Do it in core
v4: more kerneldoc

Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
---
Documentation/gpu/drm-usage-stats.rst | 54 +++++++++++----
drivers/gpu/drm/drm_file.c | 99 ++++++++++++++++++++++++++-
include/drm/drm_file.h | 19 +++++
include/drm/drm_gem.h | 30 ++++++++
4 files changed, 189 insertions(+), 13 deletions(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index 552195fb1ea3..bfc14150452c 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -52,6 +52,9 @@ String shall contain the name this driver registered as via the respective
Optional fully standardised keys
--------------------------------

+Identification
+^^^^^^^^^^^^^^
+
- drm-pdev: <aaaa:bb.cc.d>

For PCI devices this should contain the PCI slot address of the device in
@@ -69,6 +72,9 @@ scope of each device, in which case `drm-pdev` shall be present as well.
Userspace should make sure to not double account any usage statistics by using
the above described criteria in order to associate data to individual clients.

+Utilization
+^^^^^^^^^^^
+
- drm-engine-<str>: <uint> ns

GPUs usually contain multiple execution engines. Each shall be given a stable
@@ -93,18 +99,6 @@ exported engine corresponds to a group of identical hardware engines.
In the absence of this tag parser shall assume capacity of one. Zero capacity
is not allowed.

-- drm-memory-<str>: <uint> [KiB|MiB]
-
-Each possible memory type which can be used to store buffer objects by the
-GPU in question shall be given a stable and unique name to be returned as the
-string here.
-
-Value shall reflect the amount of storage currently consumed by the buffer
-object belong to this client, in the respective memory region.
-
-Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
-indicating kibi- or mebi-bytes.
-
- drm-cycles-<str>: <uint>

Engine identifier string must be the same as the one specified in the
@@ -126,6 +120,42 @@ percentage utilization of the engine, whereas drm-engine-<str> only reflects
time active without considering what frequency the engine is operating as a
percentage of it's maximum frequency.

+Memory
+^^^^^^
+
+- drm-memory-<region>: <uint> [KiB|MiB]
+
+Each possible memory type which can be used to store buffer objects by the
+GPU in question shall be given a stable and unique name to be returned as the
+string here. The name "memory" is reserved to refer to normal system memory.
+
+Value shall reflect the amount of storage currently consumed by the buffer
+object belong to this client, in the respective memory region.
+
+Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
+indicating kibi- or mebi-bytes.
+
+- drm-shared-<region>: <uint> [KiB|MiB]
+
+The total size of buffers that are shared with another file (ie. have more
+than a single handle).
+
+- drm-private-<region>: <uint> [KiB|MiB]
+
+The total size of buffers that are not shared with another file.
+
+- drm-resident-<region>: <uint> [KiB|MiB]
+
+The total size of buffers that are resident in system memory.
+
+- drm-purgeable-<region>: <uint> [KiB|MiB]
+
+The total size of buffers that are purgeable.
+
+- drm-active-<region>: <uint> [KiB|MiB]
+
+The total size of buffers that are active on one or more rings.
+
Implementation Details
======================

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 6d5bdd684ae2..9321eb0bf020 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -42,6 +42,7 @@
#include <drm/drm_client.h>
#include <drm/drm_drv.h>
#include <drm/drm_file.h>
+#include <drm/drm_gem.h>
#include <drm/drm_print.h>

#include "drm_crtc_internal.h"
@@ -871,9 +872,105 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
}
EXPORT_SYMBOL(drm_send_event);

+static void print_size(struct drm_printer *p, const char *stat,
+ const char *region, size_t sz)
+{
+ const char *units[] = {"", " KiB", " MiB"};
+ unsigned u;
+
+ for (u = 0; u < ARRAY_SIZE(units) - 1; u++) {
+ if (sz < SZ_1K)
+ break;
+ sz = div_u64(sz, SZ_1K);
+ }
+
+ drm_printf(p, "drm-%s-%s:\t%zu%s\n", stat, region, sz, units[u]);
+}
+
+/**
+ * drm_print_memory_stats - A helper to print memory stats
+ * @p: The printer to print output to
+ * @stats: The collected memory stats
+ * @supported_status: Bitmask of optional stats which are available
+ * @region: The memory region
+ *
+ */
+void drm_print_memory_stats(struct drm_printer *p,
+ const struct drm_memory_stats *stats,
+ enum drm_gem_object_status supported_status,
+ const char *region)
+{
+ print_size(p, "total", region, stats->private + stats->shared);
+ print_size(p, "shared", region, stats->shared);
+ print_size(p, "active", region, stats->active);
+
+ if (supported_status & DRM_GEM_OBJECT_RESIDENT)
+ print_size(p, "resident", region, stats->resident);
+
+ if (supported_status & DRM_GEM_OBJECT_PURGEABLE)
+ print_size(p, "purgeable", region, stats->purgeable);
+}
+EXPORT_SYMBOL(drm_print_memory_stats);
+
+/**
+ * drm_show_memory_stats - Helper to collect and show standard fdinfo memory stats
+ * @p: the printer to print output to
+ * @file: the DRM file
+ *
+ * Helper to iterate over GEM objects with a handle allocated in the specified
+ * file.
+ */
+void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
+{
+ struct drm_gem_object *obj;
+ struct drm_memory_stats status = {};
+ enum drm_gem_object_status supported_status;
+ int id;
+
+ spin_lock(&file->table_lock);
+ idr_for_each_entry (&file->object_idr, obj, id) {
+ enum drm_gem_object_status s = 0;
+
+ if (obj->funcs && obj->funcs->status) {
+ s = obj->funcs->status(obj);
+ supported_status = DRM_GEM_OBJECT_RESIDENT |
+ DRM_GEM_OBJECT_PURGEABLE;
+ }
+
+ if (obj->handle_count > 1) {
+ status.shared += obj->size;
+ } else {
+ status.private += obj->size;
+ }
+
+ if (s & DRM_GEM_OBJECT_RESIDENT) {
+ status.resident += obj->size;
+ } else {
+ /* If already purged or not yet backed by pages, don't
+ * count it as purgeable:
+ */
+ s &= ~DRM_GEM_OBJECT_PURGEABLE;
+ }
+
+ if (!dma_resv_test_signaled(obj->resv, dma_resv_usage_rw(true))) {
+ status.active += obj->size;
+
+ /* If still active, don't count as purgeable: */
+ s &= ~DRM_GEM_OBJECT_PURGEABLE;
+ }
+
+ if (s & DRM_GEM_OBJECT_PURGEABLE)
+ status.purgeable += obj->size;
+ }
+ spin_unlock(&file->table_lock);
+
+ drm_print_memory_stats(p, &status, supported_status, "memory");
+}
+EXPORT_SYMBOL(drm_show_memory_stats);
+
/**
* drm_show_fdinfo - helper for drm file fops
- * @seq_file: output stream
+ * @m: output stream
* @f: the device file instance
*
* Helper to implement fdinfo, for userspace to query usage stats, etc, of a
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 6de6d0e9c634..1339e925af52 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -41,6 +41,7 @@
struct dma_fence;
struct drm_file;
struct drm_device;
+struct drm_printer;
struct device;
struct file;

@@ -440,6 +441,24 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
void drm_send_event_timestamp_locked(struct drm_device *dev,
struct drm_pending_event *e,
ktime_t timestamp);
+
+
+struct drm_memory_stats {
+ size_t shared;
+ size_t private;
+ size_t resident;
+ size_t purgeable;
+ size_t active;
+};
+
+enum drm_gem_object_status;
+
+void drm_print_memory_stats(struct drm_printer *p,
+ const struct drm_memory_stats *stats,
+ enum drm_gem_object_status supported_status,
+ const char *region);
+
+void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file);
void drm_show_fdinfo(struct seq_file *m, struct file *f);

struct file *mock_drm_getfile(struct drm_minor *minor, unsigned int flags);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 189fd618ca65..9ebd2820ad1f 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -42,6 +42,25 @@
struct iosys_map;
struct drm_gem_object;

+/**
+ * enum drm_gem_object_status - bitmask of object state for fdinfo reporting
+ * @DRM_GEM_OBJECT_RESIDENT: object is resident in memory (ie. not unpinned)
+ * @DRM_GEM_OBJECT_PURGEABLE: object marked as purgeable by userspace
+ *
+ * Bitmask of status used for fdinfo memory stats, see &drm_gem_object_funcs.status
+ * and drm_show_fdinfo(). Note that an object can DRM_GEM_OBJECT_PURGEABLE if
+ * it still active or not resident, in which case drm_show_fdinfo() will not
+ * account for it as purgeable. So drivers do not need to check if the buffer
+ * is idle and resident to return this bit. (Ie. userspace can mark a buffer
+ * as purgeable even while it is still busy on the GPU.. it does not _actually_
+ * become puregeable until it becomes idle. The status gem object func does
+ * not need to consider this.)
+ */
+enum drm_gem_object_status {
+ DRM_GEM_OBJECT_RESIDENT = BIT(0),
+ DRM_GEM_OBJECT_PURGEABLE = BIT(1),
+};
+
/**
* struct drm_gem_object_funcs - GEM object functions
*/
@@ -174,6 +193,17 @@ struct drm_gem_object_funcs {
*/
int (*evict)(struct drm_gem_object *obj);

+ /**
+ * @status:
+ *
+ * The optional status callback can return additional object state
+ * which determines which stats the object is counted against. The
+ * callback is called under table_lock. Racing against object status
+ * change is "harmless", and the callback can expect to not race
+ * against object destruction.
+ */
+ enum drm_gem_object_status (*status)(struct drm_gem_object *obj);
+
/**
* @vm_ops:
*
--
2.39.2

2023-04-27 18:00:05

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 6/9] drm/msm: Add memory stats to fdinfo

From: Rob Clark <[email protected]>

Use the new helper to export stats about memory usage.

v2: Drop unintended hunk
v3: Rebase

Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
---
drivers/gpu/drm/msm/msm_drv.c | 2 ++
drivers/gpu/drm/msm/msm_gem.c | 15 +++++++++++++++
2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 1e941aa77609..81a1371c0307 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -1052,6 +1052,8 @@ static void msm_show_fdinfo(struct drm_printer *p, struct drm_file *file)
return;

msm_gpu_show_fdinfo(priv->gpu, file->driver_priv, p);
+
+ drm_show_memory_stats(p, file);
}

static const struct file_operations fops = {
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index cd39b9d8abdb..20cfd86d2b32 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -1090,6 +1090,20 @@ int msm_gem_new_handle(struct drm_device *dev, struct drm_file *file,
return ret;
}

+static enum drm_gem_object_status msm_gem_status(struct drm_gem_object *obj)
+{
+ struct msm_gem_object *msm_obj = to_msm_bo(obj);
+ enum drm_gem_object_status status = 0;
+
+ if (msm_obj->pages)
+ status |= DRM_GEM_OBJECT_RESIDENT;
+
+ if (msm_obj->madv == MSM_MADV_DONTNEED)
+ status |= DRM_GEM_OBJECT_PURGEABLE;
+
+ return status;
+}
+
static const struct vm_operations_struct vm_ops = {
.fault = msm_gem_fault,
.open = drm_gem_vm_open,
@@ -1104,6 +1118,7 @@ static const struct drm_gem_object_funcs msm_gem_object_funcs = {
.vmap = msm_gem_prime_vmap,
.vunmap = msm_gem_prime_vunmap,
.mmap = msm_gem_object_mmap,
+ .status = msm_gem_status,
.vm_ops = &vm_ops,
};

--
2.39.2

2023-04-27 18:00:38

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2 9/9] drm/msm: Wire up comm/cmdline override for fdinfo

From: Rob Clark <[email protected]>

Also store the override strings in drm_file so that fdinfo can display
them. We still need to keep our original copy as we could need these
override strings after the device file has been closed and drm_file
freed.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 24 +++++++++++++++++++++++-
drivers/gpu/drm/msm/msm_drv.c | 2 ++
drivers/gpu/drm/msm/msm_gpu.h | 10 ++++++++++
3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index bb38e728864d..a20c2622a61f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -16,6 +16,7 @@
#include <linux/soc/qcom/mdt_loader.h>
#include <linux/nvmem-consumer.h>
#include <soc/qcom/ocmem.h>
+#include <drm/drm_file.h>
#include "adreno_gpu.h"
#include "a6xx_gpu.h"
#include "msm_gem.h"
@@ -398,7 +399,8 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
switch (param) {
case MSM_PARAM_COMM:
case MSM_PARAM_CMDLINE: {
- char *str, **paramp;
+ char *str, *str2, **paramp;
+ struct drm_file *file = ctx->file;

str = kmalloc(len + 1, GFP_KERNEL);
if (!str)
@@ -412,6 +414,13 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
/* Ensure string is null terminated: */
str[len] = '\0';

+ /*
+ * We need a 2nd copy for drm_file.. this copy can't replace
+ * our internal copy in the ctx, because we may need it for
+ * recovery/devcoredump after the file is already closed.
+ */
+ str2 = kstrdup(str, GFP_KERNEL);
+
mutex_lock(&gpu->lock);

if (param == MSM_PARAM_COMM) {
@@ -425,6 +434,19 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,

mutex_unlock(&gpu->lock);

+ mutex_lock(&file->override_lock);
+
+ if (param == MSM_PARAM_COMM) {
+ paramp = &file->override_comm;
+ } else {
+ paramp = &file->override_cmdline;
+ }
+
+ kfree(*paramp);
+ *paramp = str2;
+
+ mutex_unlock(&file->override_lock);
+
return 0;
}
case MSM_PARAM_SYSPROF:
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 81a1371c0307..3a74b5653e96 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -581,6 +581,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
rwlock_init(&ctx->queuelock);

kref_init(&ctx->ref);
+ ctx->file = file;
msm_submitqueue_init(dev, ctx);

ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
@@ -603,6 +604,7 @@ static int msm_open(struct drm_device *dev, struct drm_file *file)

static void context_close(struct msm_file_private *ctx)
{
+ ctx->file = NULL;
msm_submitqueue_close(ctx);
msm_file_private_put(ctx);
}
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 7a4fa1b8655b..671ce89e61b0 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -359,6 +359,16 @@ struct msm_file_private {
struct kref ref;
int seqno;

+ /**
+ * @file: link back to the associated drm_file
+ *
+ * Note that msm_file_private can outlive the drm_file, ie.
+ * after the drm_file is closed but before jobs submitted have
+ * been cleaned up. After the drm_file is closed this will be
+ * NULL.
+ */
+ struct drm_file *file;
+
/**
* sysprof:
*
--
2.39.2

2023-04-28 08:52:53

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] drm/docs: Fix usage stats typos

Am 27.04.23 um 19:53 schrieb Rob Clark:
> From: Rob Clark <[email protected]>
>
> Fix a couple missing ':'s.
>
> Signed-off-by: Rob Clark <[email protected]>
> Reviewed-by: Rodrigo Vivi <[email protected]>

Reviewed-by: Christian König <[email protected]>

Since this is a pretty clear fix I suggest to get this pushed to reduce
the number of patches in the set.

Christian.

> ---
> Documentation/gpu/drm-usage-stats.rst | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> index b46327356e80..72d069e5dacb 100644
> --- a/Documentation/gpu/drm-usage-stats.rst
> +++ b/Documentation/gpu/drm-usage-stats.rst
> @@ -105,7 +105,7 @@ object belong to this client, in the respective memory region.
> Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> indicating kibi- or mebi-bytes.
>
> -- drm-cycles-<str> <uint>
> +- drm-cycles-<str>: <uint>
>
> Engine identifier string must be the same as the one specified in the
> drm-engine-<str> tag and shall contain the number of busy cycles for the given
> @@ -117,7 +117,7 @@ larger value within a reasonable period. Upon observing a value lower than what
> was previously read, userspace is expected to stay with that larger previous
> value until a monotonic update is seen.
>
> -- drm-maxfreq-<str> <uint> [Hz|MHz|KHz]
> +- drm-maxfreq-<str>: <uint> [Hz|MHz|KHz]
>
> Engine identifier string must be the same as the one specified in the
> drm-engine-<str> tag and shall contain the maximum frequency for the given

2023-04-28 11:00:37

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] drm: Add fdinfo memory stats


On 27/04/2023 18:53, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> Add support to dump GEM stats to fdinfo.
>
> v2: Fix typos, change size units to match docs, use div_u64
> v3: Do it in core
> v4: more kerneldoc
>
> Signed-off-by: Rob Clark <[email protected]>
> Reviewed-by: Emil Velikov <[email protected]>
> Reviewed-by: Daniel Vetter <[email protected]>
> ---
> Documentation/gpu/drm-usage-stats.rst | 54 +++++++++++----
> drivers/gpu/drm/drm_file.c | 99 ++++++++++++++++++++++++++-
> include/drm/drm_file.h | 19 +++++
> include/drm/drm_gem.h | 30 ++++++++
> 4 files changed, 189 insertions(+), 13 deletions(-)
>
> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> index 552195fb1ea3..bfc14150452c 100644
> --- a/Documentation/gpu/drm-usage-stats.rst
> +++ b/Documentation/gpu/drm-usage-stats.rst
> @@ -52,6 +52,9 @@ String shall contain the name this driver registered as via the respective
> Optional fully standardised keys
> --------------------------------
>
> +Identification
> +^^^^^^^^^^^^^^
> +
> - drm-pdev: <aaaa:bb.cc.d>
>
> For PCI devices this should contain the PCI slot address of the device in
> @@ -69,6 +72,9 @@ scope of each device, in which case `drm-pdev` shall be present as well.
> Userspace should make sure to not double account any usage statistics by using
> the above described criteria in order to associate data to individual clients.
>
> +Utilization
> +^^^^^^^^^^^
> +
> - drm-engine-<str>: <uint> ns
>
> GPUs usually contain multiple execution engines. Each shall be given a stable
> @@ -93,18 +99,6 @@ exported engine corresponds to a group of identical hardware engines.
> In the absence of this tag parser shall assume capacity of one. Zero capacity
> is not allowed.
>
> -- drm-memory-<str>: <uint> [KiB|MiB]
> -
> -Each possible memory type which can be used to store buffer objects by the
> -GPU in question shall be given a stable and unique name to be returned as the
> -string here.
> -
> -Value shall reflect the amount of storage currently consumed by the buffer
> -object belong to this client, in the respective memory region.
> -
> -Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> -indicating kibi- or mebi-bytes.
> -
> - drm-cycles-<str>: <uint>
>
> Engine identifier string must be the same as the one specified in the
> @@ -126,6 +120,42 @@ percentage utilization of the engine, whereas drm-engine-<str> only reflects
> time active without considering what frequency the engine is operating as a
> percentage of it's maximum frequency.
>
> +Memory
> +^^^^^^
> +
> +- drm-memory-<region>: <uint> [KiB|MiB]
> +
> +Each possible memory type which can be used to store buffer objects by the
> +GPU in question shall be given a stable and unique name to be returned as the
> +string here. The name "memory" is reserved to refer to normal system memory.

How is the name memory reserved, I mean when which part of the key?
Obviously amdgpu exposes drm-memory-vram so it can't mean system memory
there.

[Comes back later]

Ah I see.. you meant the _region_ name "memory" is reserved. Which
applies to the below keys, not the one above. Hmm.. So for multi-region
drivers you meant like:

drm-total-memory:
drm-total-vram:

Etc. Okay I think that works. All prefixes "drm-$category" become
reserved ones effectively but I think that is okay.

> +
> +Value shall reflect the amount of storage currently consumed by the buffer
> +object belong to this client, in the respective memory region.

OMG it is all my fault for mentioning buffer objects here... :)

Maybe just fix the plural while moving.

Or maybe there is time to s/buffer objects/memory/ too? Why not I think.
It would leave things more future proof.

> +
> +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> +indicating kibi- or mebi-bytes.
> +
> +- drm-shared-<region>: <uint> [KiB|MiB]
> +
> +The total size of buffers that are shared with another file (ie. have more
> +than a single handle).
> +
> +- drm-private-<region>: <uint> [KiB|MiB]
> +
> +The total size of buffers that are not shared with another file.

You went back to private + shared for a specific reason? I thought we
agreed total + shared can be less confusing.

> +
> +- drm-resident-<region>: <uint> [KiB|MiB]
> +
> +The total size of buffers that are resident in system memory.

"..resident in the specified memory region."?

> +
> +- drm-purgeable-<region>: <uint> [KiB|MiB]
> +
> +The total size of buffers that are purgeable.
> +
> +- drm-active-<region>: <uint> [KiB|MiB]
> +
> +The total size of buffers that are active on one or more rings.

Under utilisation we used 'engines' so introducing 'rings' at least
needs clarification, maybe a terminology chapter? Or just use engines
for consistency?

> +
> Implementation Details
> ======================
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 6d5bdd684ae2..9321eb0bf020 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -42,6 +42,7 @@
> #include <drm/drm_client.h>
> #include <drm/drm_drv.h>
> #include <drm/drm_file.h>
> +#include <drm/drm_gem.h>
> #include <drm/drm_print.h>
>
> #include "drm_crtc_internal.h"
> @@ -871,9 +872,105 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
> }
> EXPORT_SYMBOL(drm_send_event);
>
> +static void print_size(struct drm_printer *p, const char *stat,
> + const char *region, size_t sz)
> +{
> + const char *units[] = {"", " KiB", " MiB"};
> + unsigned u;
> +
> + for (u = 0; u < ARRAY_SIZE(units) - 1; u++) {
> + if (sz < SZ_1K)
> + break;
> + sz = div_u64(sz, SZ_1K);
> + }
> +
> + drm_printf(p, "drm-%s-%s:\t%zu%s\n", stat, region, sz, units[u]);
> +}
> +
> +/**
> + * drm_print_memory_stats - A helper to print memory stats
> + * @p: The printer to print output to
> + * @stats: The collected memory stats
> + * @supported_status: Bitmask of optional stats which are available
> + * @region: The memory region
> + *
> + */
> +void drm_print_memory_stats(struct drm_printer *p,
> + const struct drm_memory_stats *stats,
> + enum drm_gem_object_status supported_status,
> + const char *region)
> +{
> + print_size(p, "total", region, stats->private + stats->shared);
> + print_size(p, "shared", region, stats->shared);

Ah just rst is out of date.

> + print_size(p, "active", region, stats->active);
> +
> + if (supported_status & DRM_GEM_OBJECT_RESIDENT)
> + print_size(p, "resident", region, stats->resident);
> +
> + if (supported_status & DRM_GEM_OBJECT_PURGEABLE)
> + print_size(p, "purgeable", region, stats->purgeable);
> +}
> +EXPORT_SYMBOL(drm_print_memory_stats);
> +
> +/**
> + * drm_show_memory_stats - Helper to collect and show standard fdinfo memory stats
> + * @p: the printer to print output to
> + * @file: the DRM file
> + *
> + * Helper to iterate over GEM objects with a handle allocated in the specified
> + * file.
> + */
> +void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
> +{
> + struct drm_gem_object *obj;
> + struct drm_memory_stats status = {};
> + enum drm_gem_object_status supported_status;
> + int id;
> +
> + spin_lock(&file->table_lock);
> + idr_for_each_entry (&file->object_idr, obj, id) {
> + enum drm_gem_object_status s = 0;
> +
> + if (obj->funcs && obj->funcs->status) {
> + s = obj->funcs->status(obj);
> + supported_status = DRM_GEM_OBJECT_RESIDENT |
> + DRM_GEM_OBJECT_PURGEABLE;

Whats the purpose of supported_status? It is never modified. Did you
intend for the vfunc to be returning this?

> + }
> +
> + if (obj->handle_count > 1) {
> + status.shared += obj->size;
> + } else {
> + status.private += obj->size;
> + }
> +
> + if (s & DRM_GEM_OBJECT_RESIDENT) {
> + status.resident += obj->size;
> + } else {
> + /* If already purged or not yet backed by pages, don't
> + * count it as purgeable:
> + */
> + s &= ~DRM_GEM_OBJECT_PURGEABLE;
> + }

Again, why couldn't a resident object also be purgeable?

> +
> + if (!dma_resv_test_signaled(obj->resv, dma_resv_usage_rw(true))) {
> + status.active += obj->size;
> +
> + /* If still active, don't count as purgeable: */
> + s &= ~DRM_GEM_OBJECT_PURGEABLE;

Also add it to resident if driver hasn't advertised
DRM_GEM_OBJECT_RESIDENT? Not much value so not sure.

> + }
> +
> + if (s & DRM_GEM_OBJECT_PURGEABLE)
> + status.purgeable += obj->size;
> + }
> + spin_unlock(&file->table_lock);
> +
> + drm_print_memory_stats(p, &status, supported_status, "memory");
> +}
> +EXPORT_SYMBOL(drm_show_memory_stats);
> +
> /**
> * drm_show_fdinfo - helper for drm file fops
> - * @seq_file: output stream
> + * @m: output stream
> * @f: the device file instance
> *
> * Helper to implement fdinfo, for userspace to query usage stats, etc, of a
> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> index 6de6d0e9c634..1339e925af52 100644
> --- a/include/drm/drm_file.h
> +++ b/include/drm/drm_file.h
> @@ -41,6 +41,7 @@
> struct dma_fence;
> struct drm_file;
> struct drm_device;
> +struct drm_printer;
> struct device;
> struct file;
>
> @@ -440,6 +441,24 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
> void drm_send_event_timestamp_locked(struct drm_device *dev,
> struct drm_pending_event *e,
> ktime_t timestamp);
> +
> +
> +struct drm_memory_stats {
> + size_t shared;
> + size_t private;
> + size_t resident;
> + size_t purgeable;
> + size_t active;
> +};

Is size_t enough? I'd be tempted to just make it u64.

> +
> +enum drm_gem_object_status;
> +
> +void drm_print_memory_stats(struct drm_printer *p,
> + const struct drm_memory_stats *stats,
> + enum drm_gem_object_status supported_status,
> + const char *region);
> +
> +void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file);
> void drm_show_fdinfo(struct seq_file *m, struct file *f);
>
> struct file *mock_drm_getfile(struct drm_minor *minor, unsigned int flags);
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index 189fd618ca65..9ebd2820ad1f 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -42,6 +42,25 @@
> struct iosys_map;
> struct drm_gem_object;
>
> +/**
> + * enum drm_gem_object_status - bitmask of object state for fdinfo reporting
> + * @DRM_GEM_OBJECT_RESIDENT: object is resident in memory (ie. not unpinned)
> + * @DRM_GEM_OBJECT_PURGEABLE: object marked as purgeable by userspace
> + *
> + * Bitmask of status used for fdinfo memory stats, see &drm_gem_object_funcs.status
> + * and drm_show_fdinfo(). Note that an object can DRM_GEM_OBJECT_PURGEABLE if

can be

> + * it still active or not resident, in which case drm_show_fdinfo() will not
it is

> + * account for it as purgeable. So drivers do not need to check if the buffer
> + * is idle and resident to return this bit. (Ie. userspace can mark a buffer
> + * as purgeable even while it is still busy on the GPU.. it does not _actually_
> + * become puregeable until it becomes idle. The status gem object func does
> + * not need to consider this.)
> + */
> +enum drm_gem_object_status {
> + DRM_GEM_OBJECT_RESIDENT = BIT(0),
> + DRM_GEM_OBJECT_PURGEABLE = BIT(1),
> +};

Why enum for a bitmask?

> +
> /**
> * struct drm_gem_object_funcs - GEM object functions
> */
> @@ -174,6 +193,17 @@ struct drm_gem_object_funcs {
> */
> int (*evict)(struct drm_gem_object *obj);
>
> + /**
> + * @status:
> + *
> + * The optional status callback can return additional object state
> + * which determines which stats the object is counted against. The
> + * callback is called under table_lock. Racing against object status
> + * change is "harmless", and the callback can expect to not race
> + * against object destruction.
> + */
> + enum drm_gem_object_status (*status)(struct drm_gem_object *obj);

Why not have this under driver vfuncs? Can you see an usecase where it
needs to be per object?

Modulo the details ie. on the high level I think this works. More
advanced drivers can re-use the exported drm_print_memory_stats and
amount of sharing-vs-duplication seems similar to my proposal so again,
I think it is an okay approach.

Regards,

Tvrtko

> +
> /**
> * @vm_ops:
> *

2023-04-28 11:21:52

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] drm/fdinfo: Add comm/cmdline override fields


On 27/04/2023 18:53, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> These are useful in particular for VM scenarios where the process which
> has opened to drm device file is just a proxy for the real user in a VM
> guest.
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> Documentation/gpu/drm-usage-stats.rst | 18 ++++++++++++++++++
> drivers/gpu/drm/drm_file.c | 15 +++++++++++++++
> include/drm/drm_file.h | 19 +++++++++++++++++++
> 3 files changed, 52 insertions(+)
>
> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> index 58dc0d3f8c58..e4877cf8089c 100644
> --- a/Documentation/gpu/drm-usage-stats.rst
> +++ b/Documentation/gpu/drm-usage-stats.rst
> @@ -73,6 +73,24 @@ scope of each device, in which case `drm-pdev` shall be present as well.
> Userspace should make sure to not double account any usage statistics by using
> the above described criteria in order to associate data to individual clients.
>
> +- drm-comm-override: <valstr>
> +
> +Returns the client executable override string. Some drivers support letting
> +userspace override this in cases where the userspace is simply a "proxy".
> +Such as is the case with virglrenderer drm native context, where the host
> +process is just forwarding command submission, etc, from guest userspace.
> +This allows the proxy to make visible the executable name of the actual
> +app in the VM guest.
> +
> +- drm-cmdline-override: <valstr>
> +
> +Returns the client cmdline override string. Some drivers support letting
> +userspace override this in cases where the userspace is simply a "proxy".
> +Such as is the case with virglrenderer drm native context, where the host
> +process is just forwarding command submission, etc, from guest userspace.
> +This allows the proxy to make visible the cmdline of the actual app in the
> +VM guest.

Perhaps it would be okay to save space here by not repeating the
description, like:

drm-comm-override: <valstr>
drm-cmdline-override: <valstr>

Long description blah blah...
This allows the proxy to make visible the _executable name *and* command
line_ blah blah..

> +
> Utilization
> ^^^^^^^^^^^
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 9321eb0bf020..d7514c313af1 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -178,6 +178,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
> spin_lock_init(&file->master_lookup_lock);
> mutex_init(&file->event_read_lock);
>
> + mutex_init(&file->override_lock);
> +
> if (drm_core_check_feature(dev, DRIVER_GEM))
> drm_gem_open(dev, file);
>
> @@ -292,6 +294,8 @@ void drm_file_free(struct drm_file *file)
> WARN_ON(!list_empty(&file->event_list));
>
> put_pid(file->pid);
> + kfree(file->override_comm);
> + kfree(file->override_cmdline);
> kfree(file);
> }
>
> @@ -995,6 +999,17 @@ void drm_show_fdinfo(struct seq_file *m, struct file *f)
> PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
> }
>
> + mutex_lock(&file->override_lock);

You could add a fast unlocked check before taking the mutex for no risk
apart a transient false negative. For 99.9999% of userspace it would
mean no pointless lock/unlock cycle.

> + if (file->override_comm) {
> + drm_printf(&p, "drm-comm-override:\t%s\n",
> + file->override_comm);
> + }
> + if (file->override_cmdline) {
> + drm_printf(&p, "drm-cmdline-override:\t%s\n",
> + file->override_cmdline);
> + }
> + mutex_unlock(&file->override_lock);
> +
> if (dev->driver->show_fdinfo)
> dev->driver->show_fdinfo(&p, file);
> }
> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> index 1339e925af52..604d05fa6f0c 100644
> --- a/include/drm/drm_file.h
> +++ b/include/drm/drm_file.h
> @@ -370,6 +370,25 @@ struct drm_file {
> */
> struct drm_prime_file_private prime;
>
> + /**
> + * @comm: Overridden task comm
> + *
> + * Accessed under override_lock
> + */
> + char *override_comm;
> +
> + /**
> + * @cmdline: Overridden task cmdline
> + *
> + * Accessed under override_lock
> + */
> + char *override_cmdline;
> +
> + /**
> + * @override_lock: Serialize access to override_comm and override_cmdline
> + */
> + struct mutex override_lock;
> +

I don't think this should go to drm just yet though. Only one driver can
make use of it so I'd leave it for later and print from msm_show_fdinfo
for now.

Regards,

Tvrtko

> /* private: */
> #if IS_ENABLED(CONFIG_DRM_LEGACY)
> unsigned long lock_count; /* DRI1 legacy lock count */

2023-04-28 14:36:57

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] drm/docs: Fix usage stats typos

On Fri, Apr 28, 2023 at 1:50 AM Christian König
<[email protected]> wrote:
>
> Am 27.04.23 um 19:53 schrieb Rob Clark:
> > From: Rob Clark <[email protected]>
> >
> > Fix a couple missing ':'s.
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > Reviewed-by: Rodrigo Vivi <[email protected]>
>
> Reviewed-by: Christian König <[email protected]>
>
> Since this is a pretty clear fix I suggest to get this pushed to reduce
> the number of patches in the set.

Thanks, this is fine by me if someone wants to push it for me. Note
that the later .rst updates in this series depend on this so if/when
they are merged it probably should be the same tree

BR,
-R

> Christian.
>
> > ---
> > Documentation/gpu/drm-usage-stats.rst | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> > index b46327356e80..72d069e5dacb 100644
> > --- a/Documentation/gpu/drm-usage-stats.rst
> > +++ b/Documentation/gpu/drm-usage-stats.rst
> > @@ -105,7 +105,7 @@ object belong to this client, in the respective memory region.
> > Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> > indicating kibi- or mebi-bytes.
> >
> > -- drm-cycles-<str> <uint>
> > +- drm-cycles-<str>: <uint>
> >
> > Engine identifier string must be the same as the one specified in the
> > drm-engine-<str> tag and shall contain the number of busy cycles for the given
> > @@ -117,7 +117,7 @@ larger value within a reasonable period. Upon observing a value lower than what
> > was previously read, userspace is expected to stay with that larger previous
> > value until a monotonic update is seen.
> >
> > -- drm-maxfreq-<str> <uint> [Hz|MHz|KHz]
> > +- drm-maxfreq-<str>: <uint> [Hz|MHz|KHz]
> >
> > Engine identifier string must be the same as the one specified in the
> > drm-engine-<str> tag and shall contain the maximum frequency for the given
>

2023-04-28 14:47:12

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] drm: Add fdinfo memory stats

On Fri, Apr 28, 2023 at 3:56 AM Tvrtko Ursulin
<[email protected]> wrote:
>
>
> On 27/04/2023 18:53, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > Add support to dump GEM stats to fdinfo.
> >
> > v2: Fix typos, change size units to match docs, use div_u64
> > v3: Do it in core
> > v4: more kerneldoc
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > Reviewed-by: Emil Velikov <[email protected]>
> > Reviewed-by: Daniel Vetter <[email protected]>
> > ---
> > Documentation/gpu/drm-usage-stats.rst | 54 +++++++++++----
> > drivers/gpu/drm/drm_file.c | 99 ++++++++++++++++++++++++++-
> > include/drm/drm_file.h | 19 +++++
> > include/drm/drm_gem.h | 30 ++++++++
> > 4 files changed, 189 insertions(+), 13 deletions(-)
> >
> > diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> > index 552195fb1ea3..bfc14150452c 100644
> > --- a/Documentation/gpu/drm-usage-stats.rst
> > +++ b/Documentation/gpu/drm-usage-stats.rst
> > @@ -52,6 +52,9 @@ String shall contain the name this driver registered as via the respective
> > Optional fully standardised keys
> > --------------------------------
> >
> > +Identification
> > +^^^^^^^^^^^^^^
> > +
> > - drm-pdev: <aaaa:bb.cc.d>
> >
> > For PCI devices this should contain the PCI slot address of the device in
> > @@ -69,6 +72,9 @@ scope of each device, in which case `drm-pdev` shall be present as well.
> > Userspace should make sure to not double account any usage statistics by using
> > the above described criteria in order to associate data to individual clients.
> >
> > +Utilization
> > +^^^^^^^^^^^
> > +
> > - drm-engine-<str>: <uint> ns
> >
> > GPUs usually contain multiple execution engines. Each shall be given a stable
> > @@ -93,18 +99,6 @@ exported engine corresponds to a group of identical hardware engines.
> > In the absence of this tag parser shall assume capacity of one. Zero capacity
> > is not allowed.
> >
> > -- drm-memory-<str>: <uint> [KiB|MiB]
> > -
> > -Each possible memory type which can be used to store buffer objects by the
> > -GPU in question shall be given a stable and unique name to be returned as the
> > -string here.
> > -
> > -Value shall reflect the amount of storage currently consumed by the buffer
> > -object belong to this client, in the respective memory region.
> > -
> > -Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> > -indicating kibi- or mebi-bytes.
> > -
> > - drm-cycles-<str>: <uint>
> >
> > Engine identifier string must be the same as the one specified in the
> > @@ -126,6 +120,42 @@ percentage utilization of the engine, whereas drm-engine-<str> only reflects
> > time active without considering what frequency the engine is operating as a
> > percentage of it's maximum frequency.
> >
> > +Memory
> > +^^^^^^
> > +
> > +- drm-memory-<region>: <uint> [KiB|MiB]
> > +
> > +Each possible memory type which can be used to store buffer objects by the
> > +GPU in question shall be given a stable and unique name to be returned as the
> > +string here. The name "memory" is reserved to refer to normal system memory.
>
> How is the name memory reserved, I mean when which part of the key?
> Obviously amdgpu exposes drm-memory-vram so it can't mean system memory
> there.
>
> [Comes back later]
>
> Ah I see.. you meant the _region_ name "memory" is reserved. Which
> applies to the below keys, not the one above. Hmm.. So for multi-region
> drivers you meant like:

right, I thought "drm-memory-memory" sounded silly, and otherwise
"memory" fit elsewhere as below, so "memory" seemed like a reasonable
region name ;-)

> drm-total-memory:
> drm-total-vram:
>
> Etc. Okay I think that works. All prefixes "drm-$category" become
> reserved ones effectively but I think that is okay.
>
> > +
> > +Value shall reflect the amount of storage currently consumed by the buffer
> > +object belong to this client, in the respective memory region.
>
> OMG it is all my fault for mentioning buffer objects here... :)
>
> Maybe just fix the plural while moving.
>
> Or maybe there is time to s/buffer objects/memory/ too? Why not I think.
> It would leave things more future proof.
>
> > +
> > +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> > +indicating kibi- or mebi-bytes.
> > +
> > +- drm-shared-<region>: <uint> [KiB|MiB]
> > +
> > +The total size of buffers that are shared with another file (ie. have more
> > +than a single handle).
> > +
> > +- drm-private-<region>: <uint> [KiB|MiB]
> > +
> > +The total size of buffers that are not shared with another file.
>
> You went back to private + shared for a specific reason? I thought we
> agreed total + shared can be less confusing.

opps, yes, I forgot to update the rst

> > +
> > +- drm-resident-<region>: <uint> [KiB|MiB]
> > +
> > +The total size of buffers that are resident in system memory.
>
> "..resident in the specified memory region."?
>
> > +
> > +- drm-purgeable-<region>: <uint> [KiB|MiB]
> > +
> > +The total size of buffers that are purgeable.
> > +
> > +- drm-active-<region>: <uint> [KiB|MiB]
> > +
> > +The total size of buffers that are active on one or more rings.
>
> Under utilisation we used 'engines' so introducing 'rings' at least
> needs clarification, maybe a terminology chapter? Or just use engines
> for consistency?

using "engines" works for me

> > +
> > Implementation Details
> > ======================
> >
> > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > index 6d5bdd684ae2..9321eb0bf020 100644
> > --- a/drivers/gpu/drm/drm_file.c
> > +++ b/drivers/gpu/drm/drm_file.c
> > @@ -42,6 +42,7 @@
> > #include <drm/drm_client.h>
> > #include <drm/drm_drv.h>
> > #include <drm/drm_file.h>
> > +#include <drm/drm_gem.h>
> > #include <drm/drm_print.h>
> >
> > #include "drm_crtc_internal.h"
> > @@ -871,9 +872,105 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
> > }
> > EXPORT_SYMBOL(drm_send_event);
> >
> > +static void print_size(struct drm_printer *p, const char *stat,
> > + const char *region, size_t sz)
> > +{
> > + const char *units[] = {"", " KiB", " MiB"};
> > + unsigned u;
> > +
> > + for (u = 0; u < ARRAY_SIZE(units) - 1; u++) {
> > + if (sz < SZ_1K)
> > + break;
> > + sz = div_u64(sz, SZ_1K);
> > + }
> > +
> > + drm_printf(p, "drm-%s-%s:\t%zu%s\n", stat, region, sz, units[u]);
> > +}
> > +
> > +/**
> > + * drm_print_memory_stats - A helper to print memory stats
> > + * @p: The printer to print output to
> > + * @stats: The collected memory stats
> > + * @supported_status: Bitmask of optional stats which are available
> > + * @region: The memory region
> > + *
> > + */
> > +void drm_print_memory_stats(struct drm_printer *p,
> > + const struct drm_memory_stats *stats,
> > + enum drm_gem_object_status supported_status,
> > + const char *region)
> > +{
> > + print_size(p, "total", region, stats->private + stats->shared);
> > + print_size(p, "shared", region, stats->shared);
>
> Ah just rst is out of date.
>
> > + print_size(p, "active", region, stats->active);
> > +
> > + if (supported_status & DRM_GEM_OBJECT_RESIDENT)
> > + print_size(p, "resident", region, stats->resident);
> > +
> > + if (supported_status & DRM_GEM_OBJECT_PURGEABLE)
> > + print_size(p, "purgeable", region, stats->purgeable);
> > +}
> > +EXPORT_SYMBOL(drm_print_memory_stats);
> > +
> > +/**
> > + * drm_show_memory_stats - Helper to collect and show standard fdinfo memory stats
> > + * @p: the printer to print output to
> > + * @file: the DRM file
> > + *
> > + * Helper to iterate over GEM objects with a handle allocated in the specified
> > + * file.
> > + */
> > +void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
> > +{
> > + struct drm_gem_object *obj;
> > + struct drm_memory_stats status = {};
> > + enum drm_gem_object_status supported_status;
> > + int id;
> > +
> > + spin_lock(&file->table_lock);
> > + idr_for_each_entry (&file->object_idr, obj, id) {
> > + enum drm_gem_object_status s = 0;
> > +
> > + if (obj->funcs && obj->funcs->status) {
> > + s = obj->funcs->status(obj);
> > + supported_status = DRM_GEM_OBJECT_RESIDENT |
> > + DRM_GEM_OBJECT_PURGEABLE;
>
> Whats the purpose of supported_status? It is never modified. Did you
> intend for the vfunc to be returning this?

for now, simply to avoid showing fields for drivers which don't
support the vfunc.. it could be made more fine grained later if the
need arises.

> > + }
> > +
> > + if (obj->handle_count > 1) {
> > + status.shared += obj->size;
> > + } else {
> > + status.private += obj->size;
> > + }
> > +
> > + if (s & DRM_GEM_OBJECT_RESIDENT) {
> > + status.resident += obj->size;
> > + } else {
> > + /* If already purged or not yet backed by pages, don't
> > + * count it as purgeable:
> > + */
> > + s &= ~DRM_GEM_OBJECT_PURGEABLE;
> > + }
>
> Again, why couldn't a resident object also be purgeable?

it is the other way around.. if it isn't backed by pages (ie. already
purged, etc) it shouldn't count as purgeable

> > +
> > + if (!dma_resv_test_signaled(obj->resv, dma_resv_usage_rw(true))) {
> > + status.active += obj->size;
> > +
> > + /* If still active, don't count as purgeable: */
> > + s &= ~DRM_GEM_OBJECT_PURGEABLE;
>
> Also add it to resident if driver hasn't advertised
> DRM_GEM_OBJECT_RESIDENT? Not much value so not sure.
>
> > + }
> > +
> > + if (s & DRM_GEM_OBJECT_PURGEABLE)
> > + status.purgeable += obj->size;
> > + }
> > + spin_unlock(&file->table_lock);
> > +
> > + drm_print_memory_stats(p, &status, supported_status, "memory");
> > +}
> > +EXPORT_SYMBOL(drm_show_memory_stats);
> > +
> > /**
> > * drm_show_fdinfo - helper for drm file fops
> > - * @seq_file: output stream
> > + * @m: output stream
> > * @f: the device file instance
> > *
> > * Helper to implement fdinfo, for userspace to query usage stats, etc, of a
> > diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> > index 6de6d0e9c634..1339e925af52 100644
> > --- a/include/drm/drm_file.h
> > +++ b/include/drm/drm_file.h
> > @@ -41,6 +41,7 @@
> > struct dma_fence;
> > struct drm_file;
> > struct drm_device;
> > +struct drm_printer;
> > struct device;
> > struct file;
> >
> > @@ -440,6 +441,24 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
> > void drm_send_event_timestamp_locked(struct drm_device *dev,
> > struct drm_pending_event *e,
> > ktime_t timestamp);
> > +
> > +
> > +struct drm_memory_stats {
> > + size_t shared;
> > + size_t private;
> > + size_t resident;
> > + size_t purgeable;
> > + size_t active;
> > +};
>
> Is size_t enough? I'd be tempted to just make it u64.

hmm, >4GB VRAM on 32b system? Seems dubious but I guess u64 would be
fine too...

> > +
> > +enum drm_gem_object_status;
> > +
> > +void drm_print_memory_stats(struct drm_printer *p,
> > + const struct drm_memory_stats *stats,
> > + enum drm_gem_object_status supported_status,
> > + const char *region);
> > +
> > +void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file);
> > void drm_show_fdinfo(struct seq_file *m, struct file *f);
> >
> > struct file *mock_drm_getfile(struct drm_minor *minor, unsigned int flags);
> > diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> > index 189fd618ca65..9ebd2820ad1f 100644
> > --- a/include/drm/drm_gem.h
> > +++ b/include/drm/drm_gem.h
> > @@ -42,6 +42,25 @@
> > struct iosys_map;
> > struct drm_gem_object;
> >
> > +/**
> > + * enum drm_gem_object_status - bitmask of object state for fdinfo reporting
> > + * @DRM_GEM_OBJECT_RESIDENT: object is resident in memory (ie. not unpinned)
> > + * @DRM_GEM_OBJECT_PURGEABLE: object marked as purgeable by userspace
> > + *
> > + * Bitmask of status used for fdinfo memory stats, see &drm_gem_object_funcs.status
> > + * and drm_show_fdinfo(). Note that an object can DRM_GEM_OBJECT_PURGEABLE if
>
> can be
>
> > + * it still active or not resident, in which case drm_show_fdinfo() will not
> it is
>
> > + * account for it as purgeable. So drivers do not need to check if the buffer
> > + * is idle and resident to return this bit. (Ie. userspace can mark a buffer
> > + * as purgeable even while it is still busy on the GPU.. it does not _actually_
> > + * become puregeable until it becomes idle. The status gem object func does
> > + * not need to consider this.)
> > + */
> > +enum drm_gem_object_status {
> > + DRM_GEM_OBJECT_RESIDENT = BIT(0),
> > + DRM_GEM_OBJECT_PURGEABLE = BIT(1),
> > +};
>
> Why enum for a bitmask?

I guess personal preference, #define's have no type so they aren't
linked to the variable

> > +
> > /**
> > * struct drm_gem_object_funcs - GEM object functions
> > */
> > @@ -174,6 +193,17 @@ struct drm_gem_object_funcs {
> > */
> > int (*evict)(struct drm_gem_object *obj);
> >
> > + /**
> > + * @status:
> > + *
> > + * The optional status callback can return additional object state
> > + * which determines which stats the object is counted against. The
> > + * callback is called under table_lock. Racing against object status
> > + * change is "harmless", and the callback can expect to not race
> > + * against object destruction.
> > + */
> > + enum drm_gem_object_status (*status)(struct drm_gem_object *obj);
>
> Why not have this under driver vfuncs? Can you see an usecase where it
> needs to be per object?

Probably doesn't need to be per object, but putting it in obj vfuncs
lets the driver keep it static in their foo_gem_stuff.c

> Modulo the details ie. on the high level I think this works. More
> advanced drivers can re-use the exported drm_print_memory_stats and
> amount of sharing-vs-duplication seems similar to my proposal so again,
> I think it is an okay approach.

yup, this was the intent w/ drm_print_memory_stats()

BR,
-R

> Regards,
>
> Tvrtko
>
> > +
> > /**
> > * @vm_ops:
> > *

2023-05-01 17:06:56

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] drm/fdinfo: Add comm/cmdline override fields

On Fri, Apr 28, 2023 at 4:05 AM Tvrtko Ursulin
<[email protected]> wrote:
>
>
> On 27/04/2023 18:53, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > These are useful in particular for VM scenarios where the process which
> > has opened to drm device file is just a proxy for the real user in a VM
> > guest.
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > Documentation/gpu/drm-usage-stats.rst | 18 ++++++++++++++++++
> > drivers/gpu/drm/drm_file.c | 15 +++++++++++++++
> > include/drm/drm_file.h | 19 +++++++++++++++++++
> > 3 files changed, 52 insertions(+)
> >
> > diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> > index 58dc0d3f8c58..e4877cf8089c 100644
> > --- a/Documentation/gpu/drm-usage-stats.rst
> > +++ b/Documentation/gpu/drm-usage-stats.rst
> > @@ -73,6 +73,24 @@ scope of each device, in which case `drm-pdev` shall be present as well.
> > Userspace should make sure to not double account any usage statistics by using
> > the above described criteria in order to associate data to individual clients.
> >
> > +- drm-comm-override: <valstr>
> > +
> > +Returns the client executable override string. Some drivers support letting
> > +userspace override this in cases where the userspace is simply a "proxy".
> > +Such as is the case with virglrenderer drm native context, where the host
> > +process is just forwarding command submission, etc, from guest userspace.
> > +This allows the proxy to make visible the executable name of the actual
> > +app in the VM guest.
> > +
> > +- drm-cmdline-override: <valstr>
> > +
> > +Returns the client cmdline override string. Some drivers support letting
> > +userspace override this in cases where the userspace is simply a "proxy".
> > +Such as is the case with virglrenderer drm native context, where the host
> > +process is just forwarding command submission, etc, from guest userspace.
> > +This allows the proxy to make visible the cmdline of the actual app in the
> > +VM guest.
>
> Perhaps it would be okay to save space here by not repeating the
> description, like:
>
> drm-comm-override: <valstr>
> drm-cmdline-override: <valstr>
>
> Long description blah blah...
> This allows the proxy to make visible the _executable name *and* command
> line_ blah blah..
>
> > +
> > Utilization
> > ^^^^^^^^^^^
> >
> > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > index 9321eb0bf020..d7514c313af1 100644
> > --- a/drivers/gpu/drm/drm_file.c
> > +++ b/drivers/gpu/drm/drm_file.c
> > @@ -178,6 +178,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
> > spin_lock_init(&file->master_lookup_lock);
> > mutex_init(&file->event_read_lock);
> >
> > + mutex_init(&file->override_lock);
> > +
> > if (drm_core_check_feature(dev, DRIVER_GEM))
> > drm_gem_open(dev, file);
> >
> > @@ -292,6 +294,8 @@ void drm_file_free(struct drm_file *file)
> > WARN_ON(!list_empty(&file->event_list));
> >
> > put_pid(file->pid);
> > + kfree(file->override_comm);
> > + kfree(file->override_cmdline);
> > kfree(file);
> > }
> >
> > @@ -995,6 +999,17 @@ void drm_show_fdinfo(struct seq_file *m, struct file *f)
> > PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
> > }
> >
> > + mutex_lock(&file->override_lock);
>
> You could add a fast unlocked check before taking the mutex for no risk
> apart a transient false negative. For 99.9999% of userspace it would
> mean no pointless lock/unlock cycle.

I'm not sure I get your point? This needs to be serialized against
userspace setting the override values

>
> > + if (file->override_comm) {
> > + drm_printf(&p, "drm-comm-override:\t%s\n",
> > + file->override_comm);
> > + }
> > + if (file->override_cmdline) {
> > + drm_printf(&p, "drm-cmdline-override:\t%s\n",
> > + file->override_cmdline);
> > + }
> > + mutex_unlock(&file->override_lock);
> > +
> > if (dev->driver->show_fdinfo)
> > dev->driver->show_fdinfo(&p, file);
> > }
> > diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> > index 1339e925af52..604d05fa6f0c 100644
> > --- a/include/drm/drm_file.h
> > +++ b/include/drm/drm_file.h
> > @@ -370,6 +370,25 @@ struct drm_file {
> > */
> > struct drm_prime_file_private prime;
> >
> > + /**
> > + * @comm: Overridden task comm
> > + *
> > + * Accessed under override_lock
> > + */
> > + char *override_comm;
> > +
> > + /**
> > + * @cmdline: Overridden task cmdline
> > + *
> > + * Accessed under override_lock
> > + */
> > + char *override_cmdline;
> > +
> > + /**
> > + * @override_lock: Serialize access to override_comm and override_cmdline
> > + */
> > + struct mutex override_lock;
> > +
>
> I don't think this should go to drm just yet though. Only one driver can
> make use of it so I'd leave it for later and print from msm_show_fdinfo
> for now.

This was my original approach but danvet asked that it be moved into
drm for consistency across drivers. (And really, I want the in-flight
amd and intel native-context stuff to motivate adding similar features
to amdgpu/i915/xe.)

BR,
-R

> Regards,
>
> Tvrtko
>
> > /* private: */
> > #if IS_ENABLED(CONFIG_DRM_LEGACY)
> > unsigned long lock_count; /* DRI1 legacy lock count */

2023-05-02 08:04:16

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] drm/fdinfo: Add comm/cmdline override fields


On 01/05/2023 17:58, Rob Clark wrote:
> On Fri, Apr 28, 2023 at 4:05 AM Tvrtko Ursulin
> <[email protected]> wrote:
>>
>>
>> On 27/04/2023 18:53, Rob Clark wrote:
>>> From: Rob Clark <[email protected]>
>>>
>>> These are useful in particular for VM scenarios where the process which
>>> has opened to drm device file is just a proxy for the real user in a VM
>>> guest.
>>>
>>> Signed-off-by: Rob Clark <[email protected]>
>>> ---
>>> Documentation/gpu/drm-usage-stats.rst | 18 ++++++++++++++++++
>>> drivers/gpu/drm/drm_file.c | 15 +++++++++++++++
>>> include/drm/drm_file.h | 19 +++++++++++++++++++
>>> 3 files changed, 52 insertions(+)
>>>
>>> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
>>> index 58dc0d3f8c58..e4877cf8089c 100644
>>> --- a/Documentation/gpu/drm-usage-stats.rst
>>> +++ b/Documentation/gpu/drm-usage-stats.rst
>>> @@ -73,6 +73,24 @@ scope of each device, in which case `drm-pdev` shall be present as well.
>>> Userspace should make sure to not double account any usage statistics by using
>>> the above described criteria in order to associate data to individual clients.
>>>
>>> +- drm-comm-override: <valstr>
>>> +
>>> +Returns the client executable override string. Some drivers support letting
>>> +userspace override this in cases where the userspace is simply a "proxy".
>>> +Such as is the case with virglrenderer drm native context, where the host
>>> +process is just forwarding command submission, etc, from guest userspace.
>>> +This allows the proxy to make visible the executable name of the actual
>>> +app in the VM guest.
>>> +
>>> +- drm-cmdline-override: <valstr>
>>> +
>>> +Returns the client cmdline override string. Some drivers support letting
>>> +userspace override this in cases where the userspace is simply a "proxy".
>>> +Such as is the case with virglrenderer drm native context, where the host
>>> +process is just forwarding command submission, etc, from guest userspace.
>>> +This allows the proxy to make visible the cmdline of the actual app in the
>>> +VM guest.
>>
>> Perhaps it would be okay to save space here by not repeating the
>> description, like:
>>
>> drm-comm-override: <valstr>
>> drm-cmdline-override: <valstr>
>>
>> Long description blah blah...
>> This allows the proxy to make visible the _executable name *and* command
>> line_ blah blah..
>>
>>> +
>>> Utilization
>>> ^^^^^^^^^^^
>>>
>>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>>> index 9321eb0bf020..d7514c313af1 100644
>>> --- a/drivers/gpu/drm/drm_file.c
>>> +++ b/drivers/gpu/drm/drm_file.c
>>> @@ -178,6 +178,8 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
>>> spin_lock_init(&file->master_lookup_lock);
>>> mutex_init(&file->event_read_lock);
>>>
>>> + mutex_init(&file->override_lock);
>>> +
>>> if (drm_core_check_feature(dev, DRIVER_GEM))
>>> drm_gem_open(dev, file);
>>>
>>> @@ -292,6 +294,8 @@ void drm_file_free(struct drm_file *file)
>>> WARN_ON(!list_empty(&file->event_list));
>>>
>>> put_pid(file->pid);
>>> + kfree(file->override_comm);
>>> + kfree(file->override_cmdline);
>>> kfree(file);
>>> }
>>>
>>> @@ -995,6 +999,17 @@ void drm_show_fdinfo(struct seq_file *m, struct file *f)
>>> PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
>>> }
>>>
>>> + mutex_lock(&file->override_lock);
>>
>> You could add a fast unlocked check before taking the mutex for no risk
>> apart a transient false negative. For 99.9999% of userspace it would
>> mean no pointless lock/unlock cycle.
>
> I'm not sure I get your point? This needs to be serialized against
> userspace setting the override values

if (file->override_comm || file->override_cmdline) {
mutex_lock(&file->override_lock);
if (file->override_comm)
drm_printf(&p, "drm-comm-override:\t%s\n",
file->override_comm);
if (file->override_cmdline)
drm_printf(&p, "drm-cmdline-override:\t%s\n",
file->override_cmdline);
mutext_unlock(&file->override_lock);
}

No risk apart for a transient false negative (which is immaterial for
userspace since fdinfo reads are not ordered versus the override setting
anyway) and 99.9% of deployments can get by not needing to pointlessly
cycle the lock.

>
>>
>>> + if (file->override_comm) {
>>> + drm_printf(&p, "drm-comm-override:\t%s\n",
>>> + file->override_comm);
>>> + }
>>> + if (file->override_cmdline) {
>>> + drm_printf(&p, "drm-cmdline-override:\t%s\n",
>>> + file->override_cmdline);
>>> + }
>>> + mutex_unlock(&file->override_lock);
>>> +
>>> if (dev->driver->show_fdinfo)
>>> dev->driver->show_fdinfo(&p, file);
>>> }
>>> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
>>> index 1339e925af52..604d05fa6f0c 100644
>>> --- a/include/drm/drm_file.h
>>> +++ b/include/drm/drm_file.h
>>> @@ -370,6 +370,25 @@ struct drm_file {
>>> */
>>> struct drm_prime_file_private prime;
>>>
>>> + /**
>>> + * @comm: Overridden task comm
>>> + *
>>> + * Accessed under override_lock
>>> + */
>>> + char *override_comm;
>>> +
>>> + /**
>>> + * @cmdline: Overridden task cmdline
>>> + *
>>> + * Accessed under override_lock
>>> + */
>>> + char *override_cmdline;
>>> +
>>> + /**
>>> + * @override_lock: Serialize access to override_comm and override_cmdline
>>> + */
>>> + struct mutex override_lock;
>>> +
>>
>> I don't think this should go to drm just yet though. Only one driver can
>> make use of it so I'd leave it for later and print from msm_show_fdinfo
>> for now.
>
> This was my original approach but danvet asked that it be moved into
> drm for consistency across drivers. (And really, I want the in-flight
> amd and intel native-context stuff to motivate adding similar features
> to amdgpu/i915/xe.)

IMO if implementation is not shared, not even by using helpers, I don't
think data storage should be either, but it's not a deal breaker.

Regards,

Tvrtko

>
> BR,
> -R
>
>> Regards,
>>
>> Tvrtko
>>
>>> /* private: */
>>> #if IS_ENABLED(CONFIG_DRM_LEGACY)
>>> unsigned long lock_count; /* DRI1 legacy lock count */

2023-05-02 08:37:03

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] drm: Add fdinfo memory stats


On 28/04/2023 15:45, Rob Clark wrote:
> On Fri, Apr 28, 2023 at 3:56 AM Tvrtko Ursulin
> <[email protected]> wrote:
>>
>>
>> On 27/04/2023 18:53, Rob Clark wrote:
>>> From: Rob Clark <[email protected]>
>>>
>>> Add support to dump GEM stats to fdinfo.
>>>
>>> v2: Fix typos, change size units to match docs, use div_u64
>>> v3: Do it in core
>>> v4: more kerneldoc
>>>
>>> Signed-off-by: Rob Clark <[email protected]>
>>> Reviewed-by: Emil Velikov <[email protected]>
>>> Reviewed-by: Daniel Vetter <[email protected]>
>>> ---
>>> Documentation/gpu/drm-usage-stats.rst | 54 +++++++++++----
>>> drivers/gpu/drm/drm_file.c | 99 ++++++++++++++++++++++++++-
>>> include/drm/drm_file.h | 19 +++++
>>> include/drm/drm_gem.h | 30 ++++++++
>>> 4 files changed, 189 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
>>> index 552195fb1ea3..bfc14150452c 100644
>>> --- a/Documentation/gpu/drm-usage-stats.rst
>>> +++ b/Documentation/gpu/drm-usage-stats.rst
>>> @@ -52,6 +52,9 @@ String shall contain the name this driver registered as via the respective
>>> Optional fully standardised keys
>>> --------------------------------
>>>
>>> +Identification
>>> +^^^^^^^^^^^^^^
>>> +
>>> - drm-pdev: <aaaa:bb.cc.d>
>>>
>>> For PCI devices this should contain the PCI slot address of the device in
>>> @@ -69,6 +72,9 @@ scope of each device, in which case `drm-pdev` shall be present as well.
>>> Userspace should make sure to not double account any usage statistics by using
>>> the above described criteria in order to associate data to individual clients.
>>>
>>> +Utilization
>>> +^^^^^^^^^^^
>>> +
>>> - drm-engine-<str>: <uint> ns
>>>
>>> GPUs usually contain multiple execution engines. Each shall be given a stable
>>> @@ -93,18 +99,6 @@ exported engine corresponds to a group of identical hardware engines.
>>> In the absence of this tag parser shall assume capacity of one. Zero capacity
>>> is not allowed.
>>>
>>> -- drm-memory-<str>: <uint> [KiB|MiB]
>>> -
>>> -Each possible memory type which can be used to store buffer objects by the
>>> -GPU in question shall be given a stable and unique name to be returned as the
>>> -string here.
>>> -
>>> -Value shall reflect the amount of storage currently consumed by the buffer
>>> -object belong to this client, in the respective memory region.
>>> -
>>> -Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
>>> -indicating kibi- or mebi-bytes.
>>> -
>>> - drm-cycles-<str>: <uint>
>>>
>>> Engine identifier string must be the same as the one specified in the
>>> @@ -126,6 +120,42 @@ percentage utilization of the engine, whereas drm-engine-<str> only reflects
>>> time active without considering what frequency the engine is operating as a
>>> percentage of it's maximum frequency.
>>>
>>> +Memory
>>> +^^^^^^
>>> +
>>> +- drm-memory-<region>: <uint> [KiB|MiB]
>>> +
>>> +Each possible memory type which can be used to store buffer objects by the
>>> +GPU in question shall be given a stable and unique name to be returned as the
>>> +string here. The name "memory" is reserved to refer to normal system memory.
>>
>> How is the name memory reserved, I mean when which part of the key?
>> Obviously amdgpu exposes drm-memory-vram so it can't mean system memory
>> there.
>>
>> [Comes back later]
>>
>> Ah I see.. you meant the _region_ name "memory" is reserved. Which
>> applies to the below keys, not the one above. Hmm.. So for multi-region
>> drivers you meant like:
>
> right, I thought "drm-memory-memory" sounded silly, and otherwise
> "memory" fit elsewhere as below, so "memory" seemed like a reasonable
> region name ;-)
>
>> drm-total-memory:
>> drm-total-vram:
>>
>> Etc. Okay I think that works. All prefixes "drm-$category" become
>> reserved ones effectively but I think that is okay.
>>
>>> +
>>> +Value shall reflect the amount of storage currently consumed by the buffer
>>> +object belong to this client, in the respective memory region.
>>
>> OMG it is all my fault for mentioning buffer objects here... :)
>>
>> Maybe just fix the plural while moving.
>>
>> Or maybe there is time to s/buffer objects/memory/ too? Why not I think.
>> It would leave things more future proof.
>>
>>> +
>>> +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
>>> +indicating kibi- or mebi-bytes.
>>> +
>>> +- drm-shared-<region>: <uint> [KiB|MiB]
>>> +
>>> +The total size of buffers that are shared with another file (ie. have more
>>> +than a single handle).
>>> +
>>> +- drm-private-<region>: <uint> [KiB|MiB]
>>> +
>>> +The total size of buffers that are not shared with another file.
>>
>> You went back to private + shared for a specific reason? I thought we
>> agreed total + shared can be less confusing.
>
> opps, yes, I forgot to update the rst
>
>>> +
>>> +- drm-resident-<region>: <uint> [KiB|MiB]
>>> +
>>> +The total size of buffers that are resident in system memory.
>>
>> "..resident in the specified memory region."?
>>
>>> +
>>> +- drm-purgeable-<region>: <uint> [KiB|MiB]
>>> +
>>> +The total size of buffers that are purgeable.
>>> +
>>> +- drm-active-<region>: <uint> [KiB|MiB]
>>> +
>>> +The total size of buffers that are active on one or more rings.
>>
>> Under utilisation we used 'engines' so introducing 'rings' at least
>> needs clarification, maybe a terminology chapter? Or just use engines
>> for consistency?
>
> using "engines" works for me
>
>>> +
>>> Implementation Details
>>> ======================
>>>
>>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>>> index 6d5bdd684ae2..9321eb0bf020 100644
>>> --- a/drivers/gpu/drm/drm_file.c
>>> +++ b/drivers/gpu/drm/drm_file.c
>>> @@ -42,6 +42,7 @@
>>> #include <drm/drm_client.h>
>>> #include <drm/drm_drv.h>
>>> #include <drm/drm_file.h>
>>> +#include <drm/drm_gem.h>
>>> #include <drm/drm_print.h>
>>>
>>> #include "drm_crtc_internal.h"
>>> @@ -871,9 +872,105 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e)
>>> }
>>> EXPORT_SYMBOL(drm_send_event);
>>>
>>> +static void print_size(struct drm_printer *p, const char *stat,
>>> + const char *region, size_t sz)
>>> +{
>>> + const char *units[] = {"", " KiB", " MiB"};
>>> + unsigned u;
>>> +
>>> + for (u = 0; u < ARRAY_SIZE(units) - 1; u++) {
>>> + if (sz < SZ_1K)
>>> + break;
>>> + sz = div_u64(sz, SZ_1K);
>>> + }
>>> +
>>> + drm_printf(p, "drm-%s-%s:\t%zu%s\n", stat, region, sz, units[u]);
>>> +}
>>> +
>>> +/**
>>> + * drm_print_memory_stats - A helper to print memory stats
>>> + * @p: The printer to print output to
>>> + * @stats: The collected memory stats
>>> + * @supported_status: Bitmask of optional stats which are available
>>> + * @region: The memory region
>>> + *
>>> + */
>>> +void drm_print_memory_stats(struct drm_printer *p,
>>> + const struct drm_memory_stats *stats,
>>> + enum drm_gem_object_status supported_status,
>>> + const char *region)
>>> +{
>>> + print_size(p, "total", region, stats->private + stats->shared);
>>> + print_size(p, "shared", region, stats->shared);
>>
>> Ah just rst is out of date.
>>
>>> + print_size(p, "active", region, stats->active);
>>> +
>>> + if (supported_status & DRM_GEM_OBJECT_RESIDENT)
>>> + print_size(p, "resident", region, stats->resident);
>>> +
>>> + if (supported_status & DRM_GEM_OBJECT_PURGEABLE)
>>> + print_size(p, "purgeable", region, stats->purgeable);
>>> +}
>>> +EXPORT_SYMBOL(drm_print_memory_stats);
>>> +
>>> +/**
>>> + * drm_show_memory_stats - Helper to collect and show standard fdinfo memory stats
>>> + * @p: the printer to print output to
>>> + * @file: the DRM file
>>> + *
>>> + * Helper to iterate over GEM objects with a handle allocated in the specified
>>> + * file.
>>> + */
>>> +void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file)
>>> +{
>>> + struct drm_gem_object *obj;
>>> + struct drm_memory_stats status = {};
>>> + enum drm_gem_object_status supported_status;
>>> + int id;
>>> +
>>> + spin_lock(&file->table_lock);
>>> + idr_for_each_entry (&file->object_idr, obj, id) {
>>> + enum drm_gem_object_status s = 0;
>>> +
>>> + if (obj->funcs && obj->funcs->status) {
>>> + s = obj->funcs->status(obj);
>>> + supported_status = DRM_GEM_OBJECT_RESIDENT |
>>> + DRM_GEM_OBJECT_PURGEABLE;
>>
>> Whats the purpose of supported_status? It is never modified. Did you
>> intend for the vfunc to be returning this?
>
> for now, simply to avoid showing fields for drivers which don't
> support the vfunc.. it could be made more fine grained later if the
> need arises.

I think if vfunc returned a bitmask of supported statuses that would be
better.

>>> + }
>>> +
>>> + if (obj->handle_count > 1) {
>>> + status.shared += obj->size;
>>> + } else {
>>> + status.private += obj->size;
>>> + }
>>> +
>>> + if (s & DRM_GEM_OBJECT_RESIDENT) {
>>> + status.resident += obj->size;
>>> + } else {
>>> + /* If already purged or not yet backed by pages, don't
>>> + * count it as purgeable:
>>> + */
>>> + s &= ~DRM_GEM_OBJECT_PURGEABLE;
>>> + }
>>
>> Again, why couldn't a resident object also be purgeable?
>
> it is the other way around.. if it isn't backed by pages (ie. already
> purged, etc) it shouldn't count as purgeable

Oops, may bad.

>>> +
>>> + if (!dma_resv_test_signaled(obj->resv, dma_resv_usage_rw(true))) {
>>> + status.active += obj->size;
>>> +
>>> + /* If still active, don't count as purgeable: */
>>> + s &= ~DRM_GEM_OBJECT_PURGEABLE;
>>
>> Also add it to resident if driver hasn't advertised
>> DRM_GEM_OBJECT_RESIDENT? Not much value so not sure.
>>
>>> + }
>>> +
>>> + if (s & DRM_GEM_OBJECT_PURGEABLE)
>>> + status.purgeable += obj->size;
>>> + }
>>> + spin_unlock(&file->table_lock);
>>> +
>>> + drm_print_memory_stats(p, &status, supported_status, "memory");
>>> +}
>>> +EXPORT_SYMBOL(drm_show_memory_stats);
>>> +
>>> /**
>>> * drm_show_fdinfo - helper for drm file fops
>>> - * @seq_file: output stream
>>> + * @m: output stream
>>> * @f: the device file instance
>>> *
>>> * Helper to implement fdinfo, for userspace to query usage stats, etc, of a
>>> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
>>> index 6de6d0e9c634..1339e925af52 100644
>>> --- a/include/drm/drm_file.h
>>> +++ b/include/drm/drm_file.h
>>> @@ -41,6 +41,7 @@
>>> struct dma_fence;
>>> struct drm_file;
>>> struct drm_device;
>>> +struct drm_printer;
>>> struct device;
>>> struct file;
>>>
>>> @@ -440,6 +441,24 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e);
>>> void drm_send_event_timestamp_locked(struct drm_device *dev,
>>> struct drm_pending_event *e,
>>> ktime_t timestamp);
>>> +
>>> +
>>> +struct drm_memory_stats {
>>> + size_t shared;
>>> + size_t private;
>>> + size_t resident;
>>> + size_t purgeable;
>>> + size_t active;
>>> +};
>>
>> Is size_t enough? I'd be tempted to just make it u64.
>
> hmm, >4GB VRAM on 32b system? Seems dubious but I guess u64 would be
> fine too...

Or just two ioctls to overflow it with any driver which does delayed/on
first use allocation?

Regards,

Tvrtko

>>> +
>>> +enum drm_gem_object_status;
>>> +
>>> +void drm_print_memory_stats(struct drm_printer *p,
>>> + const struct drm_memory_stats *stats,
>>> + enum drm_gem_object_status supported_status,
>>> + const char *region);
>>> +
>>> +void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file);
>>> void drm_show_fdinfo(struct seq_file *m, struct file *f);
>>>
>>> struct file *mock_drm_getfile(struct drm_minor *minor, unsigned int flags);
>>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
>>> index 189fd618ca65..9ebd2820ad1f 100644
>>> --- a/include/drm/drm_gem.h
>>> +++ b/include/drm/drm_gem.h
>>> @@ -42,6 +42,25 @@
>>> struct iosys_map;
>>> struct drm_gem_object;
>>>
>>> +/**
>>> + * enum drm_gem_object_status - bitmask of object state for fdinfo reporting
>>> + * @DRM_GEM_OBJECT_RESIDENT: object is resident in memory (ie. not unpinned)
>>> + * @DRM_GEM_OBJECT_PURGEABLE: object marked as purgeable by userspace
>>> + *
>>> + * Bitmask of status used for fdinfo memory stats, see &drm_gem_object_funcs.status
>>> + * and drm_show_fdinfo(). Note that an object can DRM_GEM_OBJECT_PURGEABLE if
>>
>> can be
>>
>>> + * it still active or not resident, in which case drm_show_fdinfo() will not
>> it is
>>
>>> + * account for it as purgeable. So drivers do not need to check if the buffer
>>> + * is idle and resident to return this bit. (Ie. userspace can mark a buffer
>>> + * as purgeable even while it is still busy on the GPU.. it does not _actually_
>>> + * become puregeable until it becomes idle. The status gem object func does
>>> + * not need to consider this.)
>>> + */
>>> +enum drm_gem_object_status {
>>> + DRM_GEM_OBJECT_RESIDENT = BIT(0),
>>> + DRM_GEM_OBJECT_PURGEABLE = BIT(1),
>>> +};
>>
>> Why enum for a bitmask?
>
> I guess personal preference, #define's have no type so they aren't
> linked to the variable
>
>>> +
>>> /**
>>> * struct drm_gem_object_funcs - GEM object functions
>>> */
>>> @@ -174,6 +193,17 @@ struct drm_gem_object_funcs {
>>> */
>>> int (*evict)(struct drm_gem_object *obj);
>>>
>>> + /**
>>> + * @status:
>>> + *
>>> + * The optional status callback can return additional object state
>>> + * which determines which stats the object is counted against. The
>>> + * callback is called under table_lock. Racing against object status
>>> + * change is "harmless", and the callback can expect to not race
>>> + * against object destruction.
>>> + */
>>> + enum drm_gem_object_status (*status)(struct drm_gem_object *obj);
>>
>> Why not have this under driver vfuncs? Can you see an usecase where it
>> needs to be per object?
>
> Probably doesn't need to be per object, but putting it in obj vfuncs
> lets the driver keep it static in their foo_gem_stuff.c
>
>> Modulo the details ie. on the high level I think this works. More
>> advanced drivers can re-use the exported drm_print_memory_stats and
>> amount of sharing-vs-duplication seems similar to my proposal so again,
>> I think it is an okay approach.
>
> yup, this was the intent w/ drm_print_memory_stats()
>
> BR,
> -R
>
>> Regards,
>>
>> Tvrtko
>>
>>> +
>>> /**
>>> * @vm_ops:
>>> *

2023-05-18 09:52:44

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] drm/fdinfo: Add comm/cmdline override fields


In case you were waiting for me looking at the rest of the series, there
was this reply from the previous round I can expand on.

On 02/05/2023 08:50, Tvrtko Ursulin wrote:
>
> On 01/05/2023 17:58, Rob Clark wrote:
>> On Fri, Apr 28, 2023 at 4:05 AM Tvrtko Ursulin
>> <[email protected]> wrote:
>>>
>>>
>>> On 27/04/2023 18:53, Rob Clark wrote:
>>>> From: Rob Clark <[email protected]>
>>>>
>>>> These are useful in particular for VM scenarios where the process which
>>>> has opened to drm device file is just a proxy for the real user in a VM
>>>> guest.
>>>>
>>>> Signed-off-by: Rob Clark <[email protected]>
>>>> ---
>>>>    Documentation/gpu/drm-usage-stats.rst | 18 ++++++++++++++++++
>>>>    drivers/gpu/drm/drm_file.c            | 15 +++++++++++++++
>>>>    include/drm/drm_file.h                | 19 +++++++++++++++++++
>>>>    3 files changed, 52 insertions(+)
>>>>
>>>> diff --git a/Documentation/gpu/drm-usage-stats.rst
>>>> b/Documentation/gpu/drm-usage-stats.rst
>>>> index 58dc0d3f8c58..e4877cf8089c 100644
>>>> --- a/Documentation/gpu/drm-usage-stats.rst
>>>> +++ b/Documentation/gpu/drm-usage-stats.rst
>>>> @@ -73,6 +73,24 @@ scope of each device, in which case `drm-pdev`
>>>> shall be present as well.
>>>>    Userspace should make sure to not double account any usage
>>>> statistics by using
>>>>    the above described criteria in order to associate data to
>>>> individual clients.
>>>>
>>>> +- drm-comm-override: <valstr>
>>>> +
>>>> +Returns the client executable override string.  Some drivers
>>>> support letting
>>>> +userspace override this in cases where the userspace is simply a
>>>> "proxy".
>>>> +Such as is the case with virglrenderer drm native context, where
>>>> the host
>>>> +process is just forwarding command submission, etc, from guest
>>>> userspace.
>>>> +This allows the proxy to make visible the executable name of the
>>>> actual
>>>> +app in the VM guest.
>>>> +
>>>> +- drm-cmdline-override: <valstr>
>>>> +
>>>> +Returns the client cmdline override string.  Some drivers support
>>>> letting
>>>> +userspace override this in cases where the userspace is simply a
>>>> "proxy".
>>>> +Such as is the case with virglrenderer drm native context, where
>>>> the host
>>>> +process is just forwarding command submission, etc, from guest
>>>> userspace.
>>>> +This allows the proxy to make visible the cmdline of the actual app
>>>> in the
>>>> +VM guest.
>>>
>>> Perhaps it would be okay to save space here by not repeating the
>>> description, like:
>>>
>>> drm-comm-override: <valstr>
>>> drm-cmdline-override: <valstr>
>>>
>>> Long description blah blah...
>>> This allows the proxy to make visible the _executable name *and* command
>>> line_ blah blah..
>>>
>>>> +
>>>>    Utilization
>>>>    ^^^^^^^^^^^
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>>>> index 9321eb0bf020..d7514c313af1 100644
>>>> --- a/drivers/gpu/drm/drm_file.c
>>>> +++ b/drivers/gpu/drm/drm_file.c
>>>> @@ -178,6 +178,8 @@ struct drm_file *drm_file_alloc(struct drm_minor
>>>> *minor)
>>>>        spin_lock_init(&file->master_lookup_lock);
>>>>        mutex_init(&file->event_read_lock);
>>>>
>>>> +     mutex_init(&file->override_lock);
>>>> +
>>>>        if (drm_core_check_feature(dev, DRIVER_GEM))
>>>>                drm_gem_open(dev, file);
>>>>
>>>> @@ -292,6 +294,8 @@ void drm_file_free(struct drm_file *file)
>>>>        WARN_ON(!list_empty(&file->event_list));
>>>>
>>>>        put_pid(file->pid);
>>>> +     kfree(file->override_comm);
>>>> +     kfree(file->override_cmdline);
>>>>        kfree(file);
>>>>    }
>>>>
>>>> @@ -995,6 +999,17 @@ void drm_show_fdinfo(struct seq_file *m, struct
>>>> file *f)
>>>>                           PCI_SLOT(pdev->devfn),
>>>> PCI_FUNC(pdev->devfn));
>>>>        }
>>>>
>>>> +     mutex_lock(&file->override_lock);
>>>
>>> You could add a fast unlocked check before taking the mutex for no risk
>>> apart a transient false negative. For 99.9999% of userspace it would
>>> mean no pointless lock/unlock cycle.
>>
>> I'm not sure I get your point?  This needs to be serialized against
>> userspace setting the override values
>
> if (file->override_comm || file->override_cmdline) {
>     mutex_lock(&file->override_lock);
>     if (file->override_comm)
>         drm_printf(&p, "drm-comm-override:\t%s\n",
>                file->override_comm);
>     if (file->override_cmdline)
>         drm_printf(&p, "drm-cmdline-override:\t%s\n",
>                file->override_cmdline);
>     mutext_unlock(&file->override_lock);
> }
>
> No risk apart for a transient false negative (which is immaterial for
> userspace since fdinfo reads are not ordered versus the override setting
> anyway) and 99.9% of deployments can get by not needing to pointlessly
> cycle the lock.

This fast path bypass I think is worth it but up to you if you are
really opposed. It's just that I don't see a point for cycling the mutex
for nothing in majority of cases.

>>>
>>>> +     if (file->override_comm) {
>>>> +             drm_printf(&p, "drm-comm-override:\t%s\n",
>>>> +                        file->override_comm);
>>>> +     }
>>>> +     if (file->override_cmdline) {
>>>> +             drm_printf(&p, "drm-cmdline-override:\t%s\n",
>>>> +                        file->override_cmdline);
>>>> +     }
>>>> +     mutex_unlock(&file->override_lock);
>>>> +
>>>>        if (dev->driver->show_fdinfo)
>>>>                dev->driver->show_fdinfo(&p, file);
>>>>    }
>>>> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
>>>> index 1339e925af52..604d05fa6f0c 100644
>>>> --- a/include/drm/drm_file.h
>>>> +++ b/include/drm/drm_file.h
>>>> @@ -370,6 +370,25 @@ struct drm_file {
>>>>         */
>>>>        struct drm_prime_file_private prime;
>>>>
>>>> +     /**
>>>> +      * @comm: Overridden task comm
>>>> +      *
>>>> +      * Accessed under override_lock
>>>> +      */
>>>> +     char *override_comm;
>>>> +
>>>> +     /**
>>>> +      * @cmdline: Overridden task cmdline
>>>> +      *
>>>> +      * Accessed under override_lock
>>>> +      */
>>>> +     char *override_cmdline;
>>>> +
>>>> +     /**
>>>> +      * @override_lock: Serialize access to override_comm and
>>>> override_cmdline
>>>> +      */
>>>> +     struct mutex override_lock;
>>>> +
>>>
>>> I don't think this should go to drm just yet though. Only one driver can
>>> make use of it so I'd leave it for later and print from msm_show_fdinfo
>>> for now.
>>
>> This was my original approach but danvet asked that it be moved into
>> drm for consistency across drivers.  (And really, I want the in-flight
>> amd and intel native-context stuff to motivate adding similar features
>> to amdgpu/i915/xe.)
>
> IMO if implementation is not shared, not even by using helpers, I don't
> think data storage should be either, but it's not a deal breaker.

To summarise my thoughts on the patch (v4):

I am not really keen on the split of data fields in common and no common
implementation or helpers.

For what the drm-usage-stats.rst are concerned it looks completely fine.
And feature really will be useful in virtualised stacks.

Code in this patch is also completely fine.

Therefore you can have an r-b on those parts, but with reservations on
whether it makes sense to put the fields under drm_file just yet. That
should be fine under the r-b rules AFAIU. Ideally you can collect an ack
from someone else too.

Reviewed-by: Tvrtko Ursulin <[email protected]>

Regards,

Tvrtko

>
> Regards,
>
> Tvrtko
>
>>
>> BR,
>> -R
>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>>        /* private: */
>>>>    #if IS_ENABLED(CONFIG_DRM_LEGACY)
>>>>        unsigned long lock_count; /* DRI1 legacy lock count */

2023-05-18 16:37:07

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] drm/fdinfo: Add comm/cmdline override fields

On Thu, May 18, 2023 at 2:43 AM Tvrtko Ursulin
<[email protected]> wrote:
>
>
> In case you were waiting for me looking at the rest of the series, there
> was this reply from the previous round I can expand on.
>
> On 02/05/2023 08:50, Tvrtko Ursulin wrote:
> >
> > On 01/05/2023 17:58, Rob Clark wrote:
> >> On Fri, Apr 28, 2023 at 4:05 AM Tvrtko Ursulin
> >> <[email protected]> wrote:
> >>>
> >>>
> >>> On 27/04/2023 18:53, Rob Clark wrote:
> >>>> From: Rob Clark <[email protected]>
> >>>>
> >>>> These are useful in particular for VM scenarios where the process which
> >>>> has opened to drm device file is just a proxy for the real user in a VM
> >>>> guest.
> >>>>
> >>>> Signed-off-by: Rob Clark <[email protected]>
> >>>> ---
> >>>> Documentation/gpu/drm-usage-stats.rst | 18 ++++++++++++++++++
> >>>> drivers/gpu/drm/drm_file.c | 15 +++++++++++++++
> >>>> include/drm/drm_file.h | 19 +++++++++++++++++++
> >>>> 3 files changed, 52 insertions(+)
> >>>>
> >>>> diff --git a/Documentation/gpu/drm-usage-stats.rst
> >>>> b/Documentation/gpu/drm-usage-stats.rst
> >>>> index 58dc0d3f8c58..e4877cf8089c 100644
> >>>> --- a/Documentation/gpu/drm-usage-stats.rst
> >>>> +++ b/Documentation/gpu/drm-usage-stats.rst
> >>>> @@ -73,6 +73,24 @@ scope of each device, in which case `drm-pdev`
> >>>> shall be present as well.
> >>>> Userspace should make sure to not double account any usage
> >>>> statistics by using
> >>>> the above described criteria in order to associate data to
> >>>> individual clients.
> >>>>
> >>>> +- drm-comm-override: <valstr>
> >>>> +
> >>>> +Returns the client executable override string. Some drivers
> >>>> support letting
> >>>> +userspace override this in cases where the userspace is simply a
> >>>> "proxy".
> >>>> +Such as is the case with virglrenderer drm native context, where
> >>>> the host
> >>>> +process is just forwarding command submission, etc, from guest
> >>>> userspace.
> >>>> +This allows the proxy to make visible the executable name of the
> >>>> actual
> >>>> +app in the VM guest.
> >>>> +
> >>>> +- drm-cmdline-override: <valstr>
> >>>> +
> >>>> +Returns the client cmdline override string. Some drivers support
> >>>> letting
> >>>> +userspace override this in cases where the userspace is simply a
> >>>> "proxy".
> >>>> +Such as is the case with virglrenderer drm native context, where
> >>>> the host
> >>>> +process is just forwarding command submission, etc, from guest
> >>>> userspace.
> >>>> +This allows the proxy to make visible the cmdline of the actual app
> >>>> in the
> >>>> +VM guest.
> >>>
> >>> Perhaps it would be okay to save space here by not repeating the
> >>> description, like:
> >>>
> >>> drm-comm-override: <valstr>
> >>> drm-cmdline-override: <valstr>
> >>>
> >>> Long description blah blah...
> >>> This allows the proxy to make visible the _executable name *and* command
> >>> line_ blah blah..
> >>>
> >>>> +
> >>>> Utilization
> >>>> ^^^^^^^^^^^
> >>>>
> >>>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> >>>> index 9321eb0bf020..d7514c313af1 100644
> >>>> --- a/drivers/gpu/drm/drm_file.c
> >>>> +++ b/drivers/gpu/drm/drm_file.c
> >>>> @@ -178,6 +178,8 @@ struct drm_file *drm_file_alloc(struct drm_minor
> >>>> *minor)
> >>>> spin_lock_init(&file->master_lookup_lock);
> >>>> mutex_init(&file->event_read_lock);
> >>>>
> >>>> + mutex_init(&file->override_lock);
> >>>> +
> >>>> if (drm_core_check_feature(dev, DRIVER_GEM))
> >>>> drm_gem_open(dev, file);
> >>>>
> >>>> @@ -292,6 +294,8 @@ void drm_file_free(struct drm_file *file)
> >>>> WARN_ON(!list_empty(&file->event_list));
> >>>>
> >>>> put_pid(file->pid);
> >>>> + kfree(file->override_comm);
> >>>> + kfree(file->override_cmdline);
> >>>> kfree(file);
> >>>> }
> >>>>
> >>>> @@ -995,6 +999,17 @@ void drm_show_fdinfo(struct seq_file *m, struct
> >>>> file *f)
> >>>> PCI_SLOT(pdev->devfn),
> >>>> PCI_FUNC(pdev->devfn));
> >>>> }
> >>>>
> >>>> + mutex_lock(&file->override_lock);
> >>>
> >>> You could add a fast unlocked check before taking the mutex for no risk
> >>> apart a transient false negative. For 99.9999% of userspace it would
> >>> mean no pointless lock/unlock cycle.
> >>
> >> I'm not sure I get your point? This needs to be serialized against
> >> userspace setting the override values
> >
> > if (file->override_comm || file->override_cmdline) {
> > mutex_lock(&file->override_lock);
> > if (file->override_comm)
> > drm_printf(&p, "drm-comm-override:\t%s\n",
> > file->override_comm);
> > if (file->override_cmdline)
> > drm_printf(&p, "drm-cmdline-override:\t%s\n",
> > file->override_cmdline);
> > mutext_unlock(&file->override_lock);
> > }
> >
> > No risk apart for a transient false negative (which is immaterial for
> > userspace since fdinfo reads are not ordered versus the override setting
> > anyway) and 99.9% of deployments can get by not needing to pointlessly
> > cycle the lock.
>
> This fast path bypass I think is worth it but up to you if you are
> really opposed. It's just that I don't see a point for cycling the mutex
> for nothing in majority of cases.

I think it is a premature optimization.. an uncontended lock is "just"
an atomic. Yes, atomics can be expensive in a hot path.. but in this
case it is going to be lost in the noise. I did look a bit at gputop
with `perf record` and it is very much not the problem.

> >>>
> >>>> + if (file->override_comm) {
> >>>> + drm_printf(&p, "drm-comm-override:\t%s\n",
> >>>> + file->override_comm);
> >>>> + }
> >>>> + if (file->override_cmdline) {
> >>>> + drm_printf(&p, "drm-cmdline-override:\t%s\n",
> >>>> + file->override_cmdline);
> >>>> + }
> >>>> + mutex_unlock(&file->override_lock);
> >>>> +
> >>>> if (dev->driver->show_fdinfo)
> >>>> dev->driver->show_fdinfo(&p, file);
> >>>> }
> >>>> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> >>>> index 1339e925af52..604d05fa6f0c 100644
> >>>> --- a/include/drm/drm_file.h
> >>>> +++ b/include/drm/drm_file.h
> >>>> @@ -370,6 +370,25 @@ struct drm_file {
> >>>> */
> >>>> struct drm_prime_file_private prime;
> >>>>
> >>>> + /**
> >>>> + * @comm: Overridden task comm
> >>>> + *
> >>>> + * Accessed under override_lock
> >>>> + */
> >>>> + char *override_comm;
> >>>> +
> >>>> + /**
> >>>> + * @cmdline: Overridden task cmdline
> >>>> + *
> >>>> + * Accessed under override_lock
> >>>> + */
> >>>> + char *override_cmdline;
> >>>> +
> >>>> + /**
> >>>> + * @override_lock: Serialize access to override_comm and
> >>>> override_cmdline
> >>>> + */
> >>>> + struct mutex override_lock;
> >>>> +
> >>>
> >>> I don't think this should go to drm just yet though. Only one driver can
> >>> make use of it so I'd leave it for later and print from msm_show_fdinfo
> >>> for now.
> >>
> >> This was my original approach but danvet asked that it be moved into
> >> drm for consistency across drivers. (And really, I want the in-flight
> >> amd and intel native-context stuff to motivate adding similar features
> >> to amdgpu/i915/xe.)
> >
> > IMO if implementation is not shared, not even by using helpers, I don't
> > think data storage should be either, but it's not a deal breaker.
>
> To summarise my thoughts on the patch (v4):
>
> I am not really keen on the split of data fields in common and no common
> implementation or helpers.

I can go either way on this.. it was danvet that suggested moving to
drm_file to encourage more standardization.

(But we can also land the meminfo parts of the series without this
part.. it was just convenient for me to keep them in the same series
to avoid conflicts)

BR,
-R

> For what the drm-usage-stats.rst are concerned it looks completely fine.
> And feature really will be useful in virtualised stacks.
>
> Code in this patch is also completely fine.
>
> Therefore you can have an r-b on those parts, but with reservations on
> whether it makes sense to put the fields under drm_file just yet. That
> should be fine under the r-b rules AFAIU. Ideally you can collect an ack
> from someone else too.
>
> Reviewed-by: Tvrtko Ursulin <[email protected]>
>
> Regards,
>
> Tvrtko
>
> >
> > Regards,
> >
> > Tvrtko
> >
> >>
> >> BR,
> >> -R
> >>
> >>> Regards,
> >>>
> >>> Tvrtko
> >>>
> >>>> /* private: */
> >>>> #if IS_ENABLED(CONFIG_DRM_LEGACY)
> >>>> unsigned long lock_count; /* DRI1 legacy lock count */