The existing gpu_mem_total tracepoint allows GPU drivers a unifrom way
to report the per-process and system-wide GPU memory usage. This
tracepoint reports a single total of the GPU private allocations and the
imported memory. [1]
To allow distinguishing GPU private vs imported memory, add
gpu_mem_imported tracepoint.
GPU drivers can use this tracepoint to report the per-process and global
GPU-imported memory in a uniform way.
For backward compatility with already existing implementations of
gpu_mem_total by various Android GPU drivers, this is proposed as a new
tracepoint rather than additional args to gpu_mem_total.
[1] https://lore.kernel.org/r/[email protected]/
Signed-off-by: Kalesh Singh <[email protected]>
---
include/trace/events/gpu_mem.h | 51 ++++++++++++++++++++++++----------
1 file changed, 36 insertions(+), 15 deletions(-)
diff --git a/include/trace/events/gpu_mem.h b/include/trace/events/gpu_mem.h
index 26d871f96e94..b9543abf1461 100644
--- a/include/trace/events/gpu_mem.h
+++ b/include/trace/events/gpu_mem.h
@@ -13,21 +13,7 @@
#include <linux/tracepoint.h>
-/*
- * The gpu_memory_total event indicates that there's an update to either the
- * global or process total gpu memory counters.
- *
- * This event should be emitted whenever the kernel device driver allocates,
- * frees, imports, unimports memory in the GPU addressable space.
- *
- * @gpu_id: This is the gpu id.
- *
- * @pid: Put 0 for global total, while positive pid for process total.
- *
- * @size: Size of the allocation in bytes.
- *
- */
-TRACE_EVENT(gpu_mem_total,
+DECLARE_EVENT_CLASS(gpu_mem_template,
TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
@@ -51,6 +37,41 @@ TRACE_EVENT(gpu_mem_total,
__entry->size)
);
+/*
+ * The gpu_memory_total event indicates that there's an update to either the
+ * global or process total gpu memory counters.
+ *
+ * This event should be emitted whenever the kernel device driver allocates,
+ * frees, imports, unimports memory in the GPU addressable space.
+ *
+ * @gpu_id: This is the gpu id.
+ *
+ * @pid: Put 0 for global total, while positive pid for process total.
+ *
+ * @size: Size of the allocation in bytes.
+ *
+ */
+DEFINE_EVENT(gpu_mem_template, gpu_mem_total,
+ TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
+ TP_ARGS(gpu_id, pid, size));
+
+/*
+ * The gpu_mem_imported event indicates that there's an update to the
+ * global or process imported gpu memory counters.
+ *
+ * This event should be emitted whenever the kernel device driver imports
+ * or unimports memory (allocated externally) in the GPU addressable space.
+ *
+ * @gpu_id: This is the gpu id.
+ *
+ * @pid: Put 0 for global total, while positive pid for process total.
+ *
+ * @size: Size of the imported memory in bytes.
+ */
+DEFINE_EVENT(gpu_mem_template, gpu_mem_imported,
+ TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
+ TP_ARGS(gpu_id, pid, size));
+
#endif /* _TRACE_GPU_MEM_H */
/* This part must be outside protection */
base-commit: ff1176468d368232b684f75e82563369208bc371
--
2.32.0.432.gabb21c7263-goog
On Mon, Jul 26, 2021 at 04:41:31PM +0000, Kalesh Singh wrote:
> The existing gpu_mem_total tracepoint allows GPU drivers a unifrom way
> to report the per-process and system-wide GPU memory usage. This
> tracepoint reports a single total of the GPU private allocations and the
> imported memory. [1]
>
> To allow distinguishing GPU private vs imported memory, add
> gpu_mem_imported tracepoint.
>
> GPU drivers can use this tracepoint to report the per-process and global
> GPU-imported memory in a uniform way.
Right, as imported dma-buf memory is usually shared between two or more
processes it will be hard to reconcile system memory usage with process
private GPU memory and imported dma-buf memory in a single total sum. It
will be good to improve this and having a separate imported GPU memory
size is good.
> For backward compatility with already existing implementations of
> gpu_mem_total by various Android GPU drivers, this is proposed as a new
> tracepoint rather than additional args to gpu_mem_total.
Is this for compatibility with kernel GPU drivers only, or are there
user space users of the existing tracepoint that can't cope with it
being extended?
The eBPF used by Android to track the GPU memory is the only established
user of the tracepoint that I know about. Can existing versions of the
eBPF can handle this OK?
If the only worry is GPU kernel drivers then I think that extending the
existing tracepoint, to add a new field with the imported memory sum,
will be better. There doesn't appear to be any in-kernel GPU drivers
implementing this yet, and other GPU kernel drivers can just be extended
to use the new extended tracepoint. As long as there's some mechanism to
detect the extended variant of the tracepoint for backports, if the new
tracepoint is ever backported to older kernels, that shouldn't be a
problem.
--
Ørjan
> [1] https://lore.kernel.org/r/[email protected]/
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---
> include/trace/events/gpu_mem.h | 51 ++++++++++++++++++++++++----------
> 1 file changed, 36 insertions(+), 15 deletions(-)
>
> diff --git a/include/trace/events/gpu_mem.h b/include/trace/events/gpu_mem.h
> index 26d871f96e94..b9543abf1461 100644
> --- a/include/trace/events/gpu_mem.h
> +++ b/include/trace/events/gpu_mem.h
> @@ -13,21 +13,7 @@
>
> #include <linux/tracepoint.h>
>
> -/*
> - * The gpu_memory_total event indicates that there's an update to either the
> - * global or process total gpu memory counters.
> - *
> - * This event should be emitted whenever the kernel device driver allocates,
> - * frees, imports, unimports memory in the GPU addressable space.
> - *
> - * @gpu_id: This is the gpu id.
> - *
> - * @pid: Put 0 for global total, while positive pid for process total.
> - *
> - * @size: Size of the allocation in bytes.
> - *
> - */
> -TRACE_EVENT(gpu_mem_total,
> +DECLARE_EVENT_CLASS(gpu_mem_template,
>
> TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
>
> @@ -51,6 +37,41 @@ TRACE_EVENT(gpu_mem_total,
> __entry->size)
> );
>
> +/*
> + * The gpu_memory_total event indicates that there's an update to either the
> + * global or process total gpu memory counters.
> + *
> + * This event should be emitted whenever the kernel device driver allocates,
> + * frees, imports, unimports memory in the GPU addressable space.
> + *
> + * @gpu_id: This is the gpu id.
> + *
> + * @pid: Put 0 for global total, while positive pid for process total.
> + *
> + * @size: Size of the allocation in bytes.
> + *
> + */
> +DEFINE_EVENT(gpu_mem_template, gpu_mem_total,
> + TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
> + TP_ARGS(gpu_id, pid, size));
> +
> +/*
> + * The gpu_mem_imported event indicates that there's an update to the
> + * global or process imported gpu memory counters.
> + *
> + * This event should be emitted whenever the kernel device driver imports
> + * or unimports memory (allocated externally) in the GPU addressable space.
> + *
> + * @gpu_id: This is the gpu id.
> + *
> + * @pid: Put 0 for global total, while positive pid for process total.
> + *
> + * @size: Size of the imported memory in bytes.
> + */
> +DEFINE_EVENT(gpu_mem_template, gpu_mem_imported,
> + TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
> + TP_ARGS(gpu_id, pid, size));
> +
> #endif /* _TRACE_GPU_MEM_H */
>
> /* This part must be outside protection */
>
> base-commit: ff1176468d368232b684f75e82563369208bc371
> --
> 2.32.0.432.gabb21c7263-goog
>
On Mon, Aug 16, 2021 at 10:22 AM Ørjan Eide <[email protected]> wrote:
>
> On Mon, Jul 26, 2021 at 04:41:31PM +0000, Kalesh Singh wrote:
> > The existing gpu_mem_total tracepoint allows GPU drivers a unifrom way
> > to report the per-process and system-wide GPU memory usage. This
> > tracepoint reports a single total of the GPU private allocations and the
> > imported memory. [1]
> >
> > To allow distinguishing GPU private vs imported memory, add
> > gpu_mem_imported tracepoint.
> >
> > GPU drivers can use this tracepoint to report the per-process and global
> > GPU-imported memory in a uniform way.
>
> Right, as imported dma-buf memory is usually shared between two or more
> processes it will be hard to reconcile system memory usage with process
> private GPU memory and imported dma-buf memory in a single total sum. It
> will be good to improve this and having a separate imported GPU memory
> size is good.
>
> > For backward compatility with already existing implementations of
> > gpu_mem_total by various Android GPU drivers, this is proposed as a new
> > tracepoint rather than additional args to gpu_mem_total.
>
> Is this for compatibility with kernel GPU drivers only, or are there
> user space users of the existing tracepoint that can't cope with it
> being extended?
>
> The eBPF used by Android to track the GPU memory is the only established
> user of the tracepoint that I know about. Can existing versions of the
> eBPF can handle this OK?
>
> If the only worry is GPU kernel drivers then I think that extending the
> existing tracepoint, to add a new field with the imported memory sum,
> will be better. There doesn't appear to be any in-kernel GPU drivers
> implementing this yet, and other GPU kernel drivers can just be extended
> to use the new extended tracepoint. As long as there's some mechanism to
> detect the extended variant of the tracepoint for backports, if the new
> tracepoint is ever backported to older kernels, that shouldn't be a
> problem.
Hi Ørjan. Thank you for your comments.
The compatibility concern was regarding existing user space tools when
devices with older kernels (having the older gpu_mem_total format)
upgrade to newer android releases.
In android, there are 2 user space tools that depend on this
tracepoint: the bpf program to capture the tracepoint data [1] and the
Perfetto profiling tool [2]. I’ve confirmed that Perfetto can handle
both the old and new formats if only a new field is added and the
types of existing fields remain unchanged. For the bpf program, it
turns out we can detect the format from gpu_mem_total/format and load
different versions of the bpf program accordingly.
I'll repost a new version adding only the new imported_mem field.
Thanks,
Kalesh
[1] https://cs.android.com/android/platform/superproject/+/92e677d80bc48772a36ef0d2d9db3195b2dfcf65:frameworks/native/services/gpuservice/bpfprogs/gpu_mem.c;l=51
[2] https://perfetto.dev
>
> --
> Ørjan
>
> > [1] https://lore.kernel.org/r/[email protected]/
> >
> > Signed-off-by: Kalesh Singh <[email protected]>
> > ---
> > include/trace/events/gpu_mem.h | 51 ++++++++++++++++++++++++----------
> > 1 file changed, 36 insertions(+), 15 deletions(-)
> >
> > diff --git a/include/trace/events/gpu_mem.h b/include/trace/events/gpu_mem.h
> > index 26d871f96e94..b9543abf1461 100644
> > --- a/include/trace/events/gpu_mem.h
> > +++ b/include/trace/events/gpu_mem.h
> > @@ -13,21 +13,7 @@
> >
> > #include <linux/tracepoint.h>
> >
> > -/*
> > - * The gpu_memory_total event indicates that there's an update to either the
> > - * global or process total gpu memory counters.
> > - *
> > - * This event should be emitted whenever the kernel device driver allocates,
> > - * frees, imports, unimports memory in the GPU addressable space.
> > - *
> > - * @gpu_id: This is the gpu id.
> > - *
> > - * @pid: Put 0 for global total, while positive pid for process total.
> > - *
> > - * @size: Size of the allocation in bytes.
> > - *
> > - */
> > -TRACE_EVENT(gpu_mem_total,
> > +DECLARE_EVENT_CLASS(gpu_mem_template,
> >
> > TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
> >
> > @@ -51,6 +37,41 @@ TRACE_EVENT(gpu_mem_total,
> > __entry->size)
> > );
> >
> > +/*
> > + * The gpu_memory_total event indicates that there's an update to either the
> > + * global or process total gpu memory counters.
> > + *
> > + * This event should be emitted whenever the kernel device driver allocates,
> > + * frees, imports, unimports memory in the GPU addressable space.
> > + *
> > + * @gpu_id: This is the gpu id.
> > + *
> > + * @pid: Put 0 for global total, while positive pid for process total.
> > + *
> > + * @size: Size of the allocation in bytes.
> > + *
> > + */
> > +DEFINE_EVENT(gpu_mem_template, gpu_mem_total,
> > + TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
> > + TP_ARGS(gpu_id, pid, size));
> > +
> > +/*
> > + * The gpu_mem_imported event indicates that there's an update to the
> > + * global or process imported gpu memory counters.
> > + *
> > + * This event should be emitted whenever the kernel device driver imports
> > + * or unimports memory (allocated externally) in the GPU addressable space.
> > + *
> > + * @gpu_id: This is the gpu id.
> > + *
> > + * @pid: Put 0 for global total, while positive pid for process total.
> > + *
> > + * @size: Size of the imported memory in bytes.
> > + */
> > +DEFINE_EVENT(gpu_mem_template, gpu_mem_imported,
> > + TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
> > + TP_ARGS(gpu_id, pid, size));
> > +
> > #endif /* _TRACE_GPU_MEM_H */
> >
> > /* This part must be outside protection */
> >
> > base-commit: ff1176468d368232b684f75e82563369208bc371
> > --
> > 2.32.0.432.gabb21c7263-goog
> >