From: Hridya Valsaraju <[email protected]>
This patch adds a proposal for a new GPU cgroup controller for
accounting/limiting GPU and GPU-related memory allocations.
The proposed controller is based on the DRM cgroup controller[1] and
follows the design of the RDMA cgroup controller.
The new cgroup controller would:
* Allow setting per-device limits on the total size of buffers
allocated by device within a cgroup.
* Expose a per-device/allocator breakdown of the buffers charged to a
cgroup.
The prototype in the following patches is only for memory accounting
using the GPU cgroup controller and does not implement limit setting.
[1]: https://lore.kernel.org/amd-gfx/[email protected]/
Signed-off-by: Hridya Valsaraju <[email protected]>
Signed-off-by: T.J. Mercier <[email protected]>
---
v6 changes
Move documentation into cgroup-v2.rst per Tejun Heo.
v5 changes
Drop the global GPU cgroup "total" (sum of all device totals) portion
of the design since there is no currently known use for this per
Tejun Heo.
Update for renamed functions/variables.
v3 changes
Remove Upstreaming Plan from gpu-cgroup.rst per John Stultz.
Use more common dual author commit message format per John Stultz.
---
Documentation/admin-guide/cgroup-v2.rst | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 69d7a6983f78..baeec096f1d8 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2352,6 +2352,30 @@ first, and stays charged to that cgroup until that resource is freed. Migrating
a process to a different cgroup does not move the charge to the destination
cgroup where the process has moved.
+
+GPU
+---
+
+The GPU controller accounts for device and system memory allocated by the GPU
+and related subsystems for graphics use. Resource limits are not currently
+supported.
+
+GPU Interface Files
+~~~~~~~~~~~~~~~~~~~~
+
+ gpu.memory.current
+ A read-only file containing memory allocations in flat-keyed format. The key
+ is a string representing the device name. The value is the size of the memory
+ charged to the device in bytes. The device names are globally unique.::
+
+ $ cat /sys/kernel/fs/cgroup1/gpu.memory.current
+ dev1 4194304
+ dev2 104857600
+
+ The device name string is set by a device driver when it registers with the
+ GPU cgroup controller to participate in resource accounting. Non-unique names
+ will be rejected at the point of registration.
+
Others
------
--
2.36.0.464.gb9c8b46e94-goog
Hello.
On Mon, May 02, 2022 at 11:19:35PM +0000, "T.J. Mercier" <[email protected]> wrote:
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> [...]
> + The device name string is set by a device driver when it registers with the
> + GPU cgroup controller to participate in resource accounting.
Are these names available anywhere else for the user? (I.e. would
drivers add respective sysfs attributes or similar?)
> + Non-unique names will be rejected at the point of registration.
This doesn't seem relevant to the cgroupfs user, does it?
I think it should be mentioned at the respective API.
HTH,
Michal
On Wed, May 04, 2022 at 10:16:50AM -0700, "T.J. Mercier" <[email protected]> wrote:
> However the only names that would result from this series are the
> names of the dma-buf heaps, with "-heap" appended. So they are
> predictable from the /dev/dma_heap/* names, and only the system and
> cma heaps currently exist upstream.
It's not so important with the read-only stats currently posted (a
crafted sysfs file with these names would be an overlikill)...
>
> For other future uses of this controller I thought we were headed in
> the direction of "standardized" names which would be
> predefined/hardcoded and documented, so these names wouldn't really
> need to be made available to a user at runtime.
> https://lore.kernel.org/lkml/CABdmKX3gTAohaOwkNccGrQyXN9tzT-oEVibO5ZPF+eP+Vq=AOg@mail.gmail.com/
(Ah, I see.)
...but if writers (limits) are envisioned, the keys should represent
something that the user can derive/construct from available info -- e.g.
the documentation.
OK, so I understand current form just presents some statistics.
Michal
On Wed, May 4, 2022 at 5:10 AM Michal Koutný <[email protected]> wrote:
>
> Hello.
>
> On Mon, May 02, 2022 at 11:19:35PM +0000, "T.J. Mercier" <[email protected]> wrote:
> > --- a/Documentation/admin-guide/cgroup-v2.rst
> > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > [...]
> > + The device name string is set by a device driver when it registers with the
> > + GPU cgroup controller to participate in resource accounting.
>
> Are these names available anywhere else for the user? (I.e. would
> drivers add respective sysfs attributes or similar?)
>
Hi, this sounds like it could be a good idea but it'd probably be best
to do this inside gpucg_register_bucket instead of requiring drivers
to perform this externally, possibly in a non-uniform way. Maybe a
sysfs file that prints each name of the gpucg_buckets elements?
However the only names that would result from this series are the
names of the dma-buf heaps, with "-heap" appended. So they are
predictable from the /dev/dma_heap/* names, and only the system and
cma heaps currently exist upstream.
For other future uses of this controller I thought we were headed in
the direction of "standardized" names which would be
predefined/hardcoded and documented, so these names wouldn't really
need to be made available to a user at runtime.
https://lore.kernel.org/lkml/CABdmKX3gTAohaOwkNccGrQyXN9tzT-oEVibO5ZPF+eP+Vq=AOg@mail.gmail.com/
>
> > + Non-unique names will be rejected at the point of registration.
>
> This doesn't seem relevant to the cgroupfs user, does it?
> I think it should be mentioned at the respective API.
>
Yeah you're right. Thank you.
> HTH,
> Michal
>
On Thu, May 5, 2022 at 4:29 AM Michal Koutný <[email protected]> wrote:
>
> On Wed, May 04, 2022 at 10:16:50AM -0700, "T.J. Mercier" <[email protected]> wrote:
> > However the only names that would result from this series are the
> > names of the dma-buf heaps, with "-heap" appended. So they are
> > predictable from the /dev/dma_heap/* names, and only the system and
> > cma heaps currently exist upstream.
>
> It's not so important with the read-only stats currently posted (a
> crafted sysfs file with these names would be an overlikill)...
>
> >
> > For other future uses of this controller I thought we were headed in
> > the direction of "standardized" names which would be
> > predefined/hardcoded and documented, so these names wouldn't really
> > need to be made available to a user at runtime.
> > https://lore.kernel.org/lkml/CABdmKX3gTAohaOwkNccGrQyXN9tzT-oEVibO5ZPF+eP+Vq=AOg@mail.gmail.com/
>
> (Ah, I see.)
>
> ...but if writers (limits) are envisioned, the keys should represent
> something that the user can derive/construct from available info -- e.g.
> the documentation.
>
> OK, so I understand current form just presents some statistics.
>
Yup, thanks for taking a look.
> Michal