LinuxLists.cc - rseq vcpu

2022-01-27 09:00:47

Subject: rseq vcpu_id ideas

Hi Paul,

I remember our LPC discussions about your virtual cpu ids ideas, and noticed some tcmalloc code
with "prototype" fields for vcpu_id and numa node id
(https://github.com/google/tcmalloc/blob/master/tcmalloc/internal/linux_syscall_support.h#L34).

I'm currently toying with ideas very close to vcpu_ids to solve issues with overzealous
memory allocation for LTTng-UST (user-space tracer) in use-cases where containers use few
cores.

My current thinking is that we could use your vcpu_id idea, but apply it on a per-pid-namespace
basis rather than per-process. We may have to be clever with NUMA as well to ensure good NUMA
locality.

Do you have any thought about this, and perhaps some prototype rseq extension code you could
share as a starting point ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

2022-01-31 11:21:48

by Peter Oskolkov

[permalink] [raw]

Subject: Re: rseq vcpu_id ideas

On Wed, Jan 26, 2022 at 5:22 PM Mathieu Desnoyers
<[email protected]> wrote:
>
> Hi Paul,
>
> I remember our LPC discussions about your virtual cpu ids ideas, and noticed some tcmalloc code
> with "prototype" fields for vcpu_id and numa node id
> (https://github.com/google/tcmalloc/blob/master/tcmalloc/internal/linux_syscall_support.h#L34).
>
> I'm currently toying with ideas very close to vcpu_ids to solve issues with overzealous
> memory allocation for LTTng-UST (user-space tracer) in use-cases where containers use few
> cores.
>
> My current thinking is that we could use your vcpu_id idea, but apply it on a per-pid-namespace
> basis rather than per-process. We may have to be clever with NUMA as well to ensure good NUMA
> locality.
>
> Do you have any thought about this, and perhaps some prototype rseq extension code you could
> share as a starting point ?

We've been using rseq vcpu extensions in production for more than a
year, with good results. We have a perfect use case, though: wide
machines (hundreds of CPUs) with many narrow processes (restricted to
a small number of CPUs). Our extension can be configured to either do
a "flat" vcpu accounting, or a "per numa node" vcpu accounting. We
currently only use "flat" accounting, I guess because most of our
processes are affined to a single numa node.

I plan to post the code to the list after the UMCG saga comes to a
clear resolution.

>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com