2022-05-13 09:27:22

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] sched/core: add forced idle accounting for cgroups

On Thu, May 12, 2022 at 05:54:27PM -0700, Josh Don wrote:
> 4feee7d1260 previously added per-task forced idle accounting. This patch
> extends this to also include cgroups.
>
> rstat is used for cgroup accounting, except for the root, which uses
> kcpustat in order to bypass the need for doing an rstat flush when
> reading root stats.
>
> Only cgroup v2 is supported. Similar to the task accounting, the cgroup
> accounting requires that schedstats is enabled.

We've been collecting scheduler stats in cgroup core so that we always have
them available whether cpu controller is enabled or not. There's nothing
actually specific to cpu controller, right? Would it make sense to collect
the cpu core stats the same way as the rest of scheduler stats?

Thanks.

--
tejun


2022-05-14 02:24:49

by Josh Don

[permalink] [raw]
Subject: Re: [PATCH] sched/core: add forced idle accounting for cgroups

Thanks Tejun,

On Thu, May 12, 2022 at 7:58 PM Tejun Heo <[email protected]> wrote:
>
> On Thu, May 12, 2022 at 05:54:27PM -0700, Josh Don wrote:
> > 4feee7d1260 previously added per-task forced idle accounting. This patch
> > extends this to also include cgroups.
> >
> > rstat is used for cgroup accounting, except for the root, which uses
> > kcpustat in order to bypass the need for doing an rstat flush when
> > reading root stats.
> >
> > Only cgroup v2 is supported. Similar to the task accounting, the cgroup
> > accounting requires that schedstats is enabled.
>
> We've been collecting scheduler stats in cgroup core so that we always have
> them available whether cpu controller is enabled or not. There's nothing
> actually specific to cpu controller, right? Would it make sense to collect
> the cpu core stats the same way as the rest of scheduler stats?

Yea, that's right, this doesn't require the cpu controller to be
enabled. Are you suggesting to add a new field to cgroup_base_stat?

One other weird artifact of collecting forceidle time is that a cpu
may account it on behalf of its hyperthread sibling. Currently, the
core rstat code always accounts to the current cpu's percpu rstat
field. I can add an accounting function to support writes to a
different cpu's field, in order to make sure that the per-cpu totals
are correct (the forceidle accounting code holds rq->__lock, which
protects all HT siblings of a core). percpu totals aren't currently
exported in cgroup v2, but this is useful information that we'll
consume, so it would be nice to keep it accurate.

Best,
Josh

2022-05-23 08:19:24

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] sched/core: add forced idle accounting for cgroups

Hello,

Sorry about late reply and thanks for the ping. I missed this one.

On Fri, May 13, 2022 at 12:23:16PM -0700, Josh Don wrote:
> Yea, that's right, this doesn't require the cpu controller to be
> enabled. Are you suggesting to add a new field to cgroup_base_stat?

Yes, that's what I meant. I think it'd fit there better.

> One other weird artifact of collecting forceidle time is that a cpu
> may account it on behalf of its hyperthread sibling. Currently, the
> core rstat code always accounts to the current cpu's percpu rstat
> field. I can add an accounting function to support writes to a
> different cpu's field, in order to make sure that the per-cpu totals
> are correct (the forceidle accounting code holds rq->__lock, which
> protects all HT siblings of a core). percpu totals aren't currently
> exported in cgroup v2, but this is useful information that we'll
> consume, so it would be nice to keep it accurate.

Sure, as long as it doesn't incur overhead when not used.

Thanks.

--
tejun