2020-10-12 16:33:34

by Yun Hsiang

[permalink] [raw]
Subject: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

If the user wants to stop controlling uclamp and let the task inherit
the value from the group, we need a method to reset.

Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
sched_setattr syscall.

Signed-off-by: Yun Hsiang <[email protected]>
---
include/uapi/linux/sched.h | 9 ++++++++-
kernel/sched/core.c | 16 ++++++++++++----
2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 3bac0a8ceab2..a12e88c362d8 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -132,6 +132,9 @@ struct clone_args {
#define SCHED_FLAG_KEEP_PARAMS 0x10
#define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
#define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
+#define SCHED_FLAG_UTIL_CLAMP_RESET_MIN 0x80
+#define SCHED_FLAG_UTIL_CLAMP_RESET_MAX 0x100
+

#define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
SCHED_FLAG_KEEP_PARAMS)
@@ -139,10 +142,14 @@ struct clone_args {
#define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
SCHED_FLAG_UTIL_CLAMP_MAX)

+#define SCHED_FLAG_UTIL_CLAMP_RESET (SCHED_FLAG_UTIL_CLAMP_RESET_MIN | \
+ SCHED_FLAG_UTIL_CLAMP_RESET_MAX)
+
#define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
SCHED_FLAG_RECLAIM | \
SCHED_FLAG_DL_OVERRUN | \
SCHED_FLAG_KEEP_ALL | \
- SCHED_FLAG_UTIL_CLAMP)
+ SCHED_FLAG_UTIL_CLAMP | \
+ SCHED_FLAG_UTIL_CLAMP_RESET)

#endif /* _UAPI_LINUX_SCHED_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9a2fbf98fd6f..ed4cb412dde7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1207,15 +1207,22 @@ static void __setscheduler_uclamp(struct task_struct *p,
uclamp_se_set(uc_se, clamp_value, false);
}

- if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
+ if (likely(!(attr->sched_flags &
+ (SCHED_FLAG_UTIL_CLAMP | SCHED_FLAG_UTIL_CLAMP_RESET))))
return;

- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET_MIN) {
+ uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
+ 0, false);
+ } else if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
attr->sched_util_min, true);
}

- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET_MAX) {
+ uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
+ 1024, false);
+ } else if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
attr->sched_util_max, true);
}
@@ -4901,7 +4908,8 @@ static int __sched_setscheduler(struct task_struct *p,
goto change;
if (dl_policy(policy) && dl_param_changed(p, attr))
goto change;
- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)
+ if (attr->sched_flags &
+ (SCHED_FLAG_UTIL_CLAMP | SCHED_FLAG_UTIL_CLAMP_RESET))
goto change;

p->sched_reset_on_fork = reset_on_fork;
--
2.25.1


2020-10-13 12:06:44

by Patrick Bellasi

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp


On Tue, Oct 13, 2020 at 12:29:51 +0200, Qais Yousef <[email protected]> wrote...

> On 10/13/20 10:21, Patrick Bellasi wrote:
>>

[...]

>> > +#define SCHED_FLAG_UTIL_CLAMP_RESET (SCHED_FLAG_UTIL_CLAMP_RESET_MIN | \
>> > + SCHED_FLAG_UTIL_CLAMP_RESET_MAX)
>> > +
>> > #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
>> > SCHED_FLAG_RECLAIM | \
>> > SCHED_FLAG_DL_OVERRUN | \
>> > SCHED_FLAG_KEEP_ALL | \
>> > - SCHED_FLAG_UTIL_CLAMP)
>> > + SCHED_FLAG_UTIL_CLAMP | \
>> > + SCHED_FLAG_UTIL_CLAMP_RESET)
>>
>>
>> ... and use it in conjunction with the existing _CLAMP_{MIN,MAX} to know
>> which clamp should be reset?
>
> I think the RESET should restore *both* MIN and MAX and reset the user_defined
> flag. Since the latter is the main purpose of this interface, I don't think you
> can reset the user_defined flag without resetting both MIN and MAX to
> uclamp_none[UCLAMP_MIN/MAX].

We can certainly set one clamp and not the other, and indeed the
user_defined flag is per-clamp_id, thus we can reset one clamp while
still keeping user-defined the other one.


>> > #endif /* _UAPI_LINUX_SCHED_H */
>> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> > index 9a2fbf98fd6f..ed4cb412dde7 100644
>> > --- a/kernel/sched/core.c
>> > +++ b/kernel/sched/core.c
>> > @@ -1207,15 +1207,22 @@ static void __setscheduler_uclamp(struct task_struct *p,
>> > uclamp_se_set(uc_se, clamp_value, false);
>> > }
>> >
>> > - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
>> > + if (likely(!(attr->sched_flags &
>> > + (SCHED_FLAG_UTIL_CLAMP | SCHED_FLAG_UTIL_CLAMP_RESET))))
>> > return;
>>
>> This check will not be changed, while we will have to add a bypass in
>> uclamp_validate().
>>
>> >
>> > - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
>> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET_MIN) {
>> > + uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
>> > + 0, false);
>> > + } else if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
>> > uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
>> > attr->sched_util_min, true);
>> > }
>> >
>>
>> These checks also will have to be updated to check _RESET and
>> _{MIN,MAX} combinations.
>>
>> Bonus point would be to be possible to pass in just the _RESET flag if
>> we want to reset both clamps. IOW, passing in _RESET only should be
>> consumed as if we passed in _RESET|_MIN|_MAX.
>>
>> Caveat, RT tasks have got a special 'reset value' for _MIN.
>> We should check and ensure __uclamp_update_uti_min_rt_default() is
>> property called for those tasks, which likely will require some
>> additional refactoring :/
>
> Hmm I am probably missing something. But if the SCHED_FLAG_UTIL_CLAMP_RESET is
> set, just reset uc_se->user_defined in the loop in __setscheduler_uclamp().
> This should take care of doing the reset properly then. Including for
> RT tasks.

Yes and no. Yes because in principle we can just reset the flag for a
clamp_id without updating the request values, as it is done by the
snippets above, and the internals should work.

However, we will end up reporting the old values when reading from
user-space. We should better check all those reporting code paths or...
just update the requested values as Yun is proposing above.

I like better Yun approach so that we keep internal data structures
aligned with features.

> So IMO you just need a single SCHED_FLAG_UTIL_CLAMP_RESET that if set in the
> attr, you just execute that loop in __setscheduler_uclamp() + reset
> uc_se->user_defined.
>
> It should be invalid to pass the SCHED_FLAG_UTIL_CLAMP_RESET with
> SCHED_FLAG_UTIL_CLAMP_MIN/MAX. Both have contradictory meaning IMO.
> If user passes both we should return an EINVAL error.

Passing in _CLAMP_RESET|_CLAMP_MIN will mean reset the min value while
keeping the max at whatever it is. I think there could be cases where
this support could be on hand.

However, as in my previous email, by passing in only _CLAMP_RESET, we
should go an reset both.

2020-10-13 16:35:25

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 10/13/20 13:46, Patrick Bellasi wrote:
> > So IMO you just need a single SCHED_FLAG_UTIL_CLAMP_RESET that if set in the
> > attr, you just execute that loop in __setscheduler_uclamp() + reset
> > uc_se->user_defined.
> >
> > It should be invalid to pass the SCHED_FLAG_UTIL_CLAMP_RESET with
> > SCHED_FLAG_UTIL_CLAMP_MIN/MAX. Both have contradictory meaning IMO.
> > If user passes both we should return an EINVAL error.
>
> Passing in _CLAMP_RESET|_CLAMP_MIN will mean reset the min value while
> keeping the max at whatever it is. I think there could be cases where
> this support could be on hand.

I am not convinced personally. I'm anxious about what this fine grained control
means and how it should be used. I think less is more in this case and we can
always relax the restriction (appropriately) later if it's *really* required.

Particularly the fact that this user_defined is per uclamp_se and that it
affects the cgroup behavior is implementation details this API shouldn't rely
on. A generic RESET my uclamp settings makes more sense for me as a long term
API to maintain.

Actually maybe we should even go for a more explicit
SCHED_FLAG_UTIL_CLAMP_INHERIT_CGROUP flag instead. If we decide to abandon the
support for this feature in the future, at least we can make it return an error
without affecting other functionality because of the implicit nature of
SCHED_FLAG_UTIL_CLAMP_RESET means inherit cgroup value too.

That being said, I am not strongly against the fine grained approach if that's
what Yun wants now or what you both prefer. I just think the name of the flag
needs to change to be more explicit too then.

It'd be good to hear what others think.

Cheers

--
Qais Yousef

2020-10-13 18:25:05

by Patrick Bellasi

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp


Hi Yun,
thanks for sharing this new implementation.

On Mon, Oct 12, 2020 at 18:31:40 +0200, Yun Hsiang <[email protected]> wrote...

> If the user wants to stop controlling uclamp and let the task inherit
> the value from the group, we need a method to reset.
>
> Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> sched_setattr syscall.

Looks like what you say here is not what you code, since you actually
add two new flags, _RESET_{MIN,MAX}.

I think we value instead a simple user-space interface where just the
additional one flag _RESET should be good enough.

> Signed-off-by: Yun Hsiang <[email protected]>
> ---
> include/uapi/linux/sched.h | 9 ++++++++-
> kernel/sched/core.c | 16 ++++++++++++----
> 2 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> index 3bac0a8ceab2..a12e88c362d8 100644
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -132,6 +132,9 @@ struct clone_args {
> #define SCHED_FLAG_KEEP_PARAMS 0x10
> #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> +#define SCHED_FLAG_UTIL_CLAMP_RESET_MIN 0x80
> +#define SCHED_FLAG_UTIL_CLAMP_RESET_MAX 0x100

What about adding just SCHED_FLAG_UTIL_CLAMP_RESET 0x08 ...

>
> #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> SCHED_FLAG_KEEP_PARAMS)
> @@ -139,10 +142,14 @@ struct clone_args {
> #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> SCHED_FLAG_UTIL_CLAMP_MAX)

... making it part of SCHED_FLAG_UTIL_CLAMP ...

>
> +#define SCHED_FLAG_UTIL_CLAMP_RESET (SCHED_FLAG_UTIL_CLAMP_RESET_MIN | \
> + SCHED_FLAG_UTIL_CLAMP_RESET_MAX)
> +
> #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> SCHED_FLAG_RECLAIM | \
> SCHED_FLAG_DL_OVERRUN | \
> SCHED_FLAG_KEEP_ALL | \
> - SCHED_FLAG_UTIL_CLAMP)
> + SCHED_FLAG_UTIL_CLAMP | \
> + SCHED_FLAG_UTIL_CLAMP_RESET)


... and use it in conjunction with the existing _CLAMP_{MIN,MAX} to know
which clamp should be reset?


> #endif /* _UAPI_LINUX_SCHED_H */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9a2fbf98fd6f..ed4cb412dde7 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1207,15 +1207,22 @@ static void __setscheduler_uclamp(struct task_struct *p,
> uclamp_se_set(uc_se, clamp_value, false);
> }
>
> - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> + if (likely(!(attr->sched_flags &
> + (SCHED_FLAG_UTIL_CLAMP | SCHED_FLAG_UTIL_CLAMP_RESET))))
> return;

This check will not be changed, while we will have to add a bypass in
uclamp_validate().

>
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET_MIN) {
> + uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> + 0, false);
> + } else if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> attr->sched_util_min, true);
> }
>

These checks also will have to be updated to check _RESET and
_{MIN,MAX} combinations.

Bonus point would be to be possible to pass in just the _RESET flag if
we want to reset both clamps. IOW, passing in _RESET only should be
consumed as if we passed in _RESET|_MIN|_MAX.

Caveat, RT tasks have got a special 'reset value' for _MIN.
We should check and ensure __uclamp_update_uti_min_rt_default() is
property called for those tasks, which likely will require some
additional refactoring :/

> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET_MAX) {
> + uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> + 1024, false);
> + } else if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> attr->sched_util_max, true);
> }
> @@ -4901,7 +4908,8 @@ static int __sched_setscheduler(struct task_struct *p,
> goto change;
> if (dl_policy(policy) && dl_param_changed(p, attr))
> goto change;
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)
> + if (attr->sched_flags &
> + (SCHED_FLAG_UTIL_CLAMP | SCHED_FLAG_UTIL_CLAMP_RESET))
> goto change;
>
> p->sched_reset_on_fork = reset_on_fork;

2020-10-13 18:25:33

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 10/13/20 10:21, Patrick Bellasi wrote:
>
> Hi Yun,
> thanks for sharing this new implementation.
>
> On Mon, Oct 12, 2020 at 18:31:40 +0200, Yun Hsiang <[email protected]> wrote...
>
> > If the user wants to stop controlling uclamp and let the task inherit
> > the value from the group, we need a method to reset.
> >
> > Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> > sched_setattr syscall.
>
> Looks like what you say here is not what you code, since you actually
> add two new flags, _RESET_{MIN,MAX}.
>
> I think we value instead a simple user-space interface where just the
> additional one flag _RESET should be good enough.
>
> > Signed-off-by: Yun Hsiang <[email protected]>
> > ---
> > include/uapi/linux/sched.h | 9 ++++++++-
> > kernel/sched/core.c | 16 ++++++++++++----
> > 2 files changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> > index 3bac0a8ceab2..a12e88c362d8 100644
> > --- a/include/uapi/linux/sched.h
> > +++ b/include/uapi/linux/sched.h
> > @@ -132,6 +132,9 @@ struct clone_args {
> > #define SCHED_FLAG_KEEP_PARAMS 0x10
> > #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> > #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> > +#define SCHED_FLAG_UTIL_CLAMP_RESET_MIN 0x80
> > +#define SCHED_FLAG_UTIL_CLAMP_RESET_MAX 0x100
>
> What about adding just SCHED_FLAG_UTIL_CLAMP_RESET 0x08 ...
>
> >
> > #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> > SCHED_FLAG_KEEP_PARAMS)
> > @@ -139,10 +142,14 @@ struct clone_args {
> > #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> > SCHED_FLAG_UTIL_CLAMP_MAX)
>
> ... making it part of SCHED_FLAG_UTIL_CLAMP ...
>
> >
> > +#define SCHED_FLAG_UTIL_CLAMP_RESET (SCHED_FLAG_UTIL_CLAMP_RESET_MIN | \
> > + SCHED_FLAG_UTIL_CLAMP_RESET_MAX)
> > +
> > #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> > SCHED_FLAG_RECLAIM | \
> > SCHED_FLAG_DL_OVERRUN | \
> > SCHED_FLAG_KEEP_ALL | \
> > - SCHED_FLAG_UTIL_CLAMP)
> > + SCHED_FLAG_UTIL_CLAMP | \
> > + SCHED_FLAG_UTIL_CLAMP_RESET)
>
>
> ... and use it in conjunction with the existing _CLAMP_{MIN,MAX} to know
> which clamp should be reset?

I think the RESET should restore *both* MIN and MAX and reset the user_defined
flag. Since the latter is the main purpose of this interface, I don't think you
can reset the user_defined flag without resetting both MIN and MAX to
uclamp_none[UCLAMP_MIN/MAX].

>
> > #endif /* _UAPI_LINUX_SCHED_H */
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 9a2fbf98fd6f..ed4cb412dde7 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1207,15 +1207,22 @@ static void __setscheduler_uclamp(struct task_struct *p,
> > uclamp_se_set(uc_se, clamp_value, false);
> > }
> >
> > - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> > + if (likely(!(attr->sched_flags &
> > + (SCHED_FLAG_UTIL_CLAMP | SCHED_FLAG_UTIL_CLAMP_RESET))))
> > return;
>
> This check will not be changed, while we will have to add a bypass in
> uclamp_validate().
>
> >
> > - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET_MIN) {
> > + uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> > + 0, false);
> > + } else if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> > uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> > attr->sched_util_min, true);
> > }
> >
>
> These checks also will have to be updated to check _RESET and
> _{MIN,MAX} combinations.
>
> Bonus point would be to be possible to pass in just the _RESET flag if
> we want to reset both clamps. IOW, passing in _RESET only should be
> consumed as if we passed in _RESET|_MIN|_MAX.
>
> Caveat, RT tasks have got a special 'reset value' for _MIN.
> We should check and ensure __uclamp_update_uti_min_rt_default() is
> property called for those tasks, which likely will require some
> additional refactoring :/

Hmm I am probably missing something. But if the SCHED_FLAG_UTIL_CLAMP_RESET is
set, just reset uc_se->user_defined in the loop in __setscheduler_uclamp().
This should take care of doing the reset properly then. Including for RT tasks.

So IMO you just need a single SCHED_FLAG_UTIL_CLAMP_RESET that if set in the
attr, you just execute that loop in __setscheduler_uclamp() + reset
uc_se->user_defined.

It should be invalid to pass the SCHED_FLAG_UTIL_CLAMP_RESET with
SCHED_FLAG_UTIL_CLAMP_MIN/MAX. Both have contradictory meaning IMO.
If user passes both we should return an EINVAL error.

Thanks

--
Qais Yousef

2020-10-13 18:51:31

by Patrick Bellasi

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp


On Tue, Oct 13, 2020 at 15:32:46 +0200, Qais Yousef <[email protected]> wrote...

> On 10/13/20 13:46, Patrick Bellasi wrote:
>> > So IMO you just need a single SCHED_FLAG_UTIL_CLAMP_RESET that if set in the
>> > attr, you just execute that loop in __setscheduler_uclamp() + reset
>> > uc_se->user_defined.
>> >
>> > It should be invalid to pass the SCHED_FLAG_UTIL_CLAMP_RESET with
>> > SCHED_FLAG_UTIL_CLAMP_MIN/MAX. Both have contradictory meaning IMO.
>> > If user passes both we should return an EINVAL error.
>>
>> Passing in _CLAMP_RESET|_CLAMP_MIN will mean reset the min value while
>> keeping the max at whatever it is. I think there could be cases where
>> this support could be on hand.
>
> I am not convinced personally. I'm anxious about what this fine grained control
> means and how it should be used. I think less is more in this case and we can
> always relax the restriction (appropriately) later if it's *really* required.
>
> Particularly the fact that this user_defined is per uclamp_se and that it
> affects the cgroup behavior is implementation details this API shouldn't rely
> on.

The user_defined flag is an implementation details: true, but since the
beginning uclamp _always_ allowed a task to set only one of its clamp
values.

That's why we have UTIL_CLAMP_{MIN,MAX} as separate flags and all the
logic in place to set only one of the two.


> A generic RESET my uclamp settings makes more sense for me as a long term
> API to maintain.
>
> Actually maybe we should even go for a more explicit
> SCHED_FLAG_UTIL_CLAMP_INHERIT_CGROUP flag instead. If we decide to abandon the
> support for this feature in the future, at least we can make it return an error
> without affecting other functionality because of the implicit nature of
> SCHED_FLAG_UTIL_CLAMP_RESET means inherit cgroup value too.

That's not true and it's an even worst implementation detail what you
want to expose.

A task without a task specific clamp _always_ inherits the system
defaults. Resetting a task specific clamp already makes sense also
_without_ cgroups. It means: just do whatever the system allows you to
do.

Only if you are running with CGRoups enabled and the task happens to be
_not_ in the root group, the "CGroups inheritance" happens.
But that's exactly an internal detail a task should not care about.


> That being said, I am not strongly against the fine grained approach if that's
> what Yun wants now or what you both prefer.

It's not a fine grained approach, it's just adding a reset mechanism for
what uclamp already allows to do: setting min and max clamps
independently.

Regarding use cases, I also believe we have many more use cases of tasks
interested in setting/resetting just one clamp than tasks interested in
"fine grain" controlling both clamps at the same time.


> I just think the name of the flag needs to change to be more explicit
> too then.

I don't agree on that and, again, I see much more fine grained details and
internals exposure in what you propose compared to a single generic
_RESET flag.

> It'd be good to hear what others think.

I agree on that ;)

2020-10-14 06:26:29

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

Hi Yun,

On 12/10/2020 18:31, Yun Hsiang wrote:
> If the user wants to stop controlling uclamp and let the task inherit
> the value from the group, we need a method to reset.
>
> Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> sched_setattr syscall.

before we decide on how to implement the 'uclamp user_defined reset'
feature, could we come back to your use case in
https://lkml.kernel.org/r/20201002053812.GA176142@ubuntu ?

Lets just consider uclamp min for now. We have:

(1) system-wide:

# cat /proc/sys/kernel/sched_util_clamp_min

1024

(2) tg (hierarchy) with top-app's cpu.uclamp.min to ~200 (20% of 1024):

# cat /sys/fs/cgroup/cpu/top-app/cpu.uclamp.min
20

(3) and 2 cfs tasks A and B in top-app:

# cat /sys/fs/cgroup/cpu/top-app/tasks

pid_A
pid_B

Then you set A and B's uclamp min to 100. A and B are now user_defined.
A and B's effective uclamp min value is 100.

Since the task uclamp min values (3) are less than (1) and (2), their
uclamp min value is not affected by (1) or (2).

If A doesn't want to control itself anymore, it can set its uclamp min
to e.g. 300. Now A's effective uclamp min value is ~200, i.e. controlled
by (2), the one of B stays 100.

So the policy is:

(a) If the user_defined task wants to control it's uclamp, use task
uclamp value less than the tg (hierarchy) (and the system-wide)
value.

(b) If the user_defined task doesn't want to control it's uclamp
anymore, use a uclamp value greater than or equal the tg (hierarchy)
(and the system-wide) value.

So where exactly is the use case which would require a 'uclamp
user_defined reset' functionality?

2020-10-14 17:41:09

by Yun Hsiang

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On Tue, Oct 13, 2020 at 06:52:03PM +0200, Patrick Bellasi wrote:

Hi Patrick,

>
> On Tue, Oct 13, 2020 at 15:32:46 +0200, Qais Yousef <[email protected]> wrote...
>
> > On 10/13/20 13:46, Patrick Bellasi wrote:
> >> > So IMO you just need a single SCHED_FLAG_UTIL_CLAMP_RESET that if set in the
> >> > attr, you just execute that loop in __setscheduler_uclamp() + reset
> >> > uc_se->user_defined.
> >> >
> >> > It should be invalid to pass the SCHED_FLAG_UTIL_CLAMP_RESET with
> >> > SCHED_FLAG_UTIL_CLAMP_MIN/MAX. Both have contradictory meaning IMO.
> >> > If user passes both we should return an EINVAL error.
> >>
> >> Passing in _CLAMP_RESET|_CLAMP_MIN will mean reset the min value while
> >> keeping the max at whatever it is. I think there could be cases where
> >> this support could be on hand.
> >
> > I am not convinced personally. I'm anxious about what this fine grained control
> > means and how it should be used. I think less is more in this case and we can
> > always relax the restriction (appropriately) later if it's *really* required.
> >
> > Particularly the fact that this user_defined is per uclamp_se and that it
> > affects the cgroup behavior is implementation details this API shouldn't rely
> > on.
>
> The user_defined flag is an implementation details: true, but since the
> beginning uclamp _always_ allowed a task to set only one of its clamp
> values.
>
> That's why we have UTIL_CLAMP_{MIN,MAX} as separate flags and all the
> logic in place to set only one of the two.

I agree that UTIL_CLAMP_{MIN,MAX} should be able to set separately.
So the flag usage might be
_CLAMP_RESET => ?
_CLAMP_RESET | _CLAMP_MIN => reset min value
_CLAMP_RESET | _CLAMP_MAX => reset max value
_CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
If user pass _CLAMP_RESET without _CLAMP_MIN or _CLAMP MAX,
Should we reset both min & max value or return EINVAL error?

>
>
> > A generic RESET my uclamp settings makes more sense for me as a long term
> > API to maintain.
> >
> > Actually maybe we should even go for a more explicit
> > SCHED_FLAG_UTIL_CLAMP_INHERIT_CGROUP flag instead. If we decide to abandon the
> > support for this feature in the future, at least we can make it return an error
> > without affecting other functionality because of the implicit nature of
> > SCHED_FLAG_UTIL_CLAMP_RESET means inherit cgroup value too.
>
> That's not true and it's an even worst implementation detail what you
> want to expose.
>
> A task without a task specific clamp _always_ inherits the system
> defaults. Resetting a task specific clamp already makes sense also
> _without_ cgroups. It means: just do whatever the system allows you to
> do.
>
> Only if you are running with CGRoups enabled and the task happens to be
> _not_ in the root group, the "CGroups inheritance" happens.
> But that's exactly an internal detail a task should not care about.

I prefer to use SCHED_FLAG_UTIL_CLAMP_RESET because cgroup inheritance
is default behavior when task doesn't set it's uclamp.

>
> > That being said, I am not strongly against the fine grained approach if that's
> > what Yun wants now or what you both prefer.
>
> It's not a fine grained approach, it's just adding a reset mechanism for
> what uclamp already allows to do: setting min and max clamps
> independently.
>
> Regarding use cases, I also believe we have many more use cases of tasks
> interested in setting/resetting just one clamp than tasks interested in
> "fine grain" controlling both clamps at the same time.
>
>
> > I just think the name of the flag needs to change to be more explicit
> > too then.
>
> I don't agree on that and, again, I see much more fine grained details and
> internals exposure in what you propose compared to a single generic
> _RESET flag.
>
> > It'd be good to hear what others think.
>
> I agree on that ;)
>

2020-10-14 21:54:02

by Patrick Bellasi

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp


On Tue, Oct 13, 2020 at 22:25:48 +0200, Dietmar Eggemann <[email protected]> wrote...

Hi Dietmar,

> Hi Yun,
>
> On 12/10/2020 18:31, Yun Hsiang wrote:
>> If the user wants to stop controlling uclamp and let the task inherit
>> the value from the group, we need a method to reset.
>>
>> Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
>> sched_setattr syscall.
>
> before we decide on how to implement the 'uclamp user_defined reset'
> feature, could we come back to your use case in
> https://lkml.kernel.org/r/20201002053812.GA176142@ubuntu ?
>
> Lets just consider uclamp min for now. We have:
>
> (1) system-wide:
>
> # cat /proc/sys/kernel/sched_util_clamp_min
>
> 1024
>
> (2) tg (hierarchy) with top-app's cpu.uclamp.min to ~200 (20% of 1024):
>
> # cat /sys/fs/cgroup/cpu/top-app/cpu.uclamp.min
> 20
>
> (3) and 2 cfs tasks A and B in top-app:
>
> # cat /sys/fs/cgroup/cpu/top-app/tasks
>
> pid_A
> pid_B
>
> Then you set A and B's uclamp min to 100. A and B are now user_defined.
> A and B's effective uclamp min value is 100.
>
> Since the task uclamp min values (3) are less than (1) and (2), their
> uclamp min value is not affected by (1) or (2).
>
> If A doesn't want to control itself anymore, it can set its uclamp min
> to e.g. 300. Now A's effective uclamp min value is ~200, i.e. controlled
> by (2), the one of B stays 100.
>
> So the policy is:
>
> (a) If the user_defined task wants to control it's uclamp, use task
> uclamp value less than the tg (hierarchy) (and the system-wide)
> value.
>
> (b) If the user_defined task doesn't want to control it's uclamp
> anymore, use a uclamp value greater than or equal the tg (hierarchy)
> (and the system-wide) value.
>
> So where exactly is the use case which would require a 'uclamp
> user_defined reset' functionality?

Not sure what's the specific use-case Yun is after, but I have at least
one in my mind.

Let say a task does not need boost at all, independently from
the cgroup it's configured to run into. We can go on and set its task
specific value to util_min=0.

In this case, when the task is running alone on a CPU, it will get
always the minimum OPP, independently from its utilization.

Now, after a while (e.g. some special event happens) we want to relax
this constraint and allow the task to run:
1. at whatever OPP is required by its utilization
2. with any additional boost possibly enforced by its cgroup

Right now we have only quite cumbersome or hack solution:
a) go check the current cgroup util_min value and set for the task
something higher than that
b) set task::util_min=1024 thus asking for the maximum possible boost

Solution a) is more code for userspace and it's also racy. Solution b)
is misleading since the task does not really want to run at 1024.
It's also potentially over-killing in case the task should be moved to
the root group, which is normally unbounded and thus the task will get
executed always at the max OPP without any specific reason why.

A simple _UCLAMP_RESET flag will allow user-space to easily switch a
tasks to the default behavior (follow utilization or recommended
boosts) which is what a task usually gets when it does not opt-in to
uclamp.

Looking forward to see if Yun has an even more specific use-case.

2020-10-14 23:05:08

by Yun Hsiang

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On Tue, Oct 13, 2020 at 10:25:48PM +0200, Dietmar Eggemann wrote:
> Hi Yun,
>
> On 12/10/2020 18:31, Yun Hsiang wrote:
> > If the user wants to stop controlling uclamp and let the task inherit
> > the value from the group, we need a method to reset.
> >
> > Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> > sched_setattr syscall.
>
> before we decide on how to implement the 'uclamp user_defined reset'
> feature, could we come back to your use case in
> https://lkml.kernel.org/r/20201002053812.GA176142@ubuntu ?
>
> Lets just consider uclamp min for now. We have:
>
> (1) system-wide:
>
> # cat /proc/sys/kernel/sched_util_clamp_min
>
> 1024
>
> (2) tg (hierarchy) with top-app's cpu.uclamp.min to ~200 (20% of 1024):
>
> # cat /sys/fs/cgroup/cpu/top-app/cpu.uclamp.min
> 20
>
> (3) and 2 cfs tasks A and B in top-app:
>
> # cat /sys/fs/cgroup/cpu/top-app/tasks
>
> pid_A
> pid_B
>
> Then you set A and B's uclamp min to 100. A and B are now user_defined.
> A and B's effective uclamp min value is 100.
>
> Since the task uclamp min values (3) are less than (1) and (2), their
> uclamp min value is not affected by (1) or (2).
>
> If A doesn't want to control itself anymore, it can set its uclamp min
> to e.g. 300. Now A's effective uclamp min value is ~200, i.e. controlled
> by (2), the one of B stays 100.

The tg uclamp value may also change. If top-app's cpu.uclamp.min change
to 50 (~500), then task A's effective uclamp min value is 300 not ~500.
We can set task A's uclamp to 1024, it will be restricted by the tg.
But when task A move to root group, it's effective uclamp min value
will be 1024 not 0. If a task is in root group and it doesn't want to
control it's uclamp, the effective uclamp min value of that task should be 0.
So I think reset functionality is needed.

>
> So the policy is:
>
> (a) If the user_defined task wants to control it's uclamp, use task
> uclamp value less than the tg (hierarchy) (and the system-wide)
> value.
>
> (b) If the user_defined task doesn't want to control it's uclamp
> anymore, use a uclamp value greater than or equal the tg (hierarchy)
> (and the system-wide) value.
>
> So where exactly is the use case which would require a 'uclamp
> user_defined reset' functionality?

2020-10-15 12:27:31

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 14/10/2020 16:50, Patrick Bellasi wrote:
>
> On Tue, Oct 13, 2020 at 22:25:48 +0200, Dietmar Eggemann <[email protected]> wrote...

[...]

>> On 12/10/2020 18:31, Yun Hsiang wrote:

[...]

> Not sure what's the specific use-case Yun is after, but I have at least
> one in my mind.
>
> Let say a task does not need boost at all, independently from
> the cgroup it's configured to run into. We can go on and set its task
> specific value to util_min=0.
>
> In this case, when the task is running alone on a CPU, it will get
> always the minimum OPP, independently from its utilization.
>
> Now, after a while (e.g. some special event happens) we want to relax
> this constraint and allow the task to run:
> 1. at whatever OPP is required by its utilization
> 2. with any additional boost possibly enforced by its cgroup
>
> Right now we have only quite cumbersome or hack solution:
> a) go check the current cgroup util_min value and set for the task
> something higher than that

IMHO, it's not only the current taskgroup. You would have to check the
whole hierarchy (including system-wide).

> b) set task::util_min=1024 thus asking for the maximum possible boost
>
> Solution a) is more code for userspace and it's also racy. Solution b)
> is misleading since the task does not really want to run at 1024.
> It's also potentially over-killing in case the task should be moved to
> the root group, which is normally unbounded and thus the task will get
> executed always at the max OPP without any specific reason why.
>
> A simple _UCLAMP_RESET flag will allow user-space to easily switch a
> tasks to the default behavior (follow utilization or recommended
> boosts) which is what a task usually gets when it does not opt-in to
> uclamp.

OK, so restricting (eclipsing) 'user_defined' task uclamp values to
taskgroup (hierarchy) values by setting the task value > the taskgroup
(hierarchy) value is cumbersome.

A task uclamp reset function will force this 'user_defined' task
automatically under taskgroup (hierarchy) control by turning it into a
'!user_defined' task again.

... and this will facilitate easier uclamp userspace controller.

2020-10-15 12:32:16

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 14/10/2020 17:00, Yun Hsiang wrote:
> On Tue, Oct 13, 2020 at 10:25:48PM +0200, Dietmar Eggemann wrote:
>> Hi Yun,
>>
>> On 12/10/2020 18:31, Yun Hsiang wrote:

[...]

> The tg uclamp value may also change. If top-app's cpu.uclamp.min change
> to 50 (~500), then task A's effective uclamp min value is 300 not ~500.
> We can set task A's uclamp to 1024, it will be restricted by the tg.
> But when task A move to root group, it's effective uclamp min value
> will be 1024 not 0. If a task is in root group and it doesn't want to
> control it's uclamp, the effective uclamp min value of that task should be 0.
> So I think reset functionality is needed.

OK, looks like the intended solution { a) or b) in
https://lkml.kernel.org/r/[email protected]} is not really
feasible.

So we do need the uclamp reset to enable throughout the entire lifetime
of task p 'p is !user_defined -> p is controlled by taskgroup hierarchy)'.