2020-11-03 02:40:15

by Yun Hsiang

[permalink] [raw]
Subject: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

If the user wants to stop controlling uclamp and let the task inherit
the value from the group, we need a method to reset.

Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
sched_setattr syscall.

The policy is
_CLAMP_RESET => reset both min and max
_CLAMP_RESET | _CLAMP_MIN => reset min value
_CLAMP_RESET | _CLAMP_MAX => reset max value
_CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max

Signed-off-by: Yun Hsiang <[email protected]>
Reported-by: kernel test robot <[email protected]>
---
include/uapi/linux/sched.h | 7 +++--
kernel/sched/core.c | 59 ++++++++++++++++++++++++++++----------
2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 3bac0a8ceab2..6c823ddb1a1e 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -132,17 +132,20 @@ struct clone_args {
#define SCHED_FLAG_KEEP_PARAMS 0x10
#define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
#define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
+#define SCHED_FLAG_UTIL_CLAMP_RESET 0x80

#define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
SCHED_FLAG_KEEP_PARAMS)

#define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
- SCHED_FLAG_UTIL_CLAMP_MAX)
+ SCHED_FLAG_UTIL_CLAMP_MAX | \
+ SCHED_FLAG_UTIL_CLAMP_RESET)

#define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
SCHED_FLAG_RECLAIM | \
SCHED_FLAG_DL_OVERRUN | \
SCHED_FLAG_KEEP_ALL | \
- SCHED_FLAG_UTIL_CLAMP)
+ SCHED_FLAG_UTIL_CLAMP | \
+ SCHED_FLAG_UTIL_CLAMP_RESET)

#endif /* _UAPI_LINUX_SCHED_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8160ab5263f8..6ae463b64834 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1004,7 +1004,7 @@ unsigned int uclamp_rq_max_value(struct rq *rq, enum uclamp_id clamp_id,
return uclamp_idle_value(rq, clamp_id, clamp_value);
}

-static void __uclamp_update_util_min_rt_default(struct task_struct *p)
+static inline void __uclamp_update_util_min_rt_default(struct task_struct *p)
{
unsigned int default_util_min;
struct uclamp_se *uc_se;
@@ -1413,8 +1413,14 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
static int uclamp_validate(struct task_struct *p,
const struct sched_attr *attr)
{
- unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
- unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
+ unsigned int lower_bound, upper_bound;
+
+ /* Do not check uclamp attributes values in reset case. */
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
+ return 0;
+
+ lower_bound = p->uclamp_req[UCLAMP_MIN].value;
+ upper_bound = p->uclamp_req[UCLAMP_MAX].value;

if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
lower_bound = attr->sched_util_min;
@@ -1438,20 +1444,43 @@ static int uclamp_validate(struct task_struct *p,
return 0;
}

+static int uclamp_reset(enum uclamp_id clamp_id, unsigned long flags)
+{
+ /* No _UCLAMP_RESET flag set: do not reset */
+ if (!(flags & SCHED_FLAG_UTIL_CLAMP_RESET))
+ return false;
+
+ /* Only _UCLAMP_RESET flag set: reset both clamps */
+ if (!(flags & (SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX)))
+ return true;
+
+ /* Both _UCLAMP_RESET and _UCLAMP_MIN flags are set: reset only min */
+ if ((flags & SCHED_FLAG_UTIL_CLAMP_MIN) && clamp_id == UCLAMP_MIN)
+ return true;
+
+ /* Both _UCLAMP_RESET and _UCLAMP_MAX flags are set: reset only max */
+ if ((flags & SCHED_FLAG_UTIL_CLAMP_MAX) && clamp_id == UCLAMP_MAX)
+ return true;
+
+ return false;
+}
+
static void __setscheduler_uclamp(struct task_struct *p,
const struct sched_attr *attr)
{
enum uclamp_id clamp_id;

/*
- * On scheduling class change, reset to default clamps for tasks
- * without a task-specific value.
+ * Reset to default clamps on forced _UCLAMP_RESET (always) and
+ * for tasks without a task-specific value (on scheduling class change).
*/
for_each_clamp_id(clamp_id) {
+ unsigned int clamp_value;
struct uclamp_se *uc_se = &p->uclamp_req[clamp_id];

/* Keep using defined clamps across class changes */
- if (uc_se->user_defined)
+ if (!uclamp_reset(clamp_id, attr->sched_flags) &&
+ uc_se->user_defined)
continue;

/*
@@ -1459,24 +1488,24 @@ static void __setscheduler_uclamp(struct task_struct *p,
* at runtime.
*/
if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
- __uclamp_update_util_min_rt_default(p);
+ clamp_value = sysctl_sched_uclamp_util_min_rt_default;
else
- uclamp_se_set(uc_se, uclamp_none(clamp_id), false);
+ clamp_value = uclamp_none(clamp_id);

+ uclamp_se_set(uc_se, clamp_value, false);
}

- if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
+ if (likely(!(attr->sched_flags && SCHED_FLAG_UTIL_CLAMP)) ||
+ attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
return;

- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
- attr->sched_util_min, true);
- }
+ attr->sched_util_min, true);

- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
- attr->sched_util_max, true);
- }
+ attr->sched_util_max, true);
}

static void uclamp_fork(struct task_struct *p)
--
2.25.1


2020-11-03 13:49:19

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

Hi Yun

+Juri (A question for you below)

On 11/03/20 10:37, Yun Hsiang wrote:
> If the user wants to stop controlling uclamp and let the task inherit
> the value from the group, we need a method to reset.
>
> Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> sched_setattr syscall.
>
> The policy is
> _CLAMP_RESET => reset both min and max
> _CLAMP_RESET | _CLAMP_MIN => reset min value
> _CLAMP_RESET | _CLAMP_MAX => reset max value
> _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
>
> Signed-off-by: Yun Hsiang <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> ---
> include/uapi/linux/sched.h | 7 +++--
> kernel/sched/core.c | 59 ++++++++++++++++++++++++++++----------
> 2 files changed, 49 insertions(+), 17 deletions(-)
>
> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> index 3bac0a8ceab2..6c823ddb1a1e 100644
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -132,17 +132,20 @@ struct clone_args {
> #define SCHED_FLAG_KEEP_PARAMS 0x10
> #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> +#define SCHED_FLAG_UTIL_CLAMP_RESET 0x80

The new flag needs documentation about how it should be used. It has a none
obvious policy that we can't expect users to just get it.

>
> #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> SCHED_FLAG_KEEP_PARAMS)
>
> #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> - SCHED_FLAG_UTIL_CLAMP_MAX)
> + SCHED_FLAG_UTIL_CLAMP_MAX | \
> + SCHED_FLAG_UTIL_CLAMP_RESET)

Either do this..

>
> #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> SCHED_FLAG_RECLAIM | \
> SCHED_FLAG_DL_OVERRUN | \
> SCHED_FLAG_KEEP_ALL | \
> - SCHED_FLAG_UTIL_CLAMP)
> + SCHED_FLAG_UTIL_CLAMP | \
> + SCHED_FLAG_UTIL_CLAMP_RESET)

Or this.

I checked glibc and it seems they don't use the sched.h from linux and more
surprisingly they don't seem to have a wrapper for sched_setattr(). bionic libc
from Android does take sched.h from linux, but didn't find any user. So we
might be okay with modifying these here.

I still would like us to document better what we expect from these defines.
Are they for internal kernel use or really for user space? If the former we
should take them out of here. If the latter, then adding the RESET is dangerous
as it'll cause an observable change in behavior; that is if an application was
using SCHED_FLAG_ALL or SCHED_FLAG_UTIL_CLAMP to update the UTIL_MIN/MAX of
a task, existing binaries will find out now that instead of modifying the value
they're actually resetting them.

Juri, it seems you originally introduced SCHED_FLAG_ALL. I assume this was some
sort of shorthand for user space, not the kernel?

If the latter, I think we should move SCHED_FLAG_ALL and SCHED_FLAG_UTIL_CLAMP
to core.c and hope no one is already relying on them.

>
> #endif /* _UAPI_LINUX_SCHED_H */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8160ab5263f8..6ae463b64834 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1004,7 +1004,7 @@ unsigned int uclamp_rq_max_value(struct rq *rq, enum uclamp_id clamp_id,
> return uclamp_idle_value(rq, clamp_id, clamp_value);
> }
>
> -static void __uclamp_update_util_min_rt_default(struct task_struct *p)
> +static inline void __uclamp_update_util_min_rt_default(struct task_struct *p)

Seems unrelated change. Worth a mention in the commit message at least.

> {
> unsigned int default_util_min;
> struct uclamp_se *uc_se;
> @@ -1413,8 +1413,14 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> static int uclamp_validate(struct task_struct *p,
> const struct sched_attr *attr)
> {
> - unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> - unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> + unsigned int lower_bound, upper_bound;
> +
> + /* Do not check uclamp attributes values in reset case. */
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> + return 0;
> +
> + lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> + upper_bound = p->uclamp_req[UCLAMP_MAX].value;
>
> if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> lower_bound = attr->sched_util_min;
> @@ -1438,20 +1444,43 @@ static int uclamp_validate(struct task_struct *p,
> return 0;
> }
>
> +static int uclamp_reset(enum uclamp_id clamp_id, unsigned long flags)

Add the policy part of the commit message as a documentation to this function
please.

ie:

/*
* The policy is
* _CLAMP_RESET => reset both min and max
* _CLAMP_RESET | _CLAMP_MIN => reset min value
* _CLAMP_RESET | _CLAMP_MAX => reset max value
* _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
*/

> +{
> + /* No _UCLAMP_RESET flag set: do not reset */
> + if (!(flags & SCHED_FLAG_UTIL_CLAMP_RESET))
> + return false;
> +
> + /* Only _UCLAMP_RESET flag set: reset both clamps */
> + if (!(flags & (SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX)))
> + return true;
> +
> + /* Both _UCLAMP_RESET and _UCLAMP_MIN flags are set: reset only min */
> + if ((flags & SCHED_FLAG_UTIL_CLAMP_MIN) && clamp_id == UCLAMP_MIN)
> + return true;
> +
> + /* Both _UCLAMP_RESET and _UCLAMP_MAX flags are set: reset only max */
> + if ((flags & SCHED_FLAG_UTIL_CLAMP_MAX) && clamp_id == UCLAMP_MAX)
> + return true;
> +
> + return false;
> +}
> +
> static void __setscheduler_uclamp(struct task_struct *p,
> const struct sched_attr *attr)
> {
> enum uclamp_id clamp_id;
>
> /*
> - * On scheduling class change, reset to default clamps for tasks
> - * without a task-specific value.
> + * Reset to default clamps on forced _UCLAMP_RESET (always) and
> + * for tasks without a task-specific value (on scheduling class change).
> */
> for_each_clamp_id(clamp_id) {
> + unsigned int clamp_value;
> struct uclamp_se *uc_se = &p->uclamp_req[clamp_id];
>
> /* Keep using defined clamps across class changes */
> - if (uc_se->user_defined)
> + if (!uclamp_reset(clamp_id, attr->sched_flags) &&
> + uc_se->user_defined)
> continue;
>
> /*
> @@ -1459,24 +1488,24 @@ static void __setscheduler_uclamp(struct task_struct *p,
> * at runtime.
> */
> if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> - __uclamp_update_util_min_rt_default(p);
> + clamp_value = sysctl_sched_uclamp_util_min_rt_default;
> else
> - uclamp_se_set(uc_se, uclamp_none(clamp_id), false);
> + clamp_value = uclamp_none(clamp_id);
>
> + uclamp_se_set(uc_se, clamp_value, false);

This is another unrelated change. Add a comment in the commit message at least
please.

> }
>
> - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> + if (likely(!(attr->sched_flags && SCHED_FLAG_UTIL_CLAMP)) ||
> + attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> return;
>
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> - attr->sched_util_min, true);
> - }
> + attr->sched_util_min, true);
>
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> - attr->sched_util_max, true);
> - }
> + attr->sched_util_max, true);

These two hunks seem unrelated too. Multi line statement should still have
braces AFAIK. Why change it?

Generally personally I am not fond of mixing 'cleanup' and modifications.
Especially when they are unrelated. They come across as churn to me but
I won't insist on removing/splitting them but at least document them in the
commit message with good reasons please.

Thanks

--
Qais Yousef

> }
>
> static void uclamp_fork(struct task_struct *p)
> --
> 2.25.1
>

2020-11-03 13:50:12

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

Oops, +Juri for real this time.

On 11/03/20 13:46, Qais Yousef wrote:
> Hi Yun
>
> +Juri (A question for you below)
>
> On 11/03/20 10:37, Yun Hsiang wrote:
> > If the user wants to stop controlling uclamp and let the task inherit
> > the value from the group, we need a method to reset.
> >
> > Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> > sched_setattr syscall.
> >
> > The policy is
> > _CLAMP_RESET => reset both min and max
> > _CLAMP_RESET | _CLAMP_MIN => reset min value
> > _CLAMP_RESET | _CLAMP_MAX => reset max value
> > _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
> >
> > Signed-off-by: Yun Hsiang <[email protected]>
> > Reported-by: kernel test robot <[email protected]>
> > ---
> > include/uapi/linux/sched.h | 7 +++--
> > kernel/sched/core.c | 59 ++++++++++++++++++++++++++++----------
> > 2 files changed, 49 insertions(+), 17 deletions(-)
> >
> > diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> > index 3bac0a8ceab2..6c823ddb1a1e 100644
> > --- a/include/uapi/linux/sched.h
> > +++ b/include/uapi/linux/sched.h
> > @@ -132,17 +132,20 @@ struct clone_args {
> > #define SCHED_FLAG_KEEP_PARAMS 0x10
> > #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> > #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> > +#define SCHED_FLAG_UTIL_CLAMP_RESET 0x80
>
> The new flag needs documentation about how it should be used. It has a none
> obvious policy that we can't expect users to just get it.
>
> >
> > #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> > SCHED_FLAG_KEEP_PARAMS)
> >
> > #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> > - SCHED_FLAG_UTIL_CLAMP_MAX)
> > + SCHED_FLAG_UTIL_CLAMP_MAX | \
> > + SCHED_FLAG_UTIL_CLAMP_RESET)
>
> Either do this..
>
> >
> > #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> > SCHED_FLAG_RECLAIM | \
> > SCHED_FLAG_DL_OVERRUN | \
> > SCHED_FLAG_KEEP_ALL | \
> > - SCHED_FLAG_UTIL_CLAMP)
> > + SCHED_FLAG_UTIL_CLAMP | \
> > + SCHED_FLAG_UTIL_CLAMP_RESET)
>
> Or this.
>
> I checked glibc and it seems they don't use the sched.h from linux and more
> surprisingly they don't seem to have a wrapper for sched_setattr(). bionic libc
> from Android does take sched.h from linux, but didn't find any user. So we
> might be okay with modifying these here.
>
> I still would like us to document better what we expect from these defines.
> Are they for internal kernel use or really for user space? If the former we
> should take them out of here. If the latter, then adding the RESET is dangerous
> as it'll cause an observable change in behavior; that is if an application was
> using SCHED_FLAG_ALL or SCHED_FLAG_UTIL_CLAMP to update the UTIL_MIN/MAX of
> a task, existing binaries will find out now that instead of modifying the value
> they're actually resetting them.
>
> Juri, it seems you originally introduced SCHED_FLAG_ALL. I assume this was some
> sort of shorthand for user space, not the kernel?
>
> If the latter, I think we should move SCHED_FLAG_ALL and SCHED_FLAG_UTIL_CLAMP
> to core.c and hope no one is already relying on them.
>
> >
> > #endif /* _UAPI_LINUX_SCHED_H */
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 8160ab5263f8..6ae463b64834 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1004,7 +1004,7 @@ unsigned int uclamp_rq_max_value(struct rq *rq, enum uclamp_id clamp_id,
> > return uclamp_idle_value(rq, clamp_id, clamp_value);
> > }
> >
> > -static void __uclamp_update_util_min_rt_default(struct task_struct *p)
> > +static inline void __uclamp_update_util_min_rt_default(struct task_struct *p)
>
> Seems unrelated change. Worth a mention in the commit message at least.
>
> > {
> > unsigned int default_util_min;
> > struct uclamp_se *uc_se;
> > @@ -1413,8 +1413,14 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> > static int uclamp_validate(struct task_struct *p,
> > const struct sched_attr *attr)
> > {
> > - unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> > - unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> > + unsigned int lower_bound, upper_bound;
> > +
> > + /* Do not check uclamp attributes values in reset case. */
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> > + return 0;
> > +
> > + lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> > + upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> >
> > if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> > lower_bound = attr->sched_util_min;
> > @@ -1438,20 +1444,43 @@ static int uclamp_validate(struct task_struct *p,
> > return 0;
> > }
> >
> > +static int uclamp_reset(enum uclamp_id clamp_id, unsigned long flags)
>
> Add the policy part of the commit message as a documentation to this function
> please.
>
> ie:
>
> /*
> * The policy is
> * _CLAMP_RESET => reset both min and max
> * _CLAMP_RESET | _CLAMP_MIN => reset min value
> * _CLAMP_RESET | _CLAMP_MAX => reset max value
> * _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
> */
>
> > +{
> > + /* No _UCLAMP_RESET flag set: do not reset */
> > + if (!(flags & SCHED_FLAG_UTIL_CLAMP_RESET))
> > + return false;
> > +
> > + /* Only _UCLAMP_RESET flag set: reset both clamps */
> > + if (!(flags & (SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX)))
> > + return true;
> > +
> > + /* Both _UCLAMP_RESET and _UCLAMP_MIN flags are set: reset only min */
> > + if ((flags & SCHED_FLAG_UTIL_CLAMP_MIN) && clamp_id == UCLAMP_MIN)
> > + return true;
> > +
> > + /* Both _UCLAMP_RESET and _UCLAMP_MAX flags are set: reset only max */
> > + if ((flags & SCHED_FLAG_UTIL_CLAMP_MAX) && clamp_id == UCLAMP_MAX)
> > + return true;
> > +
> > + return false;
> > +}
> > +
> > static void __setscheduler_uclamp(struct task_struct *p,
> > const struct sched_attr *attr)
> > {
> > enum uclamp_id clamp_id;
> >
> > /*
> > - * On scheduling class change, reset to default clamps for tasks
> > - * without a task-specific value.
> > + * Reset to default clamps on forced _UCLAMP_RESET (always) and
> > + * for tasks without a task-specific value (on scheduling class change).
> > */
> > for_each_clamp_id(clamp_id) {
> > + unsigned int clamp_value;
> > struct uclamp_se *uc_se = &p->uclamp_req[clamp_id];
> >
> > /* Keep using defined clamps across class changes */
> > - if (uc_se->user_defined)
> > + if (!uclamp_reset(clamp_id, attr->sched_flags) &&
> > + uc_se->user_defined)
> > continue;
> >
> > /*
> > @@ -1459,24 +1488,24 @@ static void __setscheduler_uclamp(struct task_struct *p,
> > * at runtime.
> > */
> > if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> > - __uclamp_update_util_min_rt_default(p);
> > + clamp_value = sysctl_sched_uclamp_util_min_rt_default;
> > else
> > - uclamp_se_set(uc_se, uclamp_none(clamp_id), false);
> > + clamp_value = uclamp_none(clamp_id);
> >
> > + uclamp_se_set(uc_se, clamp_value, false);
>
> This is another unrelated change. Add a comment in the commit message at least
> please.
>
> > }
> >
> > - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> > + if (likely(!(attr->sched_flags && SCHED_FLAG_UTIL_CLAMP)) ||
> > + attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> > return;
> >
> > - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> > uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> > - attr->sched_util_min, true);
> > - }
> > + attr->sched_util_min, true);
> >
> > - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> > uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> > - attr->sched_util_max, true);
> > - }
> > + attr->sched_util_max, true);
>
> These two hunks seem unrelated too. Multi line statement should still have
> braces AFAIK. Why change it?
>
> Generally personally I am not fond of mixing 'cleanup' and modifications.
> Especially when they are unrelated. They come across as churn to me but
> I won't insist on removing/splitting them but at least document them in the
> commit message with good reasons please.
>
> Thanks
>
> --
> Qais Yousef
>
> > }
> >
> > static void uclamp_fork(struct task_struct *p)
> > --
> > 2.25.1
> >

2020-11-04 09:46:56

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 03/11/2020 14:48, Qais Yousef wrote:
> Oops, +Juri for real this time.
>
> On 11/03/20 13:46, Qais Yousef wrote:
>> Hi Yun
>>
>> +Juri (A question for you below)
>>
>> On 11/03/20 10:37, Yun Hsiang wrote:

[...]

>>> include/uapi/linux/sched.h | 7 +++--
>>> kernel/sched/core.c | 59 ++++++++++++++++++++++++++++----------
>>> 2 files changed, 49 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
>>> index 3bac0a8ceab2..6c823ddb1a1e 100644
>>> --- a/include/uapi/linux/sched.h
>>> +++ b/include/uapi/linux/sched.h
>>> @@ -132,17 +132,20 @@ struct clone_args {
>>> #define SCHED_FLAG_KEEP_PARAMS 0x10
>>> #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
>>> #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
>>> +#define SCHED_FLAG_UTIL_CLAMP_RESET 0x80
>>
>> The new flag needs documentation about how it should be used. It has a none
>> obvious policy that we can't expect users to just get it.

See (1) further below.

>>> #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
>>> SCHED_FLAG_KEEP_PARAMS)
>>>
>>> #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
>>> - SCHED_FLAG_UTIL_CLAMP_MAX)
>>> + SCHED_FLAG_UTIL_CLAMP_MAX | \
>>> + SCHED_FLAG_UTIL_CLAMP_RESET)
>>
>> Either do this..
>>
>>>
>>> #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
>>> SCHED_FLAG_RECLAIM | \
>>> SCHED_FLAG_DL_OVERRUN | \
>>> SCHED_FLAG_KEEP_ALL | \
>>> - SCHED_FLAG_UTIL_CLAMP)
>>> + SCHED_FLAG_UTIL_CLAMP | \
>>> + SCHED_FLAG_UTIL_CLAMP_RESET)
>>
>> Or this.
>>
>> I checked glibc and it seems they don't use the sched.h from linux and more
>> surprisingly they don't seem to have a wrapper for sched_setattr(). bionic libc
>> from Android does take sched.h from linux, but didn't find any user. So we
>> might be okay with modifying these here.

Schould be package linux-libc-dev. Debian 10 (buster-backports)
5.8.10-1~bpo10+1 contains the uclamp bits as well.

/usr/include/linux/sched/types.h
/usr/include/linux/sched.h

/usr/include/linux/sched.h contains SCHED_FLAG_UTIL_CLAMP and
SCHED_FLAG_ALL.

But there is no glibc wrapper for sched_[sg]etattr so syscall wrapping
is still needed.

>> I still would like us to document better what we expect from these defines.
>> Are they for internal kernel use or really for user space? If the former we
>> should take them out of here. If the latter, then adding the RESET is dangerous
>> as it'll cause an observable change in behavior; that is if an application was
>> using SCHED_FLAG_ALL or SCHED_FLAG_UTIL_CLAMP to update the UTIL_MIN/MAX of
>> a task, existing binaries will find out now that instead of modifying the value
>> they're actually resetting them.

I doubt that any application uses SCHED_FLAG_ALL so far since it already
mixes e.g. DL and UCLAMP stuff. AFAIK the only tools supporting uclamp
so far is rt-app and uclampset, which both use their own files for DL
and uclamp definition.

[..]

>> Add the policy part of the commit message as a documentation to this function
>> please.
>>
>> ie:
>>
>> /*
>> * The policy is
>> * _CLAMP_RESET => reset both min and max
>> * _CLAMP_RESET | _CLAMP_MIN => reset min value
>> * _CLAMP_RESET | _CLAMP_MAX => reset max value
>> * _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
>> */
>>

(1) Related to documentation, wouldn't it be better to document in
include/uapi/linux/sched.h, i.e. where the flags are defined, so it gets
exported via linux-libc-dev?
Like it's done for struct sched_attr members in
include/uapi/linux/sched/types.h.

[...]

>>> - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
>>> + if (likely(!(attr->sched_flags && SCHED_FLAG_UTIL_CLAMP)) ||
>>> + attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
>>> return;

Another multi line statement so the 'return' could go with braces.

>>>
>>> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
>>> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
>>> uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
>>> - attr->sched_util_min, true);
>>> - }
>>> + attr->sched_util_min, true);
>>>
>>> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
>>> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
>>> uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
>>> - attr->sched_util_max, true);
>>> - }
>>> + attr->sched_util_max, true);
>>
>> These two hunks seem unrelated too. Multi line statement should still have
>> braces AFAIK. Why change it?

+1

[...]

2020-11-06 10:42:37

by Patrick Bellasi

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp


Hi Yun,
thanks for keep improving this.

I'm replying here but still considering all other reviewers comments.

Best,
Patrick

On Tue, Nov 03, 2020 at 03:37:56 +0100, Yun Hsiang <[email protected]> wrote...

> If the user wants to stop controlling uclamp and let the task inherit
> the value from the group, we need a method to reset.
>
> Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> sched_setattr syscall.
>
> The policy is
> _CLAMP_RESET => reset both min and max
> _CLAMP_RESET | _CLAMP_MIN => reset min value
> _CLAMP_RESET | _CLAMP_MAX => reset max value
> _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max

This documentation should be added to the uapi header and, most
importantly, in:
include/uapi/linux/sched/types.h
where the documentation for struct sched_attr is provided.


> Signed-off-by: Yun Hsiang <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> ---
> include/uapi/linux/sched.h | 7 +++--
> kernel/sched/core.c | 59 ++++++++++++++++++++++++++++----------
> 2 files changed, 49 insertions(+), 17 deletions(-)
>
> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> index 3bac0a8ceab2..6c823ddb1a1e 100644
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -132,17 +132,20 @@ struct clone_args {
> #define SCHED_FLAG_KEEP_PARAMS 0x10
> #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> +#define SCHED_FLAG_UTIL_CLAMP_RESET 0x80
>
> #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> SCHED_FLAG_KEEP_PARAMS)
>

(Related to the following discussion point)
What about adding in a comment here to call out that the following
definitions are "internal only"?
Moreover, we could probably wrap the following two define within an
#ifdef __KERNEL__/#endif block


Something like:

+ /*
+ * The following definitions are internal only, do not use them to set
+ * set_{set,get}attr() params. Use instead a valid combination of the
+ * flags defined above.
+ */
+ #ifdef __KERNEL__

> #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> - SCHED_FLAG_UTIL_CLAMP_MAX)
> + SCHED_FLAG_UTIL_CLAMP_MAX | \
> + SCHED_FLAG_UTIL_CLAMP_RESET)

We need the _RESET flag only here (not below), since this is a UCLAMP
feature and it's worth/useful to have a single "all uclamp flags"
definition...

> #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> SCHED_FLAG_RECLAIM | \
> SCHED_FLAG_DL_OVERRUN | \
> SCHED_FLAG_KEEP_ALL | \
> - SCHED_FLAG_UTIL_CLAMP)
> + SCHED_FLAG_UTIL_CLAMP | \
> + SCHED_FLAG_UTIL_CLAMP_RESET)

... i.e., you can drop the chunk above.

+ #endif /* __KERNEL__ */

Regarding Qais comment, I had the same Dietmar's thought: I doubt there
are apps using _FLAGS_ALL from userspace. For DL tasks, since they
cannot fork, it makes no sense to specify, for example
_RESET_ON_FORK|_RECLAIM. For CFS/RT tasks, where UCLAMP is supported, it
makes no sense to specify DL flags.

It's true however that having this def here when it's supposed to be
used only internally, can be kind of "confusing", but it's also useful
to keep the definition aligned with the flags defined above.
The ifdef wrapping proposed above should make this even more explicit.

Perhaps we can also better call this out also with an additional note
right after:

https://elixir.bootlin.com/linux/v5.9.6/source/include/uapi/linux/sched/types.h#L43

In that file, I believe the "Task Utilization Attributes" section can
also be improved by adding a description of the _UCLAMP flags semantics.


> #endif /* _UAPI_LINUX_SCHED_H */
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8160ab5263f8..6ae463b64834 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1004,7 +1004,7 @@ unsigned int uclamp_rq_max_value(struct rq *rq, enum uclamp_id clamp_id,
> return uclamp_idle_value(rq, clamp_id, clamp_value);
> }
>
> -static void __uclamp_update_util_min_rt_default(struct task_struct *p)
> +static inline void __uclamp_update_util_min_rt_default(struct task_struct *p)
> {

Again, IMO, this is _not_ an unrelated change at all. Actually, I still
would like to do one step more and inline this function in the _only
place_ where it's used. Qais arguments for not doing that where [1]:

Updating the default rt value is done from different contexts. Hence
it is important to document the rules under which this update must
happen and ensure the update happens through a common path.

I don't see why these arguments are not satisfied when we inline, e.g.

---8<---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d2003a7d5ab5..369074154e1d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1004,25 +1004,9 @@ unsigned int uclamp_rq_max_value(struct rq *rq, enum uclamp_id clamp_id,
return uclamp_idle_value(rq, clamp_id, clamp_value);
}

-static void __uclamp_update_util_min_rt_default(struct task_struct *p)
-{
- unsigned int default_util_min;
- struct uclamp_se *uc_se;
-
- lockdep_assert_held(&p->pi_lock);
-
- uc_se = &p->uclamp_req[UCLAMP_MIN];
-
- /* Only sync if user didn't override the default */
- if (uc_se->user_defined)
- return;
-
- default_util_min = sysctl_sched_uclamp_util_min_rt_default;
- uclamp_se_set(uc_se, default_util_min, false);
-}
-
static void uclamp_update_util_min_rt_default(struct task_struct *p)
{
+ struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN];
struct rq_flags rf;
struct rq *rq;

@@ -1031,7 +1015,11 @@ static void uclamp_update_util_min_rt_default(struct task_struct *p)

/* Protect updates to p->uclamp_* */
rq = task_rq_lock(p, &rf);
- __uclamp_update_util_min_rt_default(p);
+
+ /* Only sync if user didn't override the default */
+ if (!uc_se->user_defined)
+ uclamp_se_set(uc_se, sysctl_sched_uclamp_util_min_rt_default, false);
+
task_rq_unlock(rq, p, &rf);
}
---8<---

[1] https://lore.kernel.org/lkml/20201028182946.6qfmt7q35ewrjjua@e107158-lin/

> unsigned int default_util_min;
> struct uclamp_se *uc_se;
> @@ -1413,8 +1413,14 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> static int uclamp_validate(struct task_struct *p,
> const struct sched_attr *attr)
> {
> - unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> - unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> + unsigned int lower_bound, upper_bound;
> +
> + /* Do not check uclamp attributes values in reset case. */
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> + return 0;
> +
> + lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> + upper_bound = p->uclamp_req[UCLAMP_MAX].value;
>
> if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> lower_bound = attr->sched_util_min;
> @@ -1438,20 +1444,43 @@ static int uclamp_validate(struct task_struct *p,
> return 0;
> }
>
> +static int uclamp_reset(enum uclamp_id clamp_id, unsigned long flags)
> +{
> + /* No _UCLAMP_RESET flag set: do not reset */

^^^^^^^^^^^^^
Maybe better using the full flag name for all these?

> + if (!(flags & SCHED_FLAG_UTIL_CLAMP_RESET))
> + return false;
> +
> + /* Only _UCLAMP_RESET flag set: reset both clamps */
> + if (!(flags & (SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX)))
> + return true;
> +
> + /* Both _UCLAMP_RESET and _UCLAMP_MIN flags are set: reset only min */
> + if ((flags & SCHED_FLAG_UTIL_CLAMP_MIN) && clamp_id == UCLAMP_MIN)
> + return true;
> +
> + /* Both _UCLAMP_RESET and _UCLAMP_MAX flags are set: reset only max */
> + if ((flags & SCHED_FLAG_UTIL_CLAMP_MAX) && clamp_id == UCLAMP_MAX)
> + return true;
> +
> + return false;

I was suggesting maybe we need READ_ONCE() in the if above but did not
addressed previous Dietmar's question [2] on why.

The function above has a correct semantics only when the ordering of the
if statement is strictly observed.

Now, since we don't have any data dependency on the multiple if cases,
are we sure an overzealous compiler will never come up with some
"optimized reordering" of the checks required?

IOW, if the compiler could scramble the checks on an optimization, we
would get a wrong semantics which is also very difficult do debug.
Hence the idea to use READ_ONCE, to both tell the compiler to not
even attempt reordering those checks and also to better document the
code about the importance of the ordering on those checks.

Is now more clear? Does that makes sense?

[2] https://lore.kernel.org/lkml/[email protected]/

> +}
> +
> static void __setscheduler_uclamp(struct task_struct *p,
> const struct sched_attr *attr)
> {
> enum uclamp_id clamp_id;
>
> /*
> - * On scheduling class change, reset to default clamps for tasks
> - * without a task-specific value.
> + * Reset to default clamps on forced _UCLAMP_RESET (always) and
> + * for tasks without a task-specific value (on scheduling class change).
> */
> for_each_clamp_id(clamp_id) {
> + unsigned int clamp_value;
> struct uclamp_se *uc_se = &p->uclamp_req[clamp_id];
>
> /* Keep using defined clamps across class changes */
> - if (uc_se->user_defined)
> + if (!uclamp_reset(clamp_id, attr->sched_flags) &&
> + uc_se->user_defined)
> continue;
>
> /*
> @@ -1459,24 +1488,24 @@ static void __setscheduler_uclamp(struct task_struct *p,
> * at runtime.
> */
> if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> - __uclamp_update_util_min_rt_default(p);
> + clamp_value = sysctl_sched_uclamp_util_min_rt_default;
> else
> - uclamp_se_set(uc_se, uclamp_none(clamp_id), false);
> + clamp_value = uclamp_none(clamp_id);
>
> + uclamp_se_set(uc_se, clamp_value, false);
> }
>
> - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> + if (likely(!(attr->sched_flags && SCHED_FLAG_UTIL_CLAMP)) ||
> + attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> return;

Parenthesis required for multi-line is statements.

Following chucks not required.

> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> - attr->sched_util_min, true);
> - }
> + attr->sched_util_min, true);
>
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> - attr->sched_util_max, true);
> - }
> + attr->sched_util_max, true);
> }
>
> static void uclamp_fork(struct task_struct *p)

2020-11-07 18:33:16

by Yun Hsiang

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On Wed, Nov 04, 2020 at 10:45:09AM +0100, Dietmar Eggemann wrote:
> On 03/11/2020 14:48, Qais Yousef wrote:
> > Oops, +Juri for real this time.
> >
> > On 11/03/20 13:46, Qais Yousef wrote:
> >> Hi Yun
> >>
> >> +Juri (A question for you below)
> >>
> >> On 11/03/20 10:37, Yun Hsiang wrote:
>
> [...]
>
> >>> include/uapi/linux/sched.h | 7 +++--
> >>> kernel/sched/core.c | 59 ++++++++++++++++++++++++++++----------
> >>> 2 files changed, 49 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> >>> index 3bac0a8ceab2..6c823ddb1a1e 100644
> >>> --- a/include/uapi/linux/sched.h
> >>> +++ b/include/uapi/linux/sched.h
> >>> @@ -132,17 +132,20 @@ struct clone_args {
> >>> #define SCHED_FLAG_KEEP_PARAMS 0x10
> >>> #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> >>> #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> >>> +#define SCHED_FLAG_UTIL_CLAMP_RESET 0x80
> >>
> >> The new flag needs documentation about how it should be used. It has a none
> >> obvious policy that we can't expect users to just get it.
>
> See (1) further below.
>
> >>> #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> >>> SCHED_FLAG_KEEP_PARAMS)
> >>>
> >>> #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> >>> - SCHED_FLAG_UTIL_CLAMP_MAX)
> >>> + SCHED_FLAG_UTIL_CLAMP_MAX | \
> >>> + SCHED_FLAG_UTIL_CLAMP_RESET)
> >>
> >> Either do this..
> >>
> >>>
> >>> #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> >>> SCHED_FLAG_RECLAIM | \
> >>> SCHED_FLAG_DL_OVERRUN | \
> >>> SCHED_FLAG_KEEP_ALL | \
> >>> - SCHED_FLAG_UTIL_CLAMP)
> >>> + SCHED_FLAG_UTIL_CLAMP | \
> >>> + SCHED_FLAG_UTIL_CLAMP_RESET)
> >>
> >> Or this.
> >>
> >> I checked glibc and it seems they don't use the sched.h from linux and more
> >> surprisingly they don't seem to have a wrapper for sched_setattr(). bionic libc
> >> from Android does take sched.h from linux, but didn't find any user. So we
> >> might be okay with modifying these here.
>
> Schould be package linux-libc-dev. Debian 10 (buster-backports)
> 5.8.10-1~bpo10+1 contains the uclamp bits as well.
>
> /usr/include/linux/sched/types.h
> /usr/include/linux/sched.h
>
> /usr/include/linux/sched.h contains SCHED_FLAG_UTIL_CLAMP and
> SCHED_FLAG_ALL.
>
> But there is no glibc wrapper for sched_[sg]etattr so syscall wrapping
> is still needed.
>
> >> I still would like us to document better what we expect from these defines.
> >> Are they for internal kernel use or really for user space? If the former we
> >> should take them out of here. If the latter, then adding the RESET is dangerous
> >> as it'll cause an observable change in behavior; that is if an application was
> >> using SCHED_FLAG_ALL or SCHED_FLAG_UTIL_CLAMP to update the UTIL_MIN/MAX of
> >> a task, existing binaries will find out now that instead of modifying the value
> >> they're actually resetting them.
>
> I doubt that any application uses SCHED_FLAG_ALL so far since it already
> mixes e.g. DL and UCLAMP stuff. AFAIK the only tools supporting uclamp
> so far is rt-app and uclampset, which both use their own files for DL
> and uclamp definition.
>

I think SCHED_FLAG_ALL is for internal kernel use. So is it better to
make it within an #ifdef __KERNEL__ #endif block as Patrick said?

> [..]
>
> >> Add the policy part of the commit message as a documentation to this function
> >> please.
> >>
> >> ie:
> >>
> >> /*
> >> * The policy is
> >> * _CLAMP_RESET => reset both min and max
> >> * _CLAMP_RESET | _CLAMP_MIN => reset min value
> >> * _CLAMP_RESET | _CLAMP_MAX => reset max value
> >> * _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
> >> */
> >>
>
> (1) Related to documentation, wouldn't it be better to document in
> include/uapi/linux/sched.h, i.e. where the flags are defined, so it gets
> exported via linux-libc-dev?
> Like it's done for struct sched_attr members in
> include/uapi/linux/sched/types.h.
>

Ok, I'll put the comment for_CLAMP_MIN/_CLAMP_MAX/CLAMP_RESET in
include/uapi/linux/sched.h.

> [...]
>
> >>> - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> >>> + if (likely(!(attr->sched_flags && SCHED_FLAG_UTIL_CLAMP)) ||
> >>> + attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> >>> return;
>
> Another multi line statement so the 'return' could go with braces.
>
> >>>
> >>> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> >>> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> >>> uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> >>> - attr->sched_util_min, true);
> >>> - }
> >>> + attr->sched_util_min, true);
> >>>
> >>> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> >>> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> >>> uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> >>> - attr->sched_util_max, true);
> >>> - }
> >>> + attr->sched_util_max, true);
> >>
> >> These two hunks seem unrelated too. Multi line statement should still have
> >> braces AFAIK. Why change it?

Sorry for the wrong coding style, I'll fix it. And I'll split the
modifications to different patches if I want to do some cleanup.

> +1
>
> [...]

2020-11-07 19:17:16

by Yun Hsiang

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

Hi Patrick,

On Fri, Nov 06, 2020 at 11:36:48AM +0100, Patrick Bellasi wrote:
>
> Hi Yun,
> thanks for keep improving this.
>
> I'm replying here but still considering all other reviewers comments.
>
> Best,
> Patrick
>
> On Tue, Nov 03, 2020 at 03:37:56 +0100, Yun Hsiang <[email protected]> wrote...
>
> > If the user wants to stop controlling uclamp and let the task inherit
> > the value from the group, we need a method to reset.
> >
> > Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> > sched_setattr syscall.
> >
> > The policy is
> > _CLAMP_RESET => reset both min and max
> > _CLAMP_RESET | _CLAMP_MIN => reset min value
> > _CLAMP_RESET | _CLAMP_MAX => reset max value
> > _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
>
> This documentation should be added to the uapi header and, most
> importantly, in:
> include/uapi/linux/sched/types.h
> where the documentation for struct sched_attr is provided.
>
>
> > Signed-off-by: Yun Hsiang <[email protected]>
> > Reported-by: kernel test robot <[email protected]>
> > ---
> > include/uapi/linux/sched.h | 7 +++--
> > kernel/sched/core.c | 59 ++++++++++++++++++++++++++++----------
> > 2 files changed, 49 insertions(+), 17 deletions(-)
> >
> > diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> > index 3bac0a8ceab2..6c823ddb1a1e 100644
> > --- a/include/uapi/linux/sched.h
> > +++ b/include/uapi/linux/sched.h
> > @@ -132,17 +132,20 @@ struct clone_args {
> > #define SCHED_FLAG_KEEP_PARAMS 0x10
> > #define SCHED_FLAG_UTIL_CLAMP_MIN 0x20
> > #define SCHED_FLAG_UTIL_CLAMP_MAX 0x40
> > +#define SCHED_FLAG_UTIL_CLAMP_RESET 0x80
> >
> > #define SCHED_FLAG_KEEP_ALL (SCHED_FLAG_KEEP_POLICY | \
> > SCHED_FLAG_KEEP_PARAMS)
> >
>
> (Related to the following discussion point)
> What about adding in a comment here to call out that the following
> definitions are "internal only"?
> Moreover, we could probably wrap the following two define within an
> #ifdef __KERNEL__/#endif block
>
>
> Something like:
>
> + /*
> + * The following definitions are internal only, do not use them to set
> + * set_{set,get}attr() params. Use instead a valid combination of the
> + * flags defined above.
> + */
> + #ifdef __KERNEL__
>
> > #define SCHED_FLAG_UTIL_CLAMP (SCHED_FLAG_UTIL_CLAMP_MIN | \
> > - SCHED_FLAG_UTIL_CLAMP_MAX)
> > + SCHED_FLAG_UTIL_CLAMP_MAX | \
> > + SCHED_FLAG_UTIL_CLAMP_RESET)
>
> We need the _RESET flag only here (not below), since this is a UCLAMP
> feature and it's worth/useful to have a single "all uclamp flags"
> definition...
>
> > #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK | \
> > SCHED_FLAG_RECLAIM | \
> > SCHED_FLAG_DL_OVERRUN | \
> > SCHED_FLAG_KEEP_ALL | \
> > - SCHED_FLAG_UTIL_CLAMP)
> > + SCHED_FLAG_UTIL_CLAMP | \
> > + SCHED_FLAG_UTIL_CLAMP_RESET)
>
> ... i.e., you can drop the chunk above.

Yes, this chunk is redundant.

>
> + #endif /* __KERNEL__ */
>
> Regarding Qais comment, I had the same Dietmar's thought: I doubt there
> are apps using _FLAGS_ALL from userspace. For DL tasks, since they
> cannot fork, it makes no sense to specify, for example
> _RESET_ON_FORK|_RECLAIM. For CFS/RT tasks, where UCLAMP is supported, it
> makes no sense to specify DL flags.
>
> It's true however that having this def here when it's supposed to be
> used only internally, can be kind of "confusing", but it's also useful
> to keep the definition aligned with the flags defined above.
> The ifdef wrapping proposed above should make this even more explicit.
>
> Perhaps we can also better call this out also with an additional note
> right after:
>
> https://elixir.bootlin.com/linux/v5.9.6/source/include/uapi/linux/sched/types.h#L43
>
> In that file, I believe the "Task Utilization Attributes" section can
> also be improved by adding a description of the _UCLAMP flags semantics.

I think SCHED_FLAG_ALL is for internal kernel use. So I agree with not
exporting it to user.

For the "Task Utilization Attributes" section, should I send a patch to
add some descriptions for _UCLAMP_{MIN/MAX} first?

>
>
> > #endif /* _UAPI_LINUX_SCHED_H */
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 8160ab5263f8..6ae463b64834 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1004,7 +1004,7 @@ unsigned int uclamp_rq_max_value(struct rq *rq, enum uclamp_id clamp_id,
> > return uclamp_idle_value(rq, clamp_id, clamp_value);
> > }
> >
> > -static void __uclamp_update_util_min_rt_default(struct task_struct *p)
> > +static inline void __uclamp_update_util_min_rt_default(struct task_struct *p)
> > {
>
> Again, IMO, this is _not_ an unrelated change at all. Actually, I still
> would like to do one step more and inline this function in the _only
> place_ where it's used. Qais arguments for not doing that where [1]:
>
> Updating the default rt value is done from different contexts. Hence
> it is important to document the rules under which this update must
> happen and ensure the update happens through a common path.
>
> I don't see why these arguments are not satisfied when we inline, e.g.
>
> ---8<---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d2003a7d5ab5..369074154e1d 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1004,25 +1004,9 @@ unsigned int uclamp_rq_max_value(struct rq *rq, enum uclamp_id clamp_id,
> return uclamp_idle_value(rq, clamp_id, clamp_value);
> }
>
> -static void __uclamp_update_util_min_rt_default(struct task_struct *p)
> -{
> - unsigned int default_util_min;
> - struct uclamp_se *uc_se;
> -
> - lockdep_assert_held(&p->pi_lock);
> -
> - uc_se = &p->uclamp_req[UCLAMP_MIN];
> -
> - /* Only sync if user didn't override the default */
> - if (uc_se->user_defined)
> - return;
> -
> - default_util_min = sysctl_sched_uclamp_util_min_rt_default;
> - uclamp_se_set(uc_se, default_util_min, false);
> -}
> -
> static void uclamp_update_util_min_rt_default(struct task_struct *p)
> {
> + struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN];
> struct rq_flags rf;
> struct rq *rq;
>
> @@ -1031,7 +1015,11 @@ static void uclamp_update_util_min_rt_default(struct task_struct *p)
>
> /* Protect updates to p->uclamp_* */
> rq = task_rq_lock(p, &rf);
> - __uclamp_update_util_min_rt_default(p);
> +
> + /* Only sync if user didn't override the default */
> + if (!uc_se->user_defined)
> + uclamp_se_set(uc_se, sysctl_sched_uclamp_util_min_rt_default, false);
> +
> task_rq_unlock(rq, p, &rf);
> }
> ---8<---
>
> [1] https://lore.kernel.org/lkml/20201028182946.6qfmt7q35ewrjjua@e107158-lin/

I can split this modification to a different patch.

>
> > unsigned int default_util_min;
> > struct uclamp_se *uc_se;
> > @@ -1413,8 +1413,14 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> > static int uclamp_validate(struct task_struct *p,
> > const struct sched_attr *attr)
> > {
> > - unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> > - unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> > + unsigned int lower_bound, upper_bound;
> > +
> > + /* Do not check uclamp attributes values in reset case. */
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> > + return 0;
> > +
> > + lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> > + upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> >
> > if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> > lower_bound = attr->sched_util_min;
> > @@ -1438,20 +1444,43 @@ static int uclamp_validate(struct task_struct *p,
> > return 0;
> > }
> >
> > +static int uclamp_reset(enum uclamp_id clamp_id, unsigned long flags)
> > +{
> > + /* No _UCLAMP_RESET flag set: do not reset */
>
> ^^^^^^^^^^^^^
> Maybe better using the full flag name for all these?
>

Got it!

> > + if (!(flags & SCHED_FLAG_UTIL_CLAMP_RESET))
> > + return false;
> > +
> > + /* Only _UCLAMP_RESET flag set: reset both clamps */
> > + if (!(flags & (SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX)))
> > + return true;
> > +
> > + /* Both _UCLAMP_RESET and _UCLAMP_MIN flags are set: reset only min */
> > + if ((flags & SCHED_FLAG_UTIL_CLAMP_MIN) && clamp_id == UCLAMP_MIN)
> > + return true;
> > +
> > + /* Both _UCLAMP_RESET and _UCLAMP_MAX flags are set: reset only max */
> > + if ((flags & SCHED_FLAG_UTIL_CLAMP_MAX) && clamp_id == UCLAMP_MAX)
> > + return true;
> > +
> > + return false;
>
> I was suggesting maybe we need READ_ONCE() in the if above but did not
> addressed previous Dietmar's question [2] on why.
>
> The function above has a correct semantics only when the ordering of the
> if statement is strictly observed.
>
> Now, since we don't have any data dependency on the multiple if cases,
> are we sure an overzealous compiler will never come up with some
> "optimized reordering" of the checks required?
>
> IOW, if the compiler could scramble the checks on an optimization, we
> would get a wrong semantics which is also very difficult do debug.
> Hence the idea to use READ_ONCE, to both tell the compiler to not
> even attempt reordering those checks and also to better document the
> code about the importance of the ordering on those checks.
>
> Is now more clear? Does that makes sense?

I can undertand what your worries. But I'm not sure is it really needed.
If there is no other concern I can add it.

>
> [2] https://lore.kernel.org/lkml/[email protected]/
>
> > +}
> > +
> > static void __setscheduler_uclamp(struct task_struct *p,
> > const struct sched_attr *attr)
> > {
> > enum uclamp_id clamp_id;
> >
> > /*
> > - * On scheduling class change, reset to default clamps for tasks
> > - * without a task-specific value.
> > + * Reset to default clamps on forced _UCLAMP_RESET (always) and
> > + * for tasks without a task-specific value (on scheduling class change).
> > */
> > for_each_clamp_id(clamp_id) {
> > + unsigned int clamp_value;
> > struct uclamp_se *uc_se = &p->uclamp_req[clamp_id];
> >
> > /* Keep using defined clamps across class changes */
> > - if (uc_se->user_defined)
> > + if (!uclamp_reset(clamp_id, attr->sched_flags) &&
> > + uc_se->user_defined)
> > continue;
> >
> > /*
> > @@ -1459,24 +1488,24 @@ static void __setscheduler_uclamp(struct task_struct *p,
> > * at runtime.
> > */
> > if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> > - __uclamp_update_util_min_rt_default(p);
> > + clamp_value = sysctl_sched_uclamp_util_min_rt_default;
> > else
> > - uclamp_se_set(uc_se, uclamp_none(clamp_id), false);
> > + clamp_value = uclamp_none(clamp_id);
> >
> > + uclamp_se_set(uc_se, clamp_value, false);
> > }
> >
> > - if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> > + if (likely(!(attr->sched_flags && SCHED_FLAG_UTIL_CLAMP)) ||
> > + attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_RESET)
> > return;
>
> Parenthesis required for multi-line is statements.

Sorry for wrong coding style. I'll fix it.

>
> Following chucks not required.

I'll drop these chunks.

>
> > - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> > uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> > - attr->sched_util_min, true);
> > - }
> > + attr->sched_util_min, true);
> >
> > - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> > uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
> > - attr->sched_util_max, true);
> > - }
> > + attr->sched_util_max, true);
> > }
> >
> > static void uclamp_fork(struct task_struct *p)
>

2020-11-09 13:43:28

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 11/08/20 03:15, Yun Hsiang wrote:
> I think SCHED_FLAG_ALL is for internal kernel use. So I agree with not
> exporting it to user.

+1 for the #ifdef __kernel__

> > > + if (!(flags & SCHED_FLAG_UTIL_CLAMP_RESET))
> > > + return false;
> > > +
> > > + /* Only _UCLAMP_RESET flag set: reset both clamps */
> > > + if (!(flags & (SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX)))
> > > + return true;
> > > +
> > > + /* Both _UCLAMP_RESET and _UCLAMP_MIN flags are set: reset only min */
> > > + if ((flags & SCHED_FLAG_UTIL_CLAMP_MIN) && clamp_id == UCLAMP_MIN)
> > > + return true;
> > > +
> > > + /* Both _UCLAMP_RESET and _UCLAMP_MAX flags are set: reset only max */
> > > + if ((flags & SCHED_FLAG_UTIL_CLAMP_MAX) && clamp_id == UCLAMP_MAX)
> > > + return true;
> > > +
> > > + return false;
> >
> > I was suggesting maybe we need READ_ONCE() in the if above but did not
> > addressed previous Dietmar's question [2] on why.
> >
> > The function above has a correct semantics only when the ordering of the
> > if statement is strictly observed.
> >
> > Now, since we don't have any data dependency on the multiple if cases,
> > are we sure an overzealous compiler will never come up with some
> > "optimized reordering" of the checks required?
> >
> > IOW, if the compiler could scramble the checks on an optimization, we
> > would get a wrong semantics which is also very difficult do debug.
> > Hence the idea to use READ_ONCE, to both tell the compiler to not
> > even attempt reordering those checks and also to better document the
> > code about the importance of the ordering on those checks.
> >
> > Is now more clear? Does that makes sense?
>
> I can undertand what your worries. But I'm not sure is it really needed.
> If there is no other concern I can add it.

I too don't think the compiler has a power to do such an optimization. It must
preserve the order of the checks even if it found a more efficient way to
perform them.

Thanks

--
Qais Yousef

2020-11-10 12:20:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On Fri, Nov 06, 2020 at 11:36:48AM +0100, Patrick Bellasi wrote:

> > +static int uclamp_reset(enum uclamp_id clamp_id, unsigned long flags)
> > +{
> > + /* No _UCLAMP_RESET flag set: do not reset */
> > + if (!(flags & SCHED_FLAG_UTIL_CLAMP_RESET))
> > + return false;
> > +
> > + /* Only _UCLAMP_RESET flag set: reset both clamps */
> > + if (!(flags & (SCHED_FLAG_UTIL_CLAMP_MIN | SCHED_FLAG_UTIL_CLAMP_MAX)))
> > + return true;
> > +
> > + /* Both _UCLAMP_RESET and _UCLAMP_MIN flags are set: reset only min */
> > + if ((flags & SCHED_FLAG_UTIL_CLAMP_MIN) && clamp_id == UCLAMP_MIN)
> > + return true;
> > +
> > + /* Both _UCLAMP_RESET and _UCLAMP_MAX flags are set: reset only max */
> > + if ((flags & SCHED_FLAG_UTIL_CLAMP_MAX) && clamp_id == UCLAMP_MAX)
> > + return true;
> > +
> > + return false;
>
> I was suggesting maybe we need READ_ONCE() in the if above but did not
> addressed previous Dietmar's question [2] on why.
>
> The function above has a correct semantics only when the ordering of the
> if statement is strictly observed.
>
> Now, since we don't have any data dependency on the multiple if cases,
> are we sure an overzealous compiler will never come up with some
> "optimized reordering" of the checks required?
>
> IOW, if the compiler could scramble the checks on an optimization, we
> would get a wrong semantics which is also very difficult do debug.
> Hence the idea to use READ_ONCE, to both tell the compiler to not
> even attempt reordering those checks and also to better document the
> code about the importance of the ordering on those checks.
>
> Is now more clear? Does that makes sense?

I don't think the compiler is allowed to do as you fear. Specifically it
cannot move the first branch down because that would change the meaning
of this function and affect observable behaviour even in the traditional
single-threaded environment.


2020-11-10 12:23:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On Tue, Nov 03, 2020 at 10:37:56AM +0800, Yun Hsiang wrote:
> If the user wants to stop controlling uclamp and let the task inherit
> the value from the group, we need a method to reset.
>
> Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> sched_setattr syscall.
>
> The policy is
> _CLAMP_RESET => reset both min and max
> _CLAMP_RESET | _CLAMP_MIN => reset min value
> _CLAMP_RESET | _CLAMP_MAX => reset max value
> _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
>

The obvious alternative would be to use a magic value in
sched_util_{min,max} to indicate reset. After all, we strictly enforce
the values are inside [0,1024], which leaves us with many unused values.

Specifically -1 comes to mind. It would allow doing this without an
extra flag, OTOH the explicit flag is well, more explicit.

I don't have a strong preference either way, but I wanted to make sure
it was considered, and perhaps we can record why this isn't as nice a
solution, dunno.

2020-11-11 17:43:26

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 10/11/2020 13:21, Peter Zijlstra wrote:
> On Tue, Nov 03, 2020 at 10:37:56AM +0800, Yun Hsiang wrote:
>> If the user wants to stop controlling uclamp and let the task inherit
>> the value from the group, we need a method to reset.
>>
>> Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
>> sched_setattr syscall.
>>
>> The policy is
>> _CLAMP_RESET => reset both min and max
>> _CLAMP_RESET | _CLAMP_MIN => reset min value
>> _CLAMP_RESET | _CLAMP_MAX => reset max value
>> _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
>>
>
> The obvious alternative would be to use a magic value in
> sched_util_{min,max} to indicate reset. After all, we strictly enforce
> the values are inside [0,1024], which leaves us with many unused values.
>
> Specifically -1 comes to mind. It would allow doing this without an
> extra flag, OTOH the explicit flag is well, more explicit.
>
> I don't have a strong preference either way, but I wanted to make sure
> it was considered, and perhaps we can record why this isn't as nice a
> solution, dunno.

IMHO the '-1' magic value approach is cleaner. Did some light testing on it.

From 2e6a64fac4f2f66a2c6246de33db22c467fa7d33 Mon Sep 17 00:00:00 2001
From: Dietmar Eggemann <[email protected]>
Date: Wed, 11 Nov 2020 01:14:33 +0100
Subject: [PATCH] sched/uclamp: Allow to reset a task uclamp constraint value

In case the user wants to stop controlling a uclamp constraint value
for a task, use the magic value -1 in sched_util_{min,max} with the
appropriate sched_flags (SCHED_FLAG_UTIL_CLAMP_{MIN,MAX}) to indicate
the reset.

The advantage over the 'additional flag' approach (i.e. introducing
SCHED_FLAG_UTIL_CLAMP_RESET) is that no additional flag has to be
exported via uapi. This avoids the need to document how this new flag
has be used in conjunction with the existing uclamp related flags.

The following subtle issue is fixed as well. When a uclamp constraint
value is set on a !user_defined uclamp_se it is currently first reset
and then set.
Fix this by AND'ing !user_defined with !SCHED_FLAG_UTIL_CLAMP which
stands for the 'sched class change' case.
The related condition 'if (uc_se->user_defined)' moved from
__setscheduler_uclamp() into uclamp_reset().

Signed-off-by: Dietmar Eggemann <[email protected]>
---
include/uapi/linux/sched/types.h | 4 +-
kernel/sched/core.c | 70 +++++++++++++++++++++++---------
2 files changed, 53 insertions(+), 21 deletions(-)

diff --git a/include/uapi/linux/sched/types.h b/include/uapi/linux/sched/types.h
index c852153ddb0d..b9165f17dddc 100644
--- a/include/uapi/linux/sched/types.h
+++ b/include/uapi/linux/sched/types.h
@@ -115,8 +115,8 @@ struct sched_attr {
__u64 sched_period;

/* Utilization hints */
- __u32 sched_util_min;
- __u32 sched_util_max;
+ __s32 sched_util_min;
+ __s32 sched_util_max;

};

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3dc415f58bd7..caaa2a8434b9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1413,17 +1413,24 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
static int uclamp_validate(struct task_struct *p,
const struct sched_attr *attr)
{
- unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
- unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
+ int util_min = p->uclamp_req[UCLAMP_MIN].value;
+ int util_max = p->uclamp_req[UCLAMP_MAX].value;

- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
- lower_bound = attr->sched_util_min;
- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
- upper_bound = attr->sched_util_max;
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
+ util_min = attr->sched_util_min;

- if (lower_bound > upper_bound)
- return -EINVAL;
- if (upper_bound > SCHED_CAPACITY_SCALE)
+ if (util_min < -1 || util_min > SCHED_CAPACITY_SCALE)
+ return -EINVAL;
+ }
+
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
+ util_max = attr->sched_util_max;
+
+ if (util_max < -1 || util_max > SCHED_CAPACITY_SCALE)
+ return -EINVAL;
+ }
+
+ if (util_min != -1 && util_max != -1 && util_min > util_max)
return -EINVAL;

/*
@@ -1438,20 +1445,41 @@ static int uclamp_validate(struct task_struct *p,
return 0;
}

+static bool uclamp_reset(const struct sched_attr *attr,
+ enum uclamp_id clamp_id,
+ struct uclamp_se *uc_se)
+{
+ /* Reset on sched class change for a non user-defined clamp value. */
+ if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)) &&
+ !uc_se->user_defined)
+ return true;
+
+ /* Reset on sched_util_{min,max} == -1 */
+ if (clamp_id == UCLAMP_MIN &&
+ attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN &&
+ attr->sched_util_min == -1) {
+ return true;
+ }
+
+ if (clamp_id == UCLAMP_MAX &&
+ attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX &&
+ attr->sched_util_max == -1) {
+ return true;
+ }
+
+ return false;
+}
+
static void __setscheduler_uclamp(struct task_struct *p,
const struct sched_attr *attr)
{
enum uclamp_id clamp_id;

- /*
- * On scheduling class change, reset to default clamps for tasks
- * without a task-specific value.
- */
for_each_clamp_id(clamp_id) {
struct uclamp_se *uc_se = &p->uclamp_req[clamp_id];
+ unsigned int value;

- /* Keep using defined clamps across class changes */
- if (uc_se->user_defined)
+ if (!uclamp_reset(attr, clamp_id, uc_se))
continue;

/*
@@ -1459,21 +1487,25 @@ static void __setscheduler_uclamp(struct task_struct *p,
* at runtime.
*/
if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
- __uclamp_update_util_min_rt_default(p);
+ value = sysctl_sched_uclamp_util_min_rt_default;
else
- uclamp_se_set(uc_se, uclamp_none(clamp_id), false);
+ value = uclamp_none(clamp_id);
+
+ uclamp_se_set(uc_se, value, false);

}

if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
return;

- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN &&
+ attr->sched_util_min != -1) {
uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
attr->sched_util_min, true);
}

- if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
+ if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX &&
+ attr->sched_util_max != -1) {
uclamp_se_set(&p->uclamp_req[UCLAMP_MAX],
attr->sched_util_max, true);
}
--
2.17.1

2020-11-11 18:09:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On Wed, Nov 11, 2020 at 06:41:07PM +0100, Dietmar Eggemann wrote:
> diff --git a/include/uapi/linux/sched/types.h b/include/uapi/linux/sched/types.h
> index c852153ddb0d..b9165f17dddc 100644
> --- a/include/uapi/linux/sched/types.h
> +++ b/include/uapi/linux/sched/types.h
> @@ -115,8 +115,8 @@ struct sched_attr {
> __u64 sched_period;
>
> /* Utilization hints */
> - __u32 sched_util_min;
> - __u32 sched_util_max;
> + __s32 sched_util_min;
> + __s32 sched_util_max;

So that's UAPI, not sure we can change the type here.

> };
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 3dc415f58bd7..caaa2a8434b9 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1413,17 +1413,24 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> static int uclamp_validate(struct task_struct *p,
> const struct sched_attr *attr)
> {
> - unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> - unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> + int util_min = p->uclamp_req[UCLAMP_MIN].value;
> + int util_max = p->uclamp_req[UCLAMP_MAX].value;
>
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> - lower_bound = attr->sched_util_min;
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> - upper_bound = attr->sched_util_max;
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> + util_min = attr->sched_util_min;
>
> - if (lower_bound > upper_bound)
> - return -EINVAL;
> - if (upper_bound > SCHED_CAPACITY_SCALE)
> + if (util_min < -1 || util_min > SCHED_CAPACITY_SCALE)
> + return -EINVAL;
> + }
> +
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> + util_max = attr->sched_util_max;
> +
> + if (util_max < -1 || util_max > SCHED_CAPACITY_SCALE)
> + return -EINVAL;
> + }

Luckily we can write that range as a single branch like:

if (util_{min,max} + 1 > SCHED_CAPACITY_SCALE+1)

which assumes u32 :-)

> +
> + if (util_min != -1 && util_max != -1 && util_min > util_max)
> return -EINVAL;

I think that will compile as is, otherwise write it like ~0u, which is
the same bit pattern.

2020-11-12 05:47:57

by Yun Hsiang

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On Wed, Nov 11, 2020 at 07:04:41PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 11, 2020 at 06:41:07PM +0100, Dietmar Eggemann wrote:
> > diff --git a/include/uapi/linux/sched/types.h b/include/uapi/linux/sched/types.h
> > index c852153ddb0d..b9165f17dddc 100644
> > --- a/include/uapi/linux/sched/types.h
> > +++ b/include/uapi/linux/sched/types.h
> > @@ -115,8 +115,8 @@ struct sched_attr {
> > __u64 sched_period;
> >
> > /* Utilization hints */
> > - __u32 sched_util_min;
> > - __u32 sched_util_max;
> > + __s32 sched_util_min;
> > + __s32 sched_util_max;
>
> So that's UAPI, not sure we can change the type here.

+1
I am also concerned about changing UAPI.
But if we can chage sched_util_{min/max} to __s32, use -1 to reset is better than
adding flags.

>
> > };
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 3dc415f58bd7..caaa2a8434b9 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1413,17 +1413,24 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> > static int uclamp_validate(struct task_struct *p,
> > const struct sched_attr *attr)
> > {
> > - unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> > - unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> > + int util_min = p->uclamp_req[UCLAMP_MIN].value;
> > + int util_max = p->uclamp_req[UCLAMP_MAX].value;
> >
> > - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> > - lower_bound = attr->sched_util_min;
> > - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> > - upper_bound = attr->sched_util_max;
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> > + util_min = attr->sched_util_min;
> >
> > - if (lower_bound > upper_bound)
> > - return -EINVAL;
> > - if (upper_bound > SCHED_CAPACITY_SCALE)
> > + if (util_min < -1 || util_min > SCHED_CAPACITY_SCALE)
> > + return -EINVAL;
> > + }
> > +
> > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> > + util_max = attr->sched_util_max;
> > +
> > + if (util_max < -1 || util_max > SCHED_CAPACITY_SCALE)
> > + return -EINVAL;
> > + }
>
> Luckily we can write that range as a single branch like:
>
> if (util_{min,max} + 1 > SCHED_CAPACITY_SCALE+1)
>
> which assumes u32 :-)
>
> > +
> > + if (util_min != -1 && util_max != -1 && util_min > util_max)
> > return -EINVAL;
>
> I think that will compile as is, otherwise write it like ~0u, which is
> the same bit pattern.
>

2020-11-12 13:08:55

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 11/11/2020 19:04, Peter Zijlstra wrote:
> On Wed, Nov 11, 2020 at 06:41:07PM +0100, Dietmar Eggemann wrote:
>> diff --git a/include/uapi/linux/sched/types.h b/include/uapi/linux/sched/types.h
>> index c852153ddb0d..b9165f17dddc 100644
>> --- a/include/uapi/linux/sched/types.h
>> +++ b/include/uapi/linux/sched/types.h
>> @@ -115,8 +115,8 @@ struct sched_attr {
>> __u64 sched_period;
>>
>> /* Utilization hints */
>> - __u32 sched_util_min;
>> - __u32 sched_util_max;
>> + __s32 sched_util_min;
>> + __s32 sched_util_max;
>
> So that's UAPI, not sure we can change the type here.

Yes, will remove this chunk. Not needed.

I probably should add some documentation there:

diff --git a/include/uapi/linux/sched/types.h
b/include/uapi/linux/sched/types.h
index c852153ddb0d..f2c4589d4dbf 100644
--- a/include/uapi/linux/sched/types.h
+++ b/include/uapi/linux/sched/types.h
@@ -96,6 +96,8 @@ struct sched_param {
* on a CPU with a capacity big enough to fit the specified value.
* A task with a max utilization value smaller than 1024 is more likely
* scheduled on a CPU with no more capacity than the specified value.
+ *
+ * A task utilization boundary can be reset by setting the attribute to -1.
*/
struct sched_attr {
__u32 size;

[...]

>> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
>> + util_max = attr->sched_util_max;
>> +
>> + if (util_max < -1 || util_max > SCHED_CAPACITY_SCALE)
>> + return -EINVAL;
>> + }
>
> Luckily we can write that range as a single branch like:
>
> if (util_{min,max} + 1 > SCHED_CAPACITY_SCALE+1)
>
> which assumes u32 :-)

Cool, will change it.

>> +
>> + if (util_min != -1 && util_max != -1 && util_min > util_max)
>> return -EINVAL;
>
> I think that will compile as is, otherwise write it like ~0u, which is
> the same bit pattern.

Yes, it compiles for me (arm64, gcc 9.2 and arm, gcc 8.3). Started a
0-Day build job to make sure.

Will do some more testing before sending out the updated version.

2020-11-12 14:43:31

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 11/11/20 18:41, Dietmar Eggemann wrote:
> On 10/11/2020 13:21, Peter Zijlstra wrote:
> > On Tue, Nov 03, 2020 at 10:37:56AM +0800, Yun Hsiang wrote:
> >> If the user wants to stop controlling uclamp and let the task inherit
> >> the value from the group, we need a method to reset.
> >>
> >> Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via
> >> sched_setattr syscall.
> >>
> >> The policy is
> >> _CLAMP_RESET => reset both min and max
> >> _CLAMP_RESET | _CLAMP_MIN => reset min value
> >> _CLAMP_RESET | _CLAMP_MAX => reset max value
> >> _CLAMP_RESET | _CLAMP_MIN | _CLAMP_MAX => reset both min and max
> >>
> >
> > The obvious alternative would be to use a magic value in
> > sched_util_{min,max} to indicate reset. After all, we strictly enforce
> > the values are inside [0,1024], which leaves us with many unused values.
> >
> > Specifically -1 comes to mind. It would allow doing this without an
> > extra flag, OTOH the explicit flag is well, more explicit.
> >
> > I don't have a strong preference either way, but I wanted to make sure
> > it was considered, and perhaps we can record why this isn't as nice a
> > solution, dunno.
>
> IMHO the '-1' magic value approach is cleaner. Did some light testing on it.

I assume we agree then that we don't want to explicitly document this quirky
feature and keep it for advanced users?

I am wary of the UAPI change that is both explicit and implicit. It explicitly
requests a reset, but implicitly requests a cgroup behavior change.

With this magic value at least we can more easily return an error if we decided
to deprecate it, which has been my main ask so far. I don't want us to end up
not being able to easily modify this code in the future.

I don't have strong opinion too though.

If you or Yun would still like to send the patch to protect
SCHED_FLAG_UTIL_CLAMP and SCHED_FLAG_ALL with __kernel__ that'd be great.

> From 2e6a64fac4f2f66a2c6246de33db22c467fa7d33 Mon Sep 17 00:00:00 2001
> From: Dietmar Eggemann <[email protected]>
> Date: Wed, 11 Nov 2020 01:14:33 +0100
> Subject: [PATCH] sched/uclamp: Allow to reset a task uclamp constraint value
>
> In case the user wants to stop controlling a uclamp constraint value
> for a task, use the magic value -1 in sched_util_{min,max} with the
> appropriate sched_flags (SCHED_FLAG_UTIL_CLAMP_{MIN,MAX}) to indicate
> the reset.
>
> The advantage over the 'additional flag' approach (i.e. introducing
> SCHED_FLAG_UTIL_CLAMP_RESET) is that no additional flag has to be
> exported via uapi. This avoids the need to document how this new flag
> has be used in conjunction with the existing uclamp related flags.
>
> The following subtle issue is fixed as well. When a uclamp constraint
> value is set on a !user_defined uclamp_se it is currently first reset
> and then set.
> Fix this by AND'ing !user_defined with !SCHED_FLAG_UTIL_CLAMP which
> stands for the 'sched class change' case.
> The related condition 'if (uc_se->user_defined)' moved from
> __setscheduler_uclamp() into uclamp_reset().
>
> Signed-off-by: Dietmar Eggemann <[email protected]>
> ---
> include/uapi/linux/sched/types.h | 4 +-
> kernel/sched/core.c | 70 +++++++++++++++++++++++---------
> 2 files changed, 53 insertions(+), 21 deletions(-)
>
> diff --git a/include/uapi/linux/sched/types.h b/include/uapi/linux/sched/types.h
> index c852153ddb0d..b9165f17dddc 100644
> --- a/include/uapi/linux/sched/types.h
> +++ b/include/uapi/linux/sched/types.h
> @@ -115,8 +115,8 @@ struct sched_attr {
> __u64 sched_period;
>
> /* Utilization hints */
> - __u32 sched_util_min;
> - __u32 sched_util_max;
> + __s32 sched_util_min;
> + __s32 sched_util_max;
>
> };
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 3dc415f58bd7..caaa2a8434b9 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1413,17 +1413,24 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> static int uclamp_validate(struct task_struct *p,
> const struct sched_attr *attr)
> {
> - unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> - unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> + int util_min = p->uclamp_req[UCLAMP_MIN].value;
> + int util_max = p->uclamp_req[UCLAMP_MAX].value;
>
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> - lower_bound = attr->sched_util_min;
> - if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> - upper_bound = attr->sched_util_max;
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> + util_min = attr->sched_util_min;
>


> - if (lower_bound > upper_bound)
> - return -EINVAL;

You removed this check and didn't replace it with equivalent one?

> - if (upper_bound > SCHED_CAPACITY_SCALE)
> + if (util_min < -1 || util_min > SCHED_CAPACITY_SCALE)
> + return -EINVAL;
> + }
> +
> + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
> + util_max = attr->sched_util_max;
> +
> + if (util_max < -1 || util_max > SCHED_CAPACITY_SCALE)
> + return -EINVAL;
> + }
> +
> + if (util_min != -1 && util_max != -1 && util_min > util_max)
> return -EINVAL;

Ah, it is here. Never mind.

The approach looks good to me in general.

Thanks

--
Qais Yousef

2020-11-12 16:07:05

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 12/11/2020 15:41, Qais Yousef wrote:
> On 11/11/20 18:41, Dietmar Eggemann wrote:
>> On 10/11/2020 13:21, Peter Zijlstra wrote:
>>> On Tue, Nov 03, 2020 at 10:37:56AM +0800, Yun Hsiang wrote:

[...]

> I assume we agree then that we don't want to explicitly document this quirky
> feature and keep it for advanced users?
>
> I am wary of the UAPI change that is both explicit and implicit. It explicitly
> requests a reset, but implicitly requests a cgroup behavior change.
>
> With this magic value at least we can more easily return an error if we decided
> to deprecate it, which has been my main ask so far. I don't want us to end up
> not being able to easily modify this code in the future.

Another advantage for this 'magic value' approach.

> I don't have strong opinion too though.
>
> If you or Yun would still like to send the patch to protect
> SCHED_FLAG_UTIL_CLAMP and SCHED_FLAG_ALL with __kernel__ that'd be great.

Ah yes! Can add an extra patch for this when sending out the next version.

[...]

2020-11-13 12:06:04

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 12/11/2020 17:01, Dietmar Eggemann wrote:
> On 12/11/2020 15:41, Qais Yousef wrote:
>> On 11/11/20 18:41, Dietmar Eggemann wrote:
>>> On 10/11/2020 13:21, Peter Zijlstra wrote:
>>>> On Tue, Nov 03, 2020 at 10:37:56AM +0800, Yun Hsiang wrote:

[...]

>> If you or Yun would still like to send the patch to protect
>> SCHED_FLAG_UTIL_CLAMP and SCHED_FLAG_ALL with __kernel__ that'd be great.
>
> Ah yes! Can add an extra patch for this when sending out the next version.

On second thought, why should we risk a change in UAPI? Since we're now
not introducing a new flag the meaning of SCHED_FLAG_ALL or
SCHED_FLAG_UTIL_CLAMP won't change.

2020-11-13 12:31:01

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH v5 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp

On 11/13/20 12:45, Dietmar Eggemann wrote:
> On 12/11/2020 17:01, Dietmar Eggemann wrote:
> > On 12/11/2020 15:41, Qais Yousef wrote:
> >> On 11/11/20 18:41, Dietmar Eggemann wrote:
> >>> On 10/11/2020 13:21, Peter Zijlstra wrote:
> >>>> On Tue, Nov 03, 2020 at 10:37:56AM +0800, Yun Hsiang wrote:
>
> [...]
>
> >> If you or Yun would still like to send the patch to protect
> >> SCHED_FLAG_UTIL_CLAMP and SCHED_FLAG_ALL with __kernel__ that'd be great.
> >
> > Ah yes! Can add an extra patch for this when sending out the next version.
>
> On second thought, why should we risk a change in UAPI? Since we're now
> not introducing a new flag the meaning of SCHED_FLAG_ALL or
> SCHED_FLAG_UTIL_CLAMP won't change.

It's a judgement call. Hide them now where it's likely there are no users and
hope we won't have to revert it. Or just ignore it and treat it as an ABI and
make sure no one change them later.

My judgement call it's better to introduce the __kernel__ while we can. But
I can't say for sure nothing will break. All I know it'd be easy to revert if
it does cause breakage.

You get to choose :-)

Thanks

--
Qais Yousef