2020-05-27 19:13:07

by Vincent Donnefort

[permalink] [raw]
Subject: [PATCH] sched/debug: Add new tracepoints to track util_est

From: Vincent Donnefort <[email protected]>

The util_est signals are key elements for EAS task placement and
frequency selection. Having tracepoints to track these signals enables
load-tracking and schedutil testing and/or debugging by a toolkit.

Signed-off-by: Vincent Donnefort <[email protected]>

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index ed168b0..04f9a4c 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -634,6 +634,14 @@ DECLARE_TRACE(sched_overutilized_tp,
TP_PROTO(struct root_domain *rd, bool overutilized),
TP_ARGS(rd, overutilized));

+DECLARE_TRACE(sched_util_est_cfs_tp,
+ TP_PROTO(struct cfs_rq *cfs_rq),
+ TP_ARGS(cfs_rq));
+
+DECLARE_TRACE(sched_util_est_se_tp,
+ TP_PROTO(struct sched_entity *se),
+ TP_ARGS(se));
+
#endif /* _TRACE_SCHED_H */

/* This part must be outside protection */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9228236..ecff02b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -35,6 +35,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_dl_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_se_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(sched_overutilized_tp);
+EXPORT_TRACEPOINT_SYMBOL_GPL(sched_util_est_cfs_tp);
+EXPORT_TRACEPOINT_SYMBOL_GPL(sched_util_est_se_tp);

DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 174d2df..cfc0e06 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3922,6 +3922,8 @@ static inline void util_est_enqueue(struct cfs_rq *cfs_rq,
enqueued = cfs_rq->avg.util_est.enqueued;
enqueued += _task_util_est(p);
WRITE_ONCE(cfs_rq->avg.util_est.enqueued, enqueued);
+
+ trace_sched_util_est_cfs_tp(cfs_rq);
}

/*
@@ -3952,6 +3954,8 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep)
ue.enqueued -= min_t(unsigned int, ue.enqueued, _task_util_est(p));
WRITE_ONCE(cfs_rq->avg.util_est.enqueued, ue.enqueued);

+ trace_sched_util_est_cfs_tp(cfs_rq);
+
/*
* Skip update of task's estimated utilization when the task has not
* yet completed an activation, e.g. being migrated.
@@ -4017,6 +4021,8 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep)
ue.ewma >>= UTIL_EST_WEIGHT_SHIFT;
done:
WRITE_ONCE(p->se.avg.util_est, ue);
+
+ trace_sched_util_est_se_tp(&p->se);
}

static inline int task_fits_capacity(struct task_struct *p, long capacity)
--
2.7.4


2020-06-03 12:07:15

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH] sched/debug: Add new tracepoints to track util_est


On 27/05/20 17:39, [email protected] wrote:
> From: Vincent Donnefort <[email protected]>
>
> The util_est signals are key elements for EAS task placement and
> frequency selection. Having tracepoints to track these signals enables
> load-tracking and schedutil testing and/or debugging by a toolkit.
>
> Signed-off-by: Vincent Donnefort <[email protected]>
>

To put it more bluntly, we can't really do task placement / load tracking
testing if util_est is enabled (which reminds me we may want to get rid of
the SCHED_FEAT at some point, it's been default on since ~v4.17), since
there can be noticeable gaps between util_avg and util_est.

Reviewed-by: Valentin Schneider <[email protected]>

2020-06-03 12:18:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched/debug: Add new tracepoints to track util_est

On Wed, Jun 03, 2020 at 01:04:26PM +0100, Valentin Schneider wrote:
>
> On 27/05/20 17:39, [email protected] wrote:
> > From: Vincent Donnefort <[email protected]>
> >
> > The util_est signals are key elements for EAS task placement and
> > frequency selection. Having tracepoints to track these signals enables
> > load-tracking and schedutil testing and/or debugging by a toolkit.
> >
> > Signed-off-by: Vincent Donnefort <[email protected]>
> >
>
> To put it more bluntly, we can't really do task placement / load tracking
> testing if util_est is enabled (which reminds me we may want to get rid of
> the SCHED_FEAT at some point, it's been default on since ~v4.17), since
> there can be noticeable gaps between util_avg and util_est.
>
> Reviewed-by: Valentin Schneider <[email protected]>

Thanks!

Subject: [tip: sched/core] sched/debug: Add new tracepoints to track util_est

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 4581bea8b4ec4de353369775dfef921191e393b3
Gitweb: https://git.kernel.org/tip/4581bea8b4ec4de353369775dfef921191e393b3
Author: Vincent Donnefort <[email protected]>
AuthorDate: Wed, 27 May 2020 17:39:14 +01:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Mon, 15 Jun 2020 14:10:02 +02:00

sched/debug: Add new tracepoints to track util_est

The util_est signals are key elements for EAS task placement and
frequency selection. Having tracepoints to track these signals enables
load-tracking and schedutil testing and/or debugging by a toolkit.

Signed-off-by: Vincent Donnefort <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
include/trace/events/sched.h | 8 ++++++++
kernel/sched/core.c | 2 ++
kernel/sched/fair.c | 6 ++++++
3 files changed, 16 insertions(+)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index ed168b0..04f9a4c 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -634,6 +634,14 @@ DECLARE_TRACE(sched_overutilized_tp,
TP_PROTO(struct root_domain *rd, bool overutilized),
TP_ARGS(rd, overutilized));

+DECLARE_TRACE(sched_util_est_cfs_tp,
+ TP_PROTO(struct cfs_rq *cfs_rq),
+ TP_ARGS(cfs_rq));
+
+DECLARE_TRACE(sched_util_est_se_tp,
+ TP_PROTO(struct sched_entity *se),
+ TP_ARGS(se));
+
#endif /* _TRACE_SCHED_H */

/* This part must be outside protection */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9c89b0e..0208b71 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -36,6 +36,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_dl_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_se_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(sched_overutilized_tp);
+EXPORT_TRACEPOINT_SYMBOL_GPL(sched_util_est_cfs_tp);
+EXPORT_TRACEPOINT_SYMBOL_GPL(sched_util_est_se_tp);

DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 69da576..a785a9b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3922,6 +3922,8 @@ static inline void util_est_enqueue(struct cfs_rq *cfs_rq,
enqueued = cfs_rq->avg.util_est.enqueued;
enqueued += _task_util_est(p);
WRITE_ONCE(cfs_rq->avg.util_est.enqueued, enqueued);
+
+ trace_sched_util_est_cfs_tp(cfs_rq);
}

/*
@@ -3952,6 +3954,8 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep)
ue.enqueued -= min_t(unsigned int, ue.enqueued, _task_util_est(p));
WRITE_ONCE(cfs_rq->avg.util_est.enqueued, ue.enqueued);

+ trace_sched_util_est_cfs_tp(cfs_rq);
+
/*
* Skip update of task's estimated utilization when the task has not
* yet completed an activation, e.g. being migrated.
@@ -4017,6 +4021,8 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep)
ue.ewma >>= UTIL_EST_WEIGHT_SHIFT;
done:
WRITE_ONCE(p->se.avg.util_est, ue);
+
+ trace_sched_util_est_se_tp(&p->se);
}

static inline int task_fits_capacity(struct task_struct *p, long capacity)