2021-10-19 12:38:00

by Vincent Guittot

[permalink] [raw]
Subject: [PATCH v3 1/5] sched/fair: Account update_blocked_averages in newidle_balance cost

The time spent to update the blocked load can be significant depending of
the complexity fo the cgroup hierarchy. Take this time into account in
the cost of the 1st load balance of a newly idle cpu.

Also reduce the number of call to sched_clock_cpu() and track more actual
work.

Signed-off-by: Vincent Guittot <[email protected]>
---
kernel/sched/fair.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 87db481e8a56..c0145677ee99 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10840,9 +10840,9 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
{
unsigned long next_balance = jiffies + HZ;
int this_cpu = this_rq->cpu;
+ u64 t0, t1, curr_cost = 0;
struct sched_domain *sd;
int pulled_task = 0;
- u64 curr_cost = 0;

update_misfit_status(NULL, this_rq);

@@ -10887,11 +10887,13 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)

raw_spin_rq_unlock(this_rq);

+ t0 = sched_clock_cpu(this_cpu);
update_blocked_averages(this_cpu);
+
rcu_read_lock();
for_each_domain(this_cpu, sd) {
int continue_balancing = 1;
- u64 t0, domain_cost;
+ u64 domain_cost;

if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
update_next_balance(sd, &next_balance);
@@ -10899,17 +10901,18 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
}

if (sd->flags & SD_BALANCE_NEWIDLE) {
- t0 = sched_clock_cpu(this_cpu);

pulled_task = load_balance(this_cpu, this_rq,
sd, CPU_NEWLY_IDLE,
&continue_balancing);

- domain_cost = sched_clock_cpu(this_cpu) - t0;
+ t1 = sched_clock_cpu(this_cpu);
+ domain_cost = t1 - t0;
if (domain_cost > sd->max_newidle_lb_cost)
sd->max_newidle_lb_cost = domain_cost;

curr_cost += domain_cost;
+ t0 = t1;
}

update_next_balance(sd, &next_balance);
--
2.17.1


2021-10-21 09:47:36

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH v3 1/5] sched/fair: Account update_blocked_averages in newidle_balance cost

On Tue, Oct 19, 2021 at 02:35:33PM +0200, Vincent Guittot wrote:
> The time spent to update the blocked load can be significant depending of
> the complexity fo the cgroup hierarchy. Take this time into account in
> the cost of the 1st load balance of a newly idle cpu.
>
> Also reduce the number of call to sched_clock_cpu() and track more actual
> work.
>
> Signed-off-by: Vincent Guittot <[email protected]>

Acked-by: Mel Gorman <[email protected]>

--
Mel Gorman
SUSE Labs

2021-10-31 10:21:33

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: sched/core] sched/fair: Account update_blocked_averages in newidle_balance cost

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 9e9af819db5dbe4bf99101628955a26e2a41a1a5
Gitweb: https://git.kernel.org/tip/9e9af819db5dbe4bf99101628955a26e2a41a1a5
Author: Vincent Guittot <[email protected]>
AuthorDate: Tue, 19 Oct 2021 14:35:33 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Sun, 31 Oct 2021 11:11:37 +01:00

sched/fair: Account update_blocked_averages in newidle_balance cost

The time spent to update the blocked load can be significant depending of
the complexity fo the cgroup hierarchy. Take this time into account in
the cost of the 1st load balance of a newly idle cpu.

Also reduce the number of call to sched_clock_cpu() and track more actual
work.

Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Dietmar Eggemann <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/sched/fair.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 87db481..c014567 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10840,9 +10840,9 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
{
unsigned long next_balance = jiffies + HZ;
int this_cpu = this_rq->cpu;
+ u64 t0, t1, curr_cost = 0;
struct sched_domain *sd;
int pulled_task = 0;
- u64 curr_cost = 0;

update_misfit_status(NULL, this_rq);

@@ -10887,11 +10887,13 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)

raw_spin_rq_unlock(this_rq);

+ t0 = sched_clock_cpu(this_cpu);
update_blocked_averages(this_cpu);
+
rcu_read_lock();
for_each_domain(this_cpu, sd) {
int continue_balancing = 1;
- u64 t0, domain_cost;
+ u64 domain_cost;

if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
update_next_balance(sd, &next_balance);
@@ -10899,17 +10901,18 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
}

if (sd->flags & SD_BALANCE_NEWIDLE) {
- t0 = sched_clock_cpu(this_cpu);

pulled_task = load_balance(this_cpu, this_rq,
sd, CPU_NEWLY_IDLE,
&continue_balancing);

- domain_cost = sched_clock_cpu(this_cpu) - t0;
+ t1 = sched_clock_cpu(this_cpu);
+ domain_cost = t1 - t0;
if (domain_cost > sd->max_newidle_lb_cost)
sd->max_newidle_lb_cost = domain_cost;

curr_cost += domain_cost;
+ t0 = t1;
}

update_next_balance(sd, &next_balance);