When the system doesn't have enough cycles for all tasks, the scheduler
must ensure a fair split of those CPUs cycles between CFS tasks. The
fairness of some use cases can't be solved with a static distribution of
the tasks on the system and requires a periodic rebalancing of the system
but this dynamic behavior is not always optimal and the fair distribution
of the CPU's time is not always ensured.
The patchset improves the fairness by decreasing the constraint for
selecting migratable tasks with the number of failed load balance. This
change enables then to decrease the imbalance threshold because 1st LB
will try to migrate tasks that fully match the imbalance.
Some tests results:
- small 2 x 4 cores arm64 system
hackbench -l (256000/#grp) -g #grp
grp tip/sched/core +patchset improvement
1 1.420(+/- 11.72 %) 1.382(+/-10.50 %) 2.72 %
4 1.295(+/- 2.72 %) 1.218(+/- 2.97 %) 0.76 %
8 1.220(+/- 2.17 %) 1.218(+/- 1.60 %) 0.17 %
16 1.258(+/- 1.88 %) 1.250(+/- 1,78 %) 0.58 %
fairness tests: run always running rt-app threads
monitor the ratio between min/max work done by threads
v5.9-rc1 w/ patchset
9 threads avg 78.3% (+/- 6.60%) 91.20% (+/- 2.44%)
worst 68.6% 85.67%
11 threads avg 65.91% (+/- 8.26%) 91.34% (+/- 1.87%)
worst 53.52% 87.26%
- large 2 nodes x 28 cores x 4 threads arm64 system
The hackbench tests that I usually run as well as the sp.C.x and lu.C.x
tests with 224 threads have not shown any difference with a mix of less
than 0.5% of improvements or regressions.
Vincent Guittot (4):
sched/fair: relax constraint on task's load during load balance
sched/fair: reduce minimal imbalance threshold
sched/fair: minimize concurrent LBs between domain level
sched/fair: reduce busy load balance interval
kernel/sched/fair.c | 7 +++++--
kernel/sched/topology.c | 4 ++--
2 files changed, 7 insertions(+), 4 deletions(-)
--
2.17.1
On Mon, Sep 14, 2020 at 12:03:36PM +0200, Vincent Guittot wrote:
> Vincent Guittot (4):
> sched/fair: relax constraint on task's load during load balance
> sched/fair: reduce minimal imbalance threshold
> sched/fair: minimize concurrent LBs between domain level
> sched/fair: reduce busy load balance interval
I see nothing objectionable there, a little more testing can't hurt, but
I'm tempted to apply them.
Phil, Mel, any chance you can run them through your respective setups?
On Mon, Sep 14, 2020 at 01:42:02PM +0200 [email protected] wrote:
> On Mon, Sep 14, 2020 at 12:03:36PM +0200, Vincent Guittot wrote:
> > Vincent Guittot (4):
> > sched/fair: relax constraint on task's load during load balance
> > sched/fair: reduce minimal imbalance threshold
> > sched/fair: minimize concurrent LBs between domain level
> > sched/fair: reduce busy load balance interval
>
> I see nothing objectionable there, a little more testing can't hurt, but
> I'm tempted to apply them.
>
> Phil, Mel, any chance you can run them through your respective setups?
>
Yep. I'll try to get something started today, results in a few days.
These look pretty inocuous. It'll be interesting to see the effect is.
Cheers,
Phil
--
On Mon, Sep 14, 2020 at 01:42:02PM +0200, [email protected] wrote:
> On Mon, Sep 14, 2020 at 12:03:36PM +0200, Vincent Guittot wrote:
> > Vincent Guittot (4):
> > sched/fair: relax constraint on task's load during load balance
> > sched/fair: reduce minimal imbalance threshold
> > sched/fair: minimize concurrent LBs between domain level
> > sched/fair: reduce busy load balance interval
>
> I see nothing objectionable there, a little more testing can't hurt, but
> I'm tempted to apply them.
>
> Phil, Mel, any chance you can run them through your respective setups?
They're queued but the test grid is backlogged at the moment. It'll be
a few days before the tests complete.
--
Mel Gorman
SUSE Labs
On Mon, 14 Sep 2020 at 14:53, Phil Auld <[email protected]> wrote:
>
> On Mon, Sep 14, 2020 at 01:42:02PM +0200 [email protected] wrote:
> > On Mon, Sep 14, 2020 at 12:03:36PM +0200, Vincent Guittot wrote:
> > > Vincent Guittot (4):
> > > sched/fair: relax constraint on task's load during load balance
> > > sched/fair: reduce minimal imbalance threshold
> > > sched/fair: minimize concurrent LBs between domain level
> > > sched/fair: reduce busy load balance interval
> >
> > I see nothing objectionable there, a little more testing can't hurt, but
> > I'm tempted to apply them.
> >
> > Phil, Mel, any chance you can run them through your respective setups?
> >
>
> Yep. I'll try to get something started today, results in a few days.
Thanks Phil
>
> These look pretty inocuous. It'll be interesting to see the effect is.
>
>
> Cheers,
> Phil
> --
>
On Mon, 14 Sep 2020 at 17:51, Mel Gorman <[email protected]> wrote:
>
> On Mon, Sep 14, 2020 at 01:42:02PM +0200, [email protected] wrote:
> > On Mon, Sep 14, 2020 at 12:03:36PM +0200, Vincent Guittot wrote:
> > > Vincent Guittot (4):
> > > sched/fair: relax constraint on task's load during load balance
> > > sched/fair: reduce minimal imbalance threshold
> > > sched/fair: minimize concurrent LBs between domain level
> > > sched/fair: reduce busy load balance interval
> >
> > I see nothing objectionable there, a little more testing can't hurt, but
> > I'm tempted to apply them.
> >
> > Phil, Mel, any chance you can run them through your respective setups?
>
> They're queued but the test grid is backlogged at the moment. It'll be
> a few days before the tests complete.
Thanks Mel
>
> --
> Mel Gorman
> SUSE Labs
Hi Vincent,
On 14/09/20 11:03, Vincent Guittot wrote:
> When the system doesn't have enough cycles for all tasks, the scheduler
> must ensure a fair split of those CPUs cycles between CFS tasks. The
> fairness of some use cases can't be solved with a static distribution of
> the tasks on the system and requires a periodic rebalancing of the system
> but this dynamic behavior is not always optimal and the fair distribution
> of the CPU's time is not always ensured.
>
> The patchset improves the fairness by decreasing the constraint for
> selecting migratable tasks with the number of failed load balance. This
> change enables then to decrease the imbalance threshold because 1st LB
> will try to migrate tasks that fully match the imbalance.
>
> Some tests results:
>
> - small 2 x 4 cores arm64 system
>
> hackbench -l (256000/#grp) -g #grp
>
> grp tip/sched/core +patchset improvement
> 1 1.420(+/- 11.72 %) 1.382(+/-10.50 %) 2.72 %
> 4 1.295(+/- 2.72 %) 1.218(+/- 2.97 %) 0.76 %
> 8 1.220(+/- 2.17 %) 1.218(+/- 1.60 %) 0.17 %
> 16 1.258(+/- 1.88 %) 1.250(+/- 1,78 %) 0.58 %
>
>
> fairness tests: run always running rt-app threads
> monitor the ratio between min/max work done by threads
>
> v5.9-rc1 w/ patchset
> 9 threads avg 78.3% (+/- 6.60%) 91.20% (+/- 2.44%)
> worst 68.6% 85.67%
>
> 11 threads avg 65.91% (+/- 8.26%) 91.34% (+/- 1.87%)
> worst 53.52% 87.26%
>
> - large 2 nodes x 28 cores x 4 threads arm64 system
>
> The hackbench tests that I usually run as well as the sp.C.x and lu.C.x
> tests with 224 threads have not shown any difference with a mix of less
> than 0.5% of improvements or regressions.
>
Few nitpicks from my end, but no major objections - this looks mostly
sane to me.
> Vincent Guittot (4):
> sched/fair: relax constraint on task's load during load balance
> sched/fair: reduce minimal imbalance threshold
> sched/fair: minimize concurrent LBs between domain level
> sched/fair: reduce busy load balance interval
>
> kernel/sched/fair.c | 7 +++++--
> kernel/sched/topology.c | 4 ++--
> 2 files changed, 7 insertions(+), 4 deletions(-)
Hi Peter,
On Mon, Sep 14, 2020 at 01:42:02PM +0200 [email protected] wrote:
> On Mon, Sep 14, 2020 at 12:03:36PM +0200, Vincent Guittot wrote:
> > Vincent Guittot (4):
> > sched/fair: relax constraint on task's load during load balance
> > sched/fair: reduce minimal imbalance threshold
> > sched/fair: minimize concurrent LBs between domain level
> > sched/fair: reduce busy load balance interval
>
> I see nothing objectionable there, a little more testing can't hurt, but
> I'm tempted to apply them.
>
> Phil, Mel, any chance you can run them through your respective setups?
>
Sorry for the delay. Things have been backing up...
We tested with tis series and found there was no performance change in
our test suites. (We don't have a good way to share the actual numbers
outside right now, but since they aren't really different it probably
doesn't matter much here.)
The difference we did see was a slight decrease in the number of tasks
moved around at higher loads. That seems to be a good thing even though
it didn't directly show time-based performance benefits (and was pretty
minor).
So if this helps other use cases we've got no problems with it.
Thanks,
Phil
--
On Fri, Sep 18, 2020 at 12:39:28PM -0400 Phil Auld wrote:
> Hi Peter,
>
> On Mon, Sep 14, 2020 at 01:42:02PM +0200 [email protected] wrote:
> > On Mon, Sep 14, 2020 at 12:03:36PM +0200, Vincent Guittot wrote:
> > > Vincent Guittot (4):
> > > sched/fair: relax constraint on task's load during load balance
> > > sched/fair: reduce minimal imbalance threshold
> > > sched/fair: minimize concurrent LBs between domain level
> > > sched/fair: reduce busy load balance interval
> >
> > I see nothing objectionable there, a little more testing can't hurt, but
> > I'm tempted to apply them.
> >
> > Phil, Mel, any chance you can run them through your respective setups?
> >
>
> Sorry for the delay. Things have been backing up...
>
> We tested with tis series and found there was no performance change in
> our test suites. (We don't have a good way to share the actual numbers
> outside right now, but since they aren't really different it probably
> doesn't matter much here.)
>
> The difference we did see was a slight decrease in the number of tasks
> moved around at higher loads. That seems to be a good thing even though
> it didn't directly show time-based performance benefits (and was pretty
> minor).
>
> So if this helps other use cases we've got no problems with it.
>
Feel free to add a
Reviewed-by: Phil Auld <[email protected]>
Jirka did the actual testing so he can speak up with a Tested-by if he
wants to.
> Thanks,
> Phil
>
> --
>
--