by Aaron Lu

[permalink] [raw]

Subject: Re: [PATCH v8 1/9] sched/fair: fix unfairness at wakeup

On Mon, Nov 14, 2022 at 12:05:18PM +0100, Vincent Guittot wrote:
> On Mon, 14 Nov 2022 at 04:06, Joel Fernandes <[email protected]> wrote:
> >
> > Hi Vincent,
> >
> > On Thu, Nov 10, 2022 at 06:50:01PM +0100, Vincent Guittot wrote:

... ...

> > > +static inline unsigned long get_latency_max(void)
> > > +{
> > > + unsigned long thresh = get_sched_latency(false);
> > > +
> > > + thresh -= sysctl_sched_min_granularity;
> >
> > Could you clarify, why are you subtracting sched_min_granularity here? Could
> > you add some comments here to make it clear?
>
> If the waking task failed to preempt current it could to wait up to
> sysctl_sched_min_granularity before preempting it during next tick.

check_preempt_tick() compares vdiff/delta between the leftmost se and
curr against curr's ideal_runtime, it doesn't use thresh here or the
adjusted wakeup_gran, so I don't see why reducing thresh here can help
se to preempt curr during next tick if it failed to preempt curr in its
wakeup path.

I can see reducing thresh here with whatever value can help the waking
se to preempt curr in wakeup_preempt_entity() though, because most
likely the waking se's vruntime is cfs_rq->min_vruntime -
sysctl_sched_latency/2 and curr->vruntime is near cfs_rq->min_vruntime
so vdiff is about sysctl_sched_latency/2, which is the same value as
get_sched_latency(false) and when thresh is reduced some bit, then vdiff
in wakeup_preempt_entity() will be larger than gran and make it possible
to preempt.

So I'm confused by your comment or I might misread the code.

2022-11-17 09:46:54

by Vincent Guittot

[permalink] [raw]

Subject: Re: [PATCH v8 1/9] sched/fair: fix unfairness at wakeup

On Wed, 16 Nov 2022 at 09:26, Aaron Lu <[email protected]> wrote:
>
> On Mon, Nov 14, 2022 at 12:05:18PM +0100, Vincent Guittot wrote:
> > On Mon, 14 Nov 2022 at 04:06, Joel Fernandes <[email protected]> wrote:
> > >
> > > Hi Vincent,
> > >
> > > On Thu, Nov 10, 2022 at 06:50:01PM +0100, Vincent Guittot wrote:
>
> ... ...
>
> > > > +static inline unsigned long get_latency_max(void)
> > > > +{
> > > > + unsigned long thresh = get_sched_latency(false);
> > > > +
> > > > + thresh -= sysctl_sched_min_granularity;
> > >
> > > Could you clarify, why are you subtracting sched_min_granularity here? Could
> > > you add some comments here to make it clear?
> >
> > If the waking task failed to preempt current it could to wait up to
> > sysctl_sched_min_granularity before preempting it during next tick.
>
> check_preempt_tick() compares vdiff/delta between the leftmost se and
> curr against curr's ideal_runtime, it doesn't use thresh here or the
> adjusted wakeup_gran, so I don't see why reducing thresh here can help
> se to preempt curr during next tick if it failed to preempt curr in its
> wakeup path.

If waking task doesn't preempt curr, it will wait for the next
check_preempt_tick(), but check_preempt_tick() ensures a minimum
runtime of sysctl_sched_min_granularity before comparing the vruntime.
Thresh doesn't help in check_preempt_tick() but anticipate the fact
that if it fails to preempt now, current can get an additional
sysctl_sched_min_granularity runtime before being preempted.

>
> I can see reducing thresh here with whatever value can help the waking
> se to preempt curr in wakeup_preempt_entity() though, because most
> likely the waking se's vruntime is cfs_rq->min_vruntime -
> sysctl_sched_latency/2 and curr->vruntime is near cfs_rq->min_vruntime
> so vdiff is about sysctl_sched_latency/2, which is the same value as
> get_sched_latency(false) and when thresh is reduced some bit, then vdiff
> in wakeup_preempt_entity() will be larger than gran and make it possible
> to preempt.
>
> So I'm confused by your comment or I might misread the code.