by Qais Yousef

[permalink] [raw]

Subject: Re: [PATCH v4 1/2] sched/fair: Check a task has a fitting cpu when updating misfit

On 01/26/24 15:15, Vincent Guittot wrote:

> > > TBH, I don't know. I would need time to think about this...
> > > May be when we set the new affinity of the task
> >
> > I was thinking to actually call update_misfit_status() from another less
> > expensive location.
> >
> > We can certainly do something to help the check less expensive if we must do it
> > in pick_next_task(). For example set a flag if the task belongs to a single
> > capacity value; and store the highest capacity its affinity belongs too. But
> > with cpuset v1, v2 and hotplug I am wary that might get messy.
>
> I think it worth looking at such solution as this would mean parsing
> the possible max capacity for the task only once per affinity change

Okay. It might not be that bad and just need to do the parsing when we update
the cpus_ptr, which seems to happen only in set_cpus_allowed_common(). I think
I can create a wrapper for fair where we do set_cpus_allowed_common() then do
the checks to discover the max_allowed_capacity and whether the new affinity is
asymmetric or not.

Cheers

--
Qais Yousef

2024-01-28 23:51:28

On 02/06/24 18:17, Dietmar Eggemann wrote:
> On 06/02/2024 16:06, Qais Yousef wrote:
> > On 02/05/24 20:49, Dietmar Eggemann wrote:
> >> On 26/01/2024 02:46, Qais Yousef wrote:
> >>> On 01/25/24 18:40, Vincent Guittot wrote:
> >>>> On Wed, 24 Jan 2024 at 23:30, Qais Yousef <[email protected]> wrote:
> >>>>>
> >>>>> On 01/23/24 09:26, Vincent Guittot wrote:
> >>>>>> On Fri, 5 Jan 2024 at 23:20, Qais Yousef <[email protected]> wrote:
> >>>>>>>
> >>>>>>> From: Qais Yousef <[email protected]>
>
> [...]
>
> >>> It seems we flatten topologies but not sched domains. I see all cpus shown as
> >>> core_siblings. The DT for apple silicon sets clusters in the cpu-map - which
> >>> seems the flatten topology stuff detect LLC correctly but still keeps the
> >>> sched-domains not flattened. Is this a bug? I thought we will end up with one
> >>> sched domain still.
> >>
> >> IMHO, if you have a cpu_map entry with > 1 cluster in your dtb, you end
> >> up with MC and PKG (former DIE) Sched Domain (SD) level. And misfit load
> >
> > Hmm, okay. I thought the detection of topology where we know the LLC is shared
> > will cause the sched domains to collapse too.
> >
> >> balance takes potentially longer on PKG than to MC.
> >
> > Why potentially longer? We iterate through the domains the CPU belong to. If
> > the first iteration (at MC) pulled something, then once we go to PKG then we're
> > less likely to pull again?
>
> There are a couple of mechanisms in place to let load-balance on higher
> sd levels happen less frequently, eg:
>
> load_balance() -> should_we_balance() + continue_balancing
>
> interval = get_sd_balance_interval(sd, busy) in rebalance_domains()
>
> rq->avg_idle versus sd->max_newidle_lb_cost

Okay thanks. That last one I missed.

>
> > Anyway. I think I am hitting a bug here. The behavior doesn't look right to me
> > given the delays I'm seeing and the fact we do the ilb but for some reason fail
> > to pull
>
> [...]
>