Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp454249pxb; Tue, 29 Mar 2022 06:24:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyDCiro9v6N5xGv75D9IBuMy0lAf75122xoyxaPLVK8+QbCQ/6sTe2/d5oqYNLgVr5cUHrJ X-Received: by 2002:a05:6638:1352:b0:321:547b:daa2 with SMTP id u18-20020a056638135200b00321547bdaa2mr15072827jad.128.1648560270676; Tue, 29 Mar 2022 06:24:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648560270; cv=none; d=google.com; s=arc-20160816; b=F79djtoPYYz1a7WLDgE9mB3zWxWb96wBK/oae5pyNdk6w5qOyflwaek/ARI7egkDgD +Ggx+H0IcuO5LjLzwQ3cSZpgTNPT8oH+VW+qoPHr+2lLyvVMgHENPQNySUxeePenH3st ofyXHssyGj8BzNJD6h0s7EQKY7TFret//3yDjTpJu/xXpy82nOhzu5+bFFbrMFTbM0XS DpD9LwQrGbl4pmCK9Ka3BWKiUnmP+Onq5MhfUgW/JcvazePdEVDpz/XYa9CAmexF7kD+ 4RxokYkH7pc2MDP+PYnBsnsII8hwr9nQtzrwNeOVEzhpZ0J8TITu8XyvJeCmk5lELy/Q KnRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=VHqHIDlWyaIt4ObUYDGs677A2WiiBuE3HjIZGSfA4Ag=; b=CvczSHOzqtXaUpfi52tqjFk3xm9s6jQ3/DPqli/n2DzEaq6zOreI0WzWZzwqzRDY3k b45QjkEBZTa0YQ36/BDGd+MPX8vmHNJdZpj1Acc89mLcpTAsqO8a0KZdIsm2pvxC3Tv3 WKv1V0o8DE2WOknfudwL7G6DeIlmfxYc5ub0Cw7Y3Mhnf6wXLhxnd9UICQOOmE7wB28T W4O/tVA6HtW0kePLy4NiGgCgF4vxibxqeHkQtRcKBxF2IvReKpGQhdeux4DccOr/DtCY KAhKCgs70ziYlgs73fji9IVhoOlbE2LbmNRE4or5cnve6hElRt2sw9Lr20FkpJ0lg8Vp GhLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=LUBjyN6f; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y15-20020a056602164f00b006495808450esi19331300iow.43.2022.03.29.06.24.16; Tue, 29 Mar 2022 06:24:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=LUBjyN6f; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234398AbiC2JFo (ORCPT + 99 others); Tue, 29 Mar 2022 05:05:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232656AbiC2JFm (ORCPT ); Tue, 29 Mar 2022 05:05:42 -0400 Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBDCF227C7F for ; Tue, 29 Mar 2022 02:03:57 -0700 (PDT) Received: by mail-lj1-x229.google.com with SMTP id 17so22538213ljw.8 for ; Tue, 29 Mar 2022 02:03:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VHqHIDlWyaIt4ObUYDGs677A2WiiBuE3HjIZGSfA4Ag=; b=LUBjyN6fGeb3o+FKkRYxO2blHV90suUJ55mf7DVxKxRtYNCkz2c2Ur6tYupkzyF29H UBqNGuupjb9Li3WIUoWtUtCrA4TVQvdGIHNjyQwPm9lh/hTi0gt+E2/QixDDz3veAewo vMCg0ZDe+X3VcUv2q/ZmpvB69+0VAdiG1TSMECDTizaAgBOYPNWLa8g9eoebTRt9Aofj KDqjlGVbsCU2M4UOpfJXb1uliOeORxnxGqSkMskwVRnv7P/KTYTGc9Jer6w2kpuTg8qe XMi8aAgcGqY2uTGPkUZLtKcENTzJYhqu4M72Q2oKHzsoYVWoS+bHokHJc5dTq3jGQW8z bwjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VHqHIDlWyaIt4ObUYDGs677A2WiiBuE3HjIZGSfA4Ag=; b=GrazTyTUfpXqlPXjEzHM1ZWKWEGW+JzxCGzm5UTTxriogI5xWiptNFF+vLcdKUumb6 AoemWuON8lAU6Ivo3In4PwznwK6YrbIeNRJqlsr55MwLdUvzh19TQuWAN5ZlEE7QuR5c M/roFwHd28drCF7mDV5TNWTClhElRylZzmuMa5d5yKm/HHVenD8QgEBV5UWNMSjkA+zz g1p+6b7Vmmo+Rb80oD1rhFc0ivQF7meq7OAi8u1GLCg/6YlNDnDNaSf22ZVAA+Aotivu MXa1rfK/Q0EXUdra28OY666ob2rWU2RX7s5de3mq4QuBBOHmKEqKhGbWTsNii8KUIo1j OZGw== X-Gm-Message-State: AOAM532Dc9xqkpOfpjurHnxlonvTaK8xsQtJ/z2pauhW5G0e0Sp1oo9n u7zgCvYJAo+Xh3nVP+BuloJm0/Wj+TtX1v1BgGAWIGX0Uwo= X-Received: by 2002:a2e:7d05:0:b0:247:ed41:690d with SMTP id y5-20020a2e7d05000000b00247ed41690dmr1748161ljc.92.1648544636029; Tue, 29 Mar 2022 02:03:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Vincent Guittot Date: Tue, 29 Mar 2022 11:03:44 +0200 Message-ID: Subject: Re: [PATCH 2/2] sched/fair: Simple runqueue order on migrate To: Tim Chen Cc: Peter Zijlstra , Ingo Molnar , Juri Lelli , Yu Chen , Walter Mack , Mel Gorman , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 26 Mar 2022 at 00:52, Tim Chen wrote: > > From: "Peter Zijlstra (Intel)" > > From: Peter Zijlstra (Intel) > > There's a number of problems with SMP migration of fair tasks, but > basically it boils down to a task not receiving equal service on each > runqueue (consider the trivial 3 tasks 2 cpus infeasible weight > scenario). > > Fully solving that with vruntime placement is 'hard', not least > because a task might be very under-services on a busy runqueue and > would need to be placed so far left on the new runqueue that it would > significantly impact latency on the existing tasks. > > Instead do minimal / basic placement instead; when moving to a less > busy queue place at the front of the queue to receive time sooner. > When moving to a busier queue, place at the end of the queue to > receive time later. > > Signed-off-by: Peter Zijlstra (Intel) > Signed-off-by: Tim Chen > Tested-by: Chen Yu > Tested-by: Walter Mack > --- > kernel/sched/fair.c | 33 +++++++++++++++++++++++++++++---- > kernel/sched/features.h | 2 ++ > 2 files changed, 31 insertions(+), 4 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 2498e97804fd..c5d2cb3a8f42 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4223,6 +4223,27 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > se->vruntime = max_vruntime(se->vruntime, vruntime); > } > > +static void place_entity_migrate(struct cfs_rq *cfs_rq, struct sched_entity *se) > +{ > + if (!sched_feat(PLACE_MIGRATE)) > + return; > + > + if (cfs_rq->nr_running < se->migrated) { > + /* > + * Migrated to a shorter runqueue, go first because > + * we were under-served on the old runqueue. > + */ > + se->vruntime = cfs_rq->min_vruntime; > + return; > + } > + > + /* > + * Migrated to a longer runqueue, go last because > + * we got over-served on the old runqueue. > + */ > + se->vruntime = cfs_rq->min_vruntime + sched_vslice(cfs_rq, se); > +} > + > static void check_enqueue_throttle(struct cfs_rq *cfs_rq); > > static inline bool cfs_bandwidth_used(void); > @@ -4296,6 +4317,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) > > if (flags & ENQUEUE_WAKEUP) > place_entity(cfs_rq, se, 0); > + else if (se->migrated) > + place_entity_migrate(cfs_rq, se); > > check_schedstat_required(); > update_stats_enqueue_fair(cfs_rq, se, flags); > @@ -6930,6 +6953,7 @@ static void detach_entity_cfs_rq(struct sched_entity *se); > */ > static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) > { > + struct sched_entity *se = &p->se; > /* > * As blocked tasks retain absolute vruntime the migration needs to > * deal with this by subtracting the old and adding the new > @@ -6962,7 +6986,7 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) > * rq->lock and can modify state directly. > */ > lockdep_assert_rq_held(task_rq(p)); > - detach_entity_cfs_rq(&p->se); > + detach_entity_cfs_rq(se); > > } else { > /* > @@ -6973,14 +6997,15 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) > * wakee task is less decayed, but giving the wakee more load > * sounds not bad. > */ > - remove_entity_load_avg(&p->se); > + remove_entity_load_avg(se); > } > > /* Tell new CPU we are migrated */ > - p->se.avg.last_update_time = 0; > + se->avg.last_update_time = 0; > > /* We have migrated, no longer consider this task hot */ > - p->se.migrated = 1; > + for_each_sched_entity(se) > + se->migrated = READ_ONCE(cfs_rq_of(se)->nr_running) + !se->on_rq; Why do we need to loop on se ? Isn't p->se enough ? > > update_scan_period(p, new_cpu); > } > diff --git a/kernel/sched/features.h b/kernel/sched/features.h > index 1cf435bbcd9c..681c84fd062c 100644 > --- a/kernel/sched/features.h > +++ b/kernel/sched/features.h > @@ -100,3 +100,5 @@ SCHED_FEAT(LATENCY_WARN, false) > > SCHED_FEAT(ALT_PERIOD, true) > SCHED_FEAT(BASE_SLICE, true) > + > +SCHED_FEAT(PLACE_MIGRATE, true) > -- > 2.32.0 >