Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2520166pxb; Fri, 25 Mar 2022 20:24:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzFdhtyovp2BuiGjIvlQZnA990gQjnjTUTgc7g4oAFI3q6azZH5e7l3A9OxDd+rbScoyfMC X-Received: by 2002:a50:ec94:0:b0:419:75fa:f695 with SMTP id e20-20020a50ec94000000b0041975faf695mr2211768edr.355.1648265092496; Fri, 25 Mar 2022 20:24:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648265092; cv=none; d=google.com; s=arc-20160816; b=LExApPbpxCWAXcEXjhb7nTJWpjTsljN99L4ui19xAwt7AH8fEZyhVjyLxFR3uoNYbW bNyFY7uE6WSYtgyuQWobC8E72BmpXhi9X8Wdp9tvoxtG/c9AuOHKymN/6fQu6kPg202z KxGhW5PyzmGzwKMxK58JftrPSh2N+k0x1Z7Z99b6gTVOwmUcg4zCC7l+/tgKo+drVT+5 pVjCDRcMIhXXCsn+jATT3lVLOwF5noRA9O7EBwQQoYbLKDte0kT76CisTukVvaf3LNLF AvK+yGjzJGYdrsN9s7S6HPaGtBYjzgRKXbsn3LEhAHCZoQihFft0Lcc/WL1tH6V6kqXd ZMjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=9ZN2cxoohkjwRH0NPZED8u9p0qQRNtWm3NaeQ8yWrDY=; b=zndfx4M++IEHUTU0aH279SavgYVp9RWPfi8X74KpshJa+jYchf4bYHIod9wCoY1oB3 muoz9+qBEVG8XeUJjLW3+Sxxy3d/mywpfouyxGG/Dexwhv+lSb3SMjTWmI+KkP5pxTbz S1O2PEbrUvUv92Gb1hdGfDmaPlrExQYiQ+SvWztReoBE9A5Czp//rrb8wcROewh2qpft SbaPjbDMMWs83lj+kOdOLMphRjdlMTl4X82sulWC2Q1Jo7Uev+rehZf3BdRiw3NoOx8U /kOcCnGSJdYC0LB7Pk5G2+5sRPrDlWUQD13e5mZOuKmD5VcUu9Cqvqc+xgJ+/cd5XFfh vGGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FG1JkPVl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e22-20020a1709067e1600b006df76385beesi4567915ejr.142.2022.03.25.20.24.27; Fri, 25 Mar 2022 20:24:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FG1JkPVl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229562AbiCYXxw (ORCPT + 99 others); Fri, 25 Mar 2022 19:53:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229620AbiCYXxs (ORCPT ); Fri, 25 Mar 2022 19:53:48 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 108AF10BBDB for ; Fri, 25 Mar 2022 16:52:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1648252331; x=1679788331; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jsVPk/AP3tfYmFM+PdRlupq5b+ybr1R0V4Pies6bYIo=; b=FG1JkPVl7+7oWcN0B9fqNhvzkXzMSoSiTmFkb/sqbf3TeTTHON5FeWV9 Nz6yVl4t46gR2cAcoW8Ci0WYhTLu7bYVJF49NdYIpiJdquyJq0hdIr6j9 V48kfAsfu0vWn7FA6DwspN8GzkYAlfAs7LC2I2z7g9P7/CXBtctDGHDRW 8uTE0oxl+FvLVXQDqJIXvQa8OaifMeA+1mdAR1QHl2fekjEGpspQXm5CH 1Hw1Ijt/xo253JvN6IczMZM6YXO0FvTClwTmJTN08a10OrajwI7SnAuKs VsVCkoOz+aKztp+PyLLJACv+nyvJK4vJkgMrLEXwzomgpC2Y5rdBdhBtt g==; X-IronPort-AV: E=McAfee;i="6200,9189,10297"; a="321930161" X-IronPort-AV: E=Sophos;i="5.90,211,1643702400"; d="scan'208";a="321930161" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2022 16:52:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,211,1643702400"; d="scan'208";a="648425297" Received: from skl-02.jf.intel.com ([10.54.74.28]) by fmsmga002.fm.intel.com with ESMTP; 25 Mar 2022 16:52:10 -0700 From: Tim Chen To: Peter Zijlstra , Vincent Guittot , Ingo Molnar , Juri Lelli Cc: Yu Chen , Walter Mack , Mel Gorman , linux-kernel@vger.kernel.org, Tim Chen Subject: [PATCH 2/2] sched/fair: Simple runqueue order on migrate Date: Fri, 25 Mar 2022 15:54:17 -0700 Message-Id: X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Peter Zijlstra (Intel)" From: Peter Zijlstra (Intel) There's a number of problems with SMP migration of fair tasks, but basically it boils down to a task not receiving equal service on each runqueue (consider the trivial 3 tasks 2 cpus infeasible weight scenario). Fully solving that with vruntime placement is 'hard', not least because a task might be very under-services on a busy runqueue and would need to be placed so far left on the new runqueue that it would significantly impact latency on the existing tasks. Instead do minimal / basic placement instead; when moving to a less busy queue place at the front of the queue to receive time sooner. When moving to a busier queue, place at the end of the queue to receive time later. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Tim Chen Tested-by: Chen Yu Tested-by: Walter Mack --- kernel/sched/fair.c | 33 +++++++++++++++++++++++++++++---- kernel/sched/features.h | 2 ++ 2 files changed, 31 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2498e97804fd..c5d2cb3a8f42 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4223,6 +4223,27 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) se->vruntime = max_vruntime(se->vruntime, vruntime); } +static void place_entity_migrate(struct cfs_rq *cfs_rq, struct sched_entity *se) +{ + if (!sched_feat(PLACE_MIGRATE)) + return; + + if (cfs_rq->nr_running < se->migrated) { + /* + * Migrated to a shorter runqueue, go first because + * we were under-served on the old runqueue. + */ + se->vruntime = cfs_rq->min_vruntime; + return; + } + + /* + * Migrated to a longer runqueue, go last because + * we got over-served on the old runqueue. + */ + se->vruntime = cfs_rq->min_vruntime + sched_vslice(cfs_rq, se); +} + static void check_enqueue_throttle(struct cfs_rq *cfs_rq); static inline bool cfs_bandwidth_used(void); @@ -4296,6 +4317,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) if (flags & ENQUEUE_WAKEUP) place_entity(cfs_rq, se, 0); + else if (se->migrated) + place_entity_migrate(cfs_rq, se); check_schedstat_required(); update_stats_enqueue_fair(cfs_rq, se, flags); @@ -6930,6 +6953,7 @@ static void detach_entity_cfs_rq(struct sched_entity *se); */ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) { + struct sched_entity *se = &p->se; /* * As blocked tasks retain absolute vruntime the migration needs to * deal with this by subtracting the old and adding the new @@ -6962,7 +6986,7 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) * rq->lock and can modify state directly. */ lockdep_assert_rq_held(task_rq(p)); - detach_entity_cfs_rq(&p->se); + detach_entity_cfs_rq(se); } else { /* @@ -6973,14 +6997,15 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) * wakee task is less decayed, but giving the wakee more load * sounds not bad. */ - remove_entity_load_avg(&p->se); + remove_entity_load_avg(se); } /* Tell new CPU we are migrated */ - p->se.avg.last_update_time = 0; + se->avg.last_update_time = 0; /* We have migrated, no longer consider this task hot */ - p->se.migrated = 1; + for_each_sched_entity(se) + se->migrated = READ_ONCE(cfs_rq_of(se)->nr_running) + !se->on_rq; update_scan_period(p, new_cpu); } diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 1cf435bbcd9c..681c84fd062c 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -100,3 +100,5 @@ SCHED_FEAT(LATENCY_WARN, false) SCHED_FEAT(ALT_PERIOD, true) SCHED_FEAT(BASE_SLICE, true) + +SCHED_FEAT(PLACE_MIGRATE, true) -- 2.32.0