Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3923304ybf; Tue, 3 Mar 2020 15:56:34 -0800 (PST) X-Google-Smtp-Source: ADFU+vulXqi+f0C0rlQdSKzq03cizkN8yo2FyjoTBQiSR8Oo50bNVorkayTWkXFuNjsuIU+Z1e9b X-Received: by 2002:a05:6830:1e85:: with SMTP id n5mr292119otr.113.1583279794441; Tue, 03 Mar 2020 15:56:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583279794; cv=none; d=google.com; s=arc-20160816; b=DkZHraAajGDN+ZOtaQc9vg/RkkyxgeyyIF9oydZb7cXYSVk/1cbUOB0k4NBRHwoXha wWnJLuNFXbHjiA0IHBAy+MJOR7hag1G7vWruBB6+vuPRovxC9ifMJLgRHhYFM1ryARKC vRAtasVjq4QqMUU3te5ggtW6VaZlfb7LjNsc4N8vZD4BdqybkJJ1Szfy5LHJT7ZTorX6 H3r0rLja7WsSUiy41EWBy3a9NlGrN7ysl3tbUe0Kr9jCjlZU1YsJ25dj95aim4OtfqUm opbV/xaO6C6x5UNrHA+m1dAKiB/9fbKoRUHCvuZ8Hq8CmJCTsKqYttpokoxOr/xgtyH9 k5YQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject; bh=IdAJ46XJHLaqZaobDZHf2YZw7Jl97JPQbgcHRip6iBc=; b=fqljwcho1XFq/jW6LuMv+c6jfvQjZ3E16bTR6YEelQak+k6/0yeZwRfe21uYPYWyf2 5pQBt9laBjGYqq3YoLB17pBgm/khKf//h2zpey/IlvMhuQ1EhC1VH+uYnu5vKn3vPxCa br+52r4QT3M3eXOmicx1wAW8SXRUy+5+LXbF71qoYo3jToPtGA0uVvuqXmicS0COQiQK 5DRfPxUayq2oQkXTRDeXzwj0B+QOFV7BnfNSRubb96Rowf4Avf96AYzCOBSPELImw9t+ mXR1QYP6gRga9c5lfecbT+F4CLfawY7KnsJcl5JpZZEvCEQLrEGpgx68PtmXIE2mkhbc WoQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w79si177195oia.18.2020.03.03.15.56.21; Tue, 03 Mar 2020 15:56:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728140AbgCCXyo (ORCPT + 99 others); Tue, 3 Mar 2020 18:54:44 -0500 Received: from mga01.intel.com ([192.55.52.88]:13576 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727707AbgCCXyo (ORCPT ); Tue, 3 Mar 2020 18:54:44 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Mar 2020 15:54:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,511,1574150400"; d="scan'208";a="351951445" Received: from cli6-desk1.ccr.corp.intel.com (HELO [10.239.161.118]) ([10.239.161.118]) by fmsmga001.fm.intel.com with ESMTP; 03 Mar 2020 15:54:39 -0800 Subject: Re: [RFC PATCH v4 00/19] Core scheduling v4 From: "Li, Aubrey" To: Tim Chen , Vineeth Remanan Pillai , Aubrey Li Cc: Aaron Lu , Julien Desfossez , Nishanth Aravamudan , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Paul Turner , Linus Torvalds , Linux List Kernel Mailing , Dario Faggioli , =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , Kees Cook , Greg Kerr , Phil Auld , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini References: <5e3cea14-28d1-bf1e-cabe-fb5b48fdeadc@linux.intel.com> <3c3c56c1-b8dc-652c-535e-74f6dcf45560@linux.intel.com> <20200212230705.GA25315@sinkpad> <29d43466-1e18-6b42-d4d0-20ccde20ff07@linux.intel.com> <20200225034438.GA617271@ziqianlu-desktop.localdomain> Message-ID: <67e46f79-51c2-5b69-71c6-133ec10b68c4@linux.intel.com> Date: Wed, 4 Mar 2020 07:54:39 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/3/3 22:59, Li, Aubrey wrote: > On 2020/2/29 7:55, Tim Chen wrote: >> On 2/26/20 1:54 PM, Vineeth Remanan Pillai wrote: >> >>> rq->curr being NULL can mean that the sibling is idle or forced idle. >>> In both the cases, I think it makes sense to migrate a task so that it can >>> compete with the other sibling for a chance to run. This function >>> can_migrate_task actually only says if this task is eligible and >>> later part of the code decides whether it is okay to migrate it >>> based on factors like load and util and capacity. So I think its >>> fine to declare the task as eligible if the dest core is running >>> idle. Does this thinking make sense? >>> >>> On our testing, it did not show much degradation in performance with >>> this change. I am reworking the fix by removing the check for >>> task_est_util. It doesn't seem to be valid to check for util to migrate >>> the task. >>> >> >> In Aaron's test case, there is a great imbalance in the load on one core >> where all the grp A tasks are vs the other cores where the grp B tasks are >> spread around. Normally, load balancer will move the tasks for grp A. >> >> Aubrey's can_migrate_task patch prevented the load balancer to migrate tasks if the core >> cookie on the target queue don't match. The thought was it will induce >> force idle and reduces cpu utilization if we migrate task to it. >> That kept all the grp A tasks from getting migrated and kept the imbalance >> indefinitely in Aaron's test case. >> >> Perhaps we should also look at the load imbalance between the src rq and >> target rq. If the imbalance is big (say two full cpu bound tasks worth >> of load), we should migrate anyway despite the cookie mismatch. We are willing >> to pay a bit for the force idle by balancing the load out more. >> I think Aubrey's patch on can_migrate_task should be more friendly to >> Aaron's test scenario if such logic is incorporated. >> >> In Vinnet's fix, we only look at the currently running task's weight in >> src and dst rq. Perhaps the load on the src and dst rq needs to be considered >> to prevent too great an imbalance between the run queues? > > We are trying to migrate a task, can we just use cfs.h_nr_running? This signal > is used to find the busiest run queue as well. How about this one? the cgroup weight issue seems fixed on my side. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f42ceec..90024cf 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1767,6 +1767,8 @@ static void task_numa_compare(struct task_numa_env *env, rcu_read_unlock(); } +static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p); + static void task_numa_find_cpu(struct task_numa_env *env, long taskimp, long groupimp) { @@ -5650,6 +5652,44 @@ static struct sched_group * find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu, int sd_flag); +#ifdef CONFIG_SCHED_CORE +static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p) +{ + struct rq *src_rq = task_rq(p); + bool idle_core = true; + int cpu; + + /* Ignore cookie match if core scheduler is not enabled on the CPU. */ + if (!sched_core_enabled(rq)) + return true; + + if (rq->core->core_cookie == p->core_cookie) + return true; + + for_each_cpu(cpu, cpu_smt_mask(cpu_of(rq))) { + if (!available_idle_cpu(cpu)) { + idle_core = false; + break; + } + } + /* + * A CPU in an idle core is always the best choice for tasks with + * cookies. + */ + if (idle_core) + return true; + + /* + * Ignore cookie match if there is a big imbalance between the src rq + * and dst rq. + */ + if ((src_rq->cfs.h_nr_running - rq->cfs.h_nr_running) > 1) + return true; + + return false; +} +#endif + /* * find_idlest_group_cpu - find the idlest CPU among the CPUs in the group. */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 7ae6858..8c607e9 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1061,28 +1061,6 @@ static inline raw_spinlock_t *rq_lockp(struct rq *rq) return &rq->__lock; } -static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p) -{ - bool idle_core = true; - int cpu; - - /* Ignore cookie match if core scheduler is not enabled on the CPU. */ - if (!sched_core_enabled(rq)) - return true; - - for_each_cpu(cpu, cpu_smt_mask(cpu_of(rq))) { - if (!available_idle_cpu(cpu)) { - idle_core = false; - break; - } - } - /* - * A CPU in an idle core is always the best choice for tasks with - * cookies. - */ - return idle_core || rq->core->core_cookie == p->core_cookie; -} - extern void queue_core_balance(struct rq *rq); void sched_core_add(struct rq *rq, struct task_struct *p);