Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp5941769ybv; Tue, 18 Feb 2020 06:54:42 -0800 (PST) X-Google-Smtp-Source: APXvYqxHVRQQkFvibWeVei/l2ErPsUtzHmcsQGZBB26CXbUB4ZnFtpmdc6Harysja9qz9slxFqib X-Received: by 2002:a9d:34c:: with SMTP id 70mr15074409otv.174.1582037681875; Tue, 18 Feb 2020 06:54:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582037681; cv=none; d=google.com; s=arc-20160816; b=MO9giuKLEBxog/i7TcKBVm5qA+XH0rd31F72WxqkRxsIT4LFM7sagDZKhffBCyKjmF w+mLN7n9Ah00wY3gMPBP/g/oMv+lNqMeGkE1zeMg08D48ZpE6UdQfB3FLZskt1PE6dxI RrpxEs3Ao53LzumH5/CQnyZMSjaL06XJUk5GN6J6Rz4GFkz+w/2xxKAAjiSazU8WpbPz yg6fSVKaggDmu2SUFjcO4qbY60w/XJ2i+OV3Jmq9YccJOXJWdj7SahZZyrZ/OObBvRkC 7y+MVxPaSXoZJQALlb8T6hx/Sgw1DGKJVdElcfRF1qOEbSHdnZ+0zswnSkCa7jDXWBDR PiDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=yi39uokRqrtyv5j9/isG2NIOLG6gpN1oKvYhnBYiUYM=; b=WdPH7XuXXc6Vr7d8jSl91dJg8wWANyVsuKZIeA5EsqXhY81p68yHZCRSJCR0VSOz/B cXi++j8uL8kmfOehHP2e1uyuXeGOFZ3UuEuIEXCzgDGnKOus2Qm4xPNZaE2UFEYMtGiW dyUqcwrIDaWN/rrcC3CZOxpoFP3EzBITzmO1OSlGUzecsCD7NXNfFXqnFCv0pj2h4ILe uCQ6uAD0MAFojVUBs7CYs/qUG7cC4FZ/6yaewesbiJbzIzNanmvm/wExzz8cSLvV9btc kxCOJdIfysj+cOKQHBuULdQ1xcfhZ52RxwTdXYBrGof4alth3D5RK5SF6f6FiMq0N7ao jYEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g2si1998315otn.117.2020.02.18.06.54.29; Tue, 18 Feb 2020 06:54:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726771AbgBROyS (ORCPT + 99 others); Tue, 18 Feb 2020 09:54:18 -0500 Received: from foss.arm.com ([217.140.110.172]:53774 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726634AbgBROyS (ORCPT ); Tue, 18 Feb 2020 09:54:18 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0898930E; Tue, 18 Feb 2020 06:54:18 -0800 (PST) Received: from [10.1.195.59] (ifrit.cambridge.arm.com [10.1.195.59]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8158E3F703; Tue, 18 Feb 2020 06:54:16 -0800 (PST) Subject: Re: [PATCH v2 2/5] sched/numa: Replace runnable_load_avg by load_avg To: Vincent Guittot , mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, linux-kernel@vger.kernel.org Cc: pauld@redhat.com, parth@linux.ibm.com, hdanton@sina.com References: <20200214152729.6059-1-vincent.guittot@linaro.org> <20200214152729.6059-3-vincent.guittot@linaro.org> From: Valentin Schneider Message-ID: Date: Tue, 18 Feb 2020 14:54:14 +0000 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <20200214152729.6059-3-vincent.guittot@linaro.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/14/20 3:27 PM, Vincent Guittot wrote: > @@ -1473,38 +1473,35 @@ bool should_numa_migrate_memory(struct task_struct *p, struct page * page, > group_faults_cpu(ng, src_nid) * group_faults(p, dst_nid) * 4; > } > > -static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq); > - > -static unsigned long cpu_runnable_load(struct rq *rq) > -{ > - return cfs_rq_runnable_load_avg(&rq->cfs); > -} > +/* > + * 'numa_type' describes the node at the moment of load balancing. > + */ > +enum numa_type { > + /* The node has spare capacity that can be used to run more tasks. */ > + node_has_spare = 0, > + /* > + * The node is fully used and the tasks don't compete for more CPU > + * cycles. Nevertheless, some tasks might wait before running. > + */ > + node_fully_busy, > + /* > + * The node is overloaded and can't provide expected CPU cycles to all > + * tasks. > + */ > + node_overloaded > +}; Could we reuse group_type instead? The definitions are the same modulo s/group/node/. > > /* Cached statistics for all CPUs within a node */ > struct numa_stats { > unsigned long load; > - > + unsigned long util; > /* Total compute capacity of CPUs on a node */ > unsigned long compute_capacity; > + unsigned int nr_running; > + unsigned int weight; > + enum numa_type node_type; > }; > > -/* > - * XXX borrowed from update_sg_lb_stats > - */ > -static void update_numa_stats(struct numa_stats *ns, int nid) > -{ > - int cpu; > - > - memset(ns, 0, sizeof(*ns)); > - for_each_cpu(cpu, cpumask_of_node(nid)) { > - struct rq *rq = cpu_rq(cpu); > - > - ns->load += cpu_runnable_load(rq); > - ns->compute_capacity += capacity_of(cpu); > - } > - > -} > - > struct task_numa_env { > struct task_struct *p; > > @@ -1521,6 +1518,47 @@ struct task_numa_env { > int best_cpu; > }; > > +static unsigned long cpu_load(struct rq *rq); > +static unsigned long cpu_util(int cpu); > + > +static inline enum > +numa_type numa_classify(unsigned int imbalance_pct, > + struct numa_stats *ns) > +{ > + if ((ns->nr_running > ns->weight) && > + ((ns->compute_capacity * 100) < (ns->util * imbalance_pct))) > + return node_overloaded; > + > + if ((ns->nr_running < ns->weight) || > + ((ns->compute_capacity * 100) > (ns->util * imbalance_pct))) > + return node_has_spare; > + > + return node_fully_busy; > +} > + As Mel pointed out, this is group_is_overloaded() and group_has_capacity(). @Mel, you mentioned having a common helper, do you have that laying around? I haven't seen it in your reconciliation series. What I'm naively thinking here is that we could have either move the whole thing to just sg_lb_stats (AFAICT the fields of numa_stats are a subset of it), or if we really care about the stack we could tweak the ordering to ensure we can cast one into the other (not too enticed by that one though). > +/* > + * XXX borrowed from update_sg_lb_stats > + */ > +static void update_numa_stats(struct task_numa_env *env, > + struct numa_stats *ns, int nid) > +{ > + int cpu; > + > + memset(ns, 0, sizeof(*ns)); > + for_each_cpu(cpu, cpumask_of_node(nid)) { > + struct rq *rq = cpu_rq(cpu); > + > + ns->load += cpu_load(rq); > + ns->util += cpu_util(cpu); > + ns->nr_running += rq->cfs.h_nr_running; > + ns->compute_capacity += capacity_of(cpu); > + } > + > + ns->weight = cpumask_weight(cpumask_of_node(nid)); > + > + ns->node_type = numa_classify(env->imbalance_pct, ns); > +} > + > static void task_numa_assign(struct task_numa_env *env, > struct task_struct *p, long imp) > {