Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1053885imm; Fri, 12 Oct 2018 10:57:31 -0700 (PDT) X-Google-Smtp-Source: ACcGV62AR6Jlwo+ZejY/xGuHfTOdXF8KM2v1Ql10s/P+ovjebkL+DjwjgPUA/MH7yboVUwRn2zuy X-Received: by 2002:a63:1c64:: with SMTP id c36-v6mr6265299pgm.354.1539367051208; Fri, 12 Oct 2018 10:57:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539367051; cv=none; d=google.com; s=arc-20160816; b=j2WulkHUFu4CSiJBb/DftM6mVKjawvtCEeQx+SZCGrI47+YAAJlBb1xxLtY/ATJIuo QJByJtu+XufXuDvRE7zp1POzLLn147aCnVyFlfzmpgIU+U5FnXh47HPNU6FQQDrvOnyj sjLVKSTgqNEp/ouCvjAeb8irFlTVazepai2Ecxbbqga/rFKactME1iexERdE8fWmYUBe hn6prIH3zuZ3XeSglycWBRT/wyU3GHA9bkwwVbdWit0EonkSphXXWf8jXkSUcwI327dR Ny/5ErbeE4sNumrWOv+q5PDRKMlLuO6AAHRMj1aD52iW2ECCqgf3d1R1uqnKtUq5lHW+ 6GlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=3N8B2kvprVeqj0954jmpqcdlRNoAidy6V4Z5m3nsKOE=; b=oA8+LGCJsZsyPNJXQB+wwxUqfA1+IeNXKBvFdSW8VlA+h8EBwZOeHdCDlcKmc//CXB PtTyr+IW9JU6Uk3vwiORZGcxRSzTINXf5qKxpVgdDK+E95jGWNvLC/2fd7aw7R6T2egd A8cL8XEeASPxpFERufxxnkMFzOtSLOv9o3cSQ1/2zY7MXGuCxUnmm3GCnvf7UkGfBdu1 diMNNgayVGzLYW6eZbBAaWcsWZXk1EJKOV/0EJ7iYI5oHeXAqTlazwWcgh6bbRvlTQ9T P69sgqVZyWphjXsfZ4YtatYKQMXbkX+uo6UuST94NKSYe3g9/YoBMCrl39DajiuhvZJj EEHg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u12-v6si2003514pfd.66.2018.10.12.10.57.16; Fri, 12 Oct 2018 10:57:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727463AbeJMBaS (ORCPT + 99 others); Fri, 12 Oct 2018 21:30:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56746 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727417AbeJMBaS (ORCPT ); Fri, 12 Oct 2018 21:30:18 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3F592C04BD2E; Fri, 12 Oct 2018 17:56:39 +0000 (UTC) Received: from llong.com (dhcp-17-8.bos.redhat.com [10.18.17.8]) by smtp.corp.redhat.com (Postfix) with ESMTP id 94792831B3; Fri, 12 Oct 2018 17:56:37 +0000 (UTC) From: Waiman Long To: Tejun Heo , Li Zefan , Johannes Weiner , Peter Zijlstra , Ingo Molnar Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@fb.com, pjt@google.com, luto@amacapital.net, Mike Galbraith , torvalds@linux-foundation.org, Roman Gushchin , Juri Lelli , Patrick Bellasi , Waiman Long Subject: [PATCH v13 08/11] cpuset: Make generate_sched_domains() work with partition Date: Fri, 12 Oct 2018 13:55:48 -0400 Message-Id: <1539366951-8498-9-git-send-email-longman@redhat.com> In-Reply-To: <1539366951-8498-1-git-send-email-longman@redhat.com> References: <1539366951-8498-1-git-send-email-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 12 Oct 2018 17:56:39 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The generate_sched_domains() function is modified to make it work correctly with the newly introduced subparts_cpus mask for scheduling domains generation. Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 578e6ae..c52074e 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -763,13 +763,14 @@ static int generate_sched_domains(cpumask_var_t **domains, int ndoms = 0; /* number of sched domains in result */ int nslot; /* next empty doms[] struct cpumask slot */ struct cgroup_subsys_state *pos_css; + bool root_load_balance = is_sched_load_balance(&top_cpuset); doms = NULL; dattr = NULL; csa = NULL; /* Special case for the 99% of systems with one, full, sched domain */ - if (is_sched_load_balance(&top_cpuset)) { + if (root_load_balance && !top_cpuset.nr_subparts_cpus) { ndoms = 1; doms = alloc_sched_domains(ndoms); if (!doms) @@ -792,6 +793,8 @@ static int generate_sched_domains(cpumask_var_t **domains, csn = 0; rcu_read_lock(); + if (root_load_balance) + csa[csn++] = &top_cpuset; cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) { if (cp == &top_cpuset) continue; @@ -802,6 +805,9 @@ static int generate_sched_domains(cpumask_var_t **domains, * parent's cpus, so just skip them, and then we call * update_domain_attr_tree() to calc relax_domain_level of * the corresponding sched domain. + * + * If root is load-balancing, we can skip @cp if it + * is a subset of the root's effective_cpus. */ if (!cpumask_empty(cp->cpus_allowed) && !(is_sched_load_balance(cp) && @@ -809,11 +815,16 @@ static int generate_sched_domains(cpumask_var_t **domains, housekeeping_cpumask(HK_FLAG_DOMAIN)))) continue; + if (root_load_balance && + cpumask_subset(cp->cpus_allowed, top_cpuset.effective_cpus)) + continue; + if (is_sched_load_balance(cp)) csa[csn++] = cp; - /* skip @cp's subtree */ - pos_css = css_rightmost_descendant(pos_css); + /* skip @cp's subtree if not a partition root */ + if (!is_partition_root(cp)) + pos_css = css_rightmost_descendant(pos_css); } rcu_read_unlock(); @@ -941,7 +952,12 @@ static void rebuild_sched_domains_locked(void) * passing doms with offlined cpu to partition_sched_domains(). * Anyways, hotplug work item will rebuild sched domains. */ - if (!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask)) + if (!top_cpuset.nr_subparts_cpus && + !cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask)) + goto out; + + if (top_cpuset.nr_subparts_cpus && + !cpumask_subset(top_cpuset.effective_cpus, cpu_active_mask)) goto out; /* Generate domain masks and attrs */ @@ -1356,11 +1372,15 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) update_tasks_cpumask(cp); /* - * If the effective cpumask of any non-empty cpuset is changed, - * we need to rebuild sched domains. + * On legacy hierarchy, if the effective cpumask of any non- + * empty cpuset is changed, we need to rebuild sched domains. + * On default hierarchy, the cpuset needs to be a partition + * root as well. */ if (!cpumask_empty(cp->cpus_allowed) && - is_sched_load_balance(cp)) + is_sched_load_balance(cp) && + (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) || + is_partition_root(cp))) need_rebuild_sched_domains = true; rcu_read_lock(); -- 1.8.3.1