Received: by 2002:a05:6a10:6d25:0:0:0:0 with SMTP id gq37csp1844537pxb; Mon, 13 Sep 2021 06:47:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyPoJsTNES0h5EstU1HVsLie+uauiPSPU1fneIkCoh1Jq5HGSyWCINpB8gBwoh5y5qJ9LJL X-Received: by 2002:a92:d2ca:: with SMTP id w10mr149103ilg.222.1631540872201; Mon, 13 Sep 2021 06:47:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631540872; cv=none; d=google.com; s=arc-20160816; b=DDjAzpdg7f6AXhYYp77QsK887E6g4/YO0Qhbw3ZOCNv4AYgMb338YxZue0VJ0UJztx kn6TwQP4A/9eNpswoHseB+e3K69NIu+gNk0bTd7mrtEjN8rm4AZPBvYjp+aCCr+orCAy 0q+7A48X0KxC5NvJSjqa98I1PEmwO9TxDZt3nqgar8ik39SWZnmpDT53zxrUrJbuzURK 6P2XauB9sxhiBTQ1OMuAuFxKSdAVBgky1InIvTER1Iu3Xz+FGdzpupjw/c6iuU4BDwbI OCe2UikDJDKPM1bV9n/W9A3ACDlzWkH8NzsAcseGV2k+PM34tfvaPCqmL4ppUCFBCIC/ P4MQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=hbFvarmSXVOqf8YE74b5Vq0+BQlz9nmFK0n+cZ1cbYs=; b=wWqhyZ+AjHoeQd7IQRbr7qHbfjuHDSuvMMEPZysAYBPuvT7yirrrjWnd0hlA3R2jn9 kVfLpBMA30Vg0I67AYjnr+00sOwcZz5RaxedhRxF5iaQxtI6GWPrcs5a8gKlUyEZ1K2z 9+XPjIpwSooAqPA4O/sIeyFmyQ5FE9Uxwk18cKumNkQNE+im1ghfC4iwV1Ma2nYKNl1G pkjjhTbF1Jb7UrCC7en006wcSaEr0SVRCO4qbBYm42ymdZR8Vj14mR7HdG/S0xz0J1kZ F5of5bbMq+Wt+hOVWKt94yvuhbJnJPqgw79h5euhUjJvvjZTyb7mY03+CEER+2AHuPst IAfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=X4o7y55j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c4si6112285iok.10.2021.09.13.06.47.39; Mon, 13 Sep 2021 06:47:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=X4o7y55j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243226AbhIMNrm (ORCPT + 99 others); Mon, 13 Sep 2021 09:47:42 -0400 Received: from mail.kernel.org ([198.145.29.99]:43056 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243317AbhIMNmM (ORCPT ); Mon, 13 Sep 2021 09:42:12 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 445AA6135E; Mon, 13 Sep 2021 13:30:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1631539816; bh=O2FnxtmJllFLGIGlN+bgV0E7ARe/vR0+Nd/u/nZK1tM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=X4o7y55j5uwHFBgjbek/qlDS2k7mFbq9reneCxU6A3XX1lzbfQwAneMm/Olm+TFU+ btEyFFhmuHlQmy4teEp4OdJA/kCZ1uDbVymWw16IPHhyqaYR51yKwE/sf0jOqSSdTh 1fq2C8c1EFZPBn3lzAO+rXh/pSiNS8g+HsMnEEuQ= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Waiman Long , Tejun Heo , Sasha Levin Subject: [PATCH 5.10 139/236] cgroup/cpuset: Fix violation of cpuset locking rule Date: Mon, 13 Sep 2021 15:14:04 +0200 Message-Id: <20210913131105.089517158@linuxfoundation.org> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20210913131100.316353015@linuxfoundation.org> References: <20210913131100.316353015@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Waiman Long [ Upstream commit 6ba34d3c73674e46d9e126e4f0cee79e5ef2481c ] The cpuset fields that manage partition root state do not strictly follow the cpuset locking rule that update to cpuset has to be done with both the callback_lock and cpuset_mutex held. This is now fixed by making sure that the locking rule is upheld. Fixes: 3881b86128d0 ("cpuset: Add an error state to cpuset.sched.partition") Fixes: 4b842da276a8 ("cpuset: Make CPU hotplug work with partition") Signed-off-by: Waiman Long Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin --- kernel/cgroup/cpuset.c | 58 +++++++++++++++++++++++++----------------- 1 file changed, 35 insertions(+), 23 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 190355aae7ee..1999fcec45c7 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1148,6 +1148,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, struct cpuset *parent = parent_cs(cpuset); int adding; /* Moving cpus from effective_cpus to subparts_cpus */ int deleting; /* Moving cpus from subparts_cpus to effective_cpus */ + int new_prs; bool part_error = false; /* Partition error? */ percpu_rwsem_assert_held(&cpuset_rwsem); @@ -1183,6 +1184,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, * A cpumask update cannot make parent's effective_cpus become empty. */ adding = deleting = false; + new_prs = cpuset->partition_root_state; if (cmd == partcmd_enable) { cpumask_copy(tmp->addmask, cpuset->cpus_allowed); adding = true; @@ -1247,11 +1249,11 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, switch (cpuset->partition_root_state) { case PRS_ENABLED: if (part_error) - cpuset->partition_root_state = PRS_ERROR; + new_prs = PRS_ERROR; break; case PRS_ERROR: if (!part_error) - cpuset->partition_root_state = PRS_ENABLED; + new_prs = PRS_ENABLED; break; } /* @@ -1260,10 +1262,10 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, part_error = (prev_prs == PRS_ERROR); } - if (!part_error && (cpuset->partition_root_state == PRS_ERROR)) + if (!part_error && (new_prs == PRS_ERROR)) return 0; /* Nothing need to be done */ - if (cpuset->partition_root_state == PRS_ERROR) { + if (new_prs == PRS_ERROR) { /* * Remove all its cpus from parent's subparts_cpus. */ @@ -1272,7 +1274,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, parent->subparts_cpus); } - if (!adding && !deleting) + if (!adding && !deleting && (new_prs == cpuset->partition_root_state)) return 0; /* @@ -1299,6 +1301,9 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, } parent->nr_subparts_cpus = cpumask_weight(parent->subparts_cpus); + + if (cpuset->partition_root_state != new_prs) + cpuset->partition_root_state = new_prs; spin_unlock_irq(&callback_lock); return cmd == partcmd_update; @@ -1321,6 +1326,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) struct cpuset *cp; struct cgroup_subsys_state *pos_css; bool need_rebuild_sched_domains = false; + int new_prs; rcu_read_lock(); cpuset_for_each_descendant_pre(cp, pos_css, cs) { @@ -1360,7 +1366,8 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) * update_tasks_cpumask() again for tasks in the parent * cpuset if the parent's subparts_cpus changes. */ - if ((cp != cs) && cp->partition_root_state) { + new_prs = cp->partition_root_state; + if ((cp != cs) && new_prs) { switch (parent->partition_root_state) { case PRS_DISABLED: /* @@ -1370,7 +1377,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) */ WARN_ON_ONCE(cp->partition_root_state != PRS_ERROR); - cp->partition_root_state = PRS_DISABLED; + new_prs = PRS_DISABLED; /* * clear_bit() is an atomic operation and @@ -1391,11 +1398,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) /* * When parent is invalid, it has to be too. */ - cp->partition_root_state = PRS_ERROR; - if (cp->nr_subparts_cpus) { - cp->nr_subparts_cpus = 0; - cpumask_clear(cp->subparts_cpus); - } + new_prs = PRS_ERROR; break; } } @@ -1407,8 +1410,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) spin_lock_irq(&callback_lock); cpumask_copy(cp->effective_cpus, tmp->new_cpus); - if (cp->nr_subparts_cpus && - (cp->partition_root_state != PRS_ENABLED)) { + if (cp->nr_subparts_cpus && (new_prs != PRS_ENABLED)) { cp->nr_subparts_cpus = 0; cpumask_clear(cp->subparts_cpus); } else if (cp->nr_subparts_cpus) { @@ -1435,6 +1437,10 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) = cpumask_weight(cp->subparts_cpus); } } + + if (new_prs != cp->partition_root_state) + cp->partition_root_state = new_prs; + spin_unlock_irq(&callback_lock); WARN_ON(!is_in_v2_mode() && @@ -1944,25 +1950,25 @@ out: */ static int update_prstate(struct cpuset *cs, int new_prs) { - int err; + int err, old_prs = cs->partition_root_state; struct cpuset *parent = parent_cs(cs); struct tmpmasks tmpmask; - if (new_prs == cs->partition_root_state) + if (old_prs == new_prs) return 0; /* * Cannot force a partial or invalid partition root to a full * partition root. */ - if (new_prs && (cs->partition_root_state < 0)) + if (new_prs && (old_prs == PRS_ERROR)) return -EINVAL; if (alloc_cpumasks(NULL, &tmpmask)) return -ENOMEM; err = -EINVAL; - if (!cs->partition_root_state) { + if (!old_prs) { /* * Turning on partition root requires setting the * CS_CPU_EXCLUSIVE bit implicitly as well and cpus_allowed @@ -1981,14 +1987,12 @@ static int update_prstate(struct cpuset *cs, int new_prs) update_flag(CS_CPU_EXCLUSIVE, cs, 0); goto out; } - cs->partition_root_state = PRS_ENABLED; } else { /* * Turning off partition root will clear the * CS_CPU_EXCLUSIVE bit. */ - if (cs->partition_root_state == PRS_ERROR) { - cs->partition_root_state = PRS_DISABLED; + if (old_prs == PRS_ERROR) { update_flag(CS_CPU_EXCLUSIVE, cs, 0); err = 0; goto out; @@ -1999,8 +2003,6 @@ static int update_prstate(struct cpuset *cs, int new_prs) if (err) goto out; - cs->partition_root_state = PRS_DISABLED; - /* Turning off CS_CPU_EXCLUSIVE will not return error */ update_flag(CS_CPU_EXCLUSIVE, cs, 0); } @@ -2017,6 +2019,12 @@ static int update_prstate(struct cpuset *cs, int new_prs) rebuild_sched_domains_locked(); out: + if (!err) { + spin_lock_irq(&callback_lock); + cs->partition_root_state = new_prs; + spin_unlock_irq(&callback_lock); + } + free_cpumasks(NULL, &tmpmask); return err; } @@ -3080,8 +3088,10 @@ retry: if (is_partition_root(cs) && (cpumask_empty(&new_cpus) || (parent->partition_root_state == PRS_ERROR))) { if (cs->nr_subparts_cpus) { + spin_lock_irq(&callback_lock); cs->nr_subparts_cpus = 0; cpumask_clear(cs->subparts_cpus); + spin_unlock_irq(&callback_lock); compute_effective_cpumask(&new_cpus, cs, parent); } @@ -3095,7 +3105,9 @@ retry: cpumask_empty(&new_cpus)) { update_parent_subparts_cpumask(cs, partcmd_disable, NULL, tmp); + spin_lock_irq(&callback_lock); cs->partition_root_state = PRS_ERROR; + spin_unlock_irq(&callback_lock); } cpuset_force_rebuild(); } -- 2.30.2