Received: by 2002:a05:7412:8d1c:b0:fa:4c10:6cad with SMTP id bj28csp517794rdb; Wed, 17 Jan 2024 08:50:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IG6H49Gnc9ra/jz1gq0A4ob0SAfdTe1JjQVau4qVT3MfMzOaIJMY1sq2UwsJemE33e28K9I X-Received: by 2002:a05:6a20:6f87:b0:19b:4435:e425 with SMTP id gv7-20020a056a206f8700b0019b4435e425mr2516934pzb.30.1705510250598; Wed, 17 Jan 2024 08:50:50 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705510250; cv=pass; d=google.com; s=arc-20160816; b=u0iQgEeb0LIVu6dqmDKnXTSYG5AXJ2JNa+Xhs2fPxjqF6mtaSVm4QBxvHHR9AJYlAd kpSeEzpzXMWvEz2iLP4P5SiOcqlXXLZanaDFhNLX6l3TmEBPwpKsa8C9BMANMjgEyX+K 1g2Xm+FYQ4yh19zVNmKWHpZPtheo7dKbEc+uxv8ofugeZuY+QAFEFJr/4dlcTjaH8qEI eGCL/kvqS04SJupKglnFMu/2mgwbZpa831YZCI440hRJby5p0rbDXAaUgAOzoqDEH22Q HhL/cCw4fgKVrMONE5z7SgeK7oZpmY02dQ26OdAYz7ZHXG9SIrBN5IUvtwjh0bTfsy5E +3mg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=/ssqmGGPuOAdNch15QLVD55rG/uP6CAKjxRipGrMtQM=; fh=0cYptx2rNTiuaWnhcL0TTYEVszmCn744IFt48f2CUlM=; b=SF1FXyDd23c1dvGhPkluZnM0wwKvPxCqi7y3koWKmCTBTg+wy+hecI5qBxjxS8xOHq kEFV/uJj/bb1vBZgQthY9uRNKtRtYVVO2J6jRDNC4qmu8YIHjPCRPvpYZRn7uSw0zta7 XoV5GEMf7kLynuRPqKmG4jq7bbDl5psdik0Q7XaJ4cmHbrW6cPdHemPclHzWVTOrgIoW OY5xLdVE/nZ0SwnGBEjDrBg9HOxJ5pCXGeCIbF1Z+sfCBXSn2zFhJ17RsNHn4EJ9bk7q Fk4VsgFoyKyRuWVFdTDZsnLUy/L6s1XzkV/mznRZTjc0YQZzqVKC//m6rcFJQH2z8cmi cTgA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QV517CVB; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-29228-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29228-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id c5-20020a633505000000b005cec86b7a50si1168523pga.570.2024.01.17.08.50.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Jan 2024 08:50:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-29228-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QV517CVB; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-29228-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29228-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 88D0028F849 for ; Wed, 17 Jan 2024 16:37:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3F23F241E2; Wed, 17 Jan 2024 16:36:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QV517CVB" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9E4922EF5 for ; Wed, 17 Jan 2024 16:36:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705509381; cv=none; b=AZSwLVowpiAs6tveBHTmNAJFAaHc2Zx2JdDAEoQd1a4kVLHn6OE/l9/yhNslWXZdtR5Mb2W2rg4zLneRu/K4IFuU+SM/4RGmmIi6nCQw/BkuRevbAAMUPU+hD7UOB3+h1cpRa83SGfZ4TrRjzFWcA9BNgBVDjsWYCgHYPz1s2CQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705509381; c=relaxed/simple; bh=5UEzyAawuulkn9wa4nP2VXfeCOs8DGc/KPmirNiRgUU=; h=DKIM-Signature:Received:X-MC-Unique:Received:Received:From:To:Cc: Subject:Date:Message-Id:In-Reply-To:References:MIME-Version: Content-Transfer-Encoding:X-Scanned-By; b=C9a/IjPfjvoJjzAXQUdptpLfe5qPGMrfZOqCPp5LwP0qmw+KHoVV+98YW+1zaDmRGmI8/QQgI+Q/qnc4fdR4c2LkQM8aGVXqj/hjJrXk55yBbSlPBg86efDsvdAbyc+Io5W3y7NHph7A70gQB1k0JihDuq/aUMUA5oapmB4I1Sg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=QV517CVB; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705509378; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/ssqmGGPuOAdNch15QLVD55rG/uP6CAKjxRipGrMtQM=; b=QV517CVB2TiFjBECkCE2MwlnYvmBWoVGtiQDqd1XiXkieWrewlz56I0iwCnVhzNgHF65hV dCfycSZZSzIBoMy7lp5vpO/8NMPhEwSCPiIqQkG+5my4TeeJ/BTU+mv8SD3XzYWPaT+UpM EhJ3XJFSYNIvH6Q/sHtEInv8b9UnYac= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-515-5gxLqypwMn2tdHIuj81JEQ-1; Wed, 17 Jan 2024 11:36:12 -0500 X-MC-Unique: 5gxLqypwMn2tdHIuj81JEQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 672EE81DA88; Wed, 17 Jan 2024 16:36:11 +0000 (UTC) Received: from llong.com (unknown [10.22.16.147]) by smtp.corp.redhat.com (Postfix) with ESMTP id C42601121306; Wed, 17 Jan 2024 16:36:09 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner , Frederic Weisbecker , Jonathan Corbet , "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Davidlohr Bueso , Shuah Khan Cc: cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, linux-kselftest@vger.kernel.org, Mrunal Patel , Ryan Phillips , Brent Rowsell , Peter Hunt , Cestmir Kalina , Nicolas Saenz Julienne , Alex Gladkov , Marcelo Tosatti , Phil Auld , Paul Gortmaker , Daniel Bristot de Oliveira , Juri Lelli , Peter Zijlstra , Costa Shulyupin , Waiman Long Subject: [RFC PATCH 4/8] cgroup/cpuset: Better tracking of addition/deletion of isolated CPUs Date: Wed, 17 Jan 2024 11:35:07 -0500 Message-Id: <20240117163511.88173-5-longman@redhat.com> In-Reply-To: <20240117163511.88173-1-longman@redhat.com> References: <20240117163511.88173-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 The process of updating workqueue unbound cpumask to exclude isolated CPUs in cpuset only requires the use of the aggregated isolated_cpus cpumask. Other types of CPU isolation, like the RCU no-callback CPU mode, may require knowing more granular addition and deletion of isolated CPUs. To enable these types of CPU isolation at run time, we need to provide better tracking of the addition and deletion of isolated CPUs. This patch adds a new isolated_cpus_modifier enum type for tracking the addition and deletion of isolated CPUs as well as renaming update_unbound_workqueue_cpumask() to update_isolation_cpumasks() to accommodate additional CPU isolation modes in the future. There is no functional change. Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 113 +++++++++++++++++++++++++---------------- 1 file changed, 69 insertions(+), 44 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index dfbb16aca9f4..0479af76a5dc 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -206,6 +206,13 @@ struct cpuset { */ static cpumask_var_t subpartitions_cpus; +/* Enum types for possible changes to the set of isolated CPUs */ +enum isolated_cpus_modifiers { + ISOL_CPUS_NONE = 0, + ISOL_CPUS_ADD, + ISOL_CPUS_DELETE, +}; + /* * Exclusive CPUs in isolated partitions */ @@ -1446,14 +1453,14 @@ static void partition_xcpus_newstate(int old_prs, int new_prs, struct cpumask *x * @new_prs: new partition_root_state * @parent: parent cpuset * @xcpus: exclusive CPUs to be added - * Return: true if isolated_cpus modified, false otherwise + * Return: isolated_cpus modifier * * Remote partition if parent == NULL */ -static bool partition_xcpus_add(int new_prs, struct cpuset *parent, - struct cpumask *xcpus) +static int partition_xcpus_add(int new_prs, struct cpuset *parent, + struct cpumask *xcpus) { - bool isolcpus_updated; + int icpus_mod = ISOL_CPUS_NONE; WARN_ON_ONCE(new_prs < 0); lockdep_assert_held(&callback_lock); @@ -1464,13 +1471,14 @@ static bool partition_xcpus_add(int new_prs, struct cpuset *parent, if (parent == &top_cpuset) cpumask_or(subpartitions_cpus, subpartitions_cpus, xcpus); - isolcpus_updated = (new_prs != parent->partition_root_state); - if (isolcpus_updated) + if (new_prs != parent->partition_root_state) { partition_xcpus_newstate(parent->partition_root_state, new_prs, xcpus); - + icpus_mod = (new_prs == PRS_ISOLATED) + ? ISOL_CPUS_ADD : ISOL_CPUS_DELETE; + } cpumask_andnot(parent->effective_cpus, parent->effective_cpus, xcpus); - return isolcpus_updated; + return icpus_mod; } /* @@ -1478,14 +1486,14 @@ static bool partition_xcpus_add(int new_prs, struct cpuset *parent, * @old_prs: old partition_root_state * @parent: parent cpuset * @xcpus: exclusive CPUs to be removed - * Return: true if isolated_cpus modified, false otherwise + * Return: isolated_cpus modifier * * Remote partition if parent == NULL */ -static bool partition_xcpus_del(int old_prs, struct cpuset *parent, +static int partition_xcpus_del(int old_prs, struct cpuset *parent, struct cpumask *xcpus) { - bool isolcpus_updated; + int icpus_mod; WARN_ON_ONCE(old_prs < 0); lockdep_assert_held(&callback_lock); @@ -1495,27 +1503,40 @@ static bool partition_xcpus_del(int old_prs, struct cpuset *parent, if (parent == &top_cpuset) cpumask_andnot(subpartitions_cpus, subpartitions_cpus, xcpus); - isolcpus_updated = (old_prs != parent->partition_root_state); - if (isolcpus_updated) + if (old_prs != parent->partition_root_state) { partition_xcpus_newstate(old_prs, parent->partition_root_state, xcpus); - + icpus_mod = (old_prs == PRS_ISOLATED) + ? ISOL_CPUS_DELETE : ISOL_CPUS_ADD; + } cpumask_and(xcpus, xcpus, cpu_active_mask); cpumask_or(parent->effective_cpus, parent->effective_cpus, xcpus); - return isolcpus_updated; + return icpus_mod; } -static void update_unbound_workqueue_cpumask(bool isolcpus_updated) +/** + * update_isolation_cpumasks - Add or remove CPUs to/from full isolation state + * @mask: cpumask of the CPUs to be added or removed + * @modifier: enum isolated_cpus_modifiers + * Return: 0 if successful, error code otherwise + * + * Workqueue unbound cpumask update is applied irrespective of isolation_full + * state and the whole isolated_cpus is passed. Repeated calls with the same + * isolated_cpus will not cause further action other than a wasted mutex + * lock/unlock. + */ +static int update_isolation_cpumasks(struct cpumask *mask, int modifier) { - int ret; + int err; lockdep_assert_cpus_held(); - if (!isolcpus_updated) - return; + if (!modifier) + return 0; /* No change in isolated CPUs */ - ret = workqueue_unbound_exclude_cpumask(isolated_cpus); - WARN_ON_ONCE(ret < 0); + err = workqueue_unbound_exclude_cpumask(isolated_cpus); + WARN_ON_ONCE(err); + return err; } /** @@ -1577,7 +1598,7 @@ static inline bool is_local_partition(struct cpuset *cs) static int remote_partition_enable(struct cpuset *cs, int new_prs, struct tmpmasks *tmp) { - bool isolcpus_updated; + int icpus_mod; /* * The user must have sysadmin privilege. @@ -1600,7 +1621,7 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs, return 0; spin_lock_irq(&callback_lock); - isolcpus_updated = partition_xcpus_add(new_prs, NULL, tmp->new_cpus); + icpus_mod = partition_xcpus_add(new_prs, NULL, tmp->new_cpus); list_add(&cs->remote_sibling, &remote_children); if (cs->use_parent_ecpus) { struct cpuset *parent = parent_cs(cs); @@ -1609,7 +1630,7 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs, parent->child_ecpus_count--; } spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_isolation_cpumasks(tmp->new_cpus, icpus_mod); /* * Proprogate changes in top_cpuset's effective_cpus down the hierarchy. @@ -1630,7 +1651,7 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs, */ static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp) { - bool isolcpus_updated; + int icpus_mod; compute_effective_exclusive_cpumask(cs, tmp->new_cpus); WARN_ON_ONCE(!is_remote_partition(cs)); @@ -1638,14 +1659,14 @@ static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp) spin_lock_irq(&callback_lock); list_del_init(&cs->remote_sibling); - isolcpus_updated = partition_xcpus_del(cs->partition_root_state, - NULL, tmp->new_cpus); + icpus_mod = partition_xcpus_del(cs->partition_root_state, NULL, + tmp->new_cpus); cs->partition_root_state = -cs->partition_root_state; if (!cs->prs_err) cs->prs_err = PERR_INVCPUS; reset_partition_data(cs); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_isolation_cpumasks(tmp->new_cpus, icpus_mod); /* * Proprogate changes in top_cpuset's effective_cpus down the hierarchy. @@ -1668,7 +1689,8 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *newmask, { bool adding, deleting; int prs = cs->partition_root_state; - int isolcpus_updated = 0; + int icpus_add_mod = ISOL_CPUS_NONE; + int icpus_del_mod = ISOL_CPUS_NONE; if (WARN_ON_ONCE(!is_remote_partition(cs))) return; @@ -1693,12 +1715,12 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *newmask, spin_lock_irq(&callback_lock); if (adding) - isolcpus_updated += partition_xcpus_add(prs, NULL, tmp->addmask); + icpus_add_mod = partition_xcpus_add(prs, NULL, tmp->addmask); if (deleting) - isolcpus_updated += partition_xcpus_del(prs, NULL, tmp->delmask); + icpus_del_mod = partition_xcpus_del(prs, NULL, tmp->delmask); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); - + update_isolation_cpumasks(tmp->addmask, icpus_add_mod); + update_isolation_cpumasks(tmp->delmask, icpus_del_mod); /* * Proprogate changes in top_cpuset's effective_cpus down the hierarchy. */ @@ -1819,7 +1841,8 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd, int part_error = PERR_NONE; /* Partition error? */ int subparts_delta = 0; struct cpumask *xcpus; /* cs effective_xcpus */ - int isolcpus_updated = 0; + int icpus_add_mod = ISOL_CPUS_NONE; + int icpus_del_mod = ISOL_CPUS_NONE; bool nocpu; lockdep_assert_held(&cpuset_mutex); @@ -2052,22 +2075,23 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd, cs->nr_subparts = 0; } /* - * Adding to parent's effective_cpus means deletion CPUs from cs + * Adding to parent's effective_cpus means deleting CPUs from cs * and vice versa. */ if (adding) - isolcpus_updated += partition_xcpus_del(old_prs, parent, - tmp->addmask); + icpus_add_mod = partition_xcpus_del(old_prs, parent, + tmp->addmask); if (deleting) - isolcpus_updated += partition_xcpus_add(new_prs, parent, - tmp->delmask); + icpus_del_mod = partition_xcpus_add(new_prs, parent, + tmp->delmask); if (is_partition_valid(parent)) { parent->nr_subparts += subparts_delta; WARN_ON_ONCE(parent->nr_subparts < 0); } spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_isolation_cpumasks(tmp->addmask, icpus_add_mod); + update_isolation_cpumasks(tmp->delmask, icpus_del_mod); if ((old_prs != new_prs) && (cmd == partcmd_update)) update_partition_exclusive(cs, new_prs); @@ -3044,7 +3068,7 @@ static int update_prstate(struct cpuset *cs, int new_prs) int err = PERR_NONE, old_prs = cs->partition_root_state; struct cpuset *parent = parent_cs(cs); struct tmpmasks tmpmask; - bool new_xcpus_state = false; + int icpus_mod = ISOL_CPUS_NONE; if (old_prs == new_prs) return 0; @@ -3096,7 +3120,8 @@ static int update_prstate(struct cpuset *cs, int new_prs) /* * A change in load balance state only, no change in cpumasks. */ - new_xcpus_state = true; + icpus_mod = (new_prs == PRS_ISOLATED) + ? ISOL_CPUS_ADD : ISOL_CPUS_DELETE; } else { /* * Switching back to member is always allowed even if it @@ -3128,10 +3153,10 @@ static int update_prstate(struct cpuset *cs, int new_prs) WRITE_ONCE(cs->prs_err, err); if (!is_partition_valid(cs)) reset_partition_data(cs); - else if (new_xcpus_state) + else if (icpus_mod) partition_xcpus_newstate(old_prs, new_prs, cs->effective_xcpus); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(new_xcpus_state); + update_isolation_cpumasks(cs->effective_xcpus, icpus_mod); /* Force update if switching back to member */ update_cpumasks_hier(cs, &tmpmask, !new_prs ? HIER_CHECKALL : 0); -- 2.39.3