Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp849925ioo; Thu, 26 May 2022 16:56:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy+Yl7Pgww1NPzT/f7nVzh+JYW09b2K9ydMqB28P+M890+feVXjQ6YS6/F6NEpTYcJBOqyb X-Received: by 2002:a17:907:2d8b:b0:6fe:aa75:6609 with SMTP id gt11-20020a1709072d8b00b006feaa756609mr30113185ejc.468.1653609410309; Thu, 26 May 2022 16:56:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653609410; cv=none; d=google.com; s=arc-20160816; b=Mmlv2puq3GVkp6n8NVwrnbBVvHF7e7J/Dnhk+lzJ7Aaa3k9Y8SOFWkxn7deWyOR0NI 8FoGCH46PN097gn2bSCIgV7zb8ERTMp7CLCEQ7PQIxuomHU6W8/spK9QX1BLva0DtNWX H4z/QtBBZgTDRcgCnsD9JB45QOjbkEgDf/VbP8g9STbN5Za9hjmOTm4GE3bgsVB4pGOF mKxUVqTNSfpO2uBbbKxeju8ehUQ030hUg6JOeHRDm7H/Mi0MqssBF41+cEaWs4ACNq3O pZlAEhgnqvCSymRezRDWtrJ0NlHtCzfAWUEYgJMFMpwJEgZCp9i2AzS1uqIfZ5mJ6lvB BWLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=9Y+jy9YDkKz6I38qsT325Dp1OXFjw1o7qP0tLyh7kMk=; b=Xfh2ocSUSTrkHXUXwh8He2PMEZKmQkwnZTK2Lx3Ou14MTwrI+sgjNVkEZuA/5fvtik AE1IgVTtPONv7pgPdBdlw5BS5xhf2Bx/z1tLGMxGEEOZI6aUdZdsrjVABaC4qH9x73WG XdFM4r2qmyzymyrACjQy+3RsgRXHfnoCeGluycLxkCmUNF78i79jVtMp6vA+4KLUsHOU +rcUO/HYv0AE+CKqod0pM3wMRbooJVDz86ELQitY2BV5rzyCPBq7KjHl0GjpbNZADbs3 cHjqOfLtiPkyOBlwBwzaBnVQo439RgCYqtkvApkZyeJcMMR/+HaxshVnoyaMvU1LZEx3 hMXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=TrST2hlR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hr14-20020a1709073f8e00b006f518a023acsi2603505ejc.603.2022.05.26.16.56.24; Thu, 26 May 2022 16:56:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=TrST2hlR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345355AbiEYWLb (ORCPT + 99 others); Wed, 25 May 2022 18:11:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345363AbiEYWLT (ORCPT ); Wed, 25 May 2022 18:11:19 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9DA5A2CDEE; Wed, 25 May 2022 15:11:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4DC94B81EA6; Wed, 25 May 2022 22:11:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4F2ABC34117; Wed, 25 May 2022 22:11:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1653516675; bh=LBhBMs9GjOEfhBJWzPUCF+VTJhGM8NlzVA+cHXKdExE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TrST2hlRULgNwKOnE+rzfebGkrbckmwSLNUEBXotFa7TOPGyuoo4Us8v62Bq0PgEN 7DHRYL5D/qVplGyjwRMY1yJMkYVB7A8wO0sIV6GoumB3CoJQCB5n11SnBqvAn85QEZ kHNvFhCcFs7LgfL+XjC+B8tg9nQKN55mNVXfvbxOLNryRKbUoIJOypYqiq9FzVFq88 axLHeKiaZDEDeYANqGTcm93UWsVpw9Eu/8ivjYRAyVzeK2LsPo6MaFgw5UDEC3Wx8s 5hYnSkXpT8VMIc4G4jUO8qDHU/0CTPd58Ljqz9R7sYVC4owkEf62cenPTw+wC0C2uF iiN5HCZM/KGIQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Tejun Heo , Peter Zijlstra , "Paul E . McKenney" , Paul Gortmaker , Johannes Weiner , Marcelo Tosatti , Phil Auld , Zefan Li , Waiman Long , Daniel Bristot de Oliveira , Nicolas Saenz Julienne , rcu@vger.kernel.org Subject: [RFC PATCH 4/4] cpuset: Support RCU-NOCB toggle on v2 root partitions Date: Thu, 26 May 2022 00:10:55 +0200 Message-Id: <20220525221055.1152307-5-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220525221055.1152307-1-frederic@kernel.org> References: <20220525221055.1152307-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Introduce a new "isolation.rcu_nocb" file within a cgroup2/cpuset directory which provides support for a set of CPUs to either enable ("1") or disable ("0") RCU callbacks offloading (aka. RCU NOCB). This can overwrite previous boot settings towards "rcu_nocbs=" kernel parameter. The file is only writeable on "root" type partitions to exclude any overlap. The deepest root type partition has the highest priority. This means that given the following setting: Top cpuset (CPUs: 0-7) cpuset.isolation.rcu_nocb = 0 | | Subdirectory A (CPUs: 5-7) cpuset.cpus.partition = root cpuset.isolation.rcu_nocb = 0 | | Subdirectory B (CPUs: 7) cpuset.cpus.partition = root cpuset.isolation.rcu_nocb = 1 the result is that only CPU 7 is in rcu_nocb mode. Note that "rcu_nocbs" kernel parameter must be passed on boot, even without a cpulist, so that nocb support is enabled. Signed-off-by: Frederic Weisbecker Cc: Zefan Li Cc: Tejun Heo Cc: Johannes Weiner Cc: Paul E. McKenney Cc: Phil Auld Cc: Nicolas Saenz Julienne Cc: Marcelo Tosatti Cc: Paul Gortmaker Cc: Waiman Long Cc: Daniel Bristot de Oliveira Cc: Peter Zijlstra --- kernel/cgroup/cpuset.c | 95 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 92 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 9390bfd9f1cd..2d9f019bb590 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -225,6 +225,7 @@ typedef enum { CS_SCHED_LOAD_BALANCE, CS_SPREAD_PAGE, CS_SPREAD_SLAB, + CS_RCU_NOCB, } cpuset_flagbits_t; /* convenient tests for these bits */ @@ -268,6 +269,11 @@ static inline int is_spread_slab(const struct cpuset *cs) return test_bit(CS_SPREAD_SLAB, &cs->flags); } +static inline int is_rcu_nocb(const struct cpuset *cs) +{ + return test_bit(CS_RCU_NOCB, &cs->flags); +} + static inline int is_partition_root(const struct cpuset *cs) { return cs->partition_root_state > 0; @@ -590,6 +596,62 @@ static inline void free_cpuset(struct cpuset *cs) kfree(cs); } +#ifdef CONFIG_RCU_NOCB_CPU +static int cpuset_rcu_nocb_apply(struct cpuset *root) +{ + int err; + + if (is_rcu_nocb(root)) + err = housekeeping_cpumask_set(root->effective_cpus, HK_TYPE_RCU); + else + err = housekeeping_cpumask_clear(root->effective_cpus, HK_TYPE_RCU); + + return err; +} + +static int cpuset_rcu_nocb_update(struct cpuset *cur, struct cpuset *trialcs) +{ + struct cgroup_subsys_state *des_css; + struct cpuset *des; + int err; + + if (cur->partition_root_state != PRS_ENABLED) + return -EINVAL; + + err = cpuset_rcu_nocb_apply(trialcs); + if (err < 0) + return err; + + rcu_read_lock(); + cpuset_for_each_descendant_pre(des, des_css, cur) { + if (des == cur) + continue; + if (des->partition_root_state == PRS_ENABLED) + break; + spin_lock_irq(&callback_lock); + if (is_rcu_nocb(trialcs)) + set_bit(CS_RCU_NOCB, &des->flags); + else + clear_bit(CS_RCU_NOCB, &des->flags); + spin_unlock_irq(&callback_lock); + } + rcu_read_unlock(); + + return 0; +} +#else +static inline int cpuset_rcu_nocb_apply(struct cpuset *root) +{ + return 0; +} + +static inline int cpuset_rcu_nocb_update(struct cpuset *cur, + struct cpuset *trialcs) +{ + return 0; +} +#endif /* #ifdef CONFIG_RCU_NOCB_CPU */ + /* * validate_change_legacy() - Validate conditions specific to legacy (v1) * behavior. @@ -1655,6 +1717,9 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, if (cs->partition_root_state) { struct cpuset *parent = parent_cs(cs); + WARN_ON_ONCE(cpuset_rcu_nocb_apply(parent) < 0); + WARN_ON_ONCE(cpuset_rcu_nocb_apply(cs) < 0); + /* * For partition root, update the cpumasks of sibling * cpusets if they use parent's effective_cpus. @@ -2012,6 +2077,12 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, spread_flag_changed = ((is_spread_slab(cs) != is_spread_slab(trialcs)) || (is_spread_page(cs) != is_spread_page(trialcs))); + if (is_rcu_nocb(cs) != is_rcu_nocb(trialcs)) { + err = cpuset_rcu_nocb_update(cs, trialcs); + if (err < 0) + goto out; + } + spin_lock_irq(&callback_lock); cs->flags = trialcs->flags; spin_unlock_irq(&callback_lock); @@ -2365,6 +2436,7 @@ typedef enum { FILE_MEMORY_PRESSURE, FILE_SPREAD_PAGE, FILE_SPREAD_SLAB, + FILE_RCU_NOCB, } cpuset_filetype_t; static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, @@ -2406,6 +2478,9 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, case FILE_SPREAD_SLAB: retval = update_flag(CS_SPREAD_SLAB, cs, val); break; + case FILE_RCU_NOCB: + retval = update_flag(CS_RCU_NOCB, cs, val); + break; default: retval = -EINVAL; break; @@ -2573,6 +2648,8 @@ static u64 cpuset_read_u64(struct cgroup_subsys_state *css, struct cftype *cft) return is_spread_page(cs); case FILE_SPREAD_SLAB: return is_spread_slab(cs); + case FILE_RCU_NOCB: + return is_rcu_nocb(cs); default: BUG(); } @@ -2803,7 +2880,14 @@ static struct cftype dfl_files[] = { .private = FILE_SUBPARTS_CPULIST, .flags = CFTYPE_DEBUG, }, - +#ifdef CONFIG_RCU_NOCB_CPU + { + .name = "isolation.rcu_nocb", + .read_u64 = cpuset_read_u64, + .write_u64 = cpuset_write_u64, + .private = FILE_RCU_NOCB, + }, +#endif { } /* terminate */ }; @@ -2861,6 +2945,8 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) set_bit(CS_SPREAD_PAGE, &cs->flags); if (is_spread_slab(parent)) set_bit(CS_SPREAD_SLAB, &cs->flags); + if (is_rcu_nocb(parent)) + set_bit(CS_RCU_NOCB, &cs->flags); cpuset_inc(); @@ -3227,12 +3313,15 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp) if (mems_updated) check_insane_mems_config(&new_mems); - if (is_in_v2_mode()) + if (is_in_v2_mode()) { hotplug_update_tasks(cs, &new_cpus, &new_mems, cpus_updated, mems_updated); - else + if (cpus_updated) + WARN_ON_ONCE(cpuset_rcu_nocb_apply(cs) < 0); + } else { hotplug_update_tasks_legacy(cs, &new_cpus, &new_mems, cpus_updated, mems_updated); + } percpu_up_write(&cpuset_rwsem); } -- 2.25.1