2023-11-27 19:51:33

by Waiman Long

[permalink] [raw]
Subject: [PATCH] cgroup/cpuset: Expose cpuset.cpus.isolated

The root-only cpuset.cpus.isolated control file shows the current set
of isolated CPUs in isolated partitions. This control file is currently
exposed only with the cgroup_debug boot command line option which also
adds the ".__DEBUG__." prefix. This is actually a useful control file if
users want to find out which CPUs are currently in an isolated state by
the cpuset controller. Remove CFTYPE_DEBUG flag for this control file and
make it available by default without any prefix.

The test_cpuset_prs.sh test script and the cgroup-v2.rst documentation
file are also updated accordingly. Minor code change is also made in
test_cpuset_prs.sh to avoid false test failure when running on debug
kernel.

Signed-off-by: Waiman Long <[email protected]>
---
Documentation/admin-guide/cgroup-v2.rst | 7 ++++
kernel/cgroup/cpuset.c | 2 +-
.../selftests/cgroup/test_cpuset_prs.sh | 32 +++++++++++--------
3 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index cf5651a11df8..30f6ff2eba47 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2316,6 +2316,13 @@ Cpuset Interface Files
treated to have an implicit value of "cpuset.cpus" in the
formation of local partition.

+ cpuset.cpus.isolated
+ A read-only and root cgroup only multiple values file.
+
+ This file shows the set of all isolated CPUs used in existing
+ isolated partitions. It will be empty if no isolated partition
+ is created.
+
cpuset.cpus.partition
A read-write single value file which exists on non-root
cpuset-enabled cgroups. This flag is owned by the parent cgroup
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 1bad4007ff4b..2a16df86c55c 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3974,7 +3974,7 @@ static struct cftype dfl_files[] = {
.name = "cpus.isolated",
.seq_show = cpuset_common_seq_show,
.private = FILE_ISOLATED_CPULIST,
- .flags = CFTYPE_ONLY_ON_ROOT | CFTYPE_DEBUG,
+ .flags = CFTYPE_ONLY_ON_ROOT,
},

{ } /* terminate */
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index 7b7c4c2b6d85..b5eb1be2248c 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -508,7 +508,7 @@ dump_states()
XECPUS=$DIR/cpuset.cpus.exclusive.effective
PRS=$DIR/cpuset.cpus.partition
PCPUS=$DIR/.__DEBUG__.cpuset.cpus.subpartitions
- ISCPUS=$DIR/.__DEBUG__.cpuset.cpus.isolated
+ ISCPUS=$DIR/cpuset.cpus.isolated
[[ -e $CPUS ]] && echo "$CPUS: $(cat $CPUS)"
[[ -e $XCPUS ]] && echo "$XCPUS: $(cat $XCPUS)"
[[ -e $ECPUS ]] && echo "$ECPUS: $(cat $ECPUS)"
@@ -593,17 +593,17 @@ check_cgroup_states()

#
# Get isolated (including offline) CPUs by looking at
-# /sys/kernel/debug/sched/domains and *cpuset.cpus.isolated control file,
+# /sys/kernel/debug/sched/domains and cpuset.cpus.isolated control file,
# if available, and compare that with the expected value.
#
# Note that isolated CPUs from the sched/domains context include offline
# CPUs as well as CPUs in non-isolated 1-CPU partition. Those CPUs may
-# not be included in the *cpuset.cpus.isolated control file which contains
+# not be included in the cpuset.cpus.isolated control file which contains
# only CPUs in isolated partitions.
#
# $1 - expected isolated cpu list(s) <isolcpus1>{,<isolcpus2>}
# <isolcpus1> - expected sched/domains value
-# <isolcpus2> - *cpuset.cpus.isolated value = <isolcpus1> if not defined
+# <isolcpus2> - cpuset.cpus.isolated value = <isolcpus1> if not defined
#
check_isolcpus()
{
@@ -611,7 +611,7 @@ check_isolcpus()
ISOLCPUS=
LASTISOLCPU=
SCHED_DOMAINS=/sys/kernel/debug/sched/domains
- ISCPUS=${CGROUP2}/.__DEBUG__.cpuset.cpus.isolated
+ ISCPUS=${CGROUP2}/cpuset.cpus.isolated
if [[ $EXPECT_VAL = . ]]
then
EXPECT_VAL=
@@ -692,14 +692,18 @@ test_fail()
null_isolcpus_check()
{
[[ $VERBOSE -gt 0 ]] || return 0
- pause 0.02
- check_isolcpus "."
- if [[ $? -ne 0 ]]
- then
- echo "Unexpected isolated CPUs: $ISOLCPUS"
- dump_states
- exit 1
- fi
+ # Retry a few times before printing error
+ RETRY=0
+ while [[ $RETRY -lt 5 ]]
+ do
+ pause 0.01
+ check_isolcpus "."
+ [[ $? -eq 0 ]] && return 0
+ ((RETRY++))
+ done
+ echo "Unexpected isolated CPUs: $ISOLCPUS"
+ dump_states
+ exit 1
}

#
@@ -776,7 +780,7 @@ run_state_test()
#
NEWLIST=$(cat cpuset.cpus.effective)
RETRY=0
- while [[ $NEWLIST != $CPULIST && $RETRY -lt 5 ]]
+ while [[ $NEWLIST != $CPULIST && $RETRY -lt 8 ]]
do
# Wait a bit longer & recheck a few times
pause 0.01
--
2.39.3


2023-11-28 16:47:10

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] cgroup/cpuset: Expose cpuset.cpus.isolated

Hello,

On Mon, Nov 27, 2023 at 02:51:05PM -0500, Waiman Long wrote:
> The root-only cpuset.cpus.isolated control file shows the current set
> of isolated CPUs in isolated partitions. This control file is currently
> exposed only with the cgroup_debug boot command line option which also
> adds the ".__DEBUG__." prefix. This is actually a useful control file if
> users want to find out which CPUs are currently in an isolated state by
> the cpuset controller. Remove CFTYPE_DEBUG flag for this control file and
> make it available by default without any prefix.
>
> The test_cpuset_prs.sh test script and the cgroup-v2.rst documentation
> file are also updated accordingly. Minor code change is also made in
> test_cpuset_prs.sh to avoid false test failure when running on debug
> kernel.

Applied to cgroup/for-6.8 but I wonder whether this would be useful in
non-root cgroups too. e.g. In a delegated partition which is namespaced,
wouldn't this be useful too?

Thanks.

--
tejun

2023-11-28 18:19:23

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH] cgroup/cpuset: Expose cpuset.cpus.isolated


On 11/28/23 11:46, Tejun Heo wrote:
> Hello,
>
> On Mon, Nov 27, 2023 at 02:51:05PM -0500, Waiman Long wrote:
>> The root-only cpuset.cpus.isolated control file shows the current set
>> of isolated CPUs in isolated partitions. This control file is currently
>> exposed only with the cgroup_debug boot command line option which also
>> adds the ".__DEBUG__." prefix. This is actually a useful control file if
>> users want to find out which CPUs are currently in an isolated state by
>> the cpuset controller. Remove CFTYPE_DEBUG flag for this control file and
>> make it available by default without any prefix.
>>
>> The test_cpuset_prs.sh test script and the cgroup-v2.rst documentation
>> file are also updated accordingly. Minor code change is also made in
>> test_cpuset_prs.sh to avoid false test failure when running on debug
>> kernel.
> Applied to cgroup/for-6.8 but I wonder whether this would be useful in
> non-root cgroups too. e.g. In a delegated partition which is namespaced,
> wouldn't this be useful too?
>
> Thanks.

For simplicity,we only maintain one cpumask of isolated CPUs that
includes all the exclusive CPUs in isolated partitions. We haven't
maintain separate masks for delegation purposes. We can certainly extend
that if the needs arise. At this point, the set of isolated CPUs is
mainly used for determining what kernel background services can be
disabled to reduce interference from the whole kernel point of view.

Cheers,
Longman