2023-06-01 19:18:28

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 0/8] x86/resctrl: Miscellaneous resctrl features

These series adds support few minor features.
1. Support assigning multiple tasks to control/mon groups in one command.
2. Add debug mount option for resctrl interface.
3. Add RMID and CLOSID in resctrl interface when mounted with debug option.
4. Moves the default control group creation during the mount instead of during init.
5. While doing these above changes, found that rftype flags needed some cleanup.
They were named inconsistently. Re-arranged them much more cleanly now and added
few comments. Hope it can help future additions.

---
v5:
Changes since v4:
Moved the default group creation during mount instead of kernel init.
Tried to address most of the comments on commit log. Added more context and details.
Addressed feedback about the patch4. Removed the code changes and only kept the comments.
I am ok to drop patch4. But I will wait for the comment on that.
There were lots of comments. Hope I did not miss anything. Even if I missed, it is
not intentional.

v4: Changes since v3
Addressed comments from Reinette and others.
Removed newline requirement when adding tasks.
Dropped one of the changes on flags. Kept the flag names mostly same.
Changed the names of closid and rmid to ctrl_hw_id and mon_hw_id respectively.
James had some concerns about adding these files. Addressed it by making these
files x86 specific.
Tried to address Reinette's comment on patch 7. But due to current code design
I could not do it exact way. But changed it little bit to make it easy debug
file additions in the future.

v3: Changes since v2
Still waiting for more comments. While waiting, addressed few comments from Fenghua.
Added few more texts in the documentation about multiple tasks assignment feature.
Added pid in last_cmd_status when applicable.
Introduced static resctrl_debug to save the debug option.
Few minor text changes.

v2: Changes since v1
a. Removed the changes to add the task's threads automatically. It required
book keeping to handle the failures and gets complicated. Removed that change
for now.
b. Added -o debug option to mount in debug mode(comment from Fenghua)
c. Added debug files rmid and closid. Stephane wanted to rename them more
generic to accommodate ARM. It kind of loses meaning if is renamed differently.
Kept it same for now. Will change if he feels strong about it.

v4: https://lore.kernel.org/lkml/168177435378.1758847.8317743523931859131.stgit@bmoger-ubuntu/
v3: https://lore.kernel.org/lkml/167778850105.1053859.14596357862185564029.stgit@bmoger-ubuntu/
v2: https://lore.kernel.org/lkml/167537433143.647488.9641864719195184123.stgit@bmoger-ubuntu/
v1: https://lore.kernel.org/lkml/167278351577.34228.12803395505584557101.stgit@bmoger-ubuntu/

Babu Moger (8):
x86/resctrl: Add multiple tasks to the resctrl group at once
x86/resctrl: Simplify rftype flag definitions
x86/resctrl: Rename rftype flags for consistency
x86/resctrl: Add comments on RFTYPE flags hierarchy
x86/resctrl: Introduce "-o debug" mount option
x86/resctrl: Display CLOSID and RMID for the resctrl groups
x86/resctrl: Move default control group creation during mount
x86/resctrl: Introduce RFTYPE_DEBUG flag


Documentation/arch/x86/resctrl.rst | 19 ++-
arch/x86/kernel/cpu/resctrl/internal.h | 66 ++++++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 199 +++++++++++++++++--------
3 files changed, 212 insertions(+), 72 deletions(-)

--



2023-06-01 19:22:22

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 4/8] x86/resctrl: Add comments on RFTYPE flags hierarchy

resctrl uses RFTYPE flags for creating resctrl directory structure.

Definitions and directory structures are not documented. Add
comments to improve the readability and help future additions.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/internal.h | 45 ++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 2051179a3b91..c20cd6acb7a3 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -240,6 +240,51 @@ struct rdtgroup {

/*
* Define the file type flags for base and info directories.
+ *
+ * RESCTRL filesystem has two main components
+ * a. info
+ * b. base
+ *
+ * /sys/fs/resctrl/
+ * |
+ * --> info (directory and provides details on control
+ * | and monitoring resources)
+ * |
+ * --> base (Lists the files and information to interact with control
+ * or monitor groups. Provides details on default control
+ * group when filesystem is created. There is no directory
+ * with name base)
+ *
+ * info structure
+ * -------------------------------------------------------------
+ * --> RFTYPE_INFO
+ * --> <info> directory
+ * --> RFTYPE_TOP_INFO
+ * Files: last_cmd_status
+ *
+ * --> RFTYPE_MON_INFO
+ * --> <L3_MON> directory
+ * Files: max_threshold_occupancy, mbm_local_bytes_config,
+ * mbm_total_bytes_config, mon_features, num_rmids
+ *
+ * --> RFTYPE_CTRL_INFO
+ * --> RFTYPE_RES_CACHE
+ * --> <L2/L3> directory
+ * Files: bit_usage, cbm_mask, min_cbm_bits,
+ * num_closids, shareable_bits
+ *
+ * --> RFTYPE_RES_MB
+ * --> <MB/SMBA> directory
+ * Files: bandwidth_gran, delay_linear, min_bandwidth,
+ * num_closids
+ *
+ * base structure
+ * -----------------------------------------------------------
+ * --> RFTYPE_BASE (Files common for both MON and CTRL groups)
+ * Files: cpus, cpus_list, tasks
+ *
+ * --> RFTYPE_CTRL_BASE (Files only for CTRL group)
+ * Files: mode, schemata, size
*/
#define RFTYPE_INFO BIT(0)
#define RFTYPE_BASE BIT(1)



2023-06-01 19:31:56

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 1/8] x86/resctrl: Add multiple tasks to the resctrl group at once

The resctrl task assignment for monitor or control group needs to be
done one at a time. For example:

$mount -t resctrl resctrl /sys/fs/resctrl/
$mkdir /sys/fs/resctrl/ctrl_grp1
$echo 123 > /sys/fs/resctrl/ctrl_grp1/tasks
$echo 456 > /sys/fs/resctrl/ctrl_grp1/tasks
$echo 789 > /sys/fs/resctrl/ctrl_grp1/tasks

This is not user-friendly when dealing with hundreds of tasks.

Support multiple task assignment in one command with tasks ids separated
by commas. For example:
$echo 123,456,789 > /sys/fs/resctrl/ctrl_grp1/tasks

Signed-off-by: Babu Moger <[email protected]>
---
Documentation/arch/x86/resctrl.rst | 8 +++++++-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 30 ++++++++++++++++++++----------
2 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 387ccbcb558f..290f01aa3002 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -292,7 +292,13 @@ All groups contain the following files:
"tasks":
Reading this file shows the list of all tasks that belong to
this group. Writing a task id to the file will add a task to the
- group. If the group is a CTRL_MON group the task is removed from
+ group. Multiple tasks can be added by separating the task ids
+ with commas. Tasks will be assigned sequentially. Multiple
+ failures are not supported. A single failure encountered while
+ attempting to assign a task will cause the operation to abort.
+ Failures will be logged to /sys/fs/resctrl/info/last_cmd_status.
+
+ If the group is a CTRL_MON group the task is removed from
whichever previous CTRL_MON group owned the task and also from
any MON group that owned the task. If the group is a MON group,
then the task must already belong to the CTRL_MON parent of this
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6ad33f355861..504137a5d31f 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -696,11 +696,10 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct rdtgroup *rdtgrp;
+ char *pid_str;
int ret = 0;
pid_t pid;

- if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
- return -EINVAL;
rdtgrp = rdtgroup_kn_lock_live(of->kn);
if (!rdtgrp) {
rdtgroup_kn_unlock(of->kn);
@@ -708,16 +707,27 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
}
rdt_last_cmd_clear();

- if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED ||
- rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
- ret = -EINVAL;
- rdt_last_cmd_puts("Pseudo-locking in progress\n");
- goto unlock;
- }
+ while (buf && buf[0] != '\0') {
+ pid_str = strim(strsep(&buf, ","));

- ret = rdtgroup_move_task(pid, rdtgrp, of);
+ if (kstrtoint(pid_str, 0, &pid)) {
+ rdt_last_cmd_puts("Task list parsing error\n");
+ ret = -EINVAL;
+ break;
+ }

-unlock:
+ if (pid < 0) {
+ rdt_last_cmd_printf("Invalid pid %d value\n", pid);
+ ret = -EINVAL;
+ break;
+ }
+
+ ret = rdtgroup_move_task(pid, rdtgrp, of);
+ if (ret) {
+ rdt_last_cmd_printf("Error while processing task %d\n", pid);
+ break;
+ }
+ }
rdtgroup_kn_unlock(of->kn);

return ret ?: nbytes;



2023-06-01 19:32:02

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 8/8] x86/resctrl: Introduce RFTYPE_DEBUG flag

Introduce RFTYPE_DEBUG flag which can be used to add files when
resctrl is mounted with "-o debug" option. RFTYPE_DEBUG is OR'd
with other RFTYPE_ flags. These other flags decide where in resctrl
structure these files should be created.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 +++++
2 files changed, 6 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c07c6517d856..5bc8d371fc3e 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -294,6 +294,7 @@ struct rdtgroup {
#define RFTYPE_TOP BIT(6)
#define RFTYPE_RES_CACHE BIT(8)
#define RFTYPE_RES_MB BIT(9)
+#define RFTYPE_DEBUG BIT(10)
#define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
#define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
#define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e03cb01c4742..9e42ecbb9063 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1862,6 +1862,7 @@ static struct rftype res_common_files[] = {
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdtgroup_rmid_show,
+ .fflags = RFTYPE_BASE | RFTYPE_DEBUG,
},
{
.name = "schemata",
@@ -1891,6 +1892,7 @@ static struct rftype res_common_files[] = {
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdtgroup_closid_show,
+ .fflags = RFTYPE_CTRL_BASE | RFTYPE_DEBUG,
},

};
@@ -1905,6 +1907,9 @@ static int rdtgroup_add_files(struct kernfs_node *kn, unsigned long fflags)

lockdep_assert_held(&rdtgroup_mutex);

+ if (resctrl_debug)
+ fflags |= RFTYPE_DEBUG;
+
for (rft = rfts; rft < rfts + len; rft++) {
if (rft->fflags && ((fflags & rft->fflags) == rft->fflags)) {
ret = rdtgroup_add_file(kn, rft);



2023-06-01 19:32:35

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 7/8] x86/resctrl: Move default control group creation during mount

Currently, the resctrl default control group is created during kernel
init time and rest of the files are added during mount. If the new
files are to be added to the default group during the mount then it
has to be done separately again.

This can avoided if all the files are created during the mount and
destroyed during the umount. Move the default group creation in
rdt_get_tree and removal in rdt_kill_sb.

Suggested-by: Reinette Chatre <[email protected]>
Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 59 ++++++++++++++++----------------
1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 2f5cdc638607..e03cb01c4742 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -57,6 +57,7 @@ static char last_cmd_status_buf[512];
struct dentry *debugfs_resctrl;

static bool resctrl_debug;
+static int rdtgroup_setup_root(void);

void rdt_last_cmd_clear(void)
{
@@ -2515,13 +2516,6 @@ static int rdt_get_tree(struct fs_context *fc)

cpus_read_lock();
mutex_lock(&rdtgroup_mutex);
- /*
- * resctrl file system can only be mounted once.
- */
- if (static_branch_unlikely(&rdt_enable_key)) {
- ret = -EBUSY;
- goto out;
- }

ret = rdt_enable_ctx(ctx);
if (ret < 0)
@@ -2535,9 +2529,15 @@ static int rdt_get_tree(struct fs_context *fc)

closid_init();

+ ret = rdtgroup_add_files(rdtgroup_default.kn, RFTYPE_CTRL_BASE);
+ if (ret)
+ goto out_schemata_free;
+
+ kernfs_activate(rdtgroup_default.kn);
+
ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
if (ret < 0)
- goto out_schemata_free;
+ goto out_default;

if (rdt_mon_capable) {
ret = mongroup_create_dir(rdtgroup_default.kn,
@@ -2587,6 +2587,8 @@ static int rdt_get_tree(struct fs_context *fc)
kernfs_remove(kn_mongrp);
out_info:
kernfs_remove(kn_info);
+out_default:
+ kernfs_remove(rdtgroup_default.kn);
out_schemata_free:
schemata_list_destroy();
out_mba:
@@ -2664,10 +2666,23 @@ static const struct fs_context_operations rdt_fs_context_ops = {
static int rdt_init_fs_context(struct fs_context *fc)
{
struct rdt_fs_context *ctx;
+ int ret;
+
+ /*
+ * resctrl file system can only be mounted once.
+ */
+ if (static_branch_unlikely(&rdt_enable_key))
+ return -EBUSY;
+
+ ret = rdtgroup_setup_root();
+ if (ret)
+ return ret;

ctx = kzalloc(sizeof(struct rdt_fs_context), GFP_KERNEL);
- if (!ctx)
+ if (!ctx) {
+ kernfs_destroy_root(rdt_root);
return -ENOMEM;
+ }

ctx->kfc.root = rdt_root;
ctx->kfc.magic = RDTGROUP_SUPER_MAGIC;
@@ -2845,6 +2860,9 @@ static void rdt_kill_sb(struct super_block *sb)
static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
static_branch_disable_cpuslocked(&rdt_mon_enable_key);
static_branch_disable_cpuslocked(&rdt_enable_key);
+ /* Remove the default group and cleanup the root */
+ list_del(&rdtgroup_default.rdtgroup_list);
+ kernfs_destroy_root(rdt_root);
kernfs_kill_sb(sb);
mutex_unlock(&rdtgroup_mutex);
cpus_read_unlock();
@@ -3598,10 +3616,8 @@ static struct kernfs_syscall_ops rdtgroup_kf_syscall_ops = {
.show_options = rdtgroup_show_options,
};

-static int __init rdtgroup_setup_root(void)
+static int rdtgroup_setup_root(void)
{
- int ret;
-
rdt_root = kernfs_create_root(&rdtgroup_kf_syscall_ops,
KERNFS_ROOT_CREATE_DEACTIVATED |
KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK,
@@ -3618,19 +3634,11 @@ static int __init rdtgroup_setup_root(void)

list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);

- ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_CTRL_BASE);
- if (ret) {
- kernfs_destroy_root(rdt_root);
- goto out;
- }
-
rdtgroup_default.kn = kernfs_root_to_node(rdt_root);
- kernfs_activate(rdtgroup_default.kn);

-out:
mutex_unlock(&rdtgroup_mutex);

- return ret;
+ return 0;
}

static void domain_destroy_mon_state(struct rdt_domain *d)
@@ -3752,13 +3760,9 @@ int __init rdtgroup_init(void)
seq_buf_init(&last_cmd_status, last_cmd_status_buf,
sizeof(last_cmd_status_buf));

- ret = rdtgroup_setup_root();
- if (ret)
- return ret;
-
ret = sysfs_create_mount_point(fs_kobj, "resctrl");
if (ret)
- goto cleanup_root;
+ return ret;

ret = register_filesystem(&rdt_fs_type);
if (ret)
@@ -3791,8 +3795,6 @@ int __init rdtgroup_init(void)

cleanup_mountpoint:
sysfs_remove_mount_point(fs_kobj, "resctrl");
-cleanup_root:
- kernfs_destroy_root(rdt_root);

return ret;
}
@@ -3802,5 +3804,4 @@ void __exit rdtgroup_exit(void)
debugfs_remove_recursive(debugfs_resctrl);
unregister_filesystem(&rdt_fs_type);
sysfs_remove_mount_point(fs_kobj, "resctrl");
- kernfs_destroy_root(rdt_root);
}



2023-06-01 19:52:27

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 5/8] x86/resctrl: Introduce "-o debug" mount option

Add "-o debug" option to mount resctrl filesystem in debug mode.
This option can be used for adding extra files to help debug
resctrl issues.

For example, hardware uses CLOSID and RMIDs to control and monitor
resctrl resources. These numbers are not visible to the user. These
details can help to debug issues. Debug option provides a method to
add these files to resctrl.

Signed-off-by: Babu Moger <[email protected]>
---
Documentation/arch/x86/resctrl.rst | 3 +++
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
3 files changed, 19 insertions(+)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 290f01aa3002..afdee4d1f691 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -46,6 +46,9 @@ mount options are:
"mba_MBps":
Enable the MBA Software Controller(mba_sc) to specify MBA
bandwidth in MBps
+"debug":
+ Make debug files accessible. Available debug files are annotated with
+ "Available only with debug option".

L2 and L3 CDP are controlled separately.

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c20cd6acb7a3..c07c6517d856 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -59,6 +59,7 @@ struct rdt_fs_context {
bool enable_cdpl2;
bool enable_cdpl3;
bool enable_mba_mbps;
+ bool enable_debug;
};

static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index baed20b2d788..be91dea5f927 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -56,6 +56,8 @@ static char last_cmd_status_buf[512];

struct dentry *debugfs_resctrl;

+static bool resctrl_debug;
+
void rdt_last_cmd_clear(void)
{
lockdep_assert_held(&rdtgroup_mutex);
@@ -2368,6 +2370,9 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
if (!ret && ctx->enable_mba_mbps)
ret = set_mba_sc(true);

+ if (!ret && ctx->enable_debug)
+ resctrl_debug = true;
+
return ret;
}

@@ -2556,6 +2561,7 @@ enum rdt_param {
Opt_cdp,
Opt_cdpl2,
Opt_mba_mbps,
+ Opt_debug,
nr__rdt_params
};

@@ -2563,6 +2569,7 @@ static const struct fs_parameter_spec rdt_fs_parameters[] = {
fsparam_flag("cdp", Opt_cdp),
fsparam_flag("cdpl2", Opt_cdpl2),
fsparam_flag("mba_MBps", Opt_mba_mbps),
+ fsparam_flag("debug", Opt_debug),
{}
};

@@ -2588,6 +2595,9 @@ static int rdt_parse_param(struct fs_context *fc, struct fs_parameter *param)
return -EINVAL;
ctx->enable_mba_mbps = true;
return 0;
+ case Opt_debug:
+ ctx->enable_debug = true;
+ return 0;
}

return -EINVAL;
@@ -2778,6 +2788,8 @@ static void rdt_kill_sb(struct super_block *sb)

set_mba_sc(false);

+ resctrl_debug = false;
+
/*Put everything back to default values. */
for_each_alloc_capable_rdt_resource(r)
reset_all_ctrls(r);
@@ -3530,6 +3542,9 @@ static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl))
seq_puts(seq, ",mba_MBps");

+ if (resctrl_debug)
+ seq_puts(seq, ",debug");
+
return 0;
}




2023-06-27 15:09:35

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 0/8] x86/resctrl: Miscellaneous resctrl features

Gentle ping, Any comments on this series.
Thanks
Babu

> -----Original Message-----
> From: Moger, Babu <[email protected]>
> Sent: Thursday, June 1, 2023 2:01 PM
> To: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]; Moger,
> Babu <[email protected]>; [email protected];
> [email protected]; [email protected];
> [email protected]; Das1, Sandipan <[email protected]>;
> [email protected]; [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Moger, Babu <[email protected]>
> Subject: [PATCH v5 0/8] x86/resctrl: Miscellaneous resctrl features
>
> These series adds support few minor features.
> 1. Support assigning multiple tasks to control/mon groups in one command.
> 2. Add debug mount option for resctrl interface.
> 3. Add RMID and CLOSID in resctrl interface when mounted with debug option.
> 4. Moves the default control group creation during the mount instead of during
> init.
> 5. While doing these above changes, found that rftype flags needed some
> cleanup.
> They were named inconsistently. Re-arranged them much more cleanly now
> and added
> few comments. Hope it can help future additions.
>
> ---
> v5:
> Changes since v4:
> Moved the default group creation during mount instead of kernel init.
> Tried to address most of the comments on commit log. Added more context
> and details.
> Addressed feedback about the patch4. Removed the code changes and only
> kept the comments.
> I am ok to drop patch4. But I will wait for the comment on that.
> There were lots of comments. Hope I did not miss anything. Even if I missed, it
> is
> not intentional.
>
> v4: Changes since v3
> Addressed comments from Reinette and others.
> Removed newline requirement when adding tasks.
> Dropped one of the changes on flags. Kept the flag names mostly same.
> Changed the names of closid and rmid to ctrl_hw_id and mon_hw_id
> respectively.
> James had some concerns about adding these files. Addressed it by making
> these
> files x86 specific.
> Tried to address Reinette's comment on patch 7. But due to current code
> design
> I could not do it exact way. But changed it little bit to make it easy debug
> file additions in the future.
>
> v3: Changes since v2
> Still waiting for more comments. While waiting, addressed few comments
> from Fenghua.
> Added few more texts in the documentation about multiple tasks assignment
> feature.
> Added pid in last_cmd_status when applicable.
> Introduced static resctrl_debug to save the debug option.
> Few minor text changes.
>
> v2: Changes since v1
> a. Removed the changes to add the task's threads automatically. It required
> book keeping to handle the failures and gets complicated. Removed that
> change
> for now.
> b. Added -o debug option to mount in debug mode(comment from Fenghua)
> c. Added debug files rmid and closid. Stephane wanted to rename them more
> generic to accommodate ARM. It kind of loses meaning if is renamed
> differently.
> Kept it same for now. Will change if he feels strong about it.
>
> v4:
> https://lore.kernel.org/lkml/168177435378.1758847.8317743523931859131.stg
> it@bmoger-ubuntu/
> v3:
> https://lore.kernel.org/lkml/167778850105.1053859.14596357862185564029.st
> git@bmoger-ubuntu/
> v2:
> https://lore.kernel.org/lkml/167537433143.647488.9641864719195184123.stgit
> @bmoger-ubuntu/
> v1:
> https://lore.kernel.org/lkml/167278351577.34228.12803395505584557101.stgit
> @bmoger-ubuntu/
>
> Babu Moger (8):
> x86/resctrl: Add multiple tasks to the resctrl group at once
> x86/resctrl: Simplify rftype flag definitions
> x86/resctrl: Rename rftype flags for consistency
> x86/resctrl: Add comments on RFTYPE flags hierarchy
> x86/resctrl: Introduce "-o debug" mount option
> x86/resctrl: Display CLOSID and RMID for the resctrl groups
> x86/resctrl: Move default control group creation during mount
> x86/resctrl: Introduce RFTYPE_DEBUG flag
>
>
> Documentation/arch/x86/resctrl.rst | 19 ++-
> arch/x86/kernel/cpu/resctrl/internal.h | 66 ++++++--
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 199 +++++++++++++++++--------
> 3 files changed, 212 insertions(+), 72 deletions(-)
>
> --


2023-06-28 02:30:56

by Shaopeng Tan (Fujitsu)

[permalink] [raw]
Subject: RE: [PATCH v5 0/8] x86/resctrl: Miscellaneous resctrl features

Hi Babu,

I reviewed this patch series and it looks fine.
I tested these features and ran the selftests/resctrl test set,
and there is no problem.

<Reviewed-by:[email protected]>
<Tested-by:[email protected]>

Best regards,
Shaopeng TAN


2023-07-07 21:55:30

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 4/8] x86/resctrl: Add comments on RFTYPE flags hierarchy

Hi Babu,

On 6/1/2023 12:01 PM, Babu Moger wrote:
> resctrl uses RFTYPE flags for creating resctrl directory structure.
>
> Definitions and directory structures are not documented. Add
> comments to improve the readability and help future additions.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 45 ++++++++++++++++++++++++++++++++
> 1 file changed, 45 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 2051179a3b91..c20cd6acb7a3 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -240,6 +240,51 @@ struct rdtgroup {
>
> /*
> * Define the file type flags for base and info directories.
> + *
> + * RESCTRL filesystem has two main components
> + * a. info
> + * b. base
> + *
> + * /sys/fs/resctrl/
> + * |
> + * --> info (directory and provides details on control
> + * | and monitoring resources)
> + * |
> + * --> base (Lists the files and information to interact with control
> + * or monitor groups. Provides details on default control
> + * group when filesystem is created. There is no directory
> + * with name base)
> + *

In the above I think it may help to split the comment into two parts:
(a) which directory/directories are referred to, and (b) which files can be
found in the directory/directories.

For example,

--> info (Top level directory named "info". Contains files that provide
details on control and monitoring resources.)

--> base (Root directory associated with default resource group as well as
directories created by user for MON and CTRL groups. Contains
files to interact with MON and CTRL groups.)

Please feel free to improve.

> + * info structure
> + * -------------------------------------------------------------
> + * --> RFTYPE_INFO
> + * --> <info> directory
> + * --> RFTYPE_TOP_INFO
> + * Files: last_cmd_status
> + *
> + * --> RFTYPE_MON_INFO
> + * --> <L3_MON> directory
> + * Files: max_threshold_occupancy, mbm_local_bytes_config,
> + * mbm_total_bytes_config, mon_features, num_rmids
> + *
> + * --> RFTYPE_CTRL_INFO
> + * --> RFTYPE_RES_CACHE
> + * --> <L2/L3> directory

Maybe a nitpick but I wonder if it should be "L2,L3" to not create impression
that it is either/or?

> + * Files: bit_usage, cbm_mask, min_cbm_bits,
> + * num_closids, shareable_bits
> + *
> + * --> RFTYPE_RES_MB
> + * --> <MB/SMBA> directory

Same here ... perhaps "MB,SMBA"

> + * Files: bandwidth_gran, delay_linear, min_bandwidth,
> + * num_closids

Missing thread_throttle_mode

> + *
> + * base structure
> + * -----------------------------------------------------------
> + * --> RFTYPE_BASE (Files common for both MON and CTRL groups)
> + * Files: cpus, cpus_list, tasks
> + *
> + * --> RFTYPE_CTRL_BASE (Files only for CTRL group)
> + * Files: mode, schemata, size
> */
> #define RFTYPE_INFO BIT(0)
> #define RFTYPE_BASE BIT(1)
>
>

I think this is a helpful addition. Thanks for doing this.

Reinette

2023-07-07 21:56:18

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 1/8] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Babu,

On 6/1/2023 12:00 PM, Babu Moger wrote:
> The resctrl task assignment for monitor or control group needs to be
> done one at a time. For example:
>
> $mount -t resctrl resctrl /sys/fs/resctrl/
> $mkdir /sys/fs/resctrl/ctrl_grp1
> $echo 123 > /sys/fs/resctrl/ctrl_grp1/tasks
> $echo 456 > /sys/fs/resctrl/ctrl_grp1/tasks
> $echo 789 > /sys/fs/resctrl/ctrl_grp1/tasks
>
> This is not user-friendly when dealing with hundreds of tasks.
>
> Support multiple task assignment in one command with tasks ids separated
> by commas. For example:
> $echo 123,456,789 > /sys/fs/resctrl/ctrl_grp1/tasks
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 6ad33f355861..504137a5d31f 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -696,11 +696,10 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
> char *buf, size_t nbytes, loff_t off)
> {
> struct rdtgroup *rdtgrp;
> + char *pid_str;
> int ret = 0;
> pid_t pid;
>
> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
> - return -EINVAL;
> rdtgrp = rdtgroup_kn_lock_live(of->kn);
> if (!rdtgrp) {
> rdtgroup_kn_unlock(of->kn);
> @@ -708,16 +707,27 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
> }
> rdt_last_cmd_clear();
>
> - if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED ||
> - rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
> - ret = -EINVAL;
> - rdt_last_cmd_puts("Pseudo-locking in progress\n");
> - goto unlock;
> - }

Please do not drop this snippet. I think there may have been misunderstanding
during previous comments - this snippet is required, it just does not need
to be run for every pid the user provides since it depends on the resource
group, not the pid.

> + while (buf && buf[0] != '\0') {

I think it may help to add a check for '\n' here also. It looks to me
that a user (shell) that provides "pid,", which is "pid,\n" would encounter
error and this may not actually be an error.

> + pid_str = strim(strsep(&buf, ","));
>
> - ret = rdtgroup_move_task(pid, rdtgrp, of);
> + if (kstrtoint(pid_str, 0, &pid)) {
> + rdt_last_cmd_puts("Task list parsing error\n");
> + ret = -EINVAL;
> + break;
> + }
>
> -unlock:
> + if (pid < 0) {
> + rdt_last_cmd_printf("Invalid pid %d value\n", pid);
> + ret = -EINVAL;
> + break;
> + }

I'm trying to image a possible error and it does not look right. For example,
the above could be "Invalid pid 123 value". How about just "Invalid pid %d".

> +
> + ret = rdtgroup_move_task(pid, rdtgrp, of);
> + if (ret) {
> + rdt_last_cmd_printf("Error while processing task %d\n", pid);
> + break;
> + }
> + }
> rdtgroup_kn_unlock(of->kn);
>
> return ret ?: nbytes;
>
>

Reinette

2023-07-07 21:57:03

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] x86/resctrl: Introduce "-o debug" mount option

Hi Babu,

On 6/1/2023 12:01 PM, Babu Moger wrote:
> Add "-o debug" option to mount resctrl filesystem in debug mode.
> This option can be used for adding extra files to help debug
> resctrl issues.

Please avoid uncertainty in the changelog (re. "can be used"). I
think it will help to be more specific if the first and last
hunks of patch 8 is merged into this patch, making it clear
that the debug mount option is in support of debug files with
this changelog written to support that.

> For example, hardware uses CLOSID and RMIDs to control and monitor
> resctrl resources. These numbers are not visible to the user. These
> details can help to debug issues. Debug option provides a method to
> add these files to resctrl.

With the debug file support added here this extra motivation should
not be necessary (remaining hunks of patch 8 can be moved to where
these files are introduced).
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> Documentation/arch/x86/resctrl.rst | 3 +++
> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
> 3 files changed, 19 insertions(+)
>
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 290f01aa3002..afdee4d1f691 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -46,6 +46,9 @@ mount options are:
> "mba_MBps":
> Enable the MBA Software Controller(mba_sc) to specify MBA
> bandwidth in MBps
> +"debug":
> + Make debug files accessible. Available debug files are annotated with
> + "Available only with debug option".
>

At the risk of becoming unreadable, the earlier documentation of the mount
command should also change.

> L2 and L3 CDP are controlled separately.
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index c20cd6acb7a3..c07c6517d856 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -59,6 +59,7 @@ struct rdt_fs_context {
> bool enable_cdpl2;
> bool enable_cdpl3;
> bool enable_mba_mbps;
> + bool enable_debug;
> };
>
> static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc)
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index baed20b2d788..be91dea5f927 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -56,6 +56,8 @@ static char last_cmd_status_buf[512];
>
> struct dentry *debugfs_resctrl;
>
> +static bool resctrl_debug;
> +
> void rdt_last_cmd_clear(void)
> {
> lockdep_assert_held(&rdtgroup_mutex);
> @@ -2368,6 +2370,9 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
> if (!ret && ctx->enable_mba_mbps)
> ret = set_mba_sc(true);
>
> + if (!ret && ctx->enable_debug)
> + resctrl_debug = true;
> +
> return ret;
> }

Looks like unwinding of rdt_enable_ctx() errors is done in the caller, this is
not ideal and additions like above cause some error unwinding to be omitted.
I cannot see why rdt_enable_ctx() cannot do its own error unwinding. This
may be better as a separate patch.

Reinette


2023-07-07 21:57:26

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 7/8] x86/resctrl: Move default control group creation during mount

Hi Babu,

On 6/1/2023 12:02 PM, Babu Moger wrote:
> Currently, the resctrl default control group is created during kernel
> init time and rest of the files are added during mount. If the new

Please drop the word "Currently"

> files are to be added to the default group during the mount then it
> has to be done separately again.
>
> This can avoided if all the files are created during the mount and
> destroyed during the umount. Move the default group creation in

"creation in" -> "creation to"?

> rdt_get_tree and removal in rdt_kill_sb.

I think it would be simpler if this patch is moved earlier in series
then patch 8 can more easily be squashed where appropriate.

>
> Suggested-by: Reinette Chatre <[email protected]>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 59 ++++++++++++++++----------------
> 1 file changed, 30 insertions(+), 29 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 2f5cdc638607..e03cb01c4742 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -57,6 +57,7 @@ static char last_cmd_status_buf[512];
> struct dentry *debugfs_resctrl;
>
> static bool resctrl_debug;
> +static int rdtgroup_setup_root(void);
>
> void rdt_last_cmd_clear(void)
> {
> @@ -2515,13 +2516,6 @@ static int rdt_get_tree(struct fs_context *fc)
>
> cpus_read_lock();
> mutex_lock(&rdtgroup_mutex);
> - /*
> - * resctrl file system can only be mounted once.
> - */
> - if (static_branch_unlikely(&rdt_enable_key)) {
> - ret = -EBUSY;
> - goto out;
> - }
>

This change is unexpected.

> ret = rdt_enable_ctx(ctx);
> if (ret < 0)
> @@ -2535,9 +2529,15 @@ static int rdt_get_tree(struct fs_context *fc)
>
> closid_init();
>
> + ret = rdtgroup_add_files(rdtgroup_default.kn, RFTYPE_CTRL_BASE);
> + if (ret)
> + goto out_schemata_free;
> +
> + kernfs_activate(rdtgroup_default.kn);
> +
> ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
> if (ret < 0)
> - goto out_schemata_free;
> + goto out_default;
>
> if (rdt_mon_capable) {
> ret = mongroup_create_dir(rdtgroup_default.kn,
> @@ -2587,6 +2587,8 @@ static int rdt_get_tree(struct fs_context *fc)
> kernfs_remove(kn_mongrp);
> out_info:
> kernfs_remove(kn_info);
> +out_default:
> + kernfs_remove(rdtgroup_default.kn);
> out_schemata_free:
> schemata_list_destroy();
> out_mba:
> @@ -2664,10 +2666,23 @@ static const struct fs_context_operations rdt_fs_context_ops = {
> static int rdt_init_fs_context(struct fs_context *fc)
> {
> struct rdt_fs_context *ctx;
> + int ret;
> +
> + /*
> + * resctrl file system can only be mounted once.
> + */
> + if (static_branch_unlikely(&rdt_enable_key))
> + return -EBUSY;
> +
> + ret = rdtgroup_setup_root();
> + if (ret)
> + return ret;
>

Why was it necessary to move this code?

> ctx = kzalloc(sizeof(struct rdt_fs_context), GFP_KERNEL);
> - if (!ctx)
> + if (!ctx) {
> + kernfs_destroy_root(rdt_root);
> return -ENOMEM;
> + }
>
> ctx->kfc.root = rdt_root;
> ctx->kfc.magic = RDTGROUP_SUPER_MAGIC;
> @@ -2845,6 +2860,9 @@ static void rdt_kill_sb(struct super_block *sb)
> static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
> static_branch_disable_cpuslocked(&rdt_mon_enable_key);
> static_branch_disable_cpuslocked(&rdt_enable_key);
> + /* Remove the default group and cleanup the root */
> + list_del(&rdtgroup_default.rdtgroup_list);
> + kernfs_destroy_root(rdt_root);

Why not just add kernfs_remove(rdtgroup_default.kn) to rmdir_all_sub()?

> kernfs_kill_sb(sb);
> mutex_unlock(&rdtgroup_mutex);
> cpus_read_unlock();
> @@ -3598,10 +3616,8 @@ static struct kernfs_syscall_ops rdtgroup_kf_syscall_ops = {
> .show_options = rdtgroup_show_options,
> };
>
> -static int __init rdtgroup_setup_root(void)
> +static int rdtgroup_setup_root(void)
> {
> - int ret;
> -
> rdt_root = kernfs_create_root(&rdtgroup_kf_syscall_ops,
> KERNFS_ROOT_CREATE_DEACTIVATED |
> KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK,
> @@ -3618,19 +3634,11 @@ static int __init rdtgroup_setup_root(void)
>
> list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);
>
> - ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_CTRL_BASE);
> - if (ret) {
> - kernfs_destroy_root(rdt_root);
> - goto out;
> - }
> -
> rdtgroup_default.kn = kernfs_root_to_node(rdt_root);
> - kernfs_activate(rdtgroup_default.kn);
>
> -out:
> mutex_unlock(&rdtgroup_mutex);
>
> - return ret;
> + return 0;
> }
>
> static void domain_destroy_mon_state(struct rdt_domain *d)
> @@ -3752,13 +3760,9 @@ int __init rdtgroup_init(void)
> seq_buf_init(&last_cmd_status, last_cmd_status_buf,
> sizeof(last_cmd_status_buf));
>
> - ret = rdtgroup_setup_root();
> - if (ret)
> - return ret;
> -
> ret = sysfs_create_mount_point(fs_kobj, "resctrl");
> if (ret)
> - goto cleanup_root;
> + return ret;
>

It is not clear to me why this change is required, could you
please elaborate? It seems that all that is needed is for
rdtgroup_add_files() to move to rdt_get_tree() (which you have done)
and then an additional call to kernfs_remove() in rmdir_all_sub().
I must be missing something, could you please help me understand?

> ret = register_filesystem(&rdt_fs_type);
> if (ret)
> @@ -3791,8 +3795,6 @@ int __init rdtgroup_init(void)
>
> cleanup_mountpoint:
> sysfs_remove_mount_point(fs_kobj, "resctrl");
> -cleanup_root:
> - kernfs_destroy_root(rdt_root);
>
> return ret;
> }
> @@ -3802,5 +3804,4 @@ void __exit rdtgroup_exit(void)
> debugfs_remove_recursive(debugfs_resctrl);
> unregister_filesystem(&rdt_fs_type);
> sysfs_remove_mount_point(fs_kobj, "resctrl");
> - kernfs_destroy_root(rdt_root);
> }
>
>

Reinette

2023-07-07 22:16:25

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 8/8] x86/resctrl: Introduce RFTYPE_DEBUG flag

Hi Babu,

On 6/1/2023 12:02 PM, Babu Moger wrote:
> Introduce RFTYPE_DEBUG flag which can be used to add files when
> resctrl is mounted with "-o debug" option. RFTYPE_DEBUG is OR'd
> with other RFTYPE_ flags. These other flags decide where in resctrl
> structure these files should be created.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 +++++
> 2 files changed, 6 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index c07c6517d856..5bc8d371fc3e 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -294,6 +294,7 @@ struct rdtgroup {
> #define RFTYPE_TOP BIT(6)
> #define RFTYPE_RES_CACHE BIT(8)
> #define RFTYPE_RES_MB BIT(9)
> +#define RFTYPE_DEBUG BIT(10)
> #define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
> #define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
> #define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index e03cb01c4742..9e42ecbb9063 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1862,6 +1862,7 @@ static struct rftype res_common_files[] = {
> .mode = 0444,
> .kf_ops = &rdtgroup_kf_single_ops,
> .seq_show = rdtgroup_rmid_show,
> + .fflags = RFTYPE_BASE | RFTYPE_DEBUG,
> },
> {
> .name = "schemata",
> @@ -1891,6 +1892,7 @@ static struct rftype res_common_files[] = {
> .mode = 0444,
> .kf_ops = &rdtgroup_kf_single_ops,
> .seq_show = rdtgroup_closid_show,
> + .fflags = RFTYPE_CTRL_BASE | RFTYPE_DEBUG,
> },
>
> };
> @@ -1905,6 +1907,9 @@ static int rdtgroup_add_files(struct kernfs_node *kn, unsigned long fflags)
>
> lockdep_assert_held(&rdtgroup_mutex);
>
> + if (resctrl_debug)
> + fflags |= RFTYPE_DEBUG;
> +
> for (rft = rfts; rft < rfts + len; rft++) {
> if (rft->fflags && ((fflags & rft->fflags) == rft->fflags)) {
> ret = rdtgroup_add_file(kn, rft);
>
>

I think that the first and last hunk of this patch can be
squashed with patch 5. From what I can tell it would help
the motivation of patch 5 and fit nicely with what its
changelog aims to describe. The remaining hunks can be
moved to patch 6.

Reinette

2023-07-11 17:04:49

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 0/8] x86/resctrl: Miscellaneous resctrl features



On 6/27/23 21:13, Shaopeng Tan (Fujitsu) wrote:
> Hi Babu,
>
> I reviewed this patch series and it looks fine.
> I tested these features and ran the selftests/resctrl test set,
> and there is no problem.
>
> <Reviewed-by:[email protected]>
> <Tested-by:[email protected]>
>

Thanks Tan. I will add Reviewed and Tested by to the patches that are final.
--
Thanks
Babu Moger

2023-07-11 18:43:06

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 1/8] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Reinette,

On 7/7/23 16:38, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/1/2023 12:00 PM, Babu Moger wrote:
>> The resctrl task assignment for monitor or control group needs to be
>> done one at a time. For example:
>>
>> $mount -t resctrl resctrl /sys/fs/resctrl/
>> $mkdir /sys/fs/resctrl/ctrl_grp1
>> $echo 123 > /sys/fs/resctrl/ctrl_grp1/tasks
>> $echo 456 > /sys/fs/resctrl/ctrl_grp1/tasks
>> $echo 789 > /sys/fs/resctrl/ctrl_grp1/tasks
>>
>> This is not user-friendly when dealing with hundreds of tasks.
>>
>> Support multiple task assignment in one command with tasks ids separated
>> by commas. For example:
>> $echo 123,456,789 > /sys/fs/resctrl/ctrl_grp1/tasks
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 6ad33f355861..504137a5d31f 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -696,11 +696,10 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
>> char *buf, size_t nbytes, loff_t off)
>> {
>> struct rdtgroup *rdtgrp;
>> + char *pid_str;
>> int ret = 0;
>> pid_t pid;
>>
>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
>> - return -EINVAL;
>> rdtgrp = rdtgroup_kn_lock_live(of->kn);
>> if (!rdtgrp) {
>> rdtgroup_kn_unlock(of->kn);
>> @@ -708,16 +707,27 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
>> }
>> rdt_last_cmd_clear();
>>
>> - if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED ||
>> - rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
>> - ret = -EINVAL;
>> - rdt_last_cmd_puts("Pseudo-locking in progress\n");
>> - goto unlock;
>> - }
>
> Please do not drop this snippet. I think there may have been misunderstanding
> during previous comments - this snippet is required, it just does not need
> to be run for every pid the user provides since it depends on the resource
> group, not the pid.

Ok. Got it.

>
>> + while (buf && buf[0] != '\0') {
>
> I think it may help to add a check for '\n' here also. It looks to me
> that a user (shell) that provides "pid,", which is "pid,\n" would encounter
> error and this may not actually be an error.

Ok Sounds good. I have verified it. New check will look like this below.

while (buf && buf[0] != '\0' && buf[0] != '\n') {

>
>> + pid_str = strim(strsep(&buf, ","));
>>
>> - ret = rdtgroup_move_task(pid, rdtgrp, of);
>> + if (kstrtoint(pid_str, 0, &pid)) {
>> + rdt_last_cmd_puts("Task list parsing error\n");
>> + ret = -EINVAL;
>> + break;
>> + }
>>
>> -unlock:
>> + if (pid < 0) {
>> + rdt_last_cmd_printf("Invalid pid %d value\n", pid);
>> + ret = -EINVAL;
>> + break;
>> + }
>
> I'm trying to image a possible error and it does not look right. For example,
> the above could be "Invalid pid 123 value". How about just "Invalid pid %d".

Sure.

Thanks
Babu

2023-07-11 23:55:52

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 4/8] x86/resctrl: Add comments on RFTYPE flags hierarchy

Hi Reinette,

On 7/7/23 16:39, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/1/2023 12:01 PM, Babu Moger wrote:
>> resctrl uses RFTYPE flags for creating resctrl directory structure.
>>
>> Definitions and directory structures are not documented. Add
>> comments to improve the readability and help future additions.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/internal.h | 45 ++++++++++++++++++++++++++++++++
>> 1 file changed, 45 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 2051179a3b91..c20cd6acb7a3 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -240,6 +240,51 @@ struct rdtgroup {
>>
>> /*
>> * Define the file type flags for base and info directories.
>> + *
>> + * RESCTRL filesystem has two main components
>> + * a. info
>> + * b. base
>> + *
>> + * /sys/fs/resctrl/
>> + * |
>> + * --> info (directory and provides details on control
>> + * | and monitoring resources)
>> + * |
>> + * --> base (Lists the files and information to interact with control
>> + * or monitor groups. Provides details on default control
>> + * group when filesystem is created. There is no directory
>> + * with name base)
>> + *
>
> In the above I think it may help to split the comment into two parts:
> (a) which directory/directories are referred to, and (b) which files can be
> found in the directory/directories.
>
> For example,
>
> --> info (Top level directory named "info". Contains files that provide
> details on control and monitoring resources.)
>
> --> base (Root directory associated with default resource group as well as
> directories created by user for MON and CTRL groups. Contains
> files to interact with MON and CTRL groups.)
>
> Please feel free to improve.

Looks good.

>
>> + * info structure
>> + * -------------------------------------------------------------
>> + * --> RFTYPE_INFO
>> + * --> <info> directory
>> + * --> RFTYPE_TOP_INFO
>> + * Files: last_cmd_status
>> + *
>> + * --> RFTYPE_MON_INFO
>> + * --> <L3_MON> directory
>> + * Files: max_threshold_occupancy, mbm_local_bytes_config,
>> + * mbm_total_bytes_config, mon_features, num_rmids
>> + *
>> + * --> RFTYPE_CTRL_INFO
>> + * --> RFTYPE_RES_CACHE
>> + * --> <L2/L3> directory
>
> Maybe a nitpick but I wonder if it should be "L2,L3" to not create impression
> that it is either/or?

Sure.
>
>> + * Files: bit_usage, cbm_mask, min_cbm_bits,
>> + * num_closids, shareable_bits
>> + *
>> + * --> RFTYPE_RES_MB
>> + * --> <MB/SMBA> directory
>
> Same here ... perhaps "MB,SMBA"

Sure.
>
>> + * Files: bandwidth_gran, delay_linear, min_bandwidth,
>> + * num_closids
>
> Missing thread_throttle_mode

Will add it.
>
>> + *
>> + * base structure
>> + * -----------------------------------------------------------
>> + * --> RFTYPE_BASE (Files common for both MON and CTRL groups)
>> + * Files: cpus, cpus_list, tasks
>> + *
>> + * --> RFTYPE_CTRL_BASE (Files only for CTRL group)
>> + * Files: mode, schemata, size
>> */
>> #define RFTYPE_INFO BIT(0)
>> #define RFTYPE_BASE BIT(1)
>>
>>
>
> I think this is a helpful addition. Thanks for doing this.

Sure. Welcome.Thanks
Babu Moger

2023-07-12 17:02:57

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] x86/resctrl: Introduce "-o debug" mount option

Hi Reinette,

On 7/7/23 16:42, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/1/2023 12:01 PM, Babu Moger wrote:
>> Add "-o debug" option to mount resctrl filesystem in debug mode.
>> This option can be used for adding extra files to help debug
>> resctrl issues.
>
> Please avoid uncertainty in the changelog (re. "can be used"). I

ok Sure.

> think it will help to be more specific if the first and last
> hunks of patch 8 is merged into this patch, making it clear
> that the debug mount option is in support of debug files with
> this changelog written to support that.

Sure. Will do that.

>
>> For example, hardware uses CLOSID and RMIDs to control and monitor
>> resctrl resources. These numbers are not visible to the user. These
>> details can help to debug issues. Debug option provides a method to
>> add these files to resctrl.
>
> With the debug file support added here this extra motivation should
> not be necessary (remaining hunks of patch 8 can be moved to where
> these files are introduced).

Sure.

>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> Documentation/arch/x86/resctrl.rst | 3 +++
>> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++
>> 3 files changed, 19 insertions(+)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index 290f01aa3002..afdee4d1f691 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -46,6 +46,9 @@ mount options are:
>> "mba_MBps":
>> Enable the MBA Software Controller(mba_sc) to specify MBA
>> bandwidth in MBps
>> +"debug":
>> + Make debug files accessible. Available debug files are annotated with
>> + "Available only with debug option".
>>
>
> At the risk of becoming unreadable, the earlier documentation of the mount
> command should also change.

Ok. Sure

>
>> L2 and L3 CDP are controlled separately.
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index c20cd6acb7a3..c07c6517d856 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -59,6 +59,7 @@ struct rdt_fs_context {
>> bool enable_cdpl2;
>> bool enable_cdpl3;
>> bool enable_mba_mbps;
>> + bool enable_debug;
>> };
>>
>> static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc)
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index baed20b2d788..be91dea5f927 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -56,6 +56,8 @@ static char last_cmd_status_buf[512];
>>
>> struct dentry *debugfs_resctrl;
>>
>> +static bool resctrl_debug;
>> +
>> void rdt_last_cmd_clear(void)
>> {
>> lockdep_assert_held(&rdtgroup_mutex);
>> @@ -2368,6 +2370,9 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
>> if (!ret && ctx->enable_mba_mbps)
>> ret = set_mba_sc(true);
>>
>> + if (!ret && ctx->enable_debug)
>> + resctrl_debug = true;
>> +
>> return ret;
>> }
>
> Looks like unwinding of rdt_enable_ctx() errors is done in the caller, this is
> not ideal and additions like above cause some error unwinding to be omitted.
> I cannot see why rdt_enable_ctx() cannot do its own error unwinding. This
> may be better as a separate patch.
>

Sure. Will do error unwinding of rdt_enable_ctx as a separate patch before
this patch. That seems more appropriate.
--
Thanks
Babu Moger

2023-07-14 17:12:09

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 7/8] x86/resctrl: Move default control group creation during mount

Hi Reinette,
Sorry.. Took a while to respond. I had to recreate the issue to refresh my
memory.

On 7/7/23 16:46, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/1/2023 12:02 PM, Babu Moger wrote:
>> Currently, the resctrl default control group is created during kernel
>> init time and rest of the files are added during mount. If the new
>
> Please drop the word "Currently"

Sure
>
>> files are to be added to the default group during the mount then it
>> has to be done separately again.
>>
>> This can avoided if all the files are created during the mount and
>> destroyed during the umount. Move the default group creation in
>
> "creation in" -> "creation to"?

Sure
>
>> rdt_get_tree and removal in rdt_kill_sb.
>
> I think it would be simpler if this patch is moved earlier in series
> then patch 8 can more easily be squashed where appropriate.

Yes, I was thinking about that.
>
>>
>> Suggested-by: Reinette Chatre <[email protected]>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 59 ++++++++++++++++----------------
>> 1 file changed, 30 insertions(+), 29 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 2f5cdc638607..e03cb01c4742 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -57,6 +57,7 @@ static char last_cmd_status_buf[512];
>> struct dentry *debugfs_resctrl;
>>
>> static bool resctrl_debug;
>> +static int rdtgroup_setup_root(void);
>>
>> void rdt_last_cmd_clear(void)
>> {
>> @@ -2515,13 +2516,6 @@ static int rdt_get_tree(struct fs_context *fc)
>>
>> cpus_read_lock();
>> mutex_lock(&rdtgroup_mutex);
>> - /*
>> - * resctrl file system can only be mounted once.
>> - */
>> - if (static_branch_unlikely(&rdt_enable_key)) {
>> - ret = -EBUSY;
>> - goto out;
>> - }
>>
>
> This change is unexpected.

Please see my comments below.

>
>> ret = rdt_enable_ctx(ctx);
>> if (ret < 0)
>> @@ -2535,9 +2529,15 @@ static int rdt_get_tree(struct fs_context *fc)
>>
>> closid_init();
>>
>> + ret = rdtgroup_add_files(rdtgroup_default.kn, RFTYPE_CTRL_BASE);
>> + if (ret)
>> + goto out_schemata_free;
>> +
>> + kernfs_activate(rdtgroup_default.kn);
>> +
>> ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
>> if (ret < 0)
>> - goto out_schemata_free;
>> + goto out_default;
>>
>> if (rdt_mon_capable) {
>> ret = mongroup_create_dir(rdtgroup_default.kn,
>> @@ -2587,6 +2587,8 @@ static int rdt_get_tree(struct fs_context *fc)
>> kernfs_remove(kn_mongrp);
>> out_info:
>> kernfs_remove(kn_info);
>> +out_default:
>> + kernfs_remove(rdtgroup_default.kn);
>> out_schemata_free:
>> schemata_list_destroy();
>> out_mba:
>> @@ -2664,10 +2666,23 @@ static const struct fs_context_operations rdt_fs_context_ops = {
>> static int rdt_init_fs_context(struct fs_context *fc)
>> {
>> struct rdt_fs_context *ctx;
>> + int ret;
>> +
>> + /*
>> + * resctrl file system can only be mounted once.
>> + */
>> + if (static_branch_unlikely(&rdt_enable_key))
>> + return -EBUSY;
>> +
>> + ret = rdtgroup_setup_root();
>> + if (ret)
>> + return ret;
>>
>
> Why was it necessary to move this code?

Please see my comments below..
>
>> ctx = kzalloc(sizeof(struct rdt_fs_context), GFP_KERNEL);
>> - if (!ctx)
>> + if (!ctx) {
>> + kernfs_destroy_root(rdt_root);
>> return -ENOMEM;
>> + }
>>
>> ctx->kfc.root = rdt_root;
>> ctx->kfc.magic = RDTGROUP_SUPER_MAGIC;
>> @@ -2845,6 +2860,9 @@ static void rdt_kill_sb(struct super_block *sb)
>> static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
>> static_branch_disable_cpuslocked(&rdt_mon_enable_key);
>> static_branch_disable_cpuslocked(&rdt_enable_key);
>> + /* Remove the default group and cleanup the root */
>> + list_del(&rdtgroup_default.rdtgroup_list);
>> + kernfs_destroy_root(rdt_root);
>
> Why not just add kernfs_remove(rdtgroup_default.kn) to rmdir_all_sub()?

List rdtgroup_default.rdtgroup_list is added during the mount and had to
be removed during umount and rdt_root is destroyed here.
Please see more comments below.

>
>> kernfs_kill_sb(sb);
>> mutex_unlock(&rdtgroup_mutex);
>> cpus_read_unlock();
>> @@ -3598,10 +3616,8 @@ static struct kernfs_syscall_ops rdtgroup_kf_syscall_ops = {
>> .show_options = rdtgroup_show_options,
>> };
>>
>> -static int __init rdtgroup_setup_root(void)
>> +static int rdtgroup_setup_root(void)
>> {
>> - int ret;
>> -
>> rdt_root = kernfs_create_root(&rdtgroup_kf_syscall_ops,
>> KERNFS_ROOT_CREATE_DEACTIVATED |
>> KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK,
>> @@ -3618,19 +3634,11 @@ static int __init rdtgroup_setup_root(void)
>>
>> list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);
>>
>> - ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_CTRL_BASE);
>> - if (ret) {
>> - kernfs_destroy_root(rdt_root);
>> - goto out;
>> - }
>> -
>> rdtgroup_default.kn = kernfs_root_to_node(rdt_root);
>> - kernfs_activate(rdtgroup_default.kn);
>>
>> -out:
>> mutex_unlock(&rdtgroup_mutex);
>>
>> - return ret;
>> + return 0;
>> }
>>
>> static void domain_destroy_mon_state(struct rdt_domain *d)
>> @@ -3752,13 +3760,9 @@ int __init rdtgroup_init(void)
>> seq_buf_init(&last_cmd_status, last_cmd_status_buf,
>> sizeof(last_cmd_status_buf));
>>
>> - ret = rdtgroup_setup_root();
>> - if (ret)
>> - return ret;
>> -
>> ret = sysfs_create_mount_point(fs_kobj, "resctrl");
>> if (ret)
>> - goto cleanup_root;
>> + return ret;
>>
>
> It is not clear to me why this change is required, could you
> please elaborate? It seems that all that is needed is for
> rdtgroup_add_files() to move to rdt_get_tree() (which you have done)
> and then an additional call to kernfs_remove() in rmdir_all_sub().
> I must be missing something, could you please help me understand?
>

Yes. I started with that approach. But there are issues with that approach.

Currently, rdt_root(which is rdtgroup_default.kn) is created during
rdtgroup_init. At the same time the root files are created. Also, default
group is added to rdt_all_groups. Basically, the root files and
rdtgroup_default group is always there even though filesystem is never
mounted. Also mbm_over and cqm_limbo workqueues are always running even
though filesystem is not mounted.

I changed rdtgroup_add_files() to move to rdt_get_tree() and added
kernfs_remove() in rmdir_all_sub(). This caused problems. The
kernfs_remove(rdtgroup_default.kn) removes all the reference counts and
releases the root. When we mount again, we hit this this problem below.

[ 404.558461] ------------[ cut here ]------------
[ 404.563631] WARNING: CPU: 35 PID: 7728 at fs/kernfs/dir.c:522
kernfs_new_node+0x63/0x70

404.778793] ? __warn+0x81/0x140
[ 404.782535] ? kernfs_new_node+0x63/0x70
[ 404.787036] ? report_bug+0x102/0x200
[ 404.791247] ? handle_bug+0x3f/0x70
[ 404.795269] ? exc_invalid_op+0x13/0x60
[ 404.799671] ? asm_exc_invalid_op+0x16/0x20
[ 404.804461] ? kernfs_new_node+0x63/0x70
[ 404.808954] ? snprintf+0x49/0x70
[ 404.812762] __kernfs_create_file+0x30/0xc0
[ 404.817534] rdtgroup_add_files+0x6c/0x100

Basically kernel says your rdt_root is not initialized. That is the reason
I had to move everything to mount time. The rdt_root is created and
initialized during the mount and also destroyed during the umount.
And I had to move rdt_enable_key check during rdt_root creation.

--
Thanks
Babu Moger

2023-07-14 17:12:32

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 8/8] x86/resctrl: Introduce RFTYPE_DEBUG flag

Hi Reinette,

On 7/7/23 16:47, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/1/2023 12:02 PM, Babu Moger wrote:
>> Introduce RFTYPE_DEBUG flag which can be used to add files when
>> resctrl is mounted with "-o debug" option. RFTYPE_DEBUG is OR'd
>> with other RFTYPE_ flags. These other flags decide where in resctrl
>> structure these files should be created.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 +++++
>> 2 files changed, 6 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index c07c6517d856..5bc8d371fc3e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -294,6 +294,7 @@ struct rdtgroup {
>> #define RFTYPE_TOP BIT(6)
>> #define RFTYPE_RES_CACHE BIT(8)
>> #define RFTYPE_RES_MB BIT(9)
>> +#define RFTYPE_DEBUG BIT(10)
>> #define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
>> #define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
>> #define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index e03cb01c4742..9e42ecbb9063 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1862,6 +1862,7 @@ static struct rftype res_common_files[] = {
>> .mode = 0444,
>> .kf_ops = &rdtgroup_kf_single_ops,
>> .seq_show = rdtgroup_rmid_show,
>> + .fflags = RFTYPE_BASE | RFTYPE_DEBUG,
>> },
>> {
>> .name = "schemata",
>> @@ -1891,6 +1892,7 @@ static struct rftype res_common_files[] = {
>> .mode = 0444,
>> .kf_ops = &rdtgroup_kf_single_ops,
>> .seq_show = rdtgroup_closid_show,
>> + .fflags = RFTYPE_CTRL_BASE | RFTYPE_DEBUG,
>> },
>>
>> };
>> @@ -1905,6 +1907,9 @@ static int rdtgroup_add_files(struct kernfs_node *kn, unsigned long fflags)
>>
>> lockdep_assert_held(&rdtgroup_mutex);
>>
>> + if (resctrl_debug)
>> + fflags |= RFTYPE_DEBUG;
>> +
>> for (rft = rfts; rft < rfts + len; rft++) {
>> if (rft->fflags && ((fflags & rft->fflags) == rft->fflags)) {
>> ret = rdtgroup_add_file(kn, rft);
>>
>>
>
> I think that the first and last hunk of this patch can be
> squashed with patch 5. From what I can tell it would help
> the motivation of patch 5 and fit nicely with what its
> changelog aims to describe. The remaining hunks can be
> moved to patch 6.
Sure.. Will do.
Thanks
Babu Moger

2023-07-14 22:11:42

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 7/8] x86/resctrl: Move default control group creation during mount

Hi Babu,

On 7/14/2023 9:26 AM, Moger, Babu wrote:
> Hi Reinette,
> Sorry.. Took a while to respond. I had to recreate the issue to refresh my
> memory.

No problem!

> On 7/7/23 16:46, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 6/1/2023 12:02 PM, Babu Moger wrote:


>>> ctx = kzalloc(sizeof(struct rdt_fs_context), GFP_KERNEL);
>>> - if (!ctx)
>>> + if (!ctx) {
>>> + kernfs_destroy_root(rdt_root);
>>> return -ENOMEM;
>>> + }
>>>
>>> ctx->kfc.root = rdt_root;
>>> ctx->kfc.magic = RDTGROUP_SUPER_MAGIC;
>>> @@ -2845,6 +2860,9 @@ static void rdt_kill_sb(struct super_block *sb)
>>> static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
>>> static_branch_disable_cpuslocked(&rdt_mon_enable_key);
>>> static_branch_disable_cpuslocked(&rdt_enable_key);
>>> + /* Remove the default group and cleanup the root */
>>> + list_del(&rdtgroup_default.rdtgroup_list);
>>> + kernfs_destroy_root(rdt_root);
>>
>> Why not just add kernfs_remove(rdtgroup_default.kn) to rmdir_all_sub()?
>
> List rdtgroup_default.rdtgroup_list is added during the mount and had to
> be removed during umount and rdt_root is destroyed here.

I do not think it is required for default resource group management to
be tied with the resctrl files associated with default resource group.

I think rdtgroup_setup_root can be split in two, one for all the
resctrl files that should be done at mount/unmount and one for the
default group init done at __init.

>>> kernfs_kill_sb(sb);
>>> mutex_unlock(&rdtgroup_mutex);
>>> cpus_read_unlock();
>>> @@ -3598,10 +3616,8 @@ static struct kernfs_syscall_ops rdtgroup_kf_syscall_ops = {
>>> .show_options = rdtgroup_show_options,
>>> };
>>>
>>> -static int __init rdtgroup_setup_root(void)
>>> +static int rdtgroup_setup_root(void)
>>> {
>>> - int ret;
>>> -
>>> rdt_root = kernfs_create_root(&rdtgroup_kf_syscall_ops,
>>> KERNFS_ROOT_CREATE_DEACTIVATED |
>>> KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK,
>>> @@ -3618,19 +3634,11 @@ static int __init rdtgroup_setup_root(void)
>>>
>>> list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);
>>>
>>> - ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_CTRL_BASE);
>>> - if (ret) {
>>> - kernfs_destroy_root(rdt_root);
>>> - goto out;
>>> - }
>>> -
>>> rdtgroup_default.kn = kernfs_root_to_node(rdt_root);
>>> - kernfs_activate(rdtgroup_default.kn);
>>>
>>> -out:
>>> mutex_unlock(&rdtgroup_mutex);
>>>
>>> - return ret;
>>> + return 0;
>>> }
>>>
>>> static void domain_destroy_mon_state(struct rdt_domain *d)
>>> @@ -3752,13 +3760,9 @@ int __init rdtgroup_init(void)
>>> seq_buf_init(&last_cmd_status, last_cmd_status_buf,
>>> sizeof(last_cmd_status_buf));
>>>
>>> - ret = rdtgroup_setup_root();
>>> - if (ret)
>>> - return ret;
>>> -
>>> ret = sysfs_create_mount_point(fs_kobj, "resctrl");
>>> if (ret)
>>> - goto cleanup_root;
>>> + return ret;
>>>
>>
>> It is not clear to me why this change is required, could you
>> please elaborate? It seems that all that is needed is for
>> rdtgroup_add_files() to move to rdt_get_tree() (which you have done)
>> and then an additional call to kernfs_remove() in rmdir_all_sub().
>> I must be missing something, could you please help me understand?
>>
>
> Yes. I started with that approach. But there are issues with that approach.
>
> Currently, rdt_root(which is rdtgroup_default.kn) is created during
> rdtgroup_init. At the same time the root files are created. Also, default
> group is added to rdt_all_groups. Basically, the root files and
> rdtgroup_default group is always there even though filesystem is never
> mounted. Also mbm_over and cqm_limbo workqueues are always running even
> though filesystem is not mounted.
>
> I changed rdtgroup_add_files() to move to rdt_get_tree() and added
> kernfs_remove() in rmdir_all_sub(). This caused problems. The
> kernfs_remove(rdtgroup_default.kn) removes all the reference counts and
> releases the root. When we mount again, we hit this this problem below.
>
> [ 404.558461] ------------[ cut here ]------------
> [ 404.563631] WARNING: CPU: 35 PID: 7728 at fs/kernfs/dir.c:522
> kernfs_new_node+0x63/0x70
>
> 404.778793] ? __warn+0x81/0x140
> [ 404.782535] ? kernfs_new_node+0x63/0x70
> [ 404.787036] ? report_bug+0x102/0x200
> [ 404.791247] ? handle_bug+0x3f/0x70
> [ 404.795269] ? exc_invalid_op+0x13/0x60
> [ 404.799671] ? asm_exc_invalid_op+0x16/0x20
> [ 404.804461] ? kernfs_new_node+0x63/0x70
> [ 404.808954] ? snprintf+0x49/0x70
> [ 404.812762] __kernfs_create_file+0x30/0xc0
> [ 404.817534] rdtgroup_add_files+0x6c/0x100
>
> Basically kernel says your rdt_root is not initialized. That is the reason
> I had to move everything to mount time. The rdt_root is created and
> initialized during the mount and also destroyed during the umount.
> And I had to move rdt_enable_key check during rdt_root creation.
>

ok, thank you for the additional details. I see now how this patch evolved.
I understand how rdt_root needs to be created/destroyed
during mount/unmount. If I understand correctly the changes to
rdt_init_fs_context() was motivated by this line:

ctx->kfc.root = rdt_root;

... that prompted you to move rdt_root creation there in order to have
it present for this assignment and that prompted the
rdt_enable_key check to follow. Is this correct?

I am concerned about the changes to rdt_init_fs_context() since it further
separates the resctrl file management, it breaks the symmetry of the
key checked and set, and finally these new actions seem unrelated to a function
named "init_fs_context". I looked at other examples and from what I can tell
it is not required that ctx->kfc.root be initialized within
rdt_init_fs_context(). Looks like the value is required by kernfs_get_tree()
that is called from rdt_get_tree(). For comparison I found cgroup_do_get_tree().
Note how cgroup_do_get_tree(), within the .get_tree callback,
initializes kernfs_fs_context.root and then call kernfs_get_tree()?

It thus looks to me as though things can be simplified significantly
if the kernfs_fs_context.root assignment is moved from rdt_init_fs_context()
to rdt_get_tree(). rdt_get_tree() can then create rdt_root (and add all needed
files), assign it to kernfs_fs_context.root and call kernfs_get_tree().

What do you think?

Reinette



2023-07-14 22:57:51

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 7/8] x86/resctrl: Move default control group creation during mount

Hi Reinette,

On 7/14/23 16:54, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/14/2023 9:26 AM, Moger, Babu wrote:
>> Hi Reinette,
>> Sorry.. Took a while to respond. I had to recreate the issue to refresh my
>> memory.
>
> No problem!
>
>> On 7/7/23 16:46, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 6/1/2023 12:02 PM, Babu Moger wrote:
>
>
>>>> ctx = kzalloc(sizeof(struct rdt_fs_context), GFP_KERNEL);
>>>> - if (!ctx)
>>>> + if (!ctx) {
>>>> + kernfs_destroy_root(rdt_root);
>>>> return -ENOMEM;
>>>> + }
>>>>
>>>> ctx->kfc.root = rdt_root;
>>>> ctx->kfc.magic = RDTGROUP_SUPER_MAGIC;
>>>> @@ -2845,6 +2860,9 @@ static void rdt_kill_sb(struct super_block *sb)
>>>> static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
>>>> static_branch_disable_cpuslocked(&rdt_mon_enable_key);
>>>> static_branch_disable_cpuslocked(&rdt_enable_key);
>>>> + /* Remove the default group and cleanup the root */
>>>> + list_del(&rdtgroup_default.rdtgroup_list);
>>>> + kernfs_destroy_root(rdt_root);
>>>
>>> Why not just add kernfs_remove(rdtgroup_default.kn) to rmdir_all_sub()?
>>
>> List rdtgroup_default.rdtgroup_list is added during the mount and had to
>> be removed during umount and rdt_root is destroyed here.
>
> I do not think it is required for default resource group management to
> be tied with the resctrl files associated with default resource group.
>
> I think rdtgroup_setup_root can be split in two, one for all the
> resctrl files that should be done at mount/unmount and one for the
> default group init done at __init.

Ok.
>
>>>> kernfs_kill_sb(sb);
>>>> mutex_unlock(&rdtgroup_mutex);
>>>> cpus_read_unlock();
>>>> @@ -3598,10 +3616,8 @@ static struct kernfs_syscall_ops rdtgroup_kf_syscall_ops = {
>>>> .show_options = rdtgroup_show_options,
>>>> };
>>>>
>>>> -static int __init rdtgroup_setup_root(void)
>>>> +static int rdtgroup_setup_root(void)
>>>> {
>>>> - int ret;
>>>> -
>>>> rdt_root = kernfs_create_root(&rdtgroup_kf_syscall_ops,
>>>> KERNFS_ROOT_CREATE_DEACTIVATED |
>>>> KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK,
>>>> @@ -3618,19 +3634,11 @@ static int __init rdtgroup_setup_root(void)
>>>>
>>>> list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);
>>>>
>>>> - ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_CTRL_BASE);
>>>> - if (ret) {
>>>> - kernfs_destroy_root(rdt_root);
>>>> - goto out;
>>>> - }
>>>> -
>>>> rdtgroup_default.kn = kernfs_root_to_node(rdt_root);
>>>> - kernfs_activate(rdtgroup_default.kn);
>>>>
>>>> -out:
>>>> mutex_unlock(&rdtgroup_mutex);
>>>>
>>>> - return ret;
>>>> + return 0;
>>>> }
>>>>
>>>> static void domain_destroy_mon_state(struct rdt_domain *d)
>>>> @@ -3752,13 +3760,9 @@ int __init rdtgroup_init(void)
>>>> seq_buf_init(&last_cmd_status, last_cmd_status_buf,
>>>> sizeof(last_cmd_status_buf));
>>>>
>>>> - ret = rdtgroup_setup_root();
>>>> - if (ret)
>>>> - return ret;
>>>> -
>>>> ret = sysfs_create_mount_point(fs_kobj, "resctrl");
>>>> if (ret)
>>>> - goto cleanup_root;
>>>> + return ret;
>>>>
>>>
>>> It is not clear to me why this change is required, could you
>>> please elaborate? It seems that all that is needed is for
>>> rdtgroup_add_files() to move to rdt_get_tree() (which you have done)
>>> and then an additional call to kernfs_remove() in rmdir_all_sub().
>>> I must be missing something, could you please help me understand?
>>>
>>
>> Yes. I started with that approach. But there are issues with that approach.
>>
>> Currently, rdt_root(which is rdtgroup_default.kn) is created during
>> rdtgroup_init. At the same time the root files are created. Also, default
>> group is added to rdt_all_groups. Basically, the root files and
>> rdtgroup_default group is always there even though filesystem is never
>> mounted. Also mbm_over and cqm_limbo workqueues are always running even
>> though filesystem is not mounted.
>>
>> I changed rdtgroup_add_files() to move to rdt_get_tree() and added
>> kernfs_remove() in rmdir_all_sub(). This caused problems. The
>> kernfs_remove(rdtgroup_default.kn) removes all the reference counts and
>> releases the root. When we mount again, we hit this this problem below.
>>
>> [ 404.558461] ------------[ cut here ]------------
>> [ 404.563631] WARNING: CPU: 35 PID: 7728 at fs/kernfs/dir.c:522
>> kernfs_new_node+0x63/0x70
>>
>> 404.778793] ? __warn+0x81/0x140
>> [ 404.782535] ? kernfs_new_node+0x63/0x70
>> [ 404.787036] ? report_bug+0x102/0x200
>> [ 404.791247] ? handle_bug+0x3f/0x70
>> [ 404.795269] ? exc_invalid_op+0x13/0x60
>> [ 404.799671] ? asm_exc_invalid_op+0x16/0x20
>> [ 404.804461] ? kernfs_new_node+0x63/0x70
>> [ 404.808954] ? snprintf+0x49/0x70
>> [ 404.812762] __kernfs_create_file+0x30/0xc0
>> [ 404.817534] rdtgroup_add_files+0x6c/0x100
>>
>> Basically kernel says your rdt_root is not initialized. That is the reason
>> I had to move everything to mount time. The rdt_root is created and
>> initialized during the mount and also destroyed during the umount.
>> And I had to move rdt_enable_key check during rdt_root creation.
>>
>
> ok, thank you for the additional details. I see now how this patch evolved.
> I understand how rdt_root needs to be created/destroyed
> during mount/unmount. If I understand correctly the changes to
> rdt_init_fs_context() was motivated by this line:
>
> ctx->kfc.root = rdt_root;
>
> ... that prompted you to move rdt_root creation there in order to have
> it present for this assignment and that prompted the
> rdt_enable_key check to follow. Is this correct?

That is correct.

>
> I am concerned about the changes to rdt_init_fs_context() since it further
> separates the resctrl file management, it breaks the symmetry of the
> key checked and set, and finally these new actions seem unrelated to a function
> named "init_fs_context". I looked at other examples and from what I can tell
> it is not required that ctx->kfc.root be initialized within
> rdt_init_fs_context(). Looks like the value is required by kernfs_get_tree()
> that is called from rdt_get_tree(). For comparison I found cgroup_do_get_tree().
> Note how cgroup_do_get_tree(), within the .get_tree callback,
> initializes kernfs_fs_context.root and then call kernfs_get_tree()?

Yes. I see that. Thanks for pointing.

>
> It thus looks to me as though things can be simplified significantly
> if the kernfs_fs_context.root assignment is moved from rdt_init_fs_context()
> to rdt_get_tree(). rdt_get_tree() can then create rdt_root (and add all needed
> files), assign it to kernfs_fs_context.root and call kernfs_get_tree().
>
> What do you think?

Yes. I think we can do that. Let me try it. Will let you know how it goes.
Thanks for the suggestion.
--
Thanks
Babu Moger