2023-03-02 20:24:30

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 0/7] x86/resctrl: Miscellaneous resctrl features

These series adds support few minor features.
1. Support assigning multiple tasks to control/mon groups in one command.
2. Add debug mount option for resctrl interface.
3. Add RMID and CLOSID in resctrl interface when mounted with debug option.
4. While doing these above changes, found that rftype flags needed some cleanup.
They were named inconsistently. Re-arranged them much more cleanly now.
Hope it can help future additions.

Code is built on top of tip master (commit 7fa08de735e41001).
---
v3: Changes since v2
Still waiting for more comments. While waiting, addressed few comments from Fenghua.
Added few more texts in the documentation about multiple tasks assignment feature.
Added pid in last_cmd_status when applicable.
Introduced static resctrl_debug to save the debug option.
Few minor text changes.

v2: Changes since v1
a. Removed the changes to add the task's threads automatically. It required
book keeping to handle the failures and gets complicated. Removed that change
for now.
b. Added -o debug option to mount in debug mode(comment from Fenghua)
c. Added debug files rmid and closid. Stephane wanted to rename them more
generic to accommodate ARM. It kind of loses meaning if is renamed differently.
Kept it same for now. Will change if he feels strong about it.

v2: https://lore.kernel.org/lkml/167537433143.647488.9641864719195184123.stgit@bmoger-ubuntu/
v1: https://lore.kernel.org/lkml/167278351577.34228.12803395505584557101.stgit@bmoger-ubuntu/

Babu Moger (7):
x86/resctrl: Add multiple tasks to the resctrl group at once
x86/resctrl: Remove few unnecessary rftype flags
x86/resctrl: Rename rftype flags for consistency
x86/resctrl: Re-arrange RFTYPE flags based on hierarchy
x86/resctrl: Display the RMID and COSID for resctrl groups
x86/resctrl: Introduce -o debug mount option
x86/resctrl: Add debug files when mounted with debug option


Documentation/x86/resctrl.rst | 30 ++++-
arch/x86/kernel/cpu/resctrl/core.c | 8 +-
arch/x86/kernel/cpu/resctrl/internal.h | 68 ++++++++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 179 +++++++++++++++++++++----
4 files changed, 242 insertions(+), 43 deletions(-)

--



2023-03-02 20:24:37

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

The resctrl task assignment for MONITOR or CONTROL group needs to be
done one at a time. For example:

$mount -t resctrl resctrl /sys/fs/resctrl/
$mkdir /sys/fs/resctrl/clos1
$echo 123 > /sys/fs/resctrl/clos1/tasks
$echo 456 > /sys/fs/resctrl/clos1/tasks
$echo 789 > /sys/fs/resctrl/clos1/tasks

This is not user-friendly when dealing with hundreds of tasks. Also,
there is a syscall overhead for each command executed from user space.

It can be improved by supporting the multiple task assignment in one
command with the tasks separated by commas. For example:

$echo 123,456,789 > /sys/fs/resctrl/clos1/tasks

Signed-off-by: Babu Moger <[email protected]>
---
Documentation/x86/resctrl.rst | 11 +++++++++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 +++++++++++++++++++++++-
2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 058257dc56c8..25203f20002d 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -292,13 +292,20 @@ All groups contain the following files:
"tasks":
Reading this file shows the list of all tasks that belong to
this group. Writing a task id to the file will add a task to the
- group. If the group is a CTRL_MON group the task is removed from
+ group. Multiple tasks can be assigned together in one command by
+ inputting the tasks separated by commas. Tasks will be assigned
+ sequentially in the order it is provided. Failure while assigning
+ the tasks will be aborted immediately. The tasks before the failure
+ will be assigned and the tasks next in the sequence will not be
+ assigned. Users may need to retry them again. The failure details
+ will be logged in resctrl/info/last_cmd_status file.
+
+ If the group is a CTRL_MON group the task is removed from
whichever previous CTRL_MON group owned the task and also from
any MON group that owned the task. If the group is a MON group,
then the task must already belong to the CTRL_MON parent of this
group. The task is removed from any previous MON group.

-
"cpus":
Reading this file shows a bitmask of the logical CPUs owned by
this group. Writing a mask to this file will add and remove
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e2c1599d1b37..15ea5b550fe9 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct rdtgroup *rdtgrp;
+ char *pid_str;
int ret = 0;
pid_t pid;

- if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
+ /* Valid input requires a trailing newline */
+ if (nbytes == 0 || buf[nbytes - 1] != '\n')
return -EINVAL;
+
+ buf[nbytes - 1] = '\0';
+
rdtgrp = rdtgroup_kn_lock_live(of->kn);
if (!rdtgrp) {
rdtgroup_kn_unlock(of->kn);
return -ENOENT;
}
+
+next:
+ if (!buf || buf[0] == '\0')
+ goto unlock;
+
+ pid_str = strim(strsep(&buf, ","));
+
+ if (kstrtoint(pid_str, 0, &pid) || pid < 0) {
+ rdt_last_cmd_printf("Invalid pid %d value\n", pid);
+ ret = -EINVAL;
+ goto unlock;
+ }
+
rdt_last_cmd_clear();

if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED ||
@@ -703,6 +721,10 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
}

ret = rdtgroup_move_task(pid, rdtgrp, of);
+ if (ret)
+ goto unlock;
+ else
+ goto next;

unlock:
rdtgroup_kn_unlock(of->kn);



2023-03-02 20:24:48

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 2/7] x86/resctrl: Remove few unnecessary rftype flags

Remove few unnecessary rftype flags and simplify the code. This is done
to further cleanup the code eventually.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/internal.h | 9 +++------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++++---
2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 8edecc5763d8..571145d75d29 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -243,12 +243,9 @@ struct rdtgroup {
*/
#define RFTYPE_INFO BIT(0)
#define RFTYPE_BASE BIT(1)
-#define RF_CTRLSHIFT 4
-#define RF_MONSHIFT 5
-#define RF_TOPSHIFT 6
-#define RFTYPE_CTRL BIT(RF_CTRLSHIFT)
-#define RFTYPE_MON BIT(RF_MONSHIFT)
-#define RFTYPE_TOP BIT(RF_TOPSHIFT)
+#define RFTYPE_CTRL BIT(4)
+#define RFTYPE_MON BIT(5)
+#define RFTYPE_TOP BIT(6)
#define RFTYPE_RES_CACHE BIT(8)
#define RFTYPE_RES_MB BIT(9)
#define RF_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 15ea5b550fe9..3c86506e54c1 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -3163,7 +3163,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
{
struct rdtgroup *prdtgrp, *rdtgrp;
struct kernfs_node *kn;
- uint files = 0;
+ uint fflags = 0;
int ret;

prdtgrp = rdtgroup_kn_lock_live(parent_kn);
@@ -3215,8 +3215,12 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
goto out_destroy;
}

- files = RFTYPE_BASE | BIT(RF_CTRLSHIFT + rtype);
- ret = rdtgroup_add_files(kn, files);
+ if (rtype == RDTCTRL_GROUP)
+ fflags = RFTYPE_BASE | RFTYPE_CTRL;
+ else
+ fflags = RFTYPE_BASE | RFTYPE_MON;
+
+ ret = rdtgroup_add_files(kn, fflags);
if (ret) {
rdt_last_cmd_puts("kernfs fill error\n");
goto out_destroy;



2023-03-02 20:25:35

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 3/7] x86/resctrl: Rename rftype flags for consistency

The rftype flags have two different prefixes even though they are used
for the same purpose. Change the prefix to RFTYPE_ for all the flags.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/internal.h | 8 +++---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 42 ++++++++++++++++----------------
2 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 571145d75d29..2cfc3c431d5b 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -248,10 +248,10 @@ struct rdtgroup {
#define RFTYPE_TOP BIT(6)
#define RFTYPE_RES_CACHE BIT(8)
#define RFTYPE_RES_MB BIT(9)
-#define RF_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
-#define RF_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
-#define RF_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
-#define RF_CTRL_BASE (RFTYPE_BASE | RFTYPE_CTRL)
+#define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
+#define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
+#define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
+#define RFTYPE_CTRL_BASE (RFTYPE_BASE | RFTYPE_CTRL)

/* List of all resource groups */
extern struct list_head rdt_all_groups;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 3c86506e54c1..fc7d1e778bff 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1691,77 +1691,77 @@ static struct rftype res_common_files[] = {
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_last_cmd_status_show,
- .fflags = RF_TOP_INFO,
+ .fflags = RFTYPE_TOP_INFO,
},
{
.name = "num_closids",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_num_closids_show,
- .fflags = RF_CTRL_INFO,
+ .fflags = RFTYPE_CTRL_INFO,
},
{
.name = "mon_features",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_mon_features_show,
- .fflags = RF_MON_INFO,
+ .fflags = RFTYPE_MON_INFO,
},
{
.name = "num_rmids",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_num_rmids_show,
- .fflags = RF_MON_INFO,
+ .fflags = RFTYPE_MON_INFO,
},
{
.name = "cbm_mask",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_default_ctrl_show,
- .fflags = RF_CTRL_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
},
{
.name = "min_cbm_bits",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_min_cbm_bits_show,
- .fflags = RF_CTRL_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
},
{
.name = "shareable_bits",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_shareable_bits_show,
- .fflags = RF_CTRL_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
},
{
.name = "bit_usage",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_bit_usage_show,
- .fflags = RF_CTRL_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
},
{
.name = "min_bandwidth",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_min_bw_show,
- .fflags = RF_CTRL_INFO | RFTYPE_RES_MB,
+ .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB,
},
{
.name = "bandwidth_gran",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_bw_gran_show,
- .fflags = RF_CTRL_INFO | RFTYPE_RES_MB,
+ .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB,
},
{
.name = "delay_linear",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_delay_linear_show,
- .fflags = RF_CTRL_INFO | RFTYPE_RES_MB,
+ .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB,
},
/*
* Platform specific which (if any) capabilities are provided by
@@ -1780,7 +1780,7 @@ static struct rftype res_common_files[] = {
.kf_ops = &rdtgroup_kf_single_ops,
.write = max_threshold_occ_write,
.seq_show = max_threshold_occ_show,
- .fflags = RF_MON_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE,
},
{
.name = "mbm_total_bytes_config",
@@ -1827,7 +1827,7 @@ static struct rftype res_common_files[] = {
.kf_ops = &rdtgroup_kf_single_ops,
.write = rdtgroup_schemata_write,
.seq_show = rdtgroup_schemata_show,
- .fflags = RF_CTRL_BASE,
+ .fflags = RFTYPE_CTRL_BASE,
},
{
.name = "mode",
@@ -1835,14 +1835,14 @@ static struct rftype res_common_files[] = {
.kf_ops = &rdtgroup_kf_single_ops,
.write = rdtgroup_mode_write,
.seq_show = rdtgroup_mode_show,
- .fflags = RF_CTRL_BASE,
+ .fflags = RFTYPE_CTRL_BASE,
},
{
.name = "size",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdtgroup_size_show,
- .fflags = RF_CTRL_BASE,
+ .fflags = RFTYPE_CTRL_BASE,
},

};
@@ -1899,7 +1899,7 @@ void __init thread_throttle_mode_init(void)
if (!rft)
return;

- rft->fflags = RF_CTRL_INFO | RFTYPE_RES_MB;
+ rft->fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB;
}

void __init mbm_config_rftype_init(const char *config)
@@ -1908,7 +1908,7 @@ void __init mbm_config_rftype_init(const char *config)

rft = rdtgroup_get_rftype_by_name(config);
if (rft)
- rft->fflags = RF_MON_INFO | RFTYPE_RES_CACHE;
+ rft->fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE;
}

/**
@@ -2043,21 +2043,21 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
if (IS_ERR(kn_info))
return PTR_ERR(kn_info);

- ret = rdtgroup_add_files(kn_info, RF_TOP_INFO);
+ ret = rdtgroup_add_files(kn_info, RFTYPE_TOP_INFO);
if (ret)
goto out_destroy;

/* loop over enabled controls, these are all alloc_capable */
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- fflags = r->fflags | RF_CTRL_INFO;
+ fflags = r->fflags | RFTYPE_CTRL_INFO;
ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
if (ret)
goto out_destroy;
}

for_each_mon_capable_rdt_resource(r) {
- fflags = r->fflags | RF_MON_INFO;
+ fflags = r->fflags | RFTYPE_MON_INFO;
sprintf(name, "%s_MON", r->name);
ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
if (ret)
@@ -3554,7 +3554,7 @@ static int __init rdtgroup_setup_root(void)

list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);

- ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RF_CTRL_BASE);
+ ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_CTRL_BASE);
if (ret) {
kernfs_destroy_root(rdt_root);
goto out;



2023-03-02 20:26:08

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

When a user creates a control or monitor group, the CLOSID or RMID
are not visible to the user. These are architecturally defined entities.
There is no harm in displaying these in resctrl groups. Sometimes it
can help to debug the issues.

Add CLOSID and RMID to the control/monitor groups display in resctrl
interface.

$cat /sys/fs/resctrl/clos1/closid
1
$cat /sys/fs/resctrl/mon_groups/mon1/rmid
3

Signed-off-by: Babu Moger <[email protected]>
---
Documentation/x86/resctrl.rst | 17 ++++++++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 ++++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)

diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 25203f20002d..67eae74fe40c 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -321,6 +321,15 @@ All groups contain the following files:
Just like "cpus", only using ranges of CPUs instead of bitmasks.


+"rmid":
+ Available only with debug option.Reading this file shows the
+ Resource Monitoring ID (RMID) for monitoring the resource
+ utilization. Monitoring is performed by tagging each core (or
+ thread) or process via a RMID. Kernel assigns a new RMID when
+ a group is created depending on the available RMIDs. Multiple
+ cores (or threads) or processes can share a same RMID in a resctrl
+ domain.
+
When control is enabled all CTRL_MON groups will also contain:

"schemata":
@@ -342,6 +351,14 @@ When control is enabled all CTRL_MON groups will also contain:
file. On successful pseudo-locked region creation the mode will
automatically change to "pseudo-locked".

+"closid":
+ Available only with debug option. Reading this file shows the
+ Class of Service (CLOS) id which acts as a resource control tag
+ on which the resources can be throttled. Kernel assigns a new
+ CLOSID a control group is created depending on the available
+ CLOSIDs. Multiple cores(or threads) or processes can share a
+ same CLOSID in a resctrl domain.
+
When monitoring is enabled all MON groups will also contain:

"mon_data":
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1eb538965bd3..389d64b42704 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -760,6 +760,38 @@ static int rdtgroup_tasks_show(struct kernfs_open_file *of,
return ret;
}

+static int rdtgroup_closid_show(struct kernfs_open_file *of,
+ struct seq_file *s, void *v)
+{
+ struct rdtgroup *rdtgrp;
+ int ret = 0;
+
+ rdtgrp = rdtgroup_kn_lock_live(of->kn);
+ if (rdtgrp)
+ seq_printf(s, "%u\n", rdtgrp->closid);
+ else
+ ret = -ENOENT;
+ rdtgroup_kn_unlock(of->kn);
+
+ return ret;
+}
+
+static int rdtgroup_rmid_show(struct kernfs_open_file *of,
+ struct seq_file *s, void *v)
+{
+ struct rdtgroup *rdtgrp;
+ int ret = 0;
+
+ rdtgrp = rdtgroup_kn_lock_live(of->kn);
+ if (rdtgrp)
+ seq_printf(s, "%u\n", rdtgrp->mon.rmid);
+ else
+ ret = -ENOENT;
+ rdtgroup_kn_unlock(of->kn);
+
+ return ret;
+}
+
#ifdef CONFIG_PROC_CPU_RESCTRL

/*
@@ -1821,6 +1853,12 @@ static struct rftype res_common_files[] = {
.seq_show = rdtgroup_tasks_show,
.fflags = RFTYPE_BASE,
},
+ {
+ .name = "rmid",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = rdtgroup_rmid_show,
+ },
{
.name = "schemata",
.mode = 0644,
@@ -1844,6 +1882,12 @@ static struct rftype res_common_files[] = {
.seq_show = rdtgroup_size_show,
.fflags = RFTYPE_BASE_CTRL,
},
+ {
+ .name = "closid",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = rdtgroup_closid_show,
+ },

};




2023-03-02 20:26:08

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 4/7] x86/resctrl: Re-arrange RFTYPE flags based on hierarchy

RESCTRL filesystem has two main components:
a. info (Details on resources and monitoring)
b. base (Details on CONTROL and MON groups)

The rftype flags can be renamed accordingly for better understanding.
For example:
RFTYPE_INFO : Files with these flags go in info directory
RFTYPE_INFO_MON : Files with these flags go in info/L3_MON
RFTYPE_BASE : Files with these flags go in group's(control or mon)
base directory
RFTYPE_BASE_CTRL: Files with these flags go in only CONTROL groups

Add comments to make it easy for future additions.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 8 ++--
arch/x86/kernel/cpu/resctrl/internal.h | 64 ++++++++++++++++++++++++++++----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 +++++++++++-----------
3 files changed, 81 insertions(+), 35 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 030d3b409768..d1c6b2cc8611 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -69,7 +69,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.domains = domain_init(RDT_RESOURCE_L3),
.parse_ctrlval = parse_cbm,
.format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_CACHE,
},
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -83,7 +83,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.domains = domain_init(RDT_RESOURCE_L2),
.parse_ctrlval = parse_cbm,
.format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_CACHE,
},
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
@@ -97,7 +97,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.domains = domain_init(RDT_RESOURCE_MBA),
.parse_ctrlval = parse_bw,
.format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
+ .fflags = RFTYPE_MB,
},
},
[RDT_RESOURCE_SMBA] =
@@ -109,7 +109,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.domains = domain_init(RDT_RESOURCE_SMBA),
.parse_ctrlval = parse_bw,
.format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
+ .fflags = RFTYPE_MB,
},
},
};
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 2cfc3c431d5b..6767c85b9699 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -240,18 +240,64 @@ struct rdtgroup {

/*
* Define the file type flags for base and info directories.
+ *
+ * RESCTRL filesystem has two main components
+ * a. info
+ * b. base.
+ *
+ * /sys/fs/resctrl/
+ * |
+ * --> info (Displays information about control and monitoring)
+ * |
+ * --> base (Displays the details on resctrl groups)
+ *
+ * -------------------------------------------------------------
+ * info directory structure
+ * --> INFO
+ * --> TOP
+ * last_cmd_status
+ *
+ * --> MON
+ * --> (L2_MON)
+ * --> (L3_MON)
+ * max_threshold_occupancy, mbm_local_bytes_config,
+ * mbm_total_bytes_config, mon_features, num_rmids
+ *
+ * --> RES
+ * --> CACHE (L2, L3)
+ * bit_usage, cbm_mask, min_cbm_bits, num_closids,
+ * shareable_bits
+ *
+ * --> MB (MB, SMBA)
+ * bandwidth_gran, delay_linear, min_bandwidth,
+ * num_closids
+ *
+ * group structure
+ * -----------------------------------------------------------
+ * --> BASE (Files common for both MON and CTRL groups)
+ * cpus, cpus_list, tasks
+ *
+ * --> CTRL (Files only for CTRL group)
+ * mode, schemata, size
+ *
*/
#define RFTYPE_INFO BIT(0)
#define RFTYPE_BASE BIT(1)
-#define RFTYPE_CTRL BIT(4)
-#define RFTYPE_MON BIT(5)
-#define RFTYPE_TOP BIT(6)
-#define RFTYPE_RES_CACHE BIT(8)
-#define RFTYPE_RES_MB BIT(9)
-#define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
-#define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
-#define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
-#define RFTYPE_CTRL_BASE (RFTYPE_BASE | RFTYPE_CTRL)
+
+#define RFTYPE_TOP BIT(2)
+#define RFTYPE_MON BIT(3)
+#define RFTYPE_RES BIT(4)
+
+#define RFTYPE_CACHE BIT(5)
+#define RFTYPE_MB BIT(6)
+
+#define RFTYPE_CTRL BIT(8)
+
+#define RFTYPE_INFO_TOP (RFTYPE_INFO | RFTYPE_TOP)
+#define RFTYPE_INFO_MON (RFTYPE_INFO | RFTYPE_MON)
+#define RFTYPE_INFO_RES (RFTYPE_INFO | RFTYPE_RES)
+
+#define RFTYPE_BASE_CTRL (RFTYPE_BASE | RFTYPE_CTRL)

/* List of all resource groups */
extern struct list_head rdt_all_groups;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index fc7d1e778bff..1eb538965bd3 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1691,77 +1691,77 @@ static struct rftype res_common_files[] = {
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_last_cmd_status_show,
- .fflags = RFTYPE_TOP_INFO,
+ .fflags = RFTYPE_INFO_TOP,
},
{
.name = "num_closids",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_num_closids_show,
- .fflags = RFTYPE_CTRL_INFO,
+ .fflags = RFTYPE_INFO_RES,
},
{
.name = "mon_features",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_mon_features_show,
- .fflags = RFTYPE_MON_INFO,
+ .fflags = RFTYPE_INFO_MON,
},
{
.name = "num_rmids",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_num_rmids_show,
- .fflags = RFTYPE_MON_INFO,
+ .fflags = RFTYPE_INFO_MON,
},
{
.name = "cbm_mask",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_default_ctrl_show,
- .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_INFO_RES | RFTYPE_CACHE,
},
{
.name = "min_cbm_bits",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_min_cbm_bits_show,
- .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_INFO_RES | RFTYPE_CACHE,
},
{
.name = "shareable_bits",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_shareable_bits_show,
- .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_INFO_RES | RFTYPE_CACHE,
},
{
.name = "bit_usage",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_bit_usage_show,
- .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_INFO_RES | RFTYPE_CACHE,
},
{
.name = "min_bandwidth",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_min_bw_show,
- .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB,
+ .fflags = RFTYPE_INFO_RES | RFTYPE_MB,
},
{
.name = "bandwidth_gran",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_bw_gran_show,
- .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB,
+ .fflags = RFTYPE_INFO_RES | RFTYPE_MB,
},
{
.name = "delay_linear",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdt_delay_linear_show,
- .fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB,
+ .fflags = RFTYPE_INFO_RES | RFTYPE_MB,
},
/*
* Platform specific which (if any) capabilities are provided by
@@ -1780,7 +1780,7 @@ static struct rftype res_common_files[] = {
.kf_ops = &rdtgroup_kf_single_ops,
.write = max_threshold_occ_write,
.seq_show = max_threshold_occ_show,
- .fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE,
+ .fflags = RFTYPE_INFO_MON | RFTYPE_CACHE,
},
{
.name = "mbm_total_bytes_config",
@@ -1827,7 +1827,7 @@ static struct rftype res_common_files[] = {
.kf_ops = &rdtgroup_kf_single_ops,
.write = rdtgroup_schemata_write,
.seq_show = rdtgroup_schemata_show,
- .fflags = RFTYPE_CTRL_BASE,
+ .fflags = RFTYPE_BASE_CTRL,
},
{
.name = "mode",
@@ -1835,14 +1835,14 @@ static struct rftype res_common_files[] = {
.kf_ops = &rdtgroup_kf_single_ops,
.write = rdtgroup_mode_write,
.seq_show = rdtgroup_mode_show,
- .fflags = RFTYPE_CTRL_BASE,
+ .fflags = RFTYPE_BASE_CTRL,
},
{
.name = "size",
.mode = 0444,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = rdtgroup_size_show,
- .fflags = RFTYPE_CTRL_BASE,
+ .fflags = RFTYPE_BASE_CTRL,
},

};
@@ -1899,7 +1899,7 @@ void __init thread_throttle_mode_init(void)
if (!rft)
return;

- rft->fflags = RFTYPE_CTRL_INFO | RFTYPE_RES_MB;
+ rft->fflags = RFTYPE_INFO_RES | RFTYPE_MB;
}

void __init mbm_config_rftype_init(const char *config)
@@ -1908,7 +1908,7 @@ void __init mbm_config_rftype_init(const char *config)

rft = rdtgroup_get_rftype_by_name(config);
if (rft)
- rft->fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE;
+ rft->fflags = RFTYPE_INFO_MON | RFTYPE_CACHE;
}

/**
@@ -2043,21 +2043,21 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
if (IS_ERR(kn_info))
return PTR_ERR(kn_info);

- ret = rdtgroup_add_files(kn_info, RFTYPE_TOP_INFO);
+ ret = rdtgroup_add_files(kn_info, RFTYPE_INFO_TOP);
if (ret)
goto out_destroy;

/* loop over enabled controls, these are all alloc_capable */
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- fflags = r->fflags | RFTYPE_CTRL_INFO;
+ fflags = r->fflags | RFTYPE_INFO_RES;
ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
if (ret)
goto out_destroy;
}

for_each_mon_capable_rdt_resource(r) {
- fflags = r->fflags | RFTYPE_MON_INFO;
+ fflags = r->fflags | RFTYPE_INFO_MON;
sprintf(name, "%s_MON", r->name);
ret = rdtgroup_mkdir_info_resdir(r, name, fflags);
if (ret)
@@ -3218,7 +3218,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
if (rtype == RDTCTRL_GROUP)
fflags = RFTYPE_BASE | RFTYPE_CTRL;
else
- fflags = RFTYPE_BASE | RFTYPE_MON;
+ fflags = RFTYPE_BASE;

ret = rdtgroup_add_files(kn, fflags);
if (ret) {
@@ -3554,7 +3554,7 @@ static int __init rdtgroup_setup_root(void)

list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);

- ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_CTRL_BASE);
+ ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_BASE_CTRL);
if (ret) {
kernfs_destroy_root(rdt_root);
goto out;



2023-03-02 20:26:39

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 7/7] x86/resctrl: Add debug files when mounted with debug option

Add the debug files to the resctrl hierarchy.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 48 +++++++++++++++++++++++++++++++-
1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index a1f13715a65c..790c6b9f9031 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2400,6 +2400,45 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn,
struct rdtgroup *prgrp,
struct kernfs_node **mon_data_kn);

+void resctrl_add_debug_file(struct kernfs_node *parent_kn, const char *config,
+ unsigned long fflags)
+{
+ struct rftype *rft;
+
+ rft = rdtgroup_get_rftype_by_name(config);
+ if (rft) {
+ rft->fflags |= fflags;
+ rdtgroup_add_file(parent_kn, rft);
+ }
+}
+
+static void resctrl_add_debug_files(void)
+{
+ resctrl_add_debug_file(rdtgroup_default.kn, "rmid", RFTYPE_BASE);
+ resctrl_add_debug_file(rdtgroup_default.kn, "closid", RFTYPE_BASE_CTRL);
+ kernfs_activate(rdtgroup_default.kn);
+}
+
+void resctrl_remove_debug_file(struct kernfs_node *parent_kn,
+ const char *config, unsigned long fflags)
+{
+ struct rftype *rft;
+
+ rft = rdtgroup_get_rftype_by_name(config);
+ if (rft) {
+ rft->fflags &= ~fflags;
+ kernfs_remove_by_name(parent_kn, config);
+ }
+}
+
+static void resctrl_remove_debug_files(void)
+{
+ resctrl_remove_debug_file(rdtgroup_default.kn, "rmid", RFTYPE_BASE);
+ resctrl_remove_debug_file(rdtgroup_default.kn, "closid",
+ RFTYPE_BASE_CTRL);
+ kernfs_activate(rdtgroup_default.kn);
+}
+
static int rdt_enable_ctx(struct rdt_fs_context *ctx)
{
int ret = 0;
@@ -2413,8 +2452,10 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
if (!ret && ctx->enable_mba_mbps)
ret = set_mba_sc(true);

- if (!ret && ctx->debug)
+ if (!ret && ctx->debug) {
resctrl_debug = true;
+ resctrl_add_debug_files();
+ }

return ret;
}
@@ -2831,6 +2872,11 @@ static void rdt_kill_sb(struct super_block *sb)

set_mba_sc(false);

+ if (resctrl_debug) {
+ resctrl_remove_debug_files();
+ resctrl_debug = false;
+ }
+
/*Put everything back to default values. */
for_each_alloc_capable_rdt_resource(r)
reset_all_ctrls(r);



2023-03-02 20:26:39

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v3 6/7] x86/resctrl: Introduce -o debug mount option

Add -o debug option to mount resctrl filesystem in debug mode. Debug
option adds the files for debug purposes.

Signed-off-by: Babu Moger <[email protected]>
---
Documentation/x86/resctrl.rst | 2 ++
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 13 +++++++++++++
3 files changed, 16 insertions(+)

diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 67eae74fe40c..1ada4e0650dc 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -46,6 +46,8 @@ mount options are:
"mba_MBps":
Enable the MBA Software Controller(mba_sc) to specify MBA
bandwidth in MBps
+"debug":
+ Lists the debug files in resctrl interface

L2 and L3 CDP are controlled separately.

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 6767c85b9699..35a9ee343fe0 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -59,6 +59,7 @@ struct rdt_fs_context {
bool enable_cdpl2;
bool enable_cdpl3;
bool enable_mba_mbps;
+ bool debug;
};

static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 389d64b42704..a1f13715a65c 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -56,6 +56,8 @@ static char last_cmd_status_buf[512];

struct dentry *debugfs_resctrl;

+static bool resctrl_debug;
+
void rdt_last_cmd_clear(void)
{
lockdep_assert_held(&rdtgroup_mutex);
@@ -2411,6 +2413,9 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
if (!ret && ctx->enable_mba_mbps)
ret = set_mba_sc(true);

+ if (!ret && ctx->debug)
+ resctrl_debug = true;
+
return ret;
}

@@ -2599,6 +2604,7 @@ enum rdt_param {
Opt_cdp,
Opt_cdpl2,
Opt_mba_mbps,
+ Opt_debug,
nr__rdt_params
};

@@ -2606,6 +2612,7 @@ static const struct fs_parameter_spec rdt_fs_parameters[] = {
fsparam_flag("cdp", Opt_cdp),
fsparam_flag("cdpl2", Opt_cdpl2),
fsparam_flag("mba_MBps", Opt_mba_mbps),
+ fsparam_flag("debug", Opt_debug),
{}
};

@@ -2631,6 +2638,9 @@ static int rdt_parse_param(struct fs_context *fc, struct fs_parameter *param)
return -EINVAL;
ctx->enable_mba_mbps = true;
return 0;
+ case Opt_debug:
+ ctx->debug = true;
+ return 0;
}

return -EINVAL;
@@ -3569,6 +3579,9 @@ static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl))
seq_puts(seq, ",mba_MBps");

+ if (resctrl_debug)
+ seq_puts(seq, ",debug");
+
return 0;
}




2023-03-15 18:33:11

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Babu,

On 3/2/2023 12:24 PM, Babu Moger wrote:
> The resctrl task assignment for MONITOR or CONTROL group needs to be
> done one at a time. For example:
>
> $mount -t resctrl resctrl /sys/fs/resctrl/
> $mkdir /sys/fs/resctrl/clos1
> $echo 123 > /sys/fs/resctrl/clos1/tasks
> $echo 456 > /sys/fs/resctrl/clos1/tasks
> $echo 789 > /sys/fs/resctrl/clos1/tasks
>
> This is not user-friendly when dealing with hundreds of tasks. Also,
> there is a syscall overhead for each command executed from user space.

To support this change it may also be helpful to add that
moving tasks take the mutex so attempting to move tasks in
parallel will not achieve a significant performance gain.

>
> It can be improved by supporting the multiple task assignment in one
> command with the tasks separated by commas. For example:
>
> $echo 123,456,789 > /sys/fs/resctrl/clos1/tasks
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> Documentation/x86/resctrl.rst | 11 +++++++++--
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 +++++++++++++++++++++++-
> 2 files changed, 32 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index 058257dc56c8..25203f20002d 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -292,13 +292,20 @@ All groups contain the following files:
> "tasks":
> Reading this file shows the list of all tasks that belong to
> this group. Writing a task id to the file will add a task to the
> - group. If the group is a CTRL_MON group the task is removed from
> + group. Multiple tasks can be assigned together in one command by
> + inputting the tasks separated by commas. Tasks will be assigned

How about "tasks separated" -> "task ids separated" or "by inputting the tasks
separated by commas" -> "by separating the task ids with commas"

> + sequentially in the order it is provided. Failure while assigning
> + the tasks will be aborted immediately. The tasks before the failure
> + will be assigned and the tasks next in the sequence will not be
> + assigned. Users may need to retry them again. The failure details
> + will be logged in resctrl/info/last_cmd_status file.

Please use full path as is done in rest of doc.

> +
> + If the group is a CTRL_MON group the task is removed from
> whichever previous CTRL_MON group owned the task and also from
> any MON group that owned the task. If the group is a MON group,
> then the task must already belong to the CTRL_MON parent of this
> group. The task is removed from any previous MON group.
>
> -

Why is this line removal needed?

> "cpus":
> Reading this file shows a bitmask of the logical CPUs owned by
> this group. Writing a mask to this file will add and remove
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index e2c1599d1b37..15ea5b550fe9 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
> char *buf, size_t nbytes, loff_t off)
> {
> struct rdtgroup *rdtgrp;
> + char *pid_str;
> int ret = 0;
> pid_t pid;
>
> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
> + /* Valid input requires a trailing newline */
> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
> return -EINVAL;

The resctrl files should be seen as user space API. With the above change
you take an interface that did not require a newline and dictate that it
should have a trailing newline. How convinced are you that this does not
break any current user space scripts or applications? Why does this
feature require a trailing newline?

> +
> + buf[nbytes - 1] = '\0';
> +
> rdtgrp = rdtgroup_kn_lock_live(of->kn);
> if (!rdtgrp) {
> rdtgroup_kn_unlock(of->kn);
> return -ENOENT;
> }
> +
> +next:
> + if (!buf || buf[0] == '\0')
> + goto unlock;
> +
> + pid_str = strim(strsep(&buf, ","));
> +

Could lib/cmdline.c:get_option() be useful?

> + if (kstrtoint(pid_str, 0, &pid) || pid < 0) {
> + rdt_last_cmd_printf("Invalid pid %d value\n", pid);

This is risky. What will pid be if kstrtoint() failed?

> + ret = -EINVAL;
> + goto unlock;
> + }
> +
> rdt_last_cmd_clear();
>
> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED ||
> @@ -703,6 +721,10 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
> }
>
> ret = rdtgroup_move_task(pid, rdtgrp, of);
> + if (ret)
> + goto unlock;
> + else
> + goto next;
>

The documentation states "The failure details will be logged
in resctrl/info/last_cmd_status file." but I do not see how
this is happening here. From what I can tell this implementation
does not do anything beyond what last_cmd_status already does
so any special mention in the docs is not clear to me. The
cover letter stated "Added pid in last_cmd_status when
applicable." - it sounded as though last_cmd_status would
contain the error with the pid that encountered the error but
I do not see this happening here.

> unlock:
> rdtgroup_kn_unlock(of->kn);
>
>

Reinette

2023-03-15 18:34:32

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] x86/resctrl: Remove few unnecessary rftype flags

Hi Babu,

On 3/2/2023 12:24 PM, Babu Moger wrote:
> Remove few unnecessary rftype flags and simplify the code. This is done

Please drop "few" (here and in subject).

> to further cleanup the code eventually.

Could you please be specific? "further cleanup the code
eventually" is too vague. I do not understand what is meant
with the second sentence.

>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 9 +++------
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++++---
> 2 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 8edecc5763d8..571145d75d29 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -243,12 +243,9 @@ struct rdtgroup {
> */
> #define RFTYPE_INFO BIT(0)
> #define RFTYPE_BASE BIT(1)
> -#define RF_CTRLSHIFT 4
> -#define RF_MONSHIFT 5
> -#define RF_TOPSHIFT 6
> -#define RFTYPE_CTRL BIT(RF_CTRLSHIFT)
> -#define RFTYPE_MON BIT(RF_MONSHIFT)
> -#define RFTYPE_TOP BIT(RF_TOPSHIFT)
> +#define RFTYPE_CTRL BIT(4)
> +#define RFTYPE_MON BIT(5)
> +#define RFTYPE_TOP BIT(6)
> #define RFTYPE_RES_CACHE BIT(8)
> #define RFTYPE_RES_MB BIT(9)
> #define RF_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 15ea5b550fe9..3c86506e54c1 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -3163,7 +3163,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
> {
> struct rdtgroup *prdtgrp, *rdtgrp;
> struct kernfs_node *kn;
> - uint files = 0;
> + uint fflags = 0;

Hoe does changing the variable name from "files" to "fflags" simplify
the code?

> int ret;
>
> prdtgrp = rdtgroup_kn_lock_live(parent_kn);
> @@ -3215,8 +3215,12 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
> goto out_destroy;
> }
>
> - files = RFTYPE_BASE | BIT(RF_CTRLSHIFT + rtype);
> - ret = rdtgroup_add_files(kn, files);
> + if (rtype == RDTCTRL_GROUP)
> + fflags = RFTYPE_BASE | RFTYPE_CTRL;
> + else
> + fflags = RFTYPE_BASE | RFTYPE_MON;
> +
> + ret = rdtgroup_add_files(kn, fflags);
> if (ret) {
> rdt_last_cmd_puts("kernfs fill error\n");
> goto out_destroy;
>
>

Reinette

2023-03-15 18:40:13

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 4/7] x86/resctrl: Re-arrange RFTYPE flags based on hierarchy

Hi Babu,

On 3/2/2023 12:24 PM, Babu Moger wrote:
> RESCTRL filesystem has two main components:
> a. info (Details on resources and monitoring)
> b. base (Details on CONTROL and MON groups)
>
> The rftype flags can be renamed accordingly for better understanding.
> For example:
> RFTYPE_INFO : Files with these flags go in info directory

This is not a rename but the current name.

> RFTYPE_INFO_MON : Files with these flags go in info/L3_MON

How does this improve the current RFTYPE_MON_INFO?

> RFTYPE_BASE : Files with these flags go in group's(control or mon)
> base directory
This is not a rename but the current name.

> RFTYPE_BASE_CTRL: Files with these flags go in only CONTROL groups

How does this improve current RFTYPE_CTRL_BASE ?

>
> Add comments to make it easy for future additions.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 8 ++--
> arch/x86/kernel/cpu/resctrl/internal.h | 64 ++++++++++++++++++++++++++++----
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 +++++++++++-----------
> 3 files changed, 81 insertions(+), 35 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 030d3b409768..d1c6b2cc8611 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -69,7 +69,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
> .domains = domain_init(RDT_RESOURCE_L3),
> .parse_ctrlval = parse_cbm,
> .format_str = "%d=%0*x",
> - .fflags = RFTYPE_RES_CACHE,
> + .fflags = RFTYPE_CACHE,
> },

How does this rename improve understanding?

...

> @@ -97,7 +97,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
> .domains = domain_init(RDT_RESOURCE_MBA),
> .parse_ctrlval = parse_bw,
> .format_str = "%d=%*u",
> - .fflags = RFTYPE_RES_MB,
> + .fflags = RFTYPE_MB,
> },
> },
> [RDT_RESOURCE_SMBA] =

ditto.


...

> + *
> */
> #define RFTYPE_INFO BIT(0)
> #define RFTYPE_BASE BIT(1)
> -#define RFTYPE_CTRL BIT(4)
> -#define RFTYPE_MON BIT(5)
> -#define RFTYPE_TOP BIT(6)
> -#define RFTYPE_RES_CACHE BIT(8)
> -#define RFTYPE_RES_MB BIT(9)
> -#define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
> -#define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
> -#define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
> -#define RFTYPE_CTRL_BASE (RFTYPE_BASE | RFTYPE_CTRL)
> +
> +#define RFTYPE_TOP BIT(2)
> +#define RFTYPE_MON BIT(3)
> +#define RFTYPE_RES BIT(4)
> +
> +#define RFTYPE_CACHE BIT(5)
> +#define RFTYPE_MB BIT(6)
> +
> +#define RFTYPE_CTRL BIT(8)
> +
> +#define RFTYPE_INFO_TOP (RFTYPE_INFO | RFTYPE_TOP)
> +#define RFTYPE_INFO_MON (RFTYPE_INFO | RFTYPE_MON)
> +#define RFTYPE_INFO_RES (RFTYPE_INFO | RFTYPE_RES)
> +
> +#define RFTYPE_BASE_CTRL (RFTYPE_BASE | RFTYPE_CTRL)
>

It is not clear to me how any of the renames improves understanding.

How does renaming RFTYPE_CTRL_BASE to RFTYPE_BASE_CTRL improve
understanding? Renaming RFTYPE_MON_INFO to RFTYPE_INFO_MON?

This all seems unnecessary.

...

> @@ -3218,7 +3218,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
> if (rtype == RDTCTRL_GROUP)
> fflags = RFTYPE_BASE | RFTYPE_CTRL;
> else
> - fflags = RFTYPE_BASE | RFTYPE_MON;
> + fflags = RFTYPE_BASE;
>

Is this intended?

Reinette

2023-03-15 18:45:11

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] x86/resctrl: Introduce -o debug mount option

Hi Babu,

On 3/2/2023 12:25 PM, Babu Moger wrote:
> Add -o debug option to mount resctrl filesystem in debug mode. Debug
> option adds the files for debug purposes.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> Documentation/x86/resctrl.rst | 2 ++
> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 13 +++++++++++++
> 3 files changed, 16 insertions(+)
>
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index 67eae74fe40c..1ada4e0650dc 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -46,6 +46,8 @@ mount options are:
> "mba_MBps":
> Enable the MBA Software Controller(mba_sc) to specify MBA
> bandwidth in MBps
> +"debug":
> + Lists the debug files in resctrl interface
>

This seems to imply that a listing of available debug flags
will be displayed. How about something like "Make debug files
accessible. Available debug files are annotated with "Available
only with debug option"." (please feel free to improve).

> L2 and L3 CDP are controlled separately.
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 6767c85b9699..35a9ee343fe0 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -59,6 +59,7 @@ struct rdt_fs_context {
> bool enable_cdpl2;
> bool enable_cdpl3;
> bool enable_mba_mbps;
> + bool debug;

Why not follow the prefix of existing flags?

> };
>
> static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc)

Reinette

2023-03-15 18:46:11

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

Hi Babu,

Please note "COSID" (CLOSID?) in the subject. Even so, you already
know and have been reminded when you submitted an earlier version
that resctrl needs to support Arm. Adding x86 specific bits to the
user interface complicates this enabling.

What happened to:
https://lore.kernel.org/lkml/[email protected]/

On 3/2/2023 12:24 PM, Babu Moger wrote:
> When a user creates a control or monitor group, the CLOSID or RMID
> are not visible to the user. These are architecturally defined entities.
> There is no harm in displaying these in resctrl groups. Sometimes it
> can help to debug the issues.
>
> Add CLOSID and RMID to the control/monitor groups display in resctrl
> interface.
>
> $cat /sys/fs/resctrl/clos1/closid
> 1
> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
> 3
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> Documentation/x86/resctrl.rst | 17 ++++++++++++
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 ++++++++++++++++++++++++++++++++
> 2 files changed, 61 insertions(+)
>
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index 25203f20002d..67eae74fe40c 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -321,6 +321,15 @@ All groups contain the following files:
> Just like "cpus", only using ranges of CPUs instead of bitmasks.
>
>
> +"rmid":
> + Available only with debug option.Reading this file shows the

Where can the user find documentation about "debug option"?

Also please watch spacing.

> + Resource Monitoring ID (RMID) for monitoring the resource
> + utilization. Monitoring is performed by tagging each core (or
> + thread) or process via a RMID. Kernel assigns a new RMID when
> + a group is created depending on the available RMIDs. Multiple
> + cores (or threads) or processes can share a same RMID in a resctrl
> + domain.

Please make this not to be x86 specific. You can surely document the
x86 details but that should start with something like "on x86 this ..."

What does "a resctrl domain" mean to user space? Did you mean "resource
group"?

> +
> When control is enabled all CTRL_MON groups will also contain:
>
> "schemata":
> @@ -342,6 +351,14 @@ When control is enabled all CTRL_MON groups will also contain:
> file. On successful pseudo-locked region creation the mode will
> automatically change to "pseudo-locked".
>
> +"closid":
> + Available only with debug option. Reading this file shows the
> + Class of Service (CLOS) id which acts as a resource control tag
> + on which the resources can be throttled. Kernel assigns a new
> + CLOSID a control group is created depending on the available
> + CLOSIDs. Multiple cores(or threads) or processes can share a
> + same CLOSID in a resctrl domain.
> +

This also should not be x86 specific. Also, please check the language
and watch spacing.
for example, "Kernel assigns a new CLOSID a control group" -> "Kernel
assigns a new CLOSID to a control group", "can share a same" -> "can
share the same".
What does "a resctrl domain" mean to user space?


> When monitoring is enabled all MON groups will also contain:
>

Shouldn't the "rmid" (to be renamed) entry be in this section of the
documentation?

> "mon_data":

Reinette


2023-03-15 18:47:52

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] x86/resctrl: Add debug files when mounted with debug option

Hi Babu,

On 3/2/2023 12:25 PM, Babu Moger wrote:
> Add the debug files to the resctrl hierarchy.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 48 +++++++++++++++++++++++++++++++-
> 1 file changed, 47 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index a1f13715a65c..790c6b9f9031 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2400,6 +2400,45 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn,
> struct rdtgroup *prgrp,
> struct kernfs_node **mon_data_kn);
>
> +void resctrl_add_debug_file(struct kernfs_node *parent_kn, const char *config,
> + unsigned long fflags)
> +{
> + struct rftype *rft;
> +
> + rft = rdtgroup_get_rftype_by_name(config);
> + if (rft) {
> + rft->fflags |= fflags;
> + rdtgroup_add_file(parent_kn, rft);
> + }
> +}
> +
> +static void resctrl_add_debug_files(void)
> +{
> + resctrl_add_debug_file(rdtgroup_default.kn, "rmid", RFTYPE_BASE);
> + resctrl_add_debug_file(rdtgroup_default.kn, "closid", RFTYPE_BASE_CTRL);
> + kernfs_activate(rdtgroup_default.kn);
> +}
> +

I think that separating this from all the other resctrl file creation
can be a source of confusion and bugs. Why not add the debug files
at the same time as all the other files belonging to a resource group?

How about introducing new flags, perhaps RFTYPE_MON_DEBUG and
RFTYPE_CTRL_DEBUG. When the debug option is enabled (when resctrl_debug
is true) then the appropriate flag can be OR'd with the other flags
before rdtgroup_add_files() is called.

It sounds to me if there are plans to add more of these files. A function
like resctrl_add_debug_files() requires a lot of changes and care (and thus
potential for errors) every time a new debug file is added.

On another note ... what are the plans with this debug area? At some
point it may be better to expand resctrl debugfs.

Reinette

2023-03-16 16:27:36

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

[AMD Official Use Only - General]

Hi Reinette,

> -----Original Message-----
> From: Reinette Chatre <[email protected]>
> Sent: Wednesday, March 15, 2023 1:33 PM
> To: Moger, Babu <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; Das1, Sandipan
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
> at once
>
> Hi Babu,
>
> On 3/2/2023 12:24 PM, Babu Moger wrote:
> > The resctrl task assignment for MONITOR or CONTROL group needs to be
> > done one at a time. For example:
> >
> > $mount -t resctrl resctrl /sys/fs/resctrl/
> > $mkdir /sys/fs/resctrl/clos1
> > $echo 123 > /sys/fs/resctrl/clos1/tasks
> > $echo 456 > /sys/fs/resctrl/clos1/tasks
> > $echo 789 > /sys/fs/resctrl/clos1/tasks
> >
> > This is not user-friendly when dealing with hundreds of tasks. Also,
> > there is a syscall overhead for each command executed from user space.
>
> To support this change it may also be helpful to add that moving tasks take the
> mutex so attempting to move tasks in parallel will not achieve a significant
> performance gain.

Agree. It may not be significant performance gain. Will remove this line.
>
> >
> > It can be improved by supporting the multiple task assignment in one
> > command with the tasks separated by commas. For example:
> >
> > $echo 123,456,789 > /sys/fs/resctrl/clos1/tasks
> >
> > Signed-off-by: Babu Moger <[email protected]>
> > ---
> > Documentation/x86/resctrl.rst | 11 +++++++++--
> > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 +++++++++++++++++++++++-
> > 2 files changed, 32 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/x86/resctrl.rst
> > b/Documentation/x86/resctrl.rst index 058257dc56c8..25203f20002d
> > 100644
> > --- a/Documentation/x86/resctrl.rst
> > +++ b/Documentation/x86/resctrl.rst
> > @@ -292,13 +292,20 @@ All groups contain the following files:
> > "tasks":
> > Reading this file shows the list of all tasks that belong to
> > this group. Writing a task id to the file will add a task to the
> > - group. If the group is a CTRL_MON group the task is removed from
> > + group. Multiple tasks can be assigned together in one command by
> > + inputting the tasks separated by commas. Tasks will be assigned
>
> How about "tasks separated" -> "task ids separated" or "by inputting the tasks
> separated by commas" -> "by separating the task ids with commas"


Will change it to " Multiple tasks can be assigned together in one command by separating the task ids with commas."

> > + sequentially in the order it is provided. Failure while assigning
> > + the tasks will be aborted immediately. The tasks before the failure
> > + will be assigned and the tasks next in the sequence will not be
> > + assigned. Users may need to retry them again. The failure details
> > + will be logged in resctrl/info/last_cmd_status file.
>
> Please use full path as is done in rest of doc.

Ok. Sure

>
> > +
> > + If the group is a CTRL_MON group the task is removed from
> > whichever previous CTRL_MON group owned the task and also from
> > any MON group that owned the task. If the group is a MON group,
> > then the task must already belong to the CTRL_MON parent of this
> > group. The task is removed from any previous MON group.
> >
> > -
>
> Why is this line removal needed?

Not needed.
>
> > "cpus":
> > Reading this file shows a bitmask of the logical CPUs owned by
> > this group. Writing a mask to this file will add and remove diff
> > --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > index e2c1599d1b37..15ea5b550fe9 100644
> > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct
> kernfs_open_file *of,
> > char *buf, size_t nbytes, loff_t off) {
> > struct rdtgroup *rdtgrp;
> > + char *pid_str;
> > int ret = 0;
> > pid_t pid;
> >
> > - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
> > + /* Valid input requires a trailing newline */
> > + if (nbytes == 0 || buf[nbytes - 1] != '\n')
> > return -EINVAL;
>
> The resctrl files should be seen as user space API. With the above change you
> take an interface that did not require a newline and dictate that it should have
> a trailing newline. How convinced are you that this does not break any current
> user space scripts or applications? Why does this feature require a trailing
> newline?

I have tested these changes with intel_cmt_cat tool. It didn’t have any problems.
We are already doing newline check for few other inputs.

>
> > +
> > + buf[nbytes - 1] = '\0';
> > +
> > rdtgrp = rdtgroup_kn_lock_live(of->kn);
> > if (!rdtgrp) {
> > rdtgroup_kn_unlock(of->kn);
> > return -ENOENT;
> > }
> > +
> > +next:
> > + if (!buf || buf[0] == '\0')
> > + goto unlock;
> > +
> > + pid_str = strim(strsep(&buf, ","));
> > +
>
> Could lib/cmdline.c:get_option() be useful?

Yes. We could that also. May not be required for the simple case like this.

>
> > + if (kstrtoint(pid_str, 0, &pid) || pid < 0) {
> > + rdt_last_cmd_printf("Invalid pid %d value\n", pid);
>
> This is risky. What will pid be if kstrtoint() failed?

Yea. I need to separate these failure cases. One is parsing error, and another is invalid pid.

>
> > + ret = -EINVAL;
> > + goto unlock;
> > + }
> > +
> > rdt_last_cmd_clear();
> >
> > if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6
> +721,10 @@
> > static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
> > }
> >
> > ret = rdtgroup_move_task(pid, rdtgrp, of);
> > + if (ret)
> > + goto unlock;
> > + else
> > + goto next;
> >
>
> The documentation states "The failure details will be logged in
> resctrl/info/last_cmd_status file." but I do not see how this is happening here.
> From what I can tell this implementation does not do anything beyond what
> last_cmd_status already does so any special mention in the docs is not clear to
> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it
> sounded as though last_cmd_status would contain the error with the pid that
> encountered the error but I do not see this happening here.

You are right we are not doing anything special here. pid failures error was already there.
I will have to change the text here.
Thanks
Babu

>
> > unlock:
> > rdtgroup_kn_unlock(of->kn);
> >
> >
>
> Reinette

2023-03-16 17:13:27

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Babu,

On 3/16/2023 9:27 AM, Moger, Babu wrote:
>> -----Original Message-----
>> From: Reinette Chatre <[email protected]>
>> Sent: Wednesday, March 15, 2023 1:33 PM
>> To: Moger, Babu <[email protected]>; [email protected];
>> [email protected]; [email protected]; [email protected]
>> Cc: [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected];
>> [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected];
>> [email protected]; [email protected]; Das1, Sandipan
>> <[email protected]>; [email protected]; [email protected];
>> [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]
>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
>> at once
>>
>> Hi Babu,
>>
>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>> The resctrl task assignment for MONITOR or CONTROL group needs to be
>>> done one at a time. For example:
>>>
>>> $mount -t resctrl resctrl /sys/fs/resctrl/
>>> $mkdir /sys/fs/resctrl/clos1
>>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>
>>> This is not user-friendly when dealing with hundreds of tasks. Also,
>>> there is a syscall overhead for each command executed from user space.
>>
>> To support this change it may also be helpful to add that moving tasks take the
>> mutex so attempting to move tasks in parallel will not achieve a significant
>> performance gain.
>
> Agree. It may not be significant performance gain. Will remove this line.

It does not sound as though you are actually responding to my comment.

>>> --- a/Documentation/x86/resctrl.rst
>>> +++ b/Documentation/x86/resctrl.rst
>>> @@ -292,13 +292,20 @@ All groups contain the following files:
>>> "tasks":
>>> Reading this file shows the list of all tasks that belong to
>>> this group. Writing a task id to the file will add a task to the
>>> - group. If the group is a CTRL_MON group the task is removed from
>>> + group. Multiple tasks can be assigned together in one command by
>>> + inputting the tasks separated by commas. Tasks will be assigned
>>
>> How about "tasks separated" -> "task ids separated" or "by inputting the tasks
>> separated by commas" -> "by separating the task ids with commas"
>
>
> Will change it to " Multiple tasks can be assigned together in one command by separating the task ids with commas."

I would drop the "together" since it implies that this is
somehow atomic. It will also improve reading by using consistent terminology -
note how the text switches from "add" to "assign". Something like
"Multiple tasks can be added by separating the task ids with commas." I think
using "command" is confusing since what is actually done is writing
text to a file.

...

>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct
>> kernfs_open_file *of,
>>> char *buf, size_t nbytes, loff_t off) {
>>> struct rdtgroup *rdtgrp;
>>> + char *pid_str;
>>> int ret = 0;
>>> pid_t pid;
>>>
>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
>>> + /* Valid input requires a trailing newline */
>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>> return -EINVAL;
>>
>> The resctrl files should be seen as user space API. With the above change you
>> take an interface that did not require a newline and dictate that it should have
>> a trailing newline. How convinced are you that this does not break any current
>> user space scripts or applications? Why does this feature require a trailing
>> newline?
>
> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems.
> We are already doing newline check for few other inputs.

You tested this with the _one_ user space tool that you use. This is not sufficient
to be convincing that this change has no impact. I do not believe that it is a valid
argument that other inputs do a newline check. This input never required a newline
check and it is not clear why this change now requires it. It seems that this is an
unnecessary new requirement that runs the risk of breaking an existing application.

I would like to ask again: How convinced are you that this does not break _any_ current
user space scripts or applications? Why does this feature require a trailing
newline?

>
>>
>>> +
>>> + buf[nbytes - 1] = '\0';
>>> +
>>> rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>> if (!rdtgrp) {
>>> rdtgroup_kn_unlock(of->kn);
>>> return -ENOENT;
>>> }
>>> +
>>> +next:
>>> + if (!buf || buf[0] == '\0')
>>> + goto unlock;
>>> +
>>> + pid_str = strim(strsep(&buf, ","));
>>> +
>>
>> Could lib/cmdline.c:get_option() be useful?
>
> Yes. We could that also. May not be required for the simple case like this.

Please keep an eye out for how much of it you end up duplicating ....

>>> + ret = -EINVAL;
>>> + goto unlock;
>>> + }
>>> +
>>> rdt_last_cmd_clear();
>>>
>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6
>> +721,10 @@
>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
>>> }
>>>
>>> ret = rdtgroup_move_task(pid, rdtgrp, of);
>>> + if (ret)
>>> + goto unlock;
>>> + else
>>> + goto next;
>>>
>>
>> The documentation states "The failure details will be logged in
>> resctrl/info/last_cmd_status file." but I do not see how this is happening here.
>> From what I can tell this implementation does not do anything beyond what
>> last_cmd_status already does so any special mention in the docs is not clear to
>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it
>> sounded as though last_cmd_status would contain the error with the pid that
>> encountered the error but I do not see this happening here.
>
> You are right we are not doing anything special here. pid failures error was already there.
> I will have to change the text here.

What do you mean with "pid failures error was already there"? From what
I understand your goal is to communicate to the user which pid
encountered the error and I do not see that done. How will user know
which pid encountered a failure?

Reinette

2023-03-16 19:51:36

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Reinette,

On 3/16/23 12:12, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/16/2023 9:27 AM, Moger, Babu wrote:
>>> -----Original Message-----
>>> From: Reinette Chatre <[email protected]>
>>> Sent: Wednesday, March 15, 2023 1:33 PM
>>> To: Moger, Babu <[email protected]>; [email protected];
>>> [email protected]; [email protected]; [email protected]
>>> Cc: [email protected]; [email protected]; [email protected];
>>> [email protected]; [email protected]; [email protected];
>>> [email protected]; [email protected];
>>> [email protected]; [email protected];
>>> [email protected]; [email protected]; [email protected];
>>> [email protected]; [email protected];
>>> [email protected]; [email protected]; Das1, Sandipan
>>> <[email protected]>; [email protected]; [email protected];
>>> [email protected]; [email protected];
>>> [email protected]; [email protected]; [email protected];
>>> [email protected]; [email protected]; [email protected];
>>> [email protected]
>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
>>> at once
>>>
>>> Hi Babu,
>>>
>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be
>>>> done one at a time. For example:
>>>>
>>>> $mount -t resctrl resctrl /sys/fs/resctrl/
>>>> $mkdir /sys/fs/resctrl/clos1
>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>>
>>>> This is not user-friendly when dealing with hundreds of tasks. Also,
>>>> there is a syscall overhead for each command executed from user space.
>>>
>>> To support this change it may also be helpful to add that moving tasks take the
>>> mutex so attempting to move tasks in parallel will not achieve a significant
>>> performance gain.
>>
>> Agree. It may not be significant performance gain. Will remove this line.
>
> It does not sound as though you are actually responding to my comment.

I am confused. I am already saying there is syscall overhead for each
command if we move the tasks one by one. Now do you want me to add "moving
tasks take the mutex so attempting to move tasks in parallel will not
achieve a significant performance gain".

It is contradictory, So, I wanted to remove the line about performance.
Did I still miss something?

>
>>>> --- a/Documentation/x86/resctrl.rst
>>>> +++ b/Documentation/x86/resctrl.rst
>>>> @@ -292,13 +292,20 @@ All groups contain the following files:
>>>> "tasks":
>>>> Reading this file shows the list of all tasks that belong to
>>>> this group. Writing a task id to the file will add a task to the
>>>> - group. If the group is a CTRL_MON group the task is removed from
>>>> + group. Multiple tasks can be assigned together in one command by
>>>> + inputting the tasks separated by commas. Tasks will be assigned
>>>
>>> How about "tasks separated" -> "task ids separated" or "by inputting the tasks
>>> separated by commas" -> "by separating the task ids with commas"
>>
>>
>> Will change it to " Multiple tasks can be assigned together in one command by separating the task ids with commas."
>
> I would drop the "together" since it implies that this is
> somehow atomic. It will also improve reading by using consistent terminology -
> note how the text switches from "add" to "assign". Something like
> "Multiple tasks can be added by separating the task ids with commas." I think
> using "command" is confusing since what is actually done is writing
> text to a file.

Ok. Sure. Will change it.

>
> ...
>
>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct
>>> kernfs_open_file *of,
>>>> char *buf, size_t nbytes, loff_t off) {
>>>> struct rdtgroup *rdtgrp;
>>>> + char *pid_str;
>>>> int ret = 0;
>>>> pid_t pid;
>>>>
>>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
>>>> + /* Valid input requires a trailing newline */
>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>>> return -EINVAL;
>>>
>>> The resctrl files should be seen as user space API. With the above change you
>>> take an interface that did not require a newline and dictate that it should have
>>> a trailing newline. How convinced are you that this does not break any current
>>> user space scripts or applications? Why does this feature require a trailing
>>> newline?
>>
>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems.
>> We are already doing newline check for few other inputs.
>
> You tested this with the _one_ user space tool that you use. This is not sufficient
> to be convincing that this change has no impact. I do not believe that it is a valid
> argument that other inputs do a newline check. This input never required a newline
> check and it is not clear why this change now requires it. It seems that this is an
> unnecessary new requirement that runs the risk of breaking an existing application.
>
> I would like to ask again: How convinced are you that this does not break _any_ current
> user space scripts or applications? Why does this feature require a trailing
> newline?

I do not know of any other tool using resctrl fs.
So, you want me to drop the newline requirement for this. I can try that.
Will let you know how it goes.


>
>>
>>>
>>>> +
>>>> + buf[nbytes - 1] = '\0';
>>>> +
>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>>> if (!rdtgrp) {
>>>> rdtgroup_kn_unlock(of->kn);
>>>> return -ENOENT;
>>>> }
>>>> +
>>>> +next:
>>>> + if (!buf || buf[0] == '\0')
>>>> + goto unlock;
>>>> +
>>>> + pid_str = strim(strsep(&buf, ","));
>>>> +
>>>
>>> Could lib/cmdline.c:get_option() be useful?
>>
>> Yes. We could that also. May not be required for the simple case like this.
>
> Please keep an eye out for how much of it you end up duplicating ....

Using the get_options will require at least two calls(one to get the
length and then read the integers). Also need to allocate the integers
array dynamically. That is lot code if we are going that route.

>
>>>> + ret = -EINVAL;
>>>> + goto unlock;
>>>> + }
>>>> +
>>>> rdt_last_cmd_clear();
>>>>
>>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6
>>> +721,10 @@
>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
>>>> }
>>>>
>>>> ret = rdtgroup_move_task(pid, rdtgrp, of);
>>>> + if (ret)
>>>> + goto unlock;
>>>> + else
>>>> + goto next;
>>>>
>>>
>>> The documentation states "The failure details will be logged in
>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here.
>>> From what I can tell this implementation does not do anything beyond what
>>> last_cmd_status already does so any special mention in the docs is not clear to
>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it
>>> sounded as though last_cmd_status would contain the error with the pid that
>>> encountered the error but I do not see this happening here.
>>
>> You are right we are not doing anything special here. pid failures error was already there.
>> I will have to change the text here.
>
> What do you mean with "pid failures error was already there"? From what
> I understand your goal is to communicate to the user which pid
> encountered the error and I do not see that done. How will user know
> which pid encountered a failure?

We only have couple of failures to take here. Those failures are already
handled by rdtgroup_move_task. It logs the pid for failure(using
rdt_last_cmd_printf).

I can say "The failure pid will be logged in
/sys/fs/resctrl/info/last_cmd_status file."
--
Thanks
Babu Moger

2023-03-16 20:31:38

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] x86/resctrl: Remove few unnecessary rftype flags

Hi Reinette,

On 3/15/23 13:33, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/2/2023 12:24 PM, Babu Moger wrote:
>> Remove few unnecessary rftype flags and simplify the code. This is done
>
> Please drop "few" (here and in subject).

Sure.

>
>> to further cleanup the code eventually.
>
> Could you please be specific? "further cleanup the code
> eventually" is too vague. I do not understand what is meant
> with the second sentence.

I can just say "Remove unnecessary rftype flags and improve code readability."

>
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/internal.h | 9 +++------
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++++---
>> 2 files changed, 10 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 8edecc5763d8..571145d75d29 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -243,12 +243,9 @@ struct rdtgroup {
>> */
>> #define RFTYPE_INFO BIT(0)
>> #define RFTYPE_BASE BIT(1)
>> -#define RF_CTRLSHIFT 4
>> -#define RF_MONSHIFT 5
>> -#define RF_TOPSHIFT 6
>> -#define RFTYPE_CTRL BIT(RF_CTRLSHIFT)
>> -#define RFTYPE_MON BIT(RF_MONSHIFT)
>> -#define RFTYPE_TOP BIT(RF_TOPSHIFT)
>> +#define RFTYPE_CTRL BIT(4)
>> +#define RFTYPE_MON BIT(5)
>> +#define RFTYPE_TOP BIT(6)
>> #define RFTYPE_RES_CACHE BIT(8)
>> #define RFTYPE_RES_MB BIT(9)
>> #define RF_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 15ea5b550fe9..3c86506e54c1 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -3163,7 +3163,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>> {
>> struct rdtgroup *prdtgrp, *rdtgrp;
>> struct kernfs_node *kn;
>> - uint files = 0;
>> + uint fflags = 0;
>
> Hoe does changing the variable name from "files" to "fflags" simplify
> the code?

I should have said readability of the code. Its actually fflags we are
passing to rdtgroup_add_files. While changing flags below, I changed the
variable name to fflags.


>
>> int ret;
>>
>> prdtgrp = rdtgroup_kn_lock_live(parent_kn);
>> @@ -3215,8 +3215,12 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>> goto out_destroy;
>> }
>>
>> - files = RFTYPE_BASE | BIT(RF_CTRLSHIFT + rtype);
>> - ret = rdtgroup_add_files(kn, files);
>> + if (rtype == RDTCTRL_GROUP)
>> + fflags = RFTYPE_BASE | RFTYPE_CTRL;
>> + else
>> + fflags = RFTYPE_BASE | RFTYPE_MON;
>> +
>> + ret = rdtgroup_add_files(kn, fflags);
>> if (ret) {
>> rdt_last_cmd_puts("kernfs fill error\n");
>> goto out_destroy;
>>
>>
>
> Reinette

--
Thanks
Babu Moger

2023-03-16 20:33:45

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Babu,

On 3/16/2023 12:51 PM, Moger, Babu wrote:
> On 3/16/23 12:12, Reinette Chatre wrote:
>> On 3/16/2023 9:27 AM, Moger, Babu wrote:
>>>> -----Original Message-----
>>>> From: Reinette Chatre <[email protected]>
>>>> Sent: Wednesday, March 15, 2023 1:33 PM
>>>> To: Moger, Babu <[email protected]>; [email protected];
>>>> [email protected]; [email protected]; [email protected]
>>>> Cc: [email protected]; [email protected]; [email protected];
>>>> [email protected]; [email protected]; [email protected];
>>>> [email protected]; [email protected];
>>>> [email protected]; [email protected];
>>>> [email protected]; [email protected]; [email protected];
>>>> [email protected]; [email protected];
>>>> [email protected]; [email protected]; Das1, Sandipan
>>>> <[email protected]>; [email protected]; [email protected];
>>>> [email protected]; [email protected];
>>>> [email protected]; [email protected]; [email protected];
>>>> [email protected]; [email protected]; [email protected];
>>>> [email protected]
>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
>>>> at once
>>>>
>>>> Hi Babu,
>>>>
>>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be
>>>>> done one at a time. For example:
>>>>>
>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/
>>>>> $mkdir /sys/fs/resctrl/clos1
>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>>>
>>>>> This is not user-friendly when dealing with hundreds of tasks. Also,
>>>>> there is a syscall overhead for each command executed from user space.
>>>>
>>>> To support this change it may also be helpful to add that moving tasks take the
>>>> mutex so attempting to move tasks in parallel will not achieve a significant
>>>> performance gain.
>>>
>>> Agree. It may not be significant performance gain. Will remove this line.
>>
>> It does not sound as though you are actually responding to my comment.
>
> I am confused. I am already saying there is syscall overhead for each
> command if we move the tasks one by one. Now do you want me to add "moving
> tasks take the mutex so attempting to move tasks in parallel will not
> achieve a significant performance gain".
>
> It is contradictory, So, I wanted to remove the line about performance.
> Did I still miss something?

Where is the contradiction?

Consider your example:
$echo 123 > /sys/fs/resctrl/clos1/tasks
$echo 456 > /sys/fs/resctrl/clos1/tasks
$echo 789 > /sys/fs/resctrl/clos1/tasks

Yes, there is syscall overhead for each of the above lines. My statement was in
support of this work by stating that a user aiming to improve performance by
attempting the above in parallel would not be able to see achieve significant
performance gain since the calls would end up being serialized.

You are providing two motivations (a) "user-friendly when dealing with
hundreds of tasks", and (b) syscall overhead. Have you measured the
improvement this solution provides?

>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct
>>>> kernfs_open_file *of,
>>>>> char *buf, size_t nbytes, loff_t off) {
>>>>> struct rdtgroup *rdtgrp;
>>>>> + char *pid_str;
>>>>> int ret = 0;
>>>>> pid_t pid;
>>>>>
>>>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
>>>>> + /* Valid input requires a trailing newline */
>>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>>>> return -EINVAL;
>>>>
>>>> The resctrl files should be seen as user space API. With the above change you
>>>> take an interface that did not require a newline and dictate that it should have
>>>> a trailing newline. How convinced are you that this does not break any current
>>>> user space scripts or applications? Why does this feature require a trailing
>>>> newline?
>>>
>>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems.
>>> We are already doing newline check for few other inputs.
>>
>> You tested this with the _one_ user space tool that you use. This is not sufficient
>> to be convincing that this change has no impact. I do not believe that it is a valid
>> argument that other inputs do a newline check. This input never required a newline
>> check and it is not clear why this change now requires it. It seems that this is an
>> unnecessary new requirement that runs the risk of breaking an existing application.
>>
>> I would like to ask again: How convinced are you that this does not break _any_ current
>> user space scripts or applications? Why does this feature require a trailing
>> newline?
>
> I do not know of any other tool using resctrl fs.
> So, you want me to drop the newline requirement for this. I can try that.
> Will let you know how it goes.

You continue to avoid my question about why this requires a newline. Until
I learn why this is required, yes, from what I can tell based on this patch
this requirement can and should be dropped.

>>>>> +
>>>>> + buf[nbytes - 1] = '\0';
>>>>> +
>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>>>> if (!rdtgrp) {
>>>>> rdtgroup_kn_unlock(of->kn);
>>>>> return -ENOENT;
>>>>> }
>>>>> +
>>>>> +next:
>>>>> + if (!buf || buf[0] == '\0')
>>>>> + goto unlock;
>>>>> +
>>>>> + pid_str = strim(strsep(&buf, ","));
>>>>> +
>>>>
>>>> Could lib/cmdline.c:get_option() be useful?
>>>
>>> Yes. We could that also. May not be required for the simple case like this.
>>
>> Please keep an eye out for how much of it you end up duplicating ....
>
> Using the get_options will require at least two calls(one to get the
> length and then read the integers). Also need to allocate the integers
> array dynamically. That is lot code if we are going that route.
>

I did not ask about get_options(), I asked about get_option().

>>
>>>>> + ret = -EINVAL;
>>>>> + goto unlock;
>>>>> + }
>>>>> +
>>>>> rdt_last_cmd_clear();
>>>>>
>>>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6
>>>> +721,10 @@
>>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
>>>>> }
>>>>>
>>>>> ret = rdtgroup_move_task(pid, rdtgrp, of);
>>>>> + if (ret)
>>>>> + goto unlock;
>>>>> + else
>>>>> + goto next;
>>>>>
>>>>
>>>> The documentation states "The failure details will be logged in
>>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here.
>>>> From what I can tell this implementation does not do anything beyond what
>>>> last_cmd_status already does so any special mention in the docs is not clear to
>>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it
>>>> sounded as though last_cmd_status would contain the error with the pid that
>>>> encountered the error but I do not see this happening here.
>>>
>>> You are right we are not doing anything special here. pid failures error was already there.
>>> I will have to change the text here.
>>
>> What do you mean with "pid failures error was already there"? From what
>> I understand your goal is to communicate to the user which pid
>> encountered the error and I do not see that done. How will user know
>> which pid encountered a failure?
>
> We only have couple of failures to take here. Those failures are already
> handled by rdtgroup_move_task. It logs the pid for failure(using
> rdt_last_cmd_printf).

The pid is only logged if there is no task with that pid. How about the
error in __rdtgroup_move_task() - how will the user know which pid triggered
that error?

>
> I can say "The failure pid will be logged in
> /sys/fs/resctrl/info/last_cmd_status file."

That will not be accurate. Not all errors include the pid.

Reinette

2023-03-16 20:42:05

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] x86/resctrl: Remove few unnecessary rftype flags

Hi Babu,

On 3/16/2023 1:31 PM, Moger, Babu wrote:
> On 3/15/23 13:33, Reinette Chatre wrote:
>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index 15ea5b550fe9..3c86506e54c1 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -3163,7 +3163,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>>> {
>>> struct rdtgroup *prdtgrp, *rdtgrp;
>>> struct kernfs_node *kn;
>>> - uint files = 0;
>>> + uint fflags = 0;
>>
>> Hoe does changing the variable name from "files" to "fflags" simplify
>> the code?
>
> I should have said readability of the code. Its actually fflags we are
> passing to rdtgroup_add_files. While changing flags below, I changed the
> variable name to fflags.

You are correct in that it is the actual fflags parameter but what it
reflects is which files will be created. I do not find that using "files"
makes the code unreadable.

Reinette

2023-03-16 21:11:27

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] x86/resctrl: Remove few unnecessary rftype flags

Hi Reinette,

On 3/16/23 15:41, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/16/2023 1:31 PM, Moger, Babu wrote:
>> On 3/15/23 13:33, Reinette Chatre wrote:
>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> index 15ea5b550fe9..3c86506e54c1 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>> @@ -3163,7 +3163,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>>>> {
>>>> struct rdtgroup *prdtgrp, *rdtgrp;
>>>> struct kernfs_node *kn;
>>>> - uint files = 0;
>>>> + uint fflags = 0;
>>>
>>> Hoe does changing the variable name from "files" to "fflags" simplify
>>> the code?
>>
>> I should have said readability of the code. Its actually fflags we are
>> passing to rdtgroup_add_files. While changing flags below, I changed the
>> variable name to fflags.
>
> You are correct in that it is the actual fflags parameter but what it
> reflects is which files will be created. I do not find that using "files"
> makes the code unreadable.

Everything helps. I changed it because I was already changing few things
in this function. That is not the only change in this function. Here is
the main change in this function.

- files = RFTYPE_BASE | BIT(RF_CTRLSHIFT + rtype);
- ret = rdtgroup_add_files(kn, files);
+ if (rtype == RDTCTRL_GROUP)
+ fflags = RFTYPE_BASE | RFTYPE_CTRL;
+ else
+ fflags = RFTYPE_BASE | RFTYPE_MON;
+
+ ret = rdtgroup_add_files(kn, fflags);

In my opinion this is more clearer. Also I can delete some of these
un-necessary definitions below.

#define RFTYPE_INFO BIT(0)
#define RFTYPE_BASE BIT(1)
-#define RF_CTRLSHIFT 4
-#define RF_MONSHIFT 5
-#define RF_TOPSHIFT 6
-#define RFTYPE_CTRL BIT(RF_CTRLSHIFT)
-#define RFTYPE_MON BIT(RF_MONSHIFT)
-#define RFTYPE_TOP BIT(RF_TOPSHIFT)
+#define RFTYPE_CTRL BIT(4)
+#define RFTYPE_MON BIT(5)
+#define RFTYPE_TOP BIT(6)

Thanks
Babu

2023-03-16 21:18:52

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] x86/resctrl: Remove few unnecessary rftype flags

Hi Babu,

On 3/16/2023 2:11 PM, Moger, Babu wrote:
> Hi Reinette,
>
> On 3/16/23 15:41, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 3/16/2023 1:31 PM, Moger, Babu wrote:
>>> On 3/15/23 13:33, Reinette Chatre wrote:
>>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> index 15ea5b550fe9..3c86506e54c1 100644
>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> @@ -3163,7 +3163,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>>>>> {
>>>>> struct rdtgroup *prdtgrp, *rdtgrp;
>>>>> struct kernfs_node *kn;
>>>>> - uint files = 0;
>>>>> + uint fflags = 0;
>>>>
>>>> Hoe does changing the variable name from "files" to "fflags" simplify
>>>> the code?
>>>
>>> I should have said readability of the code. Its actually fflags we are
>>> passing to rdtgroup_add_files. While changing flags below, I changed the
>>> variable name to fflags.
>>
>> You are correct in that it is the actual fflags parameter but what it
>> reflects is which files will be created. I do not find that using "files"
>> makes the code unreadable.
>
> Everything helps. I changed it because I was already changing few things
> in this function. That is not the only change in this function. Here is
> the main change in this function.

I did read the patch Babu. I trimmed my response to focus on what
I was responding to. Our opinions differ on readability of the current
variable name. This patch can focus on just removing the unnecessary rftype
flags.

Reinette

2023-03-20 13:35:48

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] x86/resctrl: Remove few unnecessary rftype flags



On 3/16/23 16:17, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/16/2023 2:11 PM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 3/16/23 15:41, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 3/16/2023 1:31 PM, Moger, Babu wrote:
>>>> On 3/15/23 13:33, Reinette Chatre wrote:
>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> index 15ea5b550fe9..3c86506e54c1 100644
>>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> @@ -3163,7 +3163,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>>>>>> {
>>>>>> struct rdtgroup *prdtgrp, *rdtgrp;
>>>>>> struct kernfs_node *kn;
>>>>>> - uint files = 0;
>>>>>> + uint fflags = 0;
>>>>>
>>>>> Hoe does changing the variable name from "files" to "fflags" simplify
>>>>> the code?
>>>>
>>>> I should have said readability of the code. Its actually fflags we are
>>>> passing to rdtgroup_add_files. While changing flags below, I changed the
>>>> variable name to fflags.
>>>
>>> You are correct in that it is the actual fflags parameter but what it
>>> reflects is which files will be created. I do not find that using "files"
>>> makes the code unreadable.
>>
>> Everything helps. I changed it because I was already changing few things
>> in this function. That is not the only change in this function. Here is
>> the main change in this function.
>
> I did read the patch Babu. I trimmed my response to focus on what
> I was responding to. Our opinions differ on readability of the current
> variable name. This patch can focus on just removing the unnecessary rftype
> flags.

Ok. Fine with me.
--
Thanks
Babu Moger

2023-03-20 15:13:11

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Reinette,

On 3/16/23 15:33, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/16/2023 12:51 PM, Moger, Babu wrote:
>> On 3/16/23 12:12, Reinette Chatre wrote:
>>> On 3/16/2023 9:27 AM, Moger, Babu wrote:
>>>>> -----Original Message-----
>>>>> From: Reinette Chatre <[email protected]>
>>>>> Sent: Wednesday, March 15, 2023 1:33 PM
>>>>> To: Moger, Babu <[email protected]>; [email protected];
>>>>> [email protected]; [email protected]; [email protected]
>>>>> Cc: [email protected]; [email protected]; [email protected];
>>>>> [email protected]; [email protected]; [email protected];
>>>>> [email protected]; [email protected];
>>>>> [email protected]; [email protected];
>>>>> [email protected]; [email protected]; [email protected];
>>>>> [email protected]; [email protected];
>>>>> [email protected]; [email protected]; Das1, Sandipan
>>>>> <[email protected]>; [email protected]; [email protected];
>>>>> [email protected]; [email protected];
>>>>> [email protected]; [email protected]; [email protected];
>>>>> [email protected]; [email protected]; [email protected];
>>>>> [email protected]
>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
>>>>> at once
>>>>>
>>>>> Hi Babu,
>>>>>
>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be
>>>>>> done one at a time. For example:
>>>>>>
>>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/
>>>>>> $mkdir /sys/fs/resctrl/clos1
>>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>>>>
>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also,
>>>>>> there is a syscall overhead for each command executed from user space.
>>>>>
>>>>> To support this change it may also be helpful to add that moving tasks take the
>>>>> mutex so attempting to move tasks in parallel will not achieve a significant
>>>>> performance gain.
>>>>
>>>> Agree. It may not be significant performance gain. Will remove this line.
>>>
>>> It does not sound as though you are actually responding to my comment.
>>
>> I am confused. I am already saying there is syscall overhead for each
>> command if we move the tasks one by one. Now do you want me to add "moving
>> tasks take the mutex so attempting to move tasks in parallel will not
>> achieve a significant performance gain".
>>
>> It is contradictory, So, I wanted to remove the line about performance.
>> Did I still miss something?
>
> Where is the contradiction?
>
> Consider your example:
> $echo 123 > /sys/fs/resctrl/clos1/tasks
> $echo 456 > /sys/fs/resctrl/clos1/tasks
> $echo 789 > /sys/fs/resctrl/clos1/tasks
>
> Yes, there is syscall overhead for each of the above lines. My statement was in
> support of this work by stating that a user aiming to improve performance by
> attempting the above in parallel would not be able to see achieve significant
> performance gain since the calls would end up being serialized.

ok. Sure. Will add the text. I may modify little bit.
>
> You are providing two motivations (a) "user-friendly when dealing with
> hundreds of tasks", and (b) syscall overhead. Have you measured the
> improvement this solution provides?

No. I have not measured the performance improvement.

>
>>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct
>>>>> kernfs_open_file *of,
>>>>>> char *buf, size_t nbytes, loff_t off) {
>>>>>> struct rdtgroup *rdtgrp;
>>>>>> + char *pid_str;
>>>>>> int ret = 0;
>>>>>> pid_t pid;
>>>>>>
>>>>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
>>>>>> + /* Valid input requires a trailing newline */
>>>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>>>>> return -EINVAL;
>>>>>
>>>>> The resctrl files should be seen as user space API. With the above change you
>>>>> take an interface that did not require a newline and dictate that it should have
>>>>> a trailing newline. How convinced are you that this does not break any current
>>>>> user space scripts or applications? Why does this feature require a trailing
>>>>> newline?
>>>>
>>>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems.
>>>> We are already doing newline check for few other inputs.
>>>
>>> You tested this with the _one_ user space tool that you use. This is not sufficient
>>> to be convincing that this change has no impact. I do not believe that it is a valid
>>> argument that other inputs do a newline check. This input never required a newline
>>> check and it is not clear why this change now requires it. It seems that this is an
>>> unnecessary new requirement that runs the risk of breaking an existing application.
>>>
>>> I would like to ask again: How convinced are you that this does not break _any_ current
>>> user space scripts or applications? Why does this feature require a trailing
>>> newline?

I dont think this feature required trailing newline. I may have carried
away from similar code in the area. I will remove that requirement.

>>
>> I do not know of any other tool using resctrl fs.
>> So, you want me to drop the newline requirement for this. I can try that.
>> Will let you know how it goes.
>
> You continue to avoid my question about why this requires a newline. Until
> I learn why this is required, yes, from what I can tell based on this patch
> this requirement can and should be dropped.
>
>>>>>> +
>>>>>> + buf[nbytes - 1] = '\0';
>>>>>> +
>>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>>>>> if (!rdtgrp) {
>>>>>> rdtgroup_kn_unlock(of->kn);
>>>>>> return -ENOENT;
>>>>>> }
>>>>>> +
>>>>>> +next:
>>>>>> + if (!buf || buf[0] == '\0')
>>>>>> + goto unlock;
>>>>>> +
>>>>>> + pid_str = strim(strsep(&buf, ","));
>>>>>> +
>>>>>
>>>>> Could lib/cmdline.c:get_option() be useful?
>>>>
>>>> Yes. We could that also. May not be required for the simple case like this.
>>>
>>> Please keep an eye out for how much of it you end up duplicating ....
>>
>> Using the get_options will require at least two calls(one to get the
>> length and then read the integers). Also need to allocate the integers
>> array dynamically. That is lot code if we are going that route.
>>
>
> I did not ask about get_options(), I asked about get_option().

If you insist, will use get_option. But we still have to loop thru all the
string till get_option returns 0. I can try that.

>
>>>
>>>>>> + ret = -EINVAL;
>>>>>> + goto unlock;
>>>>>> + }
>>>>>> +
>>>>>> rdt_last_cmd_clear();
>>>>>>
>>>>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6
>>>>> +721,10 @@
>>>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
>>>>>> }
>>>>>>
>>>>>> ret = rdtgroup_move_task(pid, rdtgrp, of);
>>>>>> + if (ret)
>>>>>> + goto unlock;
>>>>>> + else
>>>>>> + goto next;
>>>>>>
>>>>>
>>>>> The documentation states "The failure details will be logged in
>>>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here.
>>>>> From what I can tell this implementation does not do anything beyond what
>>>>> last_cmd_status already does so any special mention in the docs is not clear to
>>>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it
>>>>> sounded as though last_cmd_status would contain the error with the pid that
>>>>> encountered the error but I do not see this happening here.
>>>>
>>>> You are right we are not doing anything special here. pid failures error was already there.
>>>> I will have to change the text here.
>>>
>>> What do you mean with "pid failures error was already there"? From what
>>> I understand your goal is to communicate to the user which pid
>>> encountered the error and I do not see that done. How will user know
>>> which pid encountered a failure?
>>
>> We only have couple of failures to take here. Those failures are already
>> handled by rdtgroup_move_task. It logs the pid for failure(using
>> rdt_last_cmd_printf).
>
> The pid is only logged if there is no task with that pid. How about the
> error in __rdtgroup_move_task() - how will the user know which pid triggered
> that error?

Yes. These cases we may be able to report the pid.

>
>>
>> I can say "The failure pid will be logged in
>> /sys/fs/resctrl/info/last_cmd_status file."
>
> That will not be accurate. Not all errors include the pid.

Can you please suggest?
Thanks
Babu Moger

2023-03-20 15:23:50

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once



On 3/20/23 10:07, Moger, Babu wrote:
> Hi Reinette,
>
> On 3/16/23 15:33, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 3/16/2023 12:51 PM, Moger, Babu wrote:
>>> On 3/16/23 12:12, Reinette Chatre wrote:
>>>> On 3/16/2023 9:27 AM, Moger, Babu wrote:
>>>>>> -----Original Message-----
>>>>>> From: Reinette Chatre <[email protected]>
>>>>>> Sent: Wednesday, March 15, 2023 1:33 PM
>>>>>> To: Moger, Babu <[email protected]>; [email protected];
>>>>>> [email protected]; [email protected]; [email protected]
>>>>>> Cc: [email protected]; [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]; [email protected];
>>>>>> [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; Das1, Sandipan
>>>>>> <[email protected]>; [email protected]; [email protected];
>>>>>> [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]
>>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
>>>>>> at once
>>>>>>
>>>>>> Hi Babu,
>>>>>>
>>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be
>>>>>>> done one at a time. For example:
>>>>>>>
>>>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/
>>>>>>> $mkdir /sys/fs/resctrl/clos1
>>>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>>>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>>>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>>>>>
>>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also,
>>>>>>> there is a syscall overhead for each command executed from user space.
>>>>>>
>>>>>> To support this change it may also be helpful to add that moving tasks take the
>>>>>> mutex so attempting to move tasks in parallel will not achieve a significant
>>>>>> performance gain.
>>>>>
>>>>> Agree. It may not be significant performance gain. Will remove this line.
>>>>
>>>> It does not sound as though you are actually responding to my comment.
>>>
>>> I am confused. I am already saying there is syscall overhead for each
>>> command if we move the tasks one by one. Now do you want me to add "moving
>>> tasks take the mutex so attempting to move tasks in parallel will not
>>> achieve a significant performance gain".
>>>
>>> It is contradictory, So, I wanted to remove the line about performance.
>>> Did I still miss something?
>>
>> Where is the contradiction?
>>
>> Consider your example:
>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>
>> Yes, there is syscall overhead for each of the above lines. My statement was in
>> support of this work by stating that a user aiming to improve performance by
>> attempting the above in parallel would not be able to see achieve significant
>> performance gain since the calls would end up being serialized.
>
> ok. Sure. Will add the text. I may modify little bit.
>>
>> You are providing two motivations (a) "user-friendly when dealing with
>> hundreds of tasks", and (b) syscall overhead. Have you measured the
>> improvement this solution provides?
>
> No. I have not measured the performance improvement.
>
>>
>>>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct
>>>>>> kernfs_open_file *of,
>>>>>>> char *buf, size_t nbytes, loff_t off) {
>>>>>>> struct rdtgroup *rdtgrp;
>>>>>>> + char *pid_str;
>>>>>>> int ret = 0;
>>>>>>> pid_t pid;
>>>>>>>
>>>>>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
>>>>>>> + /* Valid input requires a trailing newline */
>>>>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n')
>>>>>>> return -EINVAL;
>>>>>>
>>>>>> The resctrl files should be seen as user space API. With the above change you
>>>>>> take an interface that did not require a newline and dictate that it should have
>>>>>> a trailing newline. How convinced are you that this does not break any current
>>>>>> user space scripts or applications? Why does this feature require a trailing
>>>>>> newline?
>>>>>
>>>>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems.
>>>>> We are already doing newline check for few other inputs.
>>>>
>>>> You tested this with the _one_ user space tool that you use. This is not sufficient
>>>> to be convincing that this change has no impact. I do not believe that it is a valid
>>>> argument that other inputs do a newline check. This input never required a newline
>>>> check and it is not clear why this change now requires it. It seems that this is an
>>>> unnecessary new requirement that runs the risk of breaking an existing application.
>>>>
>>>> I would like to ask again: How convinced are you that this does not break _any_ current
>>>> user space scripts or applications? Why does this feature require a trailing
>>>> newline?
>
> I dont think this feature required trailing newline. I may have carried
> away from similar code in the area. I will remove that requirement.
>
>>>
>>> I do not know of any other tool using resctrl fs.
>>> So, you want me to drop the newline requirement for this. I can try that.
>>> Will let you know how it goes.
>>
>> You continue to avoid my question about why this requires a newline. Until
>> I learn why this is required, yes, from what I can tell based on this patch
>> this requirement can and should be dropped.
>>
>>>>>>> +
>>>>>>> + buf[nbytes - 1] = '\0';
>>>>>>> +
>>>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>>>>>> if (!rdtgrp) {
>>>>>>> rdtgroup_kn_unlock(of->kn);
>>>>>>> return -ENOENT;
>>>>>>> }
>>>>>>> +
>>>>>>> +next:
>>>>>>> + if (!buf || buf[0] == '\0')
>>>>>>> + goto unlock;
>>>>>>> +
>>>>>>> + pid_str = strim(strsep(&buf, ","));
>>>>>>> +
>>>>>>
>>>>>> Could lib/cmdline.c:get_option() be useful?
>>>>>
>>>>> Yes. We could that also. May not be required for the simple case like this.
>>>>
>>>> Please keep an eye out for how much of it you end up duplicating ....
>>>
>>> Using the get_options will require at least two calls(one to get the
>>> length and then read the integers). Also need to allocate the integers
>>> array dynamically. That is lot code if we are going that route.
>>>
>>
>> I did not ask about get_options(), I asked about get_option().
>
> If you insist, will use get_option. But we still have to loop thru all the
> string till get_option returns 0. I can try that.
>
>>
>>>>
>>>>>>> + ret = -EINVAL;
>>>>>>> + goto unlock;
>>>>>>> + }
>>>>>>> +
>>>>>>> rdt_last_cmd_clear();
>>>>>>>
>>>>>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6
>>>>>> +721,10 @@
>>>>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of,
>>>>>>> }
>>>>>>>
>>>>>>> ret = rdtgroup_move_task(pid, rdtgrp, of);
>>>>>>> + if (ret)
>>>>>>> + goto unlock;
>>>>>>> + else
>>>>>>> + goto next;
>>>>>>>
>>>>>>
>>>>>> The documentation states "The failure details will be logged in
>>>>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here.
>>>>>> From what I can tell this implementation does not do anything beyond what
>>>>>> last_cmd_status already does so any special mention in the docs is not clear to
>>>>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it
>>>>>> sounded as though last_cmd_status would contain the error with the pid that
>>>>>> encountered the error but I do not see this happening here.
>>>>>
>>>>> You are right we are not doing anything special here. pid failures error was already there.
>>>>> I will have to change the text here.
>>>>
>>>> What do you mean with "pid failures error was already there"? From what
>>>> I understand your goal is to communicate to the user which pid
>>>> encountered the error and I do not see that done. How will user know
>>>> which pid encountered a failure?
>>>
>>> We only have couple of failures to take here. Those failures are already
>>> handled by rdtgroup_move_task. It logs the pid for failure(using
>>> rdt_last_cmd_printf).
>>
>> The pid is only logged if there is no task with that pid. How about the
>> error in __rdtgroup_move_task() - how will the user know which pid triggered
>> that error?
>
> Yes. These cases we may be able to report the pid.

I meant "we may not"
>
>>
>>>
>>> I can say "The failure pid will be logged in
>>> /sys/fs/resctrl/info/last_cmd_status file."
>>
>> That will not be accurate. Not all errors include the pid.
>
> Can you please suggest?
> Thanks
> Babu Moger

--
Thanks
Babu Moger

2023-03-20 17:01:18

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

Hi Reinette,

On 3/15/23 13:42, Reinette Chatre wrote:
> Hi Babu,
>
> Please note "COSID" (CLOSID?) in the subject. Even so, you already
> know and have been reminded when you submitted an earlier version
> that resctrl needs to support Arm. Adding x86 specific bits to the
> user interface complicates this enabling.
>
> What happened to:
> https://lore.kernel.org/lkml/[email protected]/

Yes. It kind of loses meaning if is renamed differently. Kept it same.
I will change it to mon_hw_id and ctrl_hw_id respectively. Will change
subject line accordingly.

>
> On 3/2/2023 12:24 PM, Babu Moger wrote:
>> When a user creates a control or monitor group, the CLOSID or RMID
>> are not visible to the user. These are architecturally defined entities.
>> There is no harm in displaying these in resctrl groups. Sometimes it
>> can help to debug the issues.
>>
>> Add CLOSID and RMID to the control/monitor groups display in resctrl
>> interface.
>>
>> $cat /sys/fs/resctrl/clos1/closid
>> 1
>> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
>> 3
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> Documentation/x86/resctrl.rst | 17 ++++++++++++
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 ++++++++++++++++++++++++++++++++
>> 2 files changed, 61 insertions(+)
>>
>> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
>> index 25203f20002d..67eae74fe40c 100644
>> --- a/Documentation/x86/resctrl.rst
>> +++ b/Documentation/x86/resctrl.rst
>> @@ -321,6 +321,15 @@ All groups contain the following files:
>> Just like "cpus", only using ranges of CPUs instead of bitmasks.
>>
>>
>> +"rmid":
>> + Available only with debug option.Reading this file shows the
>
> Where can the user find documentation about "debug option"?

I can change the patch order. I can bring patch 7 before this.

>
> Also please watch spacing.

Ok sure.

>
>> + Resource Monitoring ID (RMID) for monitoring the resource
>> + utilization. Monitoring is performed by tagging each core (or
>> + thread) or process via a RMID. Kernel assigns a new RMID when
>> + a group is created depending on the available RMIDs. Multiple
>> + cores (or threads) or processes can share a same RMID in a resctrl
>> + domain.
>
> Please make this not to be x86 specific. You can surely document the
> x86 details but that should start with something like "on x86 this ..."

ok. Sure
>
> What does "a resctrl domain" mean to user space? Did you mean "resource
> group"?

Yes. it should have been "resource group".
>
>> +
>> When control is enabled all CTRL_MON groups will also contain:
>>
>> "schemata":
>> @@ -342,6 +351,14 @@ When control is enabled all CTRL_MON groups will also contain:
>> file. On successful pseudo-locked region creation the mode will
>> automatically change to "pseudo-locked".
>>
>> +"closid":
>> + Available only with debug option. Reading this file shows the
>> + Class of Service (CLOS) id which acts as a resource control tag
>> + on which the resources can be throttled. Kernel assigns a new
>> + CLOSID a control group is created depending on the available
>> + CLOSIDs. Multiple cores(or threads) or processes can share a
>> + same CLOSID in a resctrl domain.
>> +
>
> This also should not be x86 specific. Also, please check the language
> and watch spacing.

ok
> for example, "Kernel assigns a new CLOSID a control group" -> "Kernel
> assigns a new CLOSID to a control group", "can share a same" -> "can
> share the same".

Sure
> What does "a resctrl domain" mean to user space?

It should have been "resource group".

>
>
>> When monitoring is enabled all MON groups will also contain:
>>
>
> Shouldn't the "rmid" (to be renamed) entry be in this section of the
> documentation?

Not sure about this comment. Did you mean to move the rmid (to be renamed
mon_hw_id) documentation here?

Thanks
Babu

>
>> "mon_data":
>
> Reinette
>

--
Thanks
Babu Moger

2023-03-20 17:03:51

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Babu,

On 3/20/2023 8:07 AM, Moger, Babu wrote:
> On 3/16/23 15:33, Reinette Chatre wrote:
>> On 3/16/2023 12:51 PM, Moger, Babu wrote:
>>> On 3/16/23 12:12, Reinette Chatre wrote:
>>>> On 3/16/2023 9:27 AM, Moger, Babu wrote:
>>>>>> -----Original Message-----
>>>>>> From: Reinette Chatre <[email protected]>
>>>>>> Sent: Wednesday, March 15, 2023 1:33 PM
>>>>>> To: Moger, Babu <[email protected]>; [email protected];
>>>>>> [email protected]; [email protected]; [email protected]
>>>>>> Cc: [email protected]; [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]; [email protected];
>>>>>> [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; Das1, Sandipan
>>>>>> <[email protected]>; [email protected]; [email protected];
>>>>>> [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]; [email protected]; [email protected];
>>>>>> [email protected]
>>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
>>>>>> at once
>>>>>>
>>>>>> Hi Babu,
>>>>>>
>>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be
>>>>>>> done one at a time. For example:
>>>>>>>
>>>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/
>>>>>>> $mkdir /sys/fs/resctrl/clos1
>>>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>>>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>>>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>>>>>
>>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also,
>>>>>>> there is a syscall overhead for each command executed from user space.
>>>>>>
>>>>>> To support this change it may also be helpful to add that moving tasks take the
>>>>>> mutex so attempting to move tasks in parallel will not achieve a significant
>>>>>> performance gain.
>>>>>
>>>>> Agree. It may not be significant performance gain. Will remove this line.
>>>>
>>>> It does not sound as though you are actually responding to my comment.
>>>
>>> I am confused. I am already saying there is syscall overhead for each
>>> command if we move the tasks one by one. Now do you want me to add "moving
>>> tasks take the mutex so attempting to move tasks in parallel will not
>>> achieve a significant performance gain".
>>>
>>> It is contradictory, So, I wanted to remove the line about performance.
>>> Did I still miss something?
>>
>> Where is the contradiction?
>>
>> Consider your example:
>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>
>> Yes, there is syscall overhead for each of the above lines. My statement was in
>> support of this work by stating that a user aiming to improve performance by
>> attempting the above in parallel would not be able to see achieve significant
>> performance gain since the calls would end up being serialized.
>
> ok. Sure. Will add the text. I may modify little bit.
>>
>> You are providing two motivations (a) "user-friendly when dealing with
>> hundreds of tasks", and (b) syscall overhead. Have you measured the
>> improvement this solution provides?
>
> No. I have not measured the performance improvement.

The changelog makes a claim that the current implementation has overhead
that is removed with this change. There is no data to support this claim.

...

>>>>>>> +
>>>>>>> + buf[nbytes - 1] = '\0';
>>>>>>> +
>>>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>>>>>> if (!rdtgrp) {
>>>>>>> rdtgroup_kn_unlock(of->kn);
>>>>>>> return -ENOENT;
>>>>>>> }
>>>>>>> +
>>>>>>> +next:
>>>>>>> + if (!buf || buf[0] == '\0')
>>>>>>> + goto unlock;
>>>>>>> +
>>>>>>> + pid_str = strim(strsep(&buf, ","));
>>>>>>> +
>>>>>>
>>>>>> Could lib/cmdline.c:get_option() be useful?
>>>>>
>>>>> Yes. We could that also. May not be required for the simple case like this.
>>>>
>>>> Please keep an eye out for how much of it you end up duplicating ....
>>>
>>> Using the get_options will require at least two calls(one to get the
>>> length and then read the integers). Also need to allocate the integers
>>> array dynamically. That is lot code if we are going that route.
>>>
>>
>> I did not ask about get_options(), I asked about get_option().
>
> If you insist, will use get_option. But we still have to loop thru all the
> string till get_option returns 0. I can try that.


I just asked whether get_option() could be useful. Could you please point out what
I said that made you think that I insist on this change being made? If it matches
your usage, then know it is available, if it does not, then don't use it.

...

>>> I can say "The failure pid will be logged in
>>> /sys/fs/resctrl/info/last_cmd_status file."
>>
>> That will not be accurate. Not all errors include the pid.
>
> Can you please suggest?

last_cmd_status provides a 512 char buffer to communicate details
to the user. The buffer is cleared before the loop that moves all the
tasks start. If an error is encountered, a detailed message is written
to the buffer. One option may be to append a string to the buffer that
includes the pid? Perhaps something like:
rdt_last_cmd_printf("Error encountered while moving task %d\n", pid);

Please feel free to improve.

Reinette



2023-03-20 17:08:02

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

Hi Babu,

On 3/20/2023 9:52 AM, Moger, Babu wrote:
> On 3/15/23 13:42, Reinette Chatre wrote:

...

>>
>>
>>> When monitoring is enabled all MON groups will also contain:
>>>
>>
>> Shouldn't the "rmid" (to be renamed) entry be in this section of the
>> documentation?
>
> Not sure about this comment. Did you mean to move the rmid (to be renamed
> mon_hw_id) documentation here?


Yes. Note the header reads "When monitoring is enabled all MON groups
will also contain:". The new file is only relevant to monitoring groups, so
it seems appropriate that it falls under this section.

Reinette

2023-03-20 17:22:10

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

Hi Babu,

On 02/03/2023 20:24, Babu Moger wrote:
> When a user creates a control or monitor group, the CLOSID or RMID
> are not visible to the user. These are architecturally defined entities.

On x86. Any other architecture is going to have a hard time supporting this.


> There is no harm in displaying these in resctrl groups. Sometimes it
> can help to debug the issues.

By comparing it with what? Unless user-space can see into the hardware, resctrl is the
only gateway to this stuff. What difference does the allocated value here make?

Could you elaborate on what issues this can help debug?


> Add CLOSID and RMID to the control/monitor groups display in resctrl
> interface.
>
> $cat /sys/fs/resctrl/clos1/closid
> 1
> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
> 3

Er. Please don't expose this to user-space!
MPAM has no equivalent value to RMID, so whatever this is for, can't work on MPAM.


Where I have needed this value for MPAM is to pass the closid/rmid to another kernel
interface. Because the user-space interface needs to be architecture agnostic, I proposed
it as a u64 called 'id' that each architecture can encode/decode as appropriate. [0]

To prevent user-space trying to base anything on the raw closid/rmid values, I went as far
as obfuscating them with a random value picked at boot, to ensure scripts always read the
current value when passing the control/monitor group.


I'm curious what the raw value is useful for.


Thanks,

James

[0]
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.2&id=d568cf2ba58b7c4970ce41a8d4d6224e285a177e





2023-03-20 18:39:05

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once

Hi Reinette,

On 3/20/23 11:52, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/20/2023 8:07 AM, Moger, Babu wrote:
>> On 3/16/23 15:33, Reinette Chatre wrote:
>>> On 3/16/2023 12:51 PM, Moger, Babu wrote:
>>>> On 3/16/23 12:12, Reinette Chatre wrote:
>>>>> On 3/16/2023 9:27 AM, Moger, Babu wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Reinette Chatre <[email protected]>
>>>>>>> Sent: Wednesday, March 15, 2023 1:33 PM
>>>>>>> To: Moger, Babu <[email protected]>; [email protected];
>>>>>>> [email protected]; [email protected]; [email protected]
>>>>>>> Cc: [email protected]; [email protected]; [email protected];
>>>>>>> [email protected]; [email protected]; [email protected];
>>>>>>> [email protected]; [email protected];
>>>>>>> [email protected]; [email protected];
>>>>>>> [email protected]; [email protected]; [email protected];
>>>>>>> [email protected]; [email protected];
>>>>>>> [email protected]; [email protected]; Das1, Sandipan
>>>>>>> <[email protected]>; [email protected]; [email protected];
>>>>>>> [email protected]; [email protected];
>>>>>>> [email protected]; [email protected]; [email protected];
>>>>>>> [email protected]; [email protected]; [email protected];
>>>>>>> [email protected]
>>>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group
>>>>>>> at once
>>>>>>>
>>>>>>> Hi Babu,
>>>>>>>
>>>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be
>>>>>>>> done one at a time. For example:
>>>>>>>>
>>>>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/
>>>>>>>> $mkdir /sys/fs/resctrl/clos1
>>>>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>>>>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>>>>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>>>>>>
>>>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also,
>>>>>>>> there is a syscall overhead for each command executed from user space.
>>>>>>>
>>>>>>> To support this change it may also be helpful to add that moving tasks take the
>>>>>>> mutex so attempting to move tasks in parallel will not achieve a significant
>>>>>>> performance gain.
>>>>>>
>>>>>> Agree. It may not be significant performance gain. Will remove this line.
>>>>>
>>>>> It does not sound as though you are actually responding to my comment.
>>>>
>>>> I am confused. I am already saying there is syscall overhead for each
>>>> command if we move the tasks one by one. Now do you want me to add "moving
>>>> tasks take the mutex so attempting to move tasks in parallel will not
>>>> achieve a significant performance gain".
>>>>
>>>> It is contradictory, So, I wanted to remove the line about performance.
>>>> Did I still miss something?
>>>
>>> Where is the contradiction?
>>>
>>> Consider your example:
>>> $echo 123 > /sys/fs/resctrl/clos1/tasks
>>> $echo 456 > /sys/fs/resctrl/clos1/tasks
>>> $echo 789 > /sys/fs/resctrl/clos1/tasks
>>>
>>> Yes, there is syscall overhead for each of the above lines. My statement was in
>>> support of this work by stating that a user aiming to improve performance by
>>> attempting the above in parallel would not be able to see achieve significant
>>> performance gain since the calls would end up being serialized.
>>
>> ok. Sure. Will add the text. I may modify little bit.
>>>
>>> You are providing two motivations (a) "user-friendly when dealing with
>>> hundreds of tasks", and (b) syscall overhead. Have you measured the
>>> improvement this solution provides?
>>
>> No. I have not measured the performance improvement.
>
> The changelog makes a claim that the current implementation has overhead
> that is removed with this change. There is no data to support this claim.

My main motivation for this change is to make it user-friendly. So that
users can search the pid's and assign multiple tasks at a time. Originally
I did not have the line for performance. Actually, I don't want to claim
performance benefits. I will remove the performance claims.

>
> ...
>
>>>>>>>> +
>>>>>>>> + buf[nbytes - 1] = '\0';
>>>>>>>> +
>>>>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>>>>>>> if (!rdtgrp) {
>>>>>>>> rdtgroup_kn_unlock(of->kn);
>>>>>>>> return -ENOENT;
>>>>>>>> }
>>>>>>>> +
>>>>>>>> +next:
>>>>>>>> + if (!buf || buf[0] == '\0')
>>>>>>>> + goto unlock;
>>>>>>>> +
>>>>>>>> + pid_str = strim(strsep(&buf, ","));
>>>>>>>> +
>>>>>>>
>>>>>>> Could lib/cmdline.c:get_option() be useful?
>>>>>>
>>>>>> Yes. We could that also. May not be required for the simple case like this.
>>>>>
>>>>> Please keep an eye out for how much of it you end up duplicating ....
>>>>
>>>> Using the get_options will require at least two calls(one to get the
>>>> length and then read the integers). Also need to allocate the integers
>>>> array dynamically. That is lot code if we are going that route.
>>>>
>>>
>>> I did not ask about get_options(), I asked about get_option().
>>
>> If you insist, will use get_option. But we still have to loop thru all the
>> string till get_option returns 0. I can try that.
>
>
> I just asked whether get_option() could be useful. Could you please point out what
> I said that made you think that I insist on this change being made? If it matches
> your usage, then know it is available, if it does not, then don't use it.

Ok. I dont see a major benefit using get_option here. So, not planning to
to use it.

>
> ...
>
>>>> I can say "The failure pid will be logged in
>>>> /sys/fs/resctrl/info/last_cmd_status file."
>>>
>>> That will not be accurate. Not all errors include the pid.
>>
>> Can you please suggest?
>
> last_cmd_status provides a 512 char buffer to communicate details
> to the user. The buffer is cleared before the loop that moves all the
> tasks start. If an error is encountered, a detailed message is written
> to the buffer. One option may be to append a string to the buffer that
> includes the pid? Perhaps something like:
> rdt_last_cmd_printf("Error encountered while moving task %d\n", pid);

ok. Will try to add and test it.

>
> Please feel free to improve.
>
> Reinette
>
>

--
Thanks
Babu Moger

2023-03-20 19:53:44

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

Hi James,

On 3/20/23 12:10, James Morse wrote:
> Hi Babu,
>
> On 02/03/2023 20:24, Babu Moger wrote:
>> When a user creates a control or monitor group, the CLOSID or RMID
>> are not visible to the user. These are architecturally defined entities.
>
> On x86. Any other architecture is going to have a hard time supporting this.
>
>
>> There is no harm in displaying these in resctrl groups. Sometimes it
>> can help to debug the issues.
>
> By comparing it with what? Unless user-space can see into the hardware, resctrl is the
> only gateway to this stuff. What difference does the allocated value here make?
>
> Could you elaborate on what issues this can help debug?

While ago, we had an issue with one of the RMID's event reporting. There
were numerous active RMIDs on the system. As a kernel developer, we
couldn't pinpoint which RMID was reporting wrong information. That
information was important for hardware guys to debug further. We had to
patch the kernel to print that information for debugging. This is one of
the cases.

>
>
>> Add CLOSID and RMID to the control/monitor groups display in resctrl
>> interface.
>>
>> $cat /sys/fs/resctrl/clos1/closid
>> 1
>> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
>> 3
>
> Er. Please don't expose this to user-space!
> MPAM has no equivalent value to RMID, so whatever this is for, can't work on MPAM.
>
>
> Where I have needed this value for MPAM is to pass the closid/rmid to another kernel
> interface. Because the user-space interface needs to be architecture agnostic, I proposed
> it as a u64 called 'id' that each architecture can encode/decode as appropriate. [0]
>
> To prevent user-space trying to base anything on the raw closid/rmid values, I went as far
> as obfuscating them with a random value picked at boot, to ensure scripts always read the
> current value when passing the control/monitor group.
>
>
> I'm curious what the raw value is useful for.

It is mostly for debugging when there are issues.

I think we need to have a way to print generic as well as architecture
specific details.

Thanks
Babu
>
>
> Thanks,
>
> James
>
> [0]
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.2&id=d568cf2ba58b7c4970ce41a8d4d6224e285a177e
>
>
>
>

2023-03-20 20:59:40

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

On Thu, Mar 2, 2023 at 3:25 PM Babu Moger <[email protected]> wrote:
>
> When a user creates a control or monitor group, the CLOSID or RMID
> are not visible to the user. These are architecturally defined entities.
> There is no harm in displaying these in resctrl groups. Sometimes it
> can help to debug the issues.
>
> Add CLOSID and RMID to the control/monitor groups display in resctrl
> interface.
>
> $cat /sys/fs/resctrl/clos1/closid
> 1
> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
> 3
>
Is the intent here to be x86 specific?
How would that work on ARM?
Shouldn't we be using more generic names such as monitoring_id, control_id?


> Signed-off-by: Babu Moger <[email protected]>
> ---
> Documentation/x86/resctrl.rst | 17 ++++++++++++
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 ++++++++++++++++++++++++++++++++
> 2 files changed, 61 insertions(+)
>
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index 25203f20002d..67eae74fe40c 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -321,6 +321,15 @@ All groups contain the following files:
> Just like "cpus", only using ranges of CPUs instead of bitmasks.
>
>
> +"rmid":
> + Available only with debug option.Reading this file shows the
> + Resource Monitoring ID (RMID) for monitoring the resource
> + utilization. Monitoring is performed by tagging each core (or
> + thread) or process via a RMID. Kernel assigns a new RMID when
> + a group is created depending on the available RMIDs. Multiple
> + cores (or threads) or processes can share a same RMID in a resctrl
> + domain.
> +
> When control is enabled all CTRL_MON groups will also contain:
>
> "schemata":
> @@ -342,6 +351,14 @@ When control is enabled all CTRL_MON groups will also contain:
> file. On successful pseudo-locked region creation the mode will
> automatically change to "pseudo-locked".
>
> +"closid":
> + Available only with debug option. Reading this file shows the
> + Class of Service (CLOS) id which acts as a resource control tag
> + on which the resources can be throttled. Kernel assigns a new
> + CLOSID a control group is created depending on the available
> + CLOSIDs. Multiple cores(or threads) or processes can share a
> + same CLOSID in a resctrl domain.
> +
> When monitoring is enabled all MON groups will also contain:
>
> "mon_data":
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 1eb538965bd3..389d64b42704 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -760,6 +760,38 @@ static int rdtgroup_tasks_show(struct kernfs_open_file *of,
> return ret;
> }
>
> +static int rdtgroup_closid_show(struct kernfs_open_file *of,
> + struct seq_file *s, void *v)
> +{
> + struct rdtgroup *rdtgrp;
> + int ret = 0;
> +
> + rdtgrp = rdtgroup_kn_lock_live(of->kn);
> + if (rdtgrp)
> + seq_printf(s, "%u\n", rdtgrp->closid);
> + else
> + ret = -ENOENT;
> + rdtgroup_kn_unlock(of->kn);
> +
> + return ret;
> +}
> +
> +static int rdtgroup_rmid_show(struct kernfs_open_file *of,
> + struct seq_file *s, void *v)
> +{
> + struct rdtgroup *rdtgrp;
> + int ret = 0;
> +
> + rdtgrp = rdtgroup_kn_lock_live(of->kn);
> + if (rdtgrp)
> + seq_printf(s, "%u\n", rdtgrp->mon.rmid);
> + else
> + ret = -ENOENT;
> + rdtgroup_kn_unlock(of->kn);
> +
> + return ret;
> +}
> +
> #ifdef CONFIG_PROC_CPU_RESCTRL
>
> /*
> @@ -1821,6 +1853,12 @@ static struct rftype res_common_files[] = {
> .seq_show = rdtgroup_tasks_show,
> .fflags = RFTYPE_BASE,
> },
> + {
> + .name = "rmid",
> + .mode = 0444,
> + .kf_ops = &rdtgroup_kf_single_ops,
> + .seq_show = rdtgroup_rmid_show,
> + },
> {
> .name = "schemata",
> .mode = 0644,
> @@ -1844,6 +1882,12 @@ static struct rftype res_common_files[] = {
> .seq_show = rdtgroup_size_show,
> .fflags = RFTYPE_BASE_CTRL,
> },
> + {
> + .name = "closid",
> + .mode = 0444,
> + .kf_ops = &rdtgroup_kf_single_ops,
> + .seq_show = rdtgroup_closid_show,
> + },
>
> };
>
>
>

2023-03-21 15:55:05

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 4/7] x86/resctrl: Re-arrange RFTYPE flags based on hierarchy

Hi Reinette,
To be honest, I had tough time understanding these flags. Also, I need to
add more files in the future. So, I am trying make these these things
clear before I do those changes.

These flags decoding is pretty confusing. Also, there are some flags which
are duplicate. Not really required.

For example:
In group structure, we have control group or mon group. We just need two
bits here. The code uses combination of 3 flags here.
#define RFTYPE_BASE BIT(1)
#define RFTYPE_CTRL BIT(4)
#define RFTYPE_MON BIT(5)

Also, the flag RFTYPE_MON again used in creation on info directory.
Basically, very confusing to add anything new.

I will try to minimize the changes in the next version but still make it
clear.


On 3/15/23 13:37, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/2/2023 12:24 PM, Babu Moger wrote:
>> RESCTRL filesystem has two main components:
>> a. info (Details on resources and monitoring)
>> b. base (Details on CONTROL and MON groups)
>>
>> The rftype flags can be renamed accordingly for better understanding.
>> For example:
>> RFTYPE_INFO : Files with these flags go in info directory
>
> This is not a rename but the current name.

Agree. I am giving some example here. I may not need to change the text here.
>
>> RFTYPE_INFO_MON : Files with these flags go in info/L3_MON
>
> How does this improve the current RFTYPE_MON_INFO?

RFTYPE_INFO_MON -> info/L3_MON.

I tried to improve some readability based on hierarchy. Basically, looking
at the flags we know exaclty where these files land.


>
>> RFTYPE_BASE : Files with these flags go in group's(control or mon)
>> base directory
> This is not a rename but the current name.
>
>> RFTYPE_BASE_CTRL: Files with these flags go in only CONTROL groups
>
> How does this improve current RFTYPE_CTRL_BASE ?

Again, same explanation as above. Started with RFTYPE_BASE and added
RFTYPE_BASE_CTRL to say these files are on top of base.


>
>>
>> Add comments to make it easy for future additions.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/core.c | 8 ++--
>> arch/x86/kernel/cpu/resctrl/internal.h | 64 ++++++++++++++++++++++++++++----
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 +++++++++++-----------
>> 3 files changed, 81 insertions(+), 35 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 030d3b409768..d1c6b2cc8611 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -69,7 +69,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
>> .domains = domain_init(RDT_RESOURCE_L3),
>> .parse_ctrlval = parse_cbm,
>> .format_str = "%d=%0*x",
>> - .fflags = RFTYPE_RES_CACHE,
>> + .fflags = RFTYPE_CACHE,
>> },
>
> How does this rename improve understanding?

Agree. This change may not be required. I can actually remove these changes
>
> ...
>
>> @@ -97,7 +97,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
>> .domains = domain_init(RDT_RESOURCE_MBA),
>> .parse_ctrlval = parse_bw,
>> .format_str = "%d=%*u",
>> - .fflags = RFTYPE_RES_MB,
>> + .fflags = RFTYPE_MB,
>> },
>> },
>> [RDT_RESOURCE_SMBA] =
>
> ditto.

Agree. This change may not be required. I can actually remove these changes
.
>
>
> ...
>
>> + *
>> */
>> #define RFTYPE_INFO BIT(0)
>> #define RFTYPE_BASE BIT(1)
>> -#define RFTYPE_CTRL BIT(4)
>> -#define RFTYPE_MON BIT(5)
>> -#define RFTYPE_TOP BIT(6)
>> -#define RFTYPE_RES_CACHE BIT(8)
>> -#define RFTYPE_RES_MB BIT(9)
>> -#define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
>> -#define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
>> -#define RFTYPE_TOP_INFO (RFTYPE_INFO | RFTYPE_TOP)
>> -#define RFTYPE_CTRL_BASE (RFTYPE_BASE | RFTYPE_CTRL)
>> +
>> +#define RFTYPE_TOP BIT(2)
>> +#define RFTYPE_MON BIT(3)
>> +#define RFTYPE_RES BIT(4)
>> +
>> +#define RFTYPE_CACHE BIT(5)
>> +#define RFTYPE_MB BIT(6)
>> +
>> +#define RFTYPE_CTRL BIT(8)
>> +
>> +#define RFTYPE_INFO_TOP (RFTYPE_INFO | RFTYPE_TOP)
>> +#define RFTYPE_INFO_MON (RFTYPE_INFO | RFTYPE_MON)
>> +#define RFTYPE_INFO_RES (RFTYPE_INFO | RFTYPE_RES)
>> +
>> +#define RFTYPE_BASE_CTRL (RFTYPE_BASE | RFTYPE_CTRL)
>>
>
> It is not clear to me how any of the renames improves understanding.
>
> How does renaming RFTYPE_CTRL_BASE to RFTYPE_BASE_CTRL improve
> understanding? Renaming RFTYPE_MON_INFO to RFTYPE_INFO_MON?
>
> This all seems unnecessary.

Again see my comments in the beginning.
>
> ...
>
>> @@ -3218,7 +3218,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>> if (rtype == RDTCTRL_GROUP)
>> fflags = RFTYPE_BASE | RFTYPE_CTRL;
>> else
>> - fflags = RFTYPE_BASE | RFTYPE_MON;
>> + fflags = RFTYPE_BASE;
>>
>
> Is this intended?

Yes. We don't need this extra flag (RFTYPE_MON) here.
Thanks
Babu Moger

2023-03-22 13:51:06

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

Hi Stephane,

On 3/20/23 15:59, Stephane Eranian wrote:
> On Thu, Mar 2, 2023 at 3:25 PM Babu Moger <[email protected]> wrote:
>>
>> When a user creates a control or monitor group, the CLOSID or RMID
>> are not visible to the user. These are architecturally defined entities.
>> There is no harm in displaying these in resctrl groups. Sometimes it
>> can help to debug the issues.
>>
>> Add CLOSID and RMID to the control/monitor groups display in resctrl
>> interface.
>>
>> $cat /sys/fs/resctrl/clos1/closid
>> 1
>> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
>> 3
>>
> Is the intent here to be x86 specific?
> How would that work on ARM?
> Shouldn't we be using more generic names such as monitoring_id, control_id?

Yes. We can do that.
Thanks
Babu

2023-03-22 14:06:35

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] x86/resctrl: Introduce -o debug mount option



On 3/15/23 13:43, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/2/2023 12:25 PM, Babu Moger wrote:
>> Add -o debug option to mount resctrl filesystem in debug mode. Debug
>> option adds the files for debug purposes.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> Documentation/x86/resctrl.rst | 2 ++
>> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 13 +++++++++++++
>> 3 files changed, 16 insertions(+)
>>
>> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
>> index 67eae74fe40c..1ada4e0650dc 100644
>> --- a/Documentation/x86/resctrl.rst
>> +++ b/Documentation/x86/resctrl.rst
>> @@ -46,6 +46,8 @@ mount options are:
>> "mba_MBps":
>> Enable the MBA Software Controller(mba_sc) to specify MBA
>> bandwidth in MBps
>> +"debug":
>> + Lists the debug files in resctrl interface
>>
>
> This seems to imply that a listing of available debug flags
> will be displayed. How about something like "Make debug files
> accessible. Available debug files are annotated with "Available
> only with debug option"." (please feel free to improve).

Sure.

>
>> L2 and L3 CDP are controlled separately.
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 6767c85b9699..35a9ee343fe0 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -59,6 +59,7 @@ struct rdt_fs_context {
>> bool enable_cdpl2;
>> bool enable_cdpl3;
>> bool enable_mba_mbps;
>> + bool debug;
>
> Why not follow the prefix of existing flags?

Ok. Sure. Will change it.

--
Thanks
Babu Moger

2023-03-22 15:16:58

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] x86/resctrl: Add debug files when mounted with debug option



On 3/15/23 13:45, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/2/2023 12:25 PM, Babu Moger wrote:
>> Add the debug files to the resctrl hierarchy.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 48 +++++++++++++++++++++++++++++++-
>> 1 file changed, 47 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index a1f13715a65c..790c6b9f9031 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2400,6 +2400,45 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn,
>> struct rdtgroup *prgrp,
>> struct kernfs_node **mon_data_kn);
>>
>> +void resctrl_add_debug_file(struct kernfs_node *parent_kn, const char *config,
>> + unsigned long fflags)
>> +{
>> + struct rftype *rft;
>> +
>> + rft = rdtgroup_get_rftype_by_name(config);
>> + if (rft) {
>> + rft->fflags |= fflags;
>> + rdtgroup_add_file(parent_kn, rft);
>> + }
>> +}
>> +
>> +static void resctrl_add_debug_files(void)
>> +{
>> + resctrl_add_debug_file(rdtgroup_default.kn, "rmid", RFTYPE_BASE);
>> + resctrl_add_debug_file(rdtgroup_default.kn, "closid", RFTYPE_BASE_CTRL);
>> + kernfs_activate(rdtgroup_default.kn);
>> +}
>> +
>
> I think that separating this from all the other resctrl file creation
> can be a source of confusion and bugs. Why not add the debug files
> at the same time as all the other files belonging to a resource group?
>
> How about introducing new flags, perhaps RFTYPE_MON_DEBUG and
> RFTYPE_CTRL_DEBUG. When the debug option is enabled (when resctrl_debug
> is true) then the appropriate flag can be OR'd with the other flags
> before rdtgroup_add_files() is called.

Yes. I could try that.
>
> It sounds to me if there are plans to add more of these files. A function
> like resctrl_add_debug_files() requires a lot of changes and care (and thus
> potential for errors) every time a new debug file is added.
>
> On another note ... what are the plans with this debug area? At some
> point it may be better to expand resctrl debugfs.

Hmm.. I have not thought about that.

--
Thanks
Babu Moger

2023-03-22 18:47:32

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v3 4/7] x86/resctrl: Re-arrange RFTYPE flags based on hierarchy

Hi Babu,

On 3/21/2023 8:54 AM, Moger, Babu wrote:
> Hi Reinette,
> To be honest, I had tough time understanding these flags. Also, I need to
> add more files in the future. So, I am trying make these these things
> clear before I do those changes.
>
> These flags decoding is pretty confusing. Also, there are some flags which
> are duplicate. Not really required.
>
> For example:
> In group structure, we have control group or mon group. We just need two
> bits here. The code uses combination of 3 flags here.
> #define RFTYPE_BASE BIT(1)
> #define RFTYPE_CTRL BIT(4)
> #define RFTYPE_MON BIT(5)

Two bits are used to distinguish between files being for control or
monitoring groups respectively.

The third bit you list is used to indicate where the file is located. The
RFTYPE_BASE is the bit used to specify that the file is located within
a resource group, as opposed to RFTYPE_INFO that specifies that the file
is located within the info directory.


> Also, the flag RFTYPE_MON again used in creation on info directory.
> Basically, very confusing to add anything new.

The RFTYPE_MON indicates a monitoring file. Whether it is in the info
directory or as part of a monitoring group is specified using RFTYPE_INFO
or RFTYPE_BASE respectively.

A file can be specific to a monitoring or control group by setting
the RFTYPE_CTRL or RFTYPE_MON bits, but if it does not then control and
monitoring groups will have the file. For example, the "tasks" and "cpus"
files.

> I will try to minimize the changes in the next version but still make it
> clear.

ok

>
>
> On 3/15/23 13:37, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 3/2/2023 12:24 PM, Babu Moger wrote:
>>> RESCTRL filesystem has two main components:
>>> a. info (Details on resources and monitoring)
>>> b. base (Details on CONTROL and MON groups)
>>>
>>> The rftype flags can be renamed accordingly for better understanding.
>>> For example:
>>> RFTYPE_INFO : Files with these flags go in info directory
>>
>> This is not a rename but the current name.
>
> Agree. I am giving some example here. I may not need to change the text here.
>>
>>> RFTYPE_INFO_MON : Files with these flags go in info/L3_MON
>>
>> How does this improve the current RFTYPE_MON_INFO?
>
> RFTYPE_INFO_MON -> info/L3_MON.
>
> I tried to improve some readability based on hierarchy. Basically, looking
> at the flags we know exaclty where these files land.

It is not clear to me how switching around terms in the flag
accomplishes this. The meaning ends up the same.

>>> @@ -3218,7 +3218,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>>> if (rtype == RDTCTRL_GROUP)
>>> fflags = RFTYPE_BASE | RFTYPE_CTRL;
>>> else
>>> - fflags = RFTYPE_BASE | RFTYPE_MON;
>>> + fflags = RFTYPE_BASE;
>>>
>>
>> Is this intended?
>
> Yes. We don't need this extra flag (RFTYPE_MON) here.

It is not used, but it reflects the resctrl design in support
of adding monitoring files. Future additions of files need not think
how such integration needs to happen since it has been solved and
is supported.

Reinette