2020-01-09 16:02:01

by Chen Yu

[permalink] [raw]
Subject: [PATCH][RESEND v5] x86/resctrl: Add task resctrl information display

Monitoring tools that want to find out which resctrl control
and monitor groups a task belongs to must currently read
the "tasks" file in every group until they locate the process
ID.

Add an additional file /proc/{pid}/resctrl to provide this
information.

The output is as followed according to Thomas's suggestion,
for example:

1) ""
Resctrl is not available.

2) "/"
Task is part of the root group, task is not associated to
any monitoring group.

3) "/mon_groups/mon0"
Task is part of the root group and monitoring group mon0.

4) "/group0"
Task is part of control group group0, task is not associated
to any monitoring group.

5) "/group0/mon_groups/mon1"
Task is part of control group group0 and monitoring group mon1.

Tested-by: Jinshi Chen <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Reinette Chatre <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Chen Yu <[email protected]>
---
v2: Reduce indentation level in proc_resctrl_show()
according to Boris's suggestion.
Create the include/linux/resctrl.h header and
declare proc_resctrl_show() in this file, so
that other architectures would probably use it
in the future. Different architectures should
implement architectural specific proc_resctrl_show()
accordingly.

v3: Return empty string if the resctrl filesystem has
not been mounted per Boris's suggestion.
Rename the config from CPU_RESCTRL to PROC_CPU_RESCTRL
to better represent its usage. Move PROC_CPU_RESCTRL
from arch/Kconfig to fs/proc/Kconfig.
And let PROC_CPU_RESCTRL to be depended on PROC_FS.

v4: According to Thomas's suggestion, changed the output
from multiple lines to one single line.

v5: According to Alexey's feedback, removed the header file
proc_fs.h in resctrl.h, and changed seq_puts() to
seq_putc() for simplicity.
---
arch/x86/Kconfig | 1 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 78 ++++++++++++++++++++++++++
fs/proc/Kconfig | 4 ++
fs/proc/base.c | 7 +++
include/linux/resctrl.h | 14 +++++
5 files changed, 104 insertions(+)
create mode 100644 include/linux/resctrl.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8ef85139553f..252364d18887 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -455,6 +455,7 @@ config X86_CPU_RESCTRL
bool "x86 CPU resource control support"
depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
select KERNFS
+ select PROC_CPU_RESCTRL if PROC_FS
help
Enable x86 CPU resource control support.

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 2e3b06d6bbc6..f786e7626a65 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -725,6 +725,84 @@ static int rdtgroup_tasks_show(struct kernfs_open_file *of,
return ret;
}

+#ifdef CONFIG_PROC_CPU_RESCTRL
+
+/*
+ * A task can only be part of one control
+ * group and of one monitoring group which
+ * is associated to that control group.
+ * So one line is simple and clear enough:
+ *
+ * 1) ""
+ * Resctrl is not available.
+ *
+ * 2) "/"
+ * Task is part of the root group, and it is
+ * not associated to any monitoring group.
+ *
+ * 3) "/mon_groups/mon0"
+ * Task is part of the root group and monitoring
+ * group mon0.
+ *
+ * 4) "/group0"
+ * Task is part of control group group0, and it is
+ * not associated to any monitoring group.
+ *
+ * 5) "/group0/mon_groups/mon1"
+ * Task is part of control group group0 and monitoring
+ * group mon1.
+ */
+int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *tsk)
+{
+ struct rdtgroup *rdtg;
+ int ret = 0;
+
+ mutex_lock(&rdtgroup_mutex);
+
+ /* Return empty if resctrl has not been mounted. */
+ if (!static_branch_unlikely(&rdt_enable_key))
+ goto unlock;
+
+ list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
+ struct rdtgroup *crg;
+
+ /*
+ * Task information is only relevant for shareable
+ * and exclusive groups.
+ */
+ if (rdtg->mode != RDT_MODE_SHAREABLE &&
+ rdtg->mode != RDT_MODE_EXCLUSIVE)
+ continue;
+
+ if (rdtg->closid != tsk->closid)
+ continue;
+
+ seq_printf(s, "/%s", rdtg->kn->name);
+ list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
+ mon.crdtgrp_list) {
+ if (tsk->rmid != crg->mon.rmid)
+ continue;
+ seq_printf(s, "%smon_groups/%s",
+ rdtg == &rdtgroup_default ? "" : "/",
+ crg->kn->name);
+ break;
+ }
+ seq_putc(s, '\n');
+ goto unlock;
+ }
+ /*
+ * The above search should succeed. Otherwise return
+ * with an error.
+ */
+ ret = -ENOENT;
+unlock:
+ mutex_unlock(&rdtgroup_mutex);
+
+ return ret;
+}
+#endif
+
static int rdt_last_cmd_status_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
diff --git a/fs/proc/Kconfig b/fs/proc/Kconfig
index cb5629bd5fff..ae96a339d24d 100644
--- a/fs/proc/Kconfig
+++ b/fs/proc/Kconfig
@@ -103,3 +103,7 @@ config PROC_CHILDREN
config PROC_PID_ARCH_STATUS
def_bool n
depends on PROC_FS
+
+config PROC_CPU_RESCTRL
+ def_bool n
+ depends on PROC_FS
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ebea9501afb8..0e4b8bf2b986 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -94,6 +94,7 @@
#include <linux/sched/debug.h>
#include <linux/sched/stat.h>
#include <linux/posix-timers.h>
+#include <linux/resctrl.h>
#include <trace/events/oom.h>
#include "internal.h"
#include "fd.h"
@@ -3060,6 +3061,9 @@ static const struct pid_entry tgid_base_stuff[] = {
#endif
#ifdef CONFIG_CGROUPS
ONE("cgroup", S_IRUGO, proc_cgroup_show),
+#endif
+#ifdef CONFIG_PROC_CPU_RESCTRL
+ ONE("resctrl", S_IRUGO, proc_resctrl_show),
#endif
ONE("oom_score", S_IRUGO, proc_oom_score),
REG("oom_adj", S_IRUGO|S_IWUSR, proc_oom_adj_operations),
@@ -3460,6 +3464,9 @@ static const struct pid_entry tid_base_stuff[] = {
#endif
#ifdef CONFIG_CGROUPS
ONE("cgroup", S_IRUGO, proc_cgroup_show),
+#endif
+#ifdef CONFIG_PROC_CPU_RESCTRL
+ ONE("resctrl", S_IRUGO, proc_resctrl_show),
#endif
ONE("oom_score", S_IRUGO, proc_oom_score),
REG("oom_adj", S_IRUGO|S_IWUSR, proc_oom_adj_operations),
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
new file mode 100644
index 000000000000..daf5cf64c6a6
--- /dev/null
+++ b/include/linux/resctrl.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _RESCTRL_H
+#define _RESCTRL_H
+
+#ifdef CONFIG_PROC_CPU_RESCTRL
+
+int proc_resctrl_show(struct seq_file *m,
+ struct pid_namespace *ns,
+ struct pid *pid,
+ struct task_struct *tsk);
+
+#endif
+
+#endif /* _RESCTRL_H */
--
2.17.1


2020-01-09 16:08:01

by Chris Down

[permalink] [raw]
Subject: Re: [PATCH][RESEND v5] x86/resctrl: Add task resctrl information display

Chen Yu writes:
>+#ifdef CONFIG_PROC_CPU_RESCTRL
>+
>+/*
>+ * A task can only be part of one control
>+ * group and of one monitoring group which
>+ * is associated to that control group.
>+ * So one line is simple and clear enough:

Can we please avoid using the word "control group" to describe these? It's
extremely confusing for readers since it's exactly the same word as used for
actual cgroups, especially since those are also a form of "resource control"...

Doesn't official documentation refer to them as "resource groups" to avoid
this?

2020-01-09 16:10:27

by Chris Down

[permalink] [raw]
Subject: Re: [PATCH][RESEND v5] x86/resctrl: Add task resctrl information display

Chen Yu writes:
>+#ifdef CONFIG_PROC_CPU_RESCTRL
>+ ONE("resctrl", S_IRUGO, proc_resctrl_show),

There was already some discussion about "resctrl" by itself being a misleading
name, hence why the CONFIG option eventually became CONFIG_X86_CPU_RESCTRL. Can
you please rethink the name of this /proc file, and have it at least be
"cpu_resctrl" or "x86_resctrl" or similar? :-)

2020-01-09 20:37:26

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH][RESEND v5] x86/resctrl: Add task resctrl information display

Hi Chris,
Thanks for looking at this patch.
On Thu, Jan 09, 2020 at 02:53:55PM +0000, Chris Down wrote:
> Chen Yu writes:
> > +#ifdef CONFIG_PROC_CPU_RESCTRL
> > + ONE("resctrl", S_IRUGO, proc_resctrl_show),
>
> There was already some discussion about "resctrl" by itself being a
> misleading name, hence why the CONFIG option eventually became
> CONFIG_X86_CPU_RESCTRL. Can you please rethink the name of this /proc file,
> and have it at least be "cpu_resctrl" or "x86_resctrl" or similar? :-)
Ok, I'll change the name from "resctrl" to "cpu_resctrl", because:
1. it is a CPU feature so a 'cpu' prefixed seems to be more obvious,
2. other CPUs than x86 could also use this file in the future.

Thanks,
Chenyu

2020-01-09 21:01:23

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH][RESEND v5] x86/resctrl: Add task resctrl information display

Hi Chris,
On Thu, Jan 09, 2020 at 02:24:44PM +0000, Chris Down wrote:
> Chen Yu writes:
> > +#ifdef CONFIG_PROC_CPU_RESCTRL
> > +
> > +/*
> > + * A task can only be part of one control
> > + * group and of one monitoring group which
> > + * is associated to that control group.
> > + * So one line is simple and clear enough:
>
> Can we please avoid using the word "control group" to describe these? It's
> extremely confusing for readers since it's exactly the same word as used for
> actual cgroups, especially since those are also a form of "resource
> control"...
>
> Doesn't official documentation refer to them as "resource groups" to avoid
> this?
Right, how abut changing change this description to:
control group -> resctrl allocation group
monitor group -> resctrl monitor group?

Yes, the Documentation/x86/resctrl_ui.rst described them as
"Resource Control feature", which is composed of allocation
and monitor, so distinguish them as above seems to be appropriate.

Thanks,
Chenyu