2017-04-03 21:43:52

by Shivappa Vikas

[permalink] [raw]
Subject: [PATCH 0/3 V3] x86/intel_rdt: Improvements/Fixes to RDT framework

Add some improvements to the existing RDT(Resource director technology)
framework and fixes.

Patches are based on 4.11-rc4

Thanks to Thomas for all the feedback and below are changes in V3:

- Reformatted change log to make it more readable and follow
the format context, problem, solution format.
- Made a lot of coding convention changes and to make it more
readable.
- Removed hotcpu change as it was cqm specific and to be sent later.
- Including Tony's patch which removes the restriction of ordering the
resources and to specify data for all domains while editing schemata.
This includes the changes in previous version patches 1/5 and 3/5 -
https://marc.info/?l=linux-kernel&m=148736049210659

[PATCH 1/3] x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid
[PATCH 2/3] x86/intel_rdt: Implement "update" mode when writing
[PATCH 3/3] x86/intel_rdt: Update schemata read to show data in


2017-04-03 21:43:57

by Shivappa Vikas

[permalink] [raw]
Subject: [PATCH 2/3] x86/intel_rdt: Implement "update" mode when writing schemata file

From: Tony Luck <[email protected]>

The schemata file can have multiple lines and it is cumbersome to
update all lines.

Remove code that requires that the user provide values for every
resource (in the right order). If the user provides values for
just a few resources, update them and leave the rest unchanged.

Side benefit: we now check which values were updated and only send
IPIs to cpus that actually have updates.

Cc: Vikas Shivappa <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: [email protected]
Cc: [email protected]
Tested-by: Sai Praneeth Prakhya <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Vikas Shivappa <[email protected]>
---
Documentation/x86/intel_rdt_ui.txt | 14 ++++++
arch/x86/include/asm/intel_rdt.h | 10 ++---
arch/x86/kernel/cpu/intel_rdt.c | 2 -
arch/x86/kernel/cpu/intel_rdt_schemata.c | 76 ++++++++++++++------------------
4 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt
index 51cf6fa..3ea1984 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -129,6 +129,20 @@ schemata format is always:

L2:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...

+Reading/writing the schemata file
+---------------------------------
+Reading the schemata file will show the state of all resources
+on all domains. When writing you only need to specify those values
+which you wish to change. E.g.
+
+# cat schemata
+L3DATA:0=fffff;1=fffff;2=fffff;3=fffff
+L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+# echo "L3DATA:2=3c0;" > schemata
+# cat schemata
+L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
+L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+
Example 1
---------
On a two socket machine (one L3 cache per socket) with just four bits
diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index 0d64397..d770527 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -76,10 +76,7 @@ struct rftype {
* @min_cbm_bits: Minimum number of consecutive bits to be set
* in a cache bit mask
* @domains: All domains for this resource
- * @num_domains: Number of domains active
* @msr_base: Base MSR address for CBMs
- * @tmp_cbms: Scratch space when updating schemata
- * @num_tmp_cbms: Number of CBMs in tmp_cbms
* @cache_level: Which cache level defines scope of this domain
* @cbm_idx_multi: Multiplier of CBM index
* @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
@@ -94,10 +91,7 @@ struct rdt_resource {
int min_cbm_bits;
u32 max_cbm;
struct list_head domains;
- int num_domains;
int msr_base;
- u32 *tmp_cbms;
- int num_tmp_cbms;
int cache_level;
int cbm_idx_multi;
int cbm_idx_offset;
@@ -109,12 +103,16 @@ struct rdt_resource {
* @id: unique id for this instance
* @cpu_mask: which cpus share this resource
* @cbm: array of cache bit masks (indexed by CLOSID)
+ * @new_cbm: new cbm value to be loaded
+ * @have_new_cbm: did user provide new_cbm for this domain
*/
struct rdt_domain {
struct list_head list;
int id;
struct cpumask cpu_mask;
u32 *cbm;
+ u32 new_cbm;
+ bool have_new_cbm;
};

/**
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 5a533fe..329b887 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -309,7 +309,6 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)

cpumask_set_cpu(cpu, &d->cpu_mask);
list_add_tail(&d->list, add_pos);
- r->num_domains++;
}

static void domain_remove_cpu(int cpu, struct rdt_resource *r)
@@ -325,7 +324,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)

cpumask_clear_cpu(cpu, &d->cpu_mask);
if (cpumask_empty(&d->cpu_mask)) {
- r->num_domains--;
kfree(d->cbm);
list_del(&d->list);
kfree(d);
diff --git a/arch/x86/kernel/cpu/intel_rdt_schemata.c b/arch/x86/kernel/cpu/intel_rdt_schemata.c
index f369cb8..52e83ea 100644
--- a/arch/x86/kernel/cpu/intel_rdt_schemata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_schemata.c
@@ -56,17 +56,21 @@ static bool cbm_validate(unsigned long var, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-static int parse_cbm(char *buf, struct rdt_resource *r)
+static int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d)
{
unsigned long data;
int ret;

+ if (d->have_new_cbm)
+ return -EINVAL;
+
ret = kstrtoul(buf, 16, &data);
if (ret)
return ret;
if (!cbm_validate(data, r))
return -EINVAL;
- r->tmp_cbms[r->num_tmp_cbms++] = data;
+ d->new_cbm = data;
+ d->have_new_cbm = true;

return 0;
}
@@ -74,8 +78,8 @@ static int parse_cbm(char *buf, struct rdt_resource *r)
/*
* For each domain in this resource we expect to find a series of:
* id=mask
- * separated by ";". The "id" is in decimal, and must appear in the
- * right order.
+ * separated by ";". The "id" is in decimal, and must match one of
+ * the "id"s for this resource.
*/
static int parse_line(char *line, struct rdt_resource *r)
{
@@ -83,21 +87,21 @@ static int parse_line(char *line, struct rdt_resource *r)
struct rdt_domain *d;
unsigned long dom_id;

+next:
+ if (!line || line[0] == '\0')
+ return 0;
+ dom = strsep(&line, ";");
+ id = strsep(&dom, "=");
+ if (!dom || kstrtoul(id, 10, &dom_id))
+ return -EINVAL;
list_for_each_entry(d, &r->domains, list) {
- dom = strsep(&line, ";");
- if (!dom)
- return -EINVAL;
- id = strsep(&dom, "=");
- if (kstrtoul(id, 10, &dom_id) || dom_id != d->id)
- return -EINVAL;
- if (parse_cbm(dom, r))
- return -EINVAL;
+ if (d->id == dom_id) {
+ if (parse_cbm(dom, r, d))
+ return -EINVAL;
+ goto next;
+ }
}
-
- /* Any garbage at the end of the line? */
- if (line && line[0])
- return -EINVAL;
- return 0;
+ return -EINVAL;
}

static int update_domains(struct rdt_resource *r, int closid)
@@ -105,7 +109,7 @@ static int update_domains(struct rdt_resource *r, int closid)
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
- int cpu, idx = 0;
+ int cpu;

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
@@ -115,9 +119,13 @@ static int update_domains(struct rdt_resource *r, int closid)
msr_param.res = r;

list_for_each_entry(d, &r->domains, list) {
- cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
- d->cbm[msr_param.low] = r->tmp_cbms[idx++];
+ if (d->have_new_cbm && d->new_cbm != d->cbm[closid]) {
+ cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
+ d->cbm[closid] = d->new_cbm;
+ }
}
+ if (cpumask_empty(cpu_mask))
+ goto done;
cpu = get_cpu();
/* Update CBM on this cpu if it's in cpu_mask. */
if (cpumask_test_cpu(cpu, cpu_mask))
@@ -126,6 +134,7 @@ static int update_domains(struct rdt_resource *r, int closid)
smp_call_function_many(cpu_mask, rdt_cbm_update, &msr_param, 1);
put_cpu();

+done:
free_cpumask_var(cpu_mask);

return 0;
@@ -135,10 +144,10 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct rdtgroup *rdtgrp;
+ struct rdt_domain *dom;
struct rdt_resource *r;
char *tok, *resname;
int closid, ret = 0;
- u32 *l3_cbms = NULL;

/* Valid input requires a trailing newline */
if (nbytes == 0 || buf[nbytes - 1] != '\n')
@@ -153,16 +162,9 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

closid = rdtgrp->closid;

- /* get scratch space to save all the masks while we validate input */
- for_each_enabled_rdt_resource(r) {
- r->tmp_cbms = kcalloc(r->num_domains, sizeof(*l3_cbms),
- GFP_KERNEL);
- if (!r->tmp_cbms) {
- ret = -ENOMEM;
- goto out;
- }
- r->num_tmp_cbms = 0;
- }
+ for_each_enabled_rdt_resource(r)
+ list_for_each_entry(dom, &r->domains, list)
+ dom->have_new_cbm = false;

while ((tok = strsep(&buf, "\n")) != NULL) {
resname = strsep(&tok, ":");
@@ -185,14 +187,6 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
}
}

- /* Did the parser find all the masks we need? */
- for_each_enabled_rdt_resource(r) {
- if (r->num_tmp_cbms != r->num_domains) {
- ret = -EINVAL;
- goto out;
- }
- }
-
for_each_enabled_rdt_resource(r) {
ret = update_domains(r, closid);
if (ret)
@@ -201,10 +195,6 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

out:
rdtgroup_kn_unlock(of->kn);
- for_each_enabled_rdt_resource(r) {
- kfree(r->tmp_cbms);
- r->tmp_cbms = NULL;
- }
return ret ?: nbytes;
}

--
1.9.1

2017-04-03 21:44:22

by Shivappa Vikas

[permalink] [raw]
Subject: [PATCH 1/3] x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid

Each resctrl directory has one CLOSid allocated which is mapped to a
control register/QOS_MSR. During an rmdir this CLOSid is freed and can
be reused later when a new directory is created. Currently we do not
reset the QOS_MSR to a default when the CLOSid is freed. So when the
next mkdir uses a freed CLOSid, the new directory is associated with a
stale QOS_MSR.

Fix this issue by writing a default value to the QOS_MSR when the
associated CLOSid is freed. The default value is all 1s for CBM which
means no control is enforced when a mkdir reuses this CLOSid.

Signed-off-by: Vikas Shivappa <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 9ac2a5c..77e88c0 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -780,7 +780,7 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type,
return dentry;
}

-static int reset_all_cbms(struct rdt_resource *r)
+static int reset_all_ctrls(struct rdt_resource *r, u32 closid, u32 count)
{
struct msr_param msr_param;
cpumask_var_t cpu_mask;
@@ -791,8 +791,8 @@ static int reset_all_cbms(struct rdt_resource *r)
return -ENOMEM;

msr_param.res = r;
- msr_param.low = 0;
- msr_param.high = r->num_closid;
+ msr_param.low = closid;
+ msr_param.high = closid + count;

/*
* Disable resource control for this resource by setting all
@@ -802,7 +802,7 @@ static int reset_all_cbms(struct rdt_resource *r)
list_for_each_entry(d, &r->domains, list) {
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);

- for (i = 0; i < r->num_closid; i++)
+ for (i = closid; i < closid + count; i++)
d->cbm[i] = r->max_cbm;
}
cpu = get_cpu();
@@ -896,7 +896,7 @@ static void rdt_kill_sb(struct super_block *sb)

/*Put everything back to default values. */
for_each_enabled_rdt_resource(r)
- reset_all_cbms(r);
+ reset_all_ctrls(r, 0, r->num_closid);
cdp_disable();
rmdir_all_sub();
static_branch_disable(&rdt_enable_key);
@@ -991,6 +991,7 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)
{
int ret, cpu, closid = rdtgroup_default.closid;
struct rdtgroup *rdtgrp;
+ struct rdt_resource *r;
cpumask_var_t tmpmask;

if (!zalloc_cpumask_var(&tmpmask, GFP_KERNEL))
@@ -1019,6 +1020,13 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)
cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
rdt_update_closid(tmpmask, NULL);

+ /*
+ * Put domain control values back to default for the
+ * rdtgrp thats being removed.
+ */
+ for_each_enabled_rdt_resource(r)
+ reset_all_ctrls(r, rdtgrp->closid, 1);
+
rdtgrp->flags = RDT_DELETED;
closid_free(rdtgrp->closid);
list_del(&rdtgrp->rdtgroup_list);
--
1.9.1

2017-04-03 21:43:56

by Shivappa Vikas

[permalink] [raw]
Subject: [PATCH 3/3] x86/intel_rdt: Update schemata read to show data in tabular format

The schemata file would display data from different resources on all
domains. Its cumbersome to read since they are not tabular and
data/names could be of different widths. Make the schemata file to
display data in a tabular format thereby making it nice and simple to
read.

Signed-off-by: Vikas Shivappa <[email protected]>
---
arch/x86/include/asm/intel_rdt.h | 4 ++++
arch/x86/kernel/cpu/intel_rdt.c | 30 ++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/intel_rdt_schemata.c | 5 +++--
3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index d770527..3f31399 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -40,6 +40,8 @@ struct rdtgroup {
/* List of all resource groups */
extern struct list_head rdt_all_groups;

+extern int max_name_width, max_data_width;
+
int __init rdtgroup_init(void);

/**
@@ -73,6 +75,7 @@ struct rftype {
* @name: Name to use in "schemata" file
* @num_closid: Number of CLOSIDs available
* @max_cbm: Largest Cache Bit Mask allowed
+ * @data_width: Character width of data when displaying
* @min_cbm_bits: Minimum number of consecutive bits to be set
* in a cache bit mask
* @domains: All domains for this resource
@@ -90,6 +93,7 @@ struct rdt_resource {
int cbm_len;
int min_cbm_bits;
u32 max_cbm;
+ int data_width;
struct list_head domains;
int msr_base;
int cache_level;
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 329b887..70a3307 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -39,6 +39,12 @@

#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].domains)

+/*
+ * Used to store the max resource name width and max resource data width
+ * to display the schemata in a tabular format
+ */
+int max_name_width, max_data_width;
+
struct rdt_resource rdt_resources_all[] = {
{
.name = "L3",
@@ -140,6 +146,7 @@ static void rdt_get_config(int idx, struct rdt_resource *r)
r->num_closid = edx.split.cos_max + 1;
r->cbm_len = eax.split.cbm_len + 1;
r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
+ r->data_width = (r->cbm_len + 3) / 4;
r->capable = true;
r->enabled = true;
}
@@ -152,6 +159,7 @@ static void rdt_get_cdp_l3_config(int type)
r->num_closid = r_l3->num_closid / 2;
r->cbm_len = r_l3->cbm_len;
r->max_cbm = r_l3->max_cbm;
+ r->data_width = (r->cbm_len + 3) / 4;
r->capable = true;
/*
* By default, CDP is disabled. CDP can be enabled by mount parameter
@@ -160,6 +168,26 @@ static void rdt_get_cdp_l3_config(int type)
r->enabled = false;
}

+/**
+ * Choose a width for the resource name
+ * and resource data based on the resource that has
+ * widest name and cbm.
+ */
+static void rdt_init_padding(void)
+{
+ struct rdt_resource *r;
+ int cl;
+
+ for_each_enabled_rdt_resource(r) {
+ cl = strlen(r->name);
+ if (cl > max_name_width)
+ max_name_width = cl;
+
+ if (r->data_width > max_data_width)
+ max_data_width = r->data_width;
+ }
+}
+
static inline bool get_rdt_resources(void)
{
bool ret = false;
@@ -184,6 +212,8 @@ static inline bool get_rdt_resources(void)
ret = true;
}

+ rdt_init_padding();
+
return ret;
}

diff --git a/arch/x86/kernel/cpu/intel_rdt_schemata.c b/arch/x86/kernel/cpu/intel_rdt_schemata.c
index 52e83ea..8594db4 100644
--- a/arch/x86/kernel/cpu/intel_rdt_schemata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_schemata.c
@@ -203,11 +203,12 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
struct rdt_domain *dom;
bool sep = false;

- seq_printf(s, "%s:", r->name);
+ seq_printf(s, "%*s:", max_name_width, r->name);
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_puts(s, ";");
- seq_printf(s, "%d=%x", dom->id, dom->cbm[closid]);
+ seq_printf(s, "%d=%0*x", dom->id, max_data_width,
+ dom->cbm[closid]);
sep = true;
}
seq_puts(s, "\n");
--
1.9.1

2017-04-05 15:20:32

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid

On Mon, 3 Apr 2017, Vikas Shivappa wrote:

> Subject: x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid

This subject line is useless again. It want's to be descriptive.

"Fix issue" Which issue?

> Each resctrl directory has one CLOSid allocated which is mapped to a
> control register/QOS_MSR. During an rmdir this CLOSid is freed and can
> be reused later when a new directory is created. Currently we do not
> reset the QOS_MSR to a default when the CLOSid is freed. So when the
> next mkdir uses a freed CLOSid, the new directory is associated with a
> stale QOS_MSR.

That's not an issue. That's maybe something people are not expecting.

> Fix this issue by writing a default value to the QOS_MSR when the
> associated CLOSid is freed. The default value is all 1s for CBM which
> means no control is enforced when a mkdir reuses this CLOSid.

That's just wrong.

The proper behaviour for a new control group is, that at the time when it
is created it copies the CBM values of the default group and not claiming
access to ALL of the cache by default.

Thanks,

tglx


Subject: [tip:x86/cpu] x86/intel_rdt: Implement "update" mode when writing schemata file

Commit-ID: c4026b7b95a4b852e404afa2cd7720866159d118
Gitweb: http://git.kernel.org/tip/c4026b7b95a4b852e404afa2cd7720866159d118
Author: Tony Luck <[email protected]>
AuthorDate: Mon, 3 Apr 2017 14:44:16 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Wed, 5 Apr 2017 17:22:31 +0200

x86/intel_rdt: Implement "update" mode when writing schemata file

The schemata file can have multiple lines and it is cumbersome to update
all lines.

Remove code that requires that the user provides values for every resource
(in the right order). If the user provides values for just a few
resources, update them and leave the rest unchanged.

Side benefit: we now check which values were updated and only send IPIs to
cpus that actually have updates.

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Vikas Shivappa <[email protected]>
Tested-by: Sai Praneeth Prakhya <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
Documentation/x86/intel_rdt_ui.txt | 14 ++++++
arch/x86/include/asm/intel_rdt.h | 10 ++---
arch/x86/kernel/cpu/intel_rdt.c | 2 -
arch/x86/kernel/cpu/intel_rdt_schemata.c | 76 ++++++++++++++------------------
4 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt
index 51cf6fa..3ea1984 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -129,6 +129,20 @@ schemata format is always:

L2:<cache_id0>=<cbm>;<cache_id1>=<cbm>;...

+Reading/writing the schemata file
+---------------------------------
+Reading the schemata file will show the state of all resources
+on all domains. When writing you only need to specify those values
+which you wish to change. E.g.
+
+# cat schemata
+L3DATA:0=fffff;1=fffff;2=fffff;3=fffff
+L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+# echo "L3DATA:2=3c0;" > schemata
+# cat schemata
+L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
+L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+
Example 1
---------
On a two socket machine (one L3 cache per socket) with just four bits
diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index 0d64397..d770527 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -76,10 +76,7 @@ struct rftype {
* @min_cbm_bits: Minimum number of consecutive bits to be set
* in a cache bit mask
* @domains: All domains for this resource
- * @num_domains: Number of domains active
* @msr_base: Base MSR address for CBMs
- * @tmp_cbms: Scratch space when updating schemata
- * @num_tmp_cbms: Number of CBMs in tmp_cbms
* @cache_level: Which cache level defines scope of this domain
* @cbm_idx_multi: Multiplier of CBM index
* @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
@@ -94,10 +91,7 @@ struct rdt_resource {
int min_cbm_bits;
u32 max_cbm;
struct list_head domains;
- int num_domains;
int msr_base;
- u32 *tmp_cbms;
- int num_tmp_cbms;
int cache_level;
int cbm_idx_multi;
int cbm_idx_offset;
@@ -109,12 +103,16 @@ struct rdt_resource {
* @id: unique id for this instance
* @cpu_mask: which cpus share this resource
* @cbm: array of cache bit masks (indexed by CLOSID)
+ * @new_cbm: new cbm value to be loaded
+ * @have_new_cbm: did user provide new_cbm for this domain
*/
struct rdt_domain {
struct list_head list;
int id;
struct cpumask cpu_mask;
u32 *cbm;
+ u32 new_cbm;
+ bool have_new_cbm;
};

/**
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 5a533fe..329b887 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -309,7 +309,6 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)

cpumask_set_cpu(cpu, &d->cpu_mask);
list_add_tail(&d->list, add_pos);
- r->num_domains++;
}

static void domain_remove_cpu(int cpu, struct rdt_resource *r)
@@ -325,7 +324,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)

cpumask_clear_cpu(cpu, &d->cpu_mask);
if (cpumask_empty(&d->cpu_mask)) {
- r->num_domains--;
kfree(d->cbm);
list_del(&d->list);
kfree(d);
diff --git a/arch/x86/kernel/cpu/intel_rdt_schemata.c b/arch/x86/kernel/cpu/intel_rdt_schemata.c
index f369cb8..52e83ea 100644
--- a/arch/x86/kernel/cpu/intel_rdt_schemata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_schemata.c
@@ -56,17 +56,21 @@ static bool cbm_validate(unsigned long var, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-static int parse_cbm(char *buf, struct rdt_resource *r)
+static int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d)
{
unsigned long data;
int ret;

+ if (d->have_new_cbm)
+ return -EINVAL;
+
ret = kstrtoul(buf, 16, &data);
if (ret)
return ret;
if (!cbm_validate(data, r))
return -EINVAL;
- r->tmp_cbms[r->num_tmp_cbms++] = data;
+ d->new_cbm = data;
+ d->have_new_cbm = true;

return 0;
}
@@ -74,8 +78,8 @@ static int parse_cbm(char *buf, struct rdt_resource *r)
/*
* For each domain in this resource we expect to find a series of:
* id=mask
- * separated by ";". The "id" is in decimal, and must appear in the
- * right order.
+ * separated by ";". The "id" is in decimal, and must match one of
+ * the "id"s for this resource.
*/
static int parse_line(char *line, struct rdt_resource *r)
{
@@ -83,21 +87,21 @@ static int parse_line(char *line, struct rdt_resource *r)
struct rdt_domain *d;
unsigned long dom_id;

+next:
+ if (!line || line[0] == '\0')
+ return 0;
+ dom = strsep(&line, ";");
+ id = strsep(&dom, "=");
+ if (!dom || kstrtoul(id, 10, &dom_id))
+ return -EINVAL;
list_for_each_entry(d, &r->domains, list) {
- dom = strsep(&line, ";");
- if (!dom)
- return -EINVAL;
- id = strsep(&dom, "=");
- if (kstrtoul(id, 10, &dom_id) || dom_id != d->id)
- return -EINVAL;
- if (parse_cbm(dom, r))
- return -EINVAL;
+ if (d->id == dom_id) {
+ if (parse_cbm(dom, r, d))
+ return -EINVAL;
+ goto next;
+ }
}
-
- /* Any garbage at the end of the line? */
- if (line && line[0])
- return -EINVAL;
- return 0;
+ return -EINVAL;
}

static int update_domains(struct rdt_resource *r, int closid)
@@ -105,7 +109,7 @@ static int update_domains(struct rdt_resource *r, int closid)
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
- int cpu, idx = 0;
+ int cpu;

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
@@ -115,9 +119,13 @@ static int update_domains(struct rdt_resource *r, int closid)
msr_param.res = r;

list_for_each_entry(d, &r->domains, list) {
- cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
- d->cbm[msr_param.low] = r->tmp_cbms[idx++];
+ if (d->have_new_cbm && d->new_cbm != d->cbm[closid]) {
+ cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
+ d->cbm[closid] = d->new_cbm;
+ }
}
+ if (cpumask_empty(cpu_mask))
+ goto done;
cpu = get_cpu();
/* Update CBM on this cpu if it's in cpu_mask. */
if (cpumask_test_cpu(cpu, cpu_mask))
@@ -126,6 +134,7 @@ static int update_domains(struct rdt_resource *r, int closid)
smp_call_function_many(cpu_mask, rdt_cbm_update, &msr_param, 1);
put_cpu();

+done:
free_cpumask_var(cpu_mask);

return 0;
@@ -135,10 +144,10 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct rdtgroup *rdtgrp;
+ struct rdt_domain *dom;
struct rdt_resource *r;
char *tok, *resname;
int closid, ret = 0;
- u32 *l3_cbms = NULL;

/* Valid input requires a trailing newline */
if (nbytes == 0 || buf[nbytes - 1] != '\n')
@@ -153,16 +162,9 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

closid = rdtgrp->closid;

- /* get scratch space to save all the masks while we validate input */
- for_each_enabled_rdt_resource(r) {
- r->tmp_cbms = kcalloc(r->num_domains, sizeof(*l3_cbms),
- GFP_KERNEL);
- if (!r->tmp_cbms) {
- ret = -ENOMEM;
- goto out;
- }
- r->num_tmp_cbms = 0;
- }
+ for_each_enabled_rdt_resource(r)
+ list_for_each_entry(dom, &r->domains, list)
+ dom->have_new_cbm = false;

while ((tok = strsep(&buf, "\n")) != NULL) {
resname = strsep(&tok, ":");
@@ -185,14 +187,6 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
}
}

- /* Did the parser find all the masks we need? */
- for_each_enabled_rdt_resource(r) {
- if (r->num_tmp_cbms != r->num_domains) {
- ret = -EINVAL;
- goto out;
- }
- }
-
for_each_enabled_rdt_resource(r) {
ret = update_domains(r, closid);
if (ret)
@@ -201,10 +195,6 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

out:
rdtgroup_kn_unlock(of->kn);
- for_each_enabled_rdt_resource(r) {
- kfree(r->tmp_cbms);
- r->tmp_cbms = NULL;
- }
return ret ?: nbytes;
}


Subject: [tip:x86/cpu] x86/intel_rdt: Update schemata read to show data in tabular format

Commit-ID: de016df88f23a5ab0cec3a8e05f6066388725b9e
Gitweb: http://git.kernel.org/tip/de016df88f23a5ab0cec3a8e05f6066388725b9e
Author: Vikas Shivappa <[email protected]>
AuthorDate: Mon, 3 Apr 2017 14:44:17 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Wed, 5 Apr 2017 17:22:31 +0200

x86/intel_rdt: Update schemata read to show data in tabular format

The schemata file displays data from different resources on all
domains. Its cumbersome to read since they are not tabular and data/names
could be of different widths. Make the schemata file to display data in a
tabular format thereby making it nice and simple to read.

Signed-off-by: Vikas Shivappa <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/intel_rdt.h | 4 ++++
arch/x86/kernel/cpu/intel_rdt.c | 30 ++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/intel_rdt_schemata.c | 5 +++--
3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/intel_rdt.h b/arch/x86/include/asm/intel_rdt.h
index d770527..3f31399 100644
--- a/arch/x86/include/asm/intel_rdt.h
+++ b/arch/x86/include/asm/intel_rdt.h
@@ -40,6 +40,8 @@ struct rdtgroup {
/* List of all resource groups */
extern struct list_head rdt_all_groups;

+extern int max_name_width, max_data_width;
+
int __init rdtgroup_init(void);

/**
@@ -73,6 +75,7 @@ struct rftype {
* @name: Name to use in "schemata" file
* @num_closid: Number of CLOSIDs available
* @max_cbm: Largest Cache Bit Mask allowed
+ * @data_width: Character width of data when displaying
* @min_cbm_bits: Minimum number of consecutive bits to be set
* in a cache bit mask
* @domains: All domains for this resource
@@ -90,6 +93,7 @@ struct rdt_resource {
int cbm_len;
int min_cbm_bits;
u32 max_cbm;
+ int data_width;
struct list_head domains;
int msr_base;
int cache_level;
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 329b887..70a3307 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -39,6 +39,12 @@ DEFINE_PER_CPU_READ_MOSTLY(int, cpu_closid);

#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].domains)

+/*
+ * Used to store the max resource name width and max resource data width
+ * to display the schemata in a tabular format
+ */
+int max_name_width, max_data_width;
+
struct rdt_resource rdt_resources_all[] = {
{
.name = "L3",
@@ -140,6 +146,7 @@ static void rdt_get_config(int idx, struct rdt_resource *r)
r->num_closid = edx.split.cos_max + 1;
r->cbm_len = eax.split.cbm_len + 1;
r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
+ r->data_width = (r->cbm_len + 3) / 4;
r->capable = true;
r->enabled = true;
}
@@ -152,6 +159,7 @@ static void rdt_get_cdp_l3_config(int type)
r->num_closid = r_l3->num_closid / 2;
r->cbm_len = r_l3->cbm_len;
r->max_cbm = r_l3->max_cbm;
+ r->data_width = (r->cbm_len + 3) / 4;
r->capable = true;
/*
* By default, CDP is disabled. CDP can be enabled by mount parameter
@@ -160,6 +168,26 @@ static void rdt_get_cdp_l3_config(int type)
r->enabled = false;
}

+/**
+ * Choose a width for the resource name
+ * and resource data based on the resource that has
+ * widest name and cbm.
+ */
+static void rdt_init_padding(void)
+{
+ struct rdt_resource *r;
+ int cl;
+
+ for_each_enabled_rdt_resource(r) {
+ cl = strlen(r->name);
+ if (cl > max_name_width)
+ max_name_width = cl;
+
+ if (r->data_width > max_data_width)
+ max_data_width = r->data_width;
+ }
+}
+
static inline bool get_rdt_resources(void)
{
bool ret = false;
@@ -184,6 +212,8 @@ static inline bool get_rdt_resources(void)
ret = true;
}

+ rdt_init_padding();
+
return ret;
}

diff --git a/arch/x86/kernel/cpu/intel_rdt_schemata.c b/arch/x86/kernel/cpu/intel_rdt_schemata.c
index 52e83ea..8594db4 100644
--- a/arch/x86/kernel/cpu/intel_rdt_schemata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_schemata.c
@@ -203,11 +203,12 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
struct rdt_domain *dom;
bool sep = false;

- seq_printf(s, "%s:", r->name);
+ seq_printf(s, "%*s:", max_name_width, r->name);
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_puts(s, ";");
- seq_printf(s, "%d=%x", dom->id, dom->cbm[closid]);
+ seq_printf(s, "%d=%0*x", dom->id, max_data_width,
+ dom->cbm[closid]);
sep = true;
}
seq_puts(s, "\n");

2017-04-05 18:02:43

by Shivappa Vikas

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid



On Wed, 5 Apr 2017, Thomas Gleixner wrote:

> On Mon, 3 Apr 2017, Vikas Shivappa wrote:
>
>> Subject: x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid
>
> This subject line is useless again. It want's to be descriptive.
>
> "Fix issue" Which issue?
>
>> Each resctrl directory has one CLOSid allocated which is mapped to a
>> control register/QOS_MSR. During an rmdir this CLOSid is freed and can
>> be reused later when a new directory is created. Currently we do not
>> reset the QOS_MSR to a default when the CLOSid is freed. So when the
>> next mkdir uses a freed CLOSid, the new directory is associated with a
>> stale QOS_MSR.
>
> That's not an issue. That's maybe something people are not expecting.
>
>> Fix this issue by writing a default value to the QOS_MSR when the
>> associated CLOSid is freed. The default value is all 1s for CBM which
>> means no control is enforced when a mkdir reuses this CLOSid.
>
> That's just wrong.
>
> The proper behaviour for a new control group is, that at the time when it
> is created it copies the CBM values of the default group and not claiming
> access to ALL of the cache by default.

The behaviour of the default/root group is having access to all cache. Because
we set the value of all CBMs to all 1s and then root group takes CLOS 0. This is
trying to set the same values for all newly created groups.

Thanks,
Vikas

>
> Thanks,
>
> tglx
>
>
>

2017-04-05 18:07:54

by Tony Luck

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid

On Wed, Apr 05, 2017 at 05:20:24PM +0200, Thomas Gleixner wrote:
> That's just wrong.
>
> The proper behaviour for a new control group is, that at the time when it
> is created it copies the CBM values of the default group and not claiming
> access to ALL of the cache by default.

I don't see that as any more helpful. When you make a new
control group it is because none of the existing groups
provides the QoS that you want. So the first thing the
user will do is write the schemata file with the values
they do want.

So "all access", or "same as default group" are both the
same to the user ... not what they want.

We do need to make sure that the schemata matches what is
in the registers. We need to make sure that changes to the
schemata file result in the MSRs being written where needed.

-Tony

2017-04-05 20:15:24

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid

On Wed, Apr 05, 2017 at 11:07:37AM -0700, Luck, Tony wrote:
> On Wed, Apr 05, 2017 at 05:20:24PM +0200, Thomas Gleixner wrote:
> > That's just wrong.
> >
> > The proper behaviour for a new control group is, that at the time when it
> > is created it copies the CBM values of the default group and not claiming
> > access to ALL of the cache by default.
>
> I don't see that as any more helpful. When you make a new
> control group it is because none of the existing groups
> provides the QoS that you want. So the first thing the
> user will do is write the schemata file with the values
> they do want.
>
> So "all access", or "same as default group" are both the
> same to the user ... not what they want.
>
> We do need to make sure that the schemata matches what is
> in the registers. We need to make sure that changes to the
> schemata file result in the MSRs being written where needed.

FYI. This behavior and a fix patch were discussed before:
http://lkml.org/lkml/2017/1/9/747

Thanks.

-Fenghua

2017-04-10 17:09:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid

On Wed, 5 Apr 2017, Luck, Tony wrote:
> On Wed, Apr 05, 2017 at 05:20:24PM +0200, Thomas Gleixner wrote:
> > That's just wrong.
> >
> > The proper behaviour for a new control group is, that at the time when it
> > is created it copies the CBM values of the default group and not claiming
> > access to ALL of the cache by default.
>
> I don't see that as any more helpful. When you make a new
> control group it is because none of the existing groups
> provides the QoS that you want. So the first thing the
> user will do is write the schemata file with the values
> they do want.
>
> So "all access", or "same as default group" are both the
> same to the user ... not what they want.
>
> We do need to make sure that the schemata matches what is
> in the registers. We need to make sure that changes to the
> schemata file result in the MSRs being written where needed.

That's true today. The MSRs and the schemata file of a newly created group
always match. So there's nothing to fix, right?

Thanks,

tglx


2017-04-11 18:51:11

by Shivappa Vikas

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/intel_rdt: Fix issue when mkdir uses a freed CLOSid



On Mon, 10 Apr 2017, Thomas Gleixner wrote:

> On Wed, 5 Apr 2017, Luck, Tony wrote:
>> On Wed, Apr 05, 2017 at 05:20:24PM +0200, Thomas Gleixner wrote:
>>> That's just wrong.
>>>
>>> The proper behaviour for a new control group is, that at the time when it
>>> is created it copies the CBM values of the default group and not claiming
>>> access to ALL of the cache by default.
>>
>> I don't see that as any more helpful. When you make a new
>> control group it is because none of the existing groups
>> provides the QoS that you want. So the first thing the
>> user will do is write the schemata file with the values
>> they do want.
>>
>> So "all access", or "same as default group" are both the
>> same to the user ... not what they want.
>>
>> We do need to make sure that the schemata matches what is
>> in the registers. We need to make sure that changes to the
>> schemata file result in the MSRs being written where needed.
>
> That's true today. The MSRs and the schemata file of a newly created group
> always match. So there's nothing to fix, right?

Yes. Upstream patch has the MSRs matching with what schemata shows. we can drop
this patch. I sent a new MBA version without this just on the tip which has the
other two patches.

Thanks,
Vikas

>
> Thanks,
>
> tglx
>
>
>