2020-04-30 17:06:30

by James Morse

[permalink] [raw]
Subject: [PATCH v2 00/10] x86/resctrl: Misc cleanup

Hello!

These are the miscellaneous cleanup patches that floated to the top of
the MPAM tree.

The only interesting thing are the patches to make the AMD/Intel
differences something resctrl understands, instead of just 'happening'
because of the different function pointers.
This will become more important once MPAM support is added. parse_bw()
and friends are what enforces resctrl's ABI resctrl. Allowing an
architecture/platform to provide a subtly different function here would
be bad for user-space.

MPAM would set arch_has_sparse_bitmaps, but not arch_needs_linear.

Since [v1], I've picked up all the review feedback and collected the
tags.

Nothing in this series should change any behaviour.
This series is based on v5.7-rc3.

Thanks,

James

[v1] https://lore.kernel.org/lkml/[email protected]/

James Morse (10):
x86/resctrl: Nothing uses struct mbm_state chunks_bw
x86/resctrl: Remove max_delay
x86/resctrl: Fix stale comment
x86/resctrl: use container_of() in delayed_work handlers
x86/resctrl: Include pid.h
x86/resctrl: Use is_closid_match() in more places
x86/resctrl: Add arch_needs_linear to explain AMD/Intel MBA difference
x86/resctrl: Merge AMD/Intel parse_bw() calls
x86/resctrl: Add arch_has_sparse_bitmaps to explain AMD/Intel CAT
difference
cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file

arch/x86/kernel/cpu/resctrl/core.c | 35 +++------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 89 +++--------------------
arch/x86/kernel/cpu/resctrl/internal.h | 19 ++---
arch/x86/kernel/cpu/resctrl/monitor.c | 15 +---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 32 ++++----
include/linux/cacheinfo.h | 21 ++++++
include/linux/resctrl.h | 2 +
7 files changed, 70 insertions(+), 143 deletions(-)

--
2.26.1


2020-04-30 17:07:03

by James Morse

[permalink] [raw]
Subject: [PATCH v2 04/10] x86/resctrl: use container_of() in delayed_work handlers

mbm_handle_overflow() and cqm_handle_limbo() are both provided with
the domain's work_struct when called, but use get_domain_from_cpu()
to find the domain, along with the appropriate error handling.

container_of() saves some list walking and bitmap testing, use that
instead.

Signed-off-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/monitor.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index af549df38ec6..a02a7f886a0a 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -476,19 +476,13 @@ void cqm_handle_limbo(struct work_struct *work)
mutex_lock(&rdtgroup_mutex);

r = &rdt_resources_all[RDT_RESOURCE_L3];
- d = get_domain_from_cpu(cpu, r);
-
- if (!d) {
- pr_warn_once("Failure to get domain for limbo worker\n");
- goto out_unlock;
- }
+ d = container_of(work, struct rdt_domain, cqm_limbo.work);

__check_limbo(d, false);

if (has_busy_rmid(r, d))
schedule_delayed_work_on(cpu, &d->cqm_limbo, delay);

-out_unlock:
mutex_unlock(&rdtgroup_mutex);
}

@@ -516,9 +510,7 @@ void mbm_handle_overflow(struct work_struct *work)
if (!static_branch_likely(&rdt_mon_enable_key))
goto out_unlock;

- d = get_domain_from_cpu(cpu, &rdt_resources_all[RDT_RESOURCE_L3]);
- if (!d)
- goto out_unlock;
+ d = container_of(work, struct rdt_domain, mbm_over.work);

list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
mbm_update(d, prgrp->mon.rmid);
--
2.26.1

2020-04-30 17:07:13

by James Morse

[permalink] [raw]
Subject: [PATCH v2 03/10] x86/resctrl: Fix stale comment

The comment in rdtgroup_init() refers to the non existent function
rdt_mount(), which has now been renamed rdt_get_tree(). Fix the
comment.

Signed-off-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 5a359d9fcc05..9fe489904fc7 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -3195,7 +3195,7 @@ int __init rdtgroup_init(void)
* It may also be ok since that would enable debugging of RDT before
* resctrl is mounted.
* The reason why the debugfs directory is created here and not in
- * rdt_mount() is because rdt_mount() takes rdtgroup_mutex and
+ * rdt_get_tree() is because rdt_get_tree() takes rdtgroup_mutex and
* during the debugfs directory creation also &sb->s_type->i_mutex_key
* (the lockdep class of inode->i_rwsem). Other filesystem
* interactions (eg. SyS_getdents) have the lock ordering:
--
2.26.1

2020-04-30 17:07:14

by James Morse

[permalink] [raw]
Subject: [PATCH v2 07/10] x86/resctrl: Add arch_needs_linear to explain AMD/Intel MBA difference

The configuration values user-space provides to the resctrl filesystem
are ABI. To make this work on another architecture we want to move all
the ABI bits out of /arch/x86 and under /fs.

To do this, the differences between AMD and Intel CPUs needs to be
explained to resctrl via resource properties, instead of function
pointers that let the arch code accept subtly different values on
different platforms/architectures.

For MBA, Intel CPUs reject configuration attempts for non-linear
resources, whereas AMD ignore this field as its MBA resource is never
linear. To merge the parse/validate functions we need to explain
this difference.

Add arch_needs_linear to indicate the arch code needs the linear
property to be true to configure this resource. AMD can set this
and delay_linear to false. Intel can set arch_needs_linear
to true to keep the existing "No support for non-linear MB domains"
error message for affected platforms.

CC: Babu Moger <[email protected]>
Signed-off-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>

---
An alternative to this is for Intel non-linear MBA resources to
clear alloc_capable as they can't be configured anyway.
---
arch/x86/kernel/cpu/resctrl/core.c | 3 +++
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 8 +++++++-
arch/x86/kernel/cpu/resctrl/internal.h | 2 ++
3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 9048f421af84..40186c54b05d 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -260,6 +260,7 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
r->num_closid = edx.split.cos_max + 1;
max_delay = eax.split.max_delay + 1;
r->default_ctrl = MAX_MBA_BW;
+ r->membw.arch_needs_linear = true;
if (ecx & MBA_IS_LINEAR) {
r->membw.delay_linear = true;
r->membw.min_bw = MAX_MBA_BW - max_delay;
@@ -267,6 +268,7 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
} else {
if (!rdt_get_mb_table(r))
return false;
+ r->membw.arch_needs_linear = false;
}
r->data_width = 3;

@@ -288,6 +290,7 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)

/* AMD does not use delay */
r->membw.delay_linear = false;
+ r->membw.arch_needs_linear = false;

r->membw.min_bw = 0;
r->membw.bw_gran = 1;
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 055c8613b531..db8e6c0cadb1 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -33,6 +33,12 @@ static bool bw_validate_amd(char *buf, unsigned long *data,
unsigned long bw;
int ret;

+ /* temporary: always false on AMD */
+ if (!r->membw.delay_linear && r->membw.arch_needs_linear) {
+ rdt_last_cmd_puts("No support for non-linear MB domains\n");
+ return false;
+ }
+
ret = kstrtoul(buf, 10, &bw);
if (ret) {
rdt_last_cmd_printf("Non-decimal digit in MB value %s\n", buf);
@@ -82,7 +88,7 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
/*
* Only linear delay values is supported for current Intel SKUs.
*/
- if (!r->membw.delay_linear) {
+ if (!r->membw.delay_linear && r->membw.arch_needs_linear) {
rdt_last_cmd_puts("No support for non-linear MB domains\n");
return false;
}
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 1ff6718307cf..215be9957acf 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -363,6 +363,7 @@ struct rdt_cache {
* struct rdt_membw - Memory bandwidth allocation related data
* @min_bw: Minimum memory bandwidth percentage user can request
* @bw_gran: Granularity at which the memory bandwidth is allocated
+ * @arch_needs_linear: True if we can't configure non-linear resources
* @delay_linear: True if memory B/W delay is in linear scale
* @mba_sc: True if MBA software controller(mba_sc) is enabled
* @mb_map: Mapping of memory B/W percentage to memory B/W delay
@@ -371,6 +372,7 @@ struct rdt_membw {
u32 min_bw;
u32 bw_gran;
u32 delay_linear;
+ bool arch_needs_linear;
bool mba_sc;
u32 *mb_map;
};
--
2.26.1

2020-04-30 17:07:57

by James Morse

[permalink] [raw]
Subject: [PATCH v2 08/10] x86/resctrl: Merge AMD/Intel parse_bw() calls

Now that we've explained arch_needs_linear to resctrl, the parse_bw()
calls are almost the same between AMD and Intel.

The difference is '!is_mba_sc()', which is not checked on AMD. This
will always be true on AMD CPUs as mba_sc cannot be enabled as
is_mba_linear() is false.

Removing this duplication means user-space visible behaviour and
error messages are not validated or generated in different places.

CC: Babu Moger <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 3 +-
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 57 +----------------------
arch/x86/kernel/cpu/resctrl/internal.h | 6 +--
3 files changed, 5 insertions(+), 61 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 40186c54b05d..7a6a6303fc05 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -168,6 +168,7 @@ struct rdt_resource rdt_resources_all[] = {
.name = "MB",
.domains = domain_init(RDT_RESOURCE_MBA),
.cache_level = 3,
+ .parse_ctrlval = parse_bw,
.format_str = "%d=%*u",
.fflags = RFTYPE_RES_MB,
},
@@ -926,7 +927,6 @@ static __init void rdt_init_res_defs_intel(void)
else if (r->rid == RDT_RESOURCE_MBA) {
r->msr_base = MSR_IA32_MBA_THRTL_BASE;
r->msr_update = mba_wrmsr_intel;
- r->parse_ctrlval = parse_bw_intel;
}
}
}
@@ -946,7 +946,6 @@ static __init void rdt_init_res_defs_amd(void)
else if (r->rid == RDT_RESOURCE_MBA) {
r->msr_base = MSR_IA32_MBA_BW_BASE;
r->msr_update = mba_wrmsr_amd;
- r->parse_ctrlval = parse_bw_amd;
}
}
}
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index db8e6c0cadb1..c983efddb537 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -21,59 +21,6 @@
#include <linux/slab.h>
#include "internal.h"

-/*
- * Check whether MBA bandwidth percentage value is correct. The value is
- * checked against the minimum and maximum bandwidth values specified by
- * the hardware. The allocated bandwidth percentage is rounded to the next
- * control step available on the hardware.
- */
-static bool bw_validate_amd(char *buf, unsigned long *data,
- struct rdt_resource *r)
-{
- unsigned long bw;
- int ret;
-
- /* temporary: always false on AMD */
- if (!r->membw.delay_linear && r->membw.arch_needs_linear) {
- rdt_last_cmd_puts("No support for non-linear MB domains\n");
- return false;
- }
-
- ret = kstrtoul(buf, 10, &bw);
- if (ret) {
- rdt_last_cmd_printf("Non-decimal digit in MB value %s\n", buf);
- return false;
- }
-
- if (bw < r->membw.min_bw || bw > r->default_ctrl) {
- rdt_last_cmd_printf("MB value %ld out of range [%d,%d]\n", bw,
- r->membw.min_bw, r->default_ctrl);
- return false;
- }
-
- *data = roundup(bw, (unsigned long)r->membw.bw_gran);
- return true;
-}
-
-int parse_bw_amd(struct rdt_parse_data *data, struct rdt_resource *r,
- struct rdt_domain *d)
-{
- unsigned long bw_val;
-
- if (d->have_new_ctrl) {
- rdt_last_cmd_printf("Duplicate domain %d\n", d->id);
- return -EINVAL;
- }
-
- if (!bw_validate_amd(data->buf, &bw_val, r))
- return -EINVAL;
-
- d->new_ctrl = bw_val;
- d->have_new_ctrl = true;
-
- return 0;
-}
-
/*
* Check whether MBA bandwidth percentage value is correct. The value is
* checked against the minimum and max bandwidth values specified by the
@@ -110,8 +57,8 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
return true;
}

-int parse_bw_intel(struct rdt_parse_data *data, struct rdt_resource *r,
- struct rdt_domain *d)
+int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
+ struct rdt_domain *d)
{
unsigned long bw_val;

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 215be9957acf..c8d84aec62d3 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -462,10 +462,8 @@ struct rdt_resource {

int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
struct rdt_domain *d);
-int parse_bw_intel(struct rdt_parse_data *data, struct rdt_resource *r,
- struct rdt_domain *d);
-int parse_bw_amd(struct rdt_parse_data *data, struct rdt_resource *r,
- struct rdt_domain *d);
+int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
+ struct rdt_domain *d);

extern struct mutex rdtgroup_mutex;

--
2.26.1

2020-04-30 17:08:35

by James Morse

[permalink] [raw]
Subject: [PATCH v2 06/10] x86/resctrl: Use is_closid_match() in more places

rdtgroup_tasks_assigned() and show_rdt_tasks() loop over threads testing
for a CTRL/MON group match by closid/rmid with the provided rdtgrp.
Further down the file are helpers to do this, move these further up and
make use of them here.

These helpers aditionally check for alloc/mon capable. This is harmless
as rdtgroup_mkdir() tests these capable flags before allowing the config
directories to be created.

Signed-off-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 30 ++++++++++++--------------
1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 9fe489904fc7..ffeb31918364 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -592,6 +592,18 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
return ret;
}

+static bool is_closid_match(struct task_struct *t, struct rdtgroup *r)
+{
+ return (rdt_alloc_capable &&
+ (r->type == RDTCTRL_GROUP) && (t->closid == r->closid));
+}
+
+static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r)
+{
+ return (rdt_mon_capable &&
+ (r->type == RDTMON_GROUP) && (t->rmid == r->mon.rmid));
+}
+
/**
* rdtgroup_tasks_assigned - Test if tasks have been assigned to resource group
* @r: Resource group
@@ -607,8 +619,7 @@ int rdtgroup_tasks_assigned(struct rdtgroup *r)

rcu_read_lock();
for_each_process_thread(p, t) {
- if ((r->type == RDTCTRL_GROUP && t->closid == r->closid) ||
- (r->type == RDTMON_GROUP && t->rmid == r->mon.rmid)) {
+ if (is_closid_match(t, r) || is_rmid_match(t, r)) {
ret = 1;
break;
}
@@ -706,8 +717,7 @@ static void show_rdt_tasks(struct rdtgroup *r, struct seq_file *s)

rcu_read_lock();
for_each_process_thread(p, t) {
- if ((r->type == RDTCTRL_GROUP && t->closid == r->closid) ||
- (r->type == RDTMON_GROUP && t->rmid == r->mon.rmid))
+ if (is_closid_match(t, r) || is_rmid_match(t, r))
seq_printf(s, "%d\n", t->pid);
}
rcu_read_unlock();
@@ -2244,18 +2254,6 @@ static int reset_all_ctrls(struct rdt_resource *r)
return 0;
}

-static bool is_closid_match(struct task_struct *t, struct rdtgroup *r)
-{
- return (rdt_alloc_capable &&
- (r->type == RDTCTRL_GROUP) && (t->closid == r->closid));
-}
-
-static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r)
-{
- return (rdt_mon_capable &&
- (r->type == RDTMON_GROUP) && (t->rmid == r->mon.rmid));
-}
-
/*
* Move tasks from one to the other group. If @from is NULL, then all tasks
* in the systems are moved unconditionally (used for teardown).
--
2.26.1

2020-04-30 17:08:35

by James Morse

[permalink] [raw]
Subject: [PATCH v2 01/10] x86/resctrl: Nothing uses struct mbm_state chunks_bw

Nothing reads struct mbm_states's chunks_bw value, its a copy of
chunks. Remove it.

Signed-off-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/internal.h | 2 --
arch/x86/kernel/cpu/resctrl/monitor.c | 3 +--
2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 3dd13f3a8b23..6a3e998753b7 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -275,7 +275,6 @@ struct rftype {
* struct mbm_state - status for each MBM counter in each domain
* @chunks: Total data moved (multiply by rdt_group.mon_scale to get bytes)
* @prev_msr Value of IA32_QM_CTR for this RMID last time we read it
- * @chunks_bw Total local data moved. Used for bandwidth calculation
* @prev_bw_msr:Value of previous IA32_QM_CTR for bandwidth counting
* @prev_bw The most recent bandwidth in MBps
* @delta_bw Difference between the current and previous bandwidth
@@ -284,7 +283,6 @@ struct rftype {
struct mbm_state {
u64 chunks;
u64 prev_msr;
- u64 chunks_bw;
u64 prev_bw_msr;
u32 prev_bw;
u32 delta_bw;
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 773124b0e18a..af549df38ec6 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -279,8 +279,7 @@ static void mbm_bw_count(u32 rmid, struct rmid_read *rr)
return;

chunks = mbm_overflow_count(m->prev_bw_msr, tval);
- m->chunks_bw += chunks;
- m->chunks = m->chunks_bw;
+ m->chunks += chunks;
cur_bw = (chunks * r->mon_scale) >> 20;

if (m->delta_comp)
--
2.26.1

2020-04-30 17:08:43

by James Morse

[permalink] [raw]
Subject: [PATCH v2 02/10] x86/resctrl: Remove max_delay

max_delay is used by x86's __get_mem_config_intel() as a local variable.
Remove it, replacing it with a local variable.

Signed-off-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 8 ++++----
arch/x86/kernel/cpu/resctrl/internal.h | 3 ---
2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index d8cc5223b7ce..9048f421af84 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -254,16 +254,16 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
{
union cpuid_0x10_3_eax eax;
union cpuid_0x10_x_edx edx;
- u32 ebx, ecx;
+ u32 ebx, ecx, max_delay;

cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
r->num_closid = edx.split.cos_max + 1;
- r->membw.max_delay = eax.split.max_delay + 1;
+ max_delay = eax.split.max_delay + 1;
r->default_ctrl = MAX_MBA_BW;
if (ecx & MBA_IS_LINEAR) {
r->membw.delay_linear = true;
- r->membw.min_bw = MAX_MBA_BW - r->membw.max_delay;
- r->membw.bw_gran = MAX_MBA_BW - r->membw.max_delay;
+ r->membw.min_bw = MAX_MBA_BW - max_delay;
+ r->membw.bw_gran = MAX_MBA_BW - max_delay;
} else {
if (!rdt_get_mb_table(r))
return false;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 6a3e998753b7..1ff6718307cf 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -361,8 +361,6 @@ struct rdt_cache {

/**
* struct rdt_membw - Memory bandwidth allocation related data
- * @max_delay: Max throttle delay. Delay is the hardware
- * representation for memory bandwidth.
* @min_bw: Minimum memory bandwidth percentage user can request
* @bw_gran: Granularity at which the memory bandwidth is allocated
* @delay_linear: True if memory B/W delay is in linear scale
@@ -370,7 +368,6 @@ struct rdt_cache {
* @mb_map: Mapping of memory B/W percentage to memory B/W delay
*/
struct rdt_membw {
- u32 max_delay;
u32 min_bw;
u32 bw_gran;
u32 delay_linear;
--
2.26.1

2020-04-30 17:08:52

by James Morse

[permalink] [raw]
Subject: [PATCH v2 05/10] x86/resctrl: Include pid.h

We are about to disturb the header soup. This header uses struct pid
and struct pid_namespace. Include their header.

Signed-off-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
include/linux/resctrl.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index daf5cf64c6a6..9b05af9b3e28 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -2,6 +2,8 @@
#ifndef _RESCTRL_H
#define _RESCTRL_H

+#include <linux/pid.h>
+
#ifdef CONFIG_PROC_CPU_RESCTRL

int proc_resctrl_show(struct seq_file *m,
--
2.26.1

2020-04-30 17:09:16

by James Morse

[permalink] [raw]
Subject: [PATCH v2 09/10] x86/resctrl: Add arch_has_sparse_bitmaps to explain AMD/Intel CAT difference

Intel expects the cache bitmap provided by user-space to have on a
single span of 1s, whereas AMD can support bitmaps like 0xf00f.
Arm's MPAM support also allows sparse bitmaps.

To move resctrl out to /fs/ we need to explain platform differences
like this. Add a resource property arch_has_sparse_bitmaps. Test this
around the 'non-consecutive' test in cbm_validate().

Merging the validate calls causes AMD top gain the min_cbm_bits test
needed for Haswell, but as it always sets this value to 1, it will
never match.

CC: Babu Moger <[email protected]>
Signed-off-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 4 +--
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 36 +++++------------------
arch/x86/kernel/cpu/resctrl/internal.h | 6 ++--
3 files changed, 12 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 7a6a6303fc05..00c88f42742c 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -923,7 +923,7 @@ static __init void rdt_init_res_defs_intel(void)
r->rid == RDT_RESOURCE_L2 ||
r->rid == RDT_RESOURCE_L2DATA ||
r->rid == RDT_RESOURCE_L2CODE)
- r->cbm_validate = cbm_validate_intel;
+ r->cache.arch_has_sparse_bitmaps = false;
else if (r->rid == RDT_RESOURCE_MBA) {
r->msr_base = MSR_IA32_MBA_THRTL_BASE;
r->msr_update = mba_wrmsr_intel;
@@ -942,7 +942,7 @@ static __init void rdt_init_res_defs_amd(void)
r->rid == RDT_RESOURCE_L2 ||
r->rid == RDT_RESOURCE_L2DATA ||
r->rid == RDT_RESOURCE_L2CODE)
- r->cbm_validate = cbm_validate_amd;
+ r->cache.arch_has_sparse_bitmaps = true;
else if (r->rid == RDT_RESOURCE_MBA) {
r->msr_base = MSR_IA32_MBA_BW_BASE;
r->msr_update = mba_wrmsr_amd;
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index c983efddb537..cca2dfbadf3b 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -76,12 +76,14 @@ int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
}

/*
- * Check whether a cache bit mask is valid. The SDM says:
+ * Check whether a cache bit mask is valid.
+ * For Intel the SDM says:
* Please note that all (and only) contiguous '1' combinations
* are allowed (e.g. FFFFH, 0FF0H, 003CH, etc.).
* Additionally Haswell requires at least two bits set.
+ * AMD allows non-contiguous bitmasks.
*/
-bool cbm_validate_intel(char *buf, u32 *data, struct rdt_resource *r)
+static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
{
unsigned long first_bit, zero_bit, val;
unsigned int cbm_len = r->cache.cbm_len;
@@ -101,7 +103,9 @@ bool cbm_validate_intel(char *buf, u32 *data, struct rdt_resource *r)
first_bit = find_first_bit(&val, cbm_len);
zero_bit = find_next_zero_bit(&val, cbm_len, first_bit);

- if (find_next_bit(&val, cbm_len, zero_bit) < cbm_len) {
+ /* Are non-contiguous bitmaps allowed? */
+ if (!r->cache.arch_has_sparse_bitmaps &&
+ (find_next_bit(&val, cbm_len, zero_bit) < cbm_len)) {
rdt_last_cmd_printf("The mask %lx has non-consecutive 1-bits\n", val);
return false;
}
@@ -116,30 +120,6 @@ bool cbm_validate_intel(char *buf, u32 *data, struct rdt_resource *r)
return true;
}

-/*
- * Check whether a cache bit mask is valid. AMD allows non-contiguous
- * bitmasks
- */
-bool cbm_validate_amd(char *buf, u32 *data, struct rdt_resource *r)
-{
- unsigned long val;
- int ret;
-
- ret = kstrtoul(buf, 16, &val);
- if (ret) {
- rdt_last_cmd_printf("Non-hex character in the mask %s\n", buf);
- return false;
- }
-
- if (val > r->default_ctrl) {
- rdt_last_cmd_puts("Mask out of range\n");
- return false;
- }
-
- *data = val;
- return true;
-}
-
/*
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
@@ -165,7 +145,7 @@ int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
return -EINVAL;
}

- if (!r->cbm_validate(data->buf, &cbm_val, r))
+ if (!cbm_validate(data->buf, &cbm_val, r))
return -EINVAL;

if ((rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c8d84aec62d3..290e7f96d0db 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -350,6 +350,7 @@ struct msr_param {
* in a cache bit mask
* @shareable_bits: Bitmask of shareable resource with other
* executing entities
+ * @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid.
*/
struct rdt_cache {
unsigned int cbm_len;
@@ -357,6 +358,7 @@ struct rdt_cache {
unsigned int cbm_idx_mult;
unsigned int cbm_idx_offset;
unsigned int shareable_bits;
+ bool arch_has_sparse_bitmaps;
};

/**
@@ -426,7 +428,6 @@ struct rdt_parse_data {
* @cache: Cache allocation related data
* @format_str: Per resource format string to show domain value
* @parse_ctrlval: Per resource function pointer to parse control values
- * @cbm_validate Cache bitmask validate function
* @evt_list: List of monitoring events
* @num_rmid: Number of RMIDs available
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
@@ -453,7 +454,6 @@ struct rdt_resource {
int (*parse_ctrlval)(struct rdt_parse_data *data,
struct rdt_resource *r,
struct rdt_domain *d);
- bool (*cbm_validate)(char *buf, u32 *data, struct rdt_resource *r);
struct list_head evt_list;
int num_rmid;
unsigned int mon_scale;
@@ -594,8 +594,6 @@ void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms);
void cqm_handle_limbo(struct work_struct *work);
bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
void __check_limbo(struct rdt_domain *d, bool force_free);
-bool cbm_validate_intel(char *buf, u32 *data, struct rdt_resource *r);
-bool cbm_validate_amd(char *buf, u32 *data, struct rdt_resource *r);
void rdt_domain_reconfigure_cdp(struct rdt_resource *r);

#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
--
2.26.1

2020-04-30 17:09:38

by James Morse

[permalink] [raw]
Subject: [PATCH v2 10/10] cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file

resctrl/core.c defines get_cache_id() for use in its cpu-hotplug
callbacks. This gets the id attribute of the cache at the corresponding
level of a cpu.

Later rework means this private function needs to be shared. Move
it to the header file.

The name conflicts with a different definition in intel_cacheinfo.c,
name it get_cpu_cacheinfo_id() to show its relation with
get_cpu_cacheinfo().

Now this is visible on other architectures, check the id attribute
has actually been set.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 17 ++---------------
include/linux/cacheinfo.h | 21 +++++++++++++++++++++
2 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 00c88f42742c..f6458cefcac3 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -350,19 +350,6 @@ static void rdt_get_cdp_l2_config(void)
rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2CODE);
}

-static int get_cache_id(int cpu, int level)
-{
- struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu);
- int i;
-
- for (i = 0; i < ci->num_leaves; i++) {
- if (ci->info_list[i].level == level)
- return ci->info_list[i].id;
- }
-
- return -1;
-}
-
static void
mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
@@ -560,7 +547,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d)
*/
static void domain_add_cpu(int cpu, struct rdt_resource *r)
{
- int id = get_cache_id(cpu, r->cache_level);
+ int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
struct list_head *add_pos = NULL;
struct rdt_domain *d;

@@ -606,7 +593,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)

static void domain_remove_cpu(int cpu, struct rdt_resource *r)
{
- int id = get_cache_id(cpu, r->cache_level);
+ int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
struct rdt_domain *d;

d = rdt_find_domain(r, id, NULL);
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 46b92cd61d0c..4f72b47973c3 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -3,6 +3,7 @@
#define _LINUX_CACHEINFO_H

#include <linux/bitops.h>
+#include <linux/cpu.h>
#include <linux/cpumask.h>
#include <linux/smp.h>

@@ -119,4 +120,24 @@ int acpi_find_last_cache_level(unsigned int cpu);

const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);

+/*
+ * Get the id of the cache associated with @cpu at level @level.
+ * cpuhp lock must be held.
+ */
+static inline int get_cpu_cacheinfo_id(int cpu, int level)
+{
+ struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu);
+ int i;
+
+ for (i = 0; i < ci->num_leaves; i++) {
+ if (ci->info_list[i].level == level) {
+ if (ci->info_list[i].attributes & CACHE_ID)
+ return ci->info_list[i].id;
+ return -1;
+ }
+ }
+
+ return -1;
+}
+
#endif /* _LINUX_CACHEINFO_H */
--
2.26.1

2020-05-11 18:17:05

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v2 09/10] x86/resctrl: Add arch_has_sparse_bitmaps to explain AMD/Intel CAT difference

Hi James,

On 4/30/2020 10:03 AM, James Morse wrote:
> Intel expects the cache bitmap provided by user-space to have on a
> single span of 1s, whereas AMD can support bitmaps like 0xf00f.
> Arm's MPAM support also allows sparse bitmaps.
>
> To move resctrl out to /fs/ we need to explain platform differences
> like this. Add a resource property arch_has_sparse_bitmaps. Test this
> around the 'non-consecutive' test in cbm_validate().
>
> Merging the validate calls causes AMD top gain the min_cbm_bits test
> needed for Haswell, but as it always sets this value to 1, it will
> never match.
>
> CC: Babu Moger <[email protected]>
> Signed-off-by: James Morse <[email protected]>
> Reviewed-by: Reinette Chatre <[email protected]>

The Intel bits do indeed look good to me but we should check the AMD
portion ... I peeked at the AMD spec [1] and found "If an L3_MASK_n
register is programmed with all 0’s, that COS will be prevented from
allocating any lines in the L3 cache" ... so AMD does allow bitmasks of
all 0's (Intel does not).

Does MPAM also allow all 0's? Perhaps "arch_has_sparse_bitmaps" can be
used to indicate that also?

Reinette
[1] https://developer.amd.com/wp-content/resources/56375.pdf



2020-05-11 20:43:51

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v2 06/10] x86/resctrl: Use is_closid_match() in more places

Hi James,

On 4/30/2020 10:03 AM, James Morse wrote:
> rdtgroup_tasks_assigned() and show_rdt_tasks() loop over threads testing
> for a CTRL/MON group match by closid/rmid with the provided rdtgrp.
> Further down the file are helpers to do this, move these further up and
> make use of them here.
>
> These helpers aditionally check for alloc/mon capable. This is harmless

A typo snuck in after the original version: aditionally->additionally

Reinette

2020-05-11 20:47:06

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v2 08/10] x86/resctrl: Merge AMD/Intel parse_bw() calls

Hi James,

On 4/30/2020 10:03 AM, James Morse wrote:
> Now that we've explained arch_needs_linear to resctrl, the parse_bw()
> calls are almost the same between AMD and Intel.
>
> The difference is '!is_mba_sc()', which is not checked on AMD. This
> will always be true on AMD CPUs as mba_sc cannot be enabled as
> is_mba_linear() is false.
>
> Removing this duplication means user-space visible behaviour and
> error messages are not validated or generated in different places.
>
> CC: Babu Moger <[email protected]>
> Signed-off-by: James Morse <[email protected]>
> ---

Thank you very much.

Reviewed-by: Reinette Chatre <[email protected]>

Reinette

2020-05-13 20:05:20

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v2 00/10] x86/resctrl: Misc cleanup

Hi James,
Patches all look good. Tested on AMD platform and working as expected.

Reviewed-by: Babu Moger <[email protected]>

Thanks

> -----Original Message-----
> From: James Morse <[email protected]>
> Sent: Thursday, April 30, 2020 12:04 PM
> To: [email protected]; [email protected]
> Cc: Fenghua Yu <[email protected]>; Reinette Chatre
> <[email protected]>; Thomas Gleixner <[email protected]>; Ingo
> Molnar <[email protected]>; Borislav Petkov <[email protected]>; H Peter Anvin
> <[email protected]>; Moger, Babu <[email protected]>; James Morse
> <[email protected]>
> Subject: [PATCH v2 00/10] x86/resctrl: Misc cleanup
>
> Hello!
>
> These are the miscellaneous cleanup patches that floated to the top of
> the MPAM tree.
>
> The only interesting thing are the patches to make the AMD/Intel
> differences something resctrl understands, instead of just 'happening'
> because of the different function pointers.
> This will become more important once MPAM support is added. parse_bw()
> and friends are what enforces resctrl's ABI resctrl. Allowing an
> architecture/platform to provide a subtly different function here would
> be bad for user-space.
>
> MPAM would set arch_has_sparse_bitmaps, but not arch_needs_linear.
>
> Since [v1], I've picked up all the review feedback and collected the
> tags.
>
> Nothing in this series should change any behaviour.
> This series is based on v5.7-rc3.
>
> Thanks,
>
> James
>
> [v1]
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> nel.org%2Flkml%2F20200214182401.39008-1-
> james.morse%40arm.com%2F&amp;data=02%7C01%7CBabu.Moger%40amd.c
> om%7C1d3c1ee27e204ebece3808d7ed288034%7C3dd8961fe4884e608e11a82d
> 994e183d%7C0%7C0%7C637238630922264101&amp;sdata=Gcfop8YPzsPFq8fk
> bgnR%2FZgQApnQYVbaCdRlhpE2YPE%3D&amp;reserved=0
>
> James Morse (10):
> x86/resctrl: Nothing uses struct mbm_state chunks_bw
> x86/resctrl: Remove max_delay
> x86/resctrl: Fix stale comment
> x86/resctrl: use container_of() in delayed_work handlers
> x86/resctrl: Include pid.h
> x86/resctrl: Use is_closid_match() in more places
> x86/resctrl: Add arch_needs_linear to explain AMD/Intel MBA difference
> x86/resctrl: Merge AMD/Intel parse_bw() calls
> x86/resctrl: Add arch_has_sparse_bitmaps to explain AMD/Intel CAT
> difference
> cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file
>
> arch/x86/kernel/cpu/resctrl/core.c | 35 +++------
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 89 +++--------------------
> arch/x86/kernel/cpu/resctrl/internal.h | 19 ++---
> arch/x86/kernel/cpu/resctrl/monitor.c | 15 +---
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 32 ++++----
> include/linux/cacheinfo.h | 21 ++++++
> include/linux/resctrl.h | 2 +
> 7 files changed, 70 insertions(+), 143 deletions(-)
>
> --
> 2.26.1

2020-05-13 20:06:54

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v2 10/10] cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file



> -----Original Message-----
> From: James Morse <[email protected]>
> Sent: Thursday, April 30, 2020 12:04 PM
> To: [email protected]; [email protected]
> Cc: Fenghua Yu <[email protected]>; Reinette Chatre
> <[email protected]>; Thomas Gleixner <[email protected]>; Ingo
> Molnar <[email protected]>; Borislav Petkov <[email protected]>; H Peter Anvin
> <[email protected]>; Moger, Babu <[email protected]>; James Morse
> <[email protected]>
> Subject: [PATCH v2 10/10] cacheinfo: Move resctrl's get_cache_id() to the
> cacheinfo header file
>
> resctrl/core.c defines get_cache_id() for use in its cpu-hotplug
> callbacks. This gets the id attribute of the cache at the corresponding
> level of a cpu.
>
> Later rework means this private function needs to be shared. Move
> it to the header file.
>
> The name conflicts with a different definition in intel_cacheinfo.c,
> name it get_cpu_cacheinfo_id() to show its relation with
> get_cpu_cacheinfo().
>
> Now this is visible on other architectures, check the id attribute
> has actually been set.
>
> Signed-off-by: James Morse <[email protected]>

Reviewed-by: Babu Moger <[email protected]>

> ---
> arch/x86/kernel/cpu/resctrl/core.c | 17 ++---------------
> include/linux/cacheinfo.h | 21 +++++++++++++++++++++
> 2 files changed, 23 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c
> b/arch/x86/kernel/cpu/resctrl/core.c
> index 00c88f42742c..f6458cefcac3 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -350,19 +350,6 @@ static void rdt_get_cdp_l2_config(void)
> rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2CODE);
> }
>
> -static int get_cache_id(int cpu, int level)
> -{
> - struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu);
> - int i;
> -
> - for (i = 0; i < ci->num_leaves; i++) {
> - if (ci->info_list[i].level == level)
> - return ci->info_list[i].id;
> - }
> -
> - return -1;
> -}
> -
> static void
> mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct
> rdt_resource *r)
> {
> @@ -560,7 +547,7 @@ static int domain_setup_mon_state(struct rdt_resource
> *r, struct rdt_domain *d)
> */
> static void domain_add_cpu(int cpu, struct rdt_resource *r)
> {
> - int id = get_cache_id(cpu, r->cache_level);
> + int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
> struct list_head *add_pos = NULL;
> struct rdt_domain *d;
>
> @@ -606,7 +593,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource
> *r)
>
> static void domain_remove_cpu(int cpu, struct rdt_resource *r)
> {
> - int id = get_cache_id(cpu, r->cache_level);
> + int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
> struct rdt_domain *d;
>
> d = rdt_find_domain(r, id, NULL);
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 46b92cd61d0c..4f72b47973c3 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -3,6 +3,7 @@
> #define _LINUX_CACHEINFO_H
>
> #include <linux/bitops.h>
> +#include <linux/cpu.h>
> #include <linux/cpumask.h>
> #include <linux/smp.h>
>
> @@ -119,4 +120,24 @@ int acpi_find_last_cache_level(unsigned int cpu);
>
> const struct attribute_group *cache_get_priv_group(struct cacheinfo
> *this_leaf);
>
> +/*
> + * Get the id of the cache associated with @cpu at level @level.
> + * cpuhp lock must be held.
> + */
> +static inline int get_cpu_cacheinfo_id(int cpu, int level)
> +{
> + struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu);
> + int i;
> +
> + for (i = 0; i < ci->num_leaves; i++) {
> + if (ci->info_list[i].level == level) {
> + if (ci->info_list[i].attributes & CACHE_ID)
> + return ci->info_list[i].id;
> + return -1;
> + }
> + }
> +
> + return -1;
> +}
> +
> #endif /* _LINUX_CACHEINFO_H */
> --
> 2.26.1

2020-05-13 20:56:15

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v2 09/10] x86/resctrl: Add arch_has_sparse_bitmaps to explain AMD/Intel CAT difference



> -----Original Message-----
> From: Reinette Chatre <[email protected]>
> Sent: Monday, May 11, 2020 1:15 PM
> To: James Morse <[email protected]>; [email protected]; linux-
> [email protected]
> Cc: Fenghua Yu <[email protected]>; Thomas Gleixner
> <[email protected]>; Ingo Molnar <[email protected]>; Borislav Petkov
> <[email protected]>; H Peter Anvin <[email protected]>; Moger, Babu
> <[email protected]>
> Subject: Re: [PATCH v2 09/10] x86/resctrl: Add arch_has_sparse_bitmaps to
> explain AMD/Intel CAT difference
>
> Hi James,
>
> On 4/30/2020 10:03 AM, James Morse wrote:
> > Intel expects the cache bitmap provided by user-space to have on a
> > single span of 1s, whereas AMD can support bitmaps like 0xf00f.
> > Arm's MPAM support also allows sparse bitmaps.
> >
> > To move resctrl out to /fs/ we need to explain platform differences
> > like this. Add a resource property arch_has_sparse_bitmaps. Test this
> > around the 'non-consecutive' test in cbm_validate().
> >
> > Merging the validate calls causes AMD top gain the min_cbm_bits test
> > needed for Haswell, but as it always sets this value to 1, it will
> > never match.
> >
> > CC: Babu Moger <[email protected]>
> > Signed-off-by: James Morse <[email protected]>
> > Reviewed-by: Reinette Chatre <[email protected]>
>
> The Intel bits do indeed look good to me but we should check the AMD
> portion ... I peeked at the AMD spec [1] and found "If an L3_MASK_n
> register is programmed with all 0’s, that COS will be prevented from
> allocating any lines in the L3 cache" ... so AMD does allow bitmasks of
> all 0's (Intel does not).
>
> Does MPAM also allow all 0's? Perhaps "arch_has_sparse_bitmaps" can be
> used to indicate that also?

That is right. AMD allows L3 mask be all 0s. I will be great if this
property can be indicated it here. Thanks

2020-05-14 15:32:34

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] x86/resctrl: Misc cleanup

Hi James,

On 4/30/2020 10:03 AM, James Morse wrote:
> Hello!
>
> These are the miscellaneous cleanup patches that floated to the top of
> the MPAM tree.
>
> The only interesting thing are the patches to make the AMD/Intel
> differences something resctrl understands, instead of just 'happening'
> because of the different function pointers.
> This will become more important once MPAM support is added. parse_bw()
> and friends are what enforces resctrl's ABI resctrl. Allowing an
> architecture/platform to provide a subtly different function here would
> be bad for user-space.
>
> MPAM would set arch_has_sparse_bitmaps, but not arch_needs_linear.
>
> Since [v1], I've picked up all the review feedback and collected the
> tags.
>
> Nothing in this series should change any behaviour.
> This series is based on v5.7-rc3.

Please note that there are currently some resctrl changes in branch
x86/cache of the tip repo that are queued for inclusion into v5.8 that
have a few conflicts with this series. When you resubmit it may be
helpful if this series is based on that instead.

Reinette




2020-05-15 18:24:14

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v2 09/10] x86/resctrl: Add arch_has_sparse_bitmaps to explain AMD/Intel CAT difference

Hi guys,

On 13/05/2020 21:03, Babu Moger wrote:
>> From: Reinette Chatre <[email protected]>
>> On 4/30/2020 10:03 AM, James Morse wrote:
>>> Intel expects the cache bitmap provided by user-space to have on a
>>> single span of 1s, whereas AMD can support bitmaps like 0xf00f.
>>> Arm's MPAM support also allows sparse bitmaps.
>>>
>>> To move resctrl out to /fs/ we need to explain platform differences
>>> like this. Add a resource property arch_has_sparse_bitmaps. Test this
>>> around the 'non-consecutive' test in cbm_validate().
>>>
>>> Merging the validate calls causes AMD top gain the min_cbm_bits test

top -> to

>>> needed for Haswell, but as it always sets this value to 1, it will
>>> never match.
>>>
>>> CC: Babu Moger <[email protected]>
>>> Signed-off-by: James Morse <[email protected]>
>>> Reviewed-by: Reinette Chatre <[email protected]>
>>
>> The Intel bits do indeed look good to me but we should check the AMD
>> portion ... I peeked at the AMD spec [1] and found "If an L3_MASK_n
>> register is programmed with all 0’s, that COS will be prevented from
>> allocating any lines in the L3 cache" ... so AMD does allow bitmasks of
>> all 0's (Intel does not).
>>
>> Does MPAM also allow all 0's? Perhaps "arch_has_sparse_bitmaps" can be
>> used to indicate that also?

It does...


> That is right. AMD allows L3 mask be all 0s. I will be great if this
> property can be indicated it here. Thanks

Ah, this is a bug I didn't intend to introduce!

Intel has:
|       if (val == 0 || val > r->default_ctrl) {

Whereas AMD:
|       if (val > r->default_ctrl) {


So this empty bitmap is permitted today with resctrl. (and, its another try-it-and-see).

I'll add an 'arch_has_empty_bitmaps', I don't think overloading 'sparse' to mean 'sparse
and empty' is a good idea.


Thanks!

James

2020-05-15 19:17:49

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v2 09/10] x86/resctrl: Add arch_has_sparse_bitmaps to explain AMD/Intel CAT difference

> -----Original Message-----
> From: James Morse <[email protected]>
> Sent: Friday, May 15, 2020 1:22 PM
> To: Moger, Babu <[email protected]>; Reinette Chatre
> <[email protected]>
> Cc: [email protected]; [email protected]; Fenghua Yu
> <[email protected]>; Thomas Gleixner <[email protected]>; Ingo Molnar
> <[email protected]>; Borislav Petkov <[email protected]>; H Peter Anvin
> <[email protected]>
> Subject: Re: [PATCH v2 09/10] x86/resctrl: Add arch_has_sparse_bitmaps to
> explain AMD/Intel CAT difference
>
> Hi guys,
>
> On 13/05/2020 21:03, Babu Moger wrote:
> >> From: Reinette Chatre <[email protected]>
> >> On 4/30/2020 10:03 AM, James Morse wrote:
> >>> Intel expects the cache bitmap provided by user-space to have on a
> >>> single span of 1s, whereas AMD can support bitmaps like 0xf00f.
> >>> Arm's MPAM support also allows sparse bitmaps.
> >>>
> >>> To move resctrl out to /fs/ we need to explain platform differences
> >>> like this. Add a resource property arch_has_sparse_bitmaps. Test this
> >>> around the 'non-consecutive' test in cbm_validate().
> >>>
> >>> Merging the validate calls causes AMD top gain the min_cbm_bits test
>
> top -> to
>
> >>> needed for Haswell, but as it always sets this value to 1, it will
> >>> never match.
> >>>
> >>> CC: Babu Moger <[email protected]>
> >>> Signed-off-by: James Morse <[email protected]>
> >>> Reviewed-by: Reinette Chatre <[email protected]>
> >>
> >> The Intel bits do indeed look good to me but we should check the AMD
> >> portion ... I peeked at the AMD spec [1] and found "If an L3_MASK_n
> >> register is programmed with all 0’s, that COS will be prevented from
> >> allocating any lines in the L3 cache" ... so AMD does allow bitmasks of
> >> all 0's (Intel does not).
> >>
> >> Does MPAM also allow all 0's? Perhaps "arch_has_sparse_bitmaps" can be
> >> used to indicate that also?
>
> It does...
>
>
> > That is right. AMD allows L3 mask be all 0s. I will be great if this
> > property can be indicated it here. Thanks
>
> Ah, this is a bug I didn't intend to introduce!
>
> Intel has:
> |       if (val == 0 || val > r->default_ctrl) {
>
> Whereas AMD:
> |       if (val > r->default_ctrl) {
>
>
> So this empty bitmap is permitted today with resctrl. (and, its another try-it-and-
> see).
>
> I'll add an 'arch_has_empty_bitmaps', I don't think overloading 'sparse' to mean
> 'sparse
> and empty' is a good idea.

Sounds good to me. Thanks