2021-06-14 20:11:09

by James Morse

[permalink] [raw]
Subject: [PATCH v4 00/24] x86/resctrl: Merge the CDP resources

Hi folks,

Changes since v3? Fixed the 'unsigned u32', remove a few spaces and fixed a
few spelling mistakes.
----

This series re-folds the resctrl code so the CDP resources (L3CODE et al)
behaviour is all contained in the filesystem parts, with a minimum amount
of arch specific code.

Arm have some CPU support for dividing caches into portions, and
applying bandwidth limits at various points in the SoC. The collective term
for these features is MPAM: Memory Partitioning and Monitoring.

MPAM is similar enough to Intel RDT, that it should use the defacto linux
interface: resctrl. This filesystem currently lives under arch/x86, and is
tightly coupled to the architecture.
Ultimately, my plan is to split the existing resctrl code up to have an
arch<->fs abstraction, then move all the bits out to fs/resctrl. From there
MPAM can be wired up.

x86 might have two resources with cache controls, (L2 and L3) but has
extra copies for CDP: L{2,3}{CODE,DATA}, which are marked as enabled
if CDP is enabled for the corresponding cache.

MPAM has an equivalent feature to CDP, but its a property of the CPU,
not the cache. Resctrl needs to have x86's odd/even behaviour, as that
its the ABI, but this isn't how the MPAM hardware works. It is entirely
possible that an in-kernel user of MPAM would not be using CDP, whereas
resctrl is.

Pretending L3CODE and L3DATA are entirely separate resources is a neat
trick, but doing this is specific to x86.
Doing this leaves the arch code in control of various parts of the
filesystem ABI: the resources names, and the way the schemata are parsed.
Allowing this stuff to vary between architectures is bad for user space.

This series collapses the CODE/DATA resources, moving all the user-visible
resctrl ABI into what becomes the filesystem code. CDP becomes the type of
configuration being applied to a cache. This is done by adding a
struct resctrl_schema to the parts of resctrl that will move to fs. This
holds the arch-code resource that is in use for this schema, along with
other properties like the name, and whether the configuration being applied
is CODE/DATA/BOTH.

This lets us fold the extra resources out of the arch code so that they
don't need to be duplicated if the equivalent feature to CDP is missing, or
implemented in a different way.


The first two patches split the resource and domain structs to have an
arch specific 'hw' portion, and the rest that is visible to resctrl.
Future series massage the resctrl code so there are no accesses to 'hw'
structures in the parts of resctrl that will move to fs, providing helpers
where necessary.

This series adds temporary scaffolding, which it removes a few patches
later. This is to allow things like the ctrlval arrays and resources to be
merged separately, which should make is easier to bisect. These things
are marked temporary, and should all be gone by the end of the series.

This series is a little rough around the monitors, would a fake
struct resctrl_schema for the monitors simplify things, or be a source
of bugs?

This series is based on v5.12-rc6, and can be retrieved from:
git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/resctrl_merge_cdp/v4

v3: https://lore.kernel.org/lkml/[email protected]/
v2: https://lore.kernel.org/lkml/[email protected]/
v1: https://lore.kernel.org/lkml/[email protected]/

Parts were previously posted as an RFC here:
https://lore.kernel.org/lkml/[email protected]/

James Morse (24):
x86/resctrl: Split struct rdt_resource
x86/resctrl: Split struct rdt_domain
x86/resctrl: Add a separate schema list for resctrl
x86/resctrl: Pass the schema in info dir's private pointer
x86/resctrl: Label the resources with their configuration type
x86/resctrl: Walk the resctrl schema list instead of an arch list
x86/resctrl: Store the effective num_closid in the schema
x86/resctrl: Add resctrl_arch_get_num_closid()
x86/resctrl: Pass the schema to resctrl filesystem functions
x86/resctrl: Swizzle rdt_resource and resctrl_schema in
pseudo_lock_region
x86/resctrl: Move the schemata names into struct resctrl_schema
x86/resctrl: Group staged configuration into a separate struct
x86/resctrl: Allow different CODE/DATA configurations to be staged
x86/resctrl: Rename update_domains() resctrl_arch_update_domains()
x86/resctrl: Add a helper to read a closid's configuration
x86/resctrl: Add a helper to read/set the CDP configuration
x86/resctrl: Pass configuration type to resctrl_arch_get_config()
x86/resctrl: Make ctrlval arrays the same size
x86/resctrl: Apply offset correction when config is staged
x86/resctrl: Calculate the index from the configuration type
x86/resctrl: Merge the ctrl_val arrays
x86/resctrl: Remove rdt_cdp_peer_get()
x86/resctrl: Expand resctrl_arch_update_domains()'s msr_param range
x86/resctrl: Merge the CDP resources

arch/x86/kernel/cpu/resctrl/core.c | 269 +++++--------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 164 +++++---
arch/x86/kernel/cpu/resctrl/internal.h | 234 ++++-------
arch/x86/kernel/cpu/resctrl/monitor.c | 44 ++-
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 12 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 448 ++++++++++++----------
include/linux/resctrl.h | 181 +++++++++
7 files changed, 758 insertions(+), 594 deletions(-)

--
2.30.2


2021-06-14 20:11:50

by James Morse

[permalink] [raw]
Subject: [PATCH v4 01/24] x86/resctrl: Split struct rdt_resource

resctrl is the defacto Linux ABI for SoC resource partitioning features.

To support it on another architecture, it needs to be abstracted from
the features provided by Intel RDT and AMD PQoS, and moved to /fs/.
struct rdt_resource contains a mix of architecture private details
and properties of the filesystem interface user-space users.

Start by splitting struct rdt_resource, into an architecture private
'hw' struct, which contains the common resctrl structure that would be
used by any architecture. The foreach helpers are most commonly used by
the filesystem code, and should return the common resctrl structure.
for_each_rdt_resource() is changed to walk the common structure in its
parent arch private structure.
Move as much of the structure as possible into the common structure
in the core code's header file. The x86 hardware accessors remain
part of the architecture private code, as do num_closid, mon_scale
and mbm_width.
mon_scale and mbm_width are used to detect overflow of the hardware
counters, and convert them from their native size to bytes. Any
cross-architecture abstraction should be in terms of bytes, making
these properties private.
The hardware's num_closid is kept in the private structure to force
the filesystem code to use a helper to access it. MPAM would return a
single value for the system, regardless of the resource. Using the
helper prevents this field from being confused with the version of
num_closid that is being exposed to user-space (added in a later patch).
After this split, filesytsem code touching a 'hw' struct indicates
where an abstraction is needed.

Splitting this structure only moves types around, and should not lead
to any change in behaviour.

Splitting rdt_domain up in a similar way happens in the next patch.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Fixed a spelling mistake

Changes since v2:
* Added a comment above for_each_rdt_resources()
* Expanded kerneldoc for rdt_hw_resource

Changes since v1:
* Tabs space and other whitespace
* Restored for_each_rdt_resource() calls in arch code
---
arch/x86/kernel/cpu/resctrl/core.c | 250 ++++++++++++----------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 14 +-
arch/x86/kernel/cpu/resctrl/internal.h | 150 ++++---------
arch/x86/kernel/cpu/resctrl/monitor.c | 32 +--
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 68 +++---
include/linux/resctrl.h | 111 ++++++++++
7 files changed, 358 insertions(+), 271 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 23001ae03e82..415d0f04efd7 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -57,120 +57,134 @@ static void
mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);

-#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].domains)
+#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].resctrl.domains)

-struct rdt_resource rdt_resources_all[] = {
+struct rdt_hw_resource rdt_resources_all[] = {
[RDT_RESOURCE_L3] =
{
- .rid = RDT_RESOURCE_L3,
- .name = "L3",
- .domains = domain_init(RDT_RESOURCE_L3),
+ .resctrl = {
+ .rid = RDT_RESOURCE_L3,
+ .name = "L3",
+ .cache_level = 3,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 1,
+ .cbm_idx_offset = 0,
+ },
+ .domains = domain_init(RDT_RESOURCE_L3),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L3DATA] =
{
- .rid = RDT_RESOURCE_L3DATA,
- .name = "L3DATA",
- .domains = domain_init(RDT_RESOURCE_L3DATA),
+ .resctrl = {
+ .rid = RDT_RESOURCE_L3DATA,
+ .name = "L3DATA",
+ .cache_level = 3,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 2,
+ .cbm_idx_offset = 0,
+ },
+ .domains = domain_init(RDT_RESOURCE_L3DATA),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L3CODE] =
{
- .rid = RDT_RESOURCE_L3CODE,
- .name = "L3CODE",
- .domains = domain_init(RDT_RESOURCE_L3CODE),
+ .resctrl = {
+ .rid = RDT_RESOURCE_L3CODE,
+ .name = "L3CODE",
+ .cache_level = 3,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 2,
+ .cbm_idx_offset = 1,
+ },
+ .domains = domain_init(RDT_RESOURCE_L3CODE),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L2] =
{
- .rid = RDT_RESOURCE_L2,
- .name = "L2",
- .domains = domain_init(RDT_RESOURCE_L2),
+ .resctrl = {
+ .rid = RDT_RESOURCE_L2,
+ .name = "L2",
+ .cache_level = 2,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 1,
+ .cbm_idx_offset = 0,
+ },
+ .domains = domain_init(RDT_RESOURCE_L2),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L2DATA] =
{
- .rid = RDT_RESOURCE_L2DATA,
- .name = "L2DATA",
- .domains = domain_init(RDT_RESOURCE_L2DATA),
+ .resctrl = {
+ .rid = RDT_RESOURCE_L2DATA,
+ .name = "L2DATA",
+ .cache_level = 2,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 2,
+ .cbm_idx_offset = 0,
+ },
+ .domains = domain_init(RDT_RESOURCE_L2DATA),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L2CODE] =
{
- .rid = RDT_RESOURCE_L2CODE,
- .name = "L2CODE",
- .domains = domain_init(RDT_RESOURCE_L2CODE),
+ .resctrl = {
+ .rid = RDT_RESOURCE_L2CODE,
+ .name = "L2CODE",
+ .cache_level = 2,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 2,
+ .cbm_idx_offset = 1,
+ },
+ .domains = domain_init(RDT_RESOURCE_L2CODE),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_MBA] =
{
- .rid = RDT_RESOURCE_MBA,
- .name = "MB",
- .domains = domain_init(RDT_RESOURCE_MBA),
- .cache_level = 3,
- .parse_ctrlval = parse_bw,
- .format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
+ .resctrl = {
+ .rid = RDT_RESOURCE_MBA,
+ .name = "MB",
+ .cache_level = 3,
+ .domains = domain_init(RDT_RESOURCE_MBA),
+ .parse_ctrlval = parse_bw,
+ .format_str = "%d=%*u",
+ .fflags = RFTYPE_RES_MB,
+ },
},
};

@@ -199,7 +213,8 @@ static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid)
*/
static inline void cache_alloc_hsw_probe(void)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_resource *r = &hw_res->resctrl;
u32 l, h, max_cbm = BIT_MASK(20) - 1;

if (wrmsr_safe(MSR_IA32_L3_CBM_BASE, max_cbm, 0))
@@ -211,7 +226,7 @@ static inline void cache_alloc_hsw_probe(void)
if (l != max_cbm)
return;

- r->num_closid = 4;
+ hw_res->num_closid = 4;
r->default_ctrl = max_cbm;
r->cache.cbm_len = 20;
r->cache.shareable_bits = 0xc0000;
@@ -225,7 +240,7 @@ static inline void cache_alloc_hsw_probe(void)
bool is_mba_sc(struct rdt_resource *r)
{
if (!r)
- return rdt_resources_all[RDT_RESOURCE_MBA].membw.mba_sc;
+ return rdt_resources_all[RDT_RESOURCE_MBA].resctrl.membw.mba_sc;

return r->membw.mba_sc;
}
@@ -253,12 +268,13 @@ static inline bool rdt_get_mb_table(struct rdt_resource *r)

static bool __get_mem_config_intel(struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
union cpuid_0x10_3_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx, max_delay;

cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
- r->num_closid = edx.split.cos_max + 1;
+ hw_res->num_closid = edx.split.cos_max + 1;
max_delay = eax.split.max_delay + 1;
r->default_ctrl = MAX_MBA_BW;
r->membw.arch_needs_linear = true;
@@ -287,12 +303,13 @@ static bool __get_mem_config_intel(struct rdt_resource *r)

static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
union cpuid_0x10_3_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx;

cpuid_count(0x80000020, 1, &eax.full, &ebx, &ecx, &edx.full);
- r->num_closid = edx.split.cos_max + 1;
+ hw_res->num_closid = edx.split.cos_max + 1;
r->default_ctrl = MAX_MBA_BW_AMD;

/* AMD does not use delay */
@@ -317,12 +334,13 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)

static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
union cpuid_0x10_1_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx;

cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
- r->num_closid = edx.split.cos_max + 1;
+ hw_res->num_closid = edx.split.cos_max + 1;
r->cache.cbm_len = eax.split.cbm_len + 1;
r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
r->cache.shareable_bits = ebx & r->default_ctrl;
@@ -333,10 +351,12 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)

static void rdt_get_cdp_config(int level, int type)
{
- struct rdt_resource *r_l = &rdt_resources_all[level];
- struct rdt_resource *r = &rdt_resources_all[type];
+ struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
+ struct rdt_hw_resource *hw_res_l = resctrl_to_arch_res(r_l);
+ struct rdt_resource *r = &rdt_resources_all[type].resctrl;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

- r->num_closid = r_l->num_closid / 2;
+ hw_res->num_closid = hw_res_l->num_closid / 2;
r->cache.cbm_len = r_l->cache.cbm_len;
r->default_ctrl = r_l->default_ctrl;
r->cache.shareable_bits = r_l->cache.shareable_bits;
@@ -365,9 +385,10 @@ static void
mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

for (i = m->low; i < m->high; i++)
- wrmsrl(r->msr_base + i, d->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + i, d->ctrl_val[i]);
}

/*
@@ -389,19 +410,21 @@ mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

/* Write the delay values for mba. */
for (i = m->low; i < m->high; i++)
- wrmsrl(r->msr_base + i, delay_bw_map(d->ctrl_val[i], r));
+ wrmsrl(hw_res->msr_base + i, delay_bw_map(d->ctrl_val[i], r));
}

static void
cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

for (i = m->low; i < m->high; i++)
- wrmsrl(r->msr_base + cbm_idx(r, i), d->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + cbm_idx(r, i), d->ctrl_val[i]);
}

struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
@@ -420,13 +443,14 @@ struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
void rdt_ctrl_update(void *arg)
{
struct msr_param *m = arg;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(m->res);
struct rdt_resource *r = m->res;
int cpu = smp_processor_id();
struct rdt_domain *d;

d = get_domain_from_cpu(cpu, r);
if (d) {
- r->msr_update(d, m, r);
+ hw_res->msr_update(d, m, r);
return;
}
pr_warn_once("cpu %d not found in any domain for resource %s\n",
@@ -468,6 +492,7 @@ struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,

void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
int i;

/*
@@ -476,7 +501,7 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
* For Memory Allocation: Set b/w requested to 100%
* and the bandwidth in MBps to U32_MAX
*/
- for (i = 0; i < r->num_closid; i++, dc++, dm++) {
+ for (i = 0; i < hw_res->num_closid; i++, dc++, dm++) {
*dc = r->default_ctrl;
*dm = MBA_MAX_MBPS;
}
@@ -484,14 +509,15 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)

static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
struct msr_param m;
u32 *dc, *dm;

- dc = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
+ dc = kmalloc_array(hw_res->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
if (!dc)
return -ENOMEM;

- dm = kmalloc_array(r->num_closid, sizeof(*d->mbps_val), GFP_KERNEL);
+ dm = kmalloc_array(hw_res->num_closid, sizeof(*d->mbps_val), GFP_KERNEL);
if (!dm) {
kfree(dc);
return -ENOMEM;
@@ -502,8 +528,8 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
setup_default_ctrlval(r, dc, dm);

m.low = 0;
- m.high = r->num_closid;
- r->msr_update(d, &m, r);
+ m.high = hw_res->num_closid;
+ hw_res->msr_update(d, &m, r);
return 0;
}

@@ -655,7 +681,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
return;
}

- if (r == &rdt_resources_all[RDT_RESOURCE_L3]) {
+ if (r == &rdt_resources_all[RDT_RESOURCE_L3].resctrl) {
if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
cancel_delayed_work(&d->mbm_over);
mbm_setup_overflow_handler(d, 0);
@@ -831,9 +857,9 @@ static __init bool get_mem_config(void)
return false;

if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
- return __get_mem_config_intel(&rdt_resources_all[RDT_RESOURCE_MBA]);
+ return __get_mem_config_intel(&rdt_resources_all[RDT_RESOURCE_MBA].resctrl);
else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
- return __rdt_get_mem_config_amd(&rdt_resources_all[RDT_RESOURCE_MBA]);
+ return __rdt_get_mem_config_amd(&rdt_resources_all[RDT_RESOURCE_MBA].resctrl);

return false;
}
@@ -849,14 +875,14 @@ static __init bool get_rdt_alloc_resources(void)
return false;

if (rdt_cpu_has(X86_FEATURE_CAT_L3)) {
- rdt_get_cache_alloc_cfg(1, &rdt_resources_all[RDT_RESOURCE_L3]);
+ rdt_get_cache_alloc_cfg(1, &rdt_resources_all[RDT_RESOURCE_L3].resctrl);
if (rdt_cpu_has(X86_FEATURE_CDP_L3))
rdt_get_cdp_l3_config();
ret = true;
}
if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
/* CPUID 0x10.2 fields are same format at 0x10.1 */
- rdt_get_cache_alloc_cfg(2, &rdt_resources_all[RDT_RESOURCE_L2]);
+ rdt_get_cache_alloc_cfg(2, &rdt_resources_all[RDT_RESOURCE_L2].resctrl);
if (rdt_cpu_has(X86_FEATURE_CDP_L2))
rdt_get_cdp_l2_config();
ret = true;
@@ -880,7 +906,7 @@ static __init bool get_rdt_mon_resources(void)
if (!rdt_mon_features)
return false;

- return !rdt_get_mon_l3_config(&rdt_resources_all[RDT_RESOURCE_L3]);
+ return !rdt_get_mon_l3_config(&rdt_resources_all[RDT_RESOURCE_L3].resctrl);
}

static __init void __check_quirks_intel(void)
@@ -918,9 +944,12 @@ static __init bool get_rdt_resources(void)

static __init void rdt_init_res_defs_intel(void)
{
+ struct rdt_hw_resource *hw_res;
struct rdt_resource *r;

for_each_rdt_resource(r) {
+ hw_res = resctrl_to_arch_res(r);
+
if (r->rid == RDT_RESOURCE_L3 ||
r->rid == RDT_RESOURCE_L3DATA ||
r->rid == RDT_RESOURCE_L3CODE ||
@@ -931,17 +960,20 @@ static __init void rdt_init_res_defs_intel(void)
r->cache.arch_has_empty_bitmaps = false;
r->cache.arch_has_per_cpu_cfg = false;
} else if (r->rid == RDT_RESOURCE_MBA) {
- r->msr_base = MSR_IA32_MBA_THRTL_BASE;
- r->msr_update = mba_wrmsr_intel;
+ hw_res->msr_base = MSR_IA32_MBA_THRTL_BASE;
+ hw_res->msr_update = mba_wrmsr_intel;
}
}
}

static __init void rdt_init_res_defs_amd(void)
{
+ struct rdt_hw_resource *hw_res;
struct rdt_resource *r;

for_each_rdt_resource(r) {
+ hw_res = resctrl_to_arch_res(r);
+
if (r->rid == RDT_RESOURCE_L3 ||
r->rid == RDT_RESOURCE_L3DATA ||
r->rid == RDT_RESOURCE_L3CODE ||
@@ -952,8 +984,8 @@ static __init void rdt_init_res_defs_amd(void)
r->cache.arch_has_empty_bitmaps = true;
r->cache.arch_has_per_cpu_cfg = true;
} else if (r->rid == RDT_RESOURCE_MBA) {
- r->msr_base = MSR_IA32_MBA_BW_BASE;
- r->msr_update = mba_wrmsr_amd;
+ hw_res->msr_base = MSR_IA32_MBA_BW_BASE;
+ hw_res->msr_update = mba_wrmsr_amd;
}
}
}
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index c877642e8a14..ab6e584c9d2d 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -284,10 +284,12 @@ int update_domains(struct rdt_resource *r, int closid)
static int rdtgroup_parse_resource(char *resname, char *tok,
struct rdtgroup *rdtgrp)
{
+ struct rdt_hw_resource *hw_res;
struct rdt_resource *r;

for_each_alloc_enabled_rdt_resource(r) {
- if (!strcmp(resname, r->name) && rdtgrp->closid < r->num_closid)
+ hw_res = resctrl_to_arch_res(r);
+ if (!strcmp(resname, r->name) && rdtgrp->closid < hw_res->num_closid)
return parse_line(tok, r, rdtgrp);
}
rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname);
@@ -394,6 +396,7 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
int rdtgroup_schemata_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
+ struct rdt_hw_resource *hw_res;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
int ret = 0;
@@ -418,7 +421,8 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
} else {
closid = rdtgrp->closid;
for_each_alloc_enabled_rdt_resource(r) {
- if (closid < r->num_closid)
+ hw_res = resctrl_to_arch_res(r);
+ if (closid < hw_res->num_closid)
show_doms(s, r, closid);
}
}
@@ -449,6 +453,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
int rdtgroup_mondata_show(struct seq_file *m, void *arg)
{
struct kernfs_open_file *of = m->private;
+ struct rdt_hw_resource *hw_res;
u32 resid, evtid, domid;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
@@ -468,7 +473,8 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
domid = md.u.domid;
evtid = md.u.evtid;

- r = &rdt_resources_all[resid];
+ hw_res = &rdt_resources_all[resid];
+ r = &hw_res->resctrl;
d = rdt_find_domain(r, domid, NULL);
if (IS_ERR_OR_NULL(d)) {
ret = -ENOENT;
@@ -482,7 +488,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
else if (rr.val & RMID_VAL_UNAVAIL)
seq_puts(m, "Unavailable\n");
else
- seq_printf(m, "%llu\n", rr.val * r->mon_scale);
+ seq_printf(m, "%llu\n", rr.val * hw_res->mon_scale);

out:
rdtgroup_kn_unlock(of->kn);
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c4d320d02fd5..43c8cf6b2b12 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -2,6 +2,7 @@
#ifndef _ASM_X86_RESCTRL_INTERNAL_H
#define _ASM_X86_RESCTRL_INTERNAL_H

+#include <linux/resctrl.h>
#include <linux/sched.h>
#include <linux/kernfs.h>
#include <linux/fs_context.h>
@@ -348,67 +349,6 @@ struct msr_param {
int high;
};

-/**
- * struct rdt_cache - Cache allocation related data
- * @cbm_len: Length of the cache bit mask
- * @min_cbm_bits: Minimum number of consecutive bits to be set
- * @cbm_idx_mult: Multiplier of CBM index
- * @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
- * closid * cbm_idx_multi + cbm_idx_offset
- * in a cache bit mask
- * @shareable_bits: Bitmask of shareable resource with other
- * executing entities
- * @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid.
- * @arch_has_empty_bitmaps: True if the '0' bitmap is valid.
- * @arch_has_per_cpu_cfg: True if QOS_CFG register for this cache
- * level has CPU scope.
- */
-struct rdt_cache {
- unsigned int cbm_len;
- unsigned int min_cbm_bits;
- unsigned int cbm_idx_mult;
- unsigned int cbm_idx_offset;
- unsigned int shareable_bits;
- bool arch_has_sparse_bitmaps;
- bool arch_has_empty_bitmaps;
- bool arch_has_per_cpu_cfg;
-};
-
-/**
- * enum membw_throttle_mode - System's memory bandwidth throttling mode
- * @THREAD_THROTTLE_UNDEFINED: Not relevant to the system
- * @THREAD_THROTTLE_MAX: Memory bandwidth is throttled at the core
- * always using smallest bandwidth percentage
- * assigned to threads, aka "max throttling"
- * @THREAD_THROTTLE_PER_THREAD: Memory bandwidth is throttled at the thread
- */
-enum membw_throttle_mode {
- THREAD_THROTTLE_UNDEFINED = 0,
- THREAD_THROTTLE_MAX,
- THREAD_THROTTLE_PER_THREAD,
-};
-
-/**
- * struct rdt_membw - Memory bandwidth allocation related data
- * @min_bw: Minimum memory bandwidth percentage user can request
- * @bw_gran: Granularity at which the memory bandwidth is allocated
- * @delay_linear: True if memory B/W delay is in linear scale
- * @arch_needs_linear: True if we can't configure non-linear resources
- * @throttle_mode: Bandwidth throttling mode when threads request
- * different memory bandwidths
- * @mba_sc: True if MBA software controller(mba_sc) is enabled
- * @mb_map: Mapping of memory B/W percentage to memory B/W delay
- */
-struct rdt_membw {
- u32 min_bw;
- u32 bw_gran;
- u32 delay_linear;
- bool arch_needs_linear;
- enum membw_throttle_mode throttle_mode;
- bool mba_sc;
- u32 *mb_map;
-};
-
static inline bool is_llc_occupancy_enabled(void)
{
return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
@@ -441,56 +381,33 @@ struct rdt_parse_data {
};

/**
- * struct rdt_resource - attributes of an RDT resource
- * @rid: The index of the resource
- * @alloc_enabled: Is allocation enabled on this machine
- * @mon_enabled: Is monitoring enabled for this feature
- * @alloc_capable: Is allocation available on this machine
- * @mon_capable: Is monitor feature available on this machine
- * @name: Name to use in "schemata" file
- * @num_closid: Number of CLOSIDs available
- * @cache_level: Which cache level defines scope of this resource
- * @default_ctrl: Specifies default cache cbm or memory B/W percent.
+ * struct rdt_hw_resource - arch private attributes of a resctrl resource
+ * @resctrl: Attributes of the resource used directly by resctrl.
+ * @num_closid: Maximum number of closid this hardware can support.
* @msr_base: Base MSR address for CBMs
* @msr_update: Function pointer to update QOS MSRs
- * @data_width: Character width of data when displaying
- * @domains: All domains for this resource
- * @cache: Cache allocation related data
- * @format_str: Per resource format string to show domain value
- * @parse_ctrlval: Per resource function pointer to parse control values
- * @evt_list: List of monitoring events
- * @num_rmid: Number of RMIDs available
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
- * @fflags: flags to choose base and info files
+ * @mbm_width: Monitor width, to detect and correct for overflow.
+ *
+ * Members of this structure are either private to the architecture
+ * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
+ * msr_update and msr_base.
*/
-struct rdt_resource {
- int rid;
- bool alloc_enabled;
- bool mon_enabled;
- bool alloc_capable;
- bool mon_capable;
- char *name;
+struct rdt_hw_resource {
+ struct rdt_resource resctrl;
int num_closid;
- int cache_level;
- u32 default_ctrl;
unsigned int msr_base;
void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
- int data_width;
- struct list_head domains;
- struct rdt_cache cache;
- struct rdt_membw membw;
- const char *format_str;
- int (*parse_ctrlval)(struct rdt_parse_data *data,
- struct rdt_resource *r,
- struct rdt_domain *d);
- struct list_head evt_list;
- int num_rmid;
unsigned int mon_scale;
unsigned int mbm_width;
- unsigned long fflags;
};

+static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
+{
+ return container_of(r, struct rdt_hw_resource, resctrl);
+}
+
int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
struct rdt_domain *d);
int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
@@ -498,7 +415,7 @@ int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,

extern struct mutex rdtgroup_mutex;

-extern struct rdt_resource rdt_resources_all[];
+extern struct rdt_hw_resource rdt_resources_all[];
extern struct rdtgroup rdtgroup_default;
DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);

@@ -517,33 +434,42 @@ enum {
RDT_NUM_RESOURCES,
};

+static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
+{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(res);
+
+ hw_res++;
+ return &hw_res->resctrl;
+}
+
+/*
+ * To return the common struct rdt_resource, which is contained in struct
+ * rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
+ * This makes the limit the resctrl member past the end of the array.
+ */
#define for_each_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++)
+ for (r = &rdt_resources_all[0].resctrl; \
+ r < &rdt_resources_all[RDT_NUM_RESOURCES].resctrl; \
+ r = resctrl_inc(r))

#define for_each_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->alloc_capable || r->mon_capable)

#define for_each_alloc_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->alloc_capable)

#define for_each_mon_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->mon_capable)

#define for_each_alloc_enabled_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->alloc_enabled)

#define for_each_mon_enabled_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->mon_enabled)

/* CPUID.(EAX=10H, ECX=ResID=1).EAX */
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index f07c10b87a87..685e7f86d908 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -174,7 +174,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free)
struct rdt_resource *r;
u32 crmid = 1, nrmid;

- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;

/*
* Skip RMID 0 and start from RMID 1 and check all the RMIDs that
@@ -232,7 +232,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry)
int cpu;
u64 val;

- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;

entry->busy = 0;
cpu = get_cpu();
@@ -287,6 +287,7 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)

static int __mon_event_count(u32 rmid, struct rmid_read *rr)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(rr->r);
struct mbm_state *m;
u64 chunks, tval;

@@ -319,7 +320,7 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
return 0;
}

- chunks = mbm_overflow_count(m->prev_msr, tval, rr->r->mbm_width);
+ chunks = mbm_overflow_count(m->prev_msr, tval, hw_res->mbm_width);
m->chunks += chunks;
m->prev_msr = tval;

@@ -334,7 +335,7 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
*/
static void mbm_bw_count(u32 rmid, struct rmid_read *rr)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(rr->r);
struct mbm_state *m = &rr->d->mbm_local[rmid];
u64 tval, cur_bw, chunks;

@@ -342,8 +343,8 @@ static void mbm_bw_count(u32 rmid, struct rmid_read *rr)
if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL))
return;

- chunks = mbm_overflow_count(m->prev_bw_msr, tval, rr->r->mbm_width);
- cur_bw = (get_corrected_mbm_count(rmid, chunks) * r->mon_scale) >> 20;
+ chunks = mbm_overflow_count(m->prev_bw_msr, tval, hw_res->mbm_width);
+ cur_bw = (get_corrected_mbm_count(rmid, chunks) * hw_res->mon_scale) >> 20;

if (m->delta_comp)
m->delta_bw = abs(cur_bw - m->prev_bw);
@@ -416,6 +417,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
{
u32 closid, rmid, cur_msr, cur_msr_val, new_msr_val;
struct mbm_state *pmbm_data, *cmbm_data;
+ struct rdt_hw_resource *hw_r_mba;
u32 cur_bw, delta_bw, user_bw;
struct rdt_resource *r_mba;
struct rdt_domain *dom_mba;
@@ -425,7 +427,8 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
if (!is_mbm_local_enabled())
return;

- r_mba = &rdt_resources_all[RDT_RESOURCE_MBA];
+ hw_r_mba = &rdt_resources_all[RDT_RESOURCE_MBA];
+ r_mba = &hw_r_mba->resctrl;
closid = rgrp->closid;
rmid = rgrp->mon.rmid;
pmbm_data = &dom_mbm->mbm_local[rmid];
@@ -474,7 +477,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
return;
}

- cur_msr = r_mba->msr_base + closid;
+ cur_msr = hw_r_mba->msr_base + closid;
wrmsrl(cur_msr, delay_bw_map(new_msr_val, r_mba));
dom_mba->ctrl_val[closid] = new_msr_val;

@@ -538,7 +541,7 @@ void cqm_handle_limbo(struct work_struct *work)

mutex_lock(&rdtgroup_mutex);

- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
d = container_of(work, struct rdt_domain, cqm_limbo.work);

__check_limbo(d, false);
@@ -574,7 +577,7 @@ void mbm_handle_overflow(struct work_struct *work)
if (!static_branch_likely(&rdt_mon_enable_key))
goto out_unlock;

- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
d = container_of(work, struct rdt_domain, mbm_over.work);

list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
@@ -671,15 +674,16 @@ static void l3_mon_evt_init(struct rdt_resource *r)
int rdt_get_mon_l3_config(struct rdt_resource *r)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
unsigned int cl_size = boot_cpu_data.x86_cache_size;
int ret;

- r->mon_scale = boot_cpu_data.x86_cache_occ_scale;
+ hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale;
r->num_rmid = boot_cpu_data.x86_cache_max_rmid + 1;
- r->mbm_width = MBM_CNTR_WIDTH_BASE;
+ hw_res->mbm_width = MBM_CNTR_WIDTH_BASE;

if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX)
- r->mbm_width += mbm_offset;
+ hw_res->mbm_width += mbm_offset;
else if (mbm_offset > MBM_CNTR_WIDTH_OFFSET_MAX)
pr_warn("Ignoring impossible MBM counter offset\n");

@@ -693,7 +697,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
resctrl_cqm_threshold = cl_size * 1024 / r->num_rmid;

/* h/w works in units of "boot_cpu_data.x86_cache_occ_scale" */
- resctrl_cqm_threshold /= r->mon_scale;
+ resctrl_cqm_threshold /= hw_res->mon_scale;

ret = dom_data_init(r);
if (ret)
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 05a89e33fde2..f079561409ab 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -684,8 +684,8 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
* resource, the portion of cache used by it should be made
* unavailable to all future allocations from both resources.
*/
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled ||
- rdt_resources_all[RDT_RESOURCE_L2DATA].alloc_enabled) {
+ if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled ||
+ rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled) {
rdt_last_cmd_puts("CDP enabled\n");
return -EINVAL;
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 01fd30e7829d..5126a1e58d97 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -100,12 +100,15 @@ int closids_supported(void)

static void closid_init(void)
{
+ struct rdt_hw_resource *hw_res;
struct rdt_resource *r;
int rdt_min_closid = 32;

/* Compute rdt_min_closid across all resources */
- for_each_alloc_enabled_rdt_resource(r)
- rdt_min_closid = min(rdt_min_closid, r->num_closid);
+ for_each_alloc_enabled_rdt_resource(r) {
+ hw_res = resctrl_to_arch_res(r);
+ rdt_min_closid = min(rdt_min_closid, hw_res->num_closid);
+ }

closid_free_map = BIT_MASK(rdt_min_closid) - 1;

@@ -843,8 +846,10 @@ static int rdt_num_closids_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
struct rdt_resource *r = of->kn->parent->priv;
+ struct rdt_hw_resource *hw_res;

- seq_printf(seq, "%d\n", r->num_closid);
+ hw_res = resctrl_to_arch_res(r);
+ seq_printf(seq, "%d\n", hw_res->num_closid);
return 0;
}

@@ -1020,8 +1025,9 @@ static int max_threshold_occ_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
struct rdt_resource *r = of->kn->parent->priv;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

- seq_printf(seq, "%u\n", resctrl_cqm_threshold * r->mon_scale);
+ seq_printf(seq, "%u\n", resctrl_cqm_threshold * hw_res->mon_scale);

return 0;
}
@@ -1042,7 +1048,7 @@ static int rdt_thread_throttle_mode_show(struct kernfs_open_file *of,
static ssize_t max_threshold_occ_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct rdt_hw_resource *hw_res;
unsigned int bytes;
int ret;

@@ -1053,7 +1059,8 @@ static ssize_t max_threshold_occ_write(struct kernfs_open_file *of,
if (bytes > (boot_cpu_data.x86_cache_size * 1024))
return -EINVAL;

- resctrl_cqm_threshold = bytes / r->mon_scale;
+ hw_res = resctrl_to_arch_res(of->kn->parent->priv);
+ resctrl_cqm_threshold = bytes / hw_res->mon_scale;

return nbytes;
}
@@ -1111,16 +1118,16 @@ static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d,

switch (r->rid) {
case RDT_RESOURCE_L3DATA:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L3CODE];
+ _r_cdp = &rdt_resources_all[RDT_RESOURCE_L3CODE].resctrl;
break;
case RDT_RESOURCE_L3CODE:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L3DATA];
+ _r_cdp = &rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl;
break;
case RDT_RESOURCE_L2DATA:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L2CODE];
+ _r_cdp = &rdt_resources_all[RDT_RESOURCE_L2CODE].resctrl;
break;
case RDT_RESOURCE_L2CODE:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L2DATA];
+ _r_cdp = &rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl;
break;
default:
ret = -ENOENT;
@@ -1867,7 +1874,7 @@ static void l2_qos_cfg_update(void *arg)

static inline bool is_mba_linear(void)
{
- return rdt_resources_all[RDT_RESOURCE_MBA].membw.delay_linear;
+ return rdt_resources_all[RDT_RESOURCE_MBA].resctrl.membw.delay_linear;
}

static int set_cache_qos_cfg(int level, bool enable)
@@ -1888,7 +1895,7 @@ static int set_cache_qos_cfg(int level, bool enable)
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;

- r_l = &rdt_resources_all[level];
+ r_l = &rdt_resources_all[level].resctrl;
list_for_each_entry(d, &r_l->domains, list) {
if (r_l->cache.arch_has_per_cpu_cfg)
/* Pick all the CPUs in the domain instance */
@@ -1917,10 +1924,10 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r)
if (!r->alloc_capable)
return;

- if (r == &rdt_resources_all[RDT_RESOURCE_L2DATA])
+ if (r == &rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl)
l2_qos_cfg_update(&r->alloc_enabled);

- if (r == &rdt_resources_all[RDT_RESOURCE_L3DATA])
+ if (r == &rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl)
l3_qos_cfg_update(&r->alloc_enabled);
}

@@ -1932,7 +1939,7 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r)
*/
static int set_mba_sc(bool mba_sc)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA];
+ struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].resctrl;
struct rdt_domain *d;

if (!is_mbm_enabled() || !is_mba_linear() ||
@@ -1948,9 +1955,9 @@ static int set_mba_sc(bool mba_sc)

static int cdp_enable(int level, int data_type, int code_type)
{
- struct rdt_resource *r_ldata = &rdt_resources_all[data_type];
- struct rdt_resource *r_lcode = &rdt_resources_all[code_type];
- struct rdt_resource *r_l = &rdt_resources_all[level];
+ struct rdt_resource *r_ldata = &rdt_resources_all[data_type].resctrl;
+ struct rdt_resource *r_lcode = &rdt_resources_all[code_type].resctrl;
+ struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
int ret;

if (!r_l->alloc_capable || !r_ldata->alloc_capable ||
@@ -1980,13 +1987,13 @@ static int cdpl2_enable(void)

static void cdp_disable(int level, int data_type, int code_type)
{
- struct rdt_resource *r = &rdt_resources_all[level];
+ struct rdt_resource *r = &rdt_resources_all[level].resctrl;

r->alloc_enabled = r->alloc_capable;

- if (rdt_resources_all[data_type].alloc_enabled) {
- rdt_resources_all[data_type].alloc_enabled = false;
- rdt_resources_all[code_type].alloc_enabled = false;
+ if (rdt_resources_all[data_type].resctrl.alloc_enabled) {
+ rdt_resources_all[data_type].resctrl.alloc_enabled = false;
+ rdt_resources_all[code_type].resctrl.alloc_enabled = false;
set_cache_qos_cfg(level, false);
}
}
@@ -2003,9 +2010,9 @@ static void cdpl2_disable(void)

static void cdp_disable_all(void)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled)
+ if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled)
cdpl3_disable();
- if (rdt_resources_all[RDT_RESOURCE_L2DATA].alloc_enabled)
+ if (rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled)
cdpl2_disable();
}

@@ -2153,7 +2160,7 @@ static int rdt_get_tree(struct fs_context *fc)
static_branch_enable_cpuslocked(&rdt_enable_key);

if (is_mbm_enabled()) {
- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
list_for_each_entry(dom, &r->domains, list)
mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL);
}
@@ -2257,6 +2264,7 @@ static int rdt_init_fs_context(struct fs_context *fc)

static int reset_all_ctrls(struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
@@ -2267,7 +2275,7 @@ static int reset_all_ctrls(struct rdt_resource *r)

msr_param.res = r;
msr_param.low = 0;
- msr_param.high = r->num_closid;
+ msr_param.high = hw_res->num_closid;

/*
* Disable resource control for this resource by setting all
@@ -2277,7 +2285,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
list_for_each_entry(d, &r->domains, list) {
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);

- for (i = 0; i < r->num_closid; i++)
+ for (i = 0; i < hw_res->num_closid; i++)
d->ctrl_val[i] = r->default_ctrl;
}
cpu = get_cpu();
@@ -3124,13 +3132,13 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)

static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled)
+ if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled)
seq_puts(seq, ",cdp");

- if (rdt_resources_all[RDT_RESOURCE_L2DATA].alloc_enabled)
+ if (rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled)
seq_puts(seq, ",cdpl2");

- if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA]))
+ if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].resctrl))
seq_puts(seq, ",mba_MBps");

return 0;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 9b05af9b3e28..ee92df9c7252 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -2,6 +2,8 @@
#ifndef _RESCTRL_H
#define _RESCTRL_H

+#include <linux/kernel.h>
+#include <linux/list.h>
#include <linux/pid.h>

#ifdef CONFIG_PROC_CPU_RESCTRL
@@ -13,4 +15,113 @@ int proc_resctrl_show(struct seq_file *m,

#endif

+struct rdt_domain;
+
+/**
+ * struct resctrl_cache - Cache allocation related data
+ * @cbm_len: Length of the cache bit mask
+ * @min_cbm_bits: Minimum number of consecutive bits to be set
+ * @cbm_idx_mult: Multiplier of CBM index
+ * @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
+ * closid * cbm_idx_multi + cbm_idx_offset
+ * in a cache bit mask
+ * @shareable_bits: Bitmask of shareable resource with other
+ * executing entities
+ * @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid.
+ * @arch_has_empty_bitmaps: True if the '0' bitmap is valid.
+ * @arch_has_per_cpu_cfg: True if QOS_CFG register for this cache
+ * level has CPU scope.
+ */
+struct resctrl_cache {
+ unsigned int cbm_len;
+ unsigned int min_cbm_bits;
+ unsigned int cbm_idx_mult; // TODO remove this
+ unsigned int cbm_idx_offset; // TODO remove this
+ unsigned int shareable_bits;
+ bool arch_has_sparse_bitmaps;
+ bool arch_has_empty_bitmaps;
+ bool arch_has_per_cpu_cfg;
+};
+
+/**
+ * enum membw_throttle_mode - System's memory bandwidth throttling mode
+ * @THREAD_THROTTLE_UNDEFINED: Not relevant to the system
+ * @THREAD_THROTTLE_MAX: Memory bandwidth is throttled at the core
+ * always using smallest bandwidth percentage
+ * assigned to threads, aka "max throttling"
+ * @THREAD_THROTTLE_PER_THREAD: Memory bandwidth is throttled at the thread
+ */
+enum membw_throttle_mode {
+ THREAD_THROTTLE_UNDEFINED = 0,
+ THREAD_THROTTLE_MAX,
+ THREAD_THROTTLE_PER_THREAD,
+};
+
+/**
+ * struct resctrl_membw - Memory bandwidth allocation related data
+ * @min_bw: Minimum memory bandwidth percentage user can request
+ * @bw_gran: Granularity at which the memory bandwidth is allocated
+ * @delay_linear: True if memory B/W delay is in linear scale
+ * @arch_needs_linear: True if we can't configure non-linear resources
+ * @throttle_mode: Bandwidth throttling mode when threads request
+ * different memory bandwidths
+ * @mba_sc: True if MBA software controller(mba_sc) is enabled
+ * @mb_map: Mapping of memory B/W percentage to memory B/W delay
+ */
+struct resctrl_membw {
+ u32 min_bw;
+ u32 bw_gran;
+ u32 delay_linear;
+ bool arch_needs_linear;
+ enum membw_throttle_mode throttle_mode;
+ bool mba_sc;
+ u32 *mb_map;
+};
+
+struct rdt_parse_data;
+
+/**
+ * struct rdt_resource - attributes of a resctrl resource
+ * @rid: The index of the resource
+ * @alloc_enabled: Is allocation enabled on this machine
+ * @mon_enabled: Is monitoring enabled for this feature
+ * @alloc_capable: Is allocation available on this machine
+ * @mon_capable: Is monitor feature available on this machine
+ * @num_rmid: Number of RMIDs available.
+ * @cache_level: Which cache level defines scope of this resource
+ * @cache: If the component has cache controls, their properties.
+ * @membw: If the component has bandwidth controls, their properties.
+ * @domains: All domains for this resource
+ * @name: Name to use in "schemata" file.
+ * @data_width: Character width of data when displaying.
+ * @default_ctrl: Specifies default cache cbm or memory B/W percent.
+ * @format_str: Per resource format string to show domain value
+ * @parse_ctrlval: Per resource function pointer to parse control values
+ *
+ * @evt_list: List of monitoring events
+ * @fflags: flags to choose base and info files
+ */
+struct rdt_resource {
+ int rid;
+ bool alloc_enabled;
+ bool mon_enabled;
+ bool alloc_capable;
+ bool mon_capable;
+ int num_rmid;
+ int cache_level;
+ struct resctrl_cache cache;
+ struct resctrl_membw membw;
+ struct list_head domains;
+ char *name;
+ int data_width;
+ u32 default_ctrl;
+ const char *format_str;
+ int (*parse_ctrlval)(struct rdt_parse_data *data,
+ struct rdt_resource *r,
+ struct rdt_domain *d);
+ struct list_head evt_list;
+ unsigned long fflags;
+
+};
+
#endif /* _RESCTRL_H */
--
2.30.2

2021-06-14 20:11:57

by James Morse

[permalink] [raw]
Subject: [PATCH v4 02/24] x86/resctrl: Split struct rdt_domain

resctrl is the defacto Linux ABI for SoC resource partitioning features.

To support it on another architecture, it needs to be abstracted from
the features provided by Intel RDT and AMD PQoS, and moved to /fs/.
struct rdt_resource contains a mix of architecture private details
and properties of the filesystem interface user-space users.

Continue by splitting struct rdt_domain, into an architecture private
'hw' struct, which contains the common resctrl structure that would be
used by any architecture. The hardware values in ctrl_val and mbps_val
need to be accessed via helpers to allow another architecture to convert
these into a different format if necessary. After this split, filesystem
code paths touching a 'hw' struct indicates where an abstraction
is needed.

Splitting this structure only moves types around, and should not lead
to any change in behaviour.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Removed a double word, removed a space.

Changes since v2:
* Shuffled commit message,
* Changed kerneldoc text above rdt_hw_domain

Changes since v1:
* Tabs space and other whitespace
* cpu becomes CPU in all comments touched
---
arch/x86/kernel/cpu/resctrl/core.c | 32 ++++++++++-------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++--
arch/x86/kernel/cpu/resctrl/internal.h | 43 +++++++----------------
arch/x86/kernel/cpu/resctrl/monitor.c | 8 +++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 29 +++++++++------
include/linux/resctrl.h | 32 ++++++++++++++++-
6 files changed, 94 insertions(+), 60 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 415d0f04efd7..aff5d0dde6c1 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -385,10 +385,11 @@ static void
mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

for (i = m->low; i < m->high; i++)
- wrmsrl(hw_res->msr_base + i, d->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
}

/*
@@ -410,21 +411,23 @@ mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

/* Write the delay values for mba. */
for (i = m->low; i < m->high; i++)
- wrmsrl(hw_res->msr_base + i, delay_bw_map(d->ctrl_val[i], r));
+ wrmsrl(hw_res->msr_base + i, delay_bw_map(hw_dom->ctrl_val[i], r));
}

static void
cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

for (i = m->low; i < m->high; i++)
- wrmsrl(hw_res->msr_base + cbm_idx(r, i), d->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + cbm_idx(r, i), hw_dom->ctrl_val[i]);
}

struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
@@ -510,21 +513,22 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
struct msr_param m;
u32 *dc, *dm;

- dc = kmalloc_array(hw_res->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
+ dc = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->ctrl_val), GFP_KERNEL);
if (!dc)
return -ENOMEM;

- dm = kmalloc_array(hw_res->num_closid, sizeof(*d->mbps_val), GFP_KERNEL);
+ dm = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->mbps_val), GFP_KERNEL);
if (!dm) {
kfree(dc);
return -ENOMEM;
}

- d->ctrl_val = dc;
- d->mbps_val = dm;
+ hw_dom->ctrl_val = dc;
+ hw_dom->mbps_val = dm;
setup_default_ctrlval(r, dc, dm);

m.low = 0;
@@ -586,6 +590,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
{
int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
struct list_head *add_pos = NULL;
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;

d = rdt_find_domain(r, id, &add_pos);
@@ -601,10 +606,11 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
return;
}

- d = kzalloc_node(sizeof(*d), GFP_KERNEL, cpu_to_node(cpu));
- if (!d)
+ hw_dom = kzalloc_node(sizeof(*hw_dom), GFP_KERNEL, cpu_to_node(cpu));
+ if (!hw_dom)
return;

+ d = &hw_dom->resctrl;
d->id = id;
cpumask_set_cpu(cpu, &d->cpu_mask);

@@ -633,6 +639,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
static void domain_remove_cpu(int cpu, struct rdt_resource *r)
{
int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;

d = rdt_find_domain(r, id, NULL);
@@ -640,6 +647,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
pr_warn("Couldn't find cache id for CPU %d\n", cpu);
return;
}
+ hw_dom = resctrl_to_arch_dom(d);

cpumask_clear_cpu(cpu, &d->cpu_mask);
if (cpumask_empty(&d->cpu_mask)) {
@@ -672,12 +680,12 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
if (d->plr)
d->plr->d = NULL;

- kfree(d->ctrl_val);
- kfree(d->mbps_val);
+ kfree(hw_dom->ctrl_val);
+ kfree(hw_dom->mbps_val);
bitmap_free(d->rmid_busy_llc);
kfree(d->mbm_total);
kfree(d->mbm_local);
- kfree(d);
+ kfree(hw_dom);
return;
}

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index ab6e584c9d2d..2e7466659af3 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -238,6 +238,7 @@ static int parse_line(char *line, struct rdt_resource *r,

int update_domains(struct rdt_resource *r, int closid)
{
+ struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
@@ -254,7 +255,8 @@ int update_domains(struct rdt_resource *r, int closid)

mba_sc = is_mba_sc(r);
list_for_each_entry(d, &r->domains, list) {
- dc = !mba_sc ? d->ctrl_val : d->mbps_val;
+ hw_dom = resctrl_to_arch_dom(d);
+ dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;
if (d->have_new_ctrl && d->new_ctrl != dc[closid]) {
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
dc[closid] = d->new_ctrl;
@@ -375,17 +377,19 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
{
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *dom;
bool sep = false;
u32 ctrl_val;

seq_printf(s, "%*s:", max_name_width, r->name);
list_for_each_entry(dom, &r->domains, list) {
+ hw_dom = resctrl_to_arch_dom(dom);
if (sep)
seq_puts(s, ";");

- ctrl_val = (!is_mba_sc(r) ? dom->ctrl_val[closid] :
- dom->mbps_val[closid]);
+ ctrl_val = (!is_mba_sc(r) ? hw_dom->ctrl_val[closid] :
+ hw_dom->mbps_val[closid]);
seq_printf(s, r->format_str, dom->id, max_data_width,
ctrl_val);
sep = true;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 43c8cf6b2b12..235cf621c878 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -299,44 +299,25 @@ struct mbm_state {
};

/**
- * struct rdt_domain - group of cpus sharing an RDT resource
- * @list: all instances of this resource
- * @id: unique id for this instance
- * @cpu_mask: which cpus share this resource
- * @rmid_busy_llc:
- * bitmap of which limbo RMIDs are above threshold
- * @mbm_total: saved state for MBM total bandwidth
- * @mbm_local: saved state for MBM local bandwidth
- * @mbm_over: worker to periodically read MBM h/w counters
- * @cqm_limbo: worker to periodically read CQM h/w counters
- * @mbm_work_cpu:
- * worker cpu for MBM h/w counters
- * @cqm_work_cpu:
- * worker cpu for CQM h/w counters
+ * struct rdt_hw_domain - Arch private attributes of a set of CPUs that share
+ * a resource
+ * @resctrl: Properties exposed to the resctrl file system
* @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID)
* @mbps_val: When mba_sc is enabled, this holds the bandwidth in MBps
- * @new_ctrl: new ctrl value to be loaded
- * @have_new_ctrl: did user provide new_ctrl for this domain
- * @plr: pseudo-locked region (if any) associated with domain
+ *
+ * Members of this structure are accessed via helpers that provide abstraction.
*/
-struct rdt_domain {
- struct list_head list;
- int id;
- struct cpumask cpu_mask;
- unsigned long *rmid_busy_llc;
- struct mbm_state *mbm_total;
- struct mbm_state *mbm_local;
- struct delayed_work mbm_over;
- struct delayed_work cqm_limbo;
- int mbm_work_cpu;
- int cqm_work_cpu;
+struct rdt_hw_domain {
+ struct rdt_domain resctrl;
u32 *ctrl_val;
u32 *mbps_val;
- u32 new_ctrl;
- bool have_new_ctrl;
- struct pseudo_lock_region *plr;
};

+static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain *r)
+{
+ return container_of(r, struct rdt_hw_domain, resctrl);
+}
+
/**
* struct msr_param - set a range of MSRs from a domain
* @res: The resource to use
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 685e7f86d908..76eea10f2e2c 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -418,6 +418,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
u32 closid, rmid, cur_msr, cur_msr_val, new_msr_val;
struct mbm_state *pmbm_data, *cmbm_data;
struct rdt_hw_resource *hw_r_mba;
+ struct rdt_hw_domain *hw_dom_mba;
u32 cur_bw, delta_bw, user_bw;
struct rdt_resource *r_mba;
struct rdt_domain *dom_mba;
@@ -438,11 +439,12 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
pr_warn_once("Failure to get domain for MBA update\n");
return;
}
+ hw_dom_mba = resctrl_to_arch_dom(dom_mba);

cur_bw = pmbm_data->prev_bw;
- user_bw = dom_mba->mbps_val[closid];
+ user_bw = hw_dom_mba->mbps_val[closid];
delta_bw = pmbm_data->delta_bw;
- cur_msr_val = dom_mba->ctrl_val[closid];
+ cur_msr_val = hw_dom_mba->ctrl_val[closid];

/*
* For Ctrl groups read data from child monitor groups.
@@ -479,7 +481,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)

cur_msr = hw_r_mba->msr_base + closid;
wrmsrl(cur_msr, delay_bw_map(new_msr_val, r_mba));
- dom_mba->ctrl_val[closid] = new_msr_val;
+ hw_dom_mba->ctrl_val[closid] = new_msr_val;

/*
* Delta values are updated dynamically package wise for each
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 5126a1e58d97..9a8665c8ab89 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -915,7 +915,7 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_putc(seq, ';');
- ctrl = dom->ctrl_val;
+ ctrl = resctrl_to_arch_dom(dom)->ctrl_val;
sw_shareable = 0;
exclusive = 0;
seq_printf(seq, "%d=", dom->id);
@@ -1193,7 +1193,7 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d
}

/* Check for overlap with other resource groups */
- ctrl = d->ctrl_val;
+ ctrl = resctrl_to_arch_dom(d)->ctrl_val;
for (i = 0; i < closids_supported(); i++, ctrl++) {
ctrl_b = *ctrl;
mode = rdtgroup_mode_by_closid(i);
@@ -1262,6 +1262,7 @@ bool rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d,
*/
static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
{
+ struct rdt_hw_domain *hw_dom;
int closid = rdtgrp->closid;
struct rdt_resource *r;
bool has_cache = false;
@@ -1272,7 +1273,8 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
continue;
has_cache = true;
list_for_each_entry(d, &r->domains, list) {
- if (rdtgroup_cbm_overlaps(r, d, d->ctrl_val[closid],
+ hw_dom = resctrl_to_arch_dom(d);
+ if (rdtgroup_cbm_overlaps(r, d, hw_dom->ctrl_val[closid],
rdtgrp->closid, false)) {
rdt_last_cmd_puts("Schemata overlaps\n");
return false;
@@ -1404,6 +1406,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r,
static int rdtgroup_size_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
+ struct rdt_hw_domain *hw_dom;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
struct rdt_domain *d;
@@ -1438,14 +1441,15 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
sep = false;
seq_printf(s, "%*s:", max_name_width, r->name);
list_for_each_entry(d, &r->domains, list) {
+ hw_dom = resctrl_to_arch_dom(d);
if (sep)
seq_putc(s, ';');
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
size = 0;
} else {
ctrl = (!is_mba_sc(r) ?
- d->ctrl_val[rdtgrp->closid] :
- d->mbps_val[rdtgrp->closid]);
+ hw_dom->ctrl_val[rdtgrp->closid] :
+ hw_dom->mbps_val[rdtgrp->closid]);
if (r->rid == RDT_RESOURCE_MBA)
size = ctrl;
else
@@ -1940,6 +1944,7 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r)
static int set_mba_sc(bool mba_sc)
{
struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].resctrl;
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;

if (!is_mbm_enabled() || !is_mba_linear() ||
@@ -1947,8 +1952,10 @@ static int set_mba_sc(bool mba_sc)
return -EINVAL;

r->membw.mba_sc = mba_sc;
- list_for_each_entry(d, &r->domains, list)
- setup_default_ctrlval(r, d->ctrl_val, d->mbps_val);
+ list_for_each_entry(d, &r->domains, list) {
+ hw_dom = resctrl_to_arch_dom(d);
+ setup_default_ctrlval(r, hw_dom->ctrl_val, hw_dom->mbps_val);
+ }

return 0;
}
@@ -2265,6 +2272,7 @@ static int rdt_init_fs_context(struct fs_context *fc)
static int reset_all_ctrls(struct rdt_resource *r)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+ struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
@@ -2283,10 +2291,11 @@ static int reset_all_ctrls(struct rdt_resource *r)
* from each domain to update the MSRs below.
*/
list_for_each_entry(d, &r->domains, list) {
+ hw_dom = resctrl_to_arch_dom(d);
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);

for (i = 0; i < hw_res->num_closid; i++)
- d->ctrl_val[i] = r->default_ctrl;
+ hw_dom->ctrl_val[i] = r->default_ctrl;
}
cpu = get_cpu();
/* Update CBM on this cpu if it's in cpu_mask. */
@@ -2665,7 +2674,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct rdt_resource *r,
d->have_new_ctrl = false;
d->new_ctrl = r->cache.shareable_bits;
used_b = r->cache.shareable_bits;
- ctrl = d->ctrl_val;
+ ctrl = resctrl_to_arch_dom(d)->ctrl_val;
for (i = 0; i < closids_supported(); i++, ctrl++) {
if (closid_allocated(i) && i != closid) {
mode = rdtgroup_mode_by_closid(i);
@@ -2682,7 +2691,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct rdt_resource *r,
* with an exclusive group.
*/
if (d_cdp)
- peer_ctl = d_cdp->ctrl_val[i];
+ peer_ctl = resctrl_to_arch_dom(d_cdp)->ctrl_val[i];
else
peer_ctl = 0;
used_b |= *ctrl | peer_ctl;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index ee92df9c7252..be6f5df78e31 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -15,7 +15,37 @@ int proc_resctrl_show(struct seq_file *m,

#endif

-struct rdt_domain;
+/**
+ * struct rdt_domain - group of CPUs sharing a resctrl resource
+ * @list: all instances of this resource
+ * @id: unique id for this instance
+ * @cpu_mask: which CPUs share this resource
+ * @new_ctrl: new ctrl value to be loaded
+ * @have_new_ctrl: did user provide new_ctrl for this domain
+ * @rmid_busy_llc: bitmap of which limbo RMIDs are above threshold
+ * @mbm_total: saved state for MBM total bandwidth
+ * @mbm_local: saved state for MBM local bandwidth
+ * @mbm_over: worker to periodically read MBM h/w counters
+ * @cqm_limbo: worker to periodically read CQM h/w counters
+ * @mbm_work_cpu: worker CPU for MBM h/w counters
+ * @cqm_work_cpu: worker CPU for CQM h/w counters
+ * @plr: pseudo-locked region (if any) associated with domain
+ */
+struct rdt_domain {
+ struct list_head list;
+ int id;
+ struct cpumask cpu_mask;
+ u32 new_ctrl;
+ bool have_new_ctrl;
+ unsigned long *rmid_busy_llc;
+ struct mbm_state *mbm_total;
+ struct mbm_state *mbm_local;
+ struct delayed_work mbm_over;
+ struct delayed_work cqm_limbo;
+ int mbm_work_cpu;
+ int cqm_work_cpu;
+ struct pseudo_lock_region *plr;
+};

/**
* struct resctrl_cache - Cache allocation related data
--
2.30.2

2021-06-14 20:12:07

by James Morse

[permalink] [raw]
Subject: [PATCH v4 03/24] x86/resctrl: Add a separate schema list for resctrl

Resctrl exposes schemata to user-space, which allow the control values
to be specified for a group of tasks.

User-visible properties of the interface, (such as the schemata names
and how the values are parsed) are rooted in a struct provided by the
architecture code. (struct rdt_hw_resource). Once a second architecture
uses resctrl, this would allow user-visible properties to diverge
between architectures.

These properties should come from the resctrl code that will be common
to all architectures. Resctrl has no per-schema structure, only struct
rdt_{hw_,}resource. Create a struct resctrl_schema to hold the
rdt_resource. Before a second architecture can be supported, this
structure will also need to hold the schema name visible to user-space
and the type of configuration values for resctrl.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Fixed a spelling mistake, removed a space.

Changes since v2:
* Expanded comments.
* Shuffled commit message,

Changes since v1:
* Renamed resctrl_all_schema list
* Used schemata_list as a prefix to make these easier to search for
* Added kerneldoc string
* Removed 'pending configuration' reference in commit message
---
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 43 +++++++++++++++++++++++++-
include/linux/resctrl.h | 11 +++++++
3 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 235cf621c878..f6790d03f056 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -106,6 +106,7 @@ extern unsigned int resctrl_cqm_threshold;
extern bool rdt_alloc_capable;
extern bool rdt_mon_capable;
extern unsigned int rdt_mon_features;
+extern struct list_head resctrl_schema_all;

enum rdt_group_type {
RDTCTRL_GROUP = 0,
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 9a8665c8ab89..14ea1212f476 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -39,6 +39,9 @@ static struct kernfs_root *rdt_root;
struct rdtgroup rdtgroup_default;
LIST_HEAD(rdt_all_groups);

+/* list of entries for the schemata file */
+LIST_HEAD(resctrl_schema_all);
+
/* Kernel fs node for "info" directory under root */
static struct kernfs_node *kn_info;

@@ -2109,6 +2112,35 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
return ret;
}

+static int schemata_list_create(void)
+{
+ struct resctrl_schema *s;
+ struct rdt_resource *r;
+
+ for_each_alloc_enabled_rdt_resource(r) {
+ s = kzalloc(sizeof(*s), GFP_KERNEL);
+ if (!s)
+ return -ENOMEM;
+
+ s->res = r;
+
+ INIT_LIST_HEAD(&s->list);
+ list_add(&s->list, &resctrl_schema_all);
+ }
+
+ return 0;
+}
+
+static void schemata_list_destroy(void)
+{
+ struct resctrl_schema *s, *tmp;
+
+ list_for_each_entry_safe(s, tmp, &resctrl_schema_all, list) {
+ list_del(&s->list);
+ kfree(s);
+ }
+}
+
static int rdt_get_tree(struct fs_context *fc)
{
struct rdt_fs_context *ctx = rdt_fc2context(fc);
@@ -2130,11 +2162,17 @@ static int rdt_get_tree(struct fs_context *fc)
if (ret < 0)
goto out_cdp;

+ ret = schemata_list_create();
+ if (ret) {
+ schemata_list_destroy();
+ goto out_mba;
+ }
+
closid_init();

ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
if (ret < 0)
- goto out_mba;
+ goto out_schemata_free;

if (rdt_mon_capable) {
ret = mongroup_create_dir(rdtgroup_default.kn,
@@ -2184,6 +2222,8 @@ static int rdt_get_tree(struct fs_context *fc)
kernfs_remove(kn_mongrp);
out_info:
kernfs_remove(kn_info);
+out_schemata_free:
+ schemata_list_destroy();
out_mba:
if (ctx->enable_mba_mbps)
set_mba_sc(false);
@@ -2425,6 +2465,7 @@ static void rdt_kill_sb(struct super_block *sb)
rmdir_all_sub();
rdt_pseudo_lock_release();
rdtgroup_default.mode = RDT_MODE_SHAREABLE;
+ schemata_list_destroy();
static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
static_branch_disable_cpuslocked(&rdt_mon_enable_key);
static_branch_disable_cpuslocked(&rdt_enable_key);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index be6f5df78e31..425e7913dc8d 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -154,4 +154,15 @@ struct rdt_resource {

};

+/**
+ * struct resctrl_schema - configuration abilities of a resource presented to
+ * user-space
+ * @list: Member of resctrl_schema_all.
+ * @res: The resource structure exported by the architecture to describe
+ * the hardware that is configured by this schema.
+ */
+struct resctrl_schema {
+ struct list_head list;
+ struct rdt_resource *res;
+};
#endif /* _RESCTRL_H */
--
2.30.2

2021-06-14 20:12:21

by James Morse

[permalink] [raw]
Subject: [PATCH v4 06/24] x86/resctrl: Walk the resctrl schema list instead of an arch list

When parsing a schema configuration value from user-space, resctrl
walks the architectures rdt_resources_all[] array to find a
matching struct rdt_resource.

Once the CDP resources are merged there will be one resource in
use by two schema. Anything walking rdt_resources_all[] on behalf
of a user-space request should walk the list of struct resctrl_schema
instead.

Change the users of for_each_alloc_enabled_rdt_resource() to walk
the schema instead. Schema were only created for alloc_enabled resources
so these two lists are currently equivalent.
schemata_list_create() and rdt_kill_sb() are ignored. The first
creates the schema list, and will eventually loop over the resource
indexes using an arch helper to retrieve the resource. rdt_kill_sb()
will eventually make use of an arch 'reset everything' helper.
After the filesystem code is moved, rdtgroup_pseudo_locked_in_hierarchy()
remains part of the x86 specific hooks to support pseudo lock. This code
walks each domain, and still does this after the separate resources are
merged.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* No one can spell pseudo

Changes since v2:
* Shuffled commit message,

Changes since v1:
* Expanded commit message
* Split from a larger patch
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 23 +++++++++++++++--------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 18 ++++++++++++------
2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 2e7466659af3..a6f9548a8a59 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -287,10 +287,12 @@ static int rdtgroup_parse_resource(char *resname, char *tok,
struct rdtgroup *rdtgrp)
{
struct rdt_hw_resource *hw_res;
+ struct resctrl_schema *s;
struct rdt_resource *r;

- for_each_alloc_enabled_rdt_resource(r) {
- hw_res = resctrl_to_arch_res(r);
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
+ hw_res = resctrl_to_arch_res(s->res);
if (!strcmp(resname, r->name) && rdtgrp->closid < hw_res->num_closid)
return parse_line(tok, r, rdtgrp);
}
@@ -301,6 +303,7 @@ static int rdtgroup_parse_resource(char *resname, char *tok,
ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
+ struct resctrl_schema *s;
struct rdtgroup *rdtgrp;
struct rdt_domain *dom;
struct rdt_resource *r;
@@ -331,8 +334,8 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
goto out;
}

- for_each_alloc_enabled_rdt_resource(r) {
- list_for_each_entry(dom, &r->domains, list)
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ list_for_each_entry(dom, &s->res->domains, list)
dom->have_new_ctrl = false;
}

@@ -353,7 +356,8 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
goto out;
}

- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
ret = update_domains(r, rdtgrp->closid);
if (ret)
goto out;
@@ -401,6 +405,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
struct rdt_hw_resource *hw_res;
+ struct resctrl_schema *schema;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
int ret = 0;
@@ -409,8 +414,10 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
rdtgrp = rdtgroup_kn_lock_live(of->kn);
if (rdtgrp) {
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
- for_each_alloc_enabled_rdt_resource(r)
+ list_for_each_entry(schema, &resctrl_schema_all, list) {
+ r = schema->res;
seq_printf(s, "%s:uninitialized\n", r->name);
+ }
} else if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
if (!rdtgrp->plr->d) {
rdt_last_cmd_clear();
@@ -424,8 +431,8 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
}
} else {
closid = rdtgrp->closid;
- for_each_alloc_enabled_rdt_resource(r) {
- hw_res = resctrl_to_arch_res(r);
+ list_for_each_entry(schema, &resctrl_schema_all, list) {
+ hw_res = resctrl_to_arch_res(schema->res);
if (closid < hw_res->num_closid)
show_doms(s, r, closid);
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 199b3035dfbf..aad891d691e0 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -104,12 +104,12 @@ int closids_supported(void)
static void closid_init(void)
{
struct rdt_hw_resource *hw_res;
- struct rdt_resource *r;
+ struct resctrl_schema *s;
int rdt_min_closid = 32;

/* Compute rdt_min_closid across all resources */
- for_each_alloc_enabled_rdt_resource(r) {
- hw_res = resctrl_to_arch_res(r);
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ hw_res = resctrl_to_arch_res(s->res);
rdt_min_closid = min(rdt_min_closid, hw_res->num_closid);
}

@@ -1276,11 +1276,13 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
{
struct rdt_hw_domain *hw_dom;
int closid = rdtgrp->closid;
+ struct resctrl_schema *s;
struct rdt_resource *r;
bool has_cache = false;
struct rdt_domain *d;

- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
if (r->rid == RDT_RESOURCE_MBA)
continue;
has_cache = true;
@@ -1418,6 +1420,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r,
static int rdtgroup_size_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
+ struct resctrl_schema *schema;
struct rdt_hw_domain *hw_dom;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
@@ -1449,7 +1452,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
goto out;
}

- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(schema, &resctrl_schema_all, list) {
+ r = schema->res;
sep = false;
seq_printf(s, "%*s:", max_name_width, r->name);
list_for_each_entry(d, &r->domains, list) {
@@ -2815,10 +2819,12 @@ static void rdtgroup_init_mba(struct rdt_resource *r)
/* Initialize the RDT group's allocations. */
static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp)
{
+ struct resctrl_schema *s;
struct rdt_resource *r;
int ret;

- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
if (r->rid == RDT_RESOURCE_MBA) {
rdtgroup_init_mba(r);
} else {
--
2.30.2

2021-06-14 20:12:21

by James Morse

[permalink] [raw]
Subject: [PATCH v4 05/24] x86/resctrl: Label the resources with their configuration type

The names of resources are used for the schema name presented to
user-space. The name used is rooted in a structure provided by
the architecture code because the names are different when CDP
is enabled. x86 implements this by swapping between two sets of
resource structures based on their alloc_enabled flag. The type
of configuration in-use is encoded in the name (and cbm_idx_offset).

Once the CDP behaviour is moved into the parts of resctrl that will
move to /fs/, there will be two struct resctrl_schema for one
struct rdt_resource. The schema describes the type of configuration
being applied to the resource. The name of the schema should be
generated by resctrl, base on the type of configuration. To do this
struct resctrl_schema needs to store the type of configuration in use
for a schema.

Create an enum resctrl_conf_type describing the options, and add
it to struct resctrl_schema. The underlying resources are still
separate, as cbm_idx_offset is still in use.
Temporarily label all the entries in rdt_resources_all[] and copy
that value to struct resctrl_schema. Copying the value ensures there
is no mismatch while the filesystem parts of resctrl are modified
to use the schema. Once the resources are merged, the filesystem
code can assign this value based on the schema being created.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Removed a space.

Changes since v2:
* Renamed CDP_BOTH as CDP_NONE and described as 'no prioritisation'
* Shuffled commit message,

Changes since v1:
* {cdp,conf}_type typo
* Added kerneldoc comment
---
arch/x86/kernel/cpu/resctrl/core.c | 7 +++++++
arch/x86/kernel/cpu/resctrl/internal.h | 2 ++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 1 +
include/linux/resctrl.h | 10 ++++++++++
4 files changed, 20 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index aff5d0dde6c1..9ba12c3d175a 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -62,6 +62,7 @@ mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m,
struct rdt_hw_resource rdt_resources_all[] = {
[RDT_RESOURCE_L3] =
{
+ .conf_type = CDP_NONE,
.resctrl = {
.rid = RDT_RESOURCE_L3,
.name = "L3",
@@ -81,6 +82,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
},
[RDT_RESOURCE_L3DATA] =
{
+ .conf_type = CDP_DATA,
.resctrl = {
.rid = RDT_RESOURCE_L3DATA,
.name = "L3DATA",
@@ -100,6 +102,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
},
[RDT_RESOURCE_L3CODE] =
{
+ .conf_type = CDP_CODE,
.resctrl = {
.rid = RDT_RESOURCE_L3CODE,
.name = "L3CODE",
@@ -119,6 +122,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
},
[RDT_RESOURCE_L2] =
{
+ .conf_type = CDP_NONE,
.resctrl = {
.rid = RDT_RESOURCE_L2,
.name = "L2",
@@ -138,6 +142,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
},
[RDT_RESOURCE_L2DATA] =
{
+ .conf_type = CDP_DATA,
.resctrl = {
.rid = RDT_RESOURCE_L2DATA,
.name = "L2DATA",
@@ -157,6 +162,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
},
[RDT_RESOURCE_L2CODE] =
{
+ .conf_type = CDP_CODE,
.resctrl = {
.rid = RDT_RESOURCE_L2CODE,
.name = "L2CODE",
@@ -176,6 +182,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
},
[RDT_RESOURCE_MBA] =
{
+ .conf_type = CDP_NONE,
.resctrl = {
.rid = RDT_RESOURCE_MBA,
.name = "MB",
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index f6790d03f056..e800de110230 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -364,6 +364,7 @@ struct rdt_parse_data {

/**
* struct rdt_hw_resource - arch private attributes of a resctrl resource
+ * @conf_type: The type that should be used when configuring. temporary
* @resctrl: Attributes of the resource used directly by resctrl.
* @num_closid: Maximum number of closid this hardware can support.
* @msr_base: Base MSR address for CBMs
@@ -376,6 +377,7 @@ struct rdt_parse_data {
* msr_update and msr_base.
*/
struct rdt_hw_resource {
+ enum resctrl_conf_type conf_type;
struct rdt_resource resctrl;
int num_closid;
unsigned int msr_base;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6ad9df322282..199b3035dfbf 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2135,6 +2135,7 @@ static int schemata_list_create(void)
return -ENOMEM;

s->res = r;
+ s->conf_type = resctrl_to_arch_res(r)->conf_type;

INIT_LIST_HEAD(&s->list);
list_add(&s->list, &resctrl_schema_all);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 425e7913dc8d..81073d0751c9 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -15,6 +15,14 @@ int proc_resctrl_show(struct seq_file *m,

#endif

+enum resctrl_conf_type {
+ /* No prioritisation, both code and data are controlled or monitored. */
+ CDP_NONE,
+
+ CDP_CODE,
+ CDP_DATA,
+};
+
/**
* struct rdt_domain - group of CPUs sharing a resctrl resource
* @list: all instances of this resource
@@ -158,11 +166,13 @@ struct rdt_resource {
* struct resctrl_schema - configuration abilities of a resource presented to
* user-space
* @list: Member of resctrl_schema_all.
+ * @conf_type: Whether this schema is specific to code/data.
* @res: The resource structure exported by the architecture to describe
* the hardware that is configured by this schema.
*/
struct resctrl_schema {
struct list_head list;
+ enum resctrl_conf_type conf_type;
struct rdt_resource *res;
};
#endif /* _RESCTRL_H */
--
2.30.2

2021-06-14 20:12:28

by James Morse

[permalink] [raw]
Subject: [PATCH v4 07/24] x86/resctrl: Store the effective num_closid in the schema

Struct resctrl_schema holds properties that vary with the style of
configuration that resctrl applies to a resource. There are already
two values for the hardware's num_closid, depending on whether the
architecture presents the L3 or L3CODE/L3DATA resources.

As the way CDP changes the number of control groups that resctrl can create
is part of the user-space interface, it should be managed by the filesystem
parts of resctrl. This allows the architecture code to only describe the
value the hardware supports.

Add num_closid to resctrl_schema. This is the value seen by the filesystem,
which may be different to the maximum value described by the arch code
when CDP is enabled.
These functions operate on the num_closid value that is exposed to
user-space:
* rdtgroup_parse_resource()
* rdtgroup_schemata_show()
* rdt_num_closids_show()
* closid_init()
These are changed to use the schema value instead. schemata_list_create()
sets this value, and reaches into the architecture specific structure to
get the value. This will eventually be replaced with a helper.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Fixed two spelling mistakes.

Changes since v2:
* Expanded kerneldoc comment.
* Shuffled commit message,

Changes since v1:
* Added missing : in a comment.
* Expanded commit message.
* Reordered patches
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 9 +++------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 13 ++++---------
include/linux/resctrl.h | 4 ++++
3 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index a6f9548a8a59..fcd6ca73ac41 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -286,14 +286,12 @@ int update_domains(struct rdt_resource *r, int closid)
static int rdtgroup_parse_resource(char *resname, char *tok,
struct rdtgroup *rdtgrp)
{
- struct rdt_hw_resource *hw_res;
struct resctrl_schema *s;
struct rdt_resource *r;

list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- hw_res = resctrl_to_arch_res(s->res);
- if (!strcmp(resname, r->name) && rdtgrp->closid < hw_res->num_closid)
+ if (!strcmp(resname, r->name) && rdtgrp->closid < s->num_closid)
return parse_line(tok, r, rdtgrp);
}
rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname);
@@ -404,7 +402,6 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
int rdtgroup_schemata_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
- struct rdt_hw_resource *hw_res;
struct resctrl_schema *schema;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
@@ -432,8 +429,8 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
} else {
closid = rdtgrp->closid;
list_for_each_entry(schema, &resctrl_schema_all, list) {
- hw_res = resctrl_to_arch_res(schema->res);
- if (closid < hw_res->num_closid)
+ r = schema->res;
+ if (closid < schema->num_closid)
show_doms(s, r, closid);
}
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index aad891d691e0..da2e1c3a414e 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -103,15 +103,12 @@ int closids_supported(void)

static void closid_init(void)
{
- struct rdt_hw_resource *hw_res;
struct resctrl_schema *s;
int rdt_min_closid = 32;

/* Compute rdt_min_closid across all resources */
- list_for_each_entry(s, &resctrl_schema_all, list) {
- hw_res = resctrl_to_arch_res(s->res);
- rdt_min_closid = min(rdt_min_closid, hw_res->num_closid);
- }
+ list_for_each_entry(s, &resctrl_schema_all, list)
+ rdt_min_closid = min(rdt_min_closid, s->num_closid);

closid_free_map = BIT_MASK(rdt_min_closid) - 1;

@@ -849,11 +846,8 @@ static int rdt_num_closids_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
struct resctrl_schema *s = of->kn->parent->priv;
- struct rdt_resource *r = s->res;
- struct rdt_hw_resource *hw_res;

- hw_res = resctrl_to_arch_res(r);
- seq_printf(seq, "%d\n", hw_res->num_closid);
+ seq_printf(seq, "%u\n", s->num_closid);
return 0;
}

@@ -2140,6 +2134,7 @@ static int schemata_list_create(void)

s->res = r;
s->conf_type = resctrl_to_arch_res(r)->conf_type;
+ s->num_closid = resctrl_to_arch_res(r)->num_closid;

INIT_LIST_HEAD(&s->list);
list_add(&s->list, &resctrl_schema_all);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 81073d0751c9..de0bef438e54 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -169,10 +169,14 @@ struct rdt_resource {
* @conf_type: Whether this schema is specific to code/data.
* @res: The resource structure exported by the architecture to describe
* the hardware that is configured by this schema.
+ * @num_closid: The number of closid that can be used with this schema. When
+ * features like CDP are enabled, this will be lower than the
+ * hardware supports for the resource.
*/
struct resctrl_schema {
struct list_head list;
enum resctrl_conf_type conf_type;
struct rdt_resource *res;
+ int num_closid;
};
#endif /* _RESCTRL_H */
--
2.30.2

2021-06-14 20:12:30

by James Morse

[permalink] [raw]
Subject: [PATCH v4 08/24] x86/resctrl: Add resctrl_arch_get_num_closid()

To initialise struct resctrl_schema's num_closid, schemata_list_create()
reaches into the architectures private structure to retrieve num_closid
from the struct rdt_hw_resource. The 'half the closids' behaviour
should be part of the filesystem parts of resctrl that are the same
on any architecture. struct resctrl_schema's num_closid should
include any correction for CDP.

Having two properties called num_closid is likely to be confusing when
they have different values.

Add a helper to read the resource's num_closid from the arch code. This
should return the number of closid that the resource supports, regardless
of whether CDP is in use. Once the CDP resources are merged,
schemata_list_create() can apply the correction itself.
Using a type with an obvious size for the arch helper means changing
the type of num_closid to u32, which matches the type already used by
struct rdtgroup.

reset_all_ctrls() does not use resctrl_arch_get_num_closid(), even
though it sets up a structure for modifying the hardware. This function
will be part of the architecture code, the maximum closid should be the
maximum value the hardware has, regardless of the way resctrl is using
it. All the uses of num_closid in core.c are naturally part of the
architecture specific code.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Fixed a spelling mistake, removed four spaces.

Changes since v2:
* Added comment in rdt_hw_resource
* Shuffled commit message,

Changes since v1:
* Rewrote commit message
* Whitespace fixes
* num_closid becomes u32 in all occurrences to reduce surprises
---
arch/x86/kernel/cpu/resctrl/core.c | 5 +++++
arch/x86/kernel/cpu/resctrl/internal.h | 8 ++++++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ++--
include/linux/resctrl.h | 6 +++++-
4 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 9ba12c3d175a..b406cca56ed4 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -450,6 +450,11 @@ struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
return NULL;
}

+u32 resctrl_arch_get_num_closid(struct rdt_resource *r)
+{
+ return resctrl_to_arch_res(r)->num_closid;
+}
+
void rdt_ctrl_update(void *arg)
{
struct msr_param *m = arg;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index e800de110230..5e6bfe27513c 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -366,7 +366,11 @@ struct rdt_parse_data {
* struct rdt_hw_resource - arch private attributes of a resctrl resource
* @conf_type: The type that should be used when configuring. temporary
* @resctrl: Attributes of the resource used directly by resctrl.
- * @num_closid: Maximum number of closid this hardware can support.
+ * @num_closid: Maximum number of closid this hardware can support,
+ * regardless of CDP. This is exposed via
+ * resctrl_arch_get_num_closid() to avoid confusion
+ * with struct resctrl_schema's property of the same name,
+ * which has been corrected for features like CDP.
* @msr_base: Base MSR address for CBMs
* @msr_update: Function pointer to update QOS MSRs
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
@@ -379,7 +383,7 @@ struct rdt_parse_data {
struct rdt_hw_resource {
enum resctrl_conf_type conf_type;
struct rdt_resource resctrl;
- int num_closid;
+ u32 num_closid;
unsigned int msr_base;
void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index da2e1c3a414e..39ec0344963e 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -104,7 +104,7 @@ int closids_supported(void)
static void closid_init(void)
{
struct resctrl_schema *s;
- int rdt_min_closid = 32;
+ u32 rdt_min_closid = 32;

/* Compute rdt_min_closid across all resources */
list_for_each_entry(s, &resctrl_schema_all, list)
@@ -2134,7 +2134,7 @@ static int schemata_list_create(void)

s->res = r;
s->conf_type = resctrl_to_arch_res(r)->conf_type;
- s->num_closid = resctrl_to_arch_res(r)->num_closid;
+ s->num_closid = resctrl_arch_get_num_closid(r);

INIT_LIST_HEAD(&s->list);
list_add(&s->list, &resctrl_schema_all);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index de0bef438e54..78c6dafd42c1 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -177,6 +177,10 @@ struct resctrl_schema {
struct list_head list;
enum resctrl_conf_type conf_type;
struct rdt_resource *res;
- int num_closid;
+ u32 num_closid;
};
+
+/* The number of closid supported by this resource regardless of CDP */
+u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
+
#endif /* _RESCTRL_H */
--
2.30.2

2021-06-14 20:12:32

by James Morse

[permalink] [raw]
Subject: [PATCH v4 12/24] x86/resctrl: Group staged configuration into a separate struct

When configuration changes are made, the new value is written
to struct rdt_domain's new_ctrl field and the have_new_ctrl flag
is set. Later new_ctrl is copied to hardware by a call to
update_domains().

Once the CDP resources are merged, there will be one new_ctrl
field in use by two struct resctrl_schema requiring a per-schema
IPI to copy the value to hardware.

Move new_ctrl and have_new_ctrl into a new struct resctrl_staged_config.
Before the CDP resources can be merged, struct rdt_domain will
need an array of these, one per type of configuration. Using the
type as an index to the array will ensure that a schema configuration
string can't specify the same domain twice.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* Shuffled commit message,

Changes since v1:
* Expanded commit message
* Removed explicit clearing of have_new_ctrl,
* Moved ARRAY_SIZE() trickery to a later patch
* Removed extra whitespace
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 43 +++++++++++++++--------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 22 +++++++-----
include/linux/resctrl.h | 16 ++++++---
3 files changed, 54 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 57c2b0e121d2..a47a792fdcb3 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -62,16 +62,17 @@ int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
{
struct rdt_resource *r = s->res;
unsigned long bw_val;
+ struct resctrl_staged_config *cfg = &d->staged_config;

- if (d->have_new_ctrl) {
+ if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("Duplicate domain %d\n", d->id);
return -EINVAL;
}

if (!bw_validate(data->buf, &bw_val, r))
return -EINVAL;
- d->new_ctrl = bw_val;
- d->have_new_ctrl = true;
+ cfg->new_ctrl = bw_val;
+ cfg->have_new_ctrl = true;

return 0;
}
@@ -129,11 +130,12 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d)
{
+ struct resctrl_staged_config *cfg = &d->staged_config;
struct rdtgroup *rdtgrp = data->rdtgrp;
struct rdt_resource *r = s->res;
u32 cbm_val;

- if (d->have_new_ctrl) {
+ if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("Duplicate domain %d\n", d->id);
return -EINVAL;
}
@@ -175,8 +177,8 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
}
}

- d->new_ctrl = cbm_val;
- d->have_new_ctrl = true;
+ cfg->new_ctrl = cbm_val;
+ cfg->have_new_ctrl = true;

return 0;
}
@@ -190,6 +192,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
static int parse_line(char *line, struct resctrl_schema *s,
struct rdtgroup *rdtgrp)
{
+ struct resctrl_staged_config *cfg;
struct rdt_resource *r = s->res;
struct rdt_parse_data data;
char *dom = NULL, *id;
@@ -219,6 +222,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
if (r->parse_ctrlval(&data, s, d))
return -EINVAL;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
+ cfg = &d->staged_config;
/*
* In pseudo-locking setup mode and just
* parsed a valid CBM that should be
@@ -229,7 +233,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
*/
rdtgrp->plr->s = s;
rdtgrp->plr->d = d;
- rdtgrp->plr->cbm = d->new_ctrl;
+ rdtgrp->plr->cbm = cfg->new_ctrl;
d->plr = rdtgrp->plr;
return 0;
}
@@ -239,14 +243,27 @@ static int parse_line(char *line, struct resctrl_schema *s,
return -EINVAL;
}

+static void apply_config(struct rdt_hw_domain *hw_dom,
+ struct resctrl_staged_config *cfg, int closid,
+ cpumask_var_t cpu_mask, bool mba_sc)
+{
+ struct rdt_domain *dom = &hw_dom->resctrl;
+ u32 *dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;
+
+ if (cfg->new_ctrl != dc[closid]) {
+ cpumask_set_cpu(cpumask_any(&dom->cpu_mask), cpu_mask);
+ dc[closid] = cfg->new_ctrl;
+ }
+}
+
int update_domains(struct rdt_resource *r, int closid)
{
+ struct resctrl_staged_config *cfg;
struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
bool mba_sc;
- u32 *dc;
int cpu;

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
@@ -259,11 +276,9 @@ int update_domains(struct rdt_resource *r, int closid)
mba_sc = is_mba_sc(r);
list_for_each_entry(d, &r->domains, list) {
hw_dom = resctrl_to_arch_dom(d);
- dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;
- if (d->have_new_ctrl && d->new_ctrl != dc[closid]) {
- cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
- dc[closid] = d->new_ctrl;
- }
+ cfg = &hw_dom->resctrl.staged_config;
+ if (cfg->have_new_ctrl)
+ apply_config(hw_dom, cfg, closid, cpu_mask, mba_sc);
}

/*
@@ -335,7 +350,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

list_for_each_entry(s, &resctrl_schema_all, list) {
list_for_each_entry(dom, &s->res->domains, list)
- dom->have_new_ctrl = false;
+ memset(&dom->staged_config, 0, sizeof(dom->staged_config));
}

while ((tok = strsep(&buf, "\n")) != NULL) {
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 5551331ac94e..4a850d6f77c0 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2729,6 +2729,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
u32 closid)
{
struct rdt_resource *r_cdp = NULL;
+ struct resctrl_staged_config *cfg;
struct rdt_domain *d_cdp = NULL;
struct rdt_resource *r = s->res;
u32 used_b = 0, unused_b = 0;
@@ -2738,8 +2739,9 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
int i;

rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp);
- d->have_new_ctrl = false;
- d->new_ctrl = r->cache.shareable_bits;
+ cfg = &d->staged_config;
+ cfg->have_new_ctrl = false;
+ cfg->new_ctrl = r->cache.shareable_bits;
used_b = r->cache.shareable_bits;
ctrl = resctrl_to_arch_dom(d)->ctrl_val;
for (i = 0; i < closids_supported(); i++, ctrl++) {
@@ -2763,29 +2765,29 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
peer_ctl = 0;
used_b |= *ctrl | peer_ctl;
if (mode == RDT_MODE_SHAREABLE)
- d->new_ctrl |= *ctrl | peer_ctl;
+ cfg->new_ctrl |= *ctrl | peer_ctl;
}
}
if (d->plr && d->plr->cbm > 0)
used_b |= d->plr->cbm;
unused_b = used_b ^ (BIT_MASK(r->cache.cbm_len) - 1);
unused_b &= BIT_MASK(r->cache.cbm_len) - 1;
- d->new_ctrl |= unused_b;
+ cfg->new_ctrl |= unused_b;
/*
* Force the initial CBM to be valid, user can
* modify the CBM based on system availability.
*/
- d->new_ctrl = cbm_ensure_valid(d->new_ctrl, r);
+ cfg->new_ctrl = cbm_ensure_valid(cfg->new_ctrl, r);
/*
* Assign the u32 CBM to an unsigned long to ensure that
* bitmap_weight() does not access out-of-bound memory.
*/
- tmp_cbm = d->new_ctrl;
+ tmp_cbm = cfg->new_ctrl;
if (bitmap_weight(&tmp_cbm, r->cache.cbm_len) < r->cache.min_cbm_bits) {
rdt_last_cmd_printf("No space on %s:%d\n", s->name, d->id);
return -ENOSPC;
}
- d->have_new_ctrl = true;
+ cfg->have_new_ctrl = true;

return 0;
}
@@ -2817,11 +2819,13 @@ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid)
/* Initialize MBA resource with default values. */
static void rdtgroup_init_mba(struct rdt_resource *r)
{
+ struct resctrl_staged_config *cfg;
struct rdt_domain *d;

list_for_each_entry(d, &r->domains, list) {
- d->new_ctrl = is_mba_sc(r) ? MBA_MAX_MBPS : r->default_ctrl;
- d->have_new_ctrl = true;
+ cfg = &d->staged_config;
+ cfg->new_ctrl = is_mba_sc(r) ? MBA_MAX_MBPS : r->default_ctrl;
+ cfg->have_new_ctrl = true;
}
}

diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 8fa806c85cec..8fad1af8f15e 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -23,13 +23,21 @@ enum resctrl_conf_type {
CDP_DATA,
};

+/**
+ * struct resctrl_staged_config - parsed configuration to be applied
+ * @new_ctrl: new ctrl value to be loaded
+ * @have_new_ctrl: whether the user provide new_ctrl is valid
+ */
+struct resctrl_staged_config {
+ u32 new_ctrl;
+ bool have_new_ctrl;
+};
+
/**
* struct rdt_domain - group of CPUs sharing a resctrl resource
* @list: all instances of this resource
* @id: unique id for this instance
* @cpu_mask: which CPUs share this resource
- * @new_ctrl: new ctrl value to be loaded
- * @have_new_ctrl: did user provide new_ctrl for this domain
* @rmid_busy_llc: bitmap of which limbo RMIDs are above threshold
* @mbm_total: saved state for MBM total bandwidth
* @mbm_local: saved state for MBM local bandwidth
@@ -38,13 +46,12 @@ enum resctrl_conf_type {
* @mbm_work_cpu: worker CPU for MBM h/w counters
* @cqm_work_cpu: worker CPU for CQM h/w counters
* @plr: pseudo-locked region (if any) associated with domain
+ * @staged_config: parsed configuration to be applied
*/
struct rdt_domain {
struct list_head list;
int id;
struct cpumask cpu_mask;
- u32 new_ctrl;
- bool have_new_ctrl;
unsigned long *rmid_busy_llc;
struct mbm_state *mbm_total;
struct mbm_state *mbm_local;
@@ -53,6 +60,7 @@ struct rdt_domain {
int mbm_work_cpu;
int cqm_work_cpu;
struct pseudo_lock_region *plr;
+ struct resctrl_staged_config staged_config;
};

/**
--
2.30.2

2021-06-14 20:12:43

by James Morse

[permalink] [raw]
Subject: [PATCH v4 09/24] x86/resctrl: Pass the schema to resctrl filesystem functions

Once the CDP resources are merged, there will be two struct
resctrl_schema for one struct rdt_resource. CDP becomes a type of
configuration that belongs to the schema.

Heplers like rdtgroup_cbm_overlaps() need access to the schema to
query the configuration (or configurations) based on schema properties.

Change these functions to take a struct schema instead of the
struct rdt_resource. All the modified functions are part of the filesystem
code that will move to /fs/resctrl once it is possible to support a
second architecture.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* Shuffled commit message,

Changes since v1:
* split from a larger patch
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 23 +++++++++++++----------
arch/x86/kernel/cpu/resctrl/internal.h | 6 +++---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 19 +++++++++++--------
include/linux/resctrl.h | 3 ++-
4 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index fcd6ca73ac41..dbbdd9f275e9 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -57,9 +57,10 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
return true;
}

-int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
+int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d)
{
+ struct rdt_resource *r = s->res;
unsigned long bw_val;

if (d->have_new_ctrl) {
@@ -125,10 +126,11 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
+int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d)
{
struct rdtgroup *rdtgrp = data->rdtgrp;
+ struct rdt_resource *r = s->res;
u32 cbm_val;

if (d->have_new_ctrl) {
@@ -160,12 +162,12 @@ int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
* The CBM may not overlap with the CBM of another closid if
* either is exclusive.
*/
- if (rdtgroup_cbm_overlaps(r, d, cbm_val, rdtgrp->closid, true)) {
+ if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, true)) {
rdt_last_cmd_puts("Overlaps with exclusive group\n");
return -EINVAL;
}

- if (rdtgroup_cbm_overlaps(r, d, cbm_val, rdtgrp->closid, false)) {
+ if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, false)) {
if (rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
rdt_last_cmd_puts("Overlaps with other group\n");
@@ -185,9 +187,10 @@ int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
* separated by ";". The "id" is in decimal, and must match one of
* the "id"s for this resource.
*/
-static int parse_line(char *line, struct rdt_resource *r,
+static int parse_line(char *line, struct resctrl_schema *s,
struct rdtgroup *rdtgrp)
{
+ struct rdt_resource *r = s->res;
struct rdt_parse_data data;
char *dom = NULL, *id;
struct rdt_domain *d;
@@ -213,7 +216,7 @@ static int parse_line(char *line, struct rdt_resource *r,
if (d->id == dom_id) {
data.buf = dom;
data.rdtgrp = rdtgrp;
- if (r->parse_ctrlval(&data, r, d))
+ if (r->parse_ctrlval(&data, s, d))
return -EINVAL;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
/*
@@ -292,7 +295,7 @@ static int rdtgroup_parse_resource(char *resname, char *tok,
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
if (!strcmp(resname, r->name) && rdtgrp->closid < s->num_closid)
- return parse_line(tok, r, rdtgrp);
+ return parse_line(tok, s, rdtgrp);
}
rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname);
return -EINVAL;
@@ -377,8 +380,9 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}

-static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
+static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
{
+ struct rdt_resource *r = schema->res;
struct rdt_hw_domain *hw_dom;
struct rdt_domain *dom;
bool sep = false;
@@ -429,9 +433,8 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
} else {
closid = rdtgrp->closid;
list_for_each_entry(schema, &resctrl_schema_all, list) {
- r = schema->res;
if (closid < schema->num_closid)
- show_doms(s, r, closid);
+ show_doms(s, schema, closid);
}
}
} else {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 5e6bfe27513c..07aeff6c1af9 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -396,9 +396,9 @@ static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r
return container_of(r, struct rdt_hw_resource, resctrl);
}

-int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
+int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d);
-int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
+int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d);

extern struct mutex rdtgroup_mutex;
@@ -501,7 +501,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off);
int rdtgroup_schemata_show(struct kernfs_open_file *of,
struct seq_file *s, void *v);
-bool rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d,
+bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d,
unsigned long cbm, int closid, bool exclusive);
unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_domain *d,
unsigned long cbm);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 39ec0344963e..301a8acfaaf3 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1221,7 +1221,7 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d

/**
* rdtgroup_cbm_overlaps - Does CBM overlap with other use of hardware
- * @r: Resource to which domain instance @d belongs.
+ * @s: Schema for the resource to which domain instance @d belongs.
* @d: The domain instance for which @closid is being tested.
* @cbm: Capacity bitmask being tested.
* @closid: Intended closid for @cbm.
@@ -1239,9 +1239,10 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d
*
* Return: true if CBM overlap detected, false if there is no overlap
*/
-bool rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d,
+bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d,
unsigned long cbm, int closid, bool exclusive)
{
+ struct rdt_resource *r = s->res;
struct rdt_resource *r_cdp;
struct rdt_domain *d_cdp;

@@ -1282,7 +1283,8 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
has_cache = true;
list_for_each_entry(d, &r->domains, list) {
hw_dom = resctrl_to_arch_dom(d);
- if (rdtgroup_cbm_overlaps(r, d, hw_dom->ctrl_val[closid],
+ if (rdtgroup_cbm_overlaps(s, d,
+ hw_dom->ctrl_val[closid],
rdtgrp->closid, false)) {
rdt_last_cmd_puts("Schemata overlaps\n");
return false;
@@ -2712,11 +2714,12 @@ static u32 cbm_ensure_valid(u32 _val, struct rdt_resource *r)
* Set the RDT domain up to start off with all usable allocations. That is,
* all shareable and unused bits. All-zero CBM is invalid.
*/
-static int __init_one_rdt_domain(struct rdt_domain *d, struct rdt_resource *r,
+static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
u32 closid)
{
struct rdt_resource *r_cdp = NULL;
struct rdt_domain *d_cdp = NULL;
+ struct rdt_resource *r = s->res;
u32 used_b = 0, unused_b = 0;
unsigned long tmp_cbm;
enum rdtgrp_mode mode;
@@ -2786,13 +2789,13 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct rdt_resource *r,
* If there are no more shareable bits available on any domain then
* the entire allocation will fail.
*/
-static int rdtgroup_init_cat(struct rdt_resource *r, u32 closid)
+static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid)
{
struct rdt_domain *d;
int ret;

- list_for_each_entry(d, &r->domains, list) {
- ret = __init_one_rdt_domain(d, r, closid);
+ list_for_each_entry(d, &s->res->domains, list) {
+ ret = __init_one_rdt_domain(d, s, closid);
if (ret < 0)
return ret;
}
@@ -2823,7 +2826,7 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp)
if (r->rid == RDT_RESOURCE_MBA) {
rdtgroup_init_mba(r);
} else {
- ret = rdtgroup_init_cat(r, rdtgrp->closid);
+ ret = rdtgroup_init_cat(s, rdtgrp->closid);
if (ret < 0)
return ret;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 78c6dafd42c1..6c9e9692eaba 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -117,6 +117,7 @@ struct resctrl_membw {
};

struct rdt_parse_data;
+struct resctrl_schema;

/**
* struct rdt_resource - attributes of a resctrl resource
@@ -155,7 +156,7 @@ struct rdt_resource {
u32 default_ctrl;
const char *format_str;
int (*parse_ctrlval)(struct rdt_parse_data *data,
- struct rdt_resource *r,
+ struct resctrl_schema *s,
struct rdt_domain *d);
struct list_head evt_list;
unsigned long fflags;
--
2.30.2

2021-06-14 20:12:44

by James Morse

[permalink] [raw]
Subject: [PATCH v4 10/24] x86/resctrl: Swizzle rdt_resource and resctrl_schema in pseudo_lock_region

struct pseudo_lock_region points to the rdt_resource.

Once the resources are merged, this won't be unique. The resource name
is moving into the schema, so that the filesystem portions of resctrl can
generate it.

Swap pseudo_lock_region's rdt_resource pointer for a schema pointer.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* Shuffled commit message,
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 4 ++--
arch/x86/kernel/cpu/resctrl/internal.h | 6 +++---
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 8 ++++----
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ++--
4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index dbbdd9f275e9..4428ec499037 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -227,7 +227,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
* the required initialization for single
* region and return.
*/
- rdtgrp->plr->r = r;
+ rdtgrp->plr->s = s;
rdtgrp->plr->d = d;
rdtgrp->plr->cbm = d->new_ctrl;
d->plr = rdtgrp->plr;
@@ -426,7 +426,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
ret = -ENODEV;
} else {
seq_printf(s, "%s:%d=%x\n",
- rdtgrp->plr->r->name,
+ rdtgrp->plr->s->res->name,
rdtgrp->plr->d->id,
rdtgrp->plr->cbm);
}
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 07aeff6c1af9..60155c20bd33 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -158,8 +158,8 @@ struct mongroup {

/**
* struct pseudo_lock_region - pseudo-lock region information
- * @r: RDT resource to which this pseudo-locked region
- * belongs
+ * @s: Resctrl schema for the resource to which this
+ * pseudo-locked region belongs
* @d: RDT domain to which this pseudo-locked region
* belongs
* @cbm: bitmask of the pseudo-locked region
@@ -179,7 +179,7 @@ struct mongroup {
* @pm_reqs: Power management QoS requests related to this region
*/
struct pseudo_lock_region {
- struct rdt_resource *r;
+ struct resctrl_schema *s;
struct rdt_domain *d;
u32 cbm;
wait_queue_head_t lock_thread_wq;
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index f079561409ab..2f99210a9d69 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -246,7 +246,7 @@ static void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
plr->line_size = 0;
kfree(plr->kmem);
plr->kmem = NULL;
- plr->r = NULL;
+ plr->s = NULL;
if (plr->d)
plr->d->plr = NULL;
plr->d = NULL;
@@ -290,10 +290,10 @@ static int pseudo_lock_region_init(struct pseudo_lock_region *plr)

ci = get_cpu_cacheinfo(plr->cpu);

- plr->size = rdtgroup_cbm_to_size(plr->r, plr->d, plr->cbm);
+ plr->size = rdtgroup_cbm_to_size(plr->s->res, plr->d, plr->cbm);

for (i = 0; i < ci->num_leaves; i++) {
- if (ci->info_list[i].level == plr->r->cache_level) {
+ if (ci->info_list[i].level == plr->s->res->cache_level) {
plr->line_size = ci->info_list[i].coherency_line_size;
return 0;
}
@@ -796,7 +796,7 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm
unsigned long cbm_b;

if (d->plr) {
- cbm_len = d->plr->r->cache.cbm_len;
+ cbm_len = d->plr->s->res->cache.cbm_len;
cbm_b = d->plr->cbm;
if (bitmap_intersects(&cbm, &cbm_b, cbm_len))
return true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 301a8acfaaf3..eaad9c8e6c04 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1439,8 +1439,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
ret = -ENODEV;
} else {
seq_printf(s, "%*s:", max_name_width,
- rdtgrp->plr->r->name);
- size = rdtgroup_cbm_to_size(rdtgrp->plr->r,
+ rdtgrp->plr->s->res->name);
+ size = rdtgroup_cbm_to_size(rdtgrp->plr->s->res,
rdtgrp->plr->d,
rdtgrp->plr->cbm);
seq_printf(s, "%d=%u\n", rdtgrp->plr->d->id, size);
--
2.30.2

2021-06-14 20:13:06

by James Morse

[permalink] [raw]
Subject: [PATCH v4 13/24] x86/resctrl: Allow different CODE/DATA configurations to be staged

Before the CDP resources can be merged, struct rdt_domain will
need an array of struct resctrl_staged_config, one per type of
configuration.

Use the type as an index to the array to ensure that a schema
configuration string can't specify the same domain twice. This
will allow two schema to apply configuration changes to one resource.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Add an empty line

Changes since v2:
* Shuffled commit message,

Changes since v1:
* Renamed max enum value CDP_NUM_TYPES
* Whitespace and parenthesis
* Missing word in the commit message
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 20 ++++++++++++++------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 +++--
include/linux/resctrl.h | 4 +++-
3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index a47a792fdcb3..c46300bce210 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -60,10 +60,11 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d)
{
+ struct resctrl_staged_config *cfg;
struct rdt_resource *r = s->res;
unsigned long bw_val;
- struct resctrl_staged_config *cfg = &d->staged_config;

+ cfg = &d->staged_config[s->conf_type];
if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("Duplicate domain %d\n", d->id);
return -EINVAL;
@@ -130,11 +131,12 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d)
{
- struct resctrl_staged_config *cfg = &d->staged_config;
struct rdtgroup *rdtgrp = data->rdtgrp;
+ struct resctrl_staged_config *cfg;
struct rdt_resource *r = s->res;
u32 cbm_val;

+ cfg = &d->staged_config[s->conf_type];
if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("Duplicate domain %d\n", d->id);
return -EINVAL;
@@ -192,6 +194,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
static int parse_line(char *line, struct resctrl_schema *s,
struct rdtgroup *rdtgrp)
{
+ enum resctrl_conf_type t = s->conf_type;
struct resctrl_staged_config *cfg;
struct rdt_resource *r = s->res;
struct rdt_parse_data data;
@@ -222,7 +225,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
if (r->parse_ctrlval(&data, s, d))
return -EINVAL;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
- cfg = &d->staged_config;
+ cfg = &d->staged_config[t];
/*
* In pseudo-locking setup mode and just
* parsed a valid CBM that should be
@@ -261,6 +264,7 @@ int update_domains(struct rdt_resource *r, int closid)
struct resctrl_staged_config *cfg;
struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
+ enum resctrl_conf_type t;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
bool mba_sc;
@@ -276,9 +280,13 @@ int update_domains(struct rdt_resource *r, int closid)
mba_sc = is_mba_sc(r);
list_for_each_entry(d, &r->domains, list) {
hw_dom = resctrl_to_arch_dom(d);
- cfg = &hw_dom->resctrl.staged_config;
- if (cfg->have_new_ctrl)
+ for (t = 0; t < CDP_NUM_TYPES; t++) {
+ cfg = &hw_dom->resctrl.staged_config[t];
+ if (!cfg->have_new_ctrl)
+ continue;
+
apply_config(hw_dom, cfg, closid, cpu_mask, mba_sc);
+ }
}

/*
@@ -350,7 +358,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

list_for_each_entry(s, &resctrl_schema_all, list) {
list_for_each_entry(dom, &s->res->domains, list)
- memset(&dom->staged_config, 0, sizeof(dom->staged_config));
+ memset(dom->staged_config, 0, sizeof(dom->staged_config));
}

while ((tok = strsep(&buf, "\n")) != NULL) {
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 4a850d6f77c0..d03cb388916c 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2728,6 +2728,7 @@ static u32 cbm_ensure_valid(u32 _val, struct rdt_resource *r)
static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
u32 closid)
{
+ enum resctrl_conf_type t = s->conf_type;
struct rdt_resource *r_cdp = NULL;
struct resctrl_staged_config *cfg;
struct rdt_domain *d_cdp = NULL;
@@ -2739,7 +2740,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
int i;

rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp);
- cfg = &d->staged_config;
+ cfg = &d->staged_config[t];
cfg->have_new_ctrl = false;
cfg->new_ctrl = r->cache.shareable_bits;
used_b = r->cache.shareable_bits;
@@ -2823,7 +2824,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r)
struct rdt_domain *d;

list_for_each_entry(d, &r->domains, list) {
- cfg = &d->staged_config;
+ cfg = &d->staged_config[CDP_NONE];
cfg->new_ctrl = is_mba_sc(r) ? MBA_MAX_MBPS : r->default_ctrl;
cfg->have_new_ctrl = true;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 8fad1af8f15e..47f245a0e092 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -23,6 +23,8 @@ enum resctrl_conf_type {
CDP_DATA,
};

+#define CDP_NUM_TYPES (CDP_DATA + 1)
+
/**
* struct resctrl_staged_config - parsed configuration to be applied
* @new_ctrl: new ctrl value to be loaded
@@ -60,7 +62,7 @@ struct rdt_domain {
int mbm_work_cpu;
int cqm_work_cpu;
struct pseudo_lock_region *plr;
- struct resctrl_staged_config staged_config;
+ struct resctrl_staged_config staged_config[CDP_NUM_TYPES];
};

/**
--
2.30.2

2021-06-14 20:13:29

by James Morse

[permalink] [raw]
Subject: [PATCH v4 16/24] x86/resctrl: Add a helper to read/set the CDP configuration

Whether CDP is enabled for a hardware resource like the L3 cache can
be found by inspecting the alloc_enabled flags of the L3CODE/L3DATA
struct rdt_hw_resources, even if they aren't in use.

Once these resources are merged, the flags can't be compared. Whether
CDP is enabled needs tracking explicitly. If another architecture is
emulating CDP the behaviour may not be per-resource.

Add cdp_capable and cdp_enabled to struct rdt_hw_resource and
add resctrl_arch_set_cdp_enabled() to let resctrl enable or disable
CDP on a resource. resctrl_arch_get_cdp_enabled() lets it read the
current state.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Fixed a spelling mistake.

Changes since v2:
* Merged rdt_domain_reconfigure_cdp() changes here.
* Shuffled commit message,

Changes since v1:
* Added prototype for resctrl_arch_set_cdp_enabled()
* s/Currently/Previously/
* rdt_get_cdp_config() accesses the array directly as most of the code
here disappears once the resources are merged.

It isn't practical for MPAM to hide the CDP emulation by applying the same
'L3' configuration to the two closid that are in use, as this would
then consume two monitors, which are likely to be in short supply.
---
arch/x86/kernel/cpu/resctrl/core.c | 4 ++
arch/x86/kernel/cpu/resctrl/internal.h | 13 +++-
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 75 +++++++++++++----------
4 files changed, 62 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index be974517ba0d..a2cbd2832d73 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -374,6 +374,10 @@ static void rdt_get_cdp_config(int level, int type)
* "cdp" during resctrl file system mount time.
*/
r->alloc_enabled = false;
+ rdt_resources_all[level].cdp_enabled = false;
+ rdt_resources_all[type].cdp_enabled = false;
+ rdt_resources_all[level].cdp_capable = true;
+ rdt_resources_all[type].cdp_capable = true;
}

static void rdt_get_cdp_l3_config(void)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 44f2b8f7b5d7..af230135ad7c 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -375,6 +375,8 @@ struct rdt_parse_data {
* @msr_update: Function pointer to update QOS MSRs
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
* @mbm_width: Monitor width, to detect and correct for overflow.
+ * @cdp_enabled: CDP state of this resource
+ * @cdp_capable: Is the CDP feature available on this resource
*
* Members of this structure are either private to the architecture
* e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
@@ -389,6 +391,8 @@ struct rdt_hw_resource {
struct rdt_resource *r);
unsigned int mon_scale;
unsigned int mbm_width;
+ bool cdp_enabled;
+ bool cdp_capable;
};

static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
@@ -409,7 +413,7 @@ DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);

extern struct dentry *debugfs_resctrl;

-enum {
+enum resctrl_res_level {
RDT_RESOURCE_L3,
RDT_RESOURCE_L3DATA,
RDT_RESOURCE_L3CODE,
@@ -430,6 +434,13 @@ static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
return &hw_res->resctrl;
}

+static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
+{
+ return rdt_resources_all[l].cdp_enabled;
+}
+
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
+
/*
* To return the common struct rdt_resource, which is contained in struct
* rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 2f99210a9d69..f322f5c78a41 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -684,8 +684,8 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp)
* resource, the portion of cache used by it should be made
* unavailable to all future allocations from both resources.
*/
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled ||
- rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled) {
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3) ||
+ resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2)) {
rdt_last_cmd_puts("CDP enabled\n");
return -EINVAL;
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 57e4d793f576..6b01902ac037 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1933,14 +1933,16 @@ static int set_cache_qos_cfg(int level, bool enable)
/* Restore the qos cfg state when a domain comes online */
void rdt_domain_reconfigure_cdp(struct rdt_resource *r)
{
- if (!r->alloc_capable)
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+ if (!hw_res->cdp_capable)
return;

if (r == &rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl)
- l2_qos_cfg_update(&r->alloc_enabled);
+ l2_qos_cfg_update(&hw_res->cdp_enabled);

if (r == &rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl)
- l3_qos_cfg_update(&r->alloc_enabled);
+ l3_qos_cfg_update(&hw_res->cdp_enabled);
}

/*
@@ -1984,51 +1986,62 @@ static int cdp_enable(int level, int data_type, int code_type)
r_l->alloc_enabled = false;
r_ldata->alloc_enabled = true;
r_lcode->alloc_enabled = true;
+ rdt_resources_all[level].cdp_enabled = true;
+ rdt_resources_all[data_type].cdp_enabled = true;
+ rdt_resources_all[code_type].cdp_enabled = true;
}
return ret;
}

-static int cdpl3_enable(void)
-{
- return cdp_enable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA,
- RDT_RESOURCE_L3CODE);
-}
-
-static int cdpl2_enable(void)
-{
- return cdp_enable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA,
- RDT_RESOURCE_L2CODE);
-}
-
static void cdp_disable(int level, int data_type, int code_type)
{
- struct rdt_resource *r = &rdt_resources_all[level].resctrl;
+ struct rdt_hw_resource *r_hw = &rdt_resources_all[level];
+ struct rdt_resource *r = &r_hw->resctrl;

r->alloc_enabled = r->alloc_capable;

- if (rdt_resources_all[data_type].resctrl.alloc_enabled) {
+ if (r_hw->cdp_enabled) {
rdt_resources_all[data_type].resctrl.alloc_enabled = false;
rdt_resources_all[code_type].resctrl.alloc_enabled = false;
set_cache_qos_cfg(level, false);
+ r_hw->cdp_enabled = false;
+ rdt_resources_all[data_type].cdp_enabled = false;
+ rdt_resources_all[code_type].cdp_enabled = false;
}
}

-static void cdpl3_disable(void)
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable)
{
- cdp_disable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA, RDT_RESOURCE_L3CODE);
-}
+ struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
+ enum resctrl_res_level code_type, data_type;

-static void cdpl2_disable(void)
-{
- cdp_disable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA, RDT_RESOURCE_L2CODE);
+ if (!hw_res->cdp_capable)
+ return -EINVAL;
+
+ if (l == RDT_RESOURCE_L3) {
+ code_type = RDT_RESOURCE_L3CODE;
+ data_type = RDT_RESOURCE_L3DATA;
+ } else if (l == RDT_RESOURCE_L2) {
+ code_type = RDT_RESOURCE_L2CODE;
+ data_type = RDT_RESOURCE_L2DATA;
+ } else {
+ return -EINVAL;
+ }
+
+ if (enable)
+ return cdp_enable(l, data_type, code_type);
+
+ cdp_disable(l, data_type, code_type);
+
+ return 0;
}

static void cdp_disable_all(void)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled)
- cdpl3_disable();
- if (rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled)
- cdpl2_disable();
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3))
+ resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, false);
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
+ resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, false);
}

/*
@@ -2106,10 +2119,10 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
int ret = 0;

if (ctx->enable_cdpl2)
- ret = cdpl2_enable();
+ ret = resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, true);

if (!ret && ctx->enable_cdpl3)
- ret = cdpl3_enable();
+ ret = resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, true);

if (!ret && ctx->enable_mba_mbps)
ret = set_mba_sc(true);
@@ -3208,10 +3221,10 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)

static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled)
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3))
seq_puts(seq, ",cdp");

- if (rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled)
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
seq_puts(seq, ",cdpl2");

if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].resctrl))
--
2.30.2

2021-06-14 20:13:30

by James Morse

[permalink] [raw]
Subject: [PATCH v4 17/24] x86/resctrl: Pass configuration type to resctrl_arch_get_config()

The ctrl_val[] array for a struct rdt_hw_resource only holds
configurations of one type. The type is implicit.

Once the CDP resources are merged, the ctrl_val[] array will hold
all the configurations for the hardware resource. When a particular
type of configuration is needed, it must be specified explicitly.

Pass the expected type from the schema into resctrl_arch_get_config().
Nothing uses this yet, but once a single ctrl_val[] array is used
for the three struct rdt_hw_resources that share hardware, the type
will be used to return the correct configuration value from the
shared array.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* Shuffled commit message,
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 5 ++--
arch/x86/kernel/cpu/resctrl/monitor.c | 2 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 35 +++++++++++++++--------
include/linux/resctrl.h | 3 +-
4 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 1cd54402b02a..72a8cf52de47 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -402,7 +402,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
}

void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
- u32 closid, u32 *value)
+ u32 closid, enum resctrl_conf_type type, u32 *value)
{
struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);

@@ -424,7 +424,8 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
if (sep)
seq_puts(s, ";");

- resctrl_arch_get_config(r, dom, closid, &ctrl_val);
+ resctrl_arch_get_config(r, dom, closid, schema->conf_type,
+ &ctrl_val);
seq_printf(s, r->format_str, dom->id, max_data_width,
ctrl_val);
sep = true;
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 647c0be76ea6..3f00dd54fb03 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -442,7 +442,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
hw_dom_mba = resctrl_to_arch_dom(dom_mba);

cur_bw = pmbm_data->prev_bw;
- resctrl_arch_get_config(r_mba, dom_mba, closid, &user_bw);
+ resctrl_arch_get_config(r_mba, dom_mba, closid, CDP_NONE, &user_bw);
delta_bw = pmbm_data->delta_bw;
/*
* resctrl_arch_get_config() chooses the mbps/ctrl value to return
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 6b01902ac037..740d2d0ff4df 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -923,7 +923,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
for (i = 0; i < closids_supported(); i++) {
if (!closid_allocated(i))
continue;
- resctrl_arch_get_config(r, dom, i, &ctrl_val);
+ resctrl_arch_get_config(r, dom, i, s->conf_type,
+ &ctrl_val);
mode = rdtgroup_mode_by_closid(i);
switch (mode) {
case RDT_MODE_SHAREABLE:
@@ -1099,6 +1100,7 @@ static int rdtgroup_mode_show(struct kernfs_open_file *of,
* Used to return the result.
* @d_cdp: RDT domain that shares hardware with @d (RDT domain peer)
* Used to return the result.
+ * @peer_type: The CDP configuration type of the peer resource.
*
* RDT resources are managed independently and by extension the RDT domains
* (RDT resource instances) are managed independently also. The Code and
@@ -1116,7 +1118,8 @@ static int rdtgroup_mode_show(struct kernfs_open_file *of,
*/
static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d,
struct rdt_resource **r_cdp,
- struct rdt_domain **d_cdp)
+ struct rdt_domain **d_cdp,
+ enum resctrl_conf_type *peer_type)
{
struct rdt_resource *_r_cdp = NULL;
struct rdt_domain *_d_cdp = NULL;
@@ -1125,15 +1128,19 @@ static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d,
switch (r->rid) {
case RDT_RESOURCE_L3DATA:
_r_cdp = &rdt_resources_all[RDT_RESOURCE_L3CODE].resctrl;
+ *peer_type = CDP_CODE;
break;
case RDT_RESOURCE_L3CODE:
_r_cdp = &rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl;
+ *peer_type = CDP_DATA;
break;
case RDT_RESOURCE_L2DATA:
_r_cdp = &rdt_resources_all[RDT_RESOURCE_L2CODE].resctrl;
+ *peer_type = CDP_CODE;
break;
case RDT_RESOURCE_L2CODE:
_r_cdp = &rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl;
+ *peer_type = CDP_DATA;
break;
default:
ret = -ENOENT;
@@ -1184,7 +1191,8 @@ static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d,
* Return: false if CBM does not overlap, true if it does.
*/
static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d,
- unsigned long cbm, int closid, bool exclusive)
+ unsigned long cbm, int closid,
+ enum resctrl_conf_type type, bool exclusive)
{
enum rdtgrp_mode mode;
unsigned long ctrl_b;
@@ -1199,7 +1207,7 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d

/* Check for overlap with other resource groups */
for (i = 0; i < closids_supported(); i++) {
- resctrl_arch_get_config(r, d, i, (u32 *)&ctrl_b);
+ resctrl_arch_get_config(r, d, i, type, (u32 *)&ctrl_b);
mode = rdtgroup_mode_by_closid(i);
if (closid_allocated(i) && i != closid &&
mode != RDT_MODE_PSEUDO_LOCKSETUP) {
@@ -1240,17 +1248,19 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d
bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d,
unsigned long cbm, int closid, bool exclusive)
{
+ enum resctrl_conf_type peer_type;
struct rdt_resource *r = s->res;
struct rdt_resource *r_cdp;
struct rdt_domain *d_cdp;

- if (__rdtgroup_cbm_overlaps(r, d, cbm, closid, exclusive))
+ if (__rdtgroup_cbm_overlaps(r, d, cbm, closid, s->conf_type,
+ exclusive))
return true;

- if (rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp) < 0)
+ if (rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp, &peer_type) < 0)
return false;

- return __rdtgroup_cbm_overlaps(r_cdp, d_cdp, cbm, closid, exclusive);
+ return __rdtgroup_cbm_overlaps(r_cdp, d_cdp, cbm, closid, peer_type, exclusive);
}

/**
@@ -1280,7 +1290,7 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
continue;
has_cache = true;
list_for_each_entry(d, &r->domains, list) {
- resctrl_arch_get_config(r, d, closid, &ctrl);
+ resctrl_arch_get_config(r, d, closid, s->conf_type, &ctrl);
if (rdtgroup_cbm_overlaps(s, d, ctrl, closid, false)) {
rdt_last_cmd_puts("Schemata overlaps\n");
return false;
@@ -1454,7 +1464,7 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
size = 0;
} else {
resctrl_arch_get_config(r, d, rdtgrp->closid,
- &ctrl);
+ schema->conf_type, &ctrl);
if (r->rid == RDT_RESOURCE_MBA)
size = ctrl;
else
@@ -2737,6 +2747,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
enum resctrl_conf_type t = s->conf_type;
struct rdt_resource *r_cdp = NULL;
struct resctrl_staged_config *cfg;
+ enum resctrl_conf_type peer_type;
struct rdt_domain *d_cdp = NULL;
struct rdt_resource *r = s->res;
u32 used_b = 0, unused_b = 0;
@@ -2745,7 +2756,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
u32 peer_ctl, ctrl_val;
int i;

- rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp);
+ rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp, &peer_type);
cfg = &d->staged_config[t];
cfg->have_new_ctrl = false;
cfg->new_ctrl = r->cache.shareable_bits;
@@ -2766,10 +2777,10 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
* with an exclusive group.
*/
if (d_cdp)
- resctrl_arch_get_config(r_cdp, d_cdp, i, &peer_ctl);
+ resctrl_arch_get_config(r_cdp, d_cdp, i, peer_type, &peer_ctl);
else
peer_ctl = 0;
- resctrl_arch_get_config(r, d, i, &ctrl_val);
+ resctrl_arch_get_config(r, d, i, s->conf_type, &ctrl_val);
used_b |= ctrl_val | peer_ctl;
if (mode == RDT_MODE_SHAREABLE)
cfg->new_ctrl |= ctrl_val | peer_ctl;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index d8c9080f0237..93bd8d3bbeb6 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -197,6 +197,7 @@ struct resctrl_schema {
u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
- u32 closid, u32 *value);
+ u32 closid, enum resctrl_conf_type type,
+ u32 *value);

#endif /* _RESCTRL_H */
--
2.30.2

2021-06-14 20:13:35

by James Morse

[permalink] [raw]
Subject: [PATCH v4 18/24] x86/resctrl: Make ctrlval arrays the same size

The CODE and DATA resources report a num_closid that is half the
actual size supported by the hardware. This behaviour is visible
to user-space when CDP is enabled.
The CODE and DATA resources have their own ctrlval arrays which are half
the size of the underlying hardware because num_closid was already
adjusted. One holds the odd configurations values, the other even.

Before the CDP resources can be merged, the 'half the closids'
behaviour needs to be implemented by schemata_list_create(), but
this causes the ctrl_val[] array to be full sized.

Remove the logic from the architecture specific rdt_get_cdp_config()
setup, and add it to schemata_list_create(). Functions that
walk take num_closid directly from struct rdt_hw_resource also
have to halve num_closid as only the lower half of each array is
in use. domain_setup_ctrlval() and reset_all_ctrls() both copy
struct rdt_hw_resource's num_closid to a struct msr_param. Correct
the value here. This is temporary as a subsequent patch will merge
the all three ctrl_val[] arrays such that when CDP is in use, the
CODA/DATA layout in the array matches the hardware. reset_all_ctrls()'s
loop over the whole of ctrl_val[] is not touched as this is harmless,
and will be required as it is once the resources are merged.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* Shuffled commit message,
---
arch/x86/kernel/cpu/resctrl/core.c | 10 +++++++++-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 +++++++++
2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a2cbd2832d73..0d18227a366b 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -363,7 +363,7 @@ static void rdt_get_cdp_config(int level, int type)
struct rdt_resource *r = &rdt_resources_all[type].resctrl;
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

- hw_res->num_closid = hw_res_l->num_closid / 2;
+ hw_res->num_closid = hw_res_l->num_closid;
r->cache.cbm_len = r_l->cache.cbm_len;
r->default_ctrl = r_l->default_ctrl;
r->cache.shareable_bits = r_l->cache.shareable_bits;
@@ -549,6 +549,14 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)

m.low = 0;
m.high = hw_res->num_closid;
+
+ /*
+ * temporary: the array is full-size, but cat_wrmsr() still re-maps
+ * the index.
+ */
+ if (hw_res->conf_type != CDP_NONE)
+ m.high /= 2;
+
hw_res->msr_update(d, &m, r);
return 0;
}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 740d2d0ff4df..e8006e332d1a 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2154,6 +2154,8 @@ static int schemata_list_create(void)
s->res = r;
s->conf_type = resctrl_to_arch_res(r)->conf_type;
s->num_closid = resctrl_arch_get_num_closid(r);
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ s->num_closid /= 2;

ret = snprintf(s->name, sizeof(s->name), r->name);
if (ret >= sizeof(s->name)) {
@@ -2366,6 +2368,13 @@ static int reset_all_ctrls(struct rdt_resource *r)
msr_param.low = 0;
msr_param.high = hw_res->num_closid;

+ /*
+ * temporary: the array is full-sized, but cat_wrmsr() still re-maps
+ * the index.
+ */
+ if (hw_res->cdp_enabled)
+ msr_param.high /= 2;
+
/*
* Disable resource control for this resource by setting all
* CBMs in all domains to the maximum mask value. Pick one CPU
--
2.30.2

2021-06-14 20:13:43

by James Morse

[permalink] [raw]
Subject: [PATCH v4 04/24] x86/resctrl: Pass the schema in info dir's private pointer

Many of resctrl's per-schema files return a value from struct
rdt_resource, which they take as their 'priv' pointer.

Moving properties that resctrl exposes to user-space into the core
'fs' code, (e.g. the name of the schema), means some of the functions
that back the filesystem need the schema struct (to where the properties
are moved), but currently take struct rdt_resource. For example, once
the CDP resources are merged, struct rdt_resource no longer reflects all
the properties of the schema.
For the info dirs that represent a control, the information needed
will be accessed via struct resctrl_schema, as this is how the resource
is being used. For the monitors, its still struct rdt_resource as the
monitors aren't described as schema.
This difference means the type of the private pointers varies
between control and monitor info dirs.

Change the 'priv' pointer to point to struct resctrl_schema for
the per-schema files that represent a control. The type can be determined
from the fflags field. If the flags are RF_MON_INFO, its a struct
rdt_resource. If the flags are RF_CTRL_INFO, its a struct resctrl_schema.
No entry in res_common_files[] has both flags.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No Changes since v3

Changes since v2:
* Shuffled commit message,

Changes since v1:
* Added comment above removed for_each_alloc_enabled_rdt_resource() to hint
at symmetry.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 38 +++++++++++++++++---------
1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 14ea1212f476..6ad9df322282 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -848,7 +848,8 @@ static int rdt_last_cmd_status_show(struct kernfs_open_file *of,
static int rdt_num_closids_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;
struct rdt_hw_resource *hw_res;

hw_res = resctrl_to_arch_res(r);
@@ -859,7 +860,8 @@ static int rdt_num_closids_show(struct kernfs_open_file *of,
static int rdt_default_ctrl_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;

seq_printf(seq, "%x\n", r->default_ctrl);
return 0;
@@ -868,7 +870,8 @@ static int rdt_default_ctrl_show(struct kernfs_open_file *of,
static int rdt_min_cbm_bits_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;

seq_printf(seq, "%u\n", r->cache.min_cbm_bits);
return 0;
@@ -877,7 +880,8 @@ static int rdt_min_cbm_bits_show(struct kernfs_open_file *of,
static int rdt_shareable_bits_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;

seq_printf(seq, "%x\n", r->cache.shareable_bits);
return 0;
@@ -900,13 +904,14 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
static int rdt_bit_usage_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
/*
* Use unsigned long even though only 32 bits are used to ensure
* test_bit() is used safely.
*/
unsigned long sw_shareable = 0, hw_shareable = 0;
unsigned long exclusive = 0, pseudo_locked = 0;
+ struct rdt_resource *r = s->res;
struct rdt_domain *dom;
int i, hwb, swb, excl, psl;
enum rdtgrp_mode mode;
@@ -978,7 +983,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
static int rdt_min_bw_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;

seq_printf(seq, "%u\n", r->membw.min_bw);
return 0;
@@ -1009,7 +1015,8 @@ static int rdt_mon_features_show(struct kernfs_open_file *of,
static int rdt_bw_gran_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;

seq_printf(seq, "%u\n", r->membw.bw_gran);
return 0;
@@ -1018,7 +1025,8 @@ static int rdt_bw_gran_show(struct kernfs_open_file *of,
static int rdt_delay_linear_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;

seq_printf(seq, "%u\n", r->membw.delay_linear);
return 0;
@@ -1038,7 +1046,8 @@ static int max_threshold_occ_show(struct kernfs_open_file *of,
static int rdt_thread_throttle_mode_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;

if (r->membw.throttle_mode == THREAD_THROTTLE_PER_THREAD)
seq_puts(seq, "per-thread\n");
@@ -1771,14 +1780,14 @@ int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name,
return ret;
}

-static int rdtgroup_mkdir_info_resdir(struct rdt_resource *r, char *name,
+static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
unsigned long fflags)
{
struct kernfs_node *kn_subdir;
int ret;

kn_subdir = kernfs_create_dir(kn_info, name,
- kn_info->mode, r);
+ kn_info->mode, priv);
if (IS_ERR(kn_subdir))
return PTR_ERR(kn_subdir);

@@ -1795,6 +1804,7 @@ static int rdtgroup_mkdir_info_resdir(struct rdt_resource *r, char *name,

static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
{
+ struct resctrl_schema *s;
struct rdt_resource *r;
unsigned long fflags;
char name[32];
@@ -1809,9 +1819,11 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
if (ret)
goto out_destroy;

- for_each_alloc_enabled_rdt_resource(r) {
+ /* loop over enabled controls, these are all alloc_enabled */
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
fflags = r->fflags | RF_CTRL_INFO;
- ret = rdtgroup_mkdir_info_resdir(r, r->name, fflags);
+ ret = rdtgroup_mkdir_info_resdir(s, r->name, fflags);
if (ret)
goto out_destroy;
}
--
2.30.2

2021-06-14 20:13:48

by James Morse

[permalink] [raw]
Subject: [PATCH v4 22/24] x86/resctrl: Remove rdt_cdp_peer_get()

When CDP is enabled, rdt_cdp_peer_get() finds the alternative
CODE/DATA resource and returns the alternative domain. This is used
to determine if bitmaps overlap when there are aliased entries
in the two struct rdt_hw_resources.

Now that the ctrl_val[] used by the CODE/DATA resources is the same,
the search for an alternate resource/domain is not needed.

Replace rdt_cdp_peer_get() with resctrl_peer_type(), which returns
the alternative type. This can be passed to resctrl_arch_get_config()
with the same resource and domain.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* Shuffled commit message,

Changes since v1:
* Expanded commit mesasge.
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 99 ++++----------------------
1 file changed, 14 insertions(+), 85 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index bc0fd909ee31..cdf21deafd2a 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1092,82 +1092,17 @@ static int rdtgroup_mode_show(struct kernfs_open_file *of,
return 0;
}

-/**
- * rdt_cdp_peer_get - Retrieve CDP peer if it exists
- * @r: RDT resource to which RDT domain @d belongs
- * @d: Cache instance for which a CDP peer is requested
- * @r_cdp: RDT resource that shares hardware with @r (RDT resource peer)
- * Used to return the result.
- * @d_cdp: RDT domain that shares hardware with @d (RDT domain peer)
- * Used to return the result.
- * @peer_type: The CDP configuration type of the peer resource.
- *
- * RDT resources are managed independently and by extension the RDT domains
- * (RDT resource instances) are managed independently also. The Code and
- * Data Prioritization (CDP) RDT resources, while managed independently,
- * could refer to the same underlying hardware. For example,
- * RDT_RESOURCE_L2CODE and RDT_RESOURCE_L2DATA both refer to the L2 cache.
- *
- * When provided with an RDT resource @r and an instance of that RDT
- * resource @d rdt_cdp_peer_get() will return if there is a peer RDT
- * resource and the exact instance that shares the same hardware.
- *
- * Return: 0 if a CDP peer was found, <0 on error or if no CDP peer exists.
- * If a CDP peer was found, @r_cdp will point to the peer RDT resource
- * and @d_cdp will point to the peer RDT domain.
- */
-static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d,
- struct rdt_resource **r_cdp,
- struct rdt_domain **d_cdp,
- enum resctrl_conf_type *peer_type)
+static enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type)
{
- struct rdt_resource *_r_cdp = NULL;
- struct rdt_domain *_d_cdp = NULL;
- int ret = 0;
-
- switch (r->rid) {
- case RDT_RESOURCE_L3DATA:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L3CODE].resctrl;
- *peer_type = CDP_CODE;
- break;
- case RDT_RESOURCE_L3CODE:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl;
- *peer_type = CDP_DATA;
- break;
- case RDT_RESOURCE_L2DATA:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L2CODE].resctrl;
- *peer_type = CDP_CODE;
- break;
- case RDT_RESOURCE_L2CODE:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl;
- *peer_type = CDP_DATA;
- break;
+ switch (my_type) {
+ case CDP_CODE:
+ return CDP_DATA;
+ case CDP_DATA:
+ return CDP_CODE;
default:
- ret = -ENOENT;
- goto out;
- }
-
- /*
- * When a new CPU comes online and CDP is enabled then the new
- * RDT domains (if any) associated with both CDP RDT resources
- * are added in the same CPU online routine while the
- * rdtgroup_mutex is held. It should thus not happen for one
- * RDT domain to exist and be associated with its RDT CDP
- * resource but there is no RDT domain associated with the
- * peer RDT CDP resource. Hence the WARN.
- */
- _d_cdp = rdt_find_domain(_r_cdp, d->id, NULL);
- if (WARN_ON(IS_ERR_OR_NULL(_d_cdp))) {
- _r_cdp = NULL;
- _d_cdp = NULL;
- ret = -EINVAL;
+ case CDP_NONE:
+ return CDP_NONE;
}
-
-out:
- *r_cdp = _r_cdp;
- *d_cdp = _d_cdp;
-
- return ret;
}

/**
@@ -1248,19 +1183,16 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d
bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d,
unsigned long cbm, int closid, bool exclusive)
{
- enum resctrl_conf_type peer_type;
+ enum resctrl_conf_type peer_type = resctrl_peer_type(s->conf_type);
struct rdt_resource *r = s->res;
- struct rdt_resource *r_cdp;
- struct rdt_domain *d_cdp;

if (__rdtgroup_cbm_overlaps(r, d, cbm, closid, s->conf_type,
exclusive))
return true;

- if (rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp, &peer_type) < 0)
+ if (!resctrl_arch_get_cdp_enabled(r->rid))
return false;
-
- return __rdtgroup_cbm_overlaps(r_cdp, d_cdp, cbm, closid, peer_type, exclusive);
+ return __rdtgroup_cbm_overlaps(r, d, cbm, closid, peer_type, exclusive);
}

/**
@@ -2746,11 +2678,9 @@ static u32 cbm_ensure_valid(u32 _val, struct rdt_resource *r)
static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
u32 closid)
{
+ enum resctrl_conf_type peer_type = resctrl_peer_type(s->conf_type);
enum resctrl_conf_type t = s->conf_type;
- struct rdt_resource *r_cdp = NULL;
struct resctrl_staged_config *cfg;
- enum resctrl_conf_type peer_type;
- struct rdt_domain *d_cdp = NULL;
struct rdt_resource *r = s->res;
u32 used_b = 0, unused_b = 0;
unsigned long tmp_cbm;
@@ -2758,7 +2688,6 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
u32 peer_ctl, ctrl_val;
int i;

- rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp, &peer_type);
cfg = &d->staged_config[t];
cfg->have_new_ctrl = false;
cfg->new_ctrl = r->cache.shareable_bits;
@@ -2778,8 +2707,8 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
* usage to ensure there is no overlap
* with an exclusive group.
*/
- if (d_cdp)
- resctrl_arch_get_config(r_cdp, d_cdp, i, peer_type, &peer_ctl);
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ resctrl_arch_get_config(r, d, i, peer_type, &peer_ctl);
else
peer_ctl = 0;
resctrl_arch_get_config(r, d, i, s->conf_type, &ctrl_val);
--
2.30.2

2021-06-14 20:14:01

by James Morse

[permalink] [raw]
Subject: [PATCH v4 24/24] x86/resctrl: Merge the CDP resources

resctrl uses struct rdt_resource to describe the available hardware
resources. The domains of the CDP aliases share a single ctrl_val[]
array. The only differences between the struct rdt_hw_resource
aliases is the name and conf_type.

The name from struct rdt_hw_resource is visible to user-space. To
support another architecture, as many user-visible details should be
handled in the filesystem parts of the code that is common to all
architectures. The name and conf_type go together.

Remove conf_type and the CDP aliases. When CDP is supported and
enabled, schemata_list_create() can create two schema using the
single resource, generating the CODE/DATA suffix to the schema
name itself.
This allows the alloc_ctrlval_array() and complications around free()ing
the ctrl_val arrays to be removed.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Added braces around an else
* Removed a space.

Changes since v2:
* Removed stray conf_type that remained in the arch specific struct
* Shuffled commit message,

Changes since v1:
* rdt_get_cdp_config() is kept for its comment.
---
arch/x86/kernel/cpu/resctrl/core.c | 176 ++-----------------------
arch/x86/kernel/cpu/resctrl/internal.h | 6 -
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 123 +++++++++--------
3 files changed, 76 insertions(+), 229 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index f0e147a209e7..daaa326e5a25 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -62,7 +62,6 @@ mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m,
struct rdt_hw_resource rdt_resources_all[] = {
[RDT_RESOURCE_L3] =
{
- .conf_type = CDP_NONE,
.resctrl = {
.rid = RDT_RESOURCE_L3,
.name = "L3",
@@ -78,45 +77,8 @@ struct rdt_hw_resource rdt_resources_all[] = {
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
},
- [RDT_RESOURCE_L3DATA] =
- {
- .conf_type = CDP_DATA,
- .resctrl = {
- .rid = RDT_RESOURCE_L3DATA,
- .name = "L3DATA",
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- },
- .domains = domain_init(RDT_RESOURCE_L3DATA),
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- .msr_base = MSR_IA32_L3_CBM_BASE,
- .msr_update = cat_wrmsr,
- },
- [RDT_RESOURCE_L3CODE] =
- {
- .conf_type = CDP_CODE,
- .resctrl = {
- .rid = RDT_RESOURCE_L3CODE,
- .name = "L3CODE",
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- },
- .domains = domain_init(RDT_RESOURCE_L3CODE),
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- .msr_base = MSR_IA32_L3_CBM_BASE,
- .msr_update = cat_wrmsr,
- },
[RDT_RESOURCE_L2] =
{
- .conf_type = CDP_NONE,
.resctrl = {
.rid = RDT_RESOURCE_L2,
.name = "L2",
@@ -132,45 +94,8 @@ struct rdt_hw_resource rdt_resources_all[] = {
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
},
- [RDT_RESOURCE_L2DATA] =
- {
- .conf_type = CDP_DATA,
- .resctrl = {
- .rid = RDT_RESOURCE_L2DATA,
- .name = "L2DATA",
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- },
- .domains = domain_init(RDT_RESOURCE_L2DATA),
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- .msr_base = MSR_IA32_L2_CBM_BASE,
- .msr_update = cat_wrmsr,
- },
- [RDT_RESOURCE_L2CODE] =
- {
- .conf_type = CDP_CODE,
- .resctrl = {
- .rid = RDT_RESOURCE_L2CODE,
- .name = "L2CODE",
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- },
- .domains = domain_init(RDT_RESOURCE_L2CODE),
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- .msr_base = MSR_IA32_L2_CBM_BASE,
- .msr_update = cat_wrmsr,
- },
[RDT_RESOURCE_MBA] =
{
- .conf_type = CDP_NONE,
.resctrl = {
.rid = RDT_RESOURCE_MBA,
.name = "MB",
@@ -339,40 +264,24 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
r->alloc_enabled = true;
}

-static void rdt_get_cdp_config(int level, int type)
+static void rdt_get_cdp_config(int level)
{
- struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
- struct rdt_hw_resource *hw_res_l = resctrl_to_arch_res(r_l);
- struct rdt_resource *r = &rdt_resources_all[type].resctrl;
- struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
-
- hw_res->num_closid = hw_res_l->num_closid;
- r->cache.cbm_len = r_l->cache.cbm_len;
- r->default_ctrl = r_l->default_ctrl;
- r->cache.shareable_bits = r_l->cache.shareable_bits;
- r->data_width = (r->cache.cbm_len + 3) / 4;
- r->alloc_capable = true;
/*
* By default, CDP is disabled. CDP can be enabled by mount parameter
* "cdp" during resctrl file system mount time.
*/
- r->alloc_enabled = false;
rdt_resources_all[level].cdp_enabled = false;
- rdt_resources_all[type].cdp_enabled = false;
rdt_resources_all[level].cdp_capable = true;
- rdt_resources_all[type].cdp_capable = true;
}

static void rdt_get_cdp_l3_config(void)
{
- rdt_get_cdp_config(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA);
- rdt_get_cdp_config(RDT_RESOURCE_L3, RDT_RESOURCE_L3CODE);
+ rdt_get_cdp_config(RDT_RESOURCE_L3);
}

static void rdt_get_cdp_l2_config(void)
{
- rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA);
- rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2CODE);
+ rdt_get_cdp_config(RDT_RESOURCE_L2);
}

static void
@@ -509,57 +418,6 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
}
}

-static u32 *alloc_ctrlval_array(struct rdt_resource *r, struct rdt_domain *d,
- bool mba_sc)
-{
- /* these are for the underlying hardware, they may not match r/d */
- struct rdt_domain *underlying_domain;
- struct rdt_hw_resource *hw_res;
- struct rdt_hw_domain *hw_dom;
- bool remapped;
-
- switch (r->rid) {
- case RDT_RESOURCE_L3DATA:
- case RDT_RESOURCE_L3CODE:
- hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
- remapped = true;
- break;
- case RDT_RESOURCE_L2DATA:
- case RDT_RESOURCE_L2CODE:
- hw_res = &rdt_resources_all[RDT_RESOURCE_L2];
- remapped = true;
- break;
- default:
- hw_res = resctrl_to_arch_res(r);
- remapped = false;
- }
-
- /*
- * If we changed the resource, we need to search for the underlying
- * domain. Doing this for all resources would make it tricky to add the
- * first resource, as domains aren't added to a resource list until
- * after the ctrlval arrays have been allocated.
- */
- if (remapped)
- underlying_domain = rdt_find_domain(&hw_res->resctrl, d->id,
- NULL);
- else
- underlying_domain = d;
- hw_dom = resctrl_to_arch_dom(underlying_domain);
-
- if (mba_sc) {
- if (hw_dom->mbps_val)
- return hw_dom->mbps_val;
- return kmalloc_array(hw_res->num_closid,
- sizeof(*hw_dom->mbps_val), GFP_KERNEL);
- } else {
- if (hw_dom->ctrl_val)
- return hw_dom->ctrl_val;
- return kmalloc_array(hw_res->num_closid,
- sizeof(*hw_dom->ctrl_val), GFP_KERNEL);
- }
-}
-
static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
@@ -567,11 +425,13 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
struct msr_param m;
u32 *dc, *dm;

- dc = alloc_ctrlval_array(r, d, false);
+ dc = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->ctrl_val),
+ GFP_KERNEL);
if (!dc)
return -ENOMEM;

- dm = alloc_ctrlval_array(r, d, true);
+ dm = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->mbps_val),
+ GFP_KERNEL);
if (!dm) {
kfree(dc);
return -ENOMEM;
@@ -730,14 +590,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
if (d->plr)
d->plr->d = NULL;

- /* temporary: these four don't have a unique ctrlval array */
- if (r->rid != RDT_RESOURCE_L3CODE &&
- r->rid != RDT_RESOURCE_L3DATA &&
- r->rid != RDT_RESOURCE_L2CODE &&
- r->rid != RDT_RESOURCE_L2DATA) {
- kfree(hw_dom->ctrl_val);
- kfree(hw_dom->mbps_val);
- }
+ kfree(hw_dom->ctrl_val);
+ kfree(hw_dom->mbps_val);
bitmap_free(d->rmid_busy_llc);
kfree(d->mbm_total);
kfree(d->mbm_local);
@@ -1010,11 +864,7 @@ static __init void rdt_init_res_defs_intel(void)
hw_res = resctrl_to_arch_res(r);

if (r->rid == RDT_RESOURCE_L3 ||
- r->rid == RDT_RESOURCE_L3DATA ||
- r->rid == RDT_RESOURCE_L3CODE ||
- r->rid == RDT_RESOURCE_L2 ||
- r->rid == RDT_RESOURCE_L2DATA ||
- r->rid == RDT_RESOURCE_L2CODE) {
+ r->rid == RDT_RESOURCE_L2) {
r->cache.arch_has_sparse_bitmaps = false;
r->cache.arch_has_empty_bitmaps = false;
r->cache.arch_has_per_cpu_cfg = false;
@@ -1034,11 +884,7 @@ static __init void rdt_init_res_defs_amd(void)
hw_res = resctrl_to_arch_res(r);

if (r->rid == RDT_RESOURCE_L3 ||
- r->rid == RDT_RESOURCE_L3DATA ||
- r->rid == RDT_RESOURCE_L3CODE ||
- r->rid == RDT_RESOURCE_L2 ||
- r->rid == RDT_RESOURCE_L2DATA ||
- r->rid == RDT_RESOURCE_L2CODE) {
+ r->rid == RDT_RESOURCE_L2) {
r->cache.arch_has_sparse_bitmaps = true;
r->cache.arch_has_empty_bitmaps = true;
r->cache.arch_has_per_cpu_cfg = true;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index ce3abbe33f78..a41d8d057356 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -364,7 +364,6 @@ struct rdt_parse_data {

/**
* struct rdt_hw_resource - arch private attributes of a resctrl resource
- * @conf_type: The type that should be used when configuring. temporary
* @resctrl: Attributes of the resource used directly by resctrl.
* @num_closid: Maximum number of closid this hardware can support,
* regardless of CDP. This is exposed via
@@ -383,7 +382,6 @@ struct rdt_parse_data {
* msr_update and msr_base.
*/
struct rdt_hw_resource {
- enum resctrl_conf_type conf_type;
struct rdt_resource resctrl;
u32 num_closid;
unsigned int msr_base;
@@ -415,11 +413,7 @@ extern struct dentry *debugfs_resctrl;

enum resctrl_res_level {
RDT_RESOURCE_L3,
- RDT_RESOURCE_L3DATA,
- RDT_RESOURCE_L3CODE,
RDT_RESOURCE_L2,
- RDT_RESOURCE_L2DATA,
- RDT_RESOURCE_L2CODE,
RDT_RESOURCE_MBA,

/* Must be the last */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index cdf21deafd2a..85ff1176f968 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1880,10 +1880,10 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r)
if (!hw_res->cdp_capable)
return;

- if (r == &rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl)
+ if (r->rid == RDT_RESOURCE_L2)
l2_qos_cfg_update(&hw_res->cdp_enabled);

- if (r == &rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl)
+ if (r->rid == RDT_RESOURCE_L3)
l3_qos_cfg_update(&hw_res->cdp_enabled);
}

@@ -1912,68 +1912,42 @@ static int set_mba_sc(bool mba_sc)
return 0;
}

-static int cdp_enable(int level, int data_type, int code_type)
+static int cdp_enable(int level)
{
- struct rdt_resource *r_ldata = &rdt_resources_all[data_type].resctrl;
- struct rdt_resource *r_lcode = &rdt_resources_all[code_type].resctrl;
struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
int ret;

- if (!r_l->alloc_capable || !r_ldata->alloc_capable ||
- !r_lcode->alloc_capable)
+ if (!r_l->alloc_capable)
return -EINVAL;

ret = set_cache_qos_cfg(level, true);
- if (!ret) {
- r_l->alloc_enabled = false;
- r_ldata->alloc_enabled = true;
- r_lcode->alloc_enabled = true;
+ if (!ret)
rdt_resources_all[level].cdp_enabled = true;
- rdt_resources_all[data_type].cdp_enabled = true;
- rdt_resources_all[code_type].cdp_enabled = true;
- }
+
return ret;
}

-static void cdp_disable(int level, int data_type, int code_type)
+static void cdp_disable(int level)
{
struct rdt_hw_resource *r_hw = &rdt_resources_all[level];
- struct rdt_resource *r = &r_hw->resctrl;
-
- r->alloc_enabled = r->alloc_capable;

if (r_hw->cdp_enabled) {
- rdt_resources_all[data_type].resctrl.alloc_enabled = false;
- rdt_resources_all[code_type].resctrl.alloc_enabled = false;
set_cache_qos_cfg(level, false);
r_hw->cdp_enabled = false;
- rdt_resources_all[data_type].cdp_enabled = false;
- rdt_resources_all[code_type].cdp_enabled = false;
}
}

int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable)
{
struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
- enum resctrl_res_level code_type, data_type;

if (!hw_res->cdp_capable)
return -EINVAL;

- if (l == RDT_RESOURCE_L3) {
- code_type = RDT_RESOURCE_L3CODE;
- data_type = RDT_RESOURCE_L3DATA;
- } else if (l == RDT_RESOURCE_L2) {
- code_type = RDT_RESOURCE_L2CODE;
- data_type = RDT_RESOURCE_L2DATA;
- } else {
- return -EINVAL;
- }
-
if (enable)
- return cdp_enable(l, data_type, code_type);
+ return cdp_enable(l);

- cdp_disable(l, data_type, code_type);
+ cdp_disable(l);

return 0;
}
@@ -2072,40 +2046,73 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
return ret;
}

-static int schemata_list_create(void)
+static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type)
{
struct resctrl_schema *s;
- struct rdt_resource *r;
+ const char *suffix = "";
int ret, cl;

- for_each_alloc_enabled_rdt_resource(r) {
- s = kzalloc(sizeof(*s), GFP_KERNEL);
- if (!s)
- return -ENOMEM;
-
- s->res = r;
- s->conf_type = resctrl_to_arch_res(r)->conf_type;
- s->num_closid = resctrl_arch_get_num_closid(r);
- if (resctrl_arch_get_cdp_enabled(r->rid))
- s->num_closid /= 2;
-
- ret = snprintf(s->name, sizeof(s->name), r->name);
- if (ret >= sizeof(s->name)) {
- kfree(s);
- return -EINVAL;
- }
+ s = kzalloc(sizeof(*s), GFP_KERNEL);
+ if (!s)
+ return -ENOMEM;

- cl = strlen(s->name);
- if (cl > max_name_width)
- max_name_width = cl;
+ s->res = r;
+ s->num_closid = resctrl_arch_get_num_closid(r);
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ s->num_closid /= 2;

- INIT_LIST_HEAD(&s->list);
- list_add(&s->list, &resctrl_schema_all);
+ s->conf_type = type;
+ switch (type) {
+ case CDP_CODE:
+ suffix = "CODE";
+ break;
+ case CDP_DATA:
+ suffix = "DATA";
+ break;
+ case CDP_NONE:
+ suffix = "";
+ break;
}

+ ret = snprintf(s->name, sizeof(s->name), "%s%s", r->name, suffix);
+ if (ret >= sizeof(s->name)) {
+ kfree(s);
+ return -EINVAL;
+ }
+
+ cl = strlen(s->name);
+ if (cl > max_name_width)
+ max_name_width = cl;
+
+ INIT_LIST_HEAD(&s->list);
+ list_add(&s->list, &resctrl_schema_all);
+
return 0;
}

+static int schemata_list_create(void)
+{
+ struct rdt_resource *r;
+ int ret = 0;
+
+ for_each_alloc_enabled_rdt_resource(r) {
+ if (resctrl_arch_get_cdp_enabled(r->rid)) {
+ ret = schemata_list_add(r, CDP_CODE);
+ if (ret)
+ break;
+
+ ret = schemata_list_add(r, CDP_DATA);
+ } else {
+ ret = schemata_list_add(r, CDP_NONE);
+ }
+
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
static void schemata_list_destroy(void)
{
struct resctrl_schema *s, *tmp;
--
2.30.2

2021-06-14 20:14:14

by James Morse

[permalink] [raw]
Subject: [PATCH v4 14/24] x86/resctrl: Rename update_domains() resctrl_arch_update_domains()

update_domains() merges the staged configuration changes into the
arch codes configuration array. Rename to make it clear its part of the
arch code interface to resctrl.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* Shuffled commit message,

Changes since v1:
* The closid is no longer staged as from resctrl its always going to be
the same number even with CDP.
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 4 ++--
arch/x86/kernel/cpu/resctrl/internal.h | 1 -
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
include/linux/resctrl.h | 1 +
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index c46300bce210..271f5d28412a 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -259,7 +259,7 @@ static void apply_config(struct rdt_hw_domain *hw_dom,
}
}

-int update_domains(struct rdt_resource *r, int closid)
+int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
{
struct resctrl_staged_config *cfg;
struct rdt_hw_domain *hw_dom;
@@ -380,7 +380,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- ret = update_domains(r, rdtgrp->closid);
+ ret = resctrl_arch_update_domains(r, rdtgrp->closid);
if (ret)
goto out;
}
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 60155c20bd33..44f2b8f7b5d7 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -516,7 +516,6 @@ void rdt_pseudo_lock_release(void);
int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r);
-int update_domains(struct rdt_resource *r, int closid);
int closids_supported(void);
void closid_free(int closid);
int alloc_rmid(void);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d03cb388916c..1dec9afd9ff4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2847,7 +2847,7 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp)
return ret;
}

- ret = update_domains(r, rdtgrp->closid);
+ ret = resctrl_arch_update_domains(r, rdtgrp->closid);
if (ret < 0) {
rdt_last_cmd_puts("Failed to initialize allocations\n");
return ret;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 47f245a0e092..c7a187de5708 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -195,5 +195,6 @@ struct resctrl_schema {

/* The number of closid supported by this resource regardless of CDP */
u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
+int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);

#endif /* _RESCTRL_H */
--
2.30.2

2021-06-14 20:14:43

by James Morse

[permalink] [raw]
Subject: [PATCH v4 11/24] x86/resctrl: Move the schemata names into struct resctrl_schema

resctrl 'info' directories and schema parsing use the schema name.
This lives in the struct rdt_resource, and is specified by the
architecture code.

Once the CDP resources are merged, there will only be one resource
(and one name) in use by two schema. To allow the CDP CODE/DATA
property to be the tyoe of configuration the schema uses, the
name should also be per-schema.

Add a name field to struct resctrl_schema, and use this wherever
the schema name is exposed (or read from) user-space. Calculating
max_name_width for padding the schemata file also moves as this is
visible to user-space. As the names in struct rdt_resource already
include the CDP information, schemata_list_create() copies them.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Remvoed a space,

Changes since v2:
* Shuffled commit message,

Changes since v1:
* Don't hardcode max_name_width, that leads to bugs
* Move max_name_width to live with the code that will generate the name.
* Fixed name/names typo
---
arch/x86/kernel/cpu/resctrl/core.c | 5 -----
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 +++-------
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 19 +++++++++++++++----
include/linux/resctrl.h | 2 ++
4 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index b406cca56ed4..be974517ba0d 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -778,13 +778,8 @@ static int resctrl_offline_cpu(unsigned int cpu)
static __init void rdt_init_padding(void)
{
struct rdt_resource *r;
- int cl;

for_each_alloc_capable_rdt_resource(r) {
- cl = strlen(r->name);
- if (cl > max_name_width)
- max_name_width = cl;
-
if (r->data_width > max_data_width)
max_data_width = r->data_width;
}
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 4428ec499037..57c2b0e121d2 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -290,11 +290,9 @@ static int rdtgroup_parse_resource(char *resname, char *tok,
struct rdtgroup *rdtgrp)
{
struct resctrl_schema *s;
- struct rdt_resource *r;

list_for_each_entry(s, &resctrl_schema_all, list) {
- r = s->res;
- if (!strcmp(resname, r->name) && rdtgrp->closid < s->num_closid)
+ if (!strcmp(resname, s->name) && rdtgrp->closid < s->num_closid)
return parse_line(tok, s, rdtgrp);
}
rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname);
@@ -388,7 +386,7 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
bool sep = false;
u32 ctrl_val;

- seq_printf(s, "%*s:", max_name_width, r->name);
+ seq_printf(s, "%*s:", max_name_width, schema->name);
list_for_each_entry(dom, &r->domains, list) {
hw_dom = resctrl_to_arch_dom(dom);
if (sep)
@@ -408,7 +406,6 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
{
struct resctrl_schema *schema;
struct rdtgroup *rdtgrp;
- struct rdt_resource *r;
int ret = 0;
u32 closid;

@@ -416,8 +413,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
if (rdtgrp) {
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
list_for_each_entry(schema, &resctrl_schema_all, list) {
- r = schema->res;
- seq_printf(s, "%s:uninitialized\n", r->name);
+ seq_printf(s, "%s:uninitialized\n", schema->name);
}
} else if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
if (!rdtgrp->plr->d) {
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index eaad9c8e6c04..5551331ac94e 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1439,7 +1439,7 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
ret = -ENODEV;
} else {
seq_printf(s, "%*s:", max_name_width,
- rdtgrp->plr->s->res->name);
+ rdtgrp->plr->s->name);
size = rdtgroup_cbm_to_size(rdtgrp->plr->s->res,
rdtgrp->plr->d,
rdtgrp->plr->cbm);
@@ -1451,7 +1451,7 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
list_for_each_entry(schema, &resctrl_schema_all, list) {
r = schema->res;
sep = false;
- seq_printf(s, "%*s:", max_name_width, r->name);
+ seq_printf(s, "%*s:", max_name_width, schema->name);
list_for_each_entry(d, &r->domains, list) {
hw_dom = resctrl_to_arch_dom(d);
if (sep)
@@ -1823,7 +1823,7 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
fflags = r->fflags | RF_CTRL_INFO;
- ret = rdtgroup_mkdir_info_resdir(s, r->name, fflags);
+ ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
if (ret)
goto out_destroy;
}
@@ -2128,6 +2128,7 @@ static int schemata_list_create(void)
{
struct resctrl_schema *s;
struct rdt_resource *r;
+ int ret, cl;

for_each_alloc_enabled_rdt_resource(r) {
s = kzalloc(sizeof(*s), GFP_KERNEL);
@@ -2138,6 +2139,16 @@ static int schemata_list_create(void)
s->conf_type = resctrl_to_arch_res(r)->conf_type;
s->num_closid = resctrl_arch_get_num_closid(r);

+ ret = snprintf(s->name, sizeof(s->name), r->name);
+ if (ret >= sizeof(s->name)) {
+ kfree(s);
+ return -EINVAL;
+ }
+
+ cl = strlen(s->name);
+ if (cl > max_name_width)
+ max_name_width = cl;
+
INIT_LIST_HEAD(&s->list);
list_add(&s->list, &resctrl_schema_all);
}
@@ -2771,7 +2782,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
*/
tmp_cbm = d->new_ctrl;
if (bitmap_weight(&tmp_cbm, r->cache.cbm_len) < r->cache.min_cbm_bits) {
- rdt_last_cmd_printf("No space on %s:%d\n", r->name, d->id);
+ rdt_last_cmd_printf("No space on %s:%d\n", s->name, d->id);
return -ENOSPC;
}
d->have_new_ctrl = true;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 6c9e9692eaba..8fa806c85cec 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -167,6 +167,7 @@ struct rdt_resource {
* struct resctrl_schema - configuration abilities of a resource presented to
* user-space
* @list: Member of resctrl_schema_all.
+ * @name: The name to use in the "schemata" file.
* @conf_type: Whether this schema is specific to code/data.
* @res: The resource structure exported by the architecture to describe
* the hardware that is configured by this schema.
@@ -176,6 +177,7 @@ struct rdt_resource {
*/
struct resctrl_schema {
struct list_head list;
+ char name[8];
enum resctrl_conf_type conf_type;
struct rdt_resource *res;
u32 num_closid;
--
2.30.2

2021-06-14 20:15:16

by James Morse

[permalink] [raw]
Subject: [PATCH v4 15/24] x86/resctrl: Add a helper to read a closid's configuration

Functions like show_doms() reach into the architecture's private
structure to retrieve the configuration from the struct rdt_hw_resource.

The hardware configuration may look completely different to the
values resctrl gets from user-space. The staged configuration
and resctrl_arch_update_domains() allow the architecture to
convert or translate these values.

Resctrl shouldn't read or write the ctrl_val[] values directly.
Add a helper to read the current configuration. This will allow another
architecture to scale the bitmaps if necessary, and possibly use controls
that don't take the user-space control format at all.
Of the remaining functions that access ctrl_val[] directly,
apply_config() is part of the architecture specific code, and is called
via resctrl_arch_update_domains(). reset_all_ctrls() will be an
architecture specific helper.
update_mba_bw() manipulates both ctrl_val[], mbps_val[] and the
hardware. The mbps_val[] that matches the mba_sc state of the
resource is changed, but the other is left unchanged. Abstracting
this is the subject of later patches that affect set_mba_sc() too.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Fixed a spelling mistake.

Changes since v2:
* Shuffled commit message,
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 16 ++++++---
arch/x86/kernel/cpu/resctrl/monitor.c | 6 +++-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 43 ++++++++++-------------
include/linux/resctrl.h | 2 ++
4 files changed, 37 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 271f5d28412a..1cd54402b02a 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -401,22 +401,30 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}

+void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
+ u32 closid, u32 *value)
+{
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+
+ if (!is_mba_sc(r))
+ *value = hw_dom->ctrl_val[closid];
+ else
+ *value = hw_dom->mbps_val[closid];
+}
+
static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
{
struct rdt_resource *r = schema->res;
- struct rdt_hw_domain *hw_dom;
struct rdt_domain *dom;
bool sep = false;
u32 ctrl_val;

seq_printf(s, "%*s:", max_name_width, schema->name);
list_for_each_entry(dom, &r->domains, list) {
- hw_dom = resctrl_to_arch_dom(dom);
if (sep)
seq_puts(s, ";");

- ctrl_val = (!is_mba_sc(r) ? hw_dom->ctrl_val[closid] :
- hw_dom->mbps_val[closid]);
+ resctrl_arch_get_config(r, dom, closid, &ctrl_val);
seq_printf(s, r->format_str, dom->id, max_data_width,
ctrl_val);
sep = true;
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 76eea10f2e2c..647c0be76ea6 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -442,8 +442,12 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
hw_dom_mba = resctrl_to_arch_dom(dom_mba);

cur_bw = pmbm_data->prev_bw;
- user_bw = hw_dom_mba->mbps_val[closid];
+ resctrl_arch_get_config(r_mba, dom_mba, closid, &user_bw);
delta_bw = pmbm_data->delta_bw;
+ /*
+ * resctrl_arch_get_config() chooses the mbps/ctrl value to return
+ * based on is_mba_sc(). For now, reach into the hw_dom.
+ */
cur_msr_val = hw_dom_mba->ctrl_val[closid];

/*
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1dec9afd9ff4..57e4d793f576 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -910,27 +910,27 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
int i, hwb, swb, excl, psl;
enum rdtgrp_mode mode;
bool sep = false;
- u32 *ctrl;
+ u32 ctrl_val;

mutex_lock(&rdtgroup_mutex);
hw_shareable = r->cache.shareable_bits;
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_putc(seq, ';');
- ctrl = resctrl_to_arch_dom(dom)->ctrl_val;
sw_shareable = 0;
exclusive = 0;
seq_printf(seq, "%d=", dom->id);
- for (i = 0; i < closids_supported(); i++, ctrl++) {
+ for (i = 0; i < closids_supported(); i++) {
if (!closid_allocated(i))
continue;
+ resctrl_arch_get_config(r, dom, i, &ctrl_val);
mode = rdtgroup_mode_by_closid(i);
switch (mode) {
case RDT_MODE_SHAREABLE:
- sw_shareable |= *ctrl;
+ sw_shareable |= ctrl_val;
break;
case RDT_MODE_EXCLUSIVE:
- exclusive |= *ctrl;
+ exclusive |= ctrl_val;
break;
case RDT_MODE_PSEUDO_LOCKSETUP:
/*
@@ -1188,7 +1188,6 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d
{
enum rdtgrp_mode mode;
unsigned long ctrl_b;
- u32 *ctrl;
int i;

/* Check for any overlap with regions used by hardware directly */
@@ -1199,9 +1198,8 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d
}

/* Check for overlap with other resource groups */
- ctrl = resctrl_to_arch_dom(d)->ctrl_val;
- for (i = 0; i < closids_supported(); i++, ctrl++) {
- ctrl_b = *ctrl;
+ for (i = 0; i < closids_supported(); i++) {
+ resctrl_arch_get_config(r, d, i, (u32 *)&ctrl_b);
mode = rdtgroup_mode_by_closid(i);
if (closid_allocated(i) && i != closid &&
mode != RDT_MODE_PSEUDO_LOCKSETUP) {
@@ -1269,12 +1267,12 @@ bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d,
*/
static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
{
- struct rdt_hw_domain *hw_dom;
int closid = rdtgrp->closid;
struct resctrl_schema *s;
struct rdt_resource *r;
bool has_cache = false;
struct rdt_domain *d;
+ u32 ctrl;

list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
@@ -1282,10 +1280,8 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
continue;
has_cache = true;
list_for_each_entry(d, &r->domains, list) {
- hw_dom = resctrl_to_arch_dom(d);
- if (rdtgroup_cbm_overlaps(s, d,
- hw_dom->ctrl_val[closid],
- rdtgrp->closid, false)) {
+ resctrl_arch_get_config(r, d, closid, &ctrl);
+ if (rdtgroup_cbm_overlaps(s, d, ctrl, closid, false)) {
rdt_last_cmd_puts("Schemata overlaps\n");
return false;
}
@@ -1417,7 +1413,6 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
struct resctrl_schema *schema;
- struct rdt_hw_domain *hw_dom;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
struct rdt_domain *d;
@@ -1453,15 +1448,13 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
sep = false;
seq_printf(s, "%*s:", max_name_width, schema->name);
list_for_each_entry(d, &r->domains, list) {
- hw_dom = resctrl_to_arch_dom(d);
if (sep)
seq_putc(s, ';');
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
size = 0;
} else {
- ctrl = (!is_mba_sc(r) ?
- hw_dom->ctrl_val[rdtgrp->closid] :
- hw_dom->mbps_val[rdtgrp->closid]);
+ resctrl_arch_get_config(r, d, rdtgrp->closid,
+ &ctrl);
if (r->rid == RDT_RESOURCE_MBA)
size = ctrl;
else
@@ -2736,7 +2729,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
u32 used_b = 0, unused_b = 0;
unsigned long tmp_cbm;
enum rdtgrp_mode mode;
- u32 peer_ctl, *ctrl;
+ u32 peer_ctl, ctrl_val;
int i;

rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp);
@@ -2744,8 +2737,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
cfg->have_new_ctrl = false;
cfg->new_ctrl = r->cache.shareable_bits;
used_b = r->cache.shareable_bits;
- ctrl = resctrl_to_arch_dom(d)->ctrl_val;
- for (i = 0; i < closids_supported(); i++, ctrl++) {
+ for (i = 0; i < closids_supported(); i++) {
if (closid_allocated(i) && i != closid) {
mode = rdtgroup_mode_by_closid(i);
if (mode == RDT_MODE_PSEUDO_LOCKSETUP)
@@ -2761,12 +2753,13 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
* with an exclusive group.
*/
if (d_cdp)
- peer_ctl = resctrl_to_arch_dom(d_cdp)->ctrl_val[i];
+ resctrl_arch_get_config(r_cdp, d_cdp, i, &peer_ctl);
else
peer_ctl = 0;
- used_b |= *ctrl | peer_ctl;
+ resctrl_arch_get_config(r, d, i, &ctrl_val);
+ used_b |= ctrl_val | peer_ctl;
if (mode == RDT_MODE_SHAREABLE)
- cfg->new_ctrl |= *ctrl | peer_ctl;
+ cfg->new_ctrl |= ctrl_val | peer_ctl;
}
}
if (d->plr && d->plr->cbm > 0)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index c7a187de5708..d8c9080f0237 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -196,5 +196,7 @@ struct resctrl_schema {
/* The number of closid supported by this resource regardless of CDP */
u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
+void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
+ u32 closid, u32 *value);

#endif /* _RESCTRL_H */
--
2.30.2

2021-06-14 20:15:32

by James Morse

[permalink] [raw]
Subject: [PATCH v4 19/24] x86/resctrl: Apply offset correction when config is staged

When resctrl comes to copy the CAT MSR values from the ctrl_val[] array
into hardware, it applies an offset adjustment based on the type of the
resource. CODE and DATA resources have their closid mapped into an
odd/even range. This mapping is based on a property of the resource.

This happens once the new control value has been written to the ctrl_val[]
array. Once the CDP resources are merged, there will only be a single
property that needs to cover both odd/even mappings to the single
ctrl_val[] array. The offset adjustment must be applied before the new
value is written to the array.

Move the logic from cat_wrmsr() to resctrl_arch_update_domains().
The value provided to apply_config() is now an index in the array,
not the closid. The parameters provided via struct msr_param are now
indexes too. As resctrl's use of closid is a u32, struct msr_param's
type is changed to match.
With this, the CODE and DATA resources only use the odd or even
indexes in the array. This allows the temporary num_closid/2 fixes in
domain_setup_ctrlval() and reset_all_ctrls() to be removed.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Fixed a fat-fingered 'unsinged u32' - oops!
* Fixed a spelling mistake.

Changes since v2:
* Shuffled commit message,

Changes since v1:
* Removing the patch that moved the closid to the staged config means the
min/max and return from apply_config() appears here.
---
arch/x86/kernel/cpu/resctrl/core.c | 15 +----------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 32 +++++++++++++++++------
arch/x86/kernel/cpu/resctrl/internal.h | 4 +--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 -----
4 files changed, 27 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 0d18227a366b..15b57f70564b 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -195,11 +195,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
},
};

-static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid)
-{
- return closid * r->cache.cbm_idx_mult + r->cache.cbm_idx_offset;
-}
-
/*
* cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
* as they do not have CPUID enumeration support for Cache allocation.
@@ -438,7 +433,7 @@ cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);

for (i = m->low; i < m->high; i++)
- wrmsrl(hw_res->msr_base + cbm_idx(r, i), hw_dom->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
}

struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
@@ -549,14 +544,6 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)

m.low = 0;
m.high = hw_res->num_closid;
-
- /*
- * temporary: the array is full-size, but cat_wrmsr() still re-maps
- * the index.
- */
- if (hw_res->conf_type != CDP_NONE)
- m.high /= 2;
-
hw_res->msr_update(d, &m, r);
return 0;
}
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 72a8cf52de47..ebeab130f7eb 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -246,17 +246,29 @@ static int parse_line(char *line, struct resctrl_schema *s,
return -EINVAL;
}

-static void apply_config(struct rdt_hw_domain *hw_dom,
- struct resctrl_staged_config *cfg, int closid,
+static u32 cbm_idx(struct rdt_resource *r, unsigned int closid)
+{
+ if (r->rid == RDT_RESOURCE_MBA)
+ return closid;
+
+ return closid * r->cache.cbm_idx_mult + r->cache.cbm_idx_offset;
+}
+
+static bool apply_config(struct rdt_hw_domain *hw_dom,
+ struct resctrl_staged_config *cfg, u32 idx,
cpumask_var_t cpu_mask, bool mba_sc)
{
struct rdt_domain *dom = &hw_dom->resctrl;
u32 *dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;

- if (cfg->new_ctrl != dc[closid]) {
+ if (cfg->new_ctrl != dc[idx]) {
cpumask_set_cpu(cpumask_any(&dom->cpu_mask), cpu_mask);
- dc[closid] = cfg->new_ctrl;
+ dc[idx] = cfg->new_ctrl;
+
+ return true;
}
+
+ return false;
}

int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
@@ -269,11 +281,12 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
struct rdt_domain *d;
bool mba_sc;
int cpu;
+ u32 idx;

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;

- msr_param.low = closid;
+ msr_param.low = cbm_idx(r, closid);
msr_param.high = msr_param.low + 1;
msr_param.res = r;

@@ -285,7 +298,9 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
if (!cfg->have_new_ctrl)
continue;

- apply_config(hw_dom, cfg, closid, cpu_mask, mba_sc);
+ idx = cbm_idx(r, closid);
+ if (!apply_config(hw_dom, cfg, idx, cpu_mask, mba_sc))
+ continue;
}
}

@@ -405,11 +420,12 @@ void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
u32 closid, enum resctrl_conf_type type, u32 *value)
{
struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+ u32 idx = cbm_idx(r, closid);

if (!is_mba_sc(r))
- *value = hw_dom->ctrl_val[closid];
+ *value = hw_dom->ctrl_val[idx];
else
- *value = hw_dom->mbps_val[closid];
+ *value = hw_dom->mbps_val[idx];
}

static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index af230135ad7c..ce3abbe33f78 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -327,8 +327,8 @@ static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain *r)
*/
struct msr_param {
struct rdt_resource *res;
- int low;
- int high;
+ u32 low;
+ u32 high;
};

static inline bool is_llc_occupancy_enabled(void)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e8006e332d1a..bc0fd909ee31 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2368,13 +2368,6 @@ static int reset_all_ctrls(struct rdt_resource *r)
msr_param.low = 0;
msr_param.high = hw_res->num_closid;

- /*
- * temporary: the array is full-sized, but cat_wrmsr() still re-maps
- * the index.
- */
- if (hw_res->cdp_enabled)
- msr_param.high /= 2;
-
/*
* Disable resource control for this resource by setting all
* CBMs in all domains to the maximum mask value. Pick one CPU
--
2.30.2

2021-06-14 20:15:39

by James Morse

[permalink] [raw]
Subject: [PATCH v4 20/24] x86/resctrl: Calculate the index from the configuration type

resctrl uses cbm_idx() to map a closid to an index in the
configuration array. This is based on a multiplier and offset
that are held in the resource.

To merge the resources, the resctrl arch code needs to calculate
the index from something else, as there will only be one resource.

Decide based on the staged configuration type. This makes the
static mult and offset parameters redundant.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* Shuffled commit message,
---
arch/x86/kernel/cpu/resctrl/core.c | 12 -----------
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 25 ++++++++++++++---------
include/linux/resctrl.h | 6 ------
3 files changed, 15 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 15b57f70564b..08603487cb7d 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -69,8 +69,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 3,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
},
.domains = domain_init(RDT_RESOURCE_L3),
.parse_ctrlval = parse_cbm,
@@ -89,8 +87,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 3,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
},
.domains = domain_init(RDT_RESOURCE_L3DATA),
.parse_ctrlval = parse_cbm,
@@ -109,8 +105,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 3,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
},
.domains = domain_init(RDT_RESOURCE_L3CODE),
.parse_ctrlval = parse_cbm,
@@ -129,8 +123,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 2,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
},
.domains = domain_init(RDT_RESOURCE_L2),
.parse_ctrlval = parse_cbm,
@@ -149,8 +141,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 2,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
},
.domains = domain_init(RDT_RESOURCE_L2DATA),
.parse_ctrlval = parse_cbm,
@@ -169,8 +159,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 2,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
},
.domains = domain_init(RDT_RESOURCE_L2CODE),
.parse_ctrlval = parse_cbm,
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index ebeab130f7eb..e384398374da 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -246,12 +246,17 @@ static int parse_line(char *line, struct resctrl_schema *s,
return -EINVAL;
}

-static u32 cbm_idx(struct rdt_resource *r, unsigned int closid)
+static u32 get_config_index(u32 closid, enum resctrl_conf_type type)
{
- if (r->rid == RDT_RESOURCE_MBA)
+ switch (type) {
+ default:
+ case CDP_NONE:
return closid;
-
- return closid * r->cache.cbm_idx_mult + r->cache.cbm_idx_offset;
+ case CDP_CODE:
+ return (closid * 2) + 1;
+ case CDP_DATA:
+ return (closid * 2);
+ }
}

static bool apply_config(struct rdt_hw_domain *hw_dom,
@@ -286,10 +291,6 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;

- msr_param.low = cbm_idx(r, closid);
- msr_param.high = msr_param.low + 1;
- msr_param.res = r;
-
mba_sc = is_mba_sc(r);
list_for_each_entry(d, &r->domains, list) {
hw_dom = resctrl_to_arch_dom(d);
@@ -298,9 +299,13 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
if (!cfg->have_new_ctrl)
continue;

- idx = cbm_idx(r, closid);
+ idx = get_config_index(closid, t);
if (!apply_config(hw_dom, cfg, idx, cpu_mask, mba_sc))
continue;
+
+ msr_param.low = idx;
+ msr_param.high = msr_param.low + 1;
+ msr_param.res = r;
}
}

@@ -420,7 +425,7 @@ void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
u32 closid, enum resctrl_conf_type type, u32 *value)
{
struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
- u32 idx = cbm_idx(r, closid);
+ u32 idx = get_config_index(closid, type);

if (!is_mba_sc(r))
*value = hw_dom->ctrl_val[idx];
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 93bd8d3bbeb6..53ca55d5be1d 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -69,10 +69,6 @@ struct rdt_domain {
* struct resctrl_cache - Cache allocation related data
* @cbm_len: Length of the cache bit mask
* @min_cbm_bits: Minimum number of consecutive bits to be set
- * @cbm_idx_mult: Multiplier of CBM index
- * @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
- * closid * cbm_idx_multi + cbm_idx_offset
- * in a cache bit mask
* @shareable_bits: Bitmask of shareable resource with other
* executing entities
* @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid.
@@ -83,8 +79,6 @@ struct rdt_domain {
struct resctrl_cache {
unsigned int cbm_len;
unsigned int min_cbm_bits;
- unsigned int cbm_idx_mult; // TODO remove this
- unsigned int cbm_idx_offset; // TODO remove this
unsigned int shareable_bits;
bool arch_has_sparse_bitmaps;
bool arch_has_empty_bitmaps;
--
2.30.2

2021-06-14 20:15:41

by James Morse

[permalink] [raw]
Subject: [PATCH v4 21/24] x86/resctrl: Merge the ctrl_val arrays

Each struct rdt_hw_resource has its own ctrl_val[] array. When CDP is
enabled, two resources are in use, each with its own ctrl_val[] array
that holds half of the configuration used by hardware. One uses the
odd slots, the other the even. rdt_cdp_peer_get() is the helper to
find the alternate resource, its domain, and corresponding entry
in the other ctrl_val[] array.

Once the CDP resources are merged there will be one struct rdt_hw_resource
and one ctrl_val[] array for each hardware resource. This will
include changes to rdt_cdp_peer_get(), making it hard to bisect any
issue.

Merge the ctrl_val[] arrays for three CODE/DATA/NONE resources first.
Doing this before merging the resources temporarily complicates
allocating and freeing the ctrl_val arrays. Add a helper to allocate
the ctrl_val array, that returns the value on the L2 or L3 resource
if it already exists. This gets removed once the resources are merged,
and there really is only one ctrl_val[] array.

Reviewed-by: Jamie Iles <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v3:
* Removed some parenthesis that disappear in a later patch.

Changes since v2:
* Shuffled commit message,

Changes since v1:
* Added underscores to ctrlval when its not in a function name
* Removed temporary free_ctrlval_arrays() function.
* Reduced churn in domain_setup_ctrlval().
---
arch/x86/kernel/cpu/resctrl/core.c | 65 ++++++++++++++++++++++++++++--
1 file changed, 61 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 08603487cb7d..f0e147a209e7 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -509,6 +509,57 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
}
}

+static u32 *alloc_ctrlval_array(struct rdt_resource *r, struct rdt_domain *d,
+ bool mba_sc)
+{
+ /* these are for the underlying hardware, they may not match r/d */
+ struct rdt_domain *underlying_domain;
+ struct rdt_hw_resource *hw_res;
+ struct rdt_hw_domain *hw_dom;
+ bool remapped;
+
+ switch (r->rid) {
+ case RDT_RESOURCE_L3DATA:
+ case RDT_RESOURCE_L3CODE:
+ hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
+ remapped = true;
+ break;
+ case RDT_RESOURCE_L2DATA:
+ case RDT_RESOURCE_L2CODE:
+ hw_res = &rdt_resources_all[RDT_RESOURCE_L2];
+ remapped = true;
+ break;
+ default:
+ hw_res = resctrl_to_arch_res(r);
+ remapped = false;
+ }
+
+ /*
+ * If we changed the resource, we need to search for the underlying
+ * domain. Doing this for all resources would make it tricky to add the
+ * first resource, as domains aren't added to a resource list until
+ * after the ctrlval arrays have been allocated.
+ */
+ if (remapped)
+ underlying_domain = rdt_find_domain(&hw_res->resctrl, d->id,
+ NULL);
+ else
+ underlying_domain = d;
+ hw_dom = resctrl_to_arch_dom(underlying_domain);
+
+ if (mba_sc) {
+ if (hw_dom->mbps_val)
+ return hw_dom->mbps_val;
+ return kmalloc_array(hw_res->num_closid,
+ sizeof(*hw_dom->mbps_val), GFP_KERNEL);
+ } else {
+ if (hw_dom->ctrl_val)
+ return hw_dom->ctrl_val;
+ return kmalloc_array(hw_res->num_closid,
+ sizeof(*hw_dom->ctrl_val), GFP_KERNEL);
+ }
+}
+
static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
@@ -516,11 +567,11 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
struct msr_param m;
u32 *dc, *dm;

- dc = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->ctrl_val), GFP_KERNEL);
+ dc = alloc_ctrlval_array(r, d, false);
if (!dc)
return -ENOMEM;

- dm = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->mbps_val), GFP_KERNEL);
+ dm = alloc_ctrlval_array(r, d, true);
if (!dm) {
kfree(dc);
return -ENOMEM;
@@ -679,8 +730,14 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
if (d->plr)
d->plr->d = NULL;

- kfree(hw_dom->ctrl_val);
- kfree(hw_dom->mbps_val);
+ /* temporary: these four don't have a unique ctrlval array */
+ if (r->rid != RDT_RESOURCE_L3CODE &&
+ r->rid != RDT_RESOURCE_L3DATA &&
+ r->rid != RDT_RESOURCE_L2CODE &&
+ r->rid != RDT_RESOURCE_L2DATA) {
+ kfree(hw_dom->ctrl_val);
+ kfree(hw_dom->mbps_val);
+ }
bitmap_free(d->rmid_busy_llc);
kfree(d->mbm_total);
kfree(d->mbm_local);
--
2.30.2

2021-06-14 20:15:49

by James Morse

[permalink] [raw]
Subject: [PATCH v4 23/24] x86/resctrl: Expand resctrl_arch_update_domains()'s msr_param range

resctrl_arch_update_domains() specifies the one closid that has
been modified and needs copying to the hardware.
resctrl_arch_update_domains() takes a struct rdt_resource and a
closid as arguments, but copies all the staged configurations
for that closid into the ctrl_val[] array.

resctrl_arch_update_domains() is called once per schema, but
once the resources and domains are merged, the second call of
a L2CODE/L2DATA pair will find no staged configurations, as
they were previously applied. The msr_param of the first call
only has one index, so would only have update the hardware
for the last staged configuration.

To avoid a second round of IPIs when changing L2CODE and L2DATA
in one go, expand the range of the msr_param if multiple staged
configurations are found.

Signed-off-by: James Morse <[email protected]>
---
No changes since v3.

Changes since v2:
* This patch is new.
---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index e384398374da..0a1b878ca0db 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -292,6 +292,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
return -ENOMEM;

mba_sc = is_mba_sc(r);
+ msr_param.res = NULL;
list_for_each_entry(d, &r->domains, list) {
hw_dom = resctrl_to_arch_dom(d);
for (t = 0; t < CDP_NUM_TYPES; t++) {
@@ -303,9 +304,14 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
if (!apply_config(hw_dom, cfg, idx, cpu_mask, mba_sc))
continue;

- msr_param.low = idx;
- msr_param.high = msr_param.low + 1;
- msr_param.res = r;
+ if (!msr_param.res) {
+ msr_param.low = idx;
+ msr_param.high = msr_param.low + 1;
+ msr_param.res = r;
+ } else {
+ msr_param.low = min(msr_param.low, idx);
+ msr_param.high = max(msr_param.high, idx + 1);
+ }
}
}

--
2.30.2

2021-06-15 16:20:03

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 00/24] x86/resctrl: Merge the CDP resources

Hi James,

On 6/14/2021 1:09 PM, James Morse wrote:
> This series re-folds the resctrl code so the CDP resources (L3CODE et al)
> behaviour is all contained in the filesystem parts, with a minimum amount
> of arch specific code.
>
> Arm have some CPU support for dividing caches into portions, and
> applying bandwidth limits at various points in the SoC. The collective term
> for these features is MPAM: Memory Partitioning and Monitoring.
>
> MPAM is similar enough to Intel RDT, that it should use the defacto linux
> interface: resctrl. This filesystem currently lives under arch/x86, and is
> tightly coupled to the architecture.
> Ultimately, my plan is to split the existing resctrl code up to have an
> arch<->fs abstraction, then move all the bits out to fs/resctrl. From there
> MPAM can be wired up.
>
> x86 might have two resources with cache controls, (L2 and L3) but has
> extra copies for CDP: L{2,3}{CODE,DATA}, which are marked as enabled
> if CDP is enabled for the corresponding cache.
>
> MPAM has an equivalent feature to CDP, but its a property of the CPU,
> not the cache. Resctrl needs to have x86's odd/even behaviour, as that
> its the ABI, but this isn't how the MPAM hardware works. It is entirely
> possible that an in-kernel user of MPAM would not be using CDP, whereas
> resctrl is.
>
> Pretending L3CODE and L3DATA are entirely separate resources is a neat
> trick, but doing this is specific to x86.
> Doing this leaves the arch code in control of various parts of the
> filesystem ABI: the resources names, and the way the schemata are parsed.
> Allowing this stuff to vary between architectures is bad for user space.
>
> This series collapses the CODE/DATA resources, moving all the user-visible
> resctrl ABI into what becomes the filesystem code. CDP becomes the type of
> configuration being applied to a cache. This is done by adding a
> struct resctrl_schema to the parts of resctrl that will move to fs. This
> holds the arch-code resource that is in use for this schema, along with
> other properties like the name, and whether the configuration being applied
> is CODE/DATA/BOTH.
>
> This lets us fold the extra resources out of the arch code so that they
> don't need to be duplicated if the equivalent feature to CDP is missing, or
> implemented in a different way.
>
>
> The first two patches split the resource and domain structs to have an
> arch specific 'hw' portion, and the rest that is visible to resctrl.
> Future series massage the resctrl code so there are no accesses to 'hw'
> structures in the parts of resctrl that will move to fs, providing helpers
> where necessary.
>
> This series adds temporary scaffolding, which it removes a few patches
> later. This is to allow things like the ctrlval arrays and resources to be
> merged separately, which should make is easier to bisect. These things
> are marked temporary, and should all be gone by the end of the series.
>
> This series is a little rough around the monitors, would a fake
> struct resctrl_schema for the monitors simplify things, or be a source
> of bugs?
>
> This series is based on v5.12-rc6, and can be retrieved from:
> git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/resctrl_merge_cdp/v4
>
> v3: https://lore.kernel.org/lkml/[email protected]/
> v2: https://lore.kernel.org/lkml/[email protected]/
> v1: https://lore.kernel.org/lkml/[email protected]/
>
> Parts were previously posted as an RFC here:
> https://lore.kernel.org/lkml/[email protected]/
>

For the most part I think this series looks good. The one thing I am
concerned about is the resctrl user interface change. On a system that
supports L3 CDP there is a visible change when CDP is not enabled:

Before this series:
# cat schemata
L3:0=fffff;1=fffff

After this series:
# cat schemata
L3:0=fffff;1=fffff

There are a few user space tools that parse the resctrl schemata file
and it may be easier to keep the interface consistent than to find and
audit them all to ensure they will keep working.

Apart from that, I do think that the dmesg change that Babu pointed out
deserves a mention in the cover letter. I agree with your response in
this regard but this is indeed a user visible change and if anybody has
issue with that then mentioning it in the cover letter will hopefully
catch it sooner.

A heads-up is that there are some kernel-doc fixups in the works that
will conflict with your series. You yourself fix at least one of these
kernel-doc issues in this series - the description of mbm_width in the
first patch. I will ask the submitter of the kernel-doc fixups to use
your text to help with the merging.

Finally, I did catch a few typos that I will respond to individually.

Thank you very much James

Reinette













2021-06-15 16:50:07

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v4 00/24] x86/resctrl: Merge the CDP resources

Hi Reinette,

On 15/06/2021 17:16, Reinette Chatre wrote:
> On 6/14/2021 1:09 PM, James Morse wrote:
>> This series re-folds the resctrl code so the CDP resources (L3CODE et al)
>> behaviour is all contained in the filesystem parts, with a minimum amount
>> of arch specific code.

[..]

>> This series collapses the CODE/DATA resources, moving all the user-visible
>> resctrl ABI into what becomes the filesystem code. CDP becomes the type of
>> configuration being applied to a cache. This is done by adding a
>> struct resctrl_schema to the parts of resctrl that will move to fs. This
>> holds the arch-code resource that is in use for this schema, along with
>> other properties like the name, and whether the configuration being applied
>> is CODE/DATA/BOTH.
>>
>> This lets us fold the extra resources out of the arch code so that they
>> don't need to be duplicated if the equivalent feature to CDP is missing, or
>> implemented in a different way.

[...]

>> This series is based on v5.12-rc6, and can be retrieved from:
>> git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/resctrl_merge_cdp/v4


> For the most part I think this series looks good. The one thing I am concerned about is
> the resctrl user interface change. On a system that supports L3 CDP there is a visible
> change when CDP is not enabled:
>
> Before this series:
> # cat schemata
>    L3:0=fffff;1=fffff
>
> After this series:
> # cat schemata
> L3:0=fffff;1=fffff

Hmm, I thought I'd fixed this with v2, ... I see this is subtly different.

This could be tweaked by getting schemata_list_add() to include the length of the longest
suffix if the resource supports CDP, but its not enabled. (Discovering that means
cdp_capable moves to be something the 'fs' bits of resctrl can see.)

I'm a little nervous 'adding 4 spaces' because user-space expects them. (I agree if it
breaks user-space it has to be done). I guess this is the problem with string parsing as
part of the interface!

I assume that in the (distant) future having CDP capable resources with names more than 2
characters isn't going to be a problem. (I don't have an example)


> There are a few user space tools that parse the resctrl schemata file and it may be easier
> to keep the interface consistent than to find and audit them all to ensure they will keep
> working.


> Apart from that, I do think that the dmesg change that Babu pointed out deserves a mention
> in the cover letter. I agree with your response in this regard but this is indeed a user
> visible change and if anybody has issue with that then mentioning it in the cover letter
> will hopefully catch it sooner.

Ah, okay.


> A heads-up is that there are some kernel-doc fixups in the works that will conflict with
> your series. You yourself fix at least one of these kernel-doc issues in this series - the
> description of mbm_width in the first patch. I will ask the submitter of the kernel-doc
> fixups to use your text to help with the merging.

Please point me at something to rebase onto!
(as far as I can see, tip/x86/cache hasn't moved)


> Finally, I did catch a few typos that I will respond to individually.

Thanks!

James

2021-06-15 17:27:09

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 00/24] x86/resctrl: Merge the CDP resources

On Tue, Jun 15, 2021 at 05:48:25PM +0100, James Morse wrote:
> Please point me at something to rebase onto!
> (as far as I can see, tip/x86/cache hasn't moved)

Just use tip/master when that branch hasn't moved. We usually
fast-forward those branches to the latest Linus -rc before applying new
stuff.

And I'm sure we should be able to handle kernel-doc conflicts when
applying.

:-)

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-06-15 17:52:22

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v4 00/24] x86/resctrl: Merge the CDP resources

Hi James,
Thanks for the patches. Sanity tested the patches on AMD machine here.
Everything looks good. I have few comments on some of the individual patches.
Thanks

On 6/14/21 3:09 PM, James Morse wrote:
> Hi folks,
>
> Changes since v3? Fixed the 'unsigned u32', remove a few spaces and fixed a
> few spelling mistakes.
> ----
>
> This series re-folds the resctrl code so the CDP resources (L3CODE et al)
> behaviour is all contained in the filesystem parts, with a minimum amount
> of arch specific code.
>
> Arm have some CPU support for dividing caches into portions, and
> applying bandwidth limits at various points in the SoC. The collective term
> for these features is MPAM: Memory Partitioning and Monitoring.
>
> MPAM is similar enough to Intel RDT, that it should use the defacto linux
> interface: resctrl. This filesystem currently lives under arch/x86, and is
> tightly coupled to the architecture.
> Ultimately, my plan is to split the existing resctrl code up to have an
> arch<->fs abstraction, then move all the bits out to fs/resctrl. From there
> MPAM can be wired up.
>
> x86 might have two resources with cache controls, (L2 and L3) but has
> extra copies for CDP: L{2,3}{CODE,DATA}, which are marked as enabled
> if CDP is enabled for the corresponding cache.
>
> MPAM has an equivalent feature to CDP, but its a property of the CPU,
> not the cache. Resctrl needs to have x86's odd/even behaviour, as that
> its the ABI, but this isn't how the MPAM hardware works. It is entirely
> possible that an in-kernel user of MPAM would not be using CDP, whereas
> resctrl is.
>
> Pretending L3CODE and L3DATA are entirely separate resources is a neat
> trick, but doing this is specific to x86.
> Doing this leaves the arch code in control of various parts of the
> filesystem ABI: the resources names, and the way the schemata are parsed.
> Allowing this stuff to vary between architectures is bad for user space.
>
> This series collapses the CODE/DATA resources, moving all the user-visible
> resctrl ABI into what becomes the filesystem code. CDP becomes the type of
> configuration being applied to a cache. This is done by adding a
> struct resctrl_schema to the parts of resctrl that will move to fs. This
> holds the arch-code resource that is in use for this schema, along with
> other properties like the name, and whether the configuration being applied
> is CODE/DATA/BOTH.
>
> This lets us fold the extra resources out of the arch code so that they
> don't need to be duplicated if the equivalent feature to CDP is missing, or
> implemented in a different way.
>
>
> The first two patches split the resource and domain structs to have an
> arch specific 'hw' portion, and the rest that is visible to resctrl.
> Future series massage the resctrl code so there are no accesses to 'hw'
> structures in the parts of resctrl that will move to fs, providing helpers
> where necessary.
>
> This series adds temporary scaffolding, which it removes a few patches
> later. This is to allow things like the ctrlval arrays and resources to be
> merged separately, which should make is easier to bisect. These things
> are marked temporary, and should all be gone by the end of the series.
>
> This series is a little rough around the monitors, would a fake
> struct resctrl_schema for the monitors simplify things, or be a source
> of bugs?
>
> This series is based on v5.12-rc6, and can be retrieved from:
> git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/resctrl_merge_cdp/v4
>
> v3: https://lore.kernel.org/lkml/[email protected]/
> v2: https://lore.kernel.org/lkml/[email protected]/
> v1: https://lore.kernel.org/lkml/[email protected]/
>
> Parts were previously posted as an RFC here:
> https://lore.kernel.org/lkml/[email protected]/
>
> James Morse (24):
> x86/resctrl: Split struct rdt_resource
> x86/resctrl: Split struct rdt_domain
> x86/resctrl: Add a separate schema list for resctrl
> x86/resctrl: Pass the schema in info dir's private pointer
> x86/resctrl: Label the resources with their configuration type
> x86/resctrl: Walk the resctrl schema list instead of an arch list
> x86/resctrl: Store the effective num_closid in the schema
> x86/resctrl: Add resctrl_arch_get_num_closid()
> x86/resctrl: Pass the schema to resctrl filesystem functions
> x86/resctrl: Swizzle rdt_resource and resctrl_schema in
> pseudo_lock_region
> x86/resctrl: Move the schemata names into struct resctrl_schema
> x86/resctrl: Group staged configuration into a separate struct
> x86/resctrl: Allow different CODE/DATA configurations to be staged
> x86/resctrl: Rename update_domains() resctrl_arch_update_domains()
> x86/resctrl: Add a helper to read a closid's configuration
> x86/resctrl: Add a helper to read/set the CDP configuration
> x86/resctrl: Pass configuration type to resctrl_arch_get_config()
> x86/resctrl: Make ctrlval arrays the same size
> x86/resctrl: Apply offset correction when config is staged
> x86/resctrl: Calculate the index from the configuration type
> x86/resctrl: Merge the ctrl_val arrays
> x86/resctrl: Remove rdt_cdp_peer_get()
> x86/resctrl: Expand resctrl_arch_update_domains()'s msr_param range
> x86/resctrl: Merge the CDP resources
>
> arch/x86/kernel/cpu/resctrl/core.c | 269 +++++--------
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 164 +++++---
> arch/x86/kernel/cpu/resctrl/internal.h | 234 ++++-------
> arch/x86/kernel/cpu/resctrl/monitor.c | 44 ++-
> arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 12 +-
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 448 ++++++++++++----------
> include/linux/resctrl.h | 181 +++++++++
> 7 files changed, 758 insertions(+), 594 deletions(-)
>

2021-06-15 17:53:09

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v4 03/24] x86/resctrl: Add a separate schema list for resctrl



On 6/14/21 3:09 PM, James Morse wrote:
> Resctrl exposes schemata to user-space, which allow the control values
> to be specified for a group of tasks.
>
> User-visible properties of the interface, (such as the schemata names
> and how the values are parsed) are rooted in a struct provided by the
> architecture code. (struct rdt_hw_resource). Once a second architecture
> uses resctrl, this would allow user-visible properties to diverge
> between architectures.
>
> These properties should come from the resctrl code that will be common
> to all architectures. Resctrl has no per-schema structure, only struct
> rdt_{hw_,}resource. Create a struct resctrl_schema to hold the
> rdt_resource. Before a second architecture can be supported, this
> structure will also need to hold the schema name visible to user-space
> and the type of configuration values for resctrl.
>
> Reviewed-by: Jamie Iles <[email protected]>
> Signed-off-by: James Morse <[email protected]>
> ---
> Changes since v3:
> * Fixed a spelling mistake, removed a space.
>
> Changes since v2:
> * Expanded comments.
> * Shuffled commit message,
>
> Changes since v1:
> * Renamed resctrl_all_schema list
> * Used schemata_list as a prefix to make these easier to search for
> * Added kerneldoc string
> * Removed 'pending configuration' reference in commit message
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 43 +++++++++++++++++++++++++-
> include/linux/resctrl.h | 11 +++++++
> 3 files changed, 54 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 235cf621c878..f6790d03f056 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -106,6 +106,7 @@ extern unsigned int resctrl_cqm_threshold;
> extern bool rdt_alloc_capable;
> extern bool rdt_mon_capable;
> extern unsigned int rdt_mon_features;
> +extern struct list_head resctrl_schema_all;
>
> enum rdt_group_type {
> RDTCTRL_GROUP = 0,
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 9a8665c8ab89..14ea1212f476 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -39,6 +39,9 @@ static struct kernfs_root *rdt_root;
> struct rdtgroup rdtgroup_default;
> LIST_HEAD(rdt_all_groups);
>
> +/* list of entries for the schemata file */
> +LIST_HEAD(resctrl_schema_all);
> +
> /* Kernel fs node for "info" directory under root */
> static struct kernfs_node *kn_info;
>
> @@ -2109,6 +2112,35 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx)
> return ret;
> }
>
> +static int schemata_list_create(void)
> +{
> + struct resctrl_schema *s;
> + struct rdt_resource *r;
> +
> + for_each_alloc_enabled_rdt_resource(r) {
> + s = kzalloc(sizeof(*s), GFP_KERNEL);
> + if (!s)
> + return -ENOMEM;
> +
> + s->res = r;
> +
> + INIT_LIST_HEAD(&s->list);
> + list_add(&s->list, &resctrl_schema_all);
> + }
> +
> + return 0;
> +}
> +
> +static void schemata_list_destroy(void)
> +{
> + struct resctrl_schema *s, *tmp;
> +
> + list_for_each_entry_safe(s, tmp, &resctrl_schema_all, list) {
> + list_del(&s->list);
> + kfree(s);
> + }
> +}
> +
> static int rdt_get_tree(struct fs_context *fc)
> {
> struct rdt_fs_context *ctx = rdt_fc2context(fc);
> @@ -2130,11 +2162,17 @@ static int rdt_get_tree(struct fs_context *fc)
> if (ret < 0)
> goto out_cdp;
>
> + ret = schemata_list_create();
> + if (ret) {
> + schemata_list_destroy();
> + goto out_mba;
> + }
> +
> closid_init();
>
> ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
> if (ret < 0)
> - goto out_mba;
> + goto out_schemata_free;
>
> if (rdt_mon_capable) {
> ret = mongroup_create_dir(rdtgroup_default.kn,
> @@ -2184,6 +2222,8 @@ static int rdt_get_tree(struct fs_context *fc)
> kernfs_remove(kn_mongrp);
> out_info:
> kernfs_remove(kn_info);
> +out_schemata_free:
> + schemata_list_destroy();
> out_mba:
> if (ctx->enable_mba_mbps)
> set_mba_sc(false);
> @@ -2425,6 +2465,7 @@ static void rdt_kill_sb(struct super_block *sb)
> rmdir_all_sub();
> rdt_pseudo_lock_release();
> rdtgroup_default.mode = RDT_MODE_SHAREABLE;
> + schemata_list_destroy();
> static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
> static_branch_disable_cpuslocked(&rdt_mon_enable_key);
> static_branch_disable_cpuslocked(&rdt_enable_key);
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index be6f5df78e31..425e7913dc8d 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -154,4 +154,15 @@ struct rdt_resource {
>
> };
>
> +/**
> + * struct resctrl_schema - configuration abilities of a resource presented to
> + * user-space
> + * @list: Member of resctrl_schema_all.
> + * @res: The resource structure exported by the architecture to describe
> + * the hardware that is configured by this schema.
> + */
> +struct resctrl_schema {
> + struct list_head list;
> + struct rdt_resource *res;

It will be better to be consistent with the naming.
struct rdt_resource *resctrl;

> +};
> #endif /* _RESCTRL_H */
>

2021-06-15 17:53:57

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v4 02/24] x86/resctrl: Split struct rdt_domain



On 6/14/21 3:09 PM, James Morse wrote:
> resctrl is the defacto Linux ABI for SoC resource partitioning features.
>
> To support it on another architecture, it needs to be abstracted from
> the features provided by Intel RDT and AMD PQoS, and moved to /fs/.
> struct rdt_resource contains a mix of architecture private details
> and properties of the filesystem interface user-space users.
>
> Continue by splitting struct rdt_domain, into an architecture private
> 'hw' struct, which contains the common resctrl structure that would be
> used by any architecture. The hardware values in ctrl_val and mbps_val
> need to be accessed via helpers to allow another architecture to convert
> these into a different format if necessary. After this split, filesystem
> code paths touching a 'hw' struct indicates where an abstraction
> is needed.
>
> Splitting this structure only moves types around, and should not lead
> to any change in behaviour.
>
> Reviewed-by: Jamie Iles <[email protected]>
> Signed-off-by: James Morse <[email protected]>
> ---
> Changes since v3:
> * Removed a double word, removed a space.
>
> Changes since v2:
> * Shuffled commit message,
> * Changed kerneldoc text above rdt_hw_domain
>
> Changes since v1:
> * Tabs space and other whitespace
> * cpu becomes CPU in all comments touched
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 32 ++++++++++-------
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++--
> arch/x86/kernel/cpu/resctrl/internal.h | 43 +++++++----------------
> arch/x86/kernel/cpu/resctrl/monitor.c | 8 +++--
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 29 +++++++++------
> include/linux/resctrl.h | 32 ++++++++++++++++-
> 6 files changed, 94 insertions(+), 60 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 415d0f04efd7..aff5d0dde6c1 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -385,10 +385,11 @@ static void
> mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
> {
> unsigned int i;
> + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>
> for (i = m->low; i < m->high; i++)
> - wrmsrl(hw_res->msr_base + i, d->ctrl_val[i]);
> + wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
> }
>
> /*
> @@ -410,21 +411,23 @@ mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m,
> struct rdt_resource *r)
> {
> unsigned int i;
> + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>
> /* Write the delay values for mba. */
> for (i = m->low; i < m->high; i++)
> - wrmsrl(hw_res->msr_base + i, delay_bw_map(d->ctrl_val[i], r));
> + wrmsrl(hw_res->msr_base + i, delay_bw_map(hw_dom->ctrl_val[i], r));
> }
>
> static void
> cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
> {
> unsigned int i;
> + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>
> for (i = m->low; i < m->high; i++)
> - wrmsrl(hw_res->msr_base + cbm_idx(r, i), d->ctrl_val[i]);
> + wrmsrl(hw_res->msr_base + cbm_idx(r, i), hw_dom->ctrl_val[i]);
> }
>
> struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
> @@ -510,21 +513,22 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
> static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
> {
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> + struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
> struct msr_param m;
> u32 *dc, *dm;
>
> - dc = kmalloc_array(hw_res->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
> + dc = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->ctrl_val), GFP_KERNEL);
> if (!dc)
> return -ENOMEM;
>
> - dm = kmalloc_array(hw_res->num_closid, sizeof(*d->mbps_val), GFP_KERNEL);
> + dm = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->mbps_val), GFP_KERNEL);
> if (!dm) {
> kfree(dc);
> return -ENOMEM;
> }
>
> - d->ctrl_val = dc;
> - d->mbps_val = dm;
> + hw_dom->ctrl_val = dc;
> + hw_dom->mbps_val = dm;
> setup_default_ctrlval(r, dc, dm);
>
> m.low = 0;
> @@ -586,6 +590,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
> {
> int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
> struct list_head *add_pos = NULL;
> + struct rdt_hw_domain *hw_dom;
> struct rdt_domain *d;
>
> d = rdt_find_domain(r, id, &add_pos);
> @@ -601,10 +606,11 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
> return;
> }
>
> - d = kzalloc_node(sizeof(*d), GFP_KERNEL, cpu_to_node(cpu));
> - if (!d)
> + hw_dom = kzalloc_node(sizeof(*hw_dom), GFP_KERNEL, cpu_to_node(cpu));
> + if (!hw_dom)
> return;
>
> + d = &hw_dom->resctrl;
> d->id = id;
> cpumask_set_cpu(cpu, &d->cpu_mask);
>
> @@ -633,6 +639,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
> static void domain_remove_cpu(int cpu, struct rdt_resource *r)
> {
> int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
> + struct rdt_hw_domain *hw_dom;
> struct rdt_domain *d;
>
> d = rdt_find_domain(r, id, NULL);
> @@ -640,6 +647,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
> pr_warn("Couldn't find cache id for CPU %d\n", cpu);
> return;
> }
> + hw_dom = resctrl_to_arch_dom(d);
>
> cpumask_clear_cpu(cpu, &d->cpu_mask);
> if (cpumask_empty(&d->cpu_mask)) {
> @@ -672,12 +680,12 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
> if (d->plr)
> d->plr->d = NULL;
>
> - kfree(d->ctrl_val);
> - kfree(d->mbps_val);
> + kfree(hw_dom->ctrl_val);
> + kfree(hw_dom->mbps_val);
> bitmap_free(d->rmid_busy_llc);
> kfree(d->mbm_total);
> kfree(d->mbm_local);
> - kfree(d);
> + kfree(hw_dom);
> return;
> }
>
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index ab6e584c9d2d..2e7466659af3 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -238,6 +238,7 @@ static int parse_line(char *line, struct rdt_resource *r,
>
> int update_domains(struct rdt_resource *r, int closid)
> {
> + struct rdt_hw_domain *hw_dom;
> struct msr_param msr_param;
> cpumask_var_t cpu_mask;
> struct rdt_domain *d;
> @@ -254,7 +255,8 @@ int update_domains(struct rdt_resource *r, int closid)
>
> mba_sc = is_mba_sc(r);
> list_for_each_entry(d, &r->domains, list) {
> - dc = !mba_sc ? d->ctrl_val : d->mbps_val;
> + hw_dom = resctrl_to_arch_dom(d);
> + dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;
> if (d->have_new_ctrl && d->new_ctrl != dc[closid]) {
> cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
> dc[closid] = d->new_ctrl;
> @@ -375,17 +377,19 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
>
> static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
> {
> + struct rdt_hw_domain *hw_dom;
> struct rdt_domain *dom;
> bool sep = false;
> u32 ctrl_val;
>
> seq_printf(s, "%*s:", max_name_width, r->name);
> list_for_each_entry(dom, &r->domains, list) {
> + hw_dom = resctrl_to_arch_dom(dom);
> if (sep)
> seq_puts(s, ";");
>
> - ctrl_val = (!is_mba_sc(r) ? dom->ctrl_val[closid] :
> - dom->mbps_val[closid]);
> + ctrl_val = (!is_mba_sc(r) ? hw_dom->ctrl_val[closid] :
> + hw_dom->mbps_val[closid]);
> seq_printf(s, r->format_str, dom->id, max_data_width,
> ctrl_val);
> sep = true;
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 43c8cf6b2b12..235cf621c878 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -299,44 +299,25 @@ struct mbm_state {
> };
>
> /**
> - * struct rdt_domain - group of cpus sharing an RDT resource
> - * @list: all instances of this resource
> - * @id: unique id for this instance
> - * @cpu_mask: which cpus share this resource
> - * @rmid_busy_llc:
> - * bitmap of which limbo RMIDs are above threshold
> - * @mbm_total: saved state for MBM total bandwidth
> - * @mbm_local: saved state for MBM local bandwidth
> - * @mbm_over: worker to periodically read MBM h/w counters
> - * @cqm_limbo: worker to periodically read CQM h/w counters
> - * @mbm_work_cpu:
> - * worker cpu for MBM h/w counters
> - * @cqm_work_cpu:
> - * worker cpu for CQM h/w counters
> + * struct rdt_hw_domain - Arch private attributes of a set of CPUs that share
> + * a resource
> + * @resctrl: Properties exposed to the resctrl file system
> * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID)
> * @mbps_val: When mba_sc is enabled, this holds the bandwidth in MBps
> - * @new_ctrl: new ctrl value to be loaded
> - * @have_new_ctrl: did user provide new_ctrl for this domain
> - * @plr: pseudo-locked region (if any) associated with domain
> + *
> + * Members of this structure are accessed via helpers that provide abstraction.
> */
> -struct rdt_domain {
> - struct list_head list;
> - int id;
> - struct cpumask cpu_mask;
> - unsigned long *rmid_busy_llc;
> - struct mbm_state *mbm_total;
> - struct mbm_state *mbm_local;
> - struct delayed_work mbm_over;
> - struct delayed_work cqm_limbo;
> - int mbm_work_cpu;
> - int cqm_work_cpu;
> +struct rdt_hw_domain {
> + struct rdt_domain resctrl;

Naming is bit confusing here. There is another field with the same
name(patch1).

+struct rdt_hw_resource {
+ struct rdt_resource resctrl;

I think we should make this bit more clearer. May be or something similar.

struct rdt_hw_domain {
struct rdt_domain d_resctrl;

> u32 *ctrl_val;
> u32 *mbps_val;
> - u32 new_ctrl;
> - bool have_new_ctrl;
> - struct pseudo_lock_region *plr;
> };
>
> +static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain *r)
> +{
> + return container_of(r, struct rdt_hw_domain, resctrl);
> +}
> +
> /**
> * struct msr_param - set a range of MSRs from a domain
> * @res: The resource to use
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 685e7f86d908..76eea10f2e2c 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -418,6 +418,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
> u32 closid, rmid, cur_msr, cur_msr_val, new_msr_val;
> struct mbm_state *pmbm_data, *cmbm_data;
> struct rdt_hw_resource *hw_r_mba;
> + struct rdt_hw_domain *hw_dom_mba;
> u32 cur_bw, delta_bw, user_bw;
> struct rdt_resource *r_mba;
> struct rdt_domain *dom_mba;
> @@ -438,11 +439,12 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
> pr_warn_once("Failure to get domain for MBA update\n");
> return;
> }
> + hw_dom_mba = resctrl_to_arch_dom(dom_mba);
>
> cur_bw = pmbm_data->prev_bw;
> - user_bw = dom_mba->mbps_val[closid];
> + user_bw = hw_dom_mba->mbps_val[closid];
> delta_bw = pmbm_data->delta_bw;
> - cur_msr_val = dom_mba->ctrl_val[closid];
> + cur_msr_val = hw_dom_mba->ctrl_val[closid];
>
> /*
> * For Ctrl groups read data from child monitor groups.
> @@ -479,7 +481,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
>
> cur_msr = hw_r_mba->msr_base + closid;
> wrmsrl(cur_msr, delay_bw_map(new_msr_val, r_mba));
> - dom_mba->ctrl_val[closid] = new_msr_val;
> + hw_dom_mba->ctrl_val[closid] = new_msr_val;
>
> /*
> * Delta values are updated dynamically package wise for each
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 5126a1e58d97..9a8665c8ab89 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -915,7 +915,7 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
> list_for_each_entry(dom, &r->domains, list) {
> if (sep)
> seq_putc(seq, ';');
> - ctrl = dom->ctrl_val;
> + ctrl = resctrl_to_arch_dom(dom)->ctrl_val;
> sw_shareable = 0;
> exclusive = 0;
> seq_printf(seq, "%d=", dom->id);
> @@ -1193,7 +1193,7 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d
> }
>
> /* Check for overlap with other resource groups */
> - ctrl = d->ctrl_val;
> + ctrl = resctrl_to_arch_dom(d)->ctrl_val;
> for (i = 0; i < closids_supported(); i++, ctrl++) {
> ctrl_b = *ctrl;
> mode = rdtgroup_mode_by_closid(i);
> @@ -1262,6 +1262,7 @@ bool rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d,
> */
> static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
> {
> + struct rdt_hw_domain *hw_dom;
> int closid = rdtgrp->closid;
> struct rdt_resource *r;
> bool has_cache = false;
> @@ -1272,7 +1273,8 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
> continue;
> has_cache = true;
> list_for_each_entry(d, &r->domains, list) {
> - if (rdtgroup_cbm_overlaps(r, d, d->ctrl_val[closid],
> + hw_dom = resctrl_to_arch_dom(d);
> + if (rdtgroup_cbm_overlaps(r, d, hw_dom->ctrl_val[closid],
> rdtgrp->closid, false)) {
> rdt_last_cmd_puts("Schemata overlaps\n");
> return false;
> @@ -1404,6 +1406,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r,
> static int rdtgroup_size_show(struct kernfs_open_file *of,
> struct seq_file *s, void *v)
> {
> + struct rdt_hw_domain *hw_dom;
> struct rdtgroup *rdtgrp;
> struct rdt_resource *r;
> struct rdt_domain *d;
> @@ -1438,14 +1441,15 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
> sep = false;
> seq_printf(s, "%*s:", max_name_width, r->name);
> list_for_each_entry(d, &r->domains, list) {
> + hw_dom = resctrl_to_arch_dom(d);
> if (sep)
> seq_putc(s, ';');
> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
> size = 0;
> } else {
> ctrl = (!is_mba_sc(r) ?
> - d->ctrl_val[rdtgrp->closid] :
> - d->mbps_val[rdtgrp->closid]);
> + hw_dom->ctrl_val[rdtgrp->closid] :
> + hw_dom->mbps_val[rdtgrp->closid]);
> if (r->rid == RDT_RESOURCE_MBA)
> size = ctrl;
> else
> @@ -1940,6 +1944,7 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r)
> static int set_mba_sc(bool mba_sc)
> {
> struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].resctrl;
> + struct rdt_hw_domain *hw_dom;
> struct rdt_domain *d;
>
> if (!is_mbm_enabled() || !is_mba_linear() ||
> @@ -1947,8 +1952,10 @@ static int set_mba_sc(bool mba_sc)
> return -EINVAL;
>
> r->membw.mba_sc = mba_sc;
> - list_for_each_entry(d, &r->domains, list)
> - setup_default_ctrlval(r, d->ctrl_val, d->mbps_val);
> + list_for_each_entry(d, &r->domains, list) {
> + hw_dom = resctrl_to_arch_dom(d);
> + setup_default_ctrlval(r, hw_dom->ctrl_val, hw_dom->mbps_val);
> + }
>
> return 0;
> }
> @@ -2265,6 +2272,7 @@ static int rdt_init_fs_context(struct fs_context *fc)
> static int reset_all_ctrls(struct rdt_resource *r)
> {
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> + struct rdt_hw_domain *hw_dom;
> struct msr_param msr_param;
> cpumask_var_t cpu_mask;
> struct rdt_domain *d;
> @@ -2283,10 +2291,11 @@ static int reset_all_ctrls(struct rdt_resource *r)
> * from each domain to update the MSRs below.
> */
> list_for_each_entry(d, &r->domains, list) {
> + hw_dom = resctrl_to_arch_dom(d);
> cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
>
> for (i = 0; i < hw_res->num_closid; i++)
> - d->ctrl_val[i] = r->default_ctrl;
> + hw_dom->ctrl_val[i] = r->default_ctrl;
> }
> cpu = get_cpu();
> /* Update CBM on this cpu if it's in cpu_mask. */
> @@ -2665,7 +2674,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct rdt_resource *r,
> d->have_new_ctrl = false;
> d->new_ctrl = r->cache.shareable_bits;
> used_b = r->cache.shareable_bits;
> - ctrl = d->ctrl_val;
> + ctrl = resctrl_to_arch_dom(d)->ctrl_val;
> for (i = 0; i < closids_supported(); i++, ctrl++) {
> if (closid_allocated(i) && i != closid) {
> mode = rdtgroup_mode_by_closid(i);
> @@ -2682,7 +2691,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct rdt_resource *r,
> * with an exclusive group.
> */
> if (d_cdp)
> - peer_ctl = d_cdp->ctrl_val[i];
> + peer_ctl = resctrl_to_arch_dom(d_cdp)->ctrl_val[i];
> else
> peer_ctl = 0;
> used_b |= *ctrl | peer_ctl;
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index ee92df9c7252..be6f5df78e31 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -15,7 +15,37 @@ int proc_resctrl_show(struct seq_file *m,
>
> #endif
>
> -struct rdt_domain;
> +/**
> + * struct rdt_domain - group of CPUs sharing a resctrl resource
> + * @list: all instances of this resource
> + * @id: unique id for this instance
> + * @cpu_mask: which CPUs share this resource
> + * @new_ctrl: new ctrl value to be loaded
> + * @have_new_ctrl: did user provide new_ctrl for this domain
> + * @rmid_busy_llc: bitmap of which limbo RMIDs are above threshold
> + * @mbm_total: saved state for MBM total bandwidth
> + * @mbm_local: saved state for MBM local bandwidth
> + * @mbm_over: worker to periodically read MBM h/w counters
> + * @cqm_limbo: worker to periodically read CQM h/w counters
> + * @mbm_work_cpu: worker CPU for MBM h/w counters
> + * @cqm_work_cpu: worker CPU for CQM h/w counters
> + * @plr: pseudo-locked region (if any) associated with domain
> + */
> +struct rdt_domain {
> + struct list_head list;
> + int id;
> + struct cpumask cpu_mask;
> + u32 new_ctrl;
> + bool have_new_ctrl;
> + unsigned long *rmid_busy_llc;
> + struct mbm_state *mbm_total;
> + struct mbm_state *mbm_local;
> + struct delayed_work mbm_over;
> + struct delayed_work cqm_limbo;
> + int mbm_work_cpu;
> + int cqm_work_cpu;
> + struct pseudo_lock_region *plr;
> +};
>
> /**
> * struct resctrl_cache - Cache allocation related data
>

2021-06-15 18:07:19

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 00/24] x86/resctrl: Merge the CDP resources

Hi James,

On 6/15/2021 9:48 AM, James Morse wrote:
> On 15/06/2021 17:16, Reinette Chatre wrote:
>> On 6/14/2021 1:09 PM, James Morse wrote:

>> For the most part I think this series looks good. The one thing I am concerned about is
>> the resctrl user interface change. On a system that supports L3 CDP there is a visible
>> change when CDP is not enabled:
>>
>> Before this series:
>> # cat schemata
>>    L3:0=fffff;1=fffff
>>
>> After this series:
>> # cat schemata
>> L3:0=fffff;1=fffff
>
> Hmm, I thought I'd fixed this with v2, ... I see this is subtly different.
>
> This could be tweaked by getting schemata_list_add() to include the length of the longest
> suffix if the resource supports CDP, but its not enabled. (Discovering that means
> cdp_capable moves to be something the 'fs' bits of resctrl can see.)
>
> I'm a little nervous 'adding 4 spaces' because user-space expects them. (I agree if it
> breaks user-space it has to be done). I guess this is the problem with string parsing as
> part of the interface!

This is a tricky and interesting one. It seems that the original
intended behavior is indeed the way you changed it to. By originally
using for_each_enabled_rdt_resource() to determine the maximum width in
de016df88f23 ("x86/intel_rdt: Update schemata read to show data in
tabular format"). This was added in v4.12 and dictated the interface
until v4.13. This was changed in 1b5c0b758317 ("x86/intel_rdt: Cleanup
namespace to support RDT monitoring") when it used
for_each_alloc_capable_rdt_resource(r) instead, added in v4.14 that is a
stable kernel and the most likely interface used by users.

To me the less risky change is to maintain the existing interface, but
perhaps there are some other guidance in this regard?

> I assume that in the (distant) future having CDP capable resources with names more than 2
> characters isn't going to be a problem. (I don't have an example)

The last statement is not clear to me. Could you please elaborate why
two characters would be significant? From what I understand the
expectation would be that the width is the maximum name length of all
possible schema, whether they are enabled or not.

>> There are a few user space tools that parse the resctrl schemata file and it may be easier
>> to keep the interface consistent than to find and audit them all to ensure they will keep
>> working.

To me this continues the biggest hurdle in maintaining the behavior as
you have it in this series.


>> A heads-up is that there are some kernel-doc fixups in the works that will conflict with
>> your series. You yourself fix at least one of these kernel-doc issues in this series - the
>> description of mbm_width in the first patch. I will ask the submitter of the kernel-doc
>> fixups to use your text to help with the merging.
>
> Please point me at something to rebase onto!
> (as far as I can see, tip/x86/cache hasn't moved)

These patches have not yet been merged. The most recent version was sent
yesterday. Your current base is good.

Reinette

2021-06-15 18:09:03

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 01/24] x86/resctrl: Split struct rdt_resource

Hi James,

On 6/14/2021 1:09 PM, James Morse wrote:
> resctrl is the defacto Linux ABI for SoC resource partitioning features.
>
> To support it on another architecture, it needs to be abstracted from
> the features provided by Intel RDT and AMD PQoS, and moved to /fs/.
> struct rdt_resource contains a mix of architecture private details
> and properties of the filesystem interface user-space users.

"user-space users" -> "user-space uses" ?

...

> +struct rdt_parse_data;
> +
> +/**
> + * struct rdt_resource - attributes of a resctrl resource
> + * @rid: The index of the resource
> + * @alloc_enabled: Is allocation enabled on this machine
> + * @mon_enabled: Is monitoring enabled for this feature
> + * @alloc_capable: Is allocation available on this machine
> + * @mon_capable: Is monitor feature available on this machine
> + * @num_rmid: Number of RMIDs available.
> + * @cache_level: Which cache level defines scope of this resource
> + * @cache: If the component has cache controls, their properties.
> + * @membw: If the component has bandwidth controls, their properties.
> + * @domains: All domains for this resource
> + * @name: Name to use in "schemata" file.
> + * @data_width: Character width of data when displaying.
> + * @default_ctrl: Specifies default cache cbm or memory B/W percent.
> + * @format_str: Per resource format string to show domain value
> + * @parse_ctrlval: Per resource function pointer to parse control values
> + *

Unexpected space here.

> + * @evt_list: List of monitoring events
> + * @fflags: flags to choose base and info files
> + */
> +struct rdt_resource {
> + int rid;
> + bool alloc_enabled;
> + bool mon_enabled;
> + bool alloc_capable;
> + bool mon_capable;
> + int num_rmid;
> + int cache_level;
> + struct resctrl_cache cache;
> + struct resctrl_membw membw;
> + struct list_head domains;
> + char *name;
> + int data_width;
> + u32 default_ctrl;
> + const char *format_str;
> + int (*parse_ctrlval)(struct rdt_parse_data *data,
> + struct rdt_resource *r,
> + struct rdt_domain *d);
> + struct list_head evt_list;
> + unsigned long fflags;
> +
> +};
> +
> #endif /* _RESCTRL_H */
>

Reinette

2021-06-15 18:09:44

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 09/24] x86/resctrl: Pass the schema to resctrl filesystem functions

Hi James,

On 6/14/2021 1:09 PM, James Morse wrote:
> Once the CDP resources are merged, there will be two struct
> resctrl_schema for one struct rdt_resource. CDP becomes a type of
> configuration that belongs to the schema.
>
> Heplers like rdtgroup_cbm_overlaps() need access to the schema to

Heplers -> Helpers

Reinette

2021-06-15 18:10:01

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 11/24] x86/resctrl: Move the schemata names into struct resctrl_schema

Hi James,

On 6/14/2021 1:09 PM, James Morse wrote:
> resctrl 'info' directories and schema parsing use the schema name.
> This lives in the struct rdt_resource, and is specified by the
> architecture code.
>
> Once the CDP resources are merged, there will only be one resource
> (and one name) in use by two schema. To allow the CDP CODE/DATA
> property to be the tyoe of configuration the schema uses, the

tyoe -> type

Reinette

2021-06-15 18:10:31

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 12/24] x86/resctrl: Group staged configuration into a separate struct

Hi James,

On 6/14/2021 1:09 PM, James Morse wrote:
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 8fa806c85cec..8fad1af8f15e 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -23,13 +23,21 @@ enum resctrl_conf_type {
> CDP_DATA,
> };
>
> +/**
> + * struct resctrl_staged_config - parsed configuration to be applied
> + * @new_ctrl: new ctrl value to be loaded
> + * @have_new_ctrl: whether the user provide new_ctrl is valid

user provide -> user provided?

Reinette

2021-06-15 18:10:56

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 05/24] x86/resctrl: Label the resources with their configuration type

Hi James,

On 6/14/2021 1:09 PM, James Morse wrote:
...

> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 425e7913dc8d..81073d0751c9 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -15,6 +15,14 @@ int proc_resctrl_show(struct seq_file *m,
>
> #endif
>
> +enum resctrl_conf_type {
> + /* No prioritisation, both code and data are controlled or monitored. */
> + CDP_NONE,
> +
> + CDP_CODE,
> + CDP_DATA,
> +};
> +


Please follow style of the rest of this file - please no inline
documentation, use proper kernel-doc instead.

Reinette

2021-06-15 18:11:29

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 02/24] x86/resctrl: Split struct rdt_domain

Hi James,

On 6/14/2021 1:09 PM, James Morse wrote:
> resctrl is the defacto Linux ABI for SoC resource partitioning features.
>
> To support it on another architecture, it needs to be abstracted from
> the features provided by Intel RDT and AMD PQoS, and moved to /fs/.
> struct rdt_resource contains a mix of architecture private details
> and properties of the filesystem interface user-space users.

rdt_resource -> rdt_domain ?
user-space users -> user-space uses ?

>
> Continue by splitting struct rdt_domain, into an architecture private
> 'hw' struct, which contains the common resctrl structure that would be
> used by any architecture. The hardware values in ctrl_val and mbps_val
> need to be accessed via helpers to allow another architecture to convert
> these into a different format if necessary. After this split, filesystem
> code paths touching a 'hw' struct indicates where an abstraction
> is needed.
>
> Splitting this structure only moves types around, and should not lead
> to any change in behaviour.
>
> Reviewed-by: Jamie Iles <[email protected]>
> Signed-off-by: James Morse <[email protected]>

Reinette

2021-06-15 18:11:49

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v4 18/24] x86/resctrl: Make ctrlval arrays the same size

Hi James,

On 6/14/2021 1:09 PM, James Morse wrote:
> The CODE and DATA resources report a num_closid that is half the
> actual size supported by the hardware. This behaviour is visible
> to user-space when CDP is enabled.
> The CODE and DATA resources have their own ctrlval arrays which are half
> the size of the underlying hardware because num_closid was already
> adjusted. One holds the odd configurations values, the other even.
>
> Before the CDP resources can be merged, the 'half the closids'
> behaviour needs to be implemented by schemata_list_create(), but
> this causes the ctrl_val[] array to be full sized.
>
> Remove the logic from the architecture specific rdt_get_cdp_config()
> setup, and add it to schemata_list_create(). Functions that
> walk take num_closid directly from struct rdt_hw_resource also

This is unclear to me ... "Functions that walk ..." seems like it is
missing to describe what they are walking.

> have to halve num_closid as only the lower half of each array is
> in use. domain_setup_ctrlval() and reset_all_ctrls() both copy
> struct rdt_hw_resource's num_closid to a struct msr_param. Correct
> the value here. This is temporary as a subsequent patch will merge
> the all three ctrl_val[] arrays such that when CDP is in use, the

the all three -> all three ?

Reinette

2021-06-17 17:04:01

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v4 03/24] x86/resctrl: Add a separate schema list for resctrl

Hi Babu,

On 15/06/2021 18:51, Babu Moger wrote:
> On 6/14/21 3:09 PM, James Morse wrote:
>> Resctrl exposes schemata to user-space, which allow the control values
>> to be specified for a group of tasks.
>>
>> User-visible properties of the interface, (such as the schemata names
>> and how the values are parsed) are rooted in a struct provided by the
>> architecture code. (struct rdt_hw_resource). Once a second architecture
>> uses resctrl, this would allow user-visible properties to diverge
>> between architectures.
>>
>> These properties should come from the resctrl code that will be common
>> to all architectures. Resctrl has no per-schema structure, only struct
>> rdt_{hw_,}resource. Create a struct resctrl_schema to hold the
>> rdt_resource. Before a second architecture can be supported, this
>> structure will also need to hold the schema name visible to user-space
>> and the type of configuration values for resctrl.

>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index be6f5df78e31..425e7913dc8d 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -154,4 +154,15 @@ struct rdt_resource {
>>
>> };
>>
>> +/**
>> + * struct resctrl_schema - configuration abilities of a resource presented to
>> + * user-space
>> + * @list: Member of resctrl_schema_all.
>> + * @res: The resource structure exported by the architecture to describe
>> + * the hardware that is configured by this schema.
>> + */
>> +struct resctrl_schema {
>> + struct list_head list;
>> + struct rdt_resource *res;

> It will be better to be consistent with the naming.
> struct rdt_resource *resctrl;

Consistent with what? The rest of the code base conventionally calls a resource 'r'.

The structures with '_hw_' are private to the architecture, the 'resctrl' structure there
is the part of the struct that the architecture makes visible to the filesystem of that name.

That pattern doesn't apply here, 'struct resctrl_schema' is an invention of the
filesystem, the architecture code doesn't need to know anything about it. (this is why it
goes in include/linux from the beginning).


I think the choices are 'r', (too terse), 'resource' (too much typing, obvious from the
type), or 'res'...


Thanks!

James

2021-06-17 17:04:58

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v4 18/24] x86/resctrl: Make ctrlval arrays the same size

Hi Reinette,

On 15/06/2021 19:09, Reinette Chatre wrote:
> On 6/14/2021 1:09 PM, James Morse wrote:
>> The CODE and DATA resources report a num_closid that is half the
>> actual size supported by the hardware. This behaviour is visible
>> to user-space when CDP is enabled.
>> The CODE and DATA resources have their own ctrlval arrays which are half
>> the size of the underlying hardware because num_closid was already
>> adjusted. One holds the odd configurations values, the other even.
>>
>> Before the CDP resources can be merged, the 'half the closids'
>> behaviour needs to be implemented by schemata_list_create(), but
>> this causes the ctrl_val[] array to be full sized.
>>
>> Remove the logic from the architecture specific rdt_get_cdp_config()
>> setup, and add it to schemata_list_create(). Functions that
>> walk take num_closid directly from struct rdt_hw_resource also
>
> This is unclear to me ... "Functions that walk ..." seems like it is missing to describe
> what they are walking.

Yup, I'm missing at least a word here! Fixed as:
| functions that walk all the configurations, such as domain_setup_ctrlval() and
| reset_all_ctrls() , take the num_closids directly from...


>> have to halve num_closid as only the lower half of each array is
>> in use. domain_setup_ctrlval() and reset_all_ctrls() both copy
>> struct rdt_hw_resource's num_closid to a struct msr_param. Correct
>> the value here. This is temporary as a subsequent patch will merge
>> the all three ctrl_val[] arrays such that when CDP is in use, the
>
> the all three -> all three ?

Yes. Thanks!

(I've never managed to spot things like this in text I wrote)


Thanks,

James

2021-06-17 17:05:06

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v4 02/24] x86/resctrl: Split struct rdt_domain

Hi Babu,

On 15/06/2021 18:51, Babu Moger wrote:
> On 6/14/21 3:09 PM, James Morse wrote:
>> resctrl is the defacto Linux ABI for SoC resource partitioning features.
>>
>> To support it on another architecture, it needs to be abstracted from
>> the features provided by Intel RDT and AMD PQoS, and moved to /fs/.
>> struct rdt_resource contains a mix of architecture private details
>> and properties of the filesystem interface user-space users.
>>
>> Continue by splitting struct rdt_domain, into an architecture private
>> 'hw' struct, which contains the common resctrl structure that would be
>> used by any architecture. The hardware values in ctrl_val and mbps_val
>> need to be accessed via helpers to allow another architecture to convert
>> these into a different format if necessary. After this split, filesystem
>> code paths touching a 'hw' struct indicates where an abstraction
>> is needed.
>>
>> Splitting this structure only moves types around, and should not lead
>> to any change in behaviour.

>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 43c8cf6b2b12..235cf621c878 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -299,44 +299,25 @@ struct mbm_state {

>> -struct rdt_domain {
>> - struct list_head list;
>> - int id;
>> - struct cpumask cpu_mask;
>> - unsigned long *rmid_busy_llc;
>> - struct mbm_state *mbm_total;
>> - struct mbm_state *mbm_local;
>> - struct delayed_work mbm_over;
>> - struct delayed_work cqm_limbo;
>> - int mbm_work_cpu;
>> - int cqm_work_cpu;
>> +struct rdt_hw_domain {
>> + struct rdt_domain resctrl;


> Naming is bit confusing here. There is another field with the same
> name(patch1).

But a totally different type, you'd only access its members via the resource or domain, so
its always clear which it is. (and if you get them wrong, it won't build)


> +struct rdt_hw_resource {
> + struct rdt_resource resctrl;
>
> I think we should make this bit more clearer. May be or something similar.
>
> struct rdt_hw_domain {
> struct rdt_domain d_resctrl;
Sure, I guess it makes it clear when quoting something.


Thanks,

James

2021-06-17 17:05:57

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v4 00/24] x86/resctrl: Merge the CDP resources

Hi Reinette,

On 15/06/2021 19:05, Reinette Chatre wrote:
> On 6/15/2021 9:48 AM, James Morse wrote:
>> On 15/06/2021 17:16, Reinette Chatre wrote:
>>> On 6/14/2021 1:09 PM, James Morse wrote:
>>> For the most part I think this series looks good. The one thing I am concerned about is
>>> the resctrl user interface change. On a system that supports L3 CDP there is a visible
>>> change when CDP is not enabled:
>>>
>>> Before this series:
>>> # cat schemata
>>>     L3:0=fffff;1=fffff
>>>
>>> After this series:
>>> # cat schemata
>>> L3:0=fffff;1=fffff
>>
>> Hmm, I thought I'd fixed this with v2, ... I see this is subtly different.
>>
>> This could be tweaked by getting schemata_list_add() to include the length of the longest
>> suffix if the resource supports CDP, but its not enabled. (Discovering that means
>> cdp_capable moves to be something the 'fs' bits of resctrl can see.)

>> I'm a little nervous 'adding 4 spaces' because user-space expects them. (I agree if it
>> breaks user-space it has to be done). I guess this is the problem with string parsing as
>> part of the interface!

> This is a tricky and interesting one. It seems that the original intended behavior is
> indeed the way you changed it to. By originally using for_each_enabled_rdt_resource() to
> determine the maximum width in de016df88f23 ("x86/intel_rdt: Update schemata read to show
> data in tabular format"). This was added in v4.12 and dictated the interface until v4.13.
> This was changed in 1b5c0b758317 ("x86/intel_rdt: Cleanup namespace to support RDT
> monitoring") when it used for_each_alloc_capable_rdt_resource(r) instead, added in v4.14
> that is a stable kernel and the most likely interface used by users.
>
> To me the less risky change is to maintain the existing interface, but perhaps there are
> some other guidance in this regard?

I think this is just the problem with having anything other than 'one value per file', as
sysfs does. Maintaining it involves getting painted into a corner by the worst parser
user-space manages to come up with!


>> I assume that in the (distant) future having CDP capable resources with names more than 2
>> characters isn't going to be a problem. (I don't have an example)
>
> The last statement is not clear to me. Could you please elaborate why two characters would
> be significant? From what I understand the expectation would be that the width is the
> maximum name length of all possible schema, whether they are enabled or not.

Great - I was nervous that if shorter strings are a problem, what about longer?

( Arm SoCs often have a system-cache that lives between the LLC and DRAM. Its not a CPU
cache, so its not really L4. Because of the way CDP gets emulated it affects all caches.
If people build these things with MPAM support - and we choose to add a schema for them:
you'd end up with 'SYSTEM-CACHECODE' and 'SYSTEM-CACHEDATA'. Its not a real example as if
its needed, 'SC' is probably acceptable.)


>>> There are a few user space tools that parse the resctrl schemata file and it may be easier
>>> to keep the interface consistent than to find and audit them all to ensure they will keep
>>> working.
>
> To me this continues the biggest hurdle in maintaining the behavior as you have it in this
> series.

No problem - I've changed it as described.


>>> A heads-up is that there are some kernel-doc fixups in the works that will conflict with
>>> your series. You yourself fix at least one of these kernel-doc issues in this series - the
>>> description of mbm_width in the first patch. I will ask the submitter of the kernel-doc
>>> fixups to use your text to help with the merging.
>>
>> Please point me at something to rebase onto!
>> (as far as I can see, tip/x86/cache hasn't moved)
>
> These patches have not yet been merged. The most recent version was sent yesterday. Your
> current base is good.

I've based on tip/master, which merged rc6....


Thanks,

James