2018-08-24 10:47:37

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 00/20] x86/intel_rdt: Start abstraction for a second arch

Hi folks,

ARM have some upcoming CPU features that are similar to Intel RDT. Resctrl
is the defacto ABI for this sort of thing, but it lives under arch/x86.

To get existing software working, we need to make resctrl work with arm64.
This series is the first chunk of that. The aim is to move the filesystem/ABI
parts into /fs/resctrl, and implement a second arch backend.


What are the ARM features?
Future ARM SoCs may have a feature called MPAM: Memory Partitioning and
Monitoring. This is an umbrella term like RDT, and covers a range of controls
(like CAT) and monitors (like MBM, CMT).

This series is almost all about CDP. MPAM has equivalent functionality, but
it doesn't need enabling, and doesn't affect the available closids. (I'll
try and use Intel terms). MPAM expects the equivalent to IA32_PRQ_MSR to
be configured with an Instruction closid and a Data closid. These are the
same for no-CDP, and different otherwise. There is no need for them to be
adjacent.

To avoid emulating CDP in arm64's arch code, this series moves all the ABI
parts of the CDP behaviour, (half the closid-space, each having two
configurations) into the filesystem parts of resctrl. These will eventually
be moved to /fs/.

MPAMs control and monitor configuration is all memory mapped, the base
addresses are discovered via firmware tables, so we won't have a table of
possible resources that just need alloc_enabling.

Is this it? No... there are another two series of a similar size that
abstract the MBM/CMT overflow threads and avoid 'fs' code accessing things
that have moved into the 'hw' arch specific struct.


I'm after feedback on the general approach taken here, bugs, as there are
certainly subtleties I've missed, and any strong-opinions on what should be
arch-specific, and what shouldn't.

This series is based on v4.18, and can be retrieved from:
git://linux-arm.org/linux-jm.git -b mpam/resctrl_rework/rfc_1


Thanks,

James Morse (20):
x86/intel_rdt: Split struct rdt_resource
x86/intel_rdt: Split struct rdt_domain
x86/intel_rdt: Group staged configuration into a separate struct
x86/intel_rdt: Add closid to the staged config
x86/intel_rdt: make update_domains() learn the affected closids
x86/intel_rdt: Add a helper to read a closid's configuration for
show_doms()
x86/intel_rdt: Expose update_domains() as an arch helper
x86/intel_rdt: Make cdp enable/disable global
x86/intel_rdt: Track the actual number of closids separately
x86/intel_rdt: Let resctrl change the resources's num_closid
x86/intel_rdt: Pass in the code/data/both configuration value when
parsing
x86/intel_rdt: Correct the closid when staging configuration changes
x86/intel_rdt: Allow different CODE/DATA configurations to be staged
x86/intel_rdt: Add a separate resource list for resctrl
x86/intel_rdt: Walk the resctrl schema list instead of the arch's
resource list
x86/intel_rdt: Move the schemata names into struct resctrl_schema
x86/intel_rdt: Stop using Lx CODE/DATA resources
x86/intel_rdt: Remove the CODE/DATA illusionary caches
x86/intel_rdt: Kill off alloc_enabled
x86/intel_rdt: Merge cdp enable/disable calls

arch/x86/kernel/cpu/intel_rdt.c | 298 +++++++-------------
arch/x86/kernel/cpu/intel_rdt.h | 161 ++++-------
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 142 +++++++---
arch/x86/kernel/cpu/intel_rdt_monitor.c | 78 ++---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 216 +++++++++-----
include/linux/resctrl.h | 166 +++++++++++
6 files changed, 621 insertions(+), 440 deletions(-)
create mode 100644 include/linux/resctrl.h

--
2.18.0



2018-08-24 10:47:35

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 01/20] x86/intel_rdt: Split struct rdt_resource

resctrl is the defacto Linux ABI for SoC resource partitioning features.
To support it on another architecture, we need to abstract it from
Intel RDT, and move it to /fs/.

Lets start by splitting struct rdt_resource, (the name is kept for now
to keep the noise down), and add some type-trickery to keep the foreach
helpers working.

Move everything that that is particular to resctrl into a new header
file, keeping the x86 msr specific stuff where it is. resctrl code
paths touching a 'hw' struct indicates where an abstraction is needed.

We split rdt_domain up in a similar way in the next patch.
No change in behaviour, this patch just moves types around.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 193 +++++++++++---------
arch/x86/kernel/cpu/intel_rdt.h | 112 +++---------
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 6 +-
arch/x86/kernel/cpu/intel_rdt_monitor.c | 23 ++-
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 37 ++--
include/linux/resctrl.h | 103 +++++++++++
6 files changed, 275 insertions(+), 199 deletions(-)
create mode 100644 include/linux/resctrl.h

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index ec4754f81cbd..8cb2639b8a56 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -64,122 +64,137 @@ mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
static void
cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);

-#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].domains)
+#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].resctrl.domains)

-struct rdt_resource rdt_resources_all[] = {
+struct rdt_hw_resource rdt_resources_all[] = {
[RDT_RESOURCE_L3] =
{
.rid = RDT_RESOURCE_L3,
- .name = "L3",
- .domains = domain_init(RDT_RESOURCE_L3),
+ .resctrl = {
+ .name = "L3",
+ .cache_level = 3,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 1,
+ .cbm_idx_offset = 0,
+ },
+ .domains = domain_init(RDT_RESOURCE_L3),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L3DATA] =
{
.rid = RDT_RESOURCE_L3DATA,
- .name = "L3DATA",
- .domains = domain_init(RDT_RESOURCE_L3DATA),
+ .resctrl = {
+ .name = "L3DATA",
+ .cache_level = 3,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 2,
+ .cbm_idx_offset = 0,
+ },
+ .domains = domain_init(RDT_RESOURCE_L3DATA),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
+
},
[RDT_RESOURCE_L3CODE] =
{
.rid = RDT_RESOURCE_L3CODE,
- .name = "L3CODE",
- .domains = domain_init(RDT_RESOURCE_L3CODE),
+ .resctrl = {
+ .name = "L3CODE",
+ .cache_level = 3,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 2,
+ .cbm_idx_offset = 1,
+ },
+ .domains = domain_init(RDT_RESOURCE_L3CODE),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L2] =
{
.rid = RDT_RESOURCE_L2,
- .name = "L2",
- .domains = domain_init(RDT_RESOURCE_L2),
+ .resctrl = {
+ .name = "L2",
+ .cache_level = 2,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 1,
+ .cbm_idx_offset = 0,
+ },
+ .domains = domain_init(RDT_RESOURCE_L2),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L2DATA] =
{
.rid = RDT_RESOURCE_L2DATA,
- .name = "L2DATA",
- .domains = domain_init(RDT_RESOURCE_L2DATA),
+ .resctrl = {
+ .name = "L2DATA",
+ .cache_level = 2,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 2,
+ .cbm_idx_offset = 0,
+ },
+ .domains = domain_init(RDT_RESOURCE_L2DATA),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L2CODE] =
{
.rid = RDT_RESOURCE_L2CODE,
- .name = "L2CODE",
- .domains = domain_init(RDT_RESOURCE_L2CODE),
+ .resctrl = {
+ .name = "L2CODE",
+ .cache_level = 2,
+ .cache = {
+ .min_cbm_bits = 1,
+ .cbm_idx_mult = 2,
+ .cbm_idx_offset = 1,
+ },
+ .domains = domain_init(RDT_RESOURCE_L2CODE),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
+ },
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_MBA] =
{
.rid = RDT_RESOURCE_MBA,
- .name = "MB",
- .domains = domain_init(RDT_RESOURCE_MBA),
+ .resctrl = {
+ .name = "MB",
+ .cache_level = 3,
+ .domains = domain_init(RDT_RESOURCE_MBA),
+ .parse_ctrlval = parse_bw,
+ .format_str = "%d=%*u",
+ .fflags = RFTYPE_RES_MB,
+ },
.msr_base = IA32_MBA_THRTL_BASE,
.msr_update = mba_wrmsr,
- .cache_level = 3,
- .parse_ctrlval = parse_bw,
- .format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
},
};

@@ -208,7 +223,7 @@ static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid)
*/
static inline void cache_alloc_hsw_probe(void)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
u32 l, h, max_cbm = BIT_MASK(20) - 1;

if (wrmsr_safe(IA32_L3_CBM_BASE, max_cbm, 0))
@@ -233,7 +248,7 @@ static inline void cache_alloc_hsw_probe(void)
bool is_mba_sc(struct rdt_resource *r)
{
if (!r)
- return rdt_resources_all[RDT_RESOURCE_MBA].membw.mba_sc;
+ return rdt_resources_all[RDT_RESOURCE_MBA].resctrl.membw.mba_sc;

return r->membw.mba_sc;
}
@@ -303,8 +318,8 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)

static void rdt_get_cdp_config(int level, int type)
{
- struct rdt_resource *r_l = &rdt_resources_all[level];
- struct rdt_resource *r = &rdt_resources_all[type];
+ struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
+ struct rdt_resource *r = &rdt_resources_all[type].resctrl;

r->num_closid = r_l->num_closid / 2;
r->cache.cbm_len = r_l->cache.cbm_len;
@@ -362,19 +377,21 @@ static void
mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);

/* Write the delay values for mba. */
for (i = m->low; i < m->high; i++)
- wrmsrl(r->msr_base + i, delay_bw_map(d->ctrl_val[i], r));
+ wrmsrl(hw_res->msr_base + i, delay_bw_map(d->ctrl_val[i], r));
}

static void
cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);

for (i = m->low; i < m->high; i++)
- wrmsrl(r->msr_base + cbm_idx(r, i), d->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + cbm_idx(r, i), d->ctrl_val[i]);
}

struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
@@ -394,12 +411,13 @@ void rdt_ctrl_update(void *arg)
{
struct msr_param *m = arg;
struct rdt_resource *r = m->res;
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);
int cpu = smp_processor_id();
struct rdt_domain *d;

d = get_domain_from_cpu(cpu, r);
if (d) {
- r->msr_update(d, m, r);
+ hw_res->msr_update(d, m, r);
return;
}
pr_warn_once("cpu %d not found in any domain for resource %s\n",
@@ -457,6 +475,7 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)

static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);
struct msr_param m;
u32 *dc, *dm;

@@ -476,7 +495,7 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)

m.low = 0;
m.high = r->num_closid;
- r->msr_update(d, &m, r);
+ hw_res->msr_update(d, &m, r);
return 0;
}

@@ -619,7 +638,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
return;
}

- if (r == &rdt_resources_all[RDT_RESOURCE_L3]) {
+ if (r == &rdt_resources_all[RDT_RESOURCE_L3].resctrl) {
if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
cancel_delayed_work(&d->mbm_over);
mbm_setup_overflow_handler(d, 0);
@@ -800,21 +819,21 @@ static __init bool get_rdt_alloc_resources(void)
return false;

if (rdt_cpu_has(X86_FEATURE_CAT_L3)) {
- rdt_get_cache_alloc_cfg(1, &rdt_resources_all[RDT_RESOURCE_L3]);
+ rdt_get_cache_alloc_cfg(1, &rdt_resources_all[RDT_RESOURCE_L3].resctrl);
if (rdt_cpu_has(X86_FEATURE_CDP_L3))
rdt_get_cdp_l3_config();
ret = true;
}
if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
/* CPUID 0x10.2 fields are same format at 0x10.1 */
- rdt_get_cache_alloc_cfg(2, &rdt_resources_all[RDT_RESOURCE_L2]);
+ rdt_get_cache_alloc_cfg(2, &rdt_resources_all[RDT_RESOURCE_L2].resctrl);
if (rdt_cpu_has(X86_FEATURE_CDP_L2))
rdt_get_cdp_l2_config();
ret = true;
}

if (rdt_cpu_has(X86_FEATURE_MBA)) {
- if (rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
+ if (rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA].resctrl))
ret = true;
}
return ret;
@@ -832,7 +851,7 @@ static __init bool get_rdt_mon_resources(void)
if (!rdt_mon_features)
return false;

- return !rdt_get_mon_l3_config(&rdt_resources_all[RDT_RESOURCE_L3]);
+ return !rdt_get_mon_l3_config(&rdt_resources_all[RDT_RESOURCE_L3].resctrl);
}

static __init void rdt_quirks(void)
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index 39752825e376..20a6674ac67c 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -2,6 +2,7 @@
#ifndef _ASM_X86_INTEL_RDT_H
#define _ASM_X86_INTEL_RDT_H

+#include <linux/resctrl.h>
#include <linux/sched.h>
#include <linux/kernfs.h>
#include <linux/jump_label.h>
@@ -246,44 +247,6 @@ struct msr_param {
int high;
};

-/**
- * struct rdt_cache - Cache allocation related data
- * @cbm_len: Length of the cache bit mask
- * @min_cbm_bits: Minimum number of consecutive bits to be set
- * @cbm_idx_mult: Multiplier of CBM index
- * @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
- * closid * cbm_idx_multi + cbm_idx_offset
- * in a cache bit mask
- * @shareable_bits: Bitmask of shareable resource with other
- * executing entities
- */
-struct rdt_cache {
- unsigned int cbm_len;
- unsigned int min_cbm_bits;
- unsigned int cbm_idx_mult;
- unsigned int cbm_idx_offset;
- unsigned int shareable_bits;
-};
-
-/**
- * struct rdt_membw - Memory bandwidth allocation related data
- * @max_delay: Max throttle delay. Delay is the hardware
- * representation for memory bandwidth.
- * @min_bw: Minimum memory bandwidth percentage user can request
- * @bw_gran: Granularity at which the memory bandwidth is allocated
- * @delay_linear: True if memory B/W delay is in linear scale
- * @mba_sc: True if MBA software controller(mba_sc) is enabled
- * @mb_map: Mapping of memory B/W percentage to memory B/W delay
- */
-struct rdt_membw {
- u32 max_delay;
- u32 min_bw;
- u32 bw_gran;
- u32 delay_linear;
- bool mba_sc;
- u32 *mb_map;
-};
-
static inline bool is_llc_occupancy_enabled(void)
{
return (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID));
@@ -312,59 +275,33 @@ static inline bool is_mbm_event(int e)

/**
* struct rdt_resource - attributes of an RDT resource
+ * @resctrl: Properties exposed to the resctrl filesystem
* @rid: The index of the resource
- * @alloc_enabled: Is allocation enabled on this machine
- * @mon_enabled: Is monitoring enabled for this feature
- * @alloc_capable: Is allocation available on this machine
- * @mon_capable: Is monitor feature available on this machine
- * @name: Name to use in "schemata" file
- * @num_closid: Number of CLOSIDs available
- * @cache_level: Which cache level defines scope of this resource
- * @default_ctrl: Specifies default cache cbm or memory B/W percent.
* @msr_base: Base MSR address for CBMs
* @msr_update: Function pointer to update QOS MSRs
- * @data_width: Character width of data when displaying
- * @domains: All domains for this resource
- * @cache: Cache allocation related data
- * @format_str: Per resource format string to show domain value
- * @parse_ctrlval: Per resource function pointer to parse control values
- * @evt_list: List of monitoring events
- * @num_rmid: Number of RMIDs available
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
* @fflags: flags to choose base and info files
*/
-struct rdt_resource {
+struct rdt_hw_resource {
+ struct rdt_resource resctrl;
int rid;
- bool alloc_enabled;
- bool mon_enabled;
- bool alloc_capable;
- bool mon_capable;
- char *name;
- int num_closid;
- int cache_level;
- u32 default_ctrl;
unsigned int msr_base;
void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
- int data_width;
- struct list_head domains;
- struct rdt_cache cache;
- struct rdt_membw membw;
- const char *format_str;
- int (*parse_ctrlval) (char *buf, struct rdt_resource *r,
- struct rdt_domain *d);
- struct list_head evt_list;
- int num_rmid;
unsigned int mon_scale;
- unsigned long fflags;
};

+static inline struct rdt_hw_resource *resctrl_to_rdt(struct rdt_resource *r)
+{
+ return container_of(r, struct rdt_hw_resource, resctrl);
+}
+
int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d);
int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d);

extern struct mutex rdtgroup_mutex;

-extern struct rdt_resource rdt_resources_all[];
+extern struct rdt_hw_resource rdt_resources_all[];
extern struct rdtgroup rdtgroup_default;
DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);

@@ -383,29 +320,38 @@ enum {
RDT_NUM_RESOURCES,
};

+static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
+{
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(res);
+
+ hw_res++;
+ return &hw_res->resctrl;
+}
+
+
#define for_each_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for (r = &rdt_resources_all[0].resctrl; r < &rdt_resources_all[RDT_NUM_RESOURCES].resctrl;\
+ r = resctrl_inc(r)) \
if (r->alloc_capable || r->mon_capable)

#define for_each_alloc_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for (r = &rdt_resources_all[0].resctrl; r < &rdt_resources_all[RDT_NUM_RESOURCES].resctrl;\
+ r = resctrl_inc(r)) \
if (r->alloc_capable)

#define for_each_mon_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for (r = &rdt_resources_all[0].resctrl; r < &rdt_resources_all[RDT_NUM_RESOURCES].resctrl;\
+ r = resctrl_inc(r)) \
if (r->mon_capable)

#define for_each_alloc_enabled_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for (r = &rdt_resources_all[0].resctrl; r < &rdt_resources_all[RDT_NUM_RESOURCES].resctrl;\
+ r = resctrl_inc(r)) \
if (r->alloc_enabled)

#define for_each_mon_enabled_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for (r = &rdt_resources_all[0].resctrl; r < &rdt_resources_all[RDT_NUM_RESOURCES].resctrl;\
+ r = resctrl_inc(r)) \
if (r->mon_enabled)

/* CPUID.(EAX=10H, ECX=ResID=1).EAX */
diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index 116d57b248d3..58890612ca8d 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -348,6 +348,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_domain *d,
int rdtgroup_mondata_show(struct seq_file *m, void *arg)
{
struct kernfs_open_file *of = m->private;
+ struct rdt_hw_resource *hw_res;
u32 resid, evtid, domid;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
@@ -363,7 +364,8 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
domid = md.u.domid;
evtid = md.u.evtid;

- r = &rdt_resources_all[resid];
+ hw_res = &rdt_resources_all[resid];
+ r = &hw_res->resctrl;
d = rdt_find_domain(r, domid, NULL);
if (!d) {
ret = -ENOENT;
@@ -377,7 +379,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
else if (rr.val & RMID_VAL_UNAVAIL)
seq_puts(m, "Unavailable\n");
else
- seq_printf(m, "%llu\n", rr.val * r->mon_scale);
+ seq_printf(m, "%llu\n", rr.val * hw_res->mon_scale);

out:
rdtgroup_kn_unlock(of->kn);
diff --git a/arch/x86/kernel/cpu/intel_rdt_monitor.c b/arch/x86/kernel/cpu/intel_rdt_monitor.c
index b0f3aed76b75..493d264a0dbe 100644
--- a/arch/x86/kernel/cpu/intel_rdt_monitor.c
+++ b/arch/x86/kernel/cpu/intel_rdt_monitor.c
@@ -122,7 +122,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free)
struct rdt_resource *r;
u32 crmid = 1, nrmid;

- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;

/*
* Skip RMID 0 and start from RMID 1 and check all the RMIDs that
@@ -180,7 +180,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry)
int cpu;
u64 val;

- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;

entry->busy = 0;
cpu = get_cpu();
@@ -281,7 +281,7 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
*/
static void mbm_bw_count(u32 rmid, struct rmid_read *rr)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_hw_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];
struct mbm_state *m = &rr->d->mbm_local[rmid];
u64 tval, cur_bw, chunks;

@@ -365,13 +365,15 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
{
u32 closid, rmid, cur_msr, cur_msr_val, new_msr_val;
struct mbm_state *pmbm_data, *cmbm_data;
+ struct rdt_hw_resource *hw_r_mba;
u32 cur_bw, delta_bw, user_bw;
struct rdt_resource *r_mba;
struct rdt_domain *dom_mba;
struct list_head *head;
struct rdtgroup *entry;

- r_mba = &rdt_resources_all[RDT_RESOURCE_MBA];
+ hw_r_mba = &rdt_resources_all[RDT_RESOURCE_MBA];
+ r_mba = &hw_r_mba->resctrl;
closid = rgrp->closid;
rmid = rgrp->mon.rmid;
pmbm_data = &dom_mbm->mbm_local[rmid];
@@ -420,7 +422,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
return;
}

- cur_msr = r_mba->msr_base + closid;
+ cur_msr = hw_r_mba->msr_base + closid;
wrmsrl(cur_msr, delay_bw_map(new_msr_val, r_mba));
dom_mba->ctrl_val[closid] = new_msr_val;

@@ -484,7 +486,7 @@ void cqm_handle_limbo(struct work_struct *work)

mutex_lock(&rdtgroup_mutex);

- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
d = get_domain_from_cpu(cpu, r);

if (!d) {
@@ -507,7 +509,7 @@ void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms)
struct rdt_resource *r;
int cpu;

- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;

cpu = cpumask_any(&dom->cpu_mask);
dom->cqm_work_cpu = cpu;
@@ -528,7 +530,7 @@ void mbm_handle_overflow(struct work_struct *work)
if (!static_branch_likely(&rdt_enable_key))
goto out_unlock;

- d = get_domain_from_cpu(cpu, &rdt_resources_all[RDT_RESOURCE_L3]);
+ d = get_domain_from_cpu(cpu, &rdt_resources_all[RDT_RESOURCE_L3].resctrl);
if (!d)
goto out_unlock;

@@ -626,8 +628,9 @@ static void l3_mon_evt_init(struct rdt_resource *r)
int rdt_get_mon_l3_config(struct rdt_resource *r)
{
int ret;
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);

- r->mon_scale = boot_cpu_data.x86_cache_occ_scale;
+ hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale;
r->num_rmid = boot_cpu_data.x86_cache_max_rmid + 1;

/*
@@ -640,7 +643,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
intel_cqm_threshold = boot_cpu_data.x86_cache_size * 1024 / r->num_rmid;

/* h/w works in units of "boot_cpu_data.x86_cache_occ_scale" */
- intel_cqm_threshold /= r->mon_scale;
+ intel_cqm_threshold /= hw_res->mon_scale;

ret = dom_data_init(r);
if (ret)
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 749856a2e736..3afe642e3ede 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -715,8 +715,9 @@ static int max_threshold_occ_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
struct rdt_resource *r = of->kn->parent->priv;
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);

- seq_printf(seq, "%u\n", intel_cqm_threshold * r->mon_scale);
+ seq_printf(seq, "%u\n", intel_cqm_threshold * hw_res->mon_scale);

return 0;
}
@@ -725,6 +726,7 @@ static ssize_t max_threshold_occ_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
struct rdt_resource *r = of->kn->parent->priv;
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);
unsigned int bytes;
int ret;

@@ -735,7 +737,7 @@ static ssize_t max_threshold_occ_write(struct kernfs_open_file *of,
if (bytes > (boot_cpu_data.x86_cache_size * 1024))
return -EINVAL;

- intel_cqm_threshold = bytes / r->mon_scale;
+ intel_cqm_threshold = bytes / hw_res->mon_scale;

return nbytes;
}
@@ -1007,7 +1009,7 @@ static void l2_qos_cfg_update(void *arg)

static inline bool is_mba_linear(void)
{
- return rdt_resources_all[RDT_RESOURCE_MBA].membw.delay_linear;
+ return rdt_resources_all[RDT_RESOURCE_MBA].resctrl.membw.delay_linear;
}

static int set_cache_qos_cfg(int level, bool enable)
@@ -1028,7 +1030,7 @@ static int set_cache_qos_cfg(int level, bool enable)
else
return -EINVAL;

- r_l = &rdt_resources_all[level];
+ r_l = &rdt_resources_all[level].resctrl;
list_for_each_entry(d, &r_l->domains, list) {
/* Pick one CPU from each domain instance to update MSR */
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
@@ -1054,7 +1056,7 @@ static int set_cache_qos_cfg(int level, bool enable)
*/
static int set_mba_sc(bool mba_sc)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA];
+ struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].resctrl;
struct rdt_domain *d;

if (!is_mbm_enabled() || !is_mba_linear() ||
@@ -1070,9 +1072,9 @@ static int set_mba_sc(bool mba_sc)

static int cdp_enable(int level, int data_type, int code_type)
{
- struct rdt_resource *r_ldata = &rdt_resources_all[data_type];
- struct rdt_resource *r_lcode = &rdt_resources_all[code_type];
- struct rdt_resource *r_l = &rdt_resources_all[level];
+ struct rdt_resource *r_ldata = &rdt_resources_all[data_type].resctrl;
+ struct rdt_resource *r_lcode = &rdt_resources_all[code_type].resctrl;
+ struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
int ret;

if (!r_l->alloc_capable || !r_ldata->alloc_capable ||
@@ -1102,13 +1104,13 @@ static int cdpl2_enable(void)

static void cdp_disable(int level, int data_type, int code_type)
{
- struct rdt_resource *r = &rdt_resources_all[level];
+ struct rdt_resource *r = &rdt_resources_all[level].resctrl;

r->alloc_enabled = r->alloc_capable;

- if (rdt_resources_all[data_type].alloc_enabled) {
- rdt_resources_all[data_type].alloc_enabled = false;
- rdt_resources_all[code_type].alloc_enabled = false;
+ if (rdt_resources_all[data_type].resctrl.alloc_enabled) {
+ rdt_resources_all[data_type].resctrl.alloc_enabled = false;
+ rdt_resources_all[code_type].resctrl.alloc_enabled = false;
set_cache_qos_cfg(level, false);
}
}
@@ -1125,9 +1127,9 @@ static void cdpl2_disable(void)

static void cdp_disable_all(void)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled)
+ if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled)
cdpl3_disable();
- if (rdt_resources_all[RDT_RESOURCE_L2DATA].alloc_enabled)
+ if (rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled)
cdpl2_disable();
}

@@ -1303,7 +1305,7 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type,
static_branch_enable_cpuslocked(&rdt_enable_key);

if (is_mbm_enabled()) {
- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
list_for_each_entry(dom, &r->domains, list)
mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL);
}
@@ -1542,6 +1544,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
struct rdt_domain *d,
struct rdt_resource *r, struct rdtgroup *prgrp)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);
union mon_data_bits priv;
struct kernfs_node *kn;
struct mon_evt *mevt;
@@ -1569,7 +1572,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
goto out_destroy;
}

- priv.u.rid = r->rid;
+ priv.u.rid = hw_res->rid;
priv.u.domid = d->id;
list_for_each_entry(mevt, &r->evt_list, list) {
priv.u.evtid = mevt->evtid;
@@ -2030,7 +2033,7 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)

static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled)
+ if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled)
seq_puts(seq, ",cdp");
return 0;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
new file mode 100644
index 000000000000..8d32b2c6d72b
--- /dev/null
+++ b/include/linux/resctrl.h
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0
+// Based on arch/x86/kernel/cpu/intel_rdt.h
+
+#ifndef __LINUX_RESCTRL_H
+#define __LINUX_RESCTRL_H
+
+#include <linux/list.h>
+#include <linux/kernel.h>
+
+struct rdt_domain;
+
+/**
+ * struct resctrl_cache - Cache allocation related data
+ * @cbm_len: Length of the cache bit mask
+ * @min_cbm_bits: Minimum number of consecutive bits to be set
+ * @cbm_idx_mult: Multiplier of CBM index
+ * @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
+ * closid * cbm_idx_multi + cbm_idx_offset
+ * in a cache bit mask
+ * @shareable_bits: Bitmask of shareable resource with other
+ * executing entities
+ */
+struct resctrl_cache {
+ u32 cbm_len;
+ u32 min_cbm_bits;
+ unsigned int cbm_idx_mult; // TODO remove this
+ unsigned int cbm_idx_offset; // TODO remove this
+ u32 shareable_bits;
+};
+
+/**
+ * struct resctrl_membw - Memory bandwidth allocation related data
+ * @max_delay: Max throttle delay. Delay is the hardware
+ * representation for memory bandwidth.
+ * @min_bw: Minimum memory bandwidth percentage user can request
+ * @bw_gran: Granularity at which the memory bandwidth is allocated
+ * @delay_linear: True if memory B/W delay is in linear scale
+ * @mba_sc: True if MBA software controller(mba_sc) is enabled
+ * @mb_map: Mapping of memory B/W percentage to memory B/W delay
+ */
+struct resctrl_membw {
+ u32 max_delay;
+ u32 min_bw;
+ u32 bw_gran;
+ u32 delay_linear;
+ bool mba_sc;
+ u32 *mb_map;
+};
+
+/**
+ * @alloc_enabled: Is allocation enabled on this machine
+ * @mon_enabled: Is monitoring enabled for this feature
+ * @alloc_capable: Is allocation available on this machine
+ * @mon_capable: Is monitor feature available on this machine
+ *
+ * @cache_level: Which cache level defines scope of this resource.
+ *
+ * @cache: If the component has cache controls, their properties.
+ * @membw: If the component has bandwidth controls, their properties.
+ *
+ * @num_closid: Number of CLOSIDs available.
+ * @num_rmid: Number of RMIDs available.
+ *
+ * @domains: All domains for this resource
+ *
+ * @name: Name to use in "schemata" file.
+ * @data_width: Character width of data when displaying.
+ * @default_ctrl: Specifies default cache cbm or memory B/W percent.
+ * @format_str: Per resource format string to show domain value
+ * @parse_ctrlval: Per resource function pointer to parse control values
+ *
+ * @evt_list: List of monitoring events
+ * @fflags: flags to choose base and info files
+ */
+struct rdt_resource {
+ bool alloc_enabled;
+ bool mon_enabled;
+ bool alloc_capable;
+ bool mon_capable;
+
+ int cache_level;
+
+ struct resctrl_cache cache;
+ struct resctrl_membw membw;
+
+ int num_closid;
+ int num_rmid;
+
+ struct list_head domains;
+
+ char *name;
+ int data_width;
+ u32 default_ctrl;
+ const char *format_str;
+ int (*parse_ctrlval) (char *buf, struct rdt_resource *r,
+ struct rdt_domain *d);
+
+ struct list_head evt_list;
+ unsigned long fflags;
+
+};
+
+#endif /* __LINUX_RESCTRL_H */
--
2.18.0


2018-08-24 10:47:37

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 02/20] x86/intel_rdt: Split struct rdt_domain

resctrl is the defacto Linux ABI for SoC resource partitioning features.
To support it on another architecture, we need to abstract it from
Intel RDT, and move it to /fs/.

Split struct rdt_domain up too. Move everything that that is particular
to resctrl into a new header file. resctrl code paths touching a 'hw'
struct indicates where an abstraction is needed.

No change in behaviour, this patch just moves types around.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 87 +++++++++++----------
arch/x86/kernel/cpu/intel_rdt.h | 30 ++++---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 12 ++-
arch/x86/kernel/cpu/intel_rdt_monitor.c | 55 +++++++------
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 14 +++-
include/linux/resctrl.h | 17 +++-
6 files changed, 127 insertions(+), 88 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 8cb2639b8a56..c4e6dcdd235b 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -377,21 +377,23 @@ static void
mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_domain *hw_dom = rc_dom_to_rdt(d);
struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);

/* Write the delay values for mba. */
for (i = m->low; i < m->high; i++)
- wrmsrl(hw_res->msr_base + i, delay_bw_map(d->ctrl_val[i], r));
+ wrmsrl(hw_res->msr_base + i, delay_bw_map(hw_dom->ctrl_val[i], r));
}

static void
cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_domain *hw_dom = rc_dom_to_rdt(d);
struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);

for (i = m->low; i < m->high; i++)
- wrmsrl(hw_res->msr_base + cbm_idx(r, i), d->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + cbm_idx(r, i), hw_dom->ctrl_val[i]);
}

struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
@@ -476,21 +478,22 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);
+ struct rdt_hw_domain *hw_dom = rc_dom_to_rdt(d);
struct msr_param m;
u32 *dc, *dm;

- dc = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
+ dc = kmalloc_array(r->num_closid, sizeof(*hw_dom->ctrl_val), GFP_KERNEL);
if (!dc)
return -ENOMEM;

- dm = kmalloc_array(r->num_closid, sizeof(*d->mbps_val), GFP_KERNEL);
+ dm = kmalloc_array(r->num_closid, sizeof(*hw_dom->mbps_val), GFP_KERNEL);
if (!dm) {
kfree(dc);
return -ENOMEM;
}

- d->ctrl_val = dc;
- d->mbps_val = dm;
+ hw_dom->ctrl_val = dc;
+ hw_dom->mbps_val = dm;
setup_default_ctrlval(r, dc, dm);

m.low = 0;
@@ -502,36 +505,37 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d)
{
size_t tsize;
+ struct rdt_hw_domain *hw_dom = rc_dom_to_rdt(d);

if (is_llc_occupancy_enabled()) {
- d->rmid_busy_llc = kcalloc(BITS_TO_LONGS(r->num_rmid),
+ hw_dom->rmid_busy_llc = kcalloc(BITS_TO_LONGS(r->num_rmid),
sizeof(unsigned long),
GFP_KERNEL);
- if (!d->rmid_busy_llc)
+ if (!hw_dom->rmid_busy_llc)
return -ENOMEM;
- INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo);
+ INIT_DELAYED_WORK(&hw_dom->cqm_limbo, cqm_handle_limbo);
}
if (is_mbm_total_enabled()) {
- tsize = sizeof(*d->mbm_total);
- d->mbm_total = kcalloc(r->num_rmid, tsize, GFP_KERNEL);
- if (!d->mbm_total) {
- kfree(d->rmid_busy_llc);
+ tsize = sizeof(*hw_dom->mbm_total);
+ hw_dom->mbm_total = kcalloc(r->num_rmid, tsize, GFP_KERNEL);
+ if (!hw_dom->mbm_total) {
+ kfree(hw_dom->rmid_busy_llc);
return -ENOMEM;
}
}
if (is_mbm_local_enabled()) {
- tsize = sizeof(*d->mbm_local);
- d->mbm_local = kcalloc(r->num_rmid, tsize, GFP_KERNEL);
- if (!d->mbm_local) {
- kfree(d->rmid_busy_llc);
- kfree(d->mbm_total);
+ tsize = sizeof(*hw_dom->mbm_local);
+ hw_dom->mbm_local = kcalloc(r->num_rmid, tsize, GFP_KERNEL);
+ if (!hw_dom->mbm_local) {
+ kfree(hw_dom->rmid_busy_llc);
+ kfree(hw_dom->mbm_total);
return -ENOMEM;
}
}

if (is_mbm_enabled()) {
- INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow);
- mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL);
+ INIT_DELAYED_WORK(&hw_dom->mbm_over, mbm_handle_overflow);
+ mbm_setup_overflow_handler(hw_dom, MBM_OVERFLOW_INTERVAL);
}

return 0;
@@ -554,6 +558,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
{
int id = get_cache_id(cpu, r->cache_level);
struct list_head *add_pos = NULL;
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;

d = rdt_find_domain(r, id, &add_pos);
@@ -567,10 +572,10 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
return;
}

- d = kzalloc_node(sizeof(*d), GFP_KERNEL, cpu_to_node(cpu));
- if (!d)
+ hw_dom = kzalloc_node(sizeof(*hw_dom), GFP_KERNEL, cpu_to_node(cpu));
+ if (!hw_dom)
return;
-
+ d = &hw_dom->resctrl;
d->id = id;
cpumask_set_cpu(cpu, &d->cpu_mask);

@@ -597,6 +602,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
static void domain_remove_cpu(int cpu, struct rdt_resource *r)
{
int id = get_cache_id(cpu, r->cache_level);
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;

d = rdt_find_domain(r, id, NULL);
@@ -604,6 +610,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
pr_warn("Could't find cache id for cpu %d\n", cpu);
return;
}
+ hw_dom = rc_dom_to_rdt(d);

cpumask_clear_cpu(cpu, &d->cpu_mask);
if (cpumask_empty(&d->cpu_mask)) {
@@ -615,8 +622,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
rmdir_mondata_subdir_allrdtgrp(r, d->id);
list_del(&d->list);
if (is_mbm_enabled())
- cancel_delayed_work(&d->mbm_over);
- if (is_llc_occupancy_enabled() && has_busy_rmid(r, d)) {
+ cancel_delayed_work(&hw_dom->mbm_over);
+ if (is_llc_occupancy_enabled() && has_busy_rmid(r, hw_dom)) {
/*
* When a package is going down, forcefully
* decrement rmid->ebusy. There is no way to know
@@ -625,28 +632,28 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
* the RMID as busy creates RMID leaks if the
* package never comes back.
*/
- __check_limbo(d, true);
- cancel_delayed_work(&d->cqm_limbo);
+ __check_limbo(hw_dom, true);
+ cancel_delayed_work(&hw_dom->cqm_limbo);
}

- kfree(d->ctrl_val);
- kfree(d->mbps_val);
- kfree(d->rmid_busy_llc);
- kfree(d->mbm_total);
- kfree(d->mbm_local);
- kfree(d);
+ kfree(hw_dom->ctrl_val);
+ kfree(hw_dom->mbps_val);
+ kfree(hw_dom->rmid_busy_llc);
+ kfree(hw_dom->mbm_total);
+ kfree(hw_dom->mbm_local);
+ kfree(hw_dom);
return;
}

if (r == &rdt_resources_all[RDT_RESOURCE_L3].resctrl) {
- if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
- cancel_delayed_work(&d->mbm_over);
- mbm_setup_overflow_handler(d, 0);
+ if (is_mbm_enabled() && cpu == hw_dom->mbm_work_cpu) {
+ cancel_delayed_work(&hw_dom->mbm_over);
+ mbm_setup_overflow_handler(hw_dom, 0);
}
- if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu &&
- has_busy_rmid(r, d)) {
- cancel_delayed_work(&d->cqm_limbo);
- cqm_setup_limbo_handler(d, 0);
+ if (is_llc_occupancy_enabled() && cpu == hw_dom->cqm_work_cpu &&
+ has_busy_rmid(r, hw_dom)) {
+ cancel_delayed_work(&hw_dom->cqm_limbo);
+ cqm_setup_limbo_handler(hw_dom, 0);
}
}
}
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index 20a6674ac67c..7c17d74fd36c 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -64,7 +64,7 @@ union mon_data_bits {

struct rmid_read {
struct rdtgroup *rgrp;
- struct rdt_domain *d;
+ struct rdt_hw_domain *d;
int evtid;
bool first;
u64 val;
@@ -200,9 +200,7 @@ struct mbm_state {

/**
* struct rdt_domain - group of cpus sharing an RDT resource
- * @list: all instances of this resource
- * @id: unique id for this instance
- * @cpu_mask: which cpus share this resource
+ * @resctrl: Properties exposed to the resctrl filesystem
* @rmid_busy_llc:
* bitmap of which limbo RMIDs are above threshold
* @mbm_total: saved state for MBM total bandwidth
@@ -215,13 +213,9 @@ struct mbm_state {
* worker cpu for CQM h/w counters
* @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID)
* @mbps_val: When mba_sc is enabled, this holds the bandwidth in MBps
- * @new_ctrl: new ctrl value to be loaded
- * @have_new_ctrl: did user provide new_ctrl for this domain
*/
-struct rdt_domain {
- struct list_head list;
- int id;
- struct cpumask cpu_mask;
+struct rdt_hw_domain {
+ struct rdt_domain resctrl;
unsigned long *rmid_busy_llc;
struct mbm_state *mbm_total;
struct mbm_state *mbm_local;
@@ -231,10 +225,14 @@ struct rdt_domain {
int cqm_work_cpu;
u32 *ctrl_val;
u32 *mbps_val;
- u32 new_ctrl;
- bool have_new_ctrl;
};

+static inline struct rdt_hw_domain *rc_dom_to_rdt(struct rdt_domain *r)
+{
+ return container_of(r, struct rdt_hw_domain, resctrl);
+}
+
+
/**
* struct msr_param - set a range of MSRs from a domain
* @res: The resource to use
@@ -403,15 +401,15 @@ void mkdir_mondata_subdir_allrdtgrp(struct rdt_resource *r,
struct rdt_domain *d);
void mon_event_read(struct rmid_read *rr, struct rdt_domain *d,
struct rdtgroup *rdtgrp, int evtid, int first);
-void mbm_setup_overflow_handler(struct rdt_domain *dom,
+void mbm_setup_overflow_handler(struct rdt_hw_domain *dom,
unsigned long delay_ms);
void mbm_handle_overflow(struct work_struct *work);
bool is_mba_sc(struct rdt_resource *r);
void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm);
u32 delay_bw_map(unsigned long bw, struct rdt_resource *r);
-void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms);
+void cqm_setup_limbo_handler(struct rdt_hw_domain *dom, unsigned long delay_ms);
void cqm_handle_limbo(struct work_struct *work);
-bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
-void __check_limbo(struct rdt_domain *d, bool force_free);
+bool has_busy_rmid(struct rdt_resource *r, struct rdt_hw_domain *d);
+void __check_limbo(struct rdt_hw_domain *d, bool force_free);

#endif /* _ASM_X86_INTEL_RDT_H */
diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index 58890612ca8d..e3dcb5161122 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -177,6 +177,7 @@ static int parse_line(char *line, struct rdt_resource *r)

static int update_domains(struct rdt_resource *r, int closid)
{
+ struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
@@ -193,7 +194,8 @@ static int update_domains(struct rdt_resource *r, int closid)

mba_sc = is_mba_sc(r);
list_for_each_entry(d, &r->domains, list) {
- dc = !mba_sc ? d->ctrl_val : d->mbps_val;
+ hw_dom = rc_dom_to_rdt(d);
+ dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;
if (d->have_new_ctrl && d->new_ctrl != dc[closid]) {
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
dc[closid] = d->new_ctrl;
@@ -290,17 +292,19 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
{
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *dom;
bool sep = false;
u32 ctrl_val;

seq_printf(s, "%*s:", max_name_width, r->name);
list_for_each_entry(dom, &r->domains, list) {
+ hw_dom = rc_dom_to_rdt(dom);
if (sep)
seq_puts(s, ";");

- ctrl_val = (!is_mba_sc(r) ? dom->ctrl_val[closid] :
- dom->mbps_val[closid]);
+ ctrl_val = (!is_mba_sc(r) ? hw_dom->ctrl_val[closid] :
+ hw_dom->mbps_val[closid]);
seq_printf(s, r->format_str, dom->id, max_data_width,
ctrl_val);
sep = true;
@@ -338,7 +342,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_domain *d,
*/
rr->rgrp = rdtgrp;
rr->evtid = evtid;
- rr->d = d;
+ rr->d = rc_dom_to_rdt(d);
rr->val = 0;
rr->first = first;

diff --git a/arch/x86/kernel/cpu/intel_rdt_monitor.c b/arch/x86/kernel/cpu/intel_rdt_monitor.c
index 493d264a0dbe..c05f1cecf6cd 100644
--- a/arch/x86/kernel/cpu/intel_rdt_monitor.c
+++ b/arch/x86/kernel/cpu/intel_rdt_monitor.c
@@ -116,7 +116,7 @@ static bool rmid_dirty(struct rmid_entry *entry)
* decrement the count. If the busy count gets to zero on an RMID, we
* free the RMID
*/
-void __check_limbo(struct rdt_domain *d, bool force_free)
+void __check_limbo(struct rdt_hw_domain *d, bool force_free)
{
struct rmid_entry *entry;
struct rdt_resource *r;
@@ -147,7 +147,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free)
}
}

-bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d)
+bool has_busy_rmid(struct rdt_resource *r, struct rdt_hw_domain *d)
{
return find_first_bit(d->rmid_busy_llc, r->num_rmid) != r->num_rmid;
}
@@ -175,6 +175,7 @@ int alloc_rmid(void)

static void add_rmid_to_limbo(struct rmid_entry *entry)
{
+ struct rdt_hw_domain *hw_dom;
struct rdt_resource *r;
struct rdt_domain *d;
int cpu;
@@ -185,6 +186,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry)
entry->busy = 0;
cpu = get_cpu();
list_for_each_entry(d, &r->domains, list) {
+ hw_dom = rc_dom_to_rdt(d);
if (cpumask_test_cpu(cpu, &d->cpu_mask)) {
val = __rmid_read(entry->rmid, QOS_L3_OCCUP_EVENT_ID);
if (val <= intel_cqm_threshold)
@@ -195,9 +197,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry)
* For the first limbo RMID in the domain,
* setup up the limbo worker.
*/
- if (!has_busy_rmid(r, d))
- cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL);
- set_bit(entry->rmid, d->rmid_busy_llc);
+ if (!has_busy_rmid(r, hw_dom))
+ cqm_setup_limbo_handler(hw_dom, CQM_LIMBOCHECK_INTERVAL);
+ set_bit(entry->rmid, hw_dom->rmid_busy_llc);
entry->busy++;
}
put_cpu();
@@ -363,9 +365,11 @@ void mon_event_count(void *info)
*/
static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
{
+ struct rdt_hw_domain *hw_dom_mbm = rc_dom_to_rdt(dom_mbm);
u32 closid, rmid, cur_msr, cur_msr_val, new_msr_val;
struct mbm_state *pmbm_data, *cmbm_data;
struct rdt_hw_resource *hw_r_mba;
+ struct rdt_hw_domain *hw_dom_mba;
u32 cur_bw, delta_bw, user_bw;
struct rdt_resource *r_mba;
struct rdt_domain *dom_mba;
@@ -376,25 +380,26 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
r_mba = &hw_r_mba->resctrl;
closid = rgrp->closid;
rmid = rgrp->mon.rmid;
- pmbm_data = &dom_mbm->mbm_local[rmid];
+ pmbm_data = &hw_dom_mbm->mbm_local[rmid];

dom_mba = get_domain_from_cpu(smp_processor_id(), r_mba);
if (!dom_mba) {
pr_warn_once("Failure to get domain for MBA update\n");
return;
}
+ hw_dom_mba = rc_dom_to_rdt(dom_mba);

cur_bw = pmbm_data->prev_bw;
- user_bw = dom_mba->mbps_val[closid];
+ user_bw = hw_dom_mba->mbps_val[closid];
delta_bw = pmbm_data->delta_bw;
- cur_msr_val = dom_mba->ctrl_val[closid];
+ cur_msr_val = hw_dom_mba->ctrl_val[closid];

/*
* For Ctrl groups read data from child monitor groups.
*/
head = &rgrp->mon.crdtgrp_list;
list_for_each_entry(entry, head, mon.crdtgrp_list) {
- cmbm_data = &dom_mbm->mbm_local[entry->mon.rmid];
+ cmbm_data = &hw_dom_mbm->mbm_local[entry->mon.rmid];
cur_bw += cmbm_data->prev_bw;
delta_bw += cmbm_data->delta_bw;
}
@@ -424,7 +429,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)

cur_msr = hw_r_mba->msr_base + closid;
wrmsrl(cur_msr, delay_bw_map(new_msr_val, r_mba));
- dom_mba->ctrl_val[closid] = new_msr_val;
+ hw_dom_mba->ctrl_val[closid] = new_msr_val;

/*
* Delta values are updated dynamically package wise for each
@@ -438,17 +443,17 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
*/
pmbm_data->delta_comp = true;
list_for_each_entry(entry, head, mon.crdtgrp_list) {
- cmbm_data = &dom_mbm->mbm_local[entry->mon.rmid];
+ cmbm_data = &hw_dom_mbm->mbm_local[entry->mon.rmid];
cmbm_data->delta_comp = true;
}
}

-static void mbm_update(struct rdt_domain *d, int rmid)
+static void mbm_update(struct rdt_hw_domain *hw_dom, int rmid)
{
struct rmid_read rr;

rr.first = false;
- rr.d = d;
+ rr.d = hw_dom;

/*
* This is protected from concurrent reads from user
@@ -480,6 +485,7 @@ static void mbm_update(struct rdt_domain *d, int rmid)
void cqm_handle_limbo(struct work_struct *work)
{
unsigned long delay = msecs_to_jiffies(CQM_LIMBOCHECK_INTERVAL);
+ struct rdt_hw_domain *hw_dom;
int cpu = smp_processor_id();
struct rdt_resource *r;
struct rdt_domain *d;
@@ -493,17 +499,18 @@ void cqm_handle_limbo(struct work_struct *work)
pr_warn_once("Failure to get domain for limbo worker\n");
goto out_unlock;
}
+ hw_dom = rc_dom_to_rdt(d);

- __check_limbo(d, false);
+ __check_limbo(hw_dom, false);

- if (has_busy_rmid(r, d))
- schedule_delayed_work_on(cpu, &d->cqm_limbo, delay);
+ if (has_busy_rmid(r, hw_dom))
+ schedule_delayed_work_on(cpu, &hw_dom->cqm_limbo, delay);

out_unlock:
mutex_unlock(&rdtgroup_mutex);
}

-void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms)
+void cqm_setup_limbo_handler(struct rdt_hw_domain *dom, unsigned long delay_ms)
{
unsigned long delay = msecs_to_jiffies(delay_ms);
struct rdt_resource *r;
@@ -511,7 +518,7 @@ void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms)

r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;

- cpu = cpumask_any(&dom->cpu_mask);
+ cpu = cpumask_any(&dom->resctrl.cpu_mask);
dom->cqm_work_cpu = cpu;

schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay);
@@ -522,6 +529,7 @@ void mbm_handle_overflow(struct work_struct *work)
unsigned long delay = msecs_to_jiffies(MBM_OVERFLOW_INTERVAL);
struct rdtgroup *prgrp, *crgrp;
int cpu = smp_processor_id();
+ struct rdt_hw_domain *hw_dom;
struct list_head *head;
struct rdt_domain *d;

@@ -533,32 +541,33 @@ void mbm_handle_overflow(struct work_struct *work)
d = get_domain_from_cpu(cpu, &rdt_resources_all[RDT_RESOURCE_L3].resctrl);
if (!d)
goto out_unlock;
+ hw_dom = rc_dom_to_rdt(d);

list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
- mbm_update(d, prgrp->mon.rmid);
+ mbm_update(hw_dom, prgrp->mon.rmid);

head = &prgrp->mon.crdtgrp_list;
list_for_each_entry(crgrp, head, mon.crdtgrp_list)
- mbm_update(d, crgrp->mon.rmid);
+ mbm_update(hw_dom, crgrp->mon.rmid);

if (is_mba_sc(NULL))
update_mba_bw(prgrp, d);
}

- schedule_delayed_work_on(cpu, &d->mbm_over, delay);
+ schedule_delayed_work_on(cpu, &hw_dom->mbm_over, delay);

out_unlock:
mutex_unlock(&rdtgroup_mutex);
}

-void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms)
+void mbm_setup_overflow_handler(struct rdt_hw_domain *dom, unsigned long delay_ms)
{
unsigned long delay = msecs_to_jiffies(delay_ms);
int cpu;

if (!static_branch_likely(&rdt_enable_key))
return;
- cpu = cpumask_any(&dom->cpu_mask);
+ cpu = cpumask_any(&dom->resctrl.cpu_mask);
dom->mbm_work_cpu = cpu;
schedule_delayed_work_on(cpu, &dom->mbm_over, delay);
}
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 3afe642e3ede..3ed88d4fedd0 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -1057,6 +1057,7 @@ static int set_cache_qos_cfg(int level, bool enable)
static int set_mba_sc(bool mba_sc)
{
struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].resctrl;
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;

if (!is_mbm_enabled() || !is_mba_linear() ||
@@ -1064,8 +1065,10 @@ static int set_mba_sc(bool mba_sc)
return -EINVAL;

r->membw.mba_sc = mba_sc;
- list_for_each_entry(d, &r->domains, list)
- setup_default_ctrlval(r, d->ctrl_val, d->mbps_val);
+ list_for_each_entry(d, &r->domains, list) {
+ hw_dom = rc_dom_to_rdt(d);
+ setup_default_ctrlval(r, hw_dom->ctrl_val, hw_dom->mbps_val);
+ }

return 0;
}
@@ -1307,7 +1310,8 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type,
if (is_mbm_enabled()) {
r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
list_for_each_entry(dom, &r->domains, list)
- mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL);
+ mbm_setup_overflow_handler(rc_dom_to_rdt(dom),
+ MBM_OVERFLOW_INTERVAL);
}

goto out;
@@ -1332,6 +1336,7 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type,

static int reset_all_ctrls(struct rdt_resource *r)
{
+ struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
@@ -1350,10 +1355,11 @@ static int reset_all_ctrls(struct rdt_resource *r)
* from each domain to update the MSRs below.
*/
list_for_each_entry(d, &r->domains, list) {
+ hw_dom = rc_dom_to_rdt(d);
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);

for (i = 0; i < r->num_closid; i++)
- d->ctrl_val[i] = r->default_ctrl;
+ hw_dom->ctrl_val[i] = r->default_ctrl;
}
cpu = get_cpu();
/* Update CBM on this cpu if it's in cpu_mask. */
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 8d32b2c6d72b..5950c30fcc30 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -7,7 +7,22 @@
#include <linux/list.h>
#include <linux/kernel.h>

-struct rdt_domain;
+/**
+ * struct rdt_domain - group of cpus sharing an RDT resource
+ * @list: all instances of this resource
+ * @id: unique id for this instance
+ * @cpu_mask: which cpus share this resource
+ * @new_ctrl: new ctrl value to be loaded
+ * @have_new_ctrl: did user provide new_ctrl for this domain
+ */
+struct rdt_domain {
+ struct list_head list;
+ int id;
+ struct cpumask cpu_mask;
+
+ u32 new_ctrl;
+ bool have_new_ctrl;
+};

/**
* struct resctrl_cache - Cache allocation related data
--
2.18.0


2018-08-24 10:47:48

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 06/20] x86/intel_rdt: Add a helper to read a closid's configuration for show_doms()

The configuration values used by the arch code may not be the same
as the bitmaps generated by resctrl, resctrl shouldn't read or
write them directly.

update_domains() and the staged config are suitable for letting the
arch code perform any conversion. Add a helper to read the current
configuration.

This will allow another architecture to scale the bitmaps if
necessary, and possibly use controls that don't take a bitmap at all.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 17 +++++++++++++----
arch/x86/kernel/cpu/intel_rdt_monitor.c | 2 +-
include/linux/resctrl.h | 3 +++
3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index 01ffd455313a..ec3c15ee3473 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -322,21 +322,30 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}

+void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
+ u32 closid, u32 *value)
+{
+ struct rdt_hw_domain *hw_dom = rc_dom_to_rdt(d);
+
+ if (!is_mba_sc(r))
+ *value = hw_dom->ctrl_val[closid];
+ else
+ *value = hw_dom->mbps_val[closid];
+}
+
static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
{
- struct rdt_hw_domain *hw_dom;
+
struct rdt_domain *dom;
bool sep = false;
u32 ctrl_val;

seq_printf(s, "%*s:", max_name_width, r->name);
list_for_each_entry(dom, &r->domains, list) {
- hw_dom = rc_dom_to_rdt(dom);
if (sep)
seq_puts(s, ";");

- ctrl_val = (!is_mba_sc(r) ? hw_dom->ctrl_val[closid] :
- hw_dom->mbps_val[closid]);
+ resctrl_arch_get_config(r, dom, closid, &ctrl_val);
seq_printf(s, r->format_str, dom->id, max_data_width,
ctrl_val);
sep = true;
diff --git a/arch/x86/kernel/cpu/intel_rdt_monitor.c b/arch/x86/kernel/cpu/intel_rdt_monitor.c
index c05f1cecf6cd..42ddcefc7065 100644
--- a/arch/x86/kernel/cpu/intel_rdt_monitor.c
+++ b/arch/x86/kernel/cpu/intel_rdt_monitor.c
@@ -390,7 +390,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm)
hw_dom_mba = rc_dom_to_rdt(dom_mba);

cur_bw = pmbm_data->prev_bw;
- user_bw = hw_dom_mba->mbps_val[closid];
+ resctrl_arch_get_config(r_mba, dom_mba, closid, &user_bw);
delta_bw = pmbm_data->delta_bw;
cur_msr_val = hw_dom_mba->ctrl_val[closid];

diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 370db085ee77..03d9fbc230af 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -125,4 +125,7 @@ struct rdt_resource {

};

+void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
+ u32 closid, u32 *value);
+
#endif /* __LINUX_RESCTRL_H */
--
2.18.0


2018-08-24 10:47:57

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 09/20] x86/intel_rdt: Track the actual number of closids separately

num_closid is different for the illusionary CODE/DATA caches, and
these resource's ctrlval is sized on this parameter. When it comes
to writing the configuration values into hardware, a correction is
applied.

The next step in moving this behaviour into the resctrl code is
to make the arch code always work with the full range of closids, and
size its ctrlval arrays based on this number.

This means another architecture doesn't need to emulate CDP.

Add a separate field to hold hw_num_closids and use this in the
arch code. The CODE/DATA caches use the full range for their hardware
struct, but the half sized version for the resctrl visible part.
This means the ctrlval array is the full size, but only the first
half is used.

A later patch will correct the closid when the configuration is
written, at which point we can merge the illusionary caches.

A short lived quirk of this is when a resource is reset(), both
the code and data illusionary caches reset the full closid range.
This disappears in a later patch that merges the caches together.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 19 ++++++++++++++-----
arch/x86/kernel/cpu/intel_rdt.h | 2 ++
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 3 ++-
3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 0e651447956e..c035280b4398 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -223,7 +223,8 @@ static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid)
*/
static inline void cache_alloc_hsw_probe(void)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
+ struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_resource *r = &hw_res->resctrl;
u32 l, h, max_cbm = BIT_MASK(20) - 1;

if (wrmsr_safe(IA32_L3_CBM_BASE, max_cbm, 0))
@@ -235,6 +236,7 @@ static inline void cache_alloc_hsw_probe(void)
return;

r->num_closid = 4;
+ hw_res->hw_num_closid = 4;
r->default_ctrl = max_cbm;
r->cache.cbm_len = 20;
r->cache.shareable_bits = 0xc0000;
@@ -276,12 +278,14 @@ static inline bool rdt_get_mb_table(struct rdt_resource *r)

static bool rdt_get_mem_config(struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);
union cpuid_0x10_3_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx;

cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
r->num_closid = edx.split.cos_max + 1;
+ hw_res->hw_num_closid = r->num_closid;
r->membw.max_delay = eax.split.max_delay + 1;
r->default_ctrl = MAX_MBA_BW;
if (ecx & MBA_IS_LINEAR) {
@@ -302,12 +306,14 @@ static bool rdt_get_mem_config(struct rdt_resource *r)

static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);
union cpuid_0x10_1_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx;

cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
r->num_closid = edx.split.cos_max + 1;
+ hw_res->hw_num_closid = r->num_closid;
r->cache.cbm_len = eax.split.cbm_len + 1;
r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
r->cache.shareable_bits = ebx & r->default_ctrl;
@@ -319,9 +325,11 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
static void rdt_get_cdp_config(int level, int type)
{
struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
- struct rdt_resource *r = &rdt_resources_all[type].resctrl;
+ struct rdt_hw_resource *hw_res_t = &rdt_resources_all[type];
+ struct rdt_resource *r = &hw_res_t->resctrl;

r->num_closid = r_l->num_closid / 2;
+ hw_res_t->hw_num_closid = r_l->num_closid;
r->cache.cbm_len = r_l->cache.cbm_len;
r->default_ctrl = r_l->default_ctrl;
r->cache.shareable_bits = r_l->cache.shareable_bits;
@@ -463,6 +471,7 @@ struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
{
int i;
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);

/*
* Initialize the Control MSRs to having no control.
@@ -470,7 +479,7 @@ void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
* For Memory Allocation: Set b/w requested to 100%
* and the bandwidth in MBps to U32_MAX
*/
- for (i = 0; i < r->num_closid; i++, dc++, dm++) {
+ for (i = 0; i < hw_res->hw_num_closid; i++, dc++, dm++) {
*dc = r->default_ctrl;
*dm = MBA_MAX_MBPS;
}
@@ -483,7 +492,7 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
struct msr_param m;
u32 *dc, *dm;

- dc = kmalloc_array(r->num_closid, sizeof(*hw_dom->ctrl_val), GFP_KERNEL);
+ dc = kmalloc_array(hw_res->hw_num_closid, sizeof(*hw_dom->ctrl_val), GFP_KERNEL);
if (!dc)
return -ENOMEM;

@@ -498,7 +507,7 @@ static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
setup_default_ctrlval(r, dc, dm);

m.low = 0;
- m.high = r->num_closid;
+ m.high = hw_res->hw_num_closid;
hw_res->msr_update(d, &m, r);
return 0;
}
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index 8df549ef016d..92822ff99f1a 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -275,6 +275,7 @@ static inline bool is_mbm_event(int e)
* struct rdt_resource - attributes of an RDT resource
* @resctrl: Properties exposed to the resctrl filesystem
* @rid: The index of the resource
+ * @hw_num_closid: The actual number of closids, regardless of CDP
* @msr_base: Base MSR address for CBMs
* @msr_update: Function pointer to update QOS MSRs
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
@@ -283,6 +284,7 @@ static inline bool is_mbm_event(int e)
struct rdt_hw_resource {
struct rdt_resource resctrl;
int rid;
+ u32 hw_num_closid;
unsigned int msr_base;
void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index f4f76c193495..58dceaad6863 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -1362,6 +1362,7 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type,

static int reset_all_ctrls(struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);
struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
@@ -1384,7 +1385,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
hw_dom = rc_dom_to_rdt(d);
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);

- for (i = 0; i < r->num_closid; i++)
+ for (i = 0; i < hw_res->hw_num_closid; i++)
hw_dom->ctrl_val[i] = r->default_ctrl;
}
cpu = get_cpu();
--
2.18.0


2018-08-24 10:47:58

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 10/20] x86/intel_rdt: Let resctrl change the resources's num_closid

Today we switch between different alloc_enabled resources which
have differing preset num_closid to account for CDP.
We want to merge these illusionary caches together, at which
point something needs to change the resctrl's view of num_closid.

The arch code now has its own idea of how many closids there are,
and as the two configurations for one rdtgroup is part of resctrl's
ABI we should get resctrl to change it.

We change the num_closid on the l2/l3 resources, which aren't
yet in use when cdp is enabled, then change them back afterwards.
Once we merge illusionary caches, resctrl will see the value it
changed here.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 58dceaad6863..e2a9202674f3 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -1149,16 +1149,36 @@ int resctrl_arch_set_cdp_enabled(bool enable)

static int try_to_enable_cdp(int level)
{
+ int ret;
struct rdt_resource *r = &rdt_resources_all[level].resctrl;
+ struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
+ struct rdt_resource *l2 = &rdt_resources_all[RDT_RESOURCE_L2].resctrl;

if (!r->cdp_capable)
return -EINVAL;
+ if (r->cdp_enabled)
+ return 0;

- return resctrl_arch_set_cdp_enabled(true);
+ ret = resctrl_arch_set_cdp_enabled(true);
+ if (!ret) {
+ if (l2->cdp_enabled)
+ l2->num_closid /= 2;
+ if (l3->cdp_enabled)
+ l3->num_closid /= 2;
+ }
+
+ return ret;
}

static void cdp_disable_all(void)
{
+ struct rdt_resource *l2 = &rdt_resources_all[RDT_RESOURCE_L2].resctrl;
+ struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].resctrl;
+
+ if (l2->cdp_enabled)
+ l2->num_closid *= 2;
+ if (l3->cdp_enabled)
+ l3->num_closid *= 2;
resctrl_arch_set_cdp_enabled(false);
}

--
2.18.0


2018-08-24 10:48:03

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 12/20] x86/intel_rdt: Correct the closid when staging configuration changes

Now that apply_config() and update_domains() know the code/data/both value
of what they are writing, and ctrl_val is correctly sized: use the
hardware closid slot, based on the configuration type.

This means cbm_idx() and its illusionary cache-properties can go.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 18 +-----------
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 32 ++++++++++++++-------
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 2 +-
include/linux/resctrl.h | 6 ++--
4 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 8d3544b6c149..6466c172c045 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -75,8 +75,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 3,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
},
.domains = domain_init(RDT_RESOURCE_L3),
.parse_ctrlval = parse_cbm,
@@ -95,8 +93,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 3,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
},
.domains = domain_init(RDT_RESOURCE_L3DATA),
.parse_ctrlval = parse_cbm,
@@ -116,8 +112,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 3,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
},
.domains = domain_init(RDT_RESOURCE_L3CODE),
.parse_ctrlval = parse_cbm,
@@ -136,8 +130,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 2,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
},
.domains = domain_init(RDT_RESOURCE_L2),
.parse_ctrlval = parse_cbm,
@@ -156,8 +148,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 2,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
},
.domains = domain_init(RDT_RESOURCE_L2DATA),
.parse_ctrlval = parse_cbm,
@@ -176,8 +166,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.cache_level = 2,
.cache = {
.min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
},
.domains = domain_init(RDT_RESOURCE_L2CODE),
.parse_ctrlval = parse_cbm,
@@ -204,10 +192,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
},
};

-static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid)
-{
- return closid * r->cache.cbm_idx_mult + r->cache.cbm_idx_offset;
-}

/*
* cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
@@ -408,7 +392,7 @@ cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
struct rdt_hw_resource *hw_res = resctrl_to_rdt(r);

for (i = m->low; i < m->high; i++)
- wrmsrl(hw_res->msr_base + cbm_idx(r, i), hw_dom->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
}

struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index bab6032704c3..05c14d9f797c 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -28,6 +28,16 @@
#include <linux/slab.h>
#include "intel_rdt.h"

+static u32 resctrl_closid_cdp_map(u32 closid, enum resctrl_conf_type t)
+{
+ if (t == CDP_CODE)
+ return (closid * 2) + 1;
+ else if (t == CDP_DATA)
+ return (closid * 2);
+ else
+ return closid;
+}
+
/*
* Check whether MBA bandwidth percentage value is correct. The value is
* checked against the minimum and max bandwidth values specified by the
@@ -77,7 +87,7 @@ int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d,

if (!bw_validate(buf, &data, r))
return -EINVAL;
- cfg->closid = closid;
+ cfg->hw_closid = resctrl_closid_cdp_map(closid, t);
cfg->new_ctrl = data;
cfg->new_ctrl_type = t;
cfg->have_new_ctrl = true;
@@ -143,7 +153,7 @@ int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d,

if(!cbm_validate(buf, &data, r))
return -EINVAL;
- cfg->closid = closid;
+ cfg->hw_closid = resctrl_closid_cdp_map(closid, t);
cfg->new_ctrl = data;
cfg->new_ctrl_type = t;
cfg->have_new_ctrl = true;
@@ -190,10 +200,10 @@ static void apply_config(struct rdt_hw_domain *hw_dom,
{
u32 *dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;

- if (cfg->new_ctrl != dc[cfg->closid]) {
+ if (cfg->new_ctrl != dc[cfg->hw_closid]) {
cpumask_set_cpu(cpumask_any(&hw_dom->resctrl.cpu_mask),
cpu_mask);
- dc[cfg->closid] = cfg->new_ctrl;
+ dc[cfg->hw_closid] = cfg->new_ctrl;
cfg->have_new_ctrl = false;
}
}
@@ -222,10 +232,9 @@ int resctrl_arch_update_domains(struct rdt_resource *r)
cfg = &hw_dom->resctrl.staged_config[i];
if (!cfg->have_new_ctrl)
continue;
-
apply_config(hw_dom, cfg, cpu_mask, mba_sc);

- closid = cfg->closid;
+ closid = cfg->hw_closid;
if (!msr_param_init) {
msr_param.low = closid;
msr_param.high = closid;
@@ -328,14 +337,14 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
}

void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
- u32 closid, u32 *value)
+ u32 hw_closid, u32 *value)
{
struct rdt_hw_domain *hw_dom = rc_dom_to_rdt(d);

if (!is_mba_sc(r))
- *value = hw_dom->ctrl_val[closid];
+ *value = hw_dom->ctrl_val[hw_closid];
else
- *value = hw_dom->mbps_val[closid];
+ *value = hw_dom->mbps_val[hw_closid];
}

static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
@@ -343,14 +352,15 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)

struct rdt_domain *dom;
bool sep = false;
- u32 ctrl_val;
+ u32 ctrl_val, hw_closid;

seq_printf(s, "%*s:", max_name_width, r->name);
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_puts(s, ";");

- resctrl_arch_get_config(r, dom, closid, &ctrl_val);
+ hw_closid = resctrl_closid_cdp_map(closid, resctrl_to_rdt(r)->cdp_type);
+ resctrl_arch_get_config(r, dom, hw_closid, &ctrl_val);
seq_printf(s, r->format_str, dom->id, max_data_width,
ctrl_val);
sep = true;
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index e2a9202674f3..f3dfed9c609a 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -1394,7 +1394,7 @@ static int reset_all_ctrls(struct rdt_resource *r)

msr_param.res = r;
msr_param.low = 0;
- msr_param.high = r->num_closid;
+ msr_param.high = hw_res->hw_num_closid;

/*
* Disable resource control for this resource by setting all
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 592242635204..dad266f9b0fe 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -15,13 +15,13 @@ enum resctrl_conf_type {

/**
* struct resctrl_staged_config - parsed configuration to be applied
- * @closid: The closid the new configuration applies to
+ * @hw_closid: raw closid for this configuration, regardless of CDP
* @new_ctrl: new ctrl value to be loaded
* @have_new_ctrl: did user provide new_ctrl for this domain
* @new_ctrl_type: CDP property of the new ctrl
*/
struct resctrl_staged_config {
- u32 closid;
+ u32 hw_closid;
u32 new_ctrl;
bool have_new_ctrl;
enum resctrl_conf_type new_ctrl_type;
@@ -56,8 +56,6 @@ struct rdt_domain {
struct resctrl_cache {
u32 cbm_len;
u32 min_cbm_bits;
- unsigned int cbm_idx_mult; // TODO remove this
- unsigned int cbm_idx_offset; // TODO remove this
u32 shareable_bits;
};

--
2.18.0


2018-08-24 10:48:05

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 13/20] x86/intel_rdt: Allow different CODE/DATA configurations to be staged

Now that the staged configuration holds its CDP type and hardware
closid, allow resctrl to stage more than configuration at a time for
a single resource.

To detect the same schema being specified twice when the schemata file
is written, the same slot in the staged_configuration array must be
used for each schema. Use the cdp_type enum directly as an index.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++--
include/linux/resctrl.h | 4 +++-
2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index 05c14d9f797c..f80a838cc36d 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -78,7 +78,7 @@ int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d,
enum resctrl_conf_type t, u32 closid)
{
unsigned long data;
- struct resctrl_staged_config *cfg = &d->staged_config[0];
+ struct resctrl_staged_config *cfg = &d->staged_config[t];

if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("duplicate domain %d\n", d->id);
@@ -144,7 +144,7 @@ int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d,
enum resctrl_conf_type t, u32 closid)
{
unsigned long data;
- struct resctrl_staged_config *cfg = &d->staged_config[0];
+ struct resctrl_staged_config *cfg = &d->staged_config[t];

if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("duplicate domain %d\n", d->id);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index dad266f9b0fe..ede5c40756b4 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -12,6 +12,8 @@ enum resctrl_conf_type {
CDP_CODE,
CDP_DATA,
};
+#define NUM_CDP_TYPES CDP_DATA + 1
+

/**
* struct resctrl_staged_config - parsed configuration to be applied
@@ -39,7 +41,7 @@ struct rdt_domain {
int id;
struct cpumask cpu_mask;

- struct resctrl_staged_config staged_config[1];
+ struct resctrl_staged_config staged_config[NUM_CDP_TYPES];
};

/**
--
2.18.0


2018-08-24 10:48:06

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 14/20] x86/intel_rdt: Add a separate resource list for resctrl

We want to merge the L2/L2CODE/L2DATA resources together so that
there is one resource per cache. The CDP properties are then
part of the configuration.

Currently the cdp type to use with the configuration is hidden
in the resource. This needs to be part of the schema, but resctrl
doesn't have a structure for this, (its all flattened out into
extra resources).

Create a list of schema that resctrl presents via the schemata file.
We want to move the illusion of an "L2CODE" cache into resctrl so that
this part of the ABI is dealt with by core code.
This change will allow us to have the same resource represented twice
as code/data, with the appropriate cdp_type for configuration.

This will also let us generate the names in resctrl.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 45 +++++++++++++++++++++++-
include/linux/resctrl.h | 13 +++++++
2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index f3dfed9c609a..2015d99ca388 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -43,6 +43,9 @@ static struct kernfs_root *rdt_root;
struct rdtgroup rdtgroup_default;
LIST_HEAD(rdt_all_groups);

+/* list of entries for the schemata file */
+LIST_HEAD(resctrl_all_schema);
+
/* Kernel fs node for "info" directory under root */
static struct kernfs_node *kn_info;

@@ -1287,6 +1290,37 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn,
struct rdtgroup *prgrp,
struct kernfs_node **mon_data_kn);

+
+static int create_schemata_list(void)
+{
+ struct rdt_resource *r;
+ struct resctrl_schema *s;
+
+ for_each_alloc_enabled_rdt_resource(r) {
+ s = kzalloc(sizeof(*s), GFP_KERNEL);
+ if (!s)
+ return -ENOMEM;
+
+ s->res = r;
+ s->conf_type = resctrl_to_rdt(r)->cdp_type;
+
+ INIT_LIST_HEAD(&s->list);
+ list_add(&s->list, &resctrl_all_schema);
+ }
+
+ return 0;
+}
+
+static void destroy_schemata_list(void)
+{
+ struct resctrl_schema *s, *tmp;
+
+ list_for_each_entry_safe(s, tmp, &resctrl_all_schema, list) {
+ list_del(&s->list);
+ kfree(s);
+ }
+}
+
static struct dentry *rdt_mount(struct file_system_type *fs_type,
int flags, const char *unused_dev_name,
void *data)
@@ -1312,12 +1346,18 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type,
goto out_cdp;
}

+ ret = create_schemata_list();
+ if (ret) {
+ dentry = ERR_PTR(ret);
+ goto out_schemata_free;
+ }
+
closid_init();

ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
if (ret) {
dentry = ERR_PTR(ret);
- goto out_cdp;
+ goto out_schemata_free;
}

if (rdt_mon_capable) {
@@ -1370,6 +1410,8 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type,
kernfs_remove(kn_mongrp);
out_info:
kernfs_remove(kn_info);
+out_schemata_free:
+ destroy_schemata_list();
out_cdp:
cdp_disable_all();
out:
@@ -1538,6 +1580,7 @@ static void rdt_kill_sb(struct super_block *sb)
reset_all_ctrls(r);
cdp_disable_all();
rmdir_all_sub();
+ destroy_schemata_list();
static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
static_branch_disable_cpuslocked(&rdt_mon_enable_key);
static_branch_disable_cpuslocked(&rdt_enable_key);
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index ede5c40756b4..071b2cc9c402 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -145,4 +145,17 @@ void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
/* Enable/Disable CDP on all applicable resources */
int resctrl_arch_set_cdp_enabled(bool enable);

+/**
+ * @list: Member of resctrl's schema list
+ * @cdp_type: Whether this entry is for code/data/both
+ * @res: The rdt_resource for this entry
+ */
+struct resctrl_schema {
+ struct list_head list;
+ enum resctrl_conf_type conf_type;
+ struct rdt_resource *res;
+};
+
+extern struct list_head resctrl_all_schema;
+
#endif /* __LINUX_RESCTRL_H */
--
2.18.0


2018-08-24 10:48:08

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 15/20] x86/intel_rdt: Walk the resctrl schema list instead of the arch's resource list

Now that resctrl has a list of resources it is using, walk that list
instead of the architectures list. This lets us keep schema properties
with the resource that is using them.

Most users of for_each_alloc_enabled_rdt_resource() are per-schema,
switch these to walk the schema list. The remainder are working with
a per-resource property.

Previously we littered resctrl_to_rdt() wherever we needed to know the
cdp_type of a cache. Now that this has a home, fix all those callers
to read the value from the relevant schema entry.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 24 +++++++++++++--------
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 4 +++-
include/linux/resctrl.h | 2 +-
3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index f80a838cc36d..3038ecfdeec0 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -271,10 +271,12 @@ int resctrl_arch_update_domains(struct rdt_resource *r)
static int rdtgroup_parse_resource(char *resname, char *tok, int closid)
{
struct rdt_resource *r;
+ struct resctrl_schema *s;

- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(s, &resctrl_all_schema, list) {
+ r = s->res;
if (!strcmp(resname, r->name) && closid < r->num_closid)
- return parse_line(tok, r, resctrl_to_rdt(r)->cdp_type, closid);
+ return parse_line(tok, r, s->conf_type, closid);
}
rdt_last_cmd_printf("unknown/unsupported resource name '%s'\n", resname);
return -EINVAL;
@@ -283,6 +285,7 @@ static int rdtgroup_parse_resource(char *resname, char *tok, int closid)
ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
+ struct resctrl_schema *s;
struct rdtgroup *rdtgrp;
struct rdt_domain *dom;
struct rdt_resource *r;
@@ -303,9 +306,10 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

closid = rdtgrp->closid;

- for_each_alloc_enabled_rdt_resource(r) {
- list_for_each_entry(dom, &r->domains, list)
+ list_for_each_entry(s, &resctrl_all_schema, list) {
+ list_for_each_entry(dom, &s->res->domains, list) {
memset(dom->staged_config, 0, sizeof(dom->staged_config));
+ }
}

while ((tok = strsep(&buf, "\n")) != NULL) {
@@ -347,9 +351,9 @@ void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
*value = hw_dom->mbps_val[hw_closid];
}

-static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
+static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
{
-
+ struct rdt_resource *r = schema->res;
struct rdt_domain *dom;
bool sep = false;
u32 ctrl_val, hw_closid;
@@ -359,7 +363,7 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
if (sep)
seq_puts(s, ";");

- hw_closid = resctrl_closid_cdp_map(closid, resctrl_to_rdt(r)->cdp_type);
+ hw_closid = resctrl_closid_cdp_map(closid, schema->conf_type);
resctrl_arch_get_config(r, dom, hw_closid, &ctrl_val);
seq_printf(s, r->format_str, dom->id, max_data_width,
ctrl_val);
@@ -371,6 +375,7 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
int rdtgroup_schemata_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
+ struct resctrl_schema *schema;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
int ret = 0;
@@ -379,9 +384,10 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
rdtgrp = rdtgroup_kn_lock_live(of->kn);
if (rdtgrp) {
closid = rdtgrp->closid;
- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(schema, &resctrl_all_schema, list) {
+ r = schema->res;
if (closid < r->num_closid)
- show_doms(s, r, closid);
+ show_doms(s, schema, closid);
}
} else {
ret = -ENOENT;
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 2015d99ca388..0bd748defc73 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -913,6 +913,7 @@ static int rdtgroup_mkdir_info_resdir(struct rdt_resource *r, char *name,

static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
{
+ struct resctrl_schema *s;
struct rdt_resource *r;
unsigned long fflags;
char name[32];
@@ -928,7 +929,8 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
if (ret)
goto out_destroy;

- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(s, &resctrl_all_schema, list) {
+ r = s->res;
fflags = r->fflags | RF_CTRL_INFO;
ret = rdtgroup_mkdir_info_resdir(r, r->name, fflags);
if (ret)
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 071b2cc9c402..9ed0beb241d8 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -147,7 +147,7 @@ int resctrl_arch_set_cdp_enabled(bool enable);

/**
* @list: Member of resctrl's schema list
- * @cdp_type: Whether this entry is for code/data/both
+ * @conf_type: Type of configuration, e.g. code/data/both
* @res: The rdt_resource for this entry
*/
struct resctrl_schema {
--
2.18.0


2018-08-24 10:48:12

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 17/20] x86/intel_rdt: Stop using Lx CODE/DATA resources

Now that CDP enable/disable is global, and the closid offset correction
is based on the configuration being applied, we can use the same
Lx resource twice for CDP's CODE/DATA schema. This keeps the illusion
of separate caches in the resctrl code.

When CDP is enabled for a cache, create two schema generating the names
and setting the configuration type.

We can now remove the initialisation of of the illusionary hw_resources:
'cdp_capable' just requires setting a flag, resctrl knows what to do
from there.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 49 ++----------
arch/x86/kernel/cpu/intel_rdt.h | 1 -
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 98 +++++++++++++-----------
3 files changed, 58 insertions(+), 90 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 3a0d7de15afa..96b1aab36053 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -81,7 +81,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
- .cdp_type = CDP_BOTH,
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -99,7 +98,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
- .cdp_type = CDP_DATA,
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,

@@ -118,7 +116,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
- .cdp_type = CDP_CODE,
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -136,7 +133,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
- .cdp_type = CDP_BOTH,
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -154,7 +150,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
- .cdp_type = CDP_DATA,
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -172,7 +167,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
- .cdp_type = CDP_CODE,
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -312,39 +306,6 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
r->alloc_enabled = true;
}

-static void rdt_get_cdp_config(int level, int type)
-{
- struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
- struct rdt_hw_resource *hw_res_t = &rdt_resources_all[type];
- struct rdt_resource *r = &hw_res_t->resctrl;
-
- r->num_closid = r_l->num_closid / 2;
- hw_res_t->hw_num_closid = r_l->num_closid;
- r->cache.cbm_len = r_l->cache.cbm_len;
- r->default_ctrl = r_l->default_ctrl;
- r->cache.shareable_bits = r_l->cache.shareable_bits;
- r->data_width = (r->cache.cbm_len + 3) / 4;
- r->alloc_capable = true;
- /*
- * By default, CDP is disabled. CDP can be enabled by mount parameter
- * "cdp" during resctrl file system mount time.
- */
- r_l->cdp_capable = true;
- r->alloc_enabled = false;
-}
-
-static void rdt_get_cdp_l3_config(void)
-{
- rdt_get_cdp_config(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA);
- rdt_get_cdp_config(RDT_RESOURCE_L3, RDT_RESOURCE_L3CODE);
-}
-
-static void rdt_get_cdp_l2_config(void)
-{
- rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA);
- rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2CODE);
-}
-
static int get_cache_id(int cpu, int level)
{
struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu);
@@ -813,6 +774,8 @@ static bool __init rdt_cpu_has(int flag)
static __init bool get_rdt_alloc_resources(void)
{
bool ret = false;
+ struct rdt_hw_resource *l2 = &rdt_resources_all[RDT_RESOURCE_L2];
+ struct rdt_hw_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3];

if (rdt_alloc_capable)
return true;
@@ -821,16 +784,16 @@ static __init bool get_rdt_alloc_resources(void)
return false;

if (rdt_cpu_has(X86_FEATURE_CAT_L3)) {
- rdt_get_cache_alloc_cfg(1, &rdt_resources_all[RDT_RESOURCE_L3].resctrl);
+ rdt_get_cache_alloc_cfg(1, &l3->resctrl);
if (rdt_cpu_has(X86_FEATURE_CDP_L3))
- rdt_get_cdp_l3_config();
+ l3->resctrl.cdp_capable = true;
ret = true;
}
if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
/* CPUID 0x10.2 fields are same format at 0x10.1 */
- rdt_get_cache_alloc_cfg(2, &rdt_resources_all[RDT_RESOURCE_L2].resctrl);
+ rdt_get_cache_alloc_cfg(2, &l2->resctrl);
if (rdt_cpu_has(X86_FEATURE_CDP_L2))
- rdt_get_cdp_l2_config();
+ l2->resctrl.cdp_capable = true;
ret = true;
}

diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index b72448186532..fd5c0b3dc797 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -285,7 +285,6 @@ struct rdt_hw_resource {
struct rdt_resource resctrl;
int rid;
u32 hw_num_closid;
- enum resctrl_conf_type cdp_type; // temporary
unsigned int msr_base;
void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index b3d3acbb2ef7..39038bdfa7d6 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -1078,47 +1078,28 @@ static int set_mba_sc(bool mba_sc)
return 0;
}

-static int cdp_enable(int level, int data_type, int code_type)
+static int cdp_enable(int level)
{
- struct rdt_resource *r_ldata = &rdt_resources_all[data_type].resctrl;
- struct rdt_resource *r_lcode = &rdt_resources_all[code_type].resctrl;
struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
int ret;

- if (!r_l->alloc_capable || !r_ldata->alloc_capable ||
- !r_lcode->alloc_capable || !r_l->cdp_capable)
+ if (!r_l->alloc_capable || !r_l->cdp_capable)
return -EINVAL;

ret = set_cache_qos_cfg(level, true);
- if (!ret) {
- r_l->alloc_enabled = false;
- r_ldata->alloc_enabled = true;
- r_lcode->alloc_enabled = true;
-
+ if (!ret)
r_l->cdp_enabled = true;
- r_ldata->cdp_enabled = true;
- r_lcode->cdp_enabled = true;
- }
+
return ret;
}

-static void cdp_disable(int level, int data_type, int code_type)
+static void cdp_disable(int level)
{
struct rdt_resource *r = &rdt_resources_all[level].resctrl;

- if (!r->cdp_enabled)
- return;
-
- r->alloc_enabled = r->alloc_capable;
-
- if (rdt_resources_all[data_type].resctrl.alloc_enabled) {
- rdt_resources_all[data_type].resctrl.alloc_enabled = false;
- rdt_resources_all[code_type].resctrl.alloc_enabled = false;
+ if (r->cdp_enabled) {
set_cache_qos_cfg(level, false);
-
r->cdp_enabled = false;
- rdt_resources_all[data_type].resctrl.cdp_enabled = false;
- rdt_resources_all[code_type].resctrl.cdp_enabled = false;
}
}

@@ -1130,22 +1111,18 @@ int resctrl_arch_set_cdp_enabled(bool enable)

if (l3 && l3->resctrl.cdp_capable) {
if (!enable) {
- cdp_disable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA,
- RDT_RESOURCE_L3CODE);
+ cdp_disable(RDT_RESOURCE_L3);
ret = 0;
} else {
- ret = cdp_enable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA,
- RDT_RESOURCE_L3CODE);
+ ret = cdp_enable(RDT_RESOURCE_L3);
}
}
if (l2 && l2->resctrl.cdp_capable) {
if (!enable) {
- cdp_disable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA,
- RDT_RESOURCE_L2CODE);
+ cdp_disable(RDT_RESOURCE_L2);
ret = 0;
} else {
- ret = cdp_enable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA,
- RDT_RESOURCE_L2CODE);
+ ret = cdp_enable(RDT_RESOURCE_L2);
}
}

@@ -1293,28 +1270,57 @@ static int mkdir_mondata_all(struct kernfs_node *parent_kn,
struct kernfs_node **mon_data_kn);


-static int create_schemata_list(void)
+static int add_schema(enum resctrl_conf_type t, struct rdt_resource *r)
{
- struct rdt_resource *r;
+ char *suffix = "";
struct resctrl_schema *s;

- for_each_alloc_enabled_rdt_resource(r) {
- s = kzalloc(sizeof(*s), GFP_KERNEL);
- if (!s)
- return -ENOMEM;
+ s = kzalloc(sizeof(*s), GFP_KERNEL);
+ if (!s)
+ return -ENOMEM;

- s->res = r;
- s->conf_type = resctrl_to_rdt(r)->cdp_type;
+ s->res = r;
+ s->conf_type = t;
+
+ switch (t) {
+ case CDP_CODE:
+ suffix = "CODE";
+ break;
+ case CDP_DATA:
+ suffix = "DATA";
+ break;
+ case CDP_BOTH:
+ suffix = "";
+ break;
+ }

- snprintf(s->name, sizeof(s->name), "%s", r->name);
+ snprintf(s->name, sizeof(s->name), "%s%s", r->name, suffix);

- INIT_LIST_HEAD(&s->list);
- list_add(&s->list, &resctrl_all_schema);
- }
+ INIT_LIST_HEAD(&s->list);
+ list_add(&s->list, &resctrl_all_schema);

return 0;
}

+static int create_schemata_list(void)
+{
+ int ret = 0;
+ struct rdt_resource *r;
+
+ for_each_alloc_enabled_rdt_resource(r) {
+ if (r->cdp_enabled) {
+ ret = add_schema(CDP_CODE, r);
+ ret |= add_schema(CDP_DATA, r);
+ } else {
+ ret = add_schema(CDP_BOTH, r);
+ }
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
static void destroy_schemata_list(void)
{
struct resctrl_schema *s, *tmp;
@@ -2133,7 +2139,7 @@ static int rdtgroup_rmdir(struct kernfs_node *kn)

static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled)
+ if (rdt_resources_all[RDT_RESOURCE_L3].resctrl.cdp_enabled)
seq_puts(seq, ",cdp");
return 0;
}
--
2.18.0


2018-08-24 10:48:18

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 20/20] x86/intel_rdt: Merge cdp enable/disable calls

Now that the cdp_enable() and cdp_disable() calls are basically the same,
merge them into cdp_set_enabled(true/false).

All these functions are behind resctrl_arch_set_cdp_enabled(),
so the can take the rdt_hw_resource directly.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 49 ++++++------------------
1 file changed, 12 insertions(+), 37 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 38cd463443e8..b9b7375ef8a9 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -1017,10 +1017,9 @@ static inline bool is_mba_linear(void)
return rdt_resources_all[RDT_RESOURCE_MBA].resctrl.membw.delay_linear;
}

-static int set_cache_qos_cfg(int level, bool enable)
+static int set_cache_qos_cfg(struct rdt_hw_resource *hw_res, bool enable)
{
void (*update)(void *arg);
- struct rdt_resource *r_l;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
int cpu;
@@ -1028,15 +1027,14 @@ static int set_cache_qos_cfg(int level, bool enable)
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;

- if (level == RDT_RESOURCE_L3)
+ if (hw_res == &rdt_resources_all[RDT_RESOURCE_L3])
update = l3_qos_cfg_update;
- else if (level == RDT_RESOURCE_L2)
+ else if (hw_res == &rdt_resources_all[RDT_RESOURCE_L2])
update = l2_qos_cfg_update;
else
return -EINVAL;

- r_l = &rdt_resources_all[level].resctrl;
- list_for_each_entry(d, &r_l->domains, list) {
+ list_for_each_entry(d, &hw_res->resctrl.domains, list) {
/* Pick one CPU from each domain instance to update MSR */
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
}
@@ -1078,53 +1076,30 @@ static int set_mba_sc(bool mba_sc)
return 0;
}

-static int cdp_enable(int level)
+static int cdp_set_enabled(struct rdt_hw_resource *hw_res, bool enable)
{
- struct rdt_resource *r_l = &rdt_resources_all[level].resctrl;
int ret;

- if (!r_l->alloc_capable || !r_l->cdp_capable)
+ if (!hw_res->resctrl.cdp_capable)
return -EINVAL;

- ret = set_cache_qos_cfg(level, true);
+ ret = set_cache_qos_cfg(hw_res, enable);
if (!ret)
- r_l->cdp_enabled = true;
+ hw_res->resctrl.cdp_enabled = enable;

return ret;
}

-static void cdp_disable(int level)
-{
- struct rdt_resource *r = &rdt_resources_all[level].resctrl;
-
- if (r->cdp_enabled) {
- set_cache_qos_cfg(level, false);
- r->cdp_enabled = false;
- }
-}
-
int resctrl_arch_set_cdp_enabled(bool enable)
{
int ret = -EINVAL;
struct rdt_hw_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3];
struct rdt_hw_resource *l2 = &rdt_resources_all[RDT_RESOURCE_L2];

- if (l3 && l3->resctrl.cdp_capable) {
- if (!enable) {
- cdp_disable(RDT_RESOURCE_L3);
- ret = 0;
- } else {
- ret = cdp_enable(RDT_RESOURCE_L3);
- }
- }
- if (l2 && l2->resctrl.cdp_capable) {
- if (!enable) {
- cdp_disable(RDT_RESOURCE_L2);
- ret = 0;
- } else {
- ret = cdp_enable(RDT_RESOURCE_L2);
- }
- }
+ if (l3 && l3->resctrl.cdp_capable)
+ ret = cdp_set_enabled(l3, enable);
+ if (l2 && l2->resctrl.cdp_capable)
+ ret = cdp_set_enabled(l2, enable);

return ret;
}
--
2.18.0


2018-08-24 10:48:24

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 18/20] x86/intel_rdt: Remove the CODE/DATA illusionary caches

Now that nothing uses these caches, remove them.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 69 ---------------------------------
arch/x86/kernel/cpu/intel_rdt.h | 4 --
2 files changed, 73 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 96b1aab36053..f6f1eceb366f 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -84,41 +84,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
},
- [RDT_RESOURCE_L3DATA] =
- {
- .rid = RDT_RESOURCE_L3DATA,
- .resctrl = {
- .name = "L3DATA",
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- },
- .domains = domain_init(RDT_RESOURCE_L3DATA),
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- .msr_base = IA32_L3_CBM_BASE,
- .msr_update = cat_wrmsr,
-
- },
- [RDT_RESOURCE_L3CODE] =
- {
- .rid = RDT_RESOURCE_L3CODE,
- .resctrl = {
- .name = "L3CODE",
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- },
- .domains = domain_init(RDT_RESOURCE_L3CODE),
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- .msr_base = IA32_L3_CBM_BASE,
- .msr_update = cat_wrmsr,
- },
[RDT_RESOURCE_L2] =
{
.rid = RDT_RESOURCE_L2,
@@ -136,40 +101,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
},
- [RDT_RESOURCE_L2DATA] =
- {
- .rid = RDT_RESOURCE_L2DATA,
- .resctrl = {
- .name = "L2DATA",
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- },
- .domains = domain_init(RDT_RESOURCE_L2DATA),
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- .msr_base = IA32_L2_CBM_BASE,
- .msr_update = cat_wrmsr,
- },
- [RDT_RESOURCE_L2CODE] =
- {
- .rid = RDT_RESOURCE_L2CODE,
- .resctrl = {
- .name = "L2CODE",
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- },
- .domains = domain_init(RDT_RESOURCE_L2CODE),
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- .msr_base = IA32_L2_CBM_BASE,
- .msr_update = cat_wrmsr,
- },
[RDT_RESOURCE_MBA] =
{
.rid = RDT_RESOURCE_MBA,
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index fd5c0b3dc797..a4aba005cfea 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -311,11 +311,7 @@ int __init rdtgroup_init(void);

enum {
RDT_RESOURCE_L3,
- RDT_RESOURCE_L3DATA,
- RDT_RESOURCE_L3CODE,
RDT_RESOURCE_L2,
- RDT_RESOURCE_L2DATA,
- RDT_RESOURCE_L2CODE,
RDT_RESOURCE_MBA,

/* Must be the last */
--
2.18.0


2018-08-24 10:48:36

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 16/20] x86/intel_rdt: Move the schemata names into struct resctrl_schema

Move the names used for the schemata file out of the resource and
into struct resctrl_schema. This lets us give one resource two
different names, based on the other schema properties.

For now we copy the name, once we merge the L2/L2CODE/L2DATA
resources resctrl will generate it.

Remove the arch code's max_name_width, this is now resctrl's
problem.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 9 ++-------
arch/x86/kernel/cpu/intel_rdt.h | 2 +-
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++--
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 4 +++-
include/linux/resctrl.h | 7 +++++++
5 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 6466c172c045..3a0d7de15afa 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -48,10 +48,10 @@ DEFINE_MUTEX(rdtgroup_mutex);
DEFINE_PER_CPU(struct intel_pqr_state, pqr_state);

/*
- * Used to store the max resource name width and max resource data width
+ * Used to store the max resource data width
* to display the schemata in a tabular format
*/
-int max_name_width, max_data_width;
+int max_data_width;

/*
* Global boolean for rdt_alloc which is true if any
@@ -722,13 +722,8 @@ static int intel_rdt_offline_cpu(unsigned int cpu)
static __init void rdt_init_padding(void)
{
struct rdt_resource *r;
- int cl;

for_each_alloc_capable_rdt_resource(r) {
- cl = strlen(r->name);
- if (cl > max_name_width)
- max_name_width = cl;
-
if (r->data_width > max_data_width)
max_data_width = r->data_width;
}
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index cc8dea58b74f..b72448186532 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -146,7 +146,7 @@ struct rdtgroup {
/* List of all resource groups */
extern struct list_head rdt_all_groups;

-extern int max_name_width, max_data_width;
+extern int max_data_width;

int __init rdtgroup_init(void);

diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index 3038ecfdeec0..e8264637a4d3 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -275,7 +275,7 @@ static int rdtgroup_parse_resource(char *resname, char *tok, int closid)

list_for_each_entry(s, &resctrl_all_schema, list) {
r = s->res;
- if (!strcmp(resname, r->name) && closid < r->num_closid)
+ if (!strcmp(resname, s->name) && closid < r->num_closid)
return parse_line(tok, r, s->conf_type, closid);
}
rdt_last_cmd_printf("unknown/unsupported resource name '%s'\n", resname);
@@ -358,7 +358,7 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
bool sep = false;
u32 ctrl_val, hw_closid;

- seq_printf(s, "%*s:", max_name_width, r->name);
+ seq_printf(s, "%*s:", sizeof(schema->name), schema->name);
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_puts(s, ";");
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 0bd748defc73..b3d3acbb2ef7 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -932,7 +932,7 @@ static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
list_for_each_entry(s, &resctrl_all_schema, list) {
r = s->res;
fflags = r->fflags | RF_CTRL_INFO;
- ret = rdtgroup_mkdir_info_resdir(r, r->name, fflags);
+ ret = rdtgroup_mkdir_info_resdir(r, s->name, fflags);
if (ret)
goto out_destroy;
}
@@ -1306,6 +1306,8 @@ static int create_schemata_list(void)
s->res = r;
s->conf_type = resctrl_to_rdt(r)->cdp_type;

+ snprintf(s->name, sizeof(s->name), "%s", r->name);
+
INIT_LIST_HEAD(&s->list);
list_add(&s->list, &resctrl_all_schema);
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 9ed0beb241d8..8b06ed8e7407 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -7,6 +7,11 @@
#include <linux/list.h>
#include <linux/kernel.h>

+/*
+ * The longest name we expect in the schemata file:
+ */
+#define RESCTRL_NAME_LEN 7
+
enum resctrl_conf_type {
CDP_BOTH = 0,
CDP_CODE,
@@ -147,11 +152,13 @@ int resctrl_arch_set_cdp_enabled(bool enable);

/**
* @list: Member of resctrl's schema list
+ * @name: Name visible in the schemata file
* @conf_type: Type of configuration, e.g. code/data/both
* @res: The rdt_resource for this entry
*/
struct resctrl_schema {
struct list_head list;
+ char name[RESCTRL_NAME_LEN];
enum resctrl_conf_type conf_type;
struct rdt_resource *res;
};
--
2.18.0


2018-08-24 10:48:40

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 19/20] x86/intel_rdt: Kill off alloc_enabled

Now that the L2/L2CODE/L2DATA resources are merged together, alloc_enabled
doesn't mean anything, its the same as alloc_capable which indicates
CAT is supported by this cache.

Take the opportunity to kill of alloc_enabled and its helpers.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 3 ---
arch/x86/kernel/cpu/intel_rdt.h | 5 -----
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 6 +++---
include/linux/resctrl.h | 2 --
5 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index f6f1eceb366f..982897839711 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -157,7 +157,6 @@ static inline void cache_alloc_hsw_probe(void)
r->cache.shareable_bits = 0xc0000;
r->cache.min_cbm_bits = 2;
r->alloc_capable = true;
- r->alloc_enabled = true;

rdt_alloc_capable = true;
}
@@ -214,7 +213,6 @@ static bool rdt_get_mem_config(struct rdt_resource *r)
r->data_width = 3;

r->alloc_capable = true;
- r->alloc_enabled = true;

return true;
}
@@ -234,7 +232,6 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
r->cache.shareable_bits = ebx & r->default_ctrl;
r->data_width = (r->cache.cbm_len + 3) / 4;
r->alloc_capable = true;
- r->alloc_enabled = true;
}

static int get_cache_id(int cpu, int level)
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index a4aba005cfea..735d7bd4bcd9 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -342,11 +342,6 @@ static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
r = resctrl_inc(r)) \
if (r->mon_capable)

-#define for_each_alloc_enabled_rdt_resource(r) \
- for (r = &rdt_resources_all[0].resctrl; r < &rdt_resources_all[RDT_NUM_RESOURCES].resctrl;\
- r = resctrl_inc(r)) \
- if (r->alloc_enabled)
-
#define for_each_mon_enabled_rdt_resource(r) \
for (r = &rdt_resources_all[0].resctrl; r < &rdt_resources_all[RDT_NUM_RESOURCES].resctrl;\
r = resctrl_inc(r)) \
diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index e8264637a4d3..ab8bc97dee0b 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -329,7 +329,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
goto out;
}

- for_each_alloc_enabled_rdt_resource(r) {
+ for_each_alloc_capable_rdt_resource(r) {
ret = resctrl_arch_update_domains(r);
if (ret)
goto out;
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 39038bdfa7d6..38cd463443e8 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -103,7 +103,7 @@ static void closid_init(void)
int rdt_min_closid = 32;

/* Compute rdt_min_closid across all resources */
- for_each_alloc_enabled_rdt_resource(r)
+ for_each_alloc_capable_rdt_resource(r)
rdt_min_closid = min(rdt_min_closid, r->num_closid);

closid_free_map = BIT_MASK(rdt_min_closid) - 1;
@@ -1307,7 +1307,7 @@ static int create_schemata_list(void)
int ret = 0;
struct rdt_resource *r;

- for_each_alloc_enabled_rdt_resource(r) {
+ for_each_alloc_capable_rdt_resource(r) {
if (r->cdp_enabled) {
ret = add_schema(CDP_CODE, r);
ret |= add_schema(CDP_DATA, r);
@@ -1586,7 +1586,7 @@ static void rdt_kill_sb(struct super_block *sb)
set_mba_sc(false);

/*Put everything back to default values. */
- for_each_alloc_enabled_rdt_resource(r)
+ for_each_capable_rdt_resource(r)
reset_all_ctrls(r);
cdp_disable_all();
rmdir_all_sub();
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 8b06ed8e7407..7a955ce93988 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -86,7 +86,6 @@ struct resctrl_membw {
};

/**
- * @alloc_enabled: Is allocation enabled on this machine
* @mon_enabled: Is monitoring enabled for this feature
* @cdp_enabled Is CDP enabled for this resource
* @alloc_capable: Is allocation available on this machine
@@ -113,7 +112,6 @@ struct resctrl_membw {
* @fflags: flags to choose base and info files
*/
struct rdt_resource {
- bool alloc_enabled;
bool mon_enabled;
bool cdp_enabled;
bool alloc_capable;
--
2.18.0


2018-08-24 10:48:54

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 11/20] x86/intel_rdt: Pass in the code/data/both configuration value when parsing

The illusion of three types of cache at each level is a neat trick to
allow a static table of resources to be used. This is a problem if the
cache topology and partitioning abilities have to be discovered at boot.

We want to fold the three code/data/both caches into one, and move the
CDP configuration details to be a property of the configuration and
its closid, not the cache. The resctrl filesystem can then re-create
the illusion of separate caches.

Temporarily label the configuration property of the cache, and pass
this value down to the configuration helpers. Eventually we will move
this label up to become a property of the schema.

A later patch will correct the closid for CDP when the configuration is
staged, which will let us merge the three types of resource.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 6 ++++++
arch/x86/kernel/cpu/intel_rdt.h | 7 +++++--
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 15 ++++++++++-----
include/linux/resctrl.h | 11 ++++++++++-
4 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index c035280b4398..8d3544b6c149 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -83,6 +83,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
+ .cdp_type = CDP_BOTH,
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -102,6 +103,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
+ .cdp_type = CDP_DATA,
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,

@@ -122,6 +124,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
+ .cdp_type = CDP_CODE,
.msr_base = IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -141,6 +144,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
+ .cdp_type = CDP_BOTH,
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -160,6 +164,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
+ .cdp_type = CDP_DATA,
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
},
@@ -179,6 +184,7 @@ struct rdt_hw_resource rdt_resources_all[] = {
.format_str = "%d=%0*x",
.fflags = RFTYPE_RES_CACHE,
},
+ .cdp_type = CDP_CODE,
.msr_base = IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
},
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index 92822ff99f1a..cc8dea58b74f 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -285,6 +285,7 @@ struct rdt_hw_resource {
struct rdt_resource resctrl;
int rid;
u32 hw_num_closid;
+ enum resctrl_conf_type cdp_type; // temporary
unsigned int msr_base;
void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
@@ -296,8 +297,10 @@ static inline struct rdt_hw_resource *resctrl_to_rdt(struct rdt_resource *r)
return container_of(r, struct rdt_hw_resource, resctrl);
}

-int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid);
-int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid);
+int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d,
+ enum resctrl_conf_type t, u32 closid);
+int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d,
+ enum resctrl_conf_type t, u32 closid);

extern struct mutex rdtgroup_mutex;

diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index 766c3e62ad91..bab6032704c3 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -64,7 +64,8 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
return true;
}

-int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid)
+int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d,
+ enum resctrl_conf_type t, u32 closid)
{
unsigned long data;
struct resctrl_staged_config *cfg = &d->staged_config[0];
@@ -78,6 +79,7 @@ int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid
return -EINVAL;
cfg->closid = closid;
cfg->new_ctrl = data;
+ cfg->new_ctrl_type = t;
cfg->have_new_ctrl = true;

return 0;
@@ -128,7 +130,8 @@ static bool cbm_validate(char *buf, unsigned long *data, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid)
+int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d,
+ enum resctrl_conf_type t, u32 closid)
{
unsigned long data;
struct resctrl_staged_config *cfg = &d->staged_config[0];
@@ -142,6 +145,7 @@ int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closi
return -EINVAL;
cfg->closid = closid;
cfg->new_ctrl = data;
+ cfg->new_ctrl_type = t;
cfg->have_new_ctrl = true;

return 0;
@@ -153,7 +157,8 @@ int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closi
* separated by ";". The "id" is in decimal, and must match one of
* the "id"s for this resource.
*/
-static int parse_line(char *line, struct rdt_resource *r, u32 closid)
+static int parse_line(char *line, struct rdt_resource *r,
+ enum resctrl_conf_type t, u32 closid)
{
char *dom = NULL, *id;
struct rdt_domain *d;
@@ -171,7 +176,7 @@ static int parse_line(char *line, struct rdt_resource *r, u32 closid)
dom = strim(dom);
list_for_each_entry(d, &r->domains, list) {
if (d->id == dom_id) {
- if (r->parse_ctrlval(dom, r, d, closid))
+ if (r->parse_ctrlval(dom, r, d, t, closid))
return -EINVAL;
goto next;
}
@@ -260,7 +265,7 @@ static int rdtgroup_parse_resource(char *resname, char *tok, int closid)

for_each_alloc_enabled_rdt_resource(r) {
if (!strcmp(resname, r->name) && closid < r->num_closid)
- return parse_line(tok, r, closid);
+ return parse_line(tok, r, resctrl_to_rdt(r)->cdp_type, closid);
}
rdt_last_cmd_printf("unknown/unsupported resource name '%s'\n", resname);
return -EINVAL;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 8bf813071039..592242635204 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -7,16 +7,24 @@
#include <linux/list.h>
#include <linux/kernel.h>

+enum resctrl_conf_type {
+ CDP_BOTH = 0,
+ CDP_CODE,
+ CDP_DATA,
+};
+
/**
* struct resctrl_staged_config - parsed configuration to be applied
* @closid: The closid the new configuration applies to
* @new_ctrl: new ctrl value to be loaded
* @have_new_ctrl: did user provide new_ctrl for this domain
+ * @new_ctrl_type: CDP property of the new ctrl
*/
struct resctrl_staged_config {
u32 closid;
u32 new_ctrl;
bool have_new_ctrl;
+ enum resctrl_conf_type new_ctrl_type;
};

/**
@@ -122,7 +130,8 @@ struct rdt_resource {
u32 default_ctrl;
const char *format_str;
int (*parse_ctrlval) (char *buf, struct rdt_resource *r,
- struct rdt_domain *d, u32 closid);
+ struct rdt_domain *d,
+ enum resctrl_conf_type t, u32 closid);

struct list_head evt_list;
unsigned long fflags;
--
2.18.0


2018-08-24 10:49:05

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 08/20] x86/intel_rdt: Make cdp enable/disable global

If the CPU supports Intel's Code and Data Prioritization (CDP), software
can specify a separate bitmap for code and data. This feature needs
enabling in a model-specific-register, and changes the properties of
the cache-controls: it halves the effective number of closids.

This changes how closids are allocated, and so applies to all
alloc_enabled caches. If a system has multiple levels of RDT-like
controls CDP should be enabled/disabled across them all.

Make the CDP enable/disable calls global.

Add CDP capable/enabled flags, and unify the enable/disable behind a
single resctrl_arch_set_cdp_enabled(true/false) call. Architectures
that have nothing to do here can just update the flags.

This subtly changes resctrl's '-o cdp' (l3) and '-o cdpl2' parameters
to mean enable globally if this level supports cdp. The difference
can't be seen on a system which only has one of the two.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.c | 1 +
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 72 ++++++++++++++++--------
include/linux/resctrl.h | 7 +++
3 files changed, 57 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index c4e6dcdd235b..0e651447956e 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -331,6 +331,7 @@ static void rdt_get_cdp_config(int level, int type)
* By default, CDP is disabled. CDP can be enabled by mount parameter
* "cdp" during resctrl file system mount time.
*/
+ r_l->cdp_capable = true;
r->alloc_enabled = false;
}

diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 3ed88d4fedd0..f4f76c193495 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -1081,7 +1081,7 @@ static int cdp_enable(int level, int data_type, int code_type)
int ret;

if (!r_l->alloc_capable || !r_ldata->alloc_capable ||
- !r_lcode->alloc_capable)
+ !r_lcode->alloc_capable || !r_l->cdp_capable)
return -EINVAL;

ret = set_cache_qos_cfg(level, true);
@@ -1089,51 +1089,77 @@ static int cdp_enable(int level, int data_type, int code_type)
r_l->alloc_enabled = false;
r_ldata->alloc_enabled = true;
r_lcode->alloc_enabled = true;
+
+ r_l->cdp_enabled = true;
+ r_ldata->cdp_enabled = true;
+ r_lcode->cdp_enabled = true;
}
return ret;
}

-static int cdpl3_enable(void)
-{
- return cdp_enable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA,
- RDT_RESOURCE_L3CODE);
-}
-
-static int cdpl2_enable(void)
-{
- return cdp_enable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA,
- RDT_RESOURCE_L2CODE);
-}
-
static void cdp_disable(int level, int data_type, int code_type)
{
struct rdt_resource *r = &rdt_resources_all[level].resctrl;

+ if (!r->cdp_enabled)
+ return;
+
r->alloc_enabled = r->alloc_capable;

if (rdt_resources_all[data_type].resctrl.alloc_enabled) {
rdt_resources_all[data_type].resctrl.alloc_enabled = false;
rdt_resources_all[code_type].resctrl.alloc_enabled = false;
set_cache_qos_cfg(level, false);
+
+ r->cdp_enabled = false;
+ rdt_resources_all[data_type].resctrl.cdp_enabled = false;
+ rdt_resources_all[code_type].resctrl.cdp_enabled = false;
}
}

-static void cdpl3_disable(void)
+int resctrl_arch_set_cdp_enabled(bool enable)
{
- cdp_disable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA, RDT_RESOURCE_L3CODE);
+ int ret = -EINVAL;
+ struct rdt_hw_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_hw_resource *l2 = &rdt_resources_all[RDT_RESOURCE_L2];
+
+ if (l3 && l3->resctrl.cdp_capable) {
+ if (!enable) {
+ cdp_disable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA,
+ RDT_RESOURCE_L3CODE);
+ ret = 0;
+ } else {
+ ret = cdp_enable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA,
+ RDT_RESOURCE_L3CODE);
+ }
+ }
+ if (l2 && l2->resctrl.cdp_capable) {
+ if (!enable) {
+ cdp_disable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA,
+ RDT_RESOURCE_L2CODE);
+ ret = 0;
+ } else {
+ ret = cdp_enable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA,
+ RDT_RESOURCE_L2CODE);
+ }
+ }
+
+ return ret;
}

-static void cdpl2_disable(void)
+static int try_to_enable_cdp(int level)
{
- cdp_disable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA, RDT_RESOURCE_L2CODE);
+ struct rdt_resource *r = &rdt_resources_all[level].resctrl;
+
+ if (!r->cdp_capable)
+ return -EINVAL;
+
+ return resctrl_arch_set_cdp_enabled(true);
}

static void cdp_disable_all(void)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].resctrl.alloc_enabled)
- cdpl3_disable();
- if (rdt_resources_all[RDT_RESOURCE_L2DATA].resctrl.alloc_enabled)
- cdpl2_disable();
+ resctrl_arch_set_cdp_enabled(false);
}

static int parse_rdtgroupfs_options(char *data)
@@ -1148,11 +1174,11 @@ static int parse_rdtgroupfs_options(char *data)
}

if (!strcmp(token, "cdp")) {
- ret = cdpl3_enable();
+ ret = try_to_enable_cdp(RDT_RESOURCE_L3);
if (ret)
goto out;
} else if (!strcmp(token, "cdpl2")) {
- ret = cdpl2_enable();
+ ret = try_to_enable_cdp(RDT_RESOURCE_L2);
if (ret)
goto out;
} else if (!strcmp(token, "mba_MBps")) {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 9fe7d7de53d7..8bf813071039 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -75,8 +75,10 @@ struct resctrl_membw {
/**
* @alloc_enabled: Is allocation enabled on this machine
* @mon_enabled: Is monitoring enabled for this feature
+ * @cdp_enabled Is CDP enabled for this resource
* @alloc_capable: Is allocation available on this machine
* @mon_capable: Is monitor feature available on this machine
+ * @cdp_capable: Is CDP feature available on this resource
*
* @cache_level: Which cache level defines scope of this resource.
*
@@ -100,8 +102,10 @@ struct resctrl_membw {
struct rdt_resource {
bool alloc_enabled;
bool mon_enabled;
+ bool cdp_enabled;
bool alloc_capable;
bool mon_capable;
+ bool cdp_capable;

int cache_level;

@@ -129,4 +133,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r);
void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
u32 closid, u32 *value);

+/* Enable/Disable CDP on all applicable resources */
+int resctrl_arch_set_cdp_enabled(bool enable);
+
#endif /* __LINUX_RESCTRL_H */
--
2.18.0


2018-08-24 10:49:21

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 07/20] x86/intel_rdt: Expose update_domains() as an arch helper

update_domains() applies the staged configuration to the hw_dom's
configuration array and updates the hardware. Make it part of the
interface between resctrl and the arch code.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++--
include/linux/resctrl.h | 1 +
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index ec3c15ee3473..766c3e62ad91 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -193,7 +193,7 @@ static void apply_config(struct rdt_hw_domain *hw_dom,
}
}

-static int update_domains(struct rdt_resource *r)
+int resctrl_arch_update_domains(struct rdt_resource *r)
{
struct resctrl_staged_config *cfg;
struct rdt_hw_domain *hw_dom;
@@ -312,7 +312,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
}

for_each_alloc_enabled_rdt_resource(r) {
- ret = update_domains(r);
+ ret = resctrl_arch_update_domains(r);
if (ret)
goto out;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 03d9fbc230af..9fe7d7de53d7 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -125,6 +125,7 @@ struct rdt_resource {

};

+int resctrl_arch_update_domains(struct rdt_resource *r);
void resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
u32 closid, u32 *value);

--
2.18.0


2018-08-24 10:49:27

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 05/20] x86/intel_rdt: make update_domains() learn the affected closids

Now that the closid is present in the staged configuration,
update_domains() can learn which low/high values it should update.

Remove the single passed in closid, and update msr_param as we
apply each staged config.

Once the L2/L2CODE/L2DATA resources are merged this will allow
update_domains() to be called once for the single resource, even
when CDP is in use. This results in both CODE and DATA
configurations being applied and the two consecutive closids being
updated with a single smp_call_function_many().

This will let us keep the CDP odd/even behaviour inside resctrl
so that architectures that don't do this don't need to emulate it.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.h | 4 ++--
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 21 ++++++++++++++++-----
2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index 5e271e0fe1f5..8df549ef016d 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -241,8 +241,8 @@ static inline struct rdt_hw_domain *rc_dom_to_rdt(struct rdt_domain *r)
*/
struct msr_param {
struct rdt_resource *res;
- int low;
- int high;
+ u32 low;
+ u32 high;
};

static inline bool is_llc_occupancy_enabled(void)
diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index 0c849653a99d..01ffd455313a 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -193,22 +193,21 @@ static void apply_config(struct rdt_hw_domain *hw_dom,
}
}

-static int update_domains(struct rdt_resource *r, int closid)
+static int update_domains(struct rdt_resource *r)
{
struct resctrl_staged_config *cfg;
struct rdt_hw_domain *hw_dom;
+ bool msr_param_init = false;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
bool mba_sc;
+ u32 closid;
int i, cpu;

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;

- /* TODO: learn these two by looping the config */
- msr_param.low = closid;
- msr_param.high = msr_param.low + 1;
msr_param.res = r;

mba_sc = is_mba_sc(r);
@@ -220,9 +219,21 @@ static int update_domains(struct rdt_resource *r, int closid)
continue;

apply_config(hw_dom, cfg, cpu_mask, mba_sc);
+
+ closid = cfg->closid;
+ if (!msr_param_init) {
+ msr_param.low = closid;
+ msr_param.high = closid;
+ msr_param_init = true;
+ } else {
+ msr_param.low = min(msr_param.low, closid);
+ msr_param.high = max(msr_param.high, closid);
+ }
}
}

+ msr_param.high += 1;
+
/*
* Avoid writing the control msr with control values when
* MBA software controller is enabled
@@ -301,7 +312,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
}

for_each_alloc_enabled_rdt_resource(r) {
- ret = update_domains(r, closid);
+ ret = update_domains(r);
if (ret)
goto out;
}
--
2.18.0


2018-08-24 10:49:34

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 04/20] x86/intel_rdt: Add closid to the staged config

Once we merge the L2/L2CODE/L2DATA resources, we still want to have
two configurations staged for one resource when CDP is enabled.

These two configurations would have different closid as far as the
hardware is concerned.

In preparation, add closid as a staged parameter, and pass it down
when the schema is being parsed. In the future this will be the
hardware closid, with the CDP correction already applied by resctrl.
This allows another architecture to work with resctrl, without
having to emulate CDP.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt.h | 4 ++--
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 21 ++++++++++++---------
include/linux/resctrl.h | 4 +++-
3 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index 7c17d74fd36c..5e271e0fe1f5 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -294,8 +294,8 @@ static inline struct rdt_hw_resource *resctrl_to_rdt(struct rdt_resource *r)
return container_of(r, struct rdt_hw_resource, resctrl);
}

-int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d);
-int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d);
+int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid);
+int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid);

extern struct mutex rdtgroup_mutex;

diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index 1068a19e03c5..0c849653a99d 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -64,7 +64,7 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
return true;
}

-int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d)
+int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid)
{
unsigned long data;
struct resctrl_staged_config *cfg = &d->staged_config[0];
@@ -76,6 +76,7 @@ int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d)

if (!bw_validate(buf, &data, r))
return -EINVAL;
+ cfg->closid = closid;
cfg->new_ctrl = data;
cfg->have_new_ctrl = true;

@@ -127,7 +128,7 @@ static bool cbm_validate(char *buf, unsigned long *data, struct rdt_resource *r)
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d)
+int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d, u32 closid)
{
unsigned long data;
struct resctrl_staged_config *cfg = &d->staged_config[0];
@@ -139,6 +140,7 @@ int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d)

if(!cbm_validate(buf, &data, r))
return -EINVAL;
+ cfg->closid = closid;
cfg->new_ctrl = data;
cfg->have_new_ctrl = true;

@@ -151,7 +153,7 @@ int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d)
* separated by ";". The "id" is in decimal, and must match one of
* the "id"s for this resource.
*/
-static int parse_line(char *line, struct rdt_resource *r)
+static int parse_line(char *line, struct rdt_resource *r, u32 closid)
{
char *dom = NULL, *id;
struct rdt_domain *d;
@@ -169,7 +171,7 @@ static int parse_line(char *line, struct rdt_resource *r)
dom = strim(dom);
list_for_each_entry(d, &r->domains, list) {
if (d->id == dom_id) {
- if (r->parse_ctrlval(dom, r, d))
+ if (r->parse_ctrlval(dom, r, d, closid))
return -EINVAL;
goto next;
}
@@ -178,15 +180,15 @@ static int parse_line(char *line, struct rdt_resource *r)
}

static void apply_config(struct rdt_hw_domain *hw_dom,
- struct resctrl_staged_config *cfg, int closid,
+ struct resctrl_staged_config *cfg,
cpumask_var_t cpu_mask, bool mba_sc)
{
u32 *dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;

- if (cfg->new_ctrl != dc[closid]) {
+ if (cfg->new_ctrl != dc[cfg->closid]) {
cpumask_set_cpu(cpumask_any(&hw_dom->resctrl.cpu_mask),
cpu_mask);
- dc[closid] = cfg->new_ctrl;
+ dc[cfg->closid] = cfg->new_ctrl;
cfg->have_new_ctrl = false;
}
}
@@ -204,6 +206,7 @@ static int update_domains(struct rdt_resource *r, int closid)
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;

+ /* TODO: learn these two by looping the config */
msr_param.low = closid;
msr_param.high = msr_param.low + 1;
msr_param.res = r;
@@ -216,7 +219,7 @@ static int update_domains(struct rdt_resource *r, int closid)
if (!cfg->have_new_ctrl)
continue;

- apply_config(hw_dom, cfg, closid, cpu_mask, mba_sc);
+ apply_config(hw_dom, cfg, cpu_mask, mba_sc);
}
}

@@ -246,7 +249,7 @@ static int rdtgroup_parse_resource(char *resname, char *tok, int closid)

for_each_alloc_enabled_rdt_resource(r) {
if (!strcmp(resname, r->name) && closid < r->num_closid)
- return parse_line(tok, r);
+ return parse_line(tok, r, closid);
}
rdt_last_cmd_printf("unknown/unsupported resource name '%s'\n", resname);
return -EINVAL;
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 0b57a55f4b3d..370db085ee77 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -9,10 +9,12 @@

/**
* struct resctrl_staged_config - parsed configuration to be applied
+ * @closid: The closid the new configuration applies to
* @new_ctrl: new ctrl value to be loaded
* @have_new_ctrl: did user provide new_ctrl for this domain
*/
struct resctrl_staged_config {
+ u32 closid;
u32 new_ctrl;
bool have_new_ctrl;
};
@@ -116,7 +118,7 @@ struct rdt_resource {
u32 default_ctrl;
const char *format_str;
int (*parse_ctrlval) (char *buf, struct rdt_resource *r,
- struct rdt_domain *d);
+ struct rdt_domain *d, u32 closid);

struct list_head evt_list;
unsigned long fflags;
--
2.18.0


2018-08-24 10:50:05

by James Morse

[permalink] [raw]
Subject: [RFC PATCH 03/20] x86/intel_rdt: Group staged configuration into a separate struct

We want to merge the L2/L2CODE/L2DATA resources to be a single resource,
but we still need to be able to apply separate CODE and DATA schema to
the domain.

Move the new_ctrl bitmap value and flag into a struct, and create an
array of them. Today there is only one element in the array, but
eventually resctrl will use the array slots for CODE/DATA/BOTH to
detect a duplicate schema being written.

Signed-off-by: James Morse <[email protected]>
---
arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 44 +++++++++++++++------
include/linux/resctrl.h | 16 ++++++--
2 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
index e3dcb5161122..1068a19e03c5 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
@@ -67,16 +67,17 @@ static bool bw_validate(char *buf, unsigned long *data, struct rdt_resource *r)
int parse_bw(char *buf, struct rdt_resource *r, struct rdt_domain *d)
{
unsigned long data;
+ struct resctrl_staged_config *cfg = &d->staged_config[0];

- if (d->have_new_ctrl) {
+ if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("duplicate domain %d\n", d->id);
return -EINVAL;
}

if (!bw_validate(buf, &data, r))
return -EINVAL;
- d->new_ctrl = data;
- d->have_new_ctrl = true;
+ cfg->new_ctrl = data;
+ cfg->have_new_ctrl = true;

return 0;
}
@@ -129,16 +130,17 @@ static bool cbm_validate(char *buf, unsigned long *data, struct rdt_resource *r)
int parse_cbm(char *buf, struct rdt_resource *r, struct rdt_domain *d)
{
unsigned long data;
+ struct resctrl_staged_config *cfg = &d->staged_config[0];

- if (d->have_new_ctrl) {
+ if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("duplicate domain %d\n", d->id);
return -EINVAL;
}

if(!cbm_validate(buf, &data, r))
return -EINVAL;
- d->new_ctrl = data;
- d->have_new_ctrl = true;
+ cfg->new_ctrl = data;
+ cfg->have_new_ctrl = true;

return 0;
}
@@ -175,15 +177,29 @@ static int parse_line(char *line, struct rdt_resource *r)
return -EINVAL;
}

+static void apply_config(struct rdt_hw_domain *hw_dom,
+ struct resctrl_staged_config *cfg, int closid,
+ cpumask_var_t cpu_mask, bool mba_sc)
+{
+ u32 *dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;
+
+ if (cfg->new_ctrl != dc[closid]) {
+ cpumask_set_cpu(cpumask_any(&hw_dom->resctrl.cpu_mask),
+ cpu_mask);
+ dc[closid] = cfg->new_ctrl;
+ cfg->have_new_ctrl = false;
+ }
+}
+
static int update_domains(struct rdt_resource *r, int closid)
{
+ struct resctrl_staged_config *cfg;
struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
bool mba_sc;
- u32 *dc;
- int cpu;
+ int i, cpu;

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
@@ -195,10 +211,12 @@ static int update_domains(struct rdt_resource *r, int closid)
mba_sc = is_mba_sc(r);
list_for_each_entry(d, &r->domains, list) {
hw_dom = rc_dom_to_rdt(d);
- dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;
- if (d->have_new_ctrl && d->new_ctrl != dc[closid]) {
- cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
- dc[closid] = d->new_ctrl;
+ for (i = 0; i < ARRAY_SIZE(d->staged_config); i++) {
+ cfg = &hw_dom->resctrl.staged_config[i];
+ if (!cfg->have_new_ctrl)
+ continue;
+
+ apply_config(hw_dom, cfg, closid, cpu_mask, mba_sc);
}
}

@@ -259,7 +277,7 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,

for_each_alloc_enabled_rdt_resource(r) {
list_for_each_entry(dom, &r->domains, list)
- dom->have_new_ctrl = false;
+ memset(dom->staged_config, 0, sizeof(dom->staged_config));
}

while ((tok = strsep(&buf, "\n")) != NULL) {
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 5950c30fcc30..0b57a55f4b3d 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -7,21 +7,29 @@
#include <linux/list.h>
#include <linux/kernel.h>

+/**
+ * struct resctrl_staged_config - parsed configuration to be applied
+ * @new_ctrl: new ctrl value to be loaded
+ * @have_new_ctrl: did user provide new_ctrl for this domain
+ */
+struct resctrl_staged_config {
+ u32 new_ctrl;
+ bool have_new_ctrl;
+};
+
/**
* struct rdt_domain - group of cpus sharing an RDT resource
* @list: all instances of this resource
* @id: unique id for this instance
* @cpu_mask: which cpus share this resource
- * @new_ctrl: new ctrl value to be loaded
- * @have_new_ctrl: did user provide new_ctrl for this domain
+ * @staged_config: parsed configuration to be applied
*/
struct rdt_domain {
struct list_head list;
int id;
struct cpumask cpu_mask;

- u32 new_ctrl;
- bool have_new_ctrl;
+ struct resctrl_staged_config staged_config[1];
};

/**
--
2.18.0


2018-08-27 14:25:41

by Fenghua Yu

[permalink] [raw]
Subject: Re: [RFC PATCH 00/20] x86/intel_rdt: Start abstraction for a second arch

On Fri, Aug 24, 2018 at 11:44:59AM +0100, James Morse wrote:
> Hi folks,
>
> ARM have some upcoming CPU features that are similar to Intel RDT. Resctrl
> is the defacto ABI for this sort of thing, but it lives under arch/x86.
>
> To get existing software working, we need to make resctrl work with arm64.
> This series is the first chunk of that. The aim is to move the filesystem/ABI
> parts into /fs/resctrl, and implement a second arch backend.
>
>
> What are the ARM features?
> Future ARM SoCs may have a feature called MPAM: Memory Partitioning and
> Monitoring. This is an umbrella term like RDT, and covers a range of controls
> (like CAT) and monitors (like MBM, CMT).

Please send a link to MPAM spec.

>
> This series is almost all about CDP. MPAM has equivalent functionality, but
> it doesn't need enabling, and doesn't affect the available closids. (I'll
> try and use Intel terms). MPAM expects the equivalent to IA32_PRQ_MSR to
> be configured with an Instruction closid and a Data closid. These are the
> same for no-CDP, and different otherwise. There is no need for them to be
> adjacent.
>
> To avoid emulating CDP in arm64's arch code, this series moves all the ABI
> parts of the CDP behaviour, (half the closid-space, each having two
> configurations) into the filesystem parts of resctrl. These will eventually
> be moved to /fs/.

Do you have the patches that moves code to /fs/resctrl?

>
> MPAMs control and monitor configuration is all memory mapped, the base
> addresses are discovered via firmware tables, so we won't have a table of
> possible resources that just need alloc_enabling.
>
> Is this it? No... there are another two series of a similar size that
> abstract the MBM/CMT overflow threads and avoid 'fs' code accessing things
> that have moved into the 'hw' arch specific struct.
>
>
> I'm after feedback on the general approach taken here, bugs, as there are
> certainly subtleties I've missed, and any strong-opinions on what should be
> arch-specific, and what shouldn't.
>
> This series is based on v4.18, and can be retrieved from:
> git://linux-arm.org/linux-jm.git -b mpam/resctrl_rework/rfc_1

Could you please publish MPAM patches as well? Then we can have better idea
on ARM's specific code. This patch set only has Intel RDT part.

Thanks.

-Fenghua

2018-08-31 15:36:09

by James Morse

[permalink] [raw]
Subject: Re: [RFC PATCH 00/20] x86/intel_rdt: Start abstraction for a second arch

Hi Fenghua,

On 27/08/18 15:22, Fenghua Yu wrote:
> On Fri, Aug 24, 2018 at 11:44:59AM +0100, James Morse wrote:
>> ARM have some upcoming CPU features that are similar to Intel RDT. Resctrl
>> is the defacto ABI for this sort of thing, but it lives under arch/x86.
>>
>> To get existing software working, we need to make resctrl work with arm64.
>> This series is the first chunk of that. The aim is to move the filesystem/ABI
>> parts into /fs/resctrl, and implement a second arch backend.
>>
>>
>> What are the ARM features?
>> Future ARM SoCs may have a feature called MPAM: Memory Partitioning and
>> Monitoring. This is an umbrella term like RDT, and covers a range of controls
>> (like CAT) and monitors (like MBM, CMT).
>
> Please send a link to MPAM spec.

I'm afraid there isn't a public spec yet, hopefully September/October.


>> This series is almost all about CDP. MPAM has equivalent functionality, but
>> it doesn't need enabling, and doesn't affect the available closids. (I'll
>> try and use Intel terms). MPAM expects the equivalent to IA32_PRQ_MSR to
>> be configured with an Instruction closid and a Data closid. These are the
>> same for no-CDP, and different otherwise. There is no need for them to be
>> adjacent.
>>
>> To avoid emulating CDP in arm64's arch code, this series moves all the ABI
>> parts of the CDP behaviour, (half the closid-space, each having two
>> configurations) into the filesystem parts of resctrl. These will eventually
>> be moved to /fs/.
>
> Do you have the patches that moves code to /fs/resctrl?

Not currently in a usable state, I haven't successfully rebased this stuff over
the mba_sc additions yet.

My intention was to make any abstracting changes where the code is now, then
move the fs-specific parts in one go when its done. This is so the work can be
broken up into manageable chunks.


>> MPAMs control and monitor configuration is all memory mapped, the base
>> addresses are discovered via firmware tables, so we won't have a table of
>> possible resources that just need alloc_enabling.
>>
>> Is this it? No... there are another two series of a similar size that
>> abstract the MBM/CMT overflow threads and avoid 'fs' code accessing things
>> that have moved into the 'hw' arch specific struct.
>>
>>
>> I'm after feedback on the general approach taken here, bugs, as there are
>> certainly subtleties I've missed, and any strong-opinions on what should be
>> arch-specific, and what shouldn't.
>>
>> This series is based on v4.18, and can be retrieved from:
>> git://linux-arm.org/linux-jm.git -b mpam/resctrl_rework/rfc_1
>
> Could you please publish MPAM patches as well? Then we can have better idea
> on ARM's specific code. This patch set only has Intel RDT part.

You want to see it all at once (great!). I'm not quite ready with all this yet,
so it will be a while. I assumed 'all at once' would be to much to ask from
reviewers, hence this attempt to break it into small chunks and post it over a
longer period of time.


Thanks,

James

2018-09-06 19:19:38

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [RFC PATCH 00/20] x86/intel_rdt: Start abstraction for a second arch

On Fri, 31 Aug 2018, James Morse wrote:
> You want to see it all at once (great!). I'm not quite ready with all this yet,
> so it will be a while. I assumed 'all at once' would be to much to ask from
> reviewers, hence this attempt to break it into small chunks and post it over a
> longer period of time.

And that's perfectly fine. We know where you are heading to and we also
don't want to have duplication of that stuff. So separating it in place
which also makes further modification follow that separate model is the
right thing to do.

That said, can the RDT people @Intel please look at that separation as a
work of its own? Moving the separated part over to fs/xxx is a nobrainer
after that.

I skimmed through the pile and couldn't spot something which made me
immediately barf, but my capacity is not allowing me to do a thorough
review.

Thanks,

tglx

2018-11-27 12:37:14

by Yury Norov

[permalink] [raw]
Subject: Re: [RFC PATCH 00/20] x86/intel_rdt: Start abstraction for a second arch

Hi James,

On Fri, Aug 24, 2018 at 11:44:59AM +0100, James Morse wrote:
> Hi folks,
>
> ARM have some upcoming CPU features that are similar to Intel RDT. Resctrl
> is the defacto ABI for this sort of thing, but it lives under arch/x86.
>
> To get existing software working, we need to make resctrl work with arm64.
> This series is the first chunk of that. The aim is to move the filesystem/ABI
> parts into /fs/resctrl, and implement a second arch backend.
>
>
> What are the ARM features?
> Future ARM SoCs may have a feature called MPAM: Memory Partitioning and
> Monitoring. This is an umbrella term like RDT, and covers a range of controls
> (like CAT) and monitors (like MBM, CMT).
>
> This series is almost all about CDP. MPAM has equivalent functionality, but
> it doesn't need enabling, and doesn't affect the available closids. (I'll
> try and use Intel terms). MPAM expects the equivalent to IA32_PRQ_MSR to
> be configured with an Instruction closid and a Data closid. These are the
> same for no-CDP, and different otherwise. There is no need for them to be
> adjacent.
>
> To avoid emulating CDP in arm64's arch code, this series moves all the ABI
> parts of the CDP behaviour, (half the closid-space, each having two
> configurations) into the filesystem parts of resctrl. These will eventually
> be moved to /fs/.
>
> MPAMs control and monitor configuration is all memory mapped, the base
> addresses are discovered via firmware tables, so we won't have a table of
> possible resources that just need alloc_enabling.
>
> Is this it? No... there are another two series of a similar size that
> abstract the MBM/CMT overflow threads and avoid 'fs' code accessing things
> that have moved into the 'hw' arch specific struct.
>
>
> I'm after feedback on the general approach taken here, bugs, as there are
> certainly subtleties I've missed, and any strong-opinions on what should be
> arch-specific, and what shouldn't.
>
> This series is based on v4.18, and can be retrieved from:
> git://linux-arm.org/linux-jm.git -b mpam/resctrl_rework/rfc_1

Thank you a lot for this work on cache allocation.

We are very interested in enabling CAT on Cavium / Marvell devices.
Could you please share another two series you mentioned above?

Do you have a working ARM64 CAT driver? It will much help us in our
experimenting.

Thanks in advance,
Yury

2018-11-30 19:24:44

by James Morse

[permalink] [raw]
Subject: Re: [RFC PATCH 00/20] x86/intel_rdt: Start abstraction for a second arch

Hi Yury,

On 27/11/2018 12:33, Yury Norov wrote:
> On Fri, Aug 24, 2018 at 11:44:59AM +0100, James Morse wrote:
>> ARM have some upcoming CPU features that are similar to Intel RDT. Resctrl
>> is the defacto ABI for this sort of thing, but it lives under arch/x86.
>>
>> To get existing software working, we need to make resctrl work with arm64.
>> This series is the first chunk of that. The aim is to move the filesystem/ABI
>> parts into /fs/resctrl, and implement a second arch backend.
>>
>>
>> What are the ARM features?
>> Future ARM SoCs may have a feature called MPAM: Memory Partitioning and
>> Monitoring. This is an umbrella term like RDT, and covers a range of controls
>> (like CAT) and monitors (like MBM, CMT).

>> This series is based on v4.18, and can be retrieved from:
>> git://linux-arm.org/linux-jm.git -b mpam/resctrl_rework/rfc_1
>
> Thank you a lot for this work on cache allocation.
>
> We are very interested in enabling CAT on Cavium / Marvell devices.
> Could you please share another two series you mentioned above?

I'm working on assembling the whole thing as Fenghua asked, there should be a
tree that shows the shape of the whole thing soon.


> Do you have a working ARM64 CAT driver? It will much help us in our
> experimenting.

For moving the arch-independent parts to /fs/, I'm afraid its all-or-nothing.
Abstracting just the 'CAT' parts will generate much more churn for both
architectures.


Thanks,

James