2022-08-01 21:15:58

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix

New AMD processors can now support following QoS features.
1. Slow Memory Bandwidth Configuration
With this feature, the QOS enforcement policies can be applied
to the external slow memory connected to the host. QOS enforcement
is accomplished by assigning a Class Of Service (COS) to a processor
and specifying allocations or limits for that COS for each resource
to be allocated.

2. Bandwidth Monitoring Event Configuration (BMEC)
The bandwidth monitoring events mbm_total_event and mbm_local_event
are set to count all the total and local reads/writes respectively.
With the introduction of slow memory, the two counters are not enough
to count all the different types are memory events. With the feature
BMEC, the users have the option to configure mbm_total_event and
mbm_local_event to count the specific type of events.

Following are the bitmaps of events supported.
Bits Description
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain

This series adds support for these features.

Feature description is available in the specification, "AMD64 Technology Platform Quality
of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".

Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

Also added a bugfix submitted by Stephane Eranian.

---
v2:
a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
b. Added the patch to fix CBM issue on AMD. This was originally discussed
https://lore.kernel.org/lkml/[email protected]/

v1:
https://lore.kernel.org/lkml/165757543252.416408.13547339307237713464.stgit@bmoger-ubuntu/

Babu Moger (10):
x86/resctrl: Fix min_cbm_bits for AMD
x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
x86/resctrl: Detect and configure Slow Memory Bandwidth allocation
x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
x86/resctrl: Introduce mon_configurable to detect Bandwidth Monitoring Event Configuration
x86/resctrl: Add sysfs interface files to read/write event configuration
x86/resctrl: Add the sysfs interface to read the event configuration
x86/resctrl: Add sysfs interface to write the event configuration
Documentation/x86: Update resctrl_ui.rst for new features


Documentation/x86/resctrl.rst | 123 +++++++++++
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/kernel/cpu/resctrl/core.c | 70 ++++++-
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 26 +++
arch/x86/kernel/cpu/resctrl/monitor.c | 16 ++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 237 +++++++++++++++++++++-
arch/x86/kernel/cpu/scattered.c | 2 +
include/linux/resctrl.h | 1 +
9 files changed, 469 insertions(+), 10 deletions(-)

--
Signature



2022-08-01 21:15:58

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v2 04/10] x86/resctrl: Detect and configure Slow Memory Bandwidth allocation

The QoS slow memory configuration details are available via
CPUID_Fn80000020_EDX_x02. Detect the available details and
initialize the rest to defaults.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 50 +++++++++++++++++++++++++++++
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 +++--
4 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 6c38427b71a2..36ad97ab7342 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -253,6 +253,37 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
return true;
}

+static bool __rdt_get_s_mem_config_amd(struct rdt_resource *r)
+{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+ union cpuid_0x10_3_eax eax;
+ union cpuid_0x10_x_edx edx;
+ u32 ebx, ecx;
+
+ cpuid_count(0x80000020, 2, &eax.full, &ebx, &ecx, &edx.full);
+ hw_res->num_closid = edx.split.cos_max + 1;
+ r->default_ctrl = MAX_MBA_BW_AMD;
+
+ /* AMD does not use delay */
+ r->membw.delay_linear = false;
+ r->membw.arch_needs_linear = false;
+
+ /*
+ * AMD does not use memory delay throttle model to control
+ * the allocation like Intel does.
+ */
+ r->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
+ r->membw.min_bw = 0;
+ r->membw.bw_gran = 1;
+ /* Max value is 2048, Data width should be 4 in decimal */
+ r->data_width = 4;
+
+ r->alloc_capable = true;
+ r->alloc_enabled = true;
+
+ return true;
+}
+
static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
@@ -787,6 +818,19 @@ static __init bool get_mem_config(void)
return false;
}

+static __init bool get_s_mem_config(void)
+{
+ struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_SMBA];
+
+ if (!rdt_cpu_has(X86_FEATURE_SMBA))
+ return false;
+
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+ return __rdt_get_s_mem_config_amd(&hw_res->r_resctrl);
+
+ return false;
+}
+
static __init bool get_rdt_alloc_resources(void)
{
struct rdt_resource *r;
@@ -817,6 +861,9 @@ static __init bool get_rdt_alloc_resources(void)
if (get_mem_config())
ret = true;

+ if (get_s_mem_config())
+ ret = true;
+
return ret;
}

@@ -908,6 +955,9 @@ static __init void rdt_init_res_defs_amd(void)
} else if (r->rid == RDT_RESOURCE_MBA) {
hw_res->msr_base = MSR_IA32_MBA_BW_BASE;
hw_res->msr_update = mba_wrmsr_amd;
+ } else if (r->rid == RDT_RESOURCE_SMBA) {
+ hw_res->msr_base = MSR_IA32_SMBA_BW_BASE;
+ hw_res->msr_update = mba_wrmsr_amd;
}
}
}
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 87666275eed9..11ec3577db40 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -203,7 +203,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
unsigned long dom_id;

if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP &&
- r->rid == RDT_RESOURCE_MBA) {
+ (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)) {
rdt_last_cmd_puts("Cannot pseudo-lock MBA resource\n");
return -EINVAL;
}
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 24a1dfeb6cb2..c049a274383c 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -14,6 +14,7 @@
#define MSR_IA32_L2_CBM_BASE 0xd10
#define MSR_IA32_MBA_THRTL_BASE 0xd50
#define MSR_IA32_MBA_BW_BASE 0xc0000200
+#define MSR_IA32_SMBA_BW_BASE 0xc0000280

#define MSR_IA32_QM_CTR 0x0c8e
#define MSR_IA32_QM_EVTSEL 0x0c8d
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f276aff521e8..fc5286067201 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1218,7 +1218,7 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)

list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- if (r->rid == RDT_RESOURCE_MBA)
+ if (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)
continue;
has_cache = true;
list_for_each_entry(d, &r->domains, list) {
@@ -1399,7 +1399,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
ctrl = resctrl_arch_get_config(r, d,
rdtgrp->closid,
schema->conf_type);
- if (r->rid == RDT_RESOURCE_MBA)
+ if (r->rid == RDT_RESOURCE_MBA ||
+ r->rid == RDT_RESOURCE_SMBA)
size = ctrl;
else
size = rdtgroup_cbm_to_size(r, d, ctrl);
@@ -2807,7 +2808,9 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp)

list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- if (r->rid == RDT_RESOURCE_MBA) {
+ if (r->rid == RDT_RESOURCE_MBA ||
+ r->rid == RDT_RESOURCE_SMBA) {
+
rdtgroup_init_mba(r);
} else {
ret = rdtgroup_init_cat(s, rdtgrp->closid);



2022-08-01 21:23:29

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v2 06/10] x86/resctrl: Introduce mon_configurable to detect Bandwidth Monitoring Event Configuration

Newer AMD processors support the new feature Bandwidth Monitoring Event
Configuration (BMEC). The events mbm_total_bytes and mbm_local_bytes
are configurable when this feature is present.

Set mon_configurable if the feature is available.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/monitor.c | 14 ++++++++++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++++++++++++
include/linux/resctrl.h | 1 +
3 files changed, 32 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index eaf25a234ff5..b9de417dac1c 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -682,6 +682,16 @@ static void l3_mon_evt_init(struct rdt_resource *r)
list_add_tail(&mbm_local_event.list, &r->evt_list);
}

+
+void __rdt_get_mon_l3_config_amd(struct rdt_resource *r)
+{
+ /*
+ * Check if CPU supports the Bandwidth Monitoring Event Configuration
+ */
+ if (boot_cpu_has(X86_FEATURE_BMEC))
+ r->mon_configurable = true;
+}
+
int rdt_get_mon_l3_config(struct rdt_resource *r)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
@@ -714,6 +724,10 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
if (ret)
return ret;

+ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+ __rdt_get_mon_l3_config_amd(r);
+
+
l3_mon_evt_init(r);

r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index fc5286067201..855483b297a8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -995,6 +995,16 @@ static int rdt_num_rmids_show(struct kernfs_open_file *of,
return 0;
}

+static int rdt_mon_configurable_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct rdt_resource *r = of->kn->parent->priv;
+
+ seq_printf(seq, "%d\n", r->mon_configurable);
+
+ return 0;
+}
+
static int rdt_mon_features_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
@@ -1447,6 +1457,13 @@ static struct rftype res_common_files[] = {
.seq_show = rdt_num_rmids_show,
.fflags = RF_MON_INFO,
},
+ {
+ .name = "mon_configurable",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = rdt_mon_configurable_show,
+ .fflags = RF_MON_INFO,
+ },
{
.name = "cbm_mask",
.mode = 0444,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 21deb5212bbd..4ee2b606ac14 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -154,6 +154,7 @@ struct rdt_resource {
bool mon_enabled;
bool alloc_capable;
bool mon_capable;
+ bool mon_configurable;
int num_rmid;
int cache_level;
struct resctrl_cache cache;



2022-08-01 21:24:11

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v2 03/10] x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA

Adds a new resource type RDT_RESOURCE_SMBA to handle the QoS
enforcement policies on the external slow memory.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++++
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
2 files changed, 13 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a5c51a14fbce..6c38427b71a2 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -100,6 +100,18 @@ struct rdt_hw_resource rdt_resources_all[] = {
.fflags = RFTYPE_RES_MB,
},
},
+ [RDT_RESOURCE_SMBA] =
+ {
+ .r_resctrl = {
+ .rid = RDT_RESOURCE_SMBA,
+ .name = "SB",
+ .cache_level = 3,
+ .domains = domain_init(RDT_RESOURCE_SMBA),
+ .parse_ctrlval = parse_bw,
+ .format_str = "%d=%*u",
+ .fflags = RFTYPE_RES_MB,
+ },
+ },
};

/*
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 1d647188a43b..24a1dfeb6cb2 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -418,6 +418,7 @@ enum resctrl_res_level {
RDT_RESOURCE_L3,
RDT_RESOURCE_L2,
RDT_RESOURCE_MBA,
+ RDT_RESOURCE_SMBA,

/* Must be the last */
RDT_NUM_RESOURCES,



2022-08-02 02:41:24

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix

On 8/2/22 03:55, Babu Moger wrote:
> v2:
> a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
> b. Added the patch to fix CBM issue on AMD. This was originally discussed
> https://lore.kernel.org/lkml/[email protected]/
>

Shouldn't this series be rebased on tip tree? I think it's odd to base
new feature series on stable tree, since patches on the latter are mostly
bugfixes backported from mainline.

--
An old man doll... just what I always wanted! - Clara

2022-08-02 09:56:51

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix


* Bagas Sanjaya <[email protected]> wrote:

> On 8/2/22 03:55, Babu Moger wrote:
> > v2:
> > a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
> > b. Added the patch to fix CBM issue on AMD. This was originally discussed
> > https://lore.kernel.org/lkml/[email protected]/
> >
>
> Shouldn't this series be rebased on tip tree? I think it's odd to base
> new feature series on stable tree, since patches on the latter are mostly
> bugfixes backported from mainline.

Normally that's true, but AFAICS the patchset applies cleanly to latest
-tip as well, so 'stable == tip' in this regard.

Series looks good to me (without having tested it):

Reviewed-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

2022-08-02 13:50:13

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix


On 8/2/22 04:39, Ingo Molnar wrote:
> * Bagas Sanjaya <[email protected]> wrote:
>
>> On 8/2/22 03:55, Babu Moger wrote:
>>> v2:
>>> a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
>>> b. Added the patch to fix CBM issue on AMD. This was originally discussed
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20220517001234.3137157-1-eranian%40google.com%2F&amp;data=05%7C01%7Cbabu.moger%40amd.com%7C90f7a070780f42fc9b5808da746ae74c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637950299744423989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=tZNlIIolEGbVfZmbeADA%2BHjLE%2BPhNWKDfcHfL2cNUuE%3D&amp;reserved=0
>>>
>> Shouldn't this series be rebased on tip tree? I think it's odd to base
Sure. Will re-base on top of tip next revision.
>> new feature series on stable tree, since patches on the latter are mostly
>> bugfixes backported from mainline.
> Normally that's true, but AFAICS the patchset applies cleanly to latest
> -tip as well, so 'stable == tip' in this regard.
>
> Series looks good to me (without having tested it):
>
> Reviewed-by: Ingo Molnar <[email protected]>

Thank you.

Babu Moger

>
> Thanks,
>
> Ingo

--
Thanks
Babu Moger