2022-11-04 20:35:10

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v8 00/13] Support for AMD QoS new features

New AMD processors can now support following QoS features.

1. Slow Memory Bandwidth Allocation (SMBA)
With this feature, the QOS enforcement policies can be applied
to the external slow memory connected to the host. QOS enforcement
is accomplished by assigning a Class Of Service (COS) to a processor
and specifying allocations or limits for that COS for each resource
to be allocated.

Currently, CXL.memory is the only supported "slow" memory device. With
the support of SMBA feature the hardware enables bandwidth allocation
on the slow memory devices.

2. Bandwidth Monitoring Event Configuration (BMEC)
The bandwidth monitoring events mbm_total_event and mbm_local_event
are set to count all the total and local reads/writes respectively.
With the introduction of slow memory, the two counters are not enough
to count all the different types are memory events. With the feature
BMEC, the users have the option to configure mbm_total_event and
mbm_local_event to count the specific type of events.

Following are the bitmaps of events supported.
Bits Description
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain

This series adds support for these features.

Feature description is available in the specification, "AMD64 Technology Platform Quality of Service Extensions, Revision: 1.03 Publication # 56375
Revision: 1.03 Issue Date: February 2022".

Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
---
v8:
Changes:
1. Removed init attribute for rdt_cpu_has to make it available for all the files.
2. Updated the change log for mon_features to correct the names of config files.
3. Changed configuration file name from mbm_total_config to mbm_total_bytes_config.
This is more consistant with other changes.
4. Added lock protection while reading/writing the config file.
5. Other few minor text changes. I have been missing few comments in last couple of
revisions. Hope I have addressed all of them this time.

v7:
https://lore.kernel.org/lkml/166604543832.5345.9696970469830919982.stgit@bmoger-ubuntu/
Changes:
Not much of a change. Missed one comment from Reinette from v5. Corrected it now.
Few format corrections from Sanjaya.

v6:
https://lore.kernel.org/lkml/166543345606.23830.3120625408601531368.stgit@bmoger-ubuntu/
Summary of changes:
1. Rebased on top of lastest tip tree. Fixed few minor conflicts.
2. Fixed format issue with scattered.c.
3. Removed config_name from the structure mon_evt. It is not required.
4. The read/write format for mbm_total_config and mbm_local_config will be same
as schemata format "id0=val0;id1=val1;...". This is comment from Fenghua.
5. Added more comments MSR_IA32_EVT_CFG_BASE writng.
5. Few text changes in resctrl.rst

v5:
https://lore.kernel.org/lkml/166431016617.373387.1968875281081252467.stgit@bmoger-ubuntu/
Summary of changes.
1. Split the series into two. The first two patches are bug fixes. So, sent them separate.
2. The config files mbm_total_config and mbm_local_config are now under
/sys/fs/resctrl/info/L3_MON/. Removed these config files from mon groups.
3. Ran "checkpatch --strict --codespell" on all the patches. Looks good with few known exceptions.
4. Few minor text changes in resctrl.rst file.

v4:
https://lore.kernel.org/lkml/166257348081.1043018.11227924488792315932.stgit@bmoger-ubuntu/
Got numerios of comments from Reinette Chatre. Addressed most of them.
Summary of changes.
1. Removed mon_configurable under /sys/fs/resctrl/info/L3_MON/.
2. Updated mon_features texts if the BMEC is supported.
3. Added more explanation about the slow memory support.
4. Replaced smp_call_function_many with on_each_cpu_mask call.
5. Removed arch_has_empty_bitmaps
6. Few other text changes.
7. Removed Reviewed-by if the patch is modified.
8. Rebased the patches to latest tip.

v3:
https://lore.kernel.org/lkml/166117559756.6695.16047463526634290701.stgit@bmoger-ubuntu/
a. Rebased the patches to latest tip. Resolved some conflicts.
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
b. Taken care of feedback from Bagas Sanjaya.
c. Added Reviewed by from Mingo.
Note: I am still looking for comments from Reinette or Fenghua.

v2:
https://lore.kernel.org/lkml/165938717220.724959.10931629283087443782.stgit@bmoger-ubuntu/
a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
b. Added the patch to fix CBM issue on AMD. This was originally discussed
https://lore.kernel.org/lkml/[email protected]/

v1:
https://lore.kernel.org/lkml/165757543252.416408.13547339307237713464.stgit@bmoger-ubuntu/

Babu Moger (13):
x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
x86/resctrl: Include new features in command line options
x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation
x86/resctrl: Remove the init attribute for rdt_cpu_has()
x86/resctrl: Introduce data structure to support monitor configuration
x86/resctrl: Add sysfs interface to read mbm_total_bytes_config
x86/resctrl: Add sysfs interface to read mbm_local_bytes_config
x86/resctrl: Add sysfs interface to write mbm_total_bytes_config
x86/resctrl: Add sysfs interface to write mbm_local_bytes_config
x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
Documentation/x86: Update resctrl.rst for new features


.../admin-guide/kernel-parameters.txt | 2 +-
Documentation/x86/resctrl.rst | 139 +++++++-
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/resctrl/core.c | 56 +++-
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 33 ++
arch/x86/kernel/cpu/resctrl/monitor.c | 7 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 304 ++++++++++++++++--
arch/x86/kernel/cpu/scattered.c | 2 +
10 files changed, 515 insertions(+), 33 deletions(-)

--



2022-11-04 20:39:07

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v8 06/13] x86/resctrl: Remove the init attribute for rdt_cpu_has()

The monitor code in resctrl/monitor.c needs to call rdt_cpu_has() to
detect the monitor related features. It has the init attribute and
cannot be called in non-init routines. Remove the init attribute and
make it available for all the resctrl files.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 4 ++--
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 6571d08e2b0d..b33a541f5c80 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -688,7 +688,7 @@ struct rdt_options {
bool force_off, force_on;
};

-static struct rdt_options rdt_options[] __initdata = {
+static struct rdt_options rdt_options[] = {
RDT_OPT(RDT_FLAG_CMT, "cmt", X86_FEATURE_CQM_OCCUP_LLC),
RDT_OPT(RDT_FLAG_MBM_TOTAL, "mbmtotal", X86_FEATURE_CQM_MBM_TOTAL),
RDT_OPT(RDT_FLAG_MBM_LOCAL, "mbmlocal", X86_FEATURE_CQM_MBM_LOCAL),
@@ -728,7 +728,7 @@ static int __init set_rdt_options(char *str)
}
__setup("rdt", set_rdt_options);

-static bool __init rdt_cpu_has(int flag)
+bool rdt_cpu_has(int flag)
{
bool ret = boot_cpu_has(flag);
struct rdt_options *o;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 16e3c6e03c79..e30e8b23f6b5 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -499,6 +499,7 @@ int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name,
umode_t mask);
struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id,
struct list_head **pos);
+bool rdt_cpu_has(int flag);
ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off);
int rdtgroup_schemata_show(struct kernfs_open_file *of,



2022-11-04 20:39:07

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v8 08/13] x86/resctrl: Add sysfs interface to read mbm_total_bytes_config

The current event configuration can be viewed by the user by reading
the configuration file /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config.
The event configuration settings are domain specific and will affect all
the CPUs in the domain.

Following are the types of events supported:
==== ===========================================================
Bits Description
==== ===========================================================
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
==== ===========================================================

By default, the mbm_total_bytes_config is set to 0x7f to count all the
event types.

For example:
$cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
0=0x7f;1=0x7f;2=0x7f;3=0x7f

In this case, the event mbm_total_bytes is currently configured
with 0x7f on domains 0 to 3.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/internal.h | 28 ++++++++++
arch/x86/kernel/cpu/resctrl/monitor.c | 1
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 87 ++++++++++++++++++++++++++++++++
3 files changed, 116 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 5459b5022760..c74285fd0f6e 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -15,6 +15,7 @@
#define MSR_IA32_MBA_THRTL_BASE 0xd50
#define MSR_IA32_MBA_BW_BASE 0xc0000200
#define MSR_IA32_SMBA_BW_BASE 0xc0000280
+#define MSR_IA32_EVT_CFG_BASE 0xc0000400

#define MSR_IA32_QM_CTR 0x0c8e
#define MSR_IA32_QM_EVTSEL 0x0c8d
@@ -41,6 +42,32 @@
*/
#define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)

+/* Reads to Local DRAM Memory */
+#define READS_TO_LOCAL_MEM BIT(0)
+
+/* Reads to Remote DRAM Memory */
+#define READS_TO_REMOTE_MEM BIT(1)
+
+/* Non-Temporal Writes to Local Memory */
+#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
+
+/* Non-Temporal Writes to Remote Memory */
+#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
+
+/* Reads to Local Memory the system identifies as "Slow Memory" */
+#define READS_TO_LOCAL_S_MEM BIT(4)
+
+/* Reads to Remote Memory the system identifies as "Slow Memory" */
+#define READS_TO_REMOTE_S_MEM BIT(5)
+
+/* Dirty Victims to All Types of Memory */
+#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
+
+/* Max event bits supported */
+#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
+
+/* Max configurable events */
+#define MAX_CONFIG_EVENTS 2

struct rdt_fs_context {
struct kernfs_fs_context kfc;
@@ -542,5 +569,6 @@ bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
void __check_limbo(struct rdt_domain *d, bool force_free);
void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
void __init thread_throttle_mode_init(void);
+void mbm_config_rftype_init(void);

#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 06c2dc980855..a188dacab6c8 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -787,6 +787,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
if (mon_configurable) {
mbm_total_event.configurable = true;
mbm_local_event.configurable = true;
+ mbm_config_rftype_init();
}

l3_mon_evt_init(r);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 8342feb54a7f..dea58b6b4aa4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1423,6 +1423,78 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
return ret;
}

+struct mon_config_info {
+ u32 evtid;
+ u32 mon_config;
+};
+
+/**
+ * mon_event_config_index_get - get the index for the configurable event
+ * @evtid: event id.
+ *
+ * Return: 0 for evtid == QOS_L3_MBM_TOTAL_EVENT_ID
+ * 1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID
+ * > 1 otherwise
+ */
+static inline unsigned int mon_event_config_index_get(u32 evtid)
+{
+ return evtid - QOS_L3_MBM_TOTAL_EVENT_ID;
+}
+
+static void mon_event_config_read(void *info)
+{
+ struct mon_config_info *mon_info = info;
+ u32 h, index;
+
+ index = mon_event_config_index_get(mon_info->evtid);
+ if (index >= MAX_CONFIG_EVENTS) {
+ pr_warn_once("Invalid event id %d\n", mon_info->evtid);
+ return;
+ }
+ rdmsr(MSR_IA32_EVT_CFG_BASE + index, mon_info->mon_config, h);
+}
+
+static void mondata_config_read(struct rdt_domain *d, struct mon_config_info *mon_info)
+{
+ smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1);
+}
+
+static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
+{
+ struct mon_config_info mon_info = {0};
+ struct rdt_domain *dom;
+ bool sep = false;
+
+ mutex_lock(&rdtgroup_mutex);
+
+ list_for_each_entry(dom, &r->domains, list) {
+ if (sep)
+ seq_puts(s, ";");
+
+ mon_info.evtid = evtid;
+ mondata_config_read(dom, &mon_info);
+
+ seq_printf(s, "%d=0x%02lx", dom->id,
+ mon_info.mon_config & MAX_EVT_CONFIG_BITS);
+ sep = true;
+ }
+ seq_puts(s, "\n");
+
+ mutex_unlock(&rdtgroup_mutex);
+
+ return 0;
+}
+
+static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct rdt_resource *r = of->kn->parent->priv;
+
+ mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
+
+ return 0;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -1521,6 +1593,12 @@ static struct rftype res_common_files[] = {
.seq_show = max_threshold_occ_show,
.fflags = RF_MON_INFO | RFTYPE_RES_CACHE,
},
+ {
+ .name = "mbm_total_bytes_config",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = mbm_total_bytes_config_show,
+ },
{
.name = "cpus",
.mode = 0644,
@@ -1627,6 +1705,15 @@ void __init thread_throttle_mode_init(void)
rft->fflags = RF_CTRL_INFO | RFTYPE_RES_MB;
}

+void mbm_config_rftype_init(void)
+{
+ struct rftype *rft;
+
+ rft = rdtgroup_get_rftype_by_name("mbm_total_bytes_config");
+ if (rft)
+ rft->fflags = RF_MON_INFO | RFTYPE_RES_CACHE;
+}
+
/**
* rdtgroup_kn_mode_restrict - Restrict user access to named resctrl file
* @r: The resource group with which the file is associated.



2022-11-04 20:39:47

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v8 13/13] Documentation/x86: Update resctrl.rst for new features

Update the documentation for the new features:
1. Slow Memory Bandwidth allocation (SMBA).
With this feature, the QOS enforcement policies can be applied
to the external slow memory connected to the host. QOS enforcement
is accomplished by assigning a Class Of Service (COS) to a processor
and specifying allocations or limits for that COS for each resource
to be allocated.

2. Bandwidth Monitoring Event Configuration (BMEC).
The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
are set to count all the total and local reads/writes respectively.
With the introduction of slow memory, the two counters are not
enough to count all the different types of memory events. With the
feature BMEC, the users have the option to configure mbm_total_bytes
and mbm_local_bytes to count the specific type of events.

Also add configuration instructions with examples.

Signed-off-by: Babu Moger <[email protected]>
Reviewed-by: Bagas Sanjaya <[email protected]>
---
Documentation/x86/resctrl.rst | 139 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 137 insertions(+), 2 deletions(-)

diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 71a531061e4e..12adba98afc5 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -17,14 +17,16 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
flag bits:

-============================================= ================================
+=============================================== ================================
RDT (Resource Director Technology) Allocation "rdt_a"
CAT (Cache Allocation Technology) "cat_l3", "cat_l2"
CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2"
CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc"
MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local"
MBA (Memory Bandwidth Allocation) "mba"
-============================================= ================================
+SMBA (Slow Memory Bandwidth Allocation) "smba"
+BMEC (Bandwidth Monitoring Event Configuration) "bmec"
+=============================================== ================================

To use the feature mount the file system::

@@ -161,6 +163,79 @@ with the following files:
"mon_features":
Lists the monitoring events if
monitoring is enabled for the resource.
+ Example::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
+ llc_occupancy
+ mbm_total_bytes
+ mbm_local_bytes
+
+ If the system supports Bandwidth Monitoring Event
+ Configuration (BMEC), then the bandwidth events will
+ be configurable. The output will be::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
+ llc_occupancy
+ mbm_total_bytes
+ mbm_total_bytes_config
+ mbm_local_bytes
+ mbm_local_bytes_config
+
+"mbm_total_bytes_config", "mbm_local_bytes_config":
+ These files contain the current event configuration for the events
+ mbm_total_bytes and mbm_local_bytes, respectively, when the
+ Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
+ The event configuration settings are domain specific and will affect
+ all the CPUs in the domain.
+
+ Following are the types of events supported:
+
+ ==== ========================================================
+ Bits Description
+ ==== ========================================================
+ 6 Dirty Victims from the QOS domain to all types of memory
+ 5 Reads to slow memory in the non-local NUMA domain
+ 4 Reads to slow memory in the local NUMA domain
+ 3 Non-temporal writes to non-local NUMA domain
+ 2 Non-temporal writes to local NUMA domain
+ 1 Reads to memory in the non-local NUMA domain
+ 0 Reads to memory in the local NUMA domain
+ ==== ========================================================
+
+ By default, the mbm_total_bytes configuration is set to 0x7f to count
+ all the event types and the mbm_local_bytes configuration is set to
+ 0x15 to count all the local memory events.
+
+ Examples:
+
+ * To view the current configuration::
+ ::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+ 0=0x7f;1=0x7f;2=0x7f;3=0x7f
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+ 0=0x15;1=0x15;3=0x15;4=0x15
+
+ * To change the mbm_total_bytes to count only reads on domain 0,
+ the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary
+ (in hexadecimal 0x33):
+ ::
+
+ # echo "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+ 0=0x33;1=0x7f;2=0x7f;3=0x7f
+
+ * To change the mbm_local_bytes to count all the slow memory reads on
+ domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b
+ in binary (in hexadecimal 0x30):
+ ::
+
+ # echo "0=0x30;1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+ 0=0x30;1=0x30;3=0x15;4=0x15

"max_threshold_occupancy":
Read/write file provides the largest value (in
@@ -464,6 +539,26 @@ Memory bandwidth domain is L3 cache.

MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...

+Slow Memory Bandwidth Allocation (SMBA)
+---------------------------------------
+AMD hardware support Slow Memory Bandwidth Allocation (SMBA) feature.
+Currently, CXL.memory is the only supported "slow" memory device.
+With the support of SMBA, the hardware enables bandwidth allocation
+on the slow memory devices. If there are multiple such devices in the
+system, the throttling logic groups all the slow sources together
+and applies the limit on them as a whole.
+
+The presence of SMBA (with CXL.memory) is independent of slow memory
+devices presence. If there is no such devices on the system, then
+setting the configuring SMBA will have no impact on the performance
+of the system.
+
+The bandwidth domain for slow memory is L3 cache. Its schemata file
+is formatted as:
+::
+
+ SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
+
Reading/writing the schemata file
---------------------------------
Reading the schemata file will show the state of all resources
@@ -479,6 +574,46 @@ which you wish to change. E.g.
L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
L3CODE:0=fffff;1=fffff;2=fffff;3=fffff

+Reading/writing the schemata file (on AMD systems)
+--------------------------------------------------
+Reading the schemata file will show the current bandwidth limit on all
+domains. The allocated resources are in multiples of one eighth GB/s.
+When writing to the file, you need to specify what cache id you wish to
+configure the bandwidth limit.
+
+For example, to allocate 2GB/s limit on the first cache id:
+
+::
+
+ # cat schemata
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "MB:1=16" > schemata
+ # cat schemata
+ MB:0=2048;1= 16;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+Reading/writing the schemata file (on AMD systems) with SMBA feature
+--------------------------------------------------------------------
+Reading and writing the schemata file is the same as without SMBA in
+above section.
+
+For example, to allocate 8GB/s limit on the first cache id:
+
+::
+
+ # cat schemata
+ SMBA:0=2048;1=2048;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "SMBA:1=64" > schemata
+ # cat schemata
+ SMBA:0=2048;1= 64;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
Cache Pseudo-Locking
====================
CAT enables a user to specify the amount of cache space that an



2022-11-04 20:40:11

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v8 12/13] x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()

The call on_each_cpu_mask() can run the function on each CPU specified
by cpumask, which may include the local processor. So, replace the call
smp_call_function_many() with on_each_cpu_mask() to simplify the code.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 29 ++++++++---------------------
1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f37ecc16b34b..6b222f8e58ae 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -325,12 +325,7 @@ static void update_cpu_closid_rmid(void *info)
static void
update_closid_rmid(const struct cpumask *cpu_mask, struct rdtgroup *r)
{
- int cpu = get_cpu();
-
- if (cpumask_test_cpu(cpu, cpu_mask))
- update_cpu_closid_rmid(r);
- smp_call_function_many(cpu_mask, update_cpu_closid_rmid, r, 1);
- put_cpu();
+ on_each_cpu_mask(cpu_mask, update_cpu_closid_rmid, r, 1);
}

static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
@@ -2130,13 +2125,9 @@ static int set_cache_qos_cfg(int level, bool enable)
/* Pick one CPU from each domain instance to update MSR */
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
}
- cpu = get_cpu();
- /* Update QOS_CFG MSR on this cpu if it's in cpu_mask. */
- if (cpumask_test_cpu(cpu, cpu_mask))
- update(&enable);
- /* Update QOS_CFG MSR on all other cpus in cpu_mask. */
- smp_call_function_many(cpu_mask, update, &enable, 1);
- put_cpu();
+
+ /* Update QOS_CFG MSR on all the CPUs in cpu_mask */
+ on_each_cpu_mask(cpu_mask, update, &enable, 1);

free_cpumask_var(cpu_mask);

@@ -2613,7 +2604,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
- int i, cpu;
+ int i;

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
@@ -2634,13 +2625,9 @@ static int reset_all_ctrls(struct rdt_resource *r)
for (i = 0; i < hw_res->num_closid; i++)
hw_dom->ctrl_val[i] = r->default_ctrl;
}
- cpu = get_cpu();
- /* Update CBM on this cpu if it's in cpu_mask. */
- if (cpumask_test_cpu(cpu, cpu_mask))
- rdt_ctrl_update(&msr_param);
- /* Update CBM on all other cpus in cpu_mask. */
- smp_call_function_many(cpu_mask, rdt_ctrl_update, &msr_param, 1);
- put_cpu();
+
+ /* Update CBM on all the CPUs in cpu_mask */
+ on_each_cpu_mask(cpu_mask, rdt_ctrl_update, &msr_param, 1);

free_cpumask_var(cpu_mask);




2022-11-04 20:51:47

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v8 07/13] x86/resctrl: Introduce data structure to support monitor configuration

Add a new field in mon_evt to support Bandwidth Monitoring Event
Configuration(BMEC) and also update the "mon_features" display.

The sysfs file "mon_features" will display the monitor configuration
if supported.

Before the change.
$cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_local_bytes

After the change when BMEC is supported.
$cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_total_bytes_config
mbm_local_bytes
mbm_local_bytes_config

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/internal.h | 2 ++
arch/x86/kernel/cpu/resctrl/monitor.c | 6 ++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++++-
3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index e30e8b23f6b5..5459b5022760 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -63,11 +63,13 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
* struct mon_evt - Entry in the event list of a resource
* @evtid: event id
* @name: name of the event
+ * @configurable: true if the event is configurable
* @list: entry in &rdt_resource->evt_list
*/
struct mon_evt {
enum resctrl_event_id evtid;
char *name;
+ bool configurable;
struct list_head list;
};

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index efe0c30d3a12..06c2dc980855 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -750,6 +750,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+ bool mon_configurable = rdt_cpu_has(X86_FEATURE_BMEC);
unsigned int threshold;
int ret;

@@ -783,6 +784,11 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
if (ret)
return ret;

+ if (mon_configurable) {
+ mbm_total_event.configurable = true;
+ mbm_local_event.configurable = true;
+ }
+
l3_mon_evt_init(r);

r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 8a3dafc0dbf7..8342feb54a7f 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1001,8 +1001,11 @@ static int rdt_mon_features_show(struct kernfs_open_file *of,
struct rdt_resource *r = of->kn->parent->priv;
struct mon_evt *mevt;

- list_for_each_entry(mevt, &r->evt_list, list)
+ list_for_each_entry(mevt, &r->evt_list, list) {
seq_printf(seq, "%s\n", mevt->name);
+ if (mevt->configurable)
+ seq_printf(seq, "%s_config\n", mevt->name);
+ }

return 0;
}



2022-11-04 20:53:31

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v8 11/13] x86/resctrl: Add sysfs interface to write mbm_local_bytes_config

The current event configuration for mbm_local_bytes can be changed by
the user by writing to the configuration file
/sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config.

The event configuration settings are domain specific and will affect all
the CPUs in the domain.

Following are the types of events supported:
==== ===========================================================
Bits Description
==== ===========================================================
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
==== ===========================================================

For example:
To change the mbm_local_bytes_config to count all the non-temporal writes
on domain 0, the bits 2 and 3 needs to be set which is 1100b (in hex 0xc).
Run the command.
$echo 0=0xc > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config

To change the mbm_local_bytes to count only reads to local NUMA domain 1,
the bit 0 needs to be set which 1b (in hex 0x1). Run the command.
$echo 1=0x1 > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 0cdccb69386e..f37ecc16b34b 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1632,6 +1632,32 @@ static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}

+static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes,
+ loff_t off)
+{
+ struct rdt_resource *r = of->kn->parent->priv;
+ int ret;
+
+ /* Valid input requires a trailing newline */
+ if (nbytes == 0 || buf[nbytes - 1] != '\n')
+ return -EINVAL;
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+
+ buf[nbytes - 1] = '\0';
+
+ ret = mon_config_parse(r, buf, QOS_L3_MBM_LOCAL_EVENT_ID);
+
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -1739,9 +1765,10 @@ static struct rftype res_common_files[] = {
},
{
.name = "mbm_local_bytes_config",
- .mode = 0444,
+ .mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = mbm_local_bytes_config_show,
+ .write = mbm_local_bytes_config_write,
},
{
.name = "cpus",



2022-11-15 21:41:57

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v8 00/13] Support for AMD QoS new features

Hi Reinette and Others,

I was planning to refresh the series later this week. I have one comment
from Peter Newman.  Let me know if you have any comments.

Thanks

Babu


On 11/4/22 14:59, Babu Moger wrote:
> New AMD processors can now support following QoS features.
>
> 1. Slow Memory Bandwidth Allocation (SMBA)
> With this feature, the QOS enforcement policies can be applied
> to the external slow memory connected to the host. QOS enforcement
> is accomplished by assigning a Class Of Service (COS) to a processor
> and specifying allocations or limits for that COS for each resource
> to be allocated.
>
> Currently, CXL.memory is the only supported "slow" memory device. With
> the support of SMBA feature the hardware enables bandwidth allocation
> on the slow memory devices.
>
> 2. Bandwidth Monitoring Event Configuration (BMEC)
> The bandwidth monitoring events mbm_total_event and mbm_local_event
> are set to count all the total and local reads/writes respectively.
> With the introduction of slow memory, the two counters are not enough
> to count all the different types are memory events. With the feature
> BMEC, the users have the option to configure mbm_total_event and
> mbm_local_event to count the specific type of events.
>
> Following are the bitmaps of events supported.
> Bits Description
> 6 Dirty Victims from the QOS domain to all types of memory
> 5 Reads to slow memory in the non-local NUMA domain
> 4 Reads to slow memory in the local NUMA domain
> 3 Non-temporal writes to non-local NUMA domain
> 2 Non-temporal writes to local NUMA domain
> 1 Reads to memory in the non-local NUMA domain
> 0 Reads to memory in the local NUMA domain
>
> This series adds support for these features.
>
> Feature description is available in the specification, "AMD64 Technology Platform Quality of Service Extensions, Revision: 1.03 Publication # 56375
> Revision: 1.03 Issue Date: February 2022".
>
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> ---
> v8:
> Changes:
> 1. Removed init attribute for rdt_cpu_has to make it available for all the files.
> 2. Updated the change log for mon_features to correct the names of config files.
> 3. Changed configuration file name from mbm_total_config to mbm_total_bytes_config.
> This is more consistant with other changes.
> 4. Added lock protection while reading/writing the config file.
> 5. Other few minor text changes. I have been missing few comments in last couple of
> revisions. Hope I have addressed all of them this time.
>
> v7:
> https://lore.kernel.org/lkml/166604543832.5345.9696970469830919982.stgit@bmoger-ubuntu/
> Changes:
> Not much of a change. Missed one comment from Reinette from v5. Corrected it now.
> Few format corrections from Sanjaya.
>
> v6:
> https://lore.kernel.org/lkml/166543345606.23830.3120625408601531368.stgit@bmoger-ubuntu/
> Summary of changes:
> 1. Rebased on top of lastest tip tree. Fixed few minor conflicts.
> 2. Fixed format issue with scattered.c.
> 3. Removed config_name from the structure mon_evt. It is not required.
> 4. The read/write format for mbm_total_config and mbm_local_config will be same
> as schemata format "id0=val0;id1=val1;...". This is comment from Fenghua.
> 5. Added more comments MSR_IA32_EVT_CFG_BASE writng.
> 5. Few text changes in resctrl.rst
>
> v5:
> https://lore.kernel.org/lkml/166431016617.373387.1968875281081252467.stgit@bmoger-ubuntu/
> Summary of changes.
> 1. Split the series into two. The first two patches are bug fixes. So, sent them separate.
> 2. The config files mbm_total_config and mbm_local_config are now under
> /sys/fs/resctrl/info/L3_MON/. Removed these config files from mon groups.
> 3. Ran "checkpatch --strict --codespell" on all the patches. Looks good with few known exceptions.
> 4. Few minor text changes in resctrl.rst file.
>
> v4:
> https://lore.kernel.org/lkml/166257348081.1043018.11227924488792315932.stgit@bmoger-ubuntu/
> Got numerios of comments from Reinette Chatre. Addressed most of them.
> Summary of changes.
> 1. Removed mon_configurable under /sys/fs/resctrl/info/L3_MON/.
> 2. Updated mon_features texts if the BMEC is supported.
> 3. Added more explanation about the slow memory support.
> 4. Replaced smp_call_function_many with on_each_cpu_mask call.
> 5. Removed arch_has_empty_bitmaps
> 6. Few other text changes.
> 7. Removed Reviewed-by if the patch is modified.
> 8. Rebased the patches to latest tip.
>
> v3:
> https://lore.kernel.org/lkml/166117559756.6695.16047463526634290701.stgit@bmoger-ubuntu/
> a. Rebased the patches to latest tip. Resolved some conflicts.
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> b. Taken care of feedback from Bagas Sanjaya.
> c. Added Reviewed by from Mingo.
> Note: I am still looking for comments from Reinette or Fenghua.
>
> v2:
> https://lore.kernel.org/lkml/165938717220.724959.10931629283087443782.stgit@bmoger-ubuntu/
> a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
> b. Added the patch to fix CBM issue on AMD. This was originally discussed
> https://lore.kernel.org/lkml/[email protected]/
>
> v1:
> https://lore.kernel.org/lkml/165757543252.416408.13547339307237713464.stgit@bmoger-ubuntu/
>
> Babu Moger (13):
> x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
> x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
> x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
> x86/resctrl: Include new features in command line options
> x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation
> x86/resctrl: Remove the init attribute for rdt_cpu_has()
> x86/resctrl: Introduce data structure to support monitor configuration
> x86/resctrl: Add sysfs interface to read mbm_total_bytes_config
> x86/resctrl: Add sysfs interface to read mbm_local_bytes_config
> x86/resctrl: Add sysfs interface to write mbm_total_bytes_config
> x86/resctrl: Add sysfs interface to write mbm_local_bytes_config
> x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
> Documentation/x86: Update resctrl.rst for new features
>
>
> .../admin-guide/kernel-parameters.txt | 2 +-
> Documentation/x86/resctrl.rst | 139 +++++++-
> arch/x86/include/asm/cpufeatures.h | 2 +
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> arch/x86/kernel/cpu/resctrl/core.c | 56 +++-
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
> arch/x86/kernel/cpu/resctrl/internal.h | 33 ++
> arch/x86/kernel/cpu/resctrl/monitor.c | 7 +
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 304 ++++++++++++++++--
> arch/x86/kernel/cpu/scattered.c | 2 +
> 10 files changed, 515 insertions(+), 33 deletions(-)
>
> --
>
--
Thanks
Babu Moger


2022-11-15 22:14:02

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v8 00/13] Support for AMD QoS new features

Hi Reinette,

On 11/15/22 15:07, Reinette Chatre wrote:
> Hi Babu,
>
> On 11/15/2022 12:50 PM, Moger, Babu wrote:
>> Hi Reinette and Others,
>>
>> I was planning to refresh the series later this week. I have one comment
>> from Peter Newman.  Let me know if you have any comments.
>>
> I am behind on resctrl work and have not had a chance to look
> at this series yet.

Sure. Thanks for the update. I will wait.

Thanks

Babu



2022-11-15 22:36:24

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v8 00/13] Support for AMD QoS new features

Hi Babu,

On 11/15/2022 12:50 PM, Moger, Babu wrote:
> Hi Reinette and Others,
>
> I was planning to refresh the series later this week. I have one comment
> from Peter Newman.  Let me know if you have any comments.
>

I am behind on resctrl work and have not had a chance to look
at this series yet.

Reinette

2022-11-23 00:47:44

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v8 08/13] x86/resctrl: Add sysfs interface to read mbm_total_bytes_config

Hi Babu,

On 11/4/2022 1:00 PM, Babu Moger wrote:
> The current event configuration can be viewed by the user by reading
> the configuration file /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config.
> The event configuration settings are domain specific and will affect all
> the CPUs in the domain.
>
> Following are the types of events supported:
> ==== ===========================================================
> Bits Description
> ==== ===========================================================
> 6 Dirty Victims from the QOS domain to all types of memory
> 5 Reads to slow memory in the non-local NUMA domain
> 4 Reads to slow memory in the local NUMA domain
> 3 Non-temporal writes to non-local NUMA domain
> 2 Non-temporal writes to local NUMA domain
> 1 Reads to memory in the non-local NUMA domain
> 0 Reads to memory in the local NUMA domain
> ==== ===========================================================
>
> By default, the mbm_total_bytes_config is set to 0x7f to count all the
> event types.
>
> For example:
> $cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 0=0x7f;1=0x7f;2=0x7f;3=0x7f
>
> In this case, the event mbm_total_bytes is currently configured
> with 0x7f on domains 0 to 3.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 28 ++++++++++
> arch/x86/kernel/cpu/resctrl/monitor.c | 1
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 87 ++++++++++++++++++++++++++++++++
> 3 files changed, 116 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 5459b5022760..c74285fd0f6e 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -15,6 +15,7 @@
> #define MSR_IA32_MBA_THRTL_BASE 0xd50
> #define MSR_IA32_MBA_BW_BASE 0xc0000200
> #define MSR_IA32_SMBA_BW_BASE 0xc0000280
> +#define MSR_IA32_EVT_CFG_BASE 0xc0000400
>
> #define MSR_IA32_QM_CTR 0x0c8e
> #define MSR_IA32_QM_EVTSEL 0x0c8d
> @@ -41,6 +42,32 @@
> */
> #define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
>
> +/* Reads to Local DRAM Memory */
> +#define READS_TO_LOCAL_MEM BIT(0)
> +
> +/* Reads to Remote DRAM Memory */
> +#define READS_TO_REMOTE_MEM BIT(1)
> +
> +/* Non-Temporal Writes to Local Memory */
> +#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
> +
> +/* Non-Temporal Writes to Remote Memory */
> +#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
> +
> +/* Reads to Local Memory the system identifies as "Slow Memory" */
> +#define READS_TO_LOCAL_S_MEM BIT(4)
> +
> +/* Reads to Remote Memory the system identifies as "Slow Memory" */
> +#define READS_TO_REMOTE_S_MEM BIT(5)
> +
> +/* Dirty Victims to All Types of Memory */
> +#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
> +
> +/* Max event bits supported */
> +#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
> +
> +/* Max configurable events */
> +#define MAX_CONFIG_EVENTS 2
>

This max being disconnected from what it is a max of looks like
a source of future confusion.

> struct rdt_fs_context {
> struct kernfs_fs_context kfc;
> @@ -542,5 +569,6 @@ bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
> void __check_limbo(struct rdt_domain *d, bool force_free);
> void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
> void __init thread_throttle_mode_init(void);
> +void mbm_config_rftype_init(void);
>
> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 06c2dc980855..a188dacab6c8 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -787,6 +787,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
> if (mon_configurable) {
> mbm_total_event.configurable = true;
> mbm_local_event.configurable = true;
> + mbm_config_rftype_init();
> }
>
> l3_mon_evt_init(r);
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 8342feb54a7f..dea58b6b4aa4 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1423,6 +1423,78 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
> return ret;
> }
>
> +struct mon_config_info {
> + u32 evtid;
> + u32 mon_config;
> +};
> +
> +/**
> + * mon_event_config_index_get - get the index for the configurable event
> + * @evtid: event id.
> + *
> + * Return: 0 for evtid == QOS_L3_MBM_TOTAL_EVENT_ID
> + * 1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID
> + * > 1 otherwise
> + */
> +static inline unsigned int mon_event_config_index_get(u32 evtid)
> +{
> + return evtid - QOS_L3_MBM_TOTAL_EVENT_ID;
> +}

It seems strange that the validation of the index is split
from where the index is determined. I think it would be easier
to understand, and reduce code duplication, it if is done together.

How about:
#define INVALID_CONFIG_INDEX UINT_MAX

static inline unsigned int mon_event_config_index_get(u32 evtid)
{
switch (evtid) {
case QOS_L3_MBM_TOTAL_EVENT_ID:
return 0;
case QOS_L3_MBM_LOCAL_EVENT_ID:
return 1;
default:
/* WARN */
return INVALID_CONFIG_INDEX;
}
}

What do you think?

> +
> +static void mon_event_config_read(void *info)
> +{
> + struct mon_config_info *mon_info = info;
> + u32 h, index;
> +
> + index = mon_event_config_index_get(mon_info->evtid);
> + if (index >= MAX_CONFIG_EVENTS) {
> + pr_warn_once("Invalid event id %d\n", mon_info->evtid);
> + return;
> + }
> + rdmsr(MSR_IA32_EVT_CFG_BASE + index, mon_info->mon_config, h);
> +}
> +
> +static void mondata_config_read(struct rdt_domain *d, struct mon_config_info *mon_info)
> +{
> + smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1);
> +}
> +
> +static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
> +{
> + struct mon_config_info mon_info = {0};

> + struct rdt_domain *dom;
> + bool sep = false;
> +
> + mutex_lock(&rdtgroup_mutex);
> +
> + list_for_each_entry(dom, &r->domains, list) {
> + if (sep)
> + seq_puts(s, ";");
> +
> + mon_info.evtid = evtid;
> + mondata_config_read(dom, &mon_info);
> +
> + seq_printf(s, "%d=0x%02lx", dom->id,

This is a u32 ... is just x sufficient?

> + mon_info.mon_config & MAX_EVT_CONFIG_BITS);

Please do this masking within mondata_config_read(). It should
not be required for every mon_config_read() caller to validate the
data because they may forget (re. patch 10).

> + sep = true;
> + }
> + seq_puts(s, "\n");
> +
> + mutex_unlock(&rdtgroup_mutex);
> +
> + return 0;
> +}
> +
> +static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
> + struct seq_file *seq, void *v)
> +{
> + struct rdt_resource *r = of->kn->parent->priv;
> +
> + mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
> +
> + return 0;
> +}
> +
> /* rdtgroup information files for one cache resource. */
> static struct rftype res_common_files[] = {
> {
> @@ -1521,6 +1593,12 @@ static struct rftype res_common_files[] = {
> .seq_show = max_threshold_occ_show,
> .fflags = RF_MON_INFO | RFTYPE_RES_CACHE,
> },
> + {
> + .name = "mbm_total_bytes_config",
> + .mode = 0444,
> + .kf_ops = &rdtgroup_kf_single_ops,
> + .seq_show = mbm_total_bytes_config_show,
> + },
> {
> .name = "cpus",
> .mode = 0644,
> @@ -1627,6 +1705,15 @@ void __init thread_throttle_mode_init(void)
> rft->fflags = RF_CTRL_INFO | RFTYPE_RES_MB;
> }
>
> +void mbm_config_rftype_init(void)

Does this need __init?

> +{
> + struct rftype *rft;
> +
> + rft = rdtgroup_get_rftype_by_name("mbm_total_bytes_config");
> + if (rft)
> + rft->fflags = RF_MON_INFO | RFTYPE_RES_CACHE;
> +}
> +
> /**
> * rdtgroup_kn_mode_restrict - Restrict user access to named resctrl file
> * @r: The resource group with which the file is associated.
>
>

Reinette

2022-11-23 00:48:06

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v8 13/13] Documentation/x86: Update resctrl.rst for new features

Hi Babu,

On 11/4/2022 1:01 PM, Babu Moger wrote:
...

> @@ -464,6 +539,26 @@ Memory bandwidth domain is L3 cache.
>
> MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
>
> +Slow Memory Bandwidth Allocation (SMBA)
> +---------------------------------------
> +AMD hardware support Slow Memory Bandwidth Allocation (SMBA) feature.

How about
AMD hardware supports the Slow Memory Bandwidth Allocation (SMBA) feature.
or
AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).

> +Currently, CXL.memory is the only supported "slow" memory device.

What does "Currently" mean here? If there is a plan for changes, could
that be shared? Otherwise maybe just remove it: "CXL.memory is the only
supported "slow" memory device."

> +With the support of SMBA, the hardware enables bandwidth allocation
> +on the slow memory devices. If there are multiple such devices in the
> +system, the throttling logic groups all the slow sources together
> +and applies the limit on them as a whole.
> +
> +The presence of SMBA (with CXL.memory) is independent of slow memory
> +devices presence. If there is no such devices on the system, then

Maybe "is no such device" or "are no such devices"?

> +setting the configuring SMBA will have no impact on the performance

"setting the configuring SMBA" is hard to parse. How about just
"configuring SMBA"?


Reinette

2022-11-23 00:48:54

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v8 06/13] x86/resctrl: Remove the init attribute for rdt_cpu_has()

Hi Babu,

On 11/4/2022 1:00 PM, Babu Moger wrote:
> The monitor code in resctrl/monitor.c needs to call rdt_cpu_has() to
> detect the monitor related features. It has the init attribute and
> cannot be called in non-init routines. Remove the init attribute and
> make it available for all the resctrl files.

I think this is the wrong way to go. The rdt_cpu_has() callers are
init code and they should rather get the __init attribute instead of
rdt_cpu_has() losing it.

Reinette

2022-11-23 00:59:29

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v8 07/13] x86/resctrl: Introduce data structure to support monitor configuration

Hi Babu,

On 11/4/2022 1:00 PM, Babu Moger wrote:
> Add a new field in mon_evt to support Bandwidth Monitoring Event
> Configuration(BMEC) and also update the "mon_features" display.
>
> The sysfs file "mon_features" will display the monitor configuration

sysfs -> resctrl ?

> if supported.

This is not clear. "mon_features" does not display the monitor
configuration, it displays the name of the file that can be used to
see the monitor configuration.

>
> Before the change.
> $cat /sys/fs/resctrl/info/L3_MON/mon_features
> llc_occupancy
> mbm_total_bytes
> mbm_local_bytes
>
> After the change when BMEC is supported.
> $cat /sys/fs/resctrl/info/L3_MON/mon_features
> llc_occupancy
> mbm_total_bytes
> mbm_total_bytes_config
> mbm_local_bytes
> mbm_local_bytes_config
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 2 ++
> arch/x86/kernel/cpu/resctrl/monitor.c | 6 ++++++
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++++-
> 3 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index e30e8b23f6b5..5459b5022760 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -63,11 +63,13 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
> * struct mon_evt - Entry in the event list of a resource
> * @evtid: event id
> * @name: name of the event
> + * @configurable: true if the event is configurable
> * @list: entry in &rdt_resource->evt_list
> */
> struct mon_evt {
> enum resctrl_event_id evtid;
> char *name;
> + bool configurable;
> struct list_head list;
> };
>
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index efe0c30d3a12..06c2dc980855 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -750,6 +750,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
> {
> unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> + bool mon_configurable = rdt_cpu_has(X86_FEATURE_BMEC);
> unsigned int threshold;
> int ret;
>
> @@ -783,6 +784,11 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
> if (ret)
> return ret;
>
> + if (mon_configurable) {
> + mbm_total_event.configurable = true;
> + mbm_local_event.configurable = true;
> + }
> +

Is the local variable needed? Why not just:
if (rdt_cpu_has(X86_FEATURE_BMEC))


Reinette

2022-11-23 18:29:20

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v8 06/13] x86/resctrl: Remove the init attribute for rdt_cpu_has()

Hi Reinette,


On 11/22/22 18:13, Reinette Chatre wrote:
> Hi Babu,
>
> On 11/4/2022 1:00 PM, Babu Moger wrote:
>> The monitor code in resctrl/monitor.c needs to call rdt_cpu_has() to
>> detect the monitor related features. It has the init attribute and
>> cannot be called in non-init routines. Remove the init attribute and
>> make it available for all the resctrl files.
> I think this is the wrong way to go. The rdt_cpu_has() callers are
> init code and they should rather get the __init attribute instead of
> rdt_cpu_has() losing it.

Ok. I will add __init  attribute to rdt_get_mon_l3_config. That should work.

Thanks

Babu

2022-11-23 18:38:32

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v8 07/13] x86/resctrl: Introduce data structure to support monitor configuration

Hi Reinette,

On 11/22/22 18:14, Reinette Chatre wrote:
> Hi Babu,
>
> On 11/4/2022 1:00 PM, Babu Moger wrote:
>> Add a new field in mon_evt to support Bandwidth Monitoring Event
>> Configuration(BMEC) and also update the "mon_features" display.
>>
>> The sysfs file "mon_features" will display the monitor configuration
> sysfs -> resctrl ?
Sure.
>
>> if supported.
> This is not clear. "mon_features" does not display the monitor
> configuration, it displays the name of the file that can be used to
> see the monitor configuration.

Will change it to:

The sysfs -> resctrl file "mon_features" will display the supported events
and files that can be used to configure those events if monitor
configuration is supported.


>
>> Before the change.
>> $cat /sys/fs/resctrl/info/L3_MON/mon_features
>> llc_occupancy
>> mbm_total_bytes
>> mbm_local_bytes
>>
>> After the change when BMEC is supported.
>> $cat /sys/fs/resctrl/info/L3_MON/mon_features
>> llc_occupancy
>> mbm_total_bytes
>> mbm_total_bytes_config
>> mbm_local_bytes
>> mbm_local_bytes_config
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/internal.h | 2 ++
>> arch/x86/kernel/cpu/resctrl/monitor.c | 6 ++++++
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++++-
>> 3 files changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index e30e8b23f6b5..5459b5022760 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -63,11 +63,13 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
>> * struct mon_evt - Entry in the event list of a resource
>> * @evtid: event id
>> * @name: name of the event
>> + * @configurable: true if the event is configurable
>> * @list: entry in &rdt_resource->evt_list
>> */
>> struct mon_evt {
>> enum resctrl_event_id evtid;
>> char *name;
>> + bool configurable;
>> struct list_head list;
>> };
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index efe0c30d3a12..06c2dc980855 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -750,6 +750,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
>> {
>> unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
>> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>> + bool mon_configurable = rdt_cpu_has(X86_FEATURE_BMEC);
>> unsigned int threshold;
>> int ret;
>>
>> @@ -783,6 +784,11 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
>> if (ret)
>> return ret;
>>
>> + if (mon_configurable) {
>> + mbm_total_event.configurable = true;
>> + mbm_local_event.configurable = true;
>> + }
>> +
> Is the local variable needed? Why not just:
> if (rdt_cpu_has(X86_FEATURE_BMEC))


Local variable not requited. Will change it.

Thanks

Babu

>
>
> Reinette

--
Thanks
Babu Moger

2022-11-23 19:29:04

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v8 07/13] x86/resctrl: Introduce data structure to support monitor configuration

Hi Babu,

On 11/23/2022 10:23 AM, Moger, Babu wrote:
> Hi Reinette,
>
> On 11/22/22 18:14, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 11/4/2022 1:00 PM, Babu Moger wrote:
>>> Add a new field in mon_evt to support Bandwidth Monitoring Event
>>> Configuration(BMEC) and also update the "mon_features" display.
>>>
>>> The sysfs file "mon_features" will display the monitor configuration
>> sysfs -> resctrl ?
> Sure.
>>
>>> if supported.
>> This is not clear. "mon_features" does not display the monitor
>> configuration, it displays the name of the file that can be used to
>> see the monitor configuration.
>
> Will change it to:
>
> The sysfs -> resctrl file "mon_features" will display the supported events
> and files that can be used to configure those events if monitor
> configuration is supported.
>

I meant that "sysfs" should be replaced by "resctrl".

Reinette

2022-11-23 19:54:10

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v8 08/13] x86/resctrl: Add sysfs interface to read mbm_total_bytes_config

Hi Reinette,


On 11/22/22 18:19, Reinette Chatre wrote:
> Hi Babu,
>
> On 11/4/2022 1:00 PM, Babu Moger wrote:
>> The current event configuration can be viewed by the user by reading
>> the configuration file /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config.
>> The event configuration settings are domain specific and will affect all
>> the CPUs in the domain.
>>
>> Following are the types of events supported:
>> ==== ===========================================================
>> Bits Description
>> ==== ===========================================================
>> 6 Dirty Victims from the QOS domain to all types of memory
>> 5 Reads to slow memory in the non-local NUMA domain
>> 4 Reads to slow memory in the local NUMA domain
>> 3 Non-temporal writes to non-local NUMA domain
>> 2 Non-temporal writes to local NUMA domain
>> 1 Reads to memory in the non-local NUMA domain
>> 0 Reads to memory in the local NUMA domain
>> ==== ===========================================================
>>
>> By default, the mbm_total_bytes_config is set to 0x7f to count all the
>> event types.
>>
>> For example:
>> $cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>> 0=0x7f;1=0x7f;2=0x7f;3=0x7f
>>
>> In this case, the event mbm_total_bytes is currently configured
>> with 0x7f on domains 0 to 3.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/internal.h | 28 ++++++++++
>> arch/x86/kernel/cpu/resctrl/monitor.c | 1
>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 87 ++++++++++++++++++++++++++++++++
>> 3 files changed, 116 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 5459b5022760..c74285fd0f6e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -15,6 +15,7 @@
>> #define MSR_IA32_MBA_THRTL_BASE 0xd50
>> #define MSR_IA32_MBA_BW_BASE 0xc0000200
>> #define MSR_IA32_SMBA_BW_BASE 0xc0000280
>> +#define MSR_IA32_EVT_CFG_BASE 0xc0000400
>>
>> #define MSR_IA32_QM_CTR 0x0c8e
>> #define MSR_IA32_QM_EVTSEL 0x0c8d
>> @@ -41,6 +42,32 @@
>> */
>> #define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
>>
>> +/* Reads to Local DRAM Memory */
>> +#define READS_TO_LOCAL_MEM BIT(0)
>> +
>> +/* Reads to Remote DRAM Memory */
>> +#define READS_TO_REMOTE_MEM BIT(1)
>> +
>> +/* Non-Temporal Writes to Local Memory */
>> +#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
>> +
>> +/* Non-Temporal Writes to Remote Memory */
>> +#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
>> +
>> +/* Reads to Local Memory the system identifies as "Slow Memory" */
>> +#define READS_TO_LOCAL_S_MEM BIT(4)
>> +
>> +/* Reads to Remote Memory the system identifies as "Slow Memory" */
>> +#define READS_TO_REMOTE_S_MEM BIT(5)
>> +
>> +/* Dirty Victims to All Types of Memory */
>> +#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
>> +
>> +/* Max event bits supported */
>> +#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
>> +
>> +/* Max configurable events */
>> +#define MAX_CONFIG_EVENTS 2
>>
> This max being disconnected from what it is a max of looks like
> a source of future confusion.

ok, Not required anymore with your suggested change below.  Will remove it.

>
>> struct rdt_fs_context {
>> struct kernfs_fs_context kfc;
>> @@ -542,5 +569,6 @@ bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
>> void __check_limbo(struct rdt_domain *d, bool force_free);
>> void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>> void __init thread_throttle_mode_init(void);
>> +void mbm_config_rftype_init(void);
>>
>> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index 06c2dc980855..a188dacab6c8 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -787,6 +787,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
>> if (mon_configurable) {
>> mbm_total_event.configurable = true;
>> mbm_local_event.configurable = true;
>> + mbm_config_rftype_init();
>> }
>>
>> l3_mon_evt_init(r);
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 8342feb54a7f..dea58b6b4aa4 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1423,6 +1423,78 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
>> return ret;
>> }
>>
>> +struct mon_config_info {
>> + u32 evtid;
>> + u32 mon_config;
>> +};
>> +
>> +/**
>> + * mon_event_config_index_get - get the index for the configurable event
>> + * @evtid: event id.
>> + *
>> + * Return: 0 for evtid == QOS_L3_MBM_TOTAL_EVENT_ID
>> + * 1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID
>> + * > 1 otherwise
>> + */
>> +static inline unsigned int mon_event_config_index_get(u32 evtid)
>> +{
>> + return evtid - QOS_L3_MBM_TOTAL_EVENT_ID;
>> +}
> It seems strange that the validation of the index is split
> from where the index is determined. I think it would be easier
> to understand, and reduce code duplication, it if is done together.
>
> How about:
> #define INVALID_CONFIG_INDEX UINT_MAX
>
> static inline unsigned int mon_event_config_index_get(u32 evtid)
> {
> switch (evtid) {
> case QOS_L3_MBM_TOTAL_EVENT_ID:
> return 0;
> case QOS_L3_MBM_LOCAL_EVENT_ID:
> return 1;
> default:
> /* WARN */
> return INVALID_CONFIG_INDEX;
> }
> }
>
> What do you think?
Yes. It should work
>
>> +
>> +static void mon_event_config_read(void *info)
>> +{
>> + struct mon_config_info *mon_info = info;
>> + u32 h, index;
>> +
>> + index = mon_event_config_index_get(mon_info->evtid);
>> + if (index >= MAX_CONFIG_EVENTS) {
>> + pr_warn_once("Invalid event id %d\n", mon_info->evtid);
>> + return;
>> + }
>> + rdmsr(MSR_IA32_EVT_CFG_BASE + index, mon_info->mon_config, h);
>> +}
>> +
>> +static void mondata_config_read(struct rdt_domain *d, struct mon_config_info *mon_info)
>> +{
>> + smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1);
>> +}
>> +
>> +static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
>> +{
>> + struct mon_config_info mon_info = {0};
>> + struct rdt_domain *dom;
>> + bool sep = false;
>> +
>> + mutex_lock(&rdtgroup_mutex);
>> +
>> + list_for_each_entry(dom, &r->domains, list) {
>> + if (sep)
>> + seq_puts(s, ";");
>> +
>> + mon_info.evtid = evtid;
>> + mondata_config_read(dom, &mon_info);
>> +
>> + seq_printf(s, "%d=0x%02lx", dom->id,
> This is a u32 ... is just x sufficient?

I have added 0x%02lx to silence the compiler. Not required anymore.


>
>> + mon_info.mon_config & MAX_EVT_CONFIG_BITS);
> Please do this masking within mondata_config_read(). It should
> not be required for every mon_config_read() caller to validate the
> data because they may forget (re. patch 10).
Sure. Will do.
>
>> + sep = true;
>> + }
>> + seq_puts(s, "\n");
>> +
>> + mutex_unlock(&rdtgroup_mutex);
>> +
>> + return 0;
>> +}
>> +
>> +static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
>> + struct seq_file *seq, void *v)
>> +{
>> + struct rdt_resource *r = of->kn->parent->priv;
>> +
>> + mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +
>> + return 0;
>> +}
>> +
>> /* rdtgroup information files for one cache resource. */
>> static struct rftype res_common_files[] = {
>> {
>> @@ -1521,6 +1593,12 @@ static struct rftype res_common_files[] = {
>> .seq_show = max_threshold_occ_show,
>> .fflags = RF_MON_INFO | RFTYPE_RES_CACHE,
>> },
>> + {
>> + .name = "mbm_total_bytes_config",
>> + .mode = 0444,
>> + .kf_ops = &rdtgroup_kf_single_ops,
>> + .seq_show = mbm_total_bytes_config_show,
>> + },
>> {
>> .name = "cpus",
>> .mode = 0644,
>> @@ -1627,6 +1705,15 @@ void __init thread_throttle_mode_init(void)
>> rft->fflags = RF_CTRL_INFO | RFTYPE_RES_MB;
>> }
>>
>> +void mbm_config_rftype_init(void)
> Does this need __init?

Not. Required. Will remove it.

Thanks

Babu

>
>> +{
>> + struct rftype *rft;
>> +
>> + rft = rdtgroup_get_rftype_by_name("mbm_total_bytes_config");
>> + if (rft)
>> + rft->fflags = RF_MON_INFO | RFTYPE_RES_CACHE;
>> +}
>> +
>> /**
>> * rdtgroup_kn_mode_restrict - Restrict user access to named resctrl file
>> * @r: The resource group with which the file is associated.
>>
>>
> Reinette

--
Thanks
Babu Moger

2022-11-23 21:59:33

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v8 07/13] x86/resctrl: Introduce data structure to support monitor configuration

[AMD Official Use Only - General]



> -----Original Message-----
> From: Reinette Chatre <[email protected]>
> Sent: Wednesday, November 23, 2022 1:06 PM
> To: Moger, Babu <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; Das1, Sandipan
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH v8 07/13] x86/resctrl: Introduce data structure to support
> monitor configuration
>
> Hi Babu,
>
> On 11/23/2022 10:23 AM, Moger, Babu wrote:
> > Hi Reinette,
> >
> > On 11/22/22 18:14, Reinette Chatre wrote:
> >> Hi Babu,
> >>
> >> On 11/4/2022 1:00 PM, Babu Moger wrote:
> >>> Add a new field in mon_evt to support Bandwidth Monitoring Event
> >>> Configuration(BMEC) and also update the "mon_features" display.
> >>>
> >>> The sysfs file "mon_features" will display the monitor configuration
> >> sysfs -> resctrl ?
> > Sure.
> >>
> >>> if supported.
> >> This is not clear. "mon_features" does not display the monitor
> >> configuration, it displays the name of the file that can be used to
> >> see the monitor configuration.
> >
> > Will change it to:
> >
> > The sysfs -> resctrl file "mon_features" will display the supported
> > events and files that can be used to configure those events if monitor
> > configuration is supported.
> >
>
> I meant that "sysfs" should be replaced by "resctrl".

Ok. Got it.
Thanks
Babu


Attachments:
winmail.dat (17.81 kB)

2022-11-23 22:38:46

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v8 08/13] x86/resctrl: Add sysfs interface to read mbm_total_bytes_config

Hi Babu,

On 11/23/2022 10:35 AM, Moger, Babu wrote:
> On 11/22/22 18:19, Reinette Chatre wrote:
>> On 11/4/2022 1:00 PM, Babu Moger wrote:

...

>>> @@ -1521,6 +1593,12 @@ static struct rftype res_common_files[] = {
>>> .seq_show = max_threshold_occ_show,
>>> .fflags = RF_MON_INFO | RFTYPE_RES_CACHE,
>>> },
>>> + {
>>> + .name = "mbm_total_bytes_config",
>>> + .mode = 0444,
>>> + .kf_ops = &rdtgroup_kf_single_ops,
>>> + .seq_show = mbm_total_bytes_config_show,
>>> + },
>>> {
>>> .name = "cpus",
>>> .mode = 0644,
>>> @@ -1627,6 +1705,15 @@ void __init thread_throttle_mode_init(void)
>>> rft->fflags = RF_CTRL_INFO | RFTYPE_RES_MB;
>>> }
>>>
>>> +void mbm_config_rftype_init(void)
>> Does this need __init?
>
> Not. Required. Will remove it.
>

Your response is not clear to me. I am not asking for any removal. My
question is whether the function needs the __init attribute. That is,
should this be:

void __init mbm_config_rftype_init(void)

Reinette

2022-11-23 23:20:46

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v8 08/13] x86/resctrl: Add sysfs interface to read mbm_total_bytes_config

[AMD Official Use Only - General]



> -----Original Message-----
> From: Reinette Chatre <[email protected]>
> Sent: Wednesday, November 23, 2022 4:28 PM
> To: Moger, Babu <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; Das1, Sandipan
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH v8 08/13] x86/resctrl: Add sysfs interface to read
> mbm_total_bytes_config
>
> Hi Babu,
>
> On 11/23/2022 10:35 AM, Moger, Babu wrote:
> > On 11/22/22 18:19, Reinette Chatre wrote:
> >> On 11/4/2022 1:00 PM, Babu Moger wrote:
>
> ...
>
> >>> @@ -1521,6 +1593,12 @@ static struct rftype res_common_files[] = {
> >>> .seq_show = max_threshold_occ_show,
> >>> .fflags = RF_MON_INFO | RFTYPE_RES_CACHE,
> >>> },
> >>> + {
> >>> + .name = "mbm_total_bytes_config",
> >>> + .mode = 0444,
> >>> + .kf_ops = &rdtgroup_kf_single_ops,
> >>> + .seq_show = mbm_total_bytes_config_show,
> >>> + },
> >>> {
> >>> .name = "cpus",
> >>> .mode = 0644,
> >>> @@ -1627,6 +1705,15 @@ void __init thread_throttle_mode_init(void)
> >>> rft->fflags = RF_CTRL_INFO | RFTYPE_RES_MB; }
> >>>
> >>> +void mbm_config_rftype_init(void)
> >> Does this need __init?
> >
> > Not. Required. Will remove it.
> >
>
> Your response is not clear to me. I am not asking for any removal. My question
> is whether the function needs the __init attribute. That is, should this be:
>
> void __init mbm_config_rftype_init(void)

Oh.. I mis-understood.
Yes. It is called from rdt_get_mon_l3_config which will be __init routine. It seems appropriate to keep the __init attribute.
Thanks
Babu


Attachments:
winmail.dat (18.03 kB)

2022-11-23 23:20:46

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v8 13/13] Documentation/x86: Update resctrl.rst for new features

[AMD Official Use Only - General]

Hi Reinette,

> -----Original Message-----
> From: Reinette Chatre <[email protected]>
> Sent: Tuesday, November 22, 2022 6:26 PM
> To: Moger, Babu <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; Das1, Sandipan
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH v8 13/13] Documentation/x86: Update resctrl.rst for new
> features
>
> Hi Babu,
>
> On 11/4/2022 1:01 PM, Babu Moger wrote:
> ...
>
> > @@ -464,6 +539,26 @@ Memory bandwidth domain is L3 cache.
> >
> > MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
> >
> > +Slow Memory Bandwidth Allocation (SMBA)
> > +---------------------------------------
> > +AMD hardware support Slow Memory Bandwidth Allocation (SMBA) feature.
>
> How about
> AMD hardware supports the Slow Memory Bandwidth Allocation (SMBA)
> feature.
> or
> AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).

Sure.
>
> > +Currently, CXL.memory is the only supported "slow" memory device.
>
> What does "Currently" mean here? If there is a plan for changes, could that be
> shared? Otherwise maybe just remove it: "CXL.memory is the only supported
> "slow" memory device."

There is no change of plan. I will remove "Currently"

>
> > +With the support of SMBA, the hardware enables bandwidth allocation
> > +on the slow memory devices. If there are multiple such devices in the
> > +system, the throttling logic groups all the slow sources together and
> > +applies the limit on them as a whole.
> > +
> > +The presence of SMBA (with CXL.memory) is independent of slow memory
> > +devices presence. If there is no such devices on the system, then
>
> Maybe "is no such device" or "are no such devices"?

It should be "If there are no such devices". Will correct it.

>
> > +setting the configuring SMBA will have no impact on the performance
>
> "setting the configuring SMBA" is hard to parse. How about just "configuring
> SMBA"?

Sure.
Thanks
Babu


Attachments:
winmail.dat (17.89 kB)