2022-09-27 20:45:24

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 00/12] x86/resctrl: Support for AMD QoS new features

New AMD processors can now support following QoS features.

1. Slow Memory Bandwidth Allocation (SMBA)
With this feature, the QOS enforcement policies can be applied
to the external slow memory connected to the host. QOS enforcement
is accomplished by assigning a Class Of Service (COS) to a processor
and specifying allocations or limits for that COS for each resource
to be allocated.

Currently, CXL.memory is the only supported "slow" memory device. With
the support of SMBA feature the hardware enables bandwidth allocation
on the slow memory devices.

2. Bandwidth Monitoring Event Configuration (BMEC)
The bandwidth monitoring events mbm_total_event and mbm_local_event
are set to count all the total and local reads/writes respectively.
With the introduction of slow memory, the two counters are not enough
to count all the different types are memory events. With the feature
BMEC, the users have the option to configure mbm_total_event and
mbm_local_event to count the specific type of events.

Following are the bitmaps of events supported.
Bits Description
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain

This series adds support for these features.

Feature description is available in the specification, "AMD64 Technology Platform Quality
of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".

Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

---
v5:
Summary of changes.
1. Split the series into two. The first two patches are bug fixes. So, sent them separate.
2. The config files mbm_total_config and mbm_local_config are now under
/sys/fs/resctrl/info/L3_MON/. Removed these config files from mon groups.
3. Ran "checkpatch --strict --codespell" on all the patches. Looks good with few known exceptions.
4. Few minor text changes in resctrl.rst file.

v4:
https://lore.kernel.org/lkml/166257348081.1043018.11227924488792315932.stgit@bmoger-ubuntu/
Got numerios of comments from Reinette Chatre. Addressed most of them.
Summary of changes.
1. Removed mon_configurable under /sys/fs/resctrl/info/L3_MON/.
2. Updated mon_features texts if the BMEC is supported.
3. Added more explanation about the slow memory support.
4. Replaced smp_call_function_many with on_each_cpu_mask call.
5. Removed arch_has_empty_bitmaps
6. Few other text changes.
7. Removed Reviewed-by if the patch is modified.
8. Rebased the patches to latest tip.

v3:
https://lore.kernel.org/lkml/166117559756.6695.16047463526634290701.stgit@bmoger-ubuntu/
a. Rebased the patches to latest tip. Resolved some conflicts.
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
b. Taken care of feedback from Bagas Sanjaya.
c. Added Reviewed by from Mingo.
Note: I am still looking for comments from Reinette or Fenghua.

v2:
https://lore.kernel.org/lkml/165938717220.724959.10931629283087443782.stgit@bmoger-ubuntu/
a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
b. Added the patch to fix CBM issue on AMD. This was originally discussed
https://lore.kernel.org/lkml/[email protected]/

v1:
https://lore.kernel.org/lkml/165757543252.416408.13547339307237713464.stgit@bmoger-ubuntu/

Babu Moger (12):
x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
x86/resctrl: Include new features in command line options
x86/resctrl: Detect and configure Slow Memory Bandwidth allocation
x86/resctrl: Introduce data structure to support monitor configuration
x86/resctrl: Add sysfs interface to read mbm_total_bytes event configuration
x86/resctrl: Add sysfs interface to read mbm_local_bytes event configuration
x86/resctrl: Add sysfs interface to write mbm_total_bytes event configuration
x86/resctrl: Add sysfs interface to write mbm_local_bytes event configuration
x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
Documentation/x86: Update resctrl_ui.rst for new features


.../admin-guide/kernel-parameters.txt | 2 +-
Documentation/x86/resctrl.rst | 130 +++++++-
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/resctrl/core.c | 51 ++-
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 33 +-
arch/x86/kernel/cpu/resctrl/monitor.c | 9 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 298 ++++++++++++++++--
arch/x86/kernel/cpu/scattered.c | 2 +
10 files changed, 496 insertions(+), 34 deletions(-)

--


2022-09-27 20:46:50

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 08/12] x86/resctrl: Add sysfs interface to read mbm_local_bytes event configuration

The current event configuration can be viewed by the user by reading
the configuration file /sys/fs/resctrl/info/L3_MON/mbm_local_config.

Following are the types of events supported:

==== ===========================================================
Bits Description
==== ===========================================================
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
==== ===========================================================

By default, the mbm_local_bytes configuration is set to 0x15 to count
all the local event types. The event configuration settings are domain
specific. Changing the configuration on one CPU in a domain would
affect the whole domain.

For example:
$cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
0:0x15;1:0x15;2:0x15;3:0x15

In this case the event mbm_local_bytes is currently configured with
0x15 on domains 0 to 3.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b51fae77ba5c..27bf6ade0dbf 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1481,6 +1481,16 @@ static int mbm_total_config_show(struct kernfs_open_file *of,
return 0;
}

+static int mbm_local_config_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct rdt_resource *r = of->kn->parent->priv;
+
+ mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
+
+ return 0;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -1585,6 +1595,12 @@ static struct rftype res_common_files[] = {
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = mbm_total_config_show,
},
+ {
+ .name = "mbm_local_config",
+ .mode = 0644,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = mbm_local_config_show,
+ },
{
.name = "cpus",
.mode = 0644,
@@ -1698,6 +1714,10 @@ void __init mbm_config_rftype_init(void)
rft = rdtgroup_get_rftype_by_name("mbm_total_config");
if (rft)
rft->fflags = RF_MON_INFO | RFTYPE_RES_CACHE;
+
+ rft = rdtgroup_get_rftype_by_name("mbm_local_config");
+ if (rft)
+ rft->fflags = RF_MON_INFO | RFTYPE_RES_CACHE;
}

/**


2022-09-27 20:50:30

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 07/12] x86/resctrl: Add sysfs interface to read mbm_total_bytes event configuration

The current event configuration can be viewed by the user by reading
the configuration file /sys/fs/resctrl/info/L3_MON/mbm_total_config.

Following are the types of events supported:

==== ===========================================================
Bits Description
==== ===========================================================
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
==== ===========================================================

By default, the mbm_total_bytes configuration is set to 0x7f to count
all the event types. The event configuration settings are domain
specific. Changing the configuration on one CPU in a domain would
affect the whole domain.

For example:
$cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
0:0x7f;1:0x7f;2:0x7f;3:0x7f

In this case, the event mbm_total_bytes is currently configured
with 0x7f on domains 0 to 3.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 3 +
arch/x86/kernel/cpu/resctrl/internal.h | 2 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 76 ++++++++++++++++++++++++++++++++
3 files changed, 81 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 513e6a00f58e..b3bb8badbaaa 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -861,6 +861,9 @@ static __init bool get_rdt_mon_resources(void)
if (!rdt_mon_features)
return false;

+ if (mon_configurable)
+ mbm_config_rftype_init();
+
return !rdt_get_mon_l3_config(r, mon_configurable);
}

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 4d03f443b353..44d3f18dfd69 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -15,6 +15,7 @@
#define MSR_IA32_MBA_THRTL_BASE 0xd50
#define MSR_IA32_MBA_BW_BASE 0xc0000200
#define MSR_IA32_SMBA_BW_BASE 0xc0000280
+#define MSR_IA32_EVT_CFG_BASE 0xc0000400

#define MSR_IA32_QM_CTR 0x0c8e
#define MSR_IA32_QM_EVTSEL 0x0c8d
@@ -556,5 +557,6 @@ bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
void __check_limbo(struct rdt_domain *d, bool force_free);
void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
void __init thread_throttle_mode_init(void);
+void __init mbm_config_rftype_init(void);

#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 834a55d78e3f..b51fae77ba5c 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1420,6 +1420,67 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
return ret;
}

+struct mon_config_info {
+ u32 evtid;
+ u32 mon_config;
+};
+
+void mon_event_config_read(void *info)
+{
+ struct mon_config_info *mon_info = info;
+ u32 h, msr_index;
+
+ switch (mon_info->evtid) {
+ case QOS_L3_MBM_TOTAL_EVENT_ID:
+ msr_index = 0;
+ break;
+ case QOS_L3_MBM_LOCAL_EVENT_ID:
+ msr_index = 1;
+ break;
+ default:
+ /* Not expected to come here */
+ return;
+ }
+
+ rdmsr(MSR_IA32_EVT_CFG_BASE + msr_index, mon_info->mon_config, h);
+}
+
+void mondata_config_read(struct rdt_domain *d, struct mon_config_info *mon_info)
+{
+ smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1);
+}
+
+unsigned int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
+{
+ struct mon_config_info mon_info = {0};
+ struct rdt_domain *dom;
+ bool sep = false;
+
+ list_for_each_entry(dom, &r->domains, list) {
+ if (sep)
+ seq_puts(s, ";");
+
+ mon_info.evtid = evtid;
+ mondata_config_read(dom, &mon_info);
+
+ seq_printf(s, "%d:0x%02x", dom->id, mon_info.mon_config);
+ sep = true;
+ }
+ seq_puts(s, "\n");
+
+ return 0;
+}
+
+static int mbm_total_config_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct rdt_resource *r = of->kn->parent->priv;
+
+ mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
+
+ return 0;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -1518,6 +1579,12 @@ static struct rftype res_common_files[] = {
.seq_show = max_threshold_occ_show,
.fflags = RF_MON_INFO | RFTYPE_RES_CACHE,
},
+ {
+ .name = "mbm_total_config",
+ .mode = 0644,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = mbm_total_config_show,
+ },
{
.name = "cpus",
.mode = 0644,
@@ -1624,6 +1691,15 @@ void __init thread_throttle_mode_init(void)
rft->fflags = RF_CTRL_INFO | RFTYPE_RES_MB;
}

+void __init mbm_config_rftype_init(void)
+{
+ struct rftype *rft;
+
+ rft = rdtgroup_get_rftype_by_name("mbm_total_config");
+ if (rft)
+ rft->fflags = RF_MON_INFO | RFTYPE_RES_CACHE;
+}
+
/**
* rdtgroup_kn_mode_restrict - Restrict user access to named resctrl file
* @r: The resource group with which the file is associated.


2022-09-27 20:52:36

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Update the documentation for the new features:
1. Slow Memory Bandwidth allocation (SMBA).
With this feature, the QOS enforcement policies can be applied
to the external slow memory connected to the host. QOS enforcement
is accomplished by assigning a Class Of Service (COS) to a processor
and specifying allocations or limits for that COS for each resource
to be allocated.

2. Bandwidth Monitoring Event Configuration (BMEC).
The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
are set to count all the total and local reads/writes respectively.
With the introduction of slow memory, the two counters are not
enough to count all the different types are memory events. With the
feature BMEC, the users have the option to configure mbm_total_bytes
and mbm_local_bytes to count the specific type of events.

Also add configuration instructions with examples.

Signed-off-by: Babu Moger <[email protected]>
---
Documentation/x86/resctrl.rst | 130 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 128 insertions(+), 2 deletions(-)

diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 71a531061e4e..b4fe54f219b6 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -17,14 +17,16 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
flag bits:

-============================================= ================================
+=============================================== ================================
RDT (Resource Director Technology) Allocation "rdt_a"
CAT (Cache Allocation Technology) "cat_l3", "cat_l2"
CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2"
CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc"
MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local"
MBA (Memory Bandwidth Allocation) "mba"
-============================================= ================================
+SMBA (Slow Memory Bandwidth Allocation) "smba"
+BMEC (Bandwidth Monitoring Event Configuration) "bmec"
+=============================================== ================================

To use the feature mount the file system::

@@ -161,6 +163,73 @@ with the following files:
"mon_features":
Lists the monitoring events if
monitoring is enabled for the resource.
+ Example::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
+ llc_occupancy
+ mbm_total_bytes
+ mbm_local_bytes
+
+ If the system supports Bandwidth Monitoring Event
+ Configuration (BMEC), then the bandwidth events will
+ be configurable. The output will be::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
+ llc_occupancy
+ mbm_total_bytes
+ mbm_total_config
+ mbm_local_bytes
+ mbm_local_config
+
+"mbm_total_config", "mbm_local_config":
+ These files contain the current event configuration for the events
+ mbm_total_bytes and mbm_local_bytes, respectively, when the
+ Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
+ The event configuration settings are domain specific. Changing the
+ configuration on one CPU in a domain would affect the whole domain.
+
+ Following are the types of events supported:
+
+ ==== ========================================================
+ Bits Description
+ ==== ========================================================
+ 6 Dirty Victims from the QOS domain to all types of memory
+ 5 Reads to slow memory in the non-local NUMA domain
+ 4 Reads to slow memory in the local NUMA domain
+ 3 Non-temporal writes to non-local NUMA domain
+ 2 Non-temporal writes to local NUMA domain
+ 1 Reads to memory in the non-local NUMA domain
+ 0 Reads to memory in the local NUMA domain
+ ==== ========================================================
+
+ By default, the mbm_total_bytes configuration is set to 0x7f to count
+ all the event types and the mbm_local_bytes configuration is set to
+ 0x15 to count all the local memory events.
+
+ Example::
+
+ To view the current configuration, run the command.
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
+ 0:0x7f;1:0x7f;2:0x7f;3:0x7f
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
+ 0:0x15;1:0x15;3:0x15;4:0x15
+
+ To change the mbm_total_bytes to count only reads on domain 0,
+ run the command. The bits 0,1,4 and 5 needs to set.
+
+ # echo "0:0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
+ 0:0x33;1:0x7f;2:0x7f;3:0x7f
+
+ To change the mbm_local_bytes to count all the slow memory reads on
+ domain 1, run the command. The bits 4 and 5 needs to set.
+
+ # echo "1:0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
+ 0:0x15;1:0x30;3:0x15;4:0x15

"max_threshold_occupancy":
Read/write file provides the largest value (in
@@ -264,6 +333,7 @@ When monitoring is enabled all MON groups will also contain:
the sum for all tasks in the CTRL_MON group and all tasks in
MON groups. Please see example section for more details on usage.

+
Resource allocation rules
-------------------------

@@ -464,6 +534,24 @@ Memory bandwidth domain is L3 cache.

MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...

+Slow Memory bandwidth Allocation (when supported)
+-------------------------------------------------
+Currently, CXL.memory is the only supported "slow" memory device.
+With the support of SMBA feature the hardware enables bandwidth
+allocation on the slow memory devices. If there are multiple slow
+memory devices in the system, then the throttling logic groups all
+the slow sources together and applies the limit on them as a whole.
+
+The presence of the SMBA feature(with CXL.memory) is independent
+of whether slow memory device is actually present in the system.
+If there is no slow memory in the system, then setting a SMBA limit
+will have no impact on the performance of the system.
+
+Slow Memory b/w domain is L3 cache.
+::
+
+ SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
+
Reading/writing the schemata file
---------------------------------
Reading the schemata file will show the state of all resources
@@ -479,6 +567,44 @@ which you wish to change. E.g.
L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
L3CODE:0=fffff;1=fffff;2=fffff;3=fffff

+Reading/writing the schemata file (on AMD systems)
+--------------------------------------------------
+Reading the schemata file will show the state of all resources
+on all domains. When writing the memory bandwidth allocation you
+only need to specify those values in an absolute number expressed
+in 1/8 GB/s increments. To allocate bandwidth limit of 2GB, you
+need to specify the value 16 (16 * 1/8 = 2). E.g.
+::
+
+ # cat schemata
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "MB:1=16" > schemata
+ # cat schemata
+ MB:0=2048;1= 16;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+Reading/writing the schemata file (on AMD systems) with slow memory
+-------------------------------------------------------------------
+Reading the schemata file will show the state of all resources
+on all domains. When writing the memory bandwidth allocation you
+only need to specify those values in an absolute number expressed
+in 1/8 GB/s increments. To allocate bandwidth limit of 8GB, you
+need to specify the value 64 (64 * 1/8 = 8). E.g.
+::
+
+ # cat schemata
+ SMBA:0=2048;1=2048;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "SMBA:1=64" > schemata
+ # cat schemata
+ SMBA:0=2048;1= 64;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
Cache Pseudo-Locking
====================
CAT enables a user to specify the amount of cache space that an


2022-09-27 21:14:04

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 04/12] x86/resctrl: Include new features in command line options

Add the command line options to disable the new features.
smba : Slow Memory Bandwidth Allocation
bmec : Bandwidth Monitor Event Configuration.

Signed-off-by: Babu Moger <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 2 +-
arch/x86/kernel/cpu/resctrl/core.c | 4 ++++
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 426fa892d311..71b397cc776c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5169,7 +5169,7 @@
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
- mba.
+ mba, smba, bmec.
E.g. to turn on cmt and turn off mba use:
rdt=cmt,!mba

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a7e9aabff8c8..53fbc3acad81 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -700,6 +700,8 @@ enum {
RDT_FLAG_L2_CAT,
RDT_FLAG_L2_CDP,
RDT_FLAG_MBA,
+ RDT_FLAG_SMBA,
+ RDT_FLAG_BMEC,
};

#define RDT_OPT(idx, n, f) \
@@ -723,6 +725,8 @@ static struct rdt_options rdt_options[] __initdata = {
RDT_OPT(RDT_FLAG_L2_CAT, "l2cat", X86_FEATURE_CAT_L2),
RDT_OPT(RDT_FLAG_L2_CDP, "l2cdp", X86_FEATURE_CDP_L2),
RDT_OPT(RDT_FLAG_MBA, "mba", X86_FEATURE_MBA),
+ RDT_OPT(RDT_FLAG_SMBA, "smba", X86_FEATURE_SMBA),
+ RDT_OPT(RDT_FLAG_BMEC, "bmec", X86_FEATURE_BMEC),
};
#define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options)



2022-09-27 21:23:43

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 11/12] x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()

The call on_each_cpu_mask() can run the function on each CPU specified
by cpumask, which may include the local processor. So, replace the call
smp_call_function_man()y with on_each_cpu_mask() to simplify the code.

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 29 ++++++++---------------------
1 file changed, 8 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 87f3f8018c92..532bb0025054 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -325,12 +325,7 @@ static void update_cpu_closid_rmid(void *info)
static void
update_closid_rmid(const struct cpumask *cpu_mask, struct rdtgroup *r)
{
- int cpu = get_cpu();
-
- if (cpumask_test_cpu(cpu, cpu_mask))
- update_cpu_closid_rmid(r);
- smp_call_function_many(cpu_mask, update_cpu_closid_rmid, r, 1);
- put_cpu();
+ on_each_cpu_mask(cpu_mask, update_cpu_closid_rmid, r, 1);
}

static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
@@ -2121,13 +2116,9 @@ static int set_cache_qos_cfg(int level, bool enable)
/* Pick one CPU from each domain instance to update MSR */
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
}
- cpu = get_cpu();
- /* Update QOS_CFG MSR on this cpu if it's in cpu_mask. */
- if (cpumask_test_cpu(cpu, cpu_mask))
- update(&enable);
- /* Update QOS_CFG MSR on all other cpus in cpu_mask. */
- smp_call_function_many(cpu_mask, update, &enable, 1);
- put_cpu();
+
+ /* Update QOS_CFG MSR on all the CPUs in cpu_mask */
+ on_each_cpu_mask(cpu_mask, update, &enable, 1);

free_cpumask_var(cpu_mask);

@@ -2569,7 +2560,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
- int i, cpu;
+ int i;

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
@@ -2590,13 +2581,9 @@ static int reset_all_ctrls(struct rdt_resource *r)
for (i = 0; i < hw_res->num_closid; i++)
hw_dom->ctrl_val[i] = r->default_ctrl;
}
- cpu = get_cpu();
- /* Update CBM on this cpu if it's in cpu_mask. */
- if (cpumask_test_cpu(cpu, cpu_mask))
- rdt_ctrl_update(&msr_param);
- /* Update CBM on all other cpus in cpu_mask. */
- smp_call_function_many(cpu_mask, rdt_ctrl_update, &msr_param, 1);
- put_cpu();
+
+ /* Update CBM on all the CPUs in cpu_mask */
+ on_each_cpu_mask(cpu_mask, rdt_ctrl_update, &msr_param, 1);

free_cpumask_var(cpu_mask);



2022-09-27 21:23:51

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 03/12] x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag

Newer AMD processors support the new feature Bandwidth Monitoring Event
Configuration (BMEC).

The feature support is identified via CPUID Fn8000_0020_EBX_x0 (ECX=0).
Bits Field Name Description
3 EVT_CFG Bandwidth Monitoring Event Configuration (BMEC)

Currently, the bandwidth monitoring events mbm_total_bytes and
mbm_local_bytes are set to count all the total and local reads/writes
respectively. With the introduction of slow memory, the two counters
are not enough to count all the different types of memory events. With
the feature BMEC, the users have the option to configure
mbm_total_bytes and mbm_local_bytes to count the specific type of
events.

Each BMEC event has a configuration MSR, QOS_EVT_CFG (0xc000_0400h +
EventID) which contains one field for each Bandwidth Type that can be
used to configure the bandwidth event to track any combination of
supported bandwidth types. The event will count requests from every
Bandwidth Type bit that is set in the corresponding configuration
register.

Following are the types of events supported:

==== ========================================================
Bits Description
==== ========================================================
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
==== ========================================================

By default, the mbm_total_bytes configuration is set to 0x7F to count
all the event types and the mbm_local_bytes configuration is set to
0x15 to count all the local memory events.

Feature description is available in the specification, "AMD64
Technology Platform Quality of Service Extensions, Revision: 1.03
Publication

Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kernel/cpu/scattered.c | 1 +
3 files changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 349852b9daa4..896226c5470b 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -305,6 +305,7 @@
#define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
#define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */
#define X86_FEATURE_SMBA (11*32+18) /* Slow Memory Bandwidth Allocation */
+#define X86_FEATURE_BMEC (11*32+19) /* AMD Bandwidth Monitoring Event Configuration (BMEC) */

/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index c881bcafba7d..4555f9596ccf 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -68,6 +68,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_CQM_OCCUP_LLC, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
+ { X86_FEATURE_BMEC, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL },
{ X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW },
{ X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES },
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 885ecf46abb2..7981df0b910e 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -45,6 +45,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
{ X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
{ X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
+ { X86_FEATURE_BMEC, CPUID_EBX, 3, 0x80000020, 0 },
{ X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
{ 0, 0, 0, 0, 0 }
};


2022-09-27 21:27:09

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 10/12] x86/resctrl: Add sysfs interface to write mbm_local_bytes event configuration

The current event configuration can be changed by the user by writing
to the configuration file /sys/fs/resctrl/info/L3_MON/mbm_local_config.
The event configuration settings are domain specific. Changing the
configuration on one CPU in a domain would affect the whole domain.

Following are the types of events supported:
==== ===========================================================
Bits Description
==== ===========================================================
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
==== ===========================================================

For example:
To change the mbm_local_bytes to count all the reads on domain 0,
run the command.
$echo 0:0x33 > /sys/fs/resctrl/info/L3_MON/mbm_local_config

To change the mbm_local_bytes to count all the slow memory reads on
domain 1, run the command.
$echo 1:0x30 > /sys/fs/resctrl/info/L3_MON/mbm_local_config

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index c1d43d03846a..87f3f8018c92 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1629,6 +1629,26 @@ static ssize_t mbm_total_config_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}

+static ssize_t mbm_local_config_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes,
+ loff_t off)
+{
+ struct rdt_resource *r = of->kn->parent->priv;
+ int ret;
+
+ /* Valid input requires a trailing newline */
+ if (nbytes == 0 || buf[nbytes - 1] != '\n')
+ return -EINVAL;
+
+ rdt_last_cmd_clear();
+
+ buf[nbytes - 1] = '\0';
+
+ ret = mon_config_parse(r, buf, QOS_L3_MBM_LOCAL_EVENT_ID);
+
+ return ret ?: nbytes;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
@@ -1739,6 +1759,7 @@ static struct rftype res_common_files[] = {
.mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = mbm_local_config_show,
+ .write = mbm_local_config_write,
},
{
.name = "cpus",


2022-09-27 21:27:19

by Moger, Babu

[permalink] [raw]
Subject: [PATCH v5 06/12] x86/resctrl: Introduce data structure to support monitor configuration

Add couple of fields in mon_evt to support Bandwidth Monitoring Event
Configuratio (BMEC) and also update the "mon_features".

The sysfs file "mon_features" will display the monitor configuration if
supported.

Before the change.
$cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_local_bytes

After the change if BMEC is supported.
$cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_total_config
mbm_local_bytes
mbm_local_config

Signed-off-by: Babu Moger <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 3 ++-
arch/x86/kernel/cpu/resctrl/internal.h | 6 +++++-
arch/x86/kernel/cpu/resctrl/monitor.c | 9 ++++++++-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++++-
4 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 56c96607259c..513e6a00f58e 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -849,6 +849,7 @@ static __init bool get_rdt_alloc_resources(void)
static __init bool get_rdt_mon_resources(void)
{
struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ bool mon_configurable = rdt_cpu_has(X86_FEATURE_BMEC);

if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC))
rdt_mon_features |= (1 << QOS_L3_OCCUP_EVENT_ID);
@@ -860,7 +861,7 @@ static __init bool get_rdt_mon_resources(void)
if (!rdt_mon_features)
return false;

- return !rdt_get_mon_l3_config(r);
+ return !rdt_get_mon_l3_config(r, mon_configurable);
}

static __init void __check_quirks_intel(void)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c049a274383c..4d03f443b353 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -72,11 +72,15 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
* struct mon_evt - Entry in the event list of a resource
* @evtid: event id
* @name: name of the event
+ * @configurable: true if the event is configurable
+ * @config_name: sysfs file name of the event if configurable
* @list: entry in &rdt_resource->evt_list
*/
struct mon_evt {
u32 evtid;
char *name;
+ bool configurable;
+ char *config_name;
struct list_head list;
};

@@ -529,7 +533,7 @@ int closids_supported(void);
void closid_free(int closid);
int alloc_rmid(void);
void free_rmid(u32 rmid);
-int rdt_get_mon_l3_config(struct rdt_resource *r);
+int rdt_get_mon_l3_config(struct rdt_resource *r, bool configurable);
void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
void rmdir_mondata_subdir_allrdtgrp(struct rdt_resource *r,
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index eaf25a234ff5..dc97aa7a3b3d 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -656,11 +656,13 @@ static struct mon_evt llc_occupancy_event = {
static struct mon_evt mbm_total_event = {
.name = "mbm_total_bytes",
.evtid = QOS_L3_MBM_TOTAL_EVENT_ID,
+ .config_name = "mbm_total_config",
};

static struct mon_evt mbm_local_event = {
.name = "mbm_local_bytes",
.evtid = QOS_L3_MBM_LOCAL_EVENT_ID,
+ .config_name = "mbm_local_config",
};

/*
@@ -682,7 +684,7 @@ static void l3_mon_evt_init(struct rdt_resource *r)
list_add_tail(&mbm_local_event.list, &r->evt_list);
}

-int rdt_get_mon_l3_config(struct rdt_resource *r)
+int rdt_get_mon_l3_config(struct rdt_resource *r, bool configurable)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
@@ -714,6 +716,11 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
if (ret)
return ret;

+ if (configurable) {
+ mbm_total_event.configurable = true;
+ mbm_local_event.configurable = true;
+ }
+
l3_mon_evt_init(r);

r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 04b519bca50d..834a55d78e3f 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1001,8 +1001,11 @@ static int rdt_mon_features_show(struct kernfs_open_file *of,
struct rdt_resource *r = of->kn->parent->priv;
struct mon_evt *mevt;

- list_for_each_entry(mevt, &r->evt_list, list)
+ list_for_each_entry(mevt, &r->evt_list, list) {
seq_printf(seq, "%s\n", mevt->name);
+ if (mevt->configurable)
+ seq_printf(seq, "%s\n", mevt->config_name);
+ }

return 0;
}


2022-09-27 22:37:10

by Fenghua Yu

[permalink] [raw]
Subject: RE: [PATCH v5 06/12] x86/resctrl: Introduce data structure to support monitor configuration

Hi, Babu,

> Add couple of fields in mon_evt to support Bandwidth Monitoring Event
> Configuratio (BMEC) and also update the "mon_features".

s/Configuratio/ Configuration/

>
> The sysfs file "mon_features" will display the monitor configuration if supported.
>
> Before the change.
> $cat /sys/fs/resctrl/info/L3_MON/mon_features
> llc_occupancy
> mbm_total_bytes
> mbm_local_bytes
>
> After the change if BMEC is supported.
> $cat /sys/fs/resctrl/info/L3_MON/mon_features
> llc_occupancy
> mbm_total_bytes
> mbm_total_config
> mbm_local_bytes
> mbm_local_config
>
> Signed-off-by: Babu Moger <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 3 ++-
> arch/x86/kernel/cpu/resctrl/internal.h | 6 +++++-
> arch/x86/kernel/cpu/resctrl/monitor.c | 9 ++++++++-
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++++-
> 4 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c
> b/arch/x86/kernel/cpu/resctrl/core.c
> index 56c96607259c..513e6a00f58e 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -849,6 +849,7 @@ static __init bool get_rdt_alloc_resources(void) static
> __init bool get_rdt_mon_resources(void) {
> struct rdt_resource *r =
> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> + bool mon_configurable = rdt_cpu_has(X86_FEATURE_BMEC);
>
> if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC))
> rdt_mon_features |= (1 << QOS_L3_OCCUP_EVENT_ID); @@ -
> 860,7 +861,7 @@ static __init bool get_rdt_mon_resources(void)
> if (!rdt_mon_features)
> return false;
>
> - return !rdt_get_mon_l3_config(r);
> + return !rdt_get_mon_l3_config(r, mon_configurable);
> }
>
> static __init void __check_quirks_intel(void) diff --git
> a/arch/x86/kernel/cpu/resctrl/internal.h
> b/arch/x86/kernel/cpu/resctrl/internal.h
> index c049a274383c..4d03f443b353 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -72,11 +72,15 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
> * struct mon_evt - Entry in the event list of a resource
> * @evtid: event id
> * @name: name of the event
> + * @configurable: true if the event is configurable
> + * @config_name: sysfs file name of the event if configurable
> * @list: entry in &rdt_resource->evt_list
> */
> struct mon_evt {
> u32 evtid;
> char *name;
> + bool configurable;
> + char *config_name;

Seems config_name is only used to be shown in mon_features. Is it necessary to have the field?

> struct list_head list;
> };
>
> @@ -529,7 +533,7 @@ int closids_supported(void); void closid_free(int closid);
> int alloc_rmid(void); void free_rmid(u32 rmid); -int
> rdt_get_mon_l3_config(struct rdt_resource *r);
> +int rdt_get_mon_l3_config(struct rdt_resource *r, bool configurable);
> void mon_event_count(void *info);
> int rdtgroup_mondata_show(struct seq_file *m, void *arg); void
> rmdir_mondata_subdir_allrdtgrp(struct rdt_resource *r, diff --git
> a/arch/x86/kernel/cpu/resctrl/monitor.c
> b/arch/x86/kernel/cpu/resctrl/monitor.c
> index eaf25a234ff5..dc97aa7a3b3d 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -656,11 +656,13 @@ static struct mon_evt llc_occupancy_event = { static
> struct mon_evt mbm_total_event = {
> .name = "mbm_total_bytes",
> .evtid = QOS_L3_MBM_TOTAL_EVENT_ID,
> + .config_name = "mbm_total_config",
> };

Struct mon_evt mbm_total_config_event = {
.name = "mbm_total_config",

>
> static struct mon_evt mbm_local_event = {
> .name = "mbm_local_bytes",
> .evtid = QOS_L3_MBM_LOCAL_EVENT_ID,
> + .config_name = "mbm_local_config",
> };
>
> /*
> @@ -682,7 +684,7 @@ static void l3_mon_evt_init(struct rdt_resource *r)
> list_add_tail(&mbm_local_event.list, &r->evt_list); }
>
> -int rdt_get_mon_l3_config(struct rdt_resource *r)
> +int rdt_get_mon_l3_config(struct rdt_resource *r, bool configurable)
> {
> unsigned int mbm_offset =
> boot_cpu_data.x86_cache_mbm_width_offset;
> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); @@ -714,6
> +716,11 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
> if (ret)
> return ret;
>
> + if (configurable) {
> + mbm_total_event.configurable = true;
> + mbm_local_event.configurable = true;
> + }
> +
> l3_mon_evt_init(r);
>
> r->mon_capable = true;
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 04b519bca50d..834a55d78e3f 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1001,8 +1001,11 @@ static int rdt_mon_features_show(struct
> kernfs_open_file *of,
> struct rdt_resource *r = of->kn->parent->priv;
> struct mon_evt *mevt;
>
> - list_for_each_entry(mevt, &r->evt_list, list)
> + list_for_each_entry(mevt, &r->evt_list, list) {
> seq_printf(seq, "%s\n", mevt->name);
> + if (mevt->configurable)
> + seq_printf(seq, "%s\n", mevt->config_name);

If "config_name" is not defined, it could be:
If (mevt->configurable)
Seq_printf(seq, "%s_config\n", mevt->name);

Thanks.

-Fenghua

2022-09-27 22:57:16

by Fenghua Yu

[permalink] [raw]
Subject: RE: [PATCH v5 08/12] x86/resctrl: Add sysfs interface to read mbm_local_bytes event configuration

Hi, Babu,

> By default, the mbm_local_bytes configuration is set to 0x15 to count all the
> local event types. The event configuration settings are domain specific.
> Changing the configuration on one CPU in a domain would affect the whole
> domain.
>
> For example:
> $cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
> 0:0x15;1:0x15;2:0x15;3:0x15

Schemata has format: "id0=val0;id1=val1;...". Maybe it's better to use
similar format here: 0=0x15;1=0x15;2=0x15;3=0x15? So we can have uniform formats across
resctrl.

Thanks.

-Fenghua

2022-09-28 04:46:23

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

On Tue, Sep 27, 2022 at 03:27:00PM -0500, Babu Moger wrote:
> + Following are the types of events supported:
> +
> + ==== ========================================================
> + Bits Description
> + ==== ========================================================
> + 6 Dirty Victims from the QOS domain to all types of memory
> + 5 Reads to slow memory in the non-local NUMA domain
> + 4 Reads to slow memory in the local NUMA domain
> + 3 Non-temporal writes to non-local NUMA domain
> + 2 Non-temporal writes to local NUMA domain
> + 1 Reads to memory in the non-local NUMA domain
> + 0 Reads to memory in the local NUMA domain
> + ==== ========================================================
> +
> + By default, the mbm_total_bytes configuration is set to 0x7f to count
> + all the event types and the mbm_local_bytes configuration is set to
> + 0x15 to count all the local memory events.
> +
> + Example::
> +
> + To view the current configuration, run the command.
> + # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
> + 0:0x7f;1:0x7f;2:0x7f;3:0x7f
> +
> + # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
> + 0:0x15;1:0x15;3:0x15;4:0x15
> +
> + To change the mbm_total_bytes to count only reads on domain 0,
> + run the command. The bits 0,1,4 and 5 needs to set.
> +
> + # echo "0:0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config
> +
> + # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
> + 0:0x33;1:0x7f;2:0x7f;3:0x7f
> +
> + To change the mbm_local_bytes to count all the slow memory reads on
> + domain 1, run the command. The bits 4 and 5 needs to set.
> +
> + # echo "1:0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config
> +
> + # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
> + 0:0x15;1:0x30;3:0x15;4:0x15
>

Hi Babu,

The description text for each snippets above shouldn't in the code
block. Also, split the block into three code blocks in the lists:

---- >8 ----
diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index b4fe54f219b6f3..ec578b069276ce 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -206,25 +206,26 @@ with the following files:
all the event types and the mbm_local_bytes configuration is set to
0x15 to count all the local memory events.

- Example::
+ Examples:
+
+ * To view the current configuration::

- To view the current configuration, run the command.
# cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
0:0x7f;1:0x7f;2:0x7f;3:0x7f

# cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
0:0x15;1:0x15;3:0x15;4:0x15

- To change the mbm_total_bytes to count only reads on domain 0,
- run the command. The bits 0,1,4 and 5 needs to set.
+ * To change the mbm_total_bytes to count only reads on domain 0
+ (the bits 0, 1, 4 and 5 needs to be set)::

# echo "0:0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config

# cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
0:0x33;1:0x7f;2:0x7f;3:0x7f

- To change the mbm_local_bytes to count all the slow memory reads on
- domain 1, run the command. The bits 4 and 5 needs to set.
+ * To change the mbm_local_bytes to count all the slow memory reads on
+ domain 1 (the bits 4 and 5 needs to be set)::

# echo "1:0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config


Also, there isn't description of mapping from bits from the supported events
table to the bytes input for mbm_{total,local}_config.

> +Slow Memory b/w domain is L3 cache.
> +::
> +
> + SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
> +

What b/w stands for in the context above?

> Reading/writing the schemata file
> ---------------------------------
> Reading the schemata file will show the state of all resources
> @@ -479,6 +567,44 @@ which you wish to change. E.g.
> L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
> L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
>
> +Reading/writing the schemata file (on AMD systems)
> +--------------------------------------------------
> +Reading the schemata file will show the state of all resources
> +on all domains. When writing the memory bandwidth allocation you
> +only need to specify those values in an absolute number expressed
> +in 1/8 GB/s increments. To allocate bandwidth limit of 2GB, you
> +need to specify the value 16 (16 * 1/8 = 2). E.g.
> <snipped>...
> +Reading the schemata file will show the state of all resources
> +on all domains. When writing the memory bandwidth allocation you
> +only need to specify those values in an absolute number expressed
> +in 1/8 GB/s increments. To allocate bandwidth limit of 8GB, you
> +need to specify the value 64 (64 * 1/8 = 8). E.g.

s/E.g./For example:/

Thanks.

--
An old man doll... just what I always wanted! - Clara


Attachments:
(No filename) (5.24 kB)
signature.asc (235.00 B)
Download all attachments

2022-09-28 13:29:27

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v5 06/12] x86/resctrl: Introduce data structure to support monitor configuration

[AMD Official Use Only - General]

Hi Fenghua,

> -----Original Message-----
> From: Yu, Fenghua <[email protected]>
> Sent: Tuesday, September 27, 2022 5:25 PM
> To: Moger, Babu <[email protected]>; [email protected]; Chatre, Reinette
> <[email protected]>; [email protected]; [email protected];
> [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Bae, Chang Seok <[email protected]>;
> [email protected]; [email protected];
> [email protected]; Das1, Sandipan <[email protected]>;
> Luck, Tony <[email protected]>; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> Eranian, Stephane <[email protected]>
> Subject: RE: [PATCH v5 06/12] x86/resctrl: Introduce data structure to support
> monitor configuration
>
> Hi, Babu,
>
> > Add couple of fields in mon_evt to support Bandwidth Monitoring Event
> > Configuratio (BMEC) and also update the "mon_features".
>
> s/Configuratio/ Configuration/

Sure.
>
> >
> > The sysfs file "mon_features" will display the monitor configuration if
> supported.
> >
> > Before the change.
> > $cat /sys/fs/resctrl/info/L3_MON/mon_features
> > llc_occupancy
> > mbm_total_bytes
> > mbm_local_bytes
> >
> > After the change if BMEC is supported.
> > $cat /sys/fs/resctrl/info/L3_MON/mon_features
> > llc_occupancy
> > mbm_total_bytes
> > mbm_total_config
> > mbm_local_bytes
> > mbm_local_config
> >
> > Signed-off-by: Babu Moger <[email protected]>
> > ---
> > arch/x86/kernel/cpu/resctrl/core.c | 3 ++-
> > arch/x86/kernel/cpu/resctrl/internal.h | 6 +++++-
> > arch/x86/kernel/cpu/resctrl/monitor.c | 9 ++++++++-
> > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++++-
> > 4 files changed, 19 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/resctrl/core.c
> > b/arch/x86/kernel/cpu/resctrl/core.c
> > index 56c96607259c..513e6a00f58e 100644
> > --- a/arch/x86/kernel/cpu/resctrl/core.c
> > +++ b/arch/x86/kernel/cpu/resctrl/core.c
> > @@ -849,6 +849,7 @@ static __init bool get_rdt_alloc_resources(void)
> > static __init bool get_rdt_mon_resources(void) {
> > struct rdt_resource *r =
> > &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> > + bool mon_configurable = rdt_cpu_has(X86_FEATURE_BMEC);
> >
> > if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC))
> > rdt_mon_features |= (1 << QOS_L3_OCCUP_EVENT_ID); @@ -
> > 860,7 +861,7 @@ static __init bool get_rdt_mon_resources(void)
> > if (!rdt_mon_features)
> > return false;
> >
> > - return !rdt_get_mon_l3_config(r);
> > + return !rdt_get_mon_l3_config(r, mon_configurable);
> > }
> >
> > static __init void __check_quirks_intel(void) diff --git
> > a/arch/x86/kernel/cpu/resctrl/internal.h
> > b/arch/x86/kernel/cpu/resctrl/internal.h
> > index c049a274383c..4d03f443b353 100644
> > --- a/arch/x86/kernel/cpu/resctrl/internal.h
> > +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> > @@ -72,11 +72,15 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
> > * struct mon_evt - Entry in the event list of a resource
> > * @evtid: event id
> > * @name: name of the event
> > + * @configurable: true if the event is configurable
> > + * @config_name: sysfs file name of the event if configurable
> > * @list: entry in &rdt_resource->evt_list
> > */
> > struct mon_evt {
> > u32 evtid;
> > char *name;
> > + bool configurable;
> > + char *config_name;
>
> Seems config_name is only used to be shown in mon_features. Is it necessary to
> have the field?

Sure. I can remove it.

>
> > struct list_head list;
> > };
> >
> > @@ -529,7 +533,7 @@ int closids_supported(void); void closid_free(int
> > closid); int alloc_rmid(void); void free_rmid(u32 rmid); -int
> > rdt_get_mon_l3_config(struct rdt_resource *r);
> > +int rdt_get_mon_l3_config(struct rdt_resource *r, bool configurable);
> > void mon_event_count(void *info);
> > int rdtgroup_mondata_show(struct seq_file *m, void *arg); void
> > rmdir_mondata_subdir_allrdtgrp(struct rdt_resource *r, diff --git
> > a/arch/x86/kernel/cpu/resctrl/monitor.c
> > b/arch/x86/kernel/cpu/resctrl/monitor.c
> > index eaf25a234ff5..dc97aa7a3b3d 100644
> > --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> > +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> > @@ -656,11 +656,13 @@ static struct mon_evt llc_occupancy_event = {
> > static struct mon_evt mbm_total_event = {
> > .name = "mbm_total_bytes",
> > .evtid = QOS_L3_MBM_TOTAL_EVENT_ID,
> > + .config_name = "mbm_total_config",
> > };
>
> Struct mon_evt mbm_total_config_event = {
> .name = "mbm_total_config",
>
> >
> > static struct mon_evt mbm_local_event = {
> > .name = "mbm_local_bytes",
> > .evtid = QOS_L3_MBM_LOCAL_EVENT_ID,
> > + .config_name = "mbm_local_config",
> > };
> >
> > /*
> > @@ -682,7 +684,7 @@ static void l3_mon_evt_init(struct rdt_resource *r)
> > list_add_tail(&mbm_local_event.list, &r->evt_list); }
> >
> > -int rdt_get_mon_l3_config(struct rdt_resource *r)
> > +int rdt_get_mon_l3_config(struct rdt_resource *r, bool configurable)
> > {
> > unsigned int mbm_offset =
> > boot_cpu_data.x86_cache_mbm_width_offset;
> > struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); @@ -714,6
> > +716,11 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
> > if (ret)
> > return ret;
> >
> > + if (configurable) {
> > + mbm_total_event.configurable = true;
> > + mbm_local_event.configurable = true;
> > + }
> > +
> > l3_mon_evt_init(r);
> >
> > r->mon_capable = true;
> > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > index 04b519bca50d..834a55d78e3f 100644
> > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> > @@ -1001,8 +1001,11 @@ static int rdt_mon_features_show(struct
> > kernfs_open_file *of,
> > struct rdt_resource *r = of->kn->parent->priv;
> > struct mon_evt *mevt;
> >
> > - list_for_each_entry(mevt, &r->evt_list, list)
> > + list_for_each_entry(mevt, &r->evt_list, list) {
> > seq_printf(seq, "%s\n", mevt->name);
> > + if (mevt->configurable)
> > + seq_printf(seq, "%s\n", mevt->config_name);
>
> If "config_name" is not defined, it could be:
> If (mevt->configurable)
> Seq_printf(seq, "%s_config\n", mevt->name);
>
Sure. Thanks
Babu

2022-09-28 15:21:36

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 08/12] x86/resctrl: Add sysfs interface to read mbm_local_bytes event configuration

Hi Fenghua,

On 9/27/22 17:42, Yu, Fenghua wrote:
> Hi, Babu,
>
>> By default, the mbm_local_bytes configuration is set to 0x15 to count all the
>> local event types. The event configuration settings are domain specific.
>> Changing the configuration on one CPU in a domain would affect the whole
>> domain.
>>
>> For example:
>> $cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
>> 0:0x15;1:0x15;2:0x15;3:0x15
> Schemata has format: "id0=val0;id1=val1;...". Maybe it's better to use
> similar format here: 0=0x15;1=0x15;2=0x15;3=0x15? So we can have uniform formats across
> resctrl.

Sure. Will change it,

Thanks

Babu

2022-09-28 15:45:01

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Hi Sanjaya,


On 9/27/22 23:25, Bagas Sanjaya wrote:
> On Tue, Sep 27, 2022 at 03:27:00PM -0500, Babu Moger wrote:
>> + Following are the types of events supported:
>> +
>> + ==== ========================================================
>> + Bits Description
>> + ==== ========================================================
>> + 6 Dirty Victims from the QOS domain to all types of memory
>> + 5 Reads to slow memory in the non-local NUMA domain
>> + 4 Reads to slow memory in the local NUMA domain
>> + 3 Non-temporal writes to non-local NUMA domain
>> + 2 Non-temporal writes to local NUMA domain
>> + 1 Reads to memory in the non-local NUMA domain
>> + 0 Reads to memory in the local NUMA domain
>> + ==== ========================================================
>> +
>> + By default, the mbm_total_bytes configuration is set to 0x7f to count
>> + all the event types and the mbm_local_bytes configuration is set to
>> + 0x15 to count all the local memory events.
>> +
>> + Example::
>> +
>> + To view the current configuration, run the command.
>> + # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
>> + 0:0x7f;1:0x7f;2:0x7f;3:0x7f
>> +
>> + # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
>> + 0:0x15;1:0x15;3:0x15;4:0x15
>> +
>> + To change the mbm_total_bytes to count only reads on domain 0,
>> + run the command. The bits 0,1,4 and 5 needs to set.
>> +
>> + # echo "0:0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config
>> +
>> + # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
>> + 0:0x33;1:0x7f;2:0x7f;3:0x7f
>> +
>> + To change the mbm_local_bytes to count all the slow memory reads on
>> + domain 1, run the command. The bits 4 and 5 needs to set.
>> +
>> + # echo "1:0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config
>> +
>> + # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
>> + 0:0x15;1:0x30;3:0x15;4:0x15
>>
> Hi Babu,
>
> The description text for each snippets above shouldn't in the code
> block. Also, split the block into three code blocks in the lists:
Did you mean, I need to remove similar texts from code?
>
> ---- >8 ----
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index b4fe54f219b6f3..ec578b069276ce 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -206,25 +206,26 @@ with the following files:
> all the event types and the mbm_local_bytes configuration is set to
> 0x15 to count all the local memory events.
>
> - Example::
> + Examples:
> +
> + * To view the current configuration::
>
> - To view the current configuration, run the command.
> # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
> 0:0x7f;1:0x7f;2:0x7f;3:0x7f
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
> 0:0x15;1:0x15;3:0x15;4:0x15
>
> - To change the mbm_total_bytes to count only reads on domain 0,
> - run the command. The bits 0,1,4 and 5 needs to set.
> + * To change the mbm_total_bytes to count only reads on domain 0
> + (the bits 0, 1, 4 and 5 needs to be set)::
>
> # echo "0:0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
> 0:0x33;1:0x7f;2:0x7f;3:0x7f
>
> - To change the mbm_local_bytes to count all the slow memory reads on
> - domain 1, run the command. The bits 4 and 5 needs to set.
> + * To change the mbm_local_bytes to count all the slow memory reads on
> + domain 1 (the bits 4 and 5 needs to be set)::
>
> # echo "1:0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config
>

Thanks for the diff. I cannot get this right for some reason. I will
probably send the diff before the final series.


>
> Also, there isn't description of mapping from bits from the supported events
> table to the bytes input for mbm_{total,local}_config.

It is already there. Is that not clear?

+        Following are the types of events supported:
+
+        ====    ========================================================
+        Bits    Description
+        ====    ========================================================
+        6       Dirty Victims from the QOS domain to all types of memory
+        5       Reads to slow memory in the non-local NUMA domain
+        4       Reads to slow memory in the local NUMA domain
+        3       Non-temporal writes to non-local NUMA domain
+        2       Non-temporal writes to local NUMA domain
+        1       Reads to memory in the non-local NUMA domain
+        0       Reads to memory in the local NUMA domain
+        ====    ========================================================


>
>> +Slow Memory b/w domain is L3 cache.
>> +::
>> +
>> + SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
>> +
> What b/w stands for in the context above?
b/w is bandwidth. I will correct it.
>
>> Reading/writing the schemata file
>> ---------------------------------
>> Reading the schemata file will show the state of all resources
>> @@ -479,6 +567,44 @@ which you wish to change. E.g.
>> L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
>> L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
>>
>> +Reading/writing the schemata file (on AMD systems)
>> +--------------------------------------------------
>> +Reading the schemata file will show the state of all resources
>> +on all domains. When writing the memory bandwidth allocation you
>> +only need to specify those values in an absolute number expressed
>> +in 1/8 GB/s increments. To allocate bandwidth limit of 2GB, you
>> +need to specify the value 16 (16 * 1/8 = 2). E.g.
>> <snipped>...
>> +Reading the schemata file will show the state of all resources
>> +on all domains. When writing the memory bandwidth allocation you
>> +only need to specify those values in an absolute number expressed
>> +in 1/8 GB/s increments. To allocate bandwidth limit of 8GB, you
>> +need to specify the value 64 (64 * 1/8 = 8). E.g.
> s/E.g./For example:/

Thanks

Babu Moger


2022-09-29 08:59:44

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

On 9/28/22 22:23, Moger, Babu wrote:
>> Hi Babu,
>>
>> The description text for each snippets above shouldn't in the code
>> block. Also, split the block into three code blocks in the lists:
> Did you mean, I need to remove similar texts from code?

I mean extracting code description from the code block, see the diff below.

>>
>> ---- >8 ----
>> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
>> index b4fe54f219b6f3..ec578b069276ce 100644
>> --- a/Documentation/x86/resctrl.rst
>> +++ b/Documentation/x86/resctrl.rst
>> @@ -206,25 +206,26 @@ with the following files:
>> all the event types and the mbm_local_bytes configuration is set to
>> 0x15 to count all the local memory events.
>>
>> - Example::
>> + Examples:
>> +
>> + * To view the current configuration::
>>
>> - To view the current configuration, run the command.
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
>> 0:0x7f;1:0x7f;2:0x7f;3:0x7f
>>
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
>> 0:0x15;1:0x15;3:0x15;4:0x15
>>
>> - To change the mbm_total_bytes to count only reads on domain 0,
>> - run the command. The bits 0,1,4 and 5 needs to set.
>> + * To change the mbm_total_bytes to count only reads on domain 0
>> + (the bits 0, 1, 4 and 5 needs to be set)::
>>
>> # echo "0:0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config
>>
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
>> 0:0x33;1:0x7f;2:0x7f;3:0x7f
>>
>> - To change the mbm_local_bytes to count all the slow memory reads on
>> - domain 1, run the command. The bits 4 and 5 needs to set.
>> + * To change the mbm_local_bytes to count all the slow memory reads on
>> + domain 1 (the bits 4 and 5 needs to be set)::
>>
>> # echo "1:0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config
>>
>
> Thanks for the diff. I cannot get this right for some reason. I will
> probably send the diff before the final series.
>
>

OK.

>>
>> Also, there isn't description of mapping from bits from the supported events
>> table to the bytes input for mbm_{total,local}_config.
>
> It is already there. Is that not clear?

No. I don't see why setting bits 0, 1, 4, and 5 on domain 0 translates to
`0:0x33`, for example.

>>
>>> +Slow Memory b/w domain is L3 cache.
>>> +::
>>> +
>>> + SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
>>> +
>> What b/w stands for in the context above?
> b/w is bandwidth. I will correct it.

OK.

Thanks for replying.

--
An old man doll... just what I always wanted! - Clara

2022-09-29 13:37:58

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Hi Sanjaya,

On 9/29/22 03:48, Bagas Sanjaya wrote:
> On 9/28/22 22:23, Moger, Babu wrote:
>>> Hi Babu,
>>>
>>> The description text for each snippets above shouldn't in the code
>>> block. Also, split the block into three code blocks in the lists:
>> Did you mean, I need to remove similar texts from code?
> I mean extracting code description from the code block, see the diff below.
>
>>> ---- >8 ----
>>> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
>>> index b4fe54f219b6f3..ec578b069276ce 100644
>>> --- a/Documentation/x86/resctrl.rst
>>> +++ b/Documentation/x86/resctrl.rst
>>> @@ -206,25 +206,26 @@ with the following files:
>>> all the event types and the mbm_local_bytes configuration is set to
>>> 0x15 to count all the local memory events.
>>>
>>> - Example::
>>> + Examples:
>>> +
>>> + * To view the current configuration::
>>>
>>> - To view the current configuration, run the command.
>>> # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
>>> 0:0x7f;1:0x7f;2:0x7f;3:0x7f
>>>
>>> # cat /sys/fs/resctrl/info/L3_MON/mbm_local_config
>>> 0:0x15;1:0x15;3:0x15;4:0x15
>>>
>>> - To change the mbm_total_bytes to count only reads on domain 0,
>>> - run the command. The bits 0,1,4 and 5 needs to set.
>>> + * To change the mbm_total_bytes to count only reads on domain 0
>>> + (the bits 0, 1, 4 and 5 needs to be set)::
>>>
>>> # echo "0:0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_config
>>>
>>> # cat /sys/fs/resctrl/info/L3_MON/mbm_total_config
>>> 0:0x33;1:0x7f;2:0x7f;3:0x7f
>>>
>>> - To change the mbm_local_bytes to count all the slow memory reads on
>>> - domain 1, run the command. The bits 4 and 5 needs to set.
>>> + * To change the mbm_local_bytes to count all the slow memory reads on
>>> + domain 1 (the bits 4 and 5 needs to be set)::
>>>
>>> # echo "1:0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_config
>>>
>> Thanks for the diff. I cannot get this right for some reason. I will
>> probably send the diff before the final series.
>>
>>
> OK.
>
>>> Also, there isn't description of mapping from bits from the supported events
>>> table to the bytes input for mbm_{total,local}_config.
>> It is already there. Is that not clear?
> No. I don't see why setting bits 0, 1, 4, and 5 on domain 0 translates to
> `0:0x33`, for example.

It is 110011b(binary) which is 0x33. I can make that little more clear.

Thanks

Babu


>
>>>> +Slow Memory b/w domain is L3 cache.
>>>> +::
>>>> +
>>>> + SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
>>>> +
>>> What b/w stands for in the context above?
>> b/w is bandwidth. I will correct it.
> OK.
>
> Thanks for replying.
>
--
Thanks
Babu Moger

2022-09-29 14:04:59

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

On 9/29/22 20:22, Moger, Babu wrote:
>>>> Also, there isn't description of mapping from bits from the supported events
>>>> table to the bytes input for mbm_{total,local}_config.
>>> It is already there. Is that not clear?
>> No. I don't see why setting bits 0, 1, 4, and 5 on domain 0 translates to
>> `0:0x33`, for example.
>
> It is 110011b(binary) which is 0x33. I can make that little more clear.
>

Ah! I see that flipping bits in order to to set the flag. Thanks for
the explanation.

--
An old man doll... just what I always wanted! - Clara

2022-09-29 22:50:44

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Hi Babu,

In subject: resctrl_ui.rst -> resctrl.rst

On 9/27/2022 1:27 PM, Babu Moger wrote:
> Update the documentation for the new features:
> 1. Slow Memory Bandwidth allocation (SMBA).
> With this feature, the QOS enforcement policies can be applied
> to the external slow memory connected to the host. QOS enforcement
> is accomplished by assigning a Class Of Service (COS) to a processor
> and specifying allocations or limits for that COS for each resource
> to be allocated.
>
> 2. Bandwidth Monitoring Event Configuration (BMEC).
> The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
> are set to count all the total and local reads/writes respectively.
> With the introduction of slow memory, the two counters are not
> enough to count all the different types are memory events. With the

types are memory events -> types of memory events?

> feature BMEC, the users have the option to configure mbm_total_bytes
> and mbm_local_bytes to count the specific type of events.
>
> Also add configuration instructions with examples.
>
> Signed-off-by: Babu Moger <[email protected]>
> ---

...

> +
> +"mbm_total_config", "mbm_local_config":
> + These files contain the current event configuration for the events
> + mbm_total_bytes and mbm_local_bytes, respectively, when the
> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
> + The event configuration settings are domain specific. Changing the
> + configuration on one CPU in a domain would affect the whole domain.

This contradicts the implementation done in this series where the
configuration is changed on every CPU in the domain.

Reinette

2022-10-03 14:37:16

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Hi Reinette,

On 9/29/22 17:10, Reinette Chatre wrote:
> Hi Babu,
>
> In subject: resctrl_ui.rst -> resctrl.rst
>
> On 9/27/2022 1:27 PM, Babu Moger wrote:
>> Update the documentation for the new features:
>> 1. Slow Memory Bandwidth allocation (SMBA).
>> With this feature, the QOS enforcement policies can be applied
>> to the external slow memory connected to the host. QOS enforcement
>> is accomplished by assigning a Class Of Service (COS) to a processor
>> and specifying allocations or limits for that COS for each resource
>> to be allocated.
>>
>> 2. Bandwidth Monitoring Event Configuration (BMEC).
>> The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
>> are set to count all the total and local reads/writes respectively.
>> With the introduction of slow memory, the two counters are not
>> enough to count all the different types are memory events. With the
> types are memory events -> types of memory events?
Ok Sure
>
>> feature BMEC, the users have the option to configure mbm_total_bytes
>> and mbm_local_bytes to count the specific type of events.
>>
>> Also add configuration instructions with examples.
>>
>> Signed-off-by: Babu Moger <[email protected]>
>> ---
> ...
>
>> +
>> +"mbm_total_config", "mbm_local_config":
>> + These files contain the current event configuration for the events
>> + mbm_total_bytes and mbm_local_bytes, respectively, when the
>> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
>> + The event configuration settings are domain specific. Changing the
>> + configuration on one CPU in a domain would affect the whole domain.
> This contradicts the implementation done in this series where the
> configuration is changed on every CPU in the domain.

How about this?

The event configuration settings are domain specific and will affect all the CPUs in the domain.

Thanks

Babu

2022-10-03 15:46:14

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Hi Babu,

On 10/3/2022 7:28 AM, Moger, Babu wrote:
> Hi Reinette,
>
> On 9/29/22 17:10, Reinette Chatre wrote:
>> Hi Babu,
>>
>> In subject: resctrl_ui.rst -> resctrl.rst
>>
>> On 9/27/2022 1:27 PM, Babu Moger wrote:
>>> Update the documentation for the new features:
>>> 1. Slow Memory Bandwidth allocation (SMBA).
>>> With this feature, the QOS enforcement policies can be applied
>>> to the external slow memory connected to the host. QOS enforcement
>>> is accomplished by assigning a Class Of Service (COS) to a processor
>>> and specifying allocations or limits for that COS for each resource
>>> to be allocated.
>>>
>>> 2. Bandwidth Monitoring Event Configuration (BMEC).
>>> The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
>>> are set to count all the total and local reads/writes respectively.
>>> With the introduction of slow memory, the two counters are not
>>> enough to count all the different types are memory events. With the
>> types are memory events -> types of memory events?
> Ok Sure
>>
>>> feature BMEC, the users have the option to configure mbm_total_bytes
>>> and mbm_local_bytes to count the specific type of events.
>>>
>>> Also add configuration instructions with examples.
>>>
>>> Signed-off-by: Babu Moger <[email protected]>
>>> ---
>> ...
>>
>>> +
>>> +"mbm_total_config", "mbm_local_config":
>>> + These files contain the current event configuration for the events
>>> + mbm_total_bytes and mbm_local_bytes, respectively, when the
>>> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
>>> + The event configuration settings are domain specific. Changing the
>>> + configuration on one CPU in a domain would affect the whole domain.
>> This contradicts the implementation done in this series where the
>> configuration is changed on every CPU in the domain.
>
> How about this?
>
> The event configuration settings are domain specific and will affect all the CPUs in the domain.

There remains a disconnect between this and the implementation that writes the
configuration to every CPU.

You could make this change to the documentation but then the
implementation needs more than "Update MSR_IA32_EVT_CFG_BASE MSR on all
the CPUs in cpu_mask" - that comment needs to highlight that the
implementation does not follow the architecture and scope rules nor how
configuration changes are made in the rest of the driver and why. Previously [1]
you indicated that this is based on guidance from hardware team so perhaps you
could document it as a hardware quirk related to this feature? At the minimum
it should acknowledge the disconnect.

Reinette

[1] https://lore.kernel.org/lkml/[email protected]/

2022-10-04 14:21:03

by Moger, Babu

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Hi Reinette,

Already responded to this but i don't see my response in archives yet.

On 10/3/22 10:36, Reinette Chatre wrote:
> Hi Babu,
>
> On 10/3/2022 7:28 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 9/29/22 17:10, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> In subject: resctrl_ui.rst -> resctrl.rst
>>>
>>> On 9/27/2022 1:27 PM, Babu Moger wrote:
>>>> Update the documentation for the new features:
>>>> 1. Slow Memory Bandwidth allocation (SMBA).
>>>> With this feature, the QOS enforcement policies can be applied
>>>> to the external slow memory connected to the host. QOS enforcement
>>>> is accomplished by assigning a Class Of Service (COS) to a processor
>>>> and specifying allocations or limits for that COS for each resource
>>>> to be allocated.
>>>>
>>>> 2. Bandwidth Monitoring Event Configuration (BMEC).
>>>> The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
>>>> are set to count all the total and local reads/writes respectively.
>>>> With the introduction of slow memory, the two counters are not
>>>> enough to count all the different types are memory events. With the
>>> types are memory events -> types of memory events?
>> Ok Sure
>>>> feature BMEC, the users have the option to configure mbm_total_bytes
>>>> and mbm_local_bytes to count the specific type of events.
>>>>
>>>> Also add configuration instructions with examples.
>>>>
>>>> Signed-off-by: Babu Moger <[email protected]>
>>>> ---
>>> ...
>>>
>>>> +
>>>> +"mbm_total_config", "mbm_local_config":
>>>> + These files contain the current event configuration for the events
>>>> + mbm_total_bytes and mbm_local_bytes, respectively, when the
>>>> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
>>>> + The event configuration settings are domain specific. Changing the
>>>> + configuration on one CPU in a domain would affect the whole domain.
>>> This contradicts the implementation done in this series where the
>>> configuration is changed on every CPU in the domain.
>> How about this?
>>
>> The event configuration settings are domain specific and will affect all the CPUs in the domain.
> There remains a disconnect between this and the implementation that writes the
> configuration to every CPU.
>
> You could make this change to the documentation but then the
> implementation needs more than "Update MSR_IA32_EVT_CFG_BASE MSR on all
> the CPUs in cpu_mask" - that comment needs to highlight that the
> implementation does not follow the architecture and scope rules nor how
> configuration changes are made in the rest of the driver and why. Previously [1]
> you indicated that this is based on guidance from hardware team so perhaps you
> could document it as a hardware quirk related to this feature? At the minimum
> it should acknowledge the disconnect.

ok. I could document this in the code patch 9([PATCH v5 09/12]
x86/resctrl: Add sysfs interface to write mbm_total_bytes event configuration.
Something like this.

/*
+ * Update MSR_IA32_EVT_CFG_BASE MSR on all the CPUs in cpu_mask.
+ * The MSR MSR_IA32_EVT_CFG_BASE is domain specific. Writing the
+ * MSR on one CPU will affect all the CPUs in the domain.
+ * However, the hardware team recommends to update the MSR on
+ * all the CPU threads. It is not clear in the document yet.
* * Doc will be updated in the next revision.
+ */
+ on_each_cpu_mask(cpu_mask, mon_event_config_write, &mon_info, 1);
+

Thanks
Babu


2022-10-04 16:49:35

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Hi Babu,

On 10/4/2022 7:00 AM, Moger, Babu wrote:
> On 10/3/22 10:36, Reinette Chatre wrote:
>> On 10/3/2022 7:28 AM, Moger, Babu wrote:
>>> On 9/29/22 17:10, Reinette Chatre wrote:
>>>> Hi Babu,
>>>>
>>>> In subject: resctrl_ui.rst -> resctrl.rst
>>>>
>>>> On 9/27/2022 1:27 PM, Babu Moger wrote:
>>>>> Update the documentation for the new features:
>>>>> 1. Slow Memory Bandwidth allocation (SMBA).
>>>>> With this feature, the QOS enforcement policies can be applied
>>>>> to the external slow memory connected to the host. QOS enforcement
>>>>> is accomplished by assigning a Class Of Service (COS) to a processor
>>>>> and specifying allocations or limits for that COS for each resource
>>>>> to be allocated.
>>>>>
>>>>> 2. Bandwidth Monitoring Event Configuration (BMEC).
>>>>> The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
>>>>> are set to count all the total and local reads/writes respectively.
>>>>> With the introduction of slow memory, the two counters are not
>>>>> enough to count all the different types are memory events. With the
>>>> types are memory events -> types of memory events?
>>> Ok Sure
>>>>> feature BMEC, the users have the option to configure mbm_total_bytes
>>>>> and mbm_local_bytes to count the specific type of events.
>>>>>
>>>>> Also add configuration instructions with examples.
>>>>>
>>>>> Signed-off-by: Babu Moger <[email protected]>
>>>>> ---
>>>> ...
>>>>
>>>>> +
>>>>> +"mbm_total_config", "mbm_local_config":
>>>>> + These files contain the current event configuration for the events
>>>>> + mbm_total_bytes and mbm_local_bytes, respectively, when the
>>>>> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
>>>>> + The event configuration settings are domain specific. Changing the
>>>>> + configuration on one CPU in a domain would affect the whole domain.
>>>> This contradicts the implementation done in this series where the
>>>> configuration is changed on every CPU in the domain.
>>> How about this?
>>>
>>> The event configuration settings are domain specific and will affect all the CPUs in the domain.
>> There remains a disconnect between this and the implementation that writes the
>> configuration to every CPU.
>>
>> You could make this change to the documentation but then the
>> implementation needs more than "Update MSR_IA32_EVT_CFG_BASE MSR on all
>> the CPUs in cpu_mask" - that comment needs to highlight that the
>> implementation does not follow the architecture and scope rules nor how
>> configuration changes are made in the rest of the driver and why. Previously [1]
>> you indicated that this is based on guidance from hardware team so perhaps you
>> could document it as a hardware quirk related to this feature? At the minimum
>> it should acknowledge the disconnect.
>
> ok. I could document this in the code patch 9([PATCH v5 09/12]
> x86/resctrl: Add sysfs interface to write mbm_total_bytes event configuration.
> Something like this.
>
> /*
> + * Update MSR_IA32_EVT_CFG_BASE MSR on all the CPUs in cpu_mask.

Since multiple MSRs are impacted, how about:

"Update MSR_IA32_EVT_CFG_BASE MSRs ..."

> + * The MSR MSR_IA32_EVT_CFG_BASE is domain specific. Writing the

"The MSRs offset from MSR MSR_IA32_EVT_CFG_BASE are scoped at the domain
level. Writing any of these MSRs on one CPU is supposed to be observed
by all CPUs in the domain."

> + * MSR on one CPU will affect all the CPUs in the domain.

Since this is not the case, perhaps it should be " ...
is supposed to affect all the CPUs ..." instead?

> + * However, the hardware team recommends to update the MSR on
> + * all the CPU threads. It is not clear in the document yet.

To be consistent, could "CPU threads" be "CPUs"?

Could you please be specific about which document you refer to? Although,
I do not think that writing the last part about "the document" adds value
here. You are representing AMD with this submission and you document that
you are following the guidance from the hardware team in this regard.
I think that is sufficient.


> * * Doc will be updated in the next revision.

This is a change that will be made to the kernel source ... what does
"next revision" mean when somebody reads this comment in a few years?

Putting all of the above together, how about:

"Update MSR_IA32_EVT_CFG_BASE MSRs on all the CPUs in cpu_mask. The MSRs
offset from MSR MSR_IA32_EVT_CFG_BASE are scoped at the domain level.
Writing any of these MSRs on one CPU is supposed to be observed by all
CPUs in the domain. However, the hardware team recommends to update these
MSRs on all the CPUs in the domain."

Reinette

2022-10-04 20:03:34

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

Hi Babu,

On 10/4/2022 11:18 AM, Moger, Babu wrote:
> On 10/4/22 11:15, Reinette Chatre wrote:
>> On 10/4/2022 7:00 AM, Moger, Babu wrote:
>>> On 10/3/22 10:36, Reinette Chatre wrote:
>>>> On 10/3/2022 7:28 AM, Moger, Babu wrote:
>>>>> On 9/29/22 17:10, Reinette Chatre wrote:
>>>>>> Hi Babu,
>>>>>>
>>>>>> In subject: resctrl_ui.rst -> resctrl.rst
>>>>>>
>>>>>> On 9/27/2022 1:27 PM, Babu Moger wrote:

...

>>> + * However, the hardware team recommends to update the MSR on
>>> + * all the CPU threads. It is not clear in the document yet.
>> To be consistent, could "CPU threads" be "CPUs"?
> sure.
>>
>> Could you please be specific about which document you refer to? Although,
> I am talking about AMD64 Technology Platform Quality

I know that. I was referring to the text just referring to "the document"
without any indication what document it actually refers to.

>
> of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
>
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
>
> Will add this link in the commit message.

Adding the link to the commit message will be helpful to support the
change but it will not help people make sense of terms like "the document"
when reading the comments in the code after the change has been merged.

Reinette

2022-10-07 08:50:56

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [PATCH v5 00/12] x86/resctrl: Support for AMD QoS new features

On 9/28/22 03:25, Babu Moger wrote:
> New AMD processors can now support following QoS features.
>
> 1. Slow Memory Bandwidth Allocation (SMBA)
> With this feature, the QOS enforcement policies can be applied
> to the external slow memory connected to the host. QOS enforcement
> is accomplished by assigning a Class Of Service (COS) to a processor
> and specifying allocations or limits for that COS for each resource
> to be allocated.
>
> Currently, CXL.memory is the only supported "slow" memory device. With
> the support of SMBA feature the hardware enables bandwidth allocation
> on the slow memory devices.
>
> 2. Bandwidth Monitoring Event Configuration (BMEC)
> The bandwidth monitoring events mbm_total_event and mbm_local_event
> are set to count all the total and local reads/writes respectively.
> With the introduction of slow memory, the two counters are not enough
> to count all the different types are memory events. With the feature
> BMEC, the users have the option to configure mbm_total_event and
> mbm_local_event to count the specific type of events.
>
> Following are the bitmaps of events supported.
> Bits Description
> 6 Dirty Victims from the QOS domain to all types of memory
> 5 Reads to slow memory in the non-local NUMA domain
> 4 Reads to slow memory in the local NUMA domain
> 3 Non-temporal writes to non-local NUMA domain
> 2 Non-temporal writes to local NUMA domain
> 1 Reads to memory in the non-local NUMA domain
> 0 Reads to memory in the local NUMA domain
>
> This series adds support for these features.
>
> Feature description is available in the specification, "AMD64 Technology Platform Quality
> of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
>
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>
> ---
> v5:
> Summary of changes.
> 1. Split the series into two. The first two patches are bug fixes. So, sent them separate.
> 2. The config files mbm_total_config and mbm_local_config are now under
> /sys/fs/resctrl/info/L3_MON/. Removed these config files from mon groups.
> 3. Ran "checkpatch --strict --codespell" on all the patches. Looks good with few known exceptions.
> 4. Few minor text changes in resctrl.rst file.
>
> v4:
> https://lore.kernel.org/lkml/166257348081.1043018.11227924488792315932.stgit@bmoger-ubuntu/
> Got numerios of comments from Reinette Chatre. Addressed most of them.
> Summary of changes.
> 1. Removed mon_configurable under /sys/fs/resctrl/info/L3_MON/.
> 2. Updated mon_features texts if the BMEC is supported.
> 3. Added more explanation about the slow memory support.
> 4. Replaced smp_call_function_many with on_each_cpu_mask call.
> 5. Removed arch_has_empty_bitmaps
> 6. Few other text changes.
> 7. Removed Reviewed-by if the patch is modified.
> 8. Rebased the patches to latest tip.
>
> v3:
> https://lore.kernel.org/lkml/166117559756.6695.16047463526634290701.stgit@bmoger-ubuntu/
> a. Rebased the patches to latest tip. Resolved some conflicts.
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> b. Taken care of feedback from Bagas Sanjaya.
> c. Added Reviewed by from Mingo.
> Note: I am still looking for comments from Reinette or Fenghua.
>
> v2:
> https://lore.kernel.org/lkml/165938717220.724959.10931629283087443782.stgit@bmoger-ubuntu/
> a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
> b. Added the patch to fix CBM issue on AMD. This was originally discussed
> https://lore.kernel.org/lkml/[email protected]/
>
> v1:
> https://lore.kernel.org/lkml/165757543252.416408.13547339307237713464.stgit@bmoger-ubuntu/
>
> Babu Moger (12):
> x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
> x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
> x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
> x86/resctrl: Include new features in command line options
> x86/resctrl: Detect and configure Slow Memory Bandwidth allocation
> x86/resctrl: Introduce data structure to support monitor configuration
> x86/resctrl: Add sysfs interface to read mbm_total_bytes event configuration
> x86/resctrl: Add sysfs interface to read mbm_local_bytes event configuration
> x86/resctrl: Add sysfs interface to write mbm_total_bytes event configuration
> x86/resctrl: Add sysfs interface to write mbm_local_bytes event configuration
> x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
> Documentation/x86: Update resctrl_ui.rst for new features
>
>
> .../admin-guide/kernel-parameters.txt | 2 +-
> Documentation/x86/resctrl.rst | 130 +++++++-
> arch/x86/include/asm/cpufeatures.h | 2 +
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> arch/x86/kernel/cpu/resctrl/core.c | 51 ++-
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
> arch/x86/kernel/cpu/resctrl/internal.h | 33 +-
> arch/x86/kernel/cpu/resctrl/monitor.c | 9 +-
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 298 ++++++++++++++++--
> arch/x86/kernel/cpu/scattered.c | 2 +
> 10 files changed, 496 insertions(+), 34 deletions(-)
>

Hi Babu, sorry for having to do public reply to this v5 cover letter
due to accidentally delete the preview documentation patch for your
upcoming v6.

Thanks for privately sending me the preview patch. Seeing it at a glance,
LGTM. Please send the full v6 series for us to review.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2022-10-07 16:36:22

by Moger, Babu

[permalink] [raw]
Subject: RE: [PATCH v5 00/12] x86/resctrl: Support for AMD QoS new features

[AMD Official Use Only - General]

Hi Sanjaya,

> -----Original Message-----
> From: Bagas Sanjaya <[email protected]>
> Sent: Friday, October 7, 2022 3:33 AM
> To: Moger, Babu <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; Das1, Sandipan
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH v5 00/12] x86/resctrl: Support for AMD QoS new features
>
> On 9/28/22 03:25, Babu Moger wrote:
> > New AMD processors can now support following QoS features.
> >
> > 1. Slow Memory Bandwidth Allocation (SMBA)
> > With this feature, the QOS enforcement policies can be applied
> > to the external slow memory connected to the host. QOS enforcement
> > is accomplished by assigning a Class Of Service (COS) to a processor
> > and specifying allocations or limits for that COS for each resource
> > to be allocated.
> >
> > Currently, CXL.memory is the only supported "slow" memory device. With
> > the support of SMBA feature the hardware enables bandwidth allocation
> > on the slow memory devices.
> >
> > 2. Bandwidth Monitoring Event Configuration (BMEC)
> > The bandwidth monitoring events mbm_total_event and mbm_local_event
> > are set to count all the total and local reads/writes respectively.
> > With the introduction of slow memory, the two counters are not enough
> > to count all the different types are memory events. With the feature
> > BMEC, the users have the option to configure mbm_total_event and
> > mbm_local_event to count the specific type of events.
> >
> > Following are the bitmaps of events supported.
> > Bits Description
> > 6 Dirty Victims from the QOS domain to all types of memory
> > 5 Reads to slow memory in the non-local NUMA domain
> > 4 Reads to slow memory in the local NUMA domain
> > 3 Non-temporal writes to non-local NUMA domain
> > 2 Non-temporal writes to local NUMA domain
> > 1 Reads to memory in the non-local NUMA domain
> > 0 Reads to memory in the local NUMA domain
> >
> > This series adds support for these features.
> >
> > Feature description is available in the specification, "AMD64
> > Technology Platform Quality of Service Extensions, Revision: 1.03 Publication
> # 56375 Revision: 1.03 Issue Date: February 2022".
> >
> > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > amd.com%2Fen%2Fsupport%2Ftech-docs%2Famd64-technology-platform-
> quality
> > -service-
> extensions&amp;data=05%7C01%7Cbabu.moger%40amd.com%7Cda5fc806
> >
> 9ca2484b2aef08daa83ea08a%7C3dd8961fe4884e608e11a82d994e183d%7C0%
> 7C0%7C
> >
> 638007284215759374%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjo
> >
> iV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdat
> a=qAZS
> > ze2Mbg24Z0%2BX0GN4yrVO2ooQqQyum7NUjwIGg5o%3D&amp;reserved=0
> > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
> >
> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=05%7C01%7Cbab
> u.m
> >
> oger%40amd.com%7Cda5fc8069ca2484b2aef08daa83ea08a%7C3dd8961fe488
> 4e608e
> >
> 11a82d994e183d%7C0%7C0%7C638007284215759374%7CUnknown%7CTWFpb
> GZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7
> C300
> >
> 0%7C%7C%7C&amp;sdata=8%2BFbGTVfkp%2BCbyyk%2BYa9u0JiHi2YZEVaHiUs
> CBw335g
> > %3D&amp;reserved=0
> >
> > ---
> > v5:
> > Summary of changes.
> > 1. Split the series into two. The first two patches are bug fixes. So, sent them
> separate.
> > 2. The config files mbm_total_config and mbm_local_config are now under
> > /sys/fs/resctrl/info/L3_MON/. Removed these config files from mon
> groups.
> > 3. Ran "checkpatch --strict --codespell" on all the patches. Looks good with
> few known exceptions.
> > 4. Few minor text changes in resctrl.rst file.
> >
> > v4:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kern
> el.org%2Flkml%2F166257348081.1043018.11227924488792315932.stgit%40bm
> oger-
> ubuntu%2F&amp;data=05%7C01%7Cbabu.moger%40amd.com%7Cda5fc8069ca
> 2484b2aef08daa83ea08a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> %7C638007284215759374%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7
> C&amp;sdata=qamR1M2sVSo4vE9NVZzFmvIop7YMKNIYHT74NJrbVVk%3D&am
> p;reserved=0
> > Got numerios of comments from Reinette Chatre. Addressed most of them.
> > Summary of changes.
> > 1. Removed mon_configurable under /sys/fs/resctrl/info/L3_MON/.
> > 2. Updated mon_features texts if the BMEC is supported.
> > 3. Added more explanation about the slow memory support.
> > 4. Replaced smp_call_function_many with on_each_cpu_mask call.
> > 5. Removed arch_has_empty_bitmaps
> > 6. Few other text changes.
> > 7. Removed Reviewed-by if the patch is modified.
> > 8. Rebased the patches to latest tip.
> >
> > v3:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kern
> el.org%2Flkml%2F166117559756.6695.16047463526634290701.stgit%40bmoge
> r-
> ubuntu%2F&amp;data=05%7C01%7Cbabu.moger%40amd.com%7Cda5fc8069ca
> 2484b2aef08daa83ea08a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> %7C638007284215915604%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7
> C&amp;sdata=ogINZslT9yExkkFww4X14XQEg8W8heYBrJB59C50Hqk%3D&amp;
> reserved=0
> > a. Rebased the patches to latest tip. Resolved some conflicts.
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kerne
> l.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftip%2Ftip.git&amp;data=05%
> 7C01%7Cbabu.moger%40amd.com%7Cda5fc8069ca2484b2aef08daa83ea08a%
> 7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638007284215915604
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJ
> BTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=ZOK3iwsaq
> 3%2BGAUIXJwn9Thg3cJBxWDMAfu4UqHlo2J4%3D&amp;reserved=0
> > b. Taken care of feedback from Bagas Sanjaya.
> > c. Added Reviewed by from Mingo.
> > Note: I am still looking for comments from Reinette or Fenghua.
> >
> > v2:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kern
> el.org%2Flkml%2F165938717220.724959.10931629283087443782.stgit%40bmo
> ger-
> ubuntu%2F&amp;data=05%7C01%7Cbabu.moger%40amd.com%7Cda5fc8069ca
> 2484b2aef08daa83ea08a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> %7C638007284215915604%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7
> C&amp;sdata=14viyG9elsn6BYGpDOwrqYQFNOlpyC6oqoJwJm49oV0%3D&amp;
> reserved=0
> > a. Rebased the patches to latest stable tree (v5.18.15). Resolved some
> conflicts.
> > b. Added the patch to fix CBM issue on AMD. This was originally discussed
> >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Flkml%2F20220517001234.3137157-1-
> eranian%40google.com%2F&
> >
> amp;data=05%7C01%7Cbabu.moger%40amd.com%7Cda5fc8069ca2484b2aef0
> 8daa83e
> >
> a08a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6380072842159
> 15604%7
> >
> CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI
> 6Ik1
> >
> haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=%2BHEvN2PNNYyH
> ohvLg2sbth
> > BQo4cgDj5Vsw9AqGb1Pr8%3D&amp;reserved=0
> >
> > v1:
> >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> >
> .kernel.org%2Flkml%2F165757543252.416408.13547339307237713464.stgit%4
> 0
> > bmoger-
> ubuntu%2F&amp;data=05%7C01%7Cbabu.moger%40amd.com%7Cda5fc8069ca
> >
> 2484b2aef08daa83ea08a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> %7C638
> >
> 007284215915604%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLC
> JQIjoiV2
> >
> luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=J
> R03cAA
> > 9pzmq0SgNsgAsajOH6LX%2F4s3px3b%2FJ8409Ys%3D&amp;reserved=0
> >
> > Babu Moger (12):
> > x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
> > x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
> > x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature
> flag
> > x86/resctrl: Include new features in command line options
> > x86/resctrl: Detect and configure Slow Memory Bandwidth allocation
> > x86/resctrl: Introduce data structure to support monitor configuration
> > x86/resctrl: Add sysfs interface to read mbm_total_bytes event
> configuration
> > x86/resctrl: Add sysfs interface to read mbm_local_bytes event
> configuration
> > x86/resctrl: Add sysfs interface to write mbm_total_bytes event
> configuration
> > x86/resctrl: Add sysfs interface to write mbm_local_bytes event
> configuration
> > x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
> > Documentation/x86: Update resctrl_ui.rst for new features
> >
> >
> > .../admin-guide/kernel-parameters.txt | 2 +-
> > Documentation/x86/resctrl.rst | 130 +++++++-
> > arch/x86/include/asm/cpufeatures.h | 2 +
> > arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> > arch/x86/kernel/cpu/resctrl/core.c | 51 ++-
> > arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
> > arch/x86/kernel/cpu/resctrl/internal.h | 33 +-
> > arch/x86/kernel/cpu/resctrl/monitor.c | 9 +-
> > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 298 ++++++++++++++++--
> > arch/x86/kernel/cpu/scattered.c | 2 +
> > 10 files changed, 496 insertions(+), 34 deletions(-)
> >
>
> Hi Babu, sorry for having to do public reply to this v5 cover letter due to
> accidentally delete the preview documentation patch for your upcoming v6.
>
> Thanks for privately sending me the preview patch. Seeing it at a glance,
> LGTM. Please send the full v6 series for us to review.
Sure. Will send the whole series.
Thanks
Babu