2024-01-23 09:22:31

by Haifeng Xu

[permalink] [raw]
Subject: [PATCH 3/3] x86/resctrl: Display cache occupancy of busy RMIDs

If llc_occupany is enabled, the RMID may not be freed immediately unless
its llc_occupany is less than the resctrl_rmid_realloc_threshold.

In our production environment, those unused RMIDs get stuck in the limbo
list forever because their llc_occupancy are larger than the threshold.
After turning it up , we can successfully free unused RMIDs and create
new monitor groups. In order to accquire the llc_occupancy of RMIDs in
each rdt domain, we use perf tool to track and filter the log manually.

It's not efficient enough. Therefore, we can add a RFTYPE_TOP_INFO file
'busy_rmids_info' that tells users the llc_occupancy of busy RMIDs. It
can also help to guide users how much the resctrl_rmid_realloc_threshold
should be.

Signed-off-by: Haifeng Xu <[email protected]>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 70 ++++++++++++++++++++++++++
1 file changed, 70 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1eac0ca97b81..88dadb87f4e1 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -32,6 +32,12 @@
#include <asm/resctrl.h>
#include "internal.h"

+struct busy_rmids_info {
+ struct rdt_resource *r;
+ struct rdt_domain *d;
+ struct seq_file *seq;
+};
+
DEFINE_STATIC_KEY_FALSE(rdt_enable_key);
DEFINE_STATIC_KEY_FALSE(rdt_mon_enable_key);
DEFINE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
@@ -934,6 +940,63 @@ static int rdt_free_rmids_show(struct kernfs_open_file *of,
return 0;
}

+void rdt_domain_busy_rmids_show(void *info)
+{
+
+ struct rdt_resource *r;
+ struct rdt_domain *d;
+ struct seq_file *seq;
+ struct busy_rmids_info *rmids_info = info;
+ u32 crmid = 1, nrmid;
+ u64 val;
+ int ret;
+
+ r = rmids_info->r;
+ d = rmids_info->d;
+ seq = rmids_info->seq;
+
+ seq_printf(seq, "domain-%d busy rmids.\n", d->id);
+
+ for (;;) {
+ nrmid = find_next_bit(d->rmid_busy_llc, r->num_rmid, crmid);
+ if (nrmid >= r->num_rmid)
+ break;
+
+ ret = resctrl_arch_rmid_read(r, d, nrmid, QOS_L3_OCCUP_EVENT_ID, &val);
+ switch (ret) {
+ case -EIO:
+ seq_printf(seq, "I/O Error\n");
+ return;
+ case -EINVAL:
+ seq_printf(seq, "Invalid Argument\n");
+ return;
+ default:
+ seq_printf(seq, "rmid:%d llc_occupancy:%llu\n", nrmid, val);
+ }
+ crmid = nrmid + 1;
+ }
+}
+
+static int rdt_busy_rmids_info_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+
+ struct rdt_domain *d;
+ struct rdt_resource *r;
+ struct busy_rmids_info info;
+
+ mutex_lock(&rdtgroup_mutex);
+ r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ info.r = r;
+ info.seq = seq;
+ list_for_each_entry(d, &r->domains, list) {
+ info.d = d;
+ smp_call_function_any(&d->cpu_mask, rdt_domain_busy_rmids_show, &info, 1);
+ }
+ mutex_unlock(&rdtgroup_mutex);
+ return 0;
+}
+
static int rdt_num_closids_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
@@ -1791,6 +1854,13 @@ static struct rftype res_common_files[] = {
.seq_show = rdt_free_rmids_show,
.fflags = RFTYPE_TOP_INFO,
},
+ {
+ .name = "busy_rmids_info",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = rdt_busy_rmids_info_show,
+ .fflags = RFTYPE_TOP_INFO,
+ },
{
.name = "num_closids",
.mode = 0444,
--
2.25.1



2024-01-24 22:25:24

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 3/3] x86/resctrl: Display cache occupancy of busy RMIDs

(+James)

Hi Haifeng,

On 1/23/2024 1:20 AM, Haifeng Xu wrote:
> If llc_occupany is enabled, the RMID may not be freed immediately unless
> its llc_occupany is less than the resctrl_rmid_realloc_threshold.
>
> In our production environment, those unused RMIDs get stuck in the limbo
> list forever because their llc_occupancy are larger than the threshold.
> After turning it up , we can successfully free unused RMIDs and create
> new monitor groups. In order to accquire the llc_occupancy of RMIDs in
> each rdt domain, we use perf tool to track and filter the log manually.
>
> It's not efficient enough. Therefore, we can add a RFTYPE_TOP_INFO file
> 'busy_rmids_info' that tells users the llc_occupancy of busy RMIDs. It
> can also help to guide users how much the resctrl_rmid_realloc_threshold
> should be.

I am addressing both patch 2/3 and patch 3/3 here.

First, please note that resctrl is obtaining support for Arm's Memory
System Resource Partitioning and Monitoring (MPAM) and MPAM's monitoring
is done with a monitoring group that is dependent on the control group,
not independent as Intel and AMD. Please see [1] for more details.

resctrl is the generic interface that will be used to interact with RDT
on Intel, PQoS on AMD, and also MPAM on Arm. We thus need to ensure that
the interface is appropriate for all. Specifically, for Arm there is
no global "free RMID list", on Arm the free RMIDs (PMG in Arm language,
but rmid is the term that made it into resctrl) are per control group.

Second, this addition seems to be purely a debugging aid. I thus don't see
this as something that users may want/need all the time, yet when users do
want/need it, accurate data is preferred. To that end, the limbo
code already walks the busy list once per second. What if there is a
new tracepoint within the limbo code that shares the exact data used during
limbo list management? From what I can tell, this data, combined with the
per-monitor-group "mon_hw_id", should give user space sufficient data to
debug the scenarios mentioned in these patches.

I did add James to this discussion to make him aware of your requirements.
Please do include him in future submissions.

Reinette

[1] https://lore.kernel.org/all/[email protected]/

2024-01-25 07:56:09

by Haifeng Xu

[permalink] [raw]
Subject: Re: [PATCH 3/3] x86/resctrl: Display cache occupancy of busy RMIDs



On 2024/1/25 06:25, Reinette Chatre wrote:
> (+James)
>
> Hi Haifeng,
>
> On 1/23/2024 1:20 AM, Haifeng Xu wrote:
>> If llc_occupany is enabled, the RMID may not be freed immediately unless
>> its llc_occupany is less than the resctrl_rmid_realloc_threshold.
>>
>> In our production environment, those unused RMIDs get stuck in the limbo
>> list forever because their llc_occupancy are larger than the threshold.
>> After turning it up , we can successfully free unused RMIDs and create
>> new monitor groups. In order to accquire the llc_occupancy of RMIDs in
>> each rdt domain, we use perf tool to track and filter the log manually.
>>
>> It's not efficient enough. Therefore, we can add a RFTYPE_TOP_INFO file
>> 'busy_rmids_info' that tells users the llc_occupancy of busy RMIDs. It
>> can also help to guide users how much the resctrl_rmid_realloc_threshold
>> should be.
>
> I am addressing both patch 2/3 and patch 3/3 here.
>
> First, please note that resctrl is obtaining support for Arm's Memory
> System Resource Partitioning and Monitoring (MPAM) and MPAM's monitoring
> is done with a monitoring group that is dependent on the control group,
> not independent as Intel and AMD. Please see [1] for more details.
>
> resctrl is the generic interface that will be used to interact with RDT
> on Intel, PQoS on AMD, and also MPAM on Arm. We thus need to ensure that
> the interface is appropriate for all. Specifically, for Arm there is
> no global "free RMID list", on Arm the free RMIDs (PMG in Arm language,
> but rmid is the term that made it into resctrl) are per control group.
>
> Second, this addition seems to be purely a debugging aid. I thus don't see
> this as something that users may want/need all the time, yet when users do
> want/need it, accurate data is preferred. To that end, the limbo
> code already walks the busy list once per second. What if there is a
> new tracepoint within the limbo code that shares the exact data used during
> limbo list management?

OK, I'll try this way.

From what I can tell, this data, combined with the
> per-monitor-group "mon_hw_id", should give user space sufficient data to
> debug the scenarios mentioned in these patches.
>
> I did add James to this discussion to make him aware of your requirements.
> Please do include him in future submissions.
>
> Reinette
>
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_all_20231215174343.13872-2D1-2Djames.morse-40arm.com_&d=DwICaQ&c=R1GFtfTqKXCFH-lgEPXWwic6stQkW4U7uVq33mt-crw&r=3uoFsejk1jN2oga47MZfph01lLGODc93n4Zqe7b0NRk&m=-XE6uI2GOyk-qzRRAWvuDzQ9NgM2-QK-KLArnJEYmu02heN9gOh6VMbPeF1iZUZe&s=FySup-TxYl6c-jaA7Q8OFIVwbMdsMxZ3ChQ6Sj0HaLA&e=

Thanks.