2022-02-17 18:43:35

by James Morse

[permalink] [raw]
Subject: [PATCH v3 16/21] x86/resctrl: Pass the required parameters into resctrl_arch_rmid_read()

resctrl_arch_rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a hardware register. Currently the function
returns the MBM values in chunks directly from hardware.

To convert this to bytes, some correction and overflow calculations
are needed. These depend on the resource and domain structures.
Overflow detection requires the old chunks value. None of this
is available to resctrl_arch_rmid_read(). MPAM requires the
resource and domain structures to find the MMIO device that holds
the registers.

Pass the resource and domain to resctrl_arch_rmid_read(). This make
rmid_dirty() too big, instead merge it with its only caller, the name is
kept as a local variable.

Signed-off-by: James Morse <[email protected]>
---
Changes since v2:
* Typos.
* Kerneldoc fixes.

This is all a little noisy for __mon_event_count(), as the switch
statement work is now before the resctrl_arch_rmid_read() call.
---
arch/x86/kernel/cpu/resctrl/monitor.c | 31 +++++++++++++++------------
include/linux/resctrl.h | 16 +++++++++++++-
2 files changed, 32 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b6ad290fda8d..277c22f8c976 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -167,10 +167,14 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
memset(am, 0, sizeof(*am));
}

-int resctrl_arch_rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
+int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
+ u32 rmid, enum resctrl_event_id eventid, u64 *val)
{
u64 msr_val;

+ if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask))
+ return -EINVAL;
+
/*
* As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
* with a valid event code for supported resource type and the bits
@@ -192,16 +196,6 @@ int resctrl_arch_rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
return 0;
}

-static bool rmid_dirty(struct rmid_entry *entry)
-{
- u64 val = 0;
-
- if (resctrl_arch_rmid_read(entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val))
- return true;
-
- return val >= resctrl_cqm_threshold;
-}
-
/*
* Check the RMIDs that are marked as busy for this domain. If the
* reported LLC occupancy is below the threshold clear the busy bit and
@@ -213,6 +207,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free)
struct rmid_entry *entry;
struct rdt_resource *r;
u32 crmid = 1, nrmid;
+ bool rmid_dirty;
+ u64 val = 0;

r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;

@@ -228,7 +224,14 @@ void __check_limbo(struct rdt_domain *d, bool force_free)
break;

entry = __rmid_entry(nrmid);
- if (force_free || !rmid_dirty(entry)) {
+
+ if (resctrl_arch_rmid_read(r, d, entry->rmid,
+ QOS_L3_OCCUP_EVENT_ID, &val))
+ rmid_dirty = true;
+ else
+ rmid_dirty = (val >= resctrl_cqm_threshold);
+
+ if (force_free || !rmid_dirty) {
clear_bit(entry->rmid, d->rmid_busy_llc);
if (!--entry->busy) {
rmid_limbo_count--;
@@ -278,7 +281,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry)
cpu = get_cpu();
list_for_each_entry(d, &r->domains, list) {
if (cpumask_test_cpu(cpu, &d->cpu_mask)) {
- err = resctrl_arch_rmid_read(entry->rmid,
+ err = resctrl_arch_rmid_read(r, d, entry->rmid,
QOS_L3_OCCUP_EVENT_ID,
&val);
if (err || val <= resctrl_cqm_threshold)
@@ -336,7 +339,7 @@ static u64 __mon_event_count(u32 rmid, struct rmid_read *rr)
if (rr->first)
resctrl_arch_reset_rmid(rr->r, rr->d, rmid, rr->evtid);

- rr->err = resctrl_arch_rmid_read(rmid, rr->evtid, &tval);
+ rr->err = resctrl_arch_rmid_read(rr->r, rr->d, rmid, rr->evtid, &tval);
if (rr->err)
return rr->err;

diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 70112dbfa128..5d57e2610c79 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -219,7 +219,21 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
u32 closid, enum resctrl_conf_type type);
int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d);
void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d);
-int resctrl_arch_rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *res);
+
+/**
+ * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid
+ * for this resource and domain.
+ * @r: resource that the counter should be read from.
+ * @d: domain that the counter should be read from.
+ * @rmid: rmid of the counter to read.
+ * @eventid: eventid to read, e.g. L3 occupancy.
+ * @val: result of the counter read in chunks.
+ *
+ * Return:
+ * 0 on success, or -EIO, -EINVAL etc on error.
+ */
+int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
+ u32 rmid, enum resctrl_event_id eventid, u64 *val);

/**
* resctrl_arch_reset_rmid() - Reset any private state associated with rmid
--
2.30.2


2022-03-25 15:29:28

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v3 16/21] x86/resctrl: Pass the required parameters into resctrl_arch_rmid_read()

On Thu, Feb 17, 2022 at 06:21:05PM +0000, James Morse wrote:
> resctrl_arch_rmid_read() is intended as the function that an
> architecture agnostic resctrl filesystem driver can use to
> read a value in bytes from a hardware register. Currently the function
> returns the MBM values in chunks directly from hardware.
>
> To convert this to bytes, some correction and overflow calculations
> are needed. These depend on the resource and domain structures.
> Overflow detection requires the old chunks value. None of this
> is available to resctrl_arch_rmid_read(). MPAM requires the
> resource and domain structures to find the MMIO device that holds
> the registers.
>
> Pass the resource and domain to resctrl_arch_rmid_read(). This make

s/make/makes/

> rmid_dirty() too big, instead merge it with its only caller, the name is
> kept as a local variable.

... big. Instead, merge it with its only caller, and the name...

>
> Signed-off-by: James Morse <[email protected]>
> ---
> Changes since v2:
> * Typos.
> * Kerneldoc fixes.
>
> This is all a little noisy for __mon_event_count(), as the switch
> statement work is now before the resctrl_arch_rmid_read() call.
> ---
> arch/x86/kernel/cpu/resctrl/monitor.c | 31 +++++++++++++++------------
> include/linux/resctrl.h | 16 +++++++++++++-
> 2 files changed, 32 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index b6ad290fda8d..277c22f8c976 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -167,10 +167,14 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
> memset(am, 0, sizeof(*am));
> }
>
> -int resctrl_arch_rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
> +int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
> + u32 rmid, enum resctrl_event_id eventid, u64 *val)
> {
> u64 msr_val;
>
> + if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask))

We already tested this and disabled preemption. (At least from some
caller AFAICT from this patch.) I'd assume we'd want the fs code to
handle preemption disable and checking cpumask. In any case, it should
be clear what guarantees resctrl_arch_rmid_read() has.

> @@ -278,7 +281,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry)
> cpu = get_cpu();
> list_for_each_entry(d, &r->domains, list) {
> if (cpumask_test_cpu(cpu, &d->cpu_mask)) {
> - err = resctrl_arch_rmid_read(entry->rmid,
> + err = resctrl_arch_rmid_read(r, d, entry->rmid,
> QOS_L3_OCCUP_EVENT_ID,
> &val);
> if (err || val <= resctrl_cqm_threshold)

2022-03-31 03:15:41

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v3 16/21] x86/resctrl: Pass the required parameters into resctrl_arch_rmid_read()

Hi Rob,

On 23/03/2022 20:58, Rob Herring wrote:
> On Thu, Feb 17, 2022 at 06:21:05PM +0000, James Morse wrote:
>> resctrl_arch_rmid_read() is intended as the function that an
>> architecture agnostic resctrl filesystem driver can use to
>> read a value in bytes from a hardware register. Currently the function
>> returns the MBM values in chunks directly from hardware.
>>
>> To convert this to bytes, some correction and overflow calculations
>> are needed. These depend on the resource and domain structures.
>> Overflow detection requires the old chunks value. None of this
>> is available to resctrl_arch_rmid_read(). MPAM requires the
>> resource and domain structures to find the MMIO device that holds
>> the registers.

>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index b6ad290fda8d..277c22f8c976 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -167,10 +167,14 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
>> memset(am, 0, sizeof(*am));
>> }
>>
>> -int resctrl_arch_rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
>> +int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
>> + u32 rmid, enum resctrl_event_id eventid, u64 *val)
>> {
>> u64 msr_val;
>>
>> + if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask))

> We already tested this and disabled preemption. (At least from some
> caller AFAICT from this patch.) I'd assume we'd want the fs code to
> handle preemption disable and checking cpumask. In any case, it should
> be clear what guarantees resctrl_arch_rmid_read() has.

This started as a lockdep warning for things that don't matter on x86, but would break
arm64. Combined with some half baked thinking about RT.

I'll add a comment, (in the header file).

It needs to be called on the correct CPU, but from process context as MPAM needs to send
IPI from here. I didn't want to add a preempt_disable() lockdep annotation here as its not
pre-emption that's the problem, but migration. cqm_handle_limbo() and
mbm_handle_overflow() are the two main routes in here, and they both use
schedule_delayed_work_on() to target the 'correct' CPU.


Thanks,

James