2021-05-04 06:41:27

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

On certain AMD platforms, when the IOMMU performance counter source
(csource) field is zero, power-gating for the counter is enabled, which
prevents write access and returns zero for read access.

This can cause invalid perf result especially when event multiplexing
is needed (i.e. more number of events than available counters) since
the current logic keeps track of the previously read counter value,
and subsequently re-program the counter to continue counting the event.
With power-gating enabled, we cannot gurantee successful re-programming
of the counter.

Workaround this issue by :

1. Modifying the ordering of setting/reading counters and enabing/
disabling csources to only access the counter when the csource
is set to non-zero.

2. Since AMD IOMMU PMU does not support interrupt mode, the logic
can be simplified to always start counting with value zero,
and accumulate the counter value when stopping without the need
to keep track and reprogram the counter with the previously read
counter value.

This has been tested on systems with and without power-gating.

Fixes: 994d6608efe4 ("iommu/amd: Remove performance counter pre-initialization test")
Suggested-by: Alexander Monakov <[email protected]>
Cc: David Coe <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/events/amd/iommu.c | 47 ++++++++++++++++++++-----------------
1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 1c1a7e45dc64..913745f1419b 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -19,8 +19,6 @@
#include "../perf_event.h"
#include "iommu.h"

-#define COUNTER_SHIFT 16
-
/* iommu pmu conf masks */
#define GET_CSOURCE(x) ((x)->conf & 0xFFULL)
#define GET_DEVID(x) (((x)->conf >> 8) & 0xFFFFULL)
@@ -286,22 +284,31 @@ static void perf_iommu_start(struct perf_event *event, int flags)
WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE));
hwc->state = 0;

+ /*
+ * To account for power-gating, which prevents write to
+ * the counter, we need to enable the counter
+ * before setting up counter register.
+ */
+ perf_iommu_enable_event(event);
+
if (flags & PERF_EF_RELOAD) {
- u64 prev_raw_count = local64_read(&hwc->prev_count);
+ u64 count = 0;
struct amd_iommu *iommu = perf_event_2_iommu(event);

+ /*
+ * Since the IOMMU PMU only support counting mode,
+ * the counter always start with value zero.
+ */
amd_iommu_pc_set_reg(iommu, hwc->iommu_bank, hwc->iommu_cntr,
- IOMMU_PC_COUNTER_REG, &prev_raw_count);
+ IOMMU_PC_COUNTER_REG, &count);
}

- perf_iommu_enable_event(event);
perf_event_update_userpage(event);
-
}

static void perf_iommu_read(struct perf_event *event)
{
- u64 count, prev, delta;
+ u64 count;
struct hw_perf_event *hwc = &event->hw;
struct amd_iommu *iommu = perf_event_2_iommu(event);

@@ -312,14 +319,11 @@ static void perf_iommu_read(struct perf_event *event)
/* IOMMU pc counter register is only 48 bits */
count &= GENMASK_ULL(47, 0);

- prev = local64_read(&hwc->prev_count);
- if (local64_cmpxchg(&hwc->prev_count, prev, count) != prev)
- return;
-
- /* Handle 48-bit counter overflow */
- delta = (count << COUNTER_SHIFT) - (prev << COUNTER_SHIFT);
- delta >>= COUNTER_SHIFT;
- local64_add(delta, &event->count);
+ /*
+ * Since the counter always start with value zero,
+ * simply just accumulate the count for the event.
+ */
+ local64_add(count, &event->count);
}

static void perf_iommu_stop(struct perf_event *event, int flags)
@@ -329,15 +333,16 @@ static void perf_iommu_stop(struct perf_event *event, int flags)
if (hwc->state & PERF_HES_UPTODATE)
return;

+ /*
+ * To account for power-gating, in which reading the counter would
+ * return zero, we need to read the register before disabling.
+ */
+ perf_iommu_read(event);
+ hwc->state |= PERF_HES_UPTODATE;
+
perf_iommu_disable_event(event);
WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
hwc->state |= PERF_HES_STOPPED;
-
- if (hwc->state & PERF_HES_UPTODATE)
- return;
-
- perf_iommu_read(event);
- hwc->state |= PERF_HES_UPTODATE;
}

static int perf_iommu_add(struct perf_event *event, int flags)
--
2.17.1


2021-05-04 09:43:30

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

On Tue, May 04, 2021 at 01:52:36AM -0500, Suravee Suthikulpanit wrote:

> 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
> can be simplified to always start counting with value zero,
> and accumulate the counter value when stopping without the need
> to keep track and reprogram the counter with the previously read
> counter value.

This relies on the hardware counter being the full 64bit wide, is it?

2021-05-04 12:01:10

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

Peter,

On 5/4/2021 4:39 PM, Peter Zijlstra wrote:
> On Tue, May 04, 2021 at 01:52:36AM -0500, Suravee Suthikulpanit wrote:
>
>> 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
>> can be simplified to always start counting with value zero,
>> and accumulate the counter value when stopping without the need
>> to keep track and reprogram the counter with the previously read
>> counter value.
>
> This relies on the hardware counter being the full 64bit wide, is it?
>

The HW counter value is 48-bit. Not sure why it needs to be 64-bit?
I might be missing some points here? Could you please describe?

Thanks,
Suravee


2021-05-04 12:15:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

On Tue, May 04, 2021 at 06:58:29PM +0700, Suthikulpanit, Suravee wrote:
> Peter,
>
> On 5/4/2021 4:39 PM, Peter Zijlstra wrote:
> > On Tue, May 04, 2021 at 01:52:36AM -0500, Suravee Suthikulpanit wrote:
> >
> > > 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
> > > can be simplified to always start counting with value zero,
> > > and accumulate the counter value when stopping without the need
> > > to keep track and reprogram the counter with the previously read
> > > counter value.
> >
> > This relies on the hardware counter being the full 64bit wide, is it?
> >
>
> The HW counter value is 48-bit. Not sure why it needs to be 64-bit?
> I might be missing some points here? Could you please describe?

How do you deal with the 48bit overflow if you don't use the interrupt?

2021-05-04 17:19:08

by David Coe

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

Hi again!

On 04/05/2021 07:52, Suravee Suthikulpanit wrote:
> On certain AMD platforms, when the IOMMU performance counter source
> (csource) field is zero, power-gating for the counter is enabled, which
> prevents write access and returns zero for read access.
>
> This can cause invalid perf result especially when event multiplexing
> is needed (i.e. more number of events than available counters) since
> the current logic keeps track of the previously read counter value,
> and subsequently re-program the counter to continue counting the event.
> With power-gating enabled, we cannot gurantee successful re-programming
> of the counter.
>
> Workaround this issue by :
>
> 1. Modifying the ordering of setting/reading counters and enabing/
> disabling csources to only access the counter when the csource
> is set to non-zero.
>
> 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
> can be simplified to always start counting with value zero,
> and accumulate the counter value when stopping without the need
> to keep track and reprogram the counter with the previously read
> counter value.
>

Results for Ryzen 4700U running Ubuntu 21.04 kernel 5.11.0-16 patched as
above.

All amd_iommu events:

Performance counter stats for 'system wide':

18 amd_iommu_0/cmd_processed/ (33.29%)
9 amd_iommu_0/cmd_processed_inv/ (33.33%)
0 amd_iommu_0/ign_rd_wr_mmio_1ff8h/ (33.36%)
308 amd_iommu_0/int_dte_hit/ (33.40%)
5 amd_iommu_0/int_dte_mis/ (33.45%)
346 amd_iommu_0/mem_dte_hit/ (33.46%)
8,954 amd_iommu_0/mem_dte_mis/ (33.48%)
0 amd_iommu_0/mem_iommu_tlb_pde_hit/ (33.46%)
771 amd_iommu_0/mem_iommu_tlb_pde_mis/ (33.44%)
14 amd_iommu_0/mem_iommu_tlb_pte_hit/ (33.40%)
836 amd_iommu_0/mem_iommu_tlb_pte_mis/ (33.36%)
0 amd_iommu_0/mem_pass_excl/ (33.32%)
0 amd_iommu_0/mem_pass_pretrans/ (33.28%)
1,601 amd_iommu_0/mem_pass_untrans/ (33.27%)
0 amd_iommu_0/mem_target_abort/ (33.27%)
1,130 amd_iommu_0/mem_trans_total/ (33.27%)
0 amd_iommu_0/page_tbl_read_gst/ (33.27%)
312 amd_iommu_0/page_tbl_read_nst/ (33.28%)
279 amd_iommu_0/page_tbl_read_tot/ (33.27%)
0 amd_iommu_0/smi_blk/ (33.29%)
0 amd_iommu_0/smi_recv/ (33.27%)
0 amd_iommu_0/tlb_inv/ (33.26%)
0 amd_iommu_0/vapic_int_guest/ (33.25%)
366 amd_iommu_0/vapic_int_non_guest/ (33.27%)

10.001941666 seconds time elapsed


Groups of 8 amd_iommu events:

Performance counter stats for 'system wide':

14 amd_iommu_0/cmd_processed/

7 amd_iommu_0/cmd_processed_inv/

0 amd_iommu_0/ign_rd_wr_mmio_1ff8h/

502 amd_iommu_0/int_dte_hit/

6 amd_iommu_0/int_dte_mis/

532 amd_iommu_0/mem_dte_hit/

13,622 amd_iommu_0/mem_dte_mis/

159 amd_iommu_0/mem_iommu_tlb_pde_hit/


10.002170562 seconds time elapsed


Performance counter stats for 'system wide':

762 amd_iommu_0/mem_iommu_tlb_pde_mis/

20 amd_iommu_0/mem_iommu_tlb_pte_hit/

698 amd_iommu_0/mem_iommu_tlb_pte_mis/

0 amd_iommu_0/mem_pass_excl/

0 amd_iommu_0/mem_pass_pretrans/

15 amd_iommu_0/mem_pass_untrans/

0 amd_iommu_0/mem_target_abort/

718 amd_iommu_0/mem_trans_total/


10.001683428 seconds time elapsed


Performance counter stats for 'system wide':

0 amd_iommu_0/page_tbl_read_gst/

33 amd_iommu_0/page_tbl_read_nst/

33 amd_iommu_0/page_tbl_read_tot/

0 amd_iommu_0/smi_blk/

0 amd_iommu_0/smi_recv/

0 amd_iommu_0/tlb_inv/

0 amd_iommu_0/vapic_int_guest/

11,638 amd_iommu_0/vapic_int_non_guest/


10.002205748 seconds time elapsed

2021-05-05 12:05:06

by David Coe

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

Hi, once more!

On 04/05/2021 07:52, Suravee Suthikulpanit wrote:
> On certain AMD platforms, when the IOMMU performance counter source
> (csource) field is zero, power-gating for the counter is enabled, which
> prevents write access and returns zero for read access.
>
> This can cause invalid perf result especially when event multiplexing
> is needed (i.e. more number of events than available counters) since
> the current logic keeps track of the previously read counter value,
> and subsequently re-program the counter to continue counting the event.
> With power-gating enabled, we cannot gurantee successful re-programming
> of the counter.
>
> Workaround this issue by :
>
> 1. Modifying the ordering of setting/reading counters and enabing/
> disabling csources to only access the counter when the csource
> is set to non-zero.
>
> 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
> can be simplified to always start counting with value zero,
> and accumulate the counter value when stopping without the need
> to keep track and reprogram the counter with the previously read
> counter value.
>
> This has been tested on systems with and without power-gating.
>

Just as a final, sanity check I've loaded the same patched kernel
5.11.0-16 on to an old AMD Athlon FX8350. So far, all seems in order: it
loads IOMMUv1 and runs Ubuntu 21.04 without incident!

Much appreciate all your efforts, Suravee, Alex et al. Best regards.

--
David

2021-05-05 12:42:06

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

Peter,

On 5/4/2021 7:13 PM, Peter Zijlstra wrote:
> On Tue, May 04, 2021 at 06:58:29PM +0700, Suthikulpanit, Suravee wrote:
>> Peter,
>>
>> On 5/4/2021 4:39 PM, Peter Zijlstra wrote:
>>> On Tue, May 04, 2021 at 01:52:36AM -0500, Suravee Suthikulpanit wrote:
>>>
>>>> 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
>>>> can be simplified to always start counting with value zero,
>>>> and accumulate the counter value when stopping without the need
>>>> to keep track and reprogram the counter with the previously read
>>>> counter value.
>>>
>>> This relies on the hardware counter being the full 64bit wide, is it?
>>>
>>
>> The HW counter value is 48-bit. Not sure why it needs to be 64-bit?
>> I might be missing some points here? Could you please describe?
>
> How do you deal with the 48bit overflow if you don't use the interrupt?

The IOMMU Perf driver does not currently handle counter overflow since the overflow
notification mechanism (i.e. IOMMU creates an EVENT_COUNTER_ZERO event in the IOMMU event log,
and generate an IOMMU MSI interrupt to signal IOMMU driver to process the event.) is not
currently supported. When counter overflows, the counter becomes zero, and Perf
reports value zero for the event.

Alternatively, to detect overflow, we might start counting with value 1 so that
we can detect overflow when the value becomes zero in which case the Perf driver
could generate error message.

Regards,
Suravee

2021-05-05 13:09:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

On Wed, May 05, 2021 at 07:39:14PM +0700, Suthikulpanit, Suravee wrote:
> Peter,
>
> On 5/4/2021 7:13 PM, Peter Zijlstra wrote:
> > On Tue, May 04, 2021 at 06:58:29PM +0700, Suthikulpanit, Suravee wrote:
> > > Peter,
> > >
> > > On 5/4/2021 4:39 PM, Peter Zijlstra wrote:
> > > > On Tue, May 04, 2021 at 01:52:36AM -0500, Suravee Suthikulpanit wrote:
> > > >
> > > > > 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
> > > > > can be simplified to always start counting with value zero,
> > > > > and accumulate the counter value when stopping without the need
> > > > > to keep track and reprogram the counter with the previously read
> > > > > counter value.
> > > >
> > > > This relies on the hardware counter being the full 64bit wide, is it?
> > > >
> > >
> > > The HW counter value is 48-bit. Not sure why it needs to be 64-bit?
> > > I might be missing some points here? Could you please describe?
> >
> > How do you deal with the 48bit overflow if you don't use the interrupt?
>
> The IOMMU Perf driver does not currently handle counter overflow since the overflow
> notification mechanism (i.e. IOMMU creates an EVENT_COUNTER_ZERO event in the IOMMU event log,
> and generate an IOMMU MSI interrupt to signal IOMMU driver to process the event.) is not
> currently supported. When counter overflows, the counter becomes zero, and Perf
> reports value zero for the event.
>
> Alternatively, to detect overflow, we might start counting with value 1 so that
> we can detect overflow when the value becomes zero in which case the Perf driver
> could generate error message.

Urgh.. the intel uncore driver programs an hrtimer to periodically fold
deltas. That way the counter will never be short.

Subject: [tip: perf/urgent] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: e10de314287c2c14b0e6f0e3e961975ce2f4a83d
Gitweb: https://git.kernel.org/tip/e10de314287c2c14b0e6f0e3e961975ce2f4a83d
Author: Suravee Suthikulpanit <[email protected]>
AuthorDate: Tue, 04 May 2021 01:52:36 -05:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 06 May 2021 15:33:37 +02:00

x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

On certain AMD platforms, when the IOMMU performance counter source
(csource) field is zero, power-gating for the counter is enabled, which
prevents write access and returns zero for read access.

This can cause invalid perf result especially when event multiplexing
is needed (i.e. more number of events than available counters) since
the current logic keeps track of the previously read counter value,
and subsequently re-program the counter to continue counting the event.
With power-gating enabled, we cannot gurantee successful re-programming
of the counter.

Workaround this issue by :

1. Modifying the ordering of setting/reading counters and enabing/
disabling csources to only access the counter when the csource
is set to non-zero.

2. Since AMD IOMMU PMU does not support interrupt mode, the logic
can be simplified to always start counting with value zero,
and accumulate the counter value when stopping without the need
to keep track and reprogram the counter with the previously read
counter value.

This has been tested on systems with and without power-gating.

Fixes: 994d6608efe4 ("iommu/amd: Remove performance counter pre-initialization test")
Suggested-by: Alexander Monakov <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/events/amd/iommu.c | 47 +++++++++++++++++++-----------------
1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
index 6a98a76..2da6139 100644
--- a/arch/x86/events/amd/iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -18,8 +18,6 @@
#include "../perf_event.h"
#include "iommu.h"

-#define COUNTER_SHIFT 16
-
/* iommu pmu conf masks */
#define GET_CSOURCE(x) ((x)->conf & 0xFFULL)
#define GET_DEVID(x) (((x)->conf >> 8) & 0xFFFFULL)
@@ -285,22 +283,31 @@ static void perf_iommu_start(struct perf_event *event, int flags)
WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE));
hwc->state = 0;

+ /*
+ * To account for power-gating, which prevents write to
+ * the counter, we need to enable the counter
+ * before setting up counter register.
+ */
+ perf_iommu_enable_event(event);
+
if (flags & PERF_EF_RELOAD) {
- u64 prev_raw_count = local64_read(&hwc->prev_count);
+ u64 count = 0;
struct amd_iommu *iommu = perf_event_2_iommu(event);

+ /*
+ * Since the IOMMU PMU only support counting mode,
+ * the counter always start with value zero.
+ */
amd_iommu_pc_set_reg(iommu, hwc->iommu_bank, hwc->iommu_cntr,
- IOMMU_PC_COUNTER_REG, &prev_raw_count);
+ IOMMU_PC_COUNTER_REG, &count);
}

- perf_iommu_enable_event(event);
perf_event_update_userpage(event);
-
}

static void perf_iommu_read(struct perf_event *event)
{
- u64 count, prev, delta;
+ u64 count;
struct hw_perf_event *hwc = &event->hw;
struct amd_iommu *iommu = perf_event_2_iommu(event);

@@ -311,14 +318,11 @@ static void perf_iommu_read(struct perf_event *event)
/* IOMMU pc counter register is only 48 bits */
count &= GENMASK_ULL(47, 0);

- prev = local64_read(&hwc->prev_count);
- if (local64_cmpxchg(&hwc->prev_count, prev, count) != prev)
- return;
-
- /* Handle 48-bit counter overflow */
- delta = (count << COUNTER_SHIFT) - (prev << COUNTER_SHIFT);
- delta >>= COUNTER_SHIFT;
- local64_add(delta, &event->count);
+ /*
+ * Since the counter always start with value zero,
+ * simply just accumulate the count for the event.
+ */
+ local64_add(count, &event->count);
}

static void perf_iommu_stop(struct perf_event *event, int flags)
@@ -328,15 +332,16 @@ static void perf_iommu_stop(struct perf_event *event, int flags)
if (hwc->state & PERF_HES_UPTODATE)
return;

+ /*
+ * To account for power-gating, in which reading the counter would
+ * return zero, we need to read the register before disabling.
+ */
+ perf_iommu_read(event);
+ hwc->state |= PERF_HES_UPTODATE;
+
perf_iommu_disable_event(event);
WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
hwc->state |= PERF_HES_STOPPED;
-
- if (hwc->state & PERF_HES_UPTODATE)
- return;
-
- perf_iommu_read(event);
- hwc->state |= PERF_HES_UPTODATE;
}

static int perf_iommu_add(struct perf_event *event, int flags)

2021-05-10 02:18:04

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating



On 5/5/2021 8:05 PM, Peter Zijlstra wrote:
> On Wed, May 05, 2021 at 07:39:14PM +0700, Suthikulpanit, Suravee wrote:
>> Peter,
>>
>> On 5/4/2021 7:13 PM, Peter Zijlstra wrote:
>>> On Tue, May 04, 2021 at 06:58:29PM +0700, Suthikulpanit, Suravee wrote:
>>>> Peter,
>>>>
>>>> On 5/4/2021 4:39 PM, Peter Zijlstra wrote:
>>>>> On Tue, May 04, 2021 at 01:52:36AM -0500, Suravee Suthikulpanit wrote:
>>>>>
>>>>>> 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
>>>>>> can be simplified to always start counting with value zero,
>>>>>> and accumulate the counter value when stopping without the need
>>>>>> to keep track and reprogram the counter with the previously read
>>>>>> counter value.
>>>>>
>>>>> This relies on the hardware counter being the full 64bit wide, is it?
>>>>>
>>>>
>>>> The HW counter value is 48-bit. Not sure why it needs to be 64-bit?
>>>> I might be missing some points here? Could you please describe?
>>>
>>> How do you deal with the 48bit overflow if you don't use the interrupt?
>>
>> The IOMMU Perf driver does not currently handle counter overflow since the overflow
>> notification mechanism (i.e. IOMMU creates an EVENT_COUNTER_ZERO event in the IOMMU event log,
>> and generate an IOMMU MSI interrupt to signal IOMMU driver to process the event.) is not
>> currently supported. When counter overflows, the counter becomes zero, and Perf
>> reports value zero for the event.
>>
>> Alternatively, to detect overflow, we might start counting with value 1 so that
>> we can detect overflow when the value becomes zero in which case the Perf driver
>> could generate error message.
>
> Urgh.. the intel uncore driver programs an hrtimer to periodically fold
> deltas. That way the counter will never be short.
>

Thanks for the info. I'll look into ways to detect and handle counter overflow for this.
So far, with the current power-gating, it has several restrictions regarding when
the HW counter can be accessed, which makes it more difficult to handle this.

Thanks,
Suravee

2021-05-14 11:06:26

by David Coe

[permalink] [raw]
Subject: Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

Hi all!

On 04/05/2021 07:52, Suravee Suthikulpanit wrote:
> On certain AMD platforms, when the IOMMU performance counter source
> (csource) field is zero, power-gating for the counter is enabled, which
> prevents write access and returns zero for read access.
>
> This can cause invalid perf result especially when event multiplexing
> is needed (i.e. more number of events than available counters) since
> the current logic keeps track of the previously read counter value,
> and subsequently re-program the counter to continue counting the event.
> With power-gating enabled, we cannot gurantee successful re-programming
> of the counter.
>
> Workaround this issue by :
>
> 1. Modifying the ordering of setting/reading counters and enabing/
> disabling csources to only access the counter when the csource
> is set to non-zero.
>
> 2. Since AMD IOMMU PMU does not support interrupt mode, the logic
> can be simplified to always start counting with value zero,
> and accumulate the counter value when stopping without the need
> to keep track and reprogram the counter with the previously read
> counter value.
>
> This has been tested on systems with and without power-gating.

I've just noticed kernel-5.13-rc1 includes your full iommu enchilada. A
quick test with Ubuntu's mainline ppa debs (and a home-spun perf)gives
on a Ryzen 2400G what seem very satisfactory results. Bravo!

Performance counter stats for 'system wide':

0 amd_iommu_0/cmd_processed/ (33.32%)
0 amd_iommu_0/cmd_processed_inv/ (33.34%)
0 amd_iommu_0/ign_rd_wr_mmio_1ff8h/ (33.38%)
615 amd_iommu_0/int_dte_hit/ (33.44%)
5 amd_iommu_0/int_dte_mis/ (33.44%)
1,347 amd_iommu_0/mem_dte_hit/ (33.46%)
19,127 amd_iommu_0/mem_dte_mis/ (33.44%)
71 amd_iommu_0/mem_iommu_tlb_pde_hit/ (33.43%)
754 amd_iommu_0/mem_iommu_tlb_pde_mis/ (33.41%)
1,777 amd_iommu_0/mem_iommu_tlb_pte_hit/ (33.36%)
20,163 amd_iommu_0/mem_iommu_tlb_pte_mis/ (33.32%)
0 amd_iommu_0/mem_pass_excl/ (33.25%)
0 amd_iommu_0/mem_pass_pretrans/ (33.28%)
27,283 amd_iommu_0/mem_pass_untrans/ (33.27%)
0 amd_iommu_0/mem_target_abort/ (33.29%)
645 amd_iommu_0/mem_trans_total/ (33.32%)
0 amd_iommu_0/page_tbl_read_gst/ (33.28%)
183 amd_iommu_0/page_tbl_read_nst/ (33.30%)
45 amd_iommu_0/page_tbl_read_tot/ (33.30%)
0 amd_iommu_0/smi_blk/ (33.32%)
0 amd_iommu_0/smi_recv/ (33.28%)
0 amd_iommu_0/tlb_inv/ (33.27%)
0 amd_iommu_0/vapic_int_guest/ (33.28%)
613 amd_iommu_0/vapic_int_non_guest/ (33.26%)

9.998673791 seconds time elapsed

Running Windows 10 & etc under QEMU/KVM produces nothing untoward.
Again, congratulations and many thanks.

--
David