2017-04-04 19:14:44

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V2] perf/x86: fix spurious NMI with PEBS Load Latency event

From: Kan Liang <[email protected]>

Spurious NMIs will be observed when applying the following command.
while true ; do sudo perf record -b -a -e
"cpu/umask=0x01,event=0xcd,ldlat=0x80/pp,cpu/umask=0x03,event=0x0/,
cpu/umask=0x02,event=0x0/,cycles,branches,cache-misses,
cache-references" -- sleep 10 ; done

The issue was introduced by
commit 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL
status on HSW+")

The previous patch clear the status's bits for the counters used for
PEBS events, by masking the whole 64 bits pebs_enabled.
However, only the lower 32 bits of both status and pebs_enabled are
reserved for PEBS-able counters.
For status, the first three bits of upper 32 bits are fixed counter
overflow bit.
For pebs_enabled, the first three bits of upper 32 bits are for PEBS
Load Latency event.
In the test case, the PEBS Load Latency event and fixed counter event
could be overflowed at the same time. The fixed counter overflow bit
will
be cleared by mistake. Once it is cleared, the fixed counter overflow
never be processed, which finally trigger spurious NMI.

Correct the PEBS enabled mask by ignoring the non-PEBS bits.

Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL
status on HSW+")
Signed-off-by: Kan Liang <[email protected]>
---

Change since V1:
- using a macros for (1ULL << MAX_PEBS_EVENTS) - 1)

arch/x86/events/intel/core.c | 2 +-
arch/x86/events/intel/ds.c | 2 +-
arch/x86/events/perf_event.h | 1 +
3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 319da60..3411e79 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2151,7 +2151,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
* counters from the GLOBAL_STATUS mask and we always process PEBS
* events via drain_pebs().
*/
- status &= ~cpuc->pebs_enabled;
+ status &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);

/*
* PEBS overflow sets bit 62 in the global status register
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 9dfeeec..c6d23ff 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1222,7 +1222,7 @@ get_next_pebs_record_by_bit(void *base, void *top, int bit)

/* clear non-PEBS bit and re-check */
pebs_status = p->status & cpuc->pebs_enabled;
- pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
+ pebs_status &= PEBS_COUNTER_MASK;
if (pebs_status == (1 << bit))
return at;
}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 416f565..2469fba 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -79,6 +79,7 @@ struct amd_nb {

/* The maximal number of PEBS events: */
#define MAX_PEBS_EVENTS 8
+#define PEBS_COUNTER_MASK ((1ULL << MAX_PEBS_EVENTS) - 1)

/*
* Flags PEBS can handle without an PMI.
--
2.4.3


2017-04-06 09:11:14

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V2] perf/x86: fix spurious NMI with PEBS Load Latency event

On Tue, Apr 04, 2017 at 03:14:06PM -0400, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> Spurious NMIs will be observed when applying the following command.
> while true ; do sudo perf record -b -a -e
> "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp,cpu/umask=0x03,event=0x0/,
> cpu/umask=0x02,event=0x0/,cycles,branches,cache-misses,
> cache-references" -- sleep 10 ; done
>
> The issue was introduced by
> commit 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL
> status on HSW+")
>
> The previous patch clear the status's bits for the counters used for
> PEBS events, by masking the whole 64 bits pebs_enabled.
> However, only the lower 32 bits of both status and pebs_enabled are
> reserved for PEBS-able counters.
> For status, the first three bits of upper 32 bits are fixed counter
> overflow bit.
> For pebs_enabled, the first three bits of upper 32 bits are for PEBS
> Load Latency event.
> In the test case, the PEBS Load Latency event and fixed counter event
> could be overflowed at the same time. The fixed counter overflow bit
> will
> be cleared by mistake. Once it is cleared, the fixed counter overflow
> never be processed, which finally trigger spurious NMI.
>
> Correct the PEBS enabled mask by ignoring the non-PEBS bits.
>
> Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL
> status on HSW+")
> Signed-off-by: Kan Liang <[email protected]>

That is atrocious unreadable garbage; in a large part due to the random
line breaks.

Any half sane editor can reflow text; use that.


---
Subject: perf/x86: Fix spurious NMI with PEBS Load Latency event
From: Kan Liang <[email protected]>
Date: Tue, 4 Apr 2017 15:14:06 -0400

Spurious NMIs will be observed with the following command:

while :; do
perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
-e "cpu/umask=0x03,event=0x0/"
-e "cpu/umask=0x02,event=0x0/"
-e cycles,branches,cache-misses
-e cache-references -- sleep 10
done

The issue was introduced by commit:

8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")

That commit clears the status bits for the counters used for PEBS
events, by masking the whole 64 bits pebs_enabled. However, only the
low 32 bits of both status and pebs_enabled are reserved for PEBS-able
counters.

For status bits 32-34 are fixed counter overflow bits. For
pebe_enabled bits 32-34 are for PEBS Load Latency.

In the test case, the PEBS Load Latency event and fixed counter event
could overflow at the same time. The fixed counter overflow bit will
be cleared by mistake. Once it is cleared, the fixed counter overflow
never be processed, which finally trigger spurious NMI.

Correct the PEBS enabled mask by ignoring the non-PEBS bits.

Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/x86/events/intel/core.c | 2 +-
arch/x86/events/intel/ds.c | 2 +-
arch/x86/events/perf_event.h | 1 +
3 files changed, 3 insertions(+), 2 deletions(-)

--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2151,7 +2151,7 @@ static int intel_pmu_handle_irq(struct p
* counters from the GLOBAL_STATUS mask and we always process PEBS
* events via drain_pebs().
*/
- status &= ~cpuc->pebs_enabled;
+ status &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);

/*
* PEBS overflow sets bit 62 in the global status register
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1222,7 +1222,7 @@ get_next_pebs_record_by_bit(void *base,

/* clear non-PEBS bit and re-check */
pebs_status = p->status & cpuc->pebs_enabled;
- pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
+ pebs_status &= PEBS_COUNTER_MASK;
if (pebs_status == (1 << bit))
return at;
}
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -79,6 +79,7 @@ struct amd_nb {

/* The maximal number of PEBS events: */
#define MAX_PEBS_EVENTS 8
+#define PEBS_COUNTER_MASK ((1ULL << MAX_PEBS_EVENTS) - 1)

/*
* Flags PEBS can handle without an PMI.

Subject: [tip:perf/core] perf/x86: Fix spurious NMI with PEBS Load Latency event

Commit-ID: fd583ad1563bec5f00140e1f2444adbcd331caad
Gitweb: http://git.kernel.org/tip/fd583ad1563bec5f00140e1f2444adbcd331caad
Author: Kan Liang <[email protected]>
AuthorDate: Tue, 4 Apr 2017 15:14:06 -0400
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 14 Apr 2017 10:31:39 +0200

perf/x86: Fix spurious NMI with PEBS Load Latency event

Spurious NMIs will be observed with the following command:

while :; do
perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
-e "cpu/umask=0x03,event=0x0/"
-e "cpu/umask=0x02,event=0x0/"
-e cycles,branches,cache-misses
-e cache-references -- sleep 10
done

The bug was introduced by commit:

8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")

That commit clears the status bits for the counters used for PEBS
events, by masking the whole 64 bits pebs_enabled. However, only the
low 32 bits of both status and pebs_enabled are reserved for PEBS-able
counters.

For status bits 32-34 are fixed counter overflow bits. For
pebs_enabled bits 32-34 are for PEBS Load Latency.

In the test case, the PEBS Load Latency event and fixed counter event
could overflow at the same time. The fixed counter overflow bit will
be cleared by mistake. Once it is cleared, the fixed counter overflow
never be processed, which finally trigger spurious NMI.

Correct the PEBS enabled mask by ignoring the non-PEBS bits.

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/events/intel/core.c | 2 +-
arch/x86/events/intel/ds.c | 2 +-
arch/x86/events/perf_event.h | 1 +
3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 4244bed..a6d91d4 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2151,7 +2151,7 @@ again:
* counters from the GLOBAL_STATUS mask and we always process PEBS
* events via drain_pebs().
*/
- status &= ~cpuc->pebs_enabled;
+ status &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);

/*
* PEBS overflow sets bit 62 in the global status register
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 9dfeeec..c6d23ff 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1222,7 +1222,7 @@ get_next_pebs_record_by_bit(void *base, void *top, int bit)

/* clear non-PEBS bit and re-check */
pebs_status = p->status & cpuc->pebs_enabled;
- pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
+ pebs_status &= PEBS_COUNTER_MASK;
if (pebs_status == (1 << bit))
return at;
}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index bcbb1d2..be3d362 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -79,6 +79,7 @@ struct amd_nb {

/* The maximal number of PEBS events: */
#define MAX_PEBS_EVENTS 8
+#define PEBS_COUNTER_MASK ((1ULL << MAX_PEBS_EVENTS) - 1)

/*
* Flags PEBS can handle without an PMI.