LinuxLists.cc - [PATCH] perf/x86: Fix overlap counter scheduling bug

2016-11-01 15:44:34

Subject: [PATCH] perf/x86: Fix overlap counter scheduling bug

My fuzzer testing hits following warning in the counter scheduling code:

WARNING: CPU: 0 PID: 0 at arch/x86/events/core.c:718 perf_assign_events+0x2ae/0x2c0
Call Trace:
<IRQ>
dump_stack+0x68/0x93
__warn+0xcb/0xf0
warn_slowpath_null+0x1d/0x20
perf_assign_events+0x2ae/0x2c0
uncore_assign_events+0x1a7/0x250 [intel_uncore]
uncore_pmu_event_add+0x7a/0x3c0 [intel_uncore]
event_sched_in.isra.104+0xf6/0x2e0
group_sched_in+0x6e/0x190
...

The reason is that the counter scheduling code assumes
overlap constraints with mask weight < SCHED_STATES_MAX.

This assumption is broken with uncore cbox constraint
added for snbep in:
3b19e4c98c03 perf/x86: Fix event constraint for SandyBridge-EP C-Box

It's also easily triggered by running following perf command
on snbep box:
# perf stat -e 'uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/' -a

Fixing this by increasing the SCHED_STATES_MAX to 3
and adding build check for EVENT_CONSTRAINT_OVERLAP
macro.

Cc: Vince Weaver <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Yan Zheng <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
arch/x86/events/core.c | 3 ---
arch/x86/events/perf_event.h | 10 +++++++++-
2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d31735f37ed7..c725f8854216 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -676,9 +676,6 @@ struct sched_state {
unsigned long used[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
};

-/* Total max is X86_PMC_IDX_MAX, but we are O(n!) limited */
-#define SCHED_STATES_MAX 2
-
struct perf_sched {
int max_weight;
int max_events;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 5874d8de1f8d..4553275b37d4 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -277,8 +277,16 @@ struct cpu_hw_events {
* dramatically. The number of such EVENT_CONSTRAINT_OVERLAP() macros
* and its counter masks must be kept at a minimum.
*/
+
+/* Total max is X86_PMC_IDX_MAX, but we are O(n!) limited */
+#define SCHED_STATES_MAX 3
+
+/* Check we dont overlap beyond the states max. */
+#define OVERLAP_CHECK(n) (!!sizeof(char[1 - 2*!!(HWEIGHT(n) > SCHED_STATES_MAX)]))
+#define OVERLAP_HWEIGHT(n) (OVERLAP_CHECK(n)*HWEIGHT(n))
+
#define EVENT_CONSTRAINT_OVERLAP(c, n, m) \
- __EVENT_CONSTRAINT(c, n, m, HWEIGHT(n), 1, 0)
+ __EVENT_CONSTRAINT(c, n, m, OVERLAP_HWEIGHT(n), 1, 0)

/*
* Constraint on the Event code.
--
2.7.4

2016-11-08 12:20:41

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

On Tue, Nov 01, 2016 at 04:44:28PM +0100, Jiri Olsa wrote:
> My fuzzer testing hits following warning in the counter scheduling code:
>
> WARNING: CPU: 0 PID: 0 at arch/x86/events/core.c:718 perf_assign_events+0x2ae/0x2c0
> Call Trace:
> <IRQ>
> dump_stack+0x68/0x93
> __warn+0xcb/0xf0
> warn_slowpath_null+0x1d/0x20
> perf_assign_events+0x2ae/0x2c0
> uncore_assign_events+0x1a7/0x250 [intel_uncore]
> uncore_pmu_event_add+0x7a/0x3c0 [intel_uncore]
> event_sched_in.isra.104+0xf6/0x2e0
> group_sched_in+0x6e/0x190
> ...
>
> The reason is that the counter scheduling code assumes
> overlap constraints with mask weight < SCHED_STATES_MAX.
>
> This assumption is broken with uncore cbox constraint
> added for snbep in:

> 3b19e4c98c03 perf/x86: Fix event constraint for SandyBridge-EP C-Box

3b19e4c98c03 ("perf/x86: Fix event constraint for SandyBridge-EP C-Box")

Is the right form.

> It's also easily triggered by running following perf command
> on snbep box:
> # perf stat -e 'uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/' -a
>
> Fixing this by increasing the SCHED_STATES_MAX to 3 and adding build
> check for EVENT_CONSTRAINT_OVERLAP macro.

> -#define SCHED_STATES_MAX 2
> +#define SCHED_STATES_MAX 3

Us having to increase this is ff'ing sad :/ That's seriously challenged
hardware :/

> +
> +/* Check we dont overlap beyond the states max. */
> +#define OVERLAP_CHECK(n) (!!sizeof(char[1 - 2*!!(HWEIGHT(n) > SCHED_STATES_MAX)]))
> +#define OVERLAP_HWEIGHT(n) (OVERLAP_CHECK(n)*HWEIGHT(n))

I'm not sure I get how this is correct at all. You cannot tell by a
single mask what the overlap is. You need all the masks.

The point is that that PMU has constraints like:

0x01 - 0001
0x03 - 0011
0x0e - 1110
0x0c - 1100

Which gets us a total of 4 overlapping counter masks, and that would
indeed lead to max 3 retries I think.

Now, I would much rather solve this by changing the constraint like the
below, that yields:

0x01 - 0001
0x03 - 0011

0x0c - 1100

Which is two distinct groups, only one of which has overlap. And the one
with overlap only has 2 overlapping masks, giving a max reties of 1.

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 272427700d48..71bc348736bd 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -669,7 +669,7 @@ static struct event_constraint snbep_uncore_cbox_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
- EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
+ UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that gives scheduling pain */
UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
UNCORE_EVENT_CONSTRAINT(0x31, 0x3),

2016-11-08 13:14:35

by Jiri Olsa

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

On Tue, Nov 08, 2016 at 01:20:39PM +0100, Peter Zijlstra wrote:

SNIP

> Now, I would much rather solve this by changing the constraint like the
> below, that yields:
>
> 0x01 - 0001
> 0x03 - 0011
>
> 0x0c - 1100
>
> Which is two distinct groups, only one of which has overlap. And the one
> with overlap only has 2 overlapping masks, giving a max reties of 1.

works for me..

thanks,
jirka

>
>
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 272427700d48..71bc348736bd 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -669,7 +669,7 @@ static struct event_constraint snbep_uncore_cbox_constraints[] = {
> UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
> UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
> UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
> - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
> + UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that gives scheduling pain */
> UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
> UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
> UNCORE_EVENT_CONSTRAINT(0x31, 0x3),

2016-11-08 15:09:53

by Andi Kleen

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

Adding Kan who actually maintains the uncore drivers these days.

> Which is two distinct groups, only one of which has overlap. And the one
> with overlap only has 2 overlapping masks, giving a max reties of 1.

Looks reasonable to limit the mask.

Are we sure this problem isn't in the other 0xc masks too ?

-Andi

>
>
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 272427700d48..71bc348736bd 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -669,7 +669,7 @@ static struct event_constraint snbep_uncore_cbox_constraints[] = {
> UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
> UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
> UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
> - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
> + UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that gives scheduling pain */
> UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
> UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
> UNCORE_EVENT_CONSTRAINT(0x31, 0x3),
>

2016-11-08 16:22:19

by Liang, Kan

[permalink] [raw]

Subject: RE: [PATCH] perf/x86: Fix overlap counter scheduling bug

> >
> >
> > diff --git a/arch/x86/events/intel/uncore_snbep.c
> > b/arch/x86/events/intel/uncore_snbep.c
> > index 272427700d48..71bc348736bd 100644
> > --- a/arch/x86/events/intel/uncore_snbep.c
> > +++ b/arch/x86/events/intel/uncore_snbep.c
> > @@ -669,7 +669,7 @@ static struct event_constraint
> snbep_uncore_cbox_constraints[] = {
> > UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
> > UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
> > UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
> > - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
> > + UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that
> gives
> > +scheduling pain */

I think the crash is caused by the overlap bit.
Why not just revert the previous patch?

If overlap bit is removed, the perf_sched_save_state will never be touched.
Why we have to reduce a counter?

Thanks,
Kan

> > UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
> > UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
> > UNCORE_EVENT_CONSTRAINT(0x31, 0x3),
> >

2016-11-08 16:57:52

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

On Tue, Nov 08, 2016 at 04:22:13PM +0000, Liang, Kan wrote:
>
>
> > >
> > >
> > > diff --git a/arch/x86/events/intel/uncore_snbep.c
> > > b/arch/x86/events/intel/uncore_snbep.c
> > > index 272427700d48..71bc348736bd 100644
> > > --- a/arch/x86/events/intel/uncore_snbep.c
> > > +++ b/arch/x86/events/intel/uncore_snbep.c
> > > @@ -669,7 +669,7 @@ static struct event_constraint
> > snbep_uncore_cbox_constraints[] = {
> > > UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
> > > UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
> > > UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
> > > - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
> > > + UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that
> > gives
> > > +scheduling pain */
>
> I think the crash is caused by the overlap bit.
> Why not just revert the previous patch?
>
> If overlap bit is removed, the perf_sched_save_state will never be touched.
> Why we have to reduce a counter?

By simply removing the overlap bit you'll still get bad scheduling
(we'll just not crash).

I think all the 0x3 mask need the overlap flag set, since they clearly
overlap with the 0x1 masks. That would improve the scheduling.

But as Jiri noted, you cannot do 0x1 + 0x3 + 0xc + 0xe without also
raising the retry limit, because that are 4 overlapping masks, you'll
have to, worst case, pop 3 attempts.

By reducing 0xe to 0xc you'll not have 4 overlapping masks anymore.

In any case, overlapping masks stink (because they make scheduling
O(n!)) and ideally hardware would not do this.

2016-11-08 17:25:44

by Liang, Kan

[permalink] [raw]

Subject: RE: [PATCH] perf/x86: Fix overlap counter scheduling bug

> >
> > > >
> > > >
> > > > diff --git a/arch/x86/events/intel/uncore_snbep.c
> > > > b/arch/x86/events/intel/uncore_snbep.c
> > > > index 272427700d48..71bc348736bd 100644
> > > > --- a/arch/x86/events/intel/uncore_snbep.c
> > > > +++ b/arch/x86/events/intel/uncore_snbep.c
> > > > @@ -669,7 +669,7 @@ static struct event_constraint
> > > snbep_uncore_cbox_constraints[] = {
> > > > UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
> > > > UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
> > > > UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
> > > > - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
> > > > + UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that
> > > gives
> > > > +scheduling pain */
> >
> > I think the crash is caused by the overlap bit.
> > Why not just revert the previous patch?
> >
> > If overlap bit is removed, the perf_sched_save_state will never be
> touched.
> > Why we have to reduce a counter?
>
> By simply removing the overlap bit you'll still get bad scheduling (we'll just
> not crash).

But in some cases, it may be good to have 0xe.
For example,
uncore_cbox_0/event=0x1d/,uncore_cbox_0/event=0x1e/,uncore_cbox_0/event=0x1f/
events 1d and 1e have constraint 0xc.

>
> I think all the 0x3 mask need the overlap flag set, since they clearly overlap
> with the 0x1 masks. That would improve the scheduling.
>

How much the overlap hint can improve the scheduling?
Because there is not only snbep_uncore_cbox, but also other uncore events
which have overlapping masks.
If it's a significant improvement, I need to set overlap flag for all of them.

Thanks,
Kan
> But as Jiri noted, you cannot do 0x1 + 0x3 + 0xc + 0xe without also raising
> the retry limit, because that are 4 overlapping masks, you'll have to, worst
> case, pop 3 attempts.
>
> By reducing 0xe to 0xc you'll not have 4 overlapping masks anymore.
>
> In any case, overlapping masks stink (because they make scheduling
> O(n!)) and ideally hardware would not do this.

2016-11-08 18:27:44

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

On Tue, Nov 08, 2016 at 05:25:34PM +0000, Liang, Kan wrote:

> > I think all the 0x3 mask need the overlap flag set, since they clearly overlap
> > with the 0x1 masks. That would improve the scheduling.
> >
>
> How much the overlap hint can improve the scheduling?
> Because there is not only snbep_uncore_cbox, but also other uncore events
> which have overlapping masks.

Hurm, not much. We're saved by the fact that we schedule from wwin to
wmax, which means that we first place the 0x01 events, and then try and
fit the 0x03 events on top. That should already be good.

/me ponders more..

The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
the counter mask of such an event is not a subset of any other counter
mask of a constraint with an equal or higher weight".

Esp. that latter part is of interest here I think, our overlapping mask
is 0x0e, that has 3 bits set and is the highest weight mask in on the
PMU, therefore it will be placed last. Can we still create a scenario
where we would need to rewind that?

The scenario for AMD Fam15h is we're having masks like:

0x3F -- 111111
0x38 -- 111000
0x07 -- 000111

0x09 -- 001001

And we mark 0x09 as overlapping, because it is not a direct subset of
0x38 or 0x07 and has less weight than either of those. This means we'll
first try and place the 0x09 event, then try and place 0x38/0x07 events.
Now imagine we have:

3 * 0x07 + 0x09

and the initial pick for the 0x09 event is counter 0, then we'll fail to
place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
event, and then re-try all 0x07 events, which will now work.

But given, that in the uncore case, the overlapping event is the
heaviest mask, I don't think this can happen. Or did I overlook
something.... takes a bit to page all this back in.

2016-11-09 14:25:23

by Robert Richter

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

On 08.11.16 19:27:39, Peter Zijlstra wrote:
> The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
> the counter mask of such an event is not a subset of any other counter
> mask of a constraint with an equal or higher weight".
>
> Esp. that latter part is of interest here I think, our overlapping mask
> is 0x0e, that has 3 bits set and is the highest weight mask in on the
> PMU, therefore it will be placed last. Can we still create a scenario
> where we would need to rewind that?
>
> The scenario for AMD Fam15h is we're having masks like:
>
> 0x3F -- 111111
> 0x38 -- 111000
> 0x07 -- 000111
>
> 0x09 -- 001001
>
> And we mark 0x09 as overlapping, because it is not a direct subset of
> 0x38 or 0x07 and has less weight than either of those. This means we'll
> first try and place the 0x09 event, then try and place 0x38/0x07 events.
> Now imagine we have:
>
> 3 * 0x07 + 0x09
>
> and the initial pick for the 0x09 event is counter 0, then we'll fail to
> place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
> event, and then re-try all 0x07 events, which will now work.
>
>
>
> But given, that in the uncore case, the overlapping event is the
> heaviest mask, I don't think this can happen. Or did I overlook
> something.... takes a bit to page all this back in.

Right, IMO 0xE mask may not be marked as overlapping. It is placed
last and if there is no space left we are lost. There is no other
combination or state we could try then. So the fix is to remove the
overlapping bit for that counter, the state is then never saved.

This assumes there are no other counters than 0x3 and 0xc that overlap
with 0xe. It becomes a bit tricky if there is another counter with the
same or higher weight that overlaps with 0xe, e.g. 0x7.

-Robert

2016-11-09 15:51:55

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

On Wed, Nov 09, 2016 at 03:25:15PM +0100, Robert Richter wrote:
> On 08.11.16 19:27:39, Peter Zijlstra wrote:
> > The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
> > the counter mask of such an event is not a subset of any other counter
> > mask of a constraint with an equal or higher weight".
> >
> > Esp. that latter part is of interest here I think, our overlapping mask
> > is 0x0e, that has 3 bits set and is the highest weight mask in on the
> > PMU, therefore it will be placed last. Can we still create a scenario
> > where we would need to rewind that?
> >
> > The scenario for AMD Fam15h is we're having masks like:
> >
> > 0x3F -- 111111
> > 0x38 -- 111000
> > 0x07 -- 000111
> >
> > 0x09 -- 001001
> >
> > And we mark 0x09 as overlapping, because it is not a direct subset of
> > 0x38 or 0x07 and has less weight than either of those. This means we'll
> > first try and place the 0x09 event, then try and place 0x38/0x07 events.
> > Now imagine we have:
> >
> > 3 * 0x07 + 0x09
> >
> > and the initial pick for the 0x09 event is counter 0, then we'll fail to
> > place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
> > event, and then re-try all 0x07 events, which will now work.
> >
> >
> >
> > But given, that in the uncore case, the overlapping event is the
> > heaviest mask, I don't think this can happen. Or did I overlook
> > something.... takes a bit to page all this back in.
>
> Right, IMO 0xE mask may not be marked as overlapping. It is placed
> last and if there is no space left we are lost. There is no other
> combination or state we could try then. So the fix is to remove the
> overlapping bit for that counter, the state is then never saved.
>
> This assumes there are no other counters than 0x3 and 0xc that overlap
> with 0xe. It becomes a bit tricky if there is another counter with the
> same or higher weight that overlaps with 0xe, e.g. 0x7.

As per a prior mail, the masks on the PMU in question are:

0x01 - 0001
0x03 - 0011
0x0e - 1110
0x0c - 1100

But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
0x1) are of heavier weight, it should all work out I think.

So yes, something like the below (removing the OVERLAP bit) looks like
its sufficient.

---
arch/x86/events/intel/uncore_snbep.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 272427700d48..e6832be714bc 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -669,7 +669,7 @@ static struct event_constraint snbep_uncore_cbox_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
- EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
+ UNCORE_EVENT_CONSTRAINT(0x1f, 0xe),
UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
UNCORE_EVENT_CONSTRAINT(0x31, 0x3),

2016-11-10 08:00:19

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

* Peter Zijlstra <[email protected]> wrote:

> On Wed, Nov 09, 2016 at 03:25:15PM +0100, Robert Richter wrote:
> > On 08.11.16 19:27:39, Peter Zijlstra wrote:
> > > The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
> > > the counter mask of such an event is not a subset of any other counter
> > > mask of a constraint with an equal or higher weight".
> > >
> > > Esp. that latter part is of interest here I think, our overlapping mask
> > > is 0x0e, that has 3 bits set and is the highest weight mask in on the
> > > PMU, therefore it will be placed last. Can we still create a scenario
> > > where we would need to rewind that?
> > >
> > > The scenario for AMD Fam15h is we're having masks like:
> > >
> > > 0x3F -- 111111
> > > 0x38 -- 111000
> > > 0x07 -- 000111
> > >
> > > 0x09 -- 001001
> > >
> > > And we mark 0x09 as overlapping, because it is not a direct subset of
> > > 0x38 or 0x07 and has less weight than either of those. This means we'll
> > > first try and place the 0x09 event, then try and place 0x38/0x07 events.
> > > Now imagine we have:
> > >
> > > 3 * 0x07 + 0x09
> > >
> > > and the initial pick for the 0x09 event is counter 0, then we'll fail to
> > > place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
> > > event, and then re-try all 0x07 events, which will now work.
> > >
> > >
> > >
> > > But given, that in the uncore case, the overlapping event is the
> > > heaviest mask, I don't think this can happen. Or did I overlook
> > > something.... takes a bit to page all this back in.
> >
> > Right, IMO 0xE mask may not be marked as overlapping. It is placed
> > last and if there is no space left we are lost. There is no other
> > combination or state we could try then. So the fix is to remove the
> > overlapping bit for that counter, the state is then never saved.
> >
> > This assumes there are no other counters than 0x3 and 0xc that overlap
> > with 0xe. It becomes a bit tricky if there is another counter with the
> > same or higher weight that overlaps with 0xe, e.g. 0x7.
>
> As per a prior mail, the masks on the PMU in question are:
>
> 0x01 - 0001
> 0x03 - 0011
> 0x0e - 1110
> 0x0c - 1100
>
> But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
> 0x1) are of heavier weight, it should all work out I think.
>
> So yes, something like the below (removing the OVERLAP bit) looks like
> its sufficient.

Would it be possible to also add debug code (or some other mechanism) to disallow
such buggy EVENT_CONSTRAINT_OVERLAP() definitions?

Thanks,

Ingo

2016-11-10 16:41:32

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

On Thu, Nov 10, 2016 at 09:00:13AM +0100, Ingo Molnar wrote:
> Would it be possible to also add debug code (or some other mechanism) to disallow
> such buggy EVENT_CONSTRAINT_OVERLAP() definitions?

Should certainly be possible if someone has the time for it I think. The
rules are fairly straight forward, the only tricky bit is detectoring
the actual overlap condition.

2016-12-14 15:59:45

by Jiri Olsa

[permalink] [raw]

Subject: Re: [PATCH] perf/x86: Fix overlap counter scheduling bug

On Wed, Nov 09, 2016 at 04:51:53PM +0100, Peter Zijlstra wrote:

SNIP

>
> As per a prior mail, the masks on the PMU in question are:
>
> 0x01 - 0001
> 0x03 - 0011
> 0x0e - 1110
> 0x0c - 1100
>
> But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
> 0x1) are of heavier weight, it should all work out I think.
>
> So yes, something like the below (removing the OVERLAP bit) looks like
> its sufficient.

Peter,
could you please take this one?

thanks,
jirka

>
> ---
> arch/x86/events/intel/uncore_snbep.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 272427700d48..e6832be714bc 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -669,7 +669,7 @@ static struct event_constraint snbep_uncore_cbox_constraints[] = {
> UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
> UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
> UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
> - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
> + UNCORE_EVENT_CONSTRAINT(0x1f, 0xe),
> UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
> UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
> UNCORE_EVENT_CONSTRAINT(0x31, 0x3),

2016-12-22 16:53:08

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:perf/urgent] perf/x86: Fix overlap counter scheduling bug

Commit-ID: 1134c2b5cb840409ffd966d8c2a9468f64e6a494
Gitweb: http://git.kernel.org/tip/1134c2b5cb840409ffd966d8c2a9468f64e6a494
Author: Peter Zijlstra <[email protected]>
AuthorDate: Wed, 9 Nov 2016 16:51:53 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 22 Dec 2016 17:45:43 +0100

perf/x86: Fix overlap counter scheduling bug

Jiri reported the overlap scheduling exceeding its max stack.

Looking at the constraint that triggered this, it turns out the
overlap marker isn't needed.

The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
the counter mask of such an event is not a subset of any other counter
mask of a constraint with an equal or higher weight".

Esp. that latter part is of interest here I think, our overlapping mask
is 0x0e, that has 3 bits set and is the highest weight mask in on the
PMU, therefore it will be placed last. Can we still create a scenario
where we would need to rewind that?

The scenario for AMD Fam15h is we're having masks like:

0x3F -- 111111
0x38 -- 111000
0x07 -- 000111

0x09 -- 001001

And we mark 0x09 as overlapping, because it is not a direct subset of
0x38 or 0x07 and has less weight than either of those. This means we'll
first try and place the 0x09 event, then try and place 0x38/0x07 events.
Now imagine we have:

3 * 0x07 + 0x09

and the initial pick for the 0x09 event is counter 0, then we'll fail to
place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
event, and then re-try all 0x07 events, which will now work.

The masks on the PMU in question are:

0x01 - 0001
0x03 - 0011
0x0e - 1110
0x0c - 1100

But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
0x1) are of heavier weight, it should all work out.

Reported-by: Jiri Olsa <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Liang Kan <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver <[email protected]>
Cc: Vince Weaver <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/events/intel/uncore_snbep.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 2724277..e6832be 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -669,7 +669,7 @@ static struct event_constraint snbep_uncore_cbox_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
- EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
+ UNCORE_EVENT_CONSTRAINT(0x1f, 0xe),
UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
UNCORE_EVENT_CONSTRAINT(0x31, 0x3),