2013-07-08 15:18:40

by Vince Weaver

[permalink] [raw]
Subject: perf: more ABI breakage


I guess I should have been noisier about this at the time.

Linux 3.9 came with commit
e259514eef764a5286873618e34c560ecb6cff13
that enabled AMD fam15h Northbridge support, by exposing the events as
part of the core CPU.

Then Linux 3.10 changed this with
c43ca5091a374c1f6778bd7e4a39a5a10735a917
and split them out as a separate PMU.

This of course breaks libpfm4 and thus PAPI, and so now that 3.10 has come
out we've had multiple people complaining about this on the PAPI lists (I
can never get them to come complain here, which is why I always look like
the lone whiner in these cases).

So can perf_event just break the ABI with impunity?

How many kernel releases do we need to wait before we implement features?

There's not really even a good backwards-compatible fix for this issue, as
PAPI presents core-CPU events and uncore-CPU events through different
interfaces.

Vince


2013-07-08 16:48:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf: more ABI breakage

On Mon, Jul 08, 2013 at 11:24:18AM -0400, Vince Weaver wrote:
>
> I guess I should have been noisier about this at the time.
>
> Linux 3.9 came with commit
> e259514eef764a5286873618e34c560ecb6cff13
> that enabled AMD fam15h Northbridge support, by exposing the events as
> part of the core CPU.
>
> Then Linux 3.10 changed this with
> c43ca5091a374c1f6778bd7e4a39a5a10735a917
> and split them out as a separate PMU.
>
> This of course breaks libpfm4 and thus PAPI

Urgh, so the 3.9 patches should never have been merged and sunk in while I was
doing my vegetable imitation.

Stephane agreed with the change in 3.10; and I suppose he overlooked the fact
that people were already using it :/ I specifically asked if there were already
users as that would indeed require some form of backwards compatibility --
however annoying.

But to answer your question, no we should not blindly break stuff like this --
but yeah it would have been ever so much more useful for people to report this
before we ship a release.

Stephane; do you see a sane way to bridge this now?

2013-07-08 19:02:36

by Vince Weaver

[permalink] [raw]
Subject: Re: perf: more ABI breakage

On Mon, 8 Jul 2013, Peter Zijlstra wrote:

> Urgh, so the 3.9 patches should never have been merged and sunk in while I was
> doing my vegetable imitation.
>
> Stephane agreed with the change in 3.10; and I suppose he overlooked the fact
> that people were already using it :/ I specifically asked if there were already
> users as that would indeed require some form of backwards compatibility --
> however annoying.

I probably was just hoping no one was using it, and since I don't own any
fam15h hardware myself it didn't affect me directly.

The problem is that AMD NB support was so long in coming that everyone
eagerly jumped on it when it arrived, to the extent of running -rc 3.9
kernels.

We can only hope that 3.9 is not chosen as long-term-stable, and that
RHEL and the HPC vendors don't decide to backport the 3.9 support to their
older kernels. If so we're stuck with it for a while :(

It looks like it will be possible for libpfm4/PAPI to sort of paper over
this but it's going to be annoying :(

Vince