2011-05-24 13:59:48

by Vince Weaver

[permalink] [raw]
Subject: perf: regression -- missing /sys/devices/system/cpu/perf_events


I know this is a bit late, but for some reason our users sit on these
things and then they aren't willing to take it up with linux-kernel
themselves

> > > commit 15ac9a395a753cb28c674e7ea80386ffdff21785
> > > Author: Peter Zijlstra <[email protected]>
> > > Date: Mon Sep 6 15:51:45 2010 +0200
> > >
> > > perf: Remove the sysfs bits

removed the /sys/devices/system/cpu/perf_events
I thought things in /sys were stable ABI?

Apparently it's common for people to have scripts to check if
perf_events is available in a kernel by checking the existence of
/sys/devices/system/cpu/perf_events, even if they didn't use the
files within.

Now that 2.6.38 kernels are starting to hit the distros we're getting
complaints that it's missing.

Vince


2011-05-24 14:12:23

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf: regression -- missing /sys/devices/system/cpu/perf_events

On Tue, 2011-05-24 at 09:59 -0400, Vince Weaver wrote:
> I know this is a bit late, but for some reason our users sit on these
> things and then they aren't willing to take it up with linux-kernel
> themselves
>
> > > > commit 15ac9a395a753cb28c674e7ea80386ffdff21785
> > > > Author: Peter Zijlstra <[email protected]>
> > > > Date: Mon Sep 6 15:51:45 2010 +0200
> > > >
> > > > perf: Remove the sysfs bits
>
> removed the /sys/devices/system/cpu/perf_events
> I thought things in /sys were stable ABI?

if only.. that stuff changes way too often (not to say that's a good
thing).

> Apparently it's common for people to have scripts to check if
> perf_events is available in a kernel by checking the existence of
> /sys/devices/system/cpu/perf_events, even if they didn't use the
> files within.

A much more reliable way is simply doing the syscall and seeing what
happens. But if you want to poke around in sysfs, /sys/bus/event_source/
is the new location.

> Now that 2.6.38 kernels are starting to hit the distros we're getting
> complaints that it's missing.

Urgh, they'd been broken long before.. and I hadn't received any
complaints from people about that. I didn't see the point in keeping
broken interfaces around, esp. since with moving to multiple-pmu they
don't make any sense at all.

Ingo, any idea what to do here?

2011-05-24 17:42:48

by Vince Weaver

[permalink] [raw]
Subject: Re: perf: regression -- missing /sys/devices/system/cpu/perf_events

On Tue, 24 May 2011, Peter Zijlstra wrote:

> A much more reliable way is simply doing the syscall and seeing what
> happens. But if you want to poke around in sysfs, /sys/bus/event_source/
> is the new location.

It's been suggested that they can look for the existence of:
/proc/sys/kernel/perf_event_paranoid

is that something not likely to go away?

Vince

2011-05-24 19:48:26

by Ingo Molnar

[permalink] [raw]
Subject: Re: perf: regression -- missing /sys/devices/system/cpu/perf_events


* Vince Weaver <[email protected]> wrote:

> On Tue, 24 May 2011, Peter Zijlstra wrote:
>
> > A much more reliable way is simply doing the syscall and seeing what
> > happens. But if you want to poke around in sysfs, /sys/bus/event_source/
> > is the new location.
>
> It's been suggested that they can look for the existence of:
> /proc/sys/kernel/perf_event_paranoid
>
> is that something not likely to go away?

Well, we indeed do not remove kernel parameters, but checking for that would
break on systems (or chroot environments) that do not have /proc mounted or
visible.

So, what is wrong with the method Peter suggested: the presence of the perf
syscall (it not returning -ENOSYS) is bona fide evidence that perf is
available.

Thanks,

Ingo

2011-05-24 20:13:00

by Vince Weaver

[permalink] [raw]
Subject: Re: perf: regression -- missing /sys/devices/system/cpu/perf_events

On Tue, 24 May 2011, Ingo Molnar wrote:
>
> So, what is wrong with the method Peter suggested: the presence of the perf
> syscall (it not returning -ENOSYS) is bona fide evidence that perf is
> available.

it's just hard to do that from a shell script.

also, running the perf syscall can be tricky if you have a new kernel but
an older set of header files that doesn't have the syscall number defined.

Vince
[email protected]

2011-05-24 20:57:26

by David Ahern

[permalink] [raw]
Subject: Re: perf: regression -- missing /sys/devices/system/cpu/perf_events



On 05/24/11 14:12, Vince Weaver wrote:
> On Tue, 24 May 2011, Ingo Molnar wrote:
>>
>> So, what is wrong with the method Peter suggested: the presence of the perf
>> syscall (it not returning -ENOSYS) is bona fide evidence that perf is
>> available.
>
> it's just hard to do that from a shell script.

What about kallsyms:

grep sys_perf_event_open /proc/kallsyms

even with the new security feature you should be able to see that it
exists. The name has been the same since the counters->events rename in
cdd6c48. egrep for both names for kernels older than 2.6.31.

David


>
> also, running the perf syscall can be tricky if you have a new kernel but
> an older set of header files that doesn't have the syscall number defined.
>
> Vince
> [email protected]
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2011-05-24 21:29:40

by Ingo Molnar

[permalink] [raw]
Subject: Re: perf: regression -- missing /sys/devices/system/cpu/perf_events


* David Ahern <[email protected]> wrote:

> On 05/24/11 14:12, Vince Weaver wrote:
> > On Tue, 24 May 2011, Ingo Molnar wrote:
> >>
> >> So, what is wrong with the method Peter suggested: the presence of the perf
> >> syscall (it not returning -ENOSYS) is bona fide evidence that perf is
> >> available.
> >
> > it's just hard to do that from a shell script.
>
> What about kallsyms:
>
> grep sys_perf_event_open /proc/kallsyms

that looks pretty roundabout and expensive - kallsyms can be large.

Nor is there a guarantee that the function will always be called
sys_perf_event_open() - we already renamed it from sys_perf_counter_open() as
you yourself mentioned it :-)

Plus the kernel can be built without /proc/kallsyms, and root can chmod off the
file permissions of /proc/kallsyms for unprivileged user-space as well. So it's
not a particularly robust check.

I agree with Vince that as far as shell scripts are concerned, checking
/proc/sys/kernel/perf_event_paranoid would work best - and it works better than
having to check the perf syscall.

Vince: mind sending a patch that adds a comment to perf_event_paranoid that
userspace relies on the existence of that file as a feature check? Having such
reminders in the code works even better than frequent testing ;-)

As far as the actual PAPI library goes i really hope it checks the syscall
itself: that's much faster and more robust than an
access("/proc/sys/kernel/perf_event_paranoid") call ...

Thanks,

Ingo

2011-05-24 21:37:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: perf: regression -- missing /sys/devices/system/cpu/perf_events


* Vince Weaver <[email protected]> wrote:

> On Tue, 24 May 2011, Ingo Molnar wrote:
> >
> > So, what is wrong with the method Peter suggested: the presence of the perf
> > syscall (it not returning -ENOSYS) is bona fide evidence that perf is
> > available.
>
> it's just hard to do that from a shell script.

Yeah.

> also, running the perf syscall can be tricky if you have a new kernel but an
> older set of header files that doesn't have the syscall number defined.

I suspect you could add some quick band-aid like:

#ifndef __NR_perf_event_open
# ifdef __i386__
# define __NR_perf_event_open 336
# endif
# ifdef __x86_64__
# define __NR_perf_event_open 298
# endif
# ifdef __powerpc__
# define __NR_perf_event_open 319
# endif
# ifdef __arm__
# define __NR_perf_event_open 364
# endif
#endif

#ifndef __NR_perf_event_open
# error Please add the __NR_perf_event_open definition for this architecture!
#endif

This should cover 99% of the users - and fill in the table as people report
build failures :)

Thanks,

Ingo

2011-06-03 21:54:55

by Vince Weaver

[permalink] [raw]
Subject: [patch] perf - comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI

On Tue, 24 May 2011, Ingo Molnar wrote:

> I agree with Vince that as far as shell scripts are concerned, checking
> /proc/sys/kernel/perf_event_paranoid would work best - and it works better than
> having to check the perf syscall.
>
> Vince: mind sending a patch that adds a comment to perf_event_paranoid that
> userspace relies on the existence of that file as a feature check? Having such
> reminders in the code works even better than frequent testing ;-)

Such a patch is included below. Not sure if this is exactly what you
meant.

> As far as the actual PAPI library goes i really hope it checks the syscall
> itself: that's much faster and more robust than an
> access("/proc/sys/kernel/perf_event_paranoid") call ...

For PAPI itself we decide with substrate to use at compile time.
This casme up because one of the vendors who ships PAPI on various kernel
revisions had a script to choose the right package to install, and as of
2.6.37 this broke due to /sys/devices/system/cpu/perf_events going away.

Thanks,

Vince
[email protected]

Signed-off-by: Vince Weaver <[email protected]>

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4fc9244..cbdc7bc 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -938,6 +938,8 @@ static struct ctl_table kern_table[] = {
},
#endif
#ifdef CONFIG_PERF_EVENTS
+ /* Userspace relies on this file existing as a check for */
+ /* perf_events being enabled. Do not remove! */
{
.procname = "perf_event_paranoid",
.data = &sysctl_perf_event_paranoid,

2011-06-04 10:24:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] perf - comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI


* Vince Weaver <[email protected]> wrote:

> On Tue, 24 May 2011, Ingo Molnar wrote:
>
> > I agree with Vince that as far as shell scripts are concerned, checking
> > /proc/sys/kernel/perf_event_paranoid would work best - and it works better than
> > having to check the perf syscall.
> >
> > Vince: mind sending a patch that adds a comment to perf_event_paranoid that
> > userspace relies on the existence of that file as a feature check? Having such
> > reminders in the code works even better than frequent testing ;-)
>
> Such a patch is included below. Not sure if this is exactly what you
> meant.

Yeah, that's exactly what i meant - we don't need more really.

Most sysctls are not ABIs (there's no userspace that relies on them)
so the general attitude is to change them freely and backtrack if
something breaks unexpectedly. We can avoid that by commenting the
dependency.

Thanks,

Ingo

2011-06-04 11:06:28

by Vince Weaver

[permalink] [raw]
Subject: [tip:perf/core] perf: Comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI

Commit-ID: aa4a221875873d2a1f9656cb7fd7e545e952b4fa
Gitweb: http://git.kernel.org/tip/aa4a221875873d2a1f9656cb7fd7e545e952b4fa
Author: Vince Weaver <[email protected]>
AuthorDate: Fri, 3 Jun 2011 17:54:40 -0400
Committer: Ingo Molnar <[email protected]>
CommitDate: Sat, 4 Jun 2011 12:22:04 +0200

perf: Comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI

Turns out that distro packages use this file as an indicator of
the perf event subsystem - this is easier to check for from scripts
than the existence of the system call.

This is easy enough to keep around for the kernel, so add a
comment to make sure it stays so.

Signed-off-by: Vince Weaver <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sysctl.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4fc9244..f175d98 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -938,6 +938,12 @@ static struct ctl_table kern_table[] = {
},
#endif
#ifdef CONFIG_PERF_EVENTS
+ /*
+ * User-space scripts rely on the existence of this file
+ * as a feature check for perf_events being enabled.
+ *
+ * So it's an ABI, do not remove!
+ */
{
.procname = "perf_event_paranoid",
.data = &sysctl_perf_event_paranoid,