2015-02-19 00:02:08

by David Ahern

[permalink] [raw]
Subject: [PATCH] perf: Fix probing for PERF_FLAG_FD_CLOEXEC flag

Commit f6edb53c4993ffe92ce521fb449d1c146cea6ec2 converted the probe to
a CPU wide event first (pid == -1). For kernels that do not support
the PERF_FLAG_FD_CLOEXEC flag the probe fails with EINVAL. Since this
errno is not handled pid is not reset to 0 and the subsequent use of
pid = -1 as an argument brings in an additional failure path if
perf_event_paranoid > 0:

$ perf record -- sleep 1
perf_event_open(..., 0) failed unexpectedly with error 13 (Permission denied)
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.007 MB /tmp/perf.data (11 samples) ]

Since this function only needs to get past this check in kernel/events/core.c:

/* for future expandability... */
if (flags & ~PERF_FLAG_ALL)
return -EINVAL;

pid = 0 is sufficient to confirm if the flag is supported or not.

Also, ensure the fd of the confirmation check is closed.

Needs to go to 3.18 stable tree as well.

Signed-off-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
---
tools/perf/util/cloexec.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
index 47b78b3f0325..3cc34edf2403 100644
--- a/tools/perf/util/cloexec.c
+++ b/tools/perf/util/cloexec.c
@@ -47,16 +47,17 @@ static int perf_flag_probe(void)
err, strerror_r(err, sbuf, sizeof(sbuf)));

/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
- fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
+ fd = sys_perf_event_open(&attr, 0, cpu, -1, 0);
err = errno;

+ if (fd >= 0)
+ close(fd);
+
if (WARN_ONCE(fd < 0 && err != EBUSY,
"perf_event_open(..., 0) failed unexpectedly with error %d (%s)\n",
err, strerror_r(err, sbuf, sizeof(sbuf))))
return -1;

- close(fd);
-
return 0;
}

--
1.9.3


2015-02-19 07:08:08

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH] perf: Fix probing for PERF_FLAG_FD_CLOEXEC flag

On 19/02/15 02:01, David Ahern wrote:
> Commit f6edb53c4993ffe92ce521fb449d1c146cea6ec2 converted the probe to
> a CPU wide event first (pid == -1). For kernels that do not support
> the PERF_FLAG_FD_CLOEXEC flag the probe fails with EINVAL. Since this
> errno is not handled pid is not reset to 0 and the subsequent use of
> pid = -1 as an argument brings in an additional failure path if
> perf_event_paranoid > 0:
>
> $ perf record -- sleep 1
> perf_event_open(..., 0) failed unexpectedly with error 13 (Permission denied)
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.007 MB /tmp/perf.data (11 samples) ]
>
> Since this function only needs to get past this check in kernel/events/core.c:
>
> /* for future expandability... */
> if (flags & ~PERF_FLAG_ALL)
> return -EINVAL;
>
> pid = 0 is sufficient to confirm if the flag is supported or not.
>
> Also, ensure the fd of the confirmation check is closed.
>
> Needs to go to 3.18 stable tree as well.
>
> Signed-off-by: David Ahern <[email protected]>
> Cc: Adrian Hunter <[email protected]>
> ---
> tools/perf/util/cloexec.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
> index 47b78b3f0325..3cc34edf2403 100644
> --- a/tools/perf/util/cloexec.c
> +++ b/tools/perf/util/cloexec.c
> @@ -47,16 +47,17 @@ static int perf_flag_probe(void)
> err, strerror_r(err, sbuf, sizeof(sbuf)));
>
> /* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
> - fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
> + fd = sys_perf_event_open(&attr, 0, cpu, -1, 0);

I would prefer to avoid pid = 0 unless necessary and so just do the same
thing again i.e.

while (1) {
fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
if (fd < 0 && pid == -1 && errno == EACCES) {
pid = 0;
continue;
}
break;
}

> err = errno;
>
> + if (fd >= 0)
> + close(fd);
> +
> if (WARN_ONCE(fd < 0 && err != EBUSY,
> "perf_event_open(..., 0) failed unexpectedly with error %d (%s)\n",
> err, strerror_r(err, sbuf, sizeof(sbuf))))
> return -1;
>
> - close(fd);
> -
> return 0;
> }
>
>

2015-02-19 14:55:35

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] perf: Fix probing for PERF_FLAG_FD_CLOEXEC flag

On 2/19/15 12:06 AM, Adrian Hunter wrote:
>> /* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
>> - fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
>> + fd = sys_perf_event_open(&attr, 0, cpu, -1, 0);
>
> I would prefer to avoid pid = 0 unless necessary and so just do the same
> thing again i.e.
>
> while (1) {
> fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
> if (fd < 0 && pid == -1 && errno == EACCES) {
> pid = 0;
> continue;
> }
> break;
> }
>

The probing is getting of hand. In this case the intent is a probe for a
flag and flags are the first thing checked kernel side. Given that the
parameters passed to sys_perf_event_open should be as simple and known
safe as possible. pid = -1 has known limitations. Why can't pid just be
getpid() in both cases?

Simplifies this function a lot and removes the need for sched_getcpu(). So
pid = getpid();

fd = sys_perf_event_open(&attr, pid, -1, -1, PERF_FLAG_FD_CLOEXEC);

and if that fails

fd = sys_perf_event_open(&attr, pid, -1, -1, 0);

Why is anything more complicated needed?

David

2015-02-19 16:18:15

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH] perf: Fix probing for PERF_FLAG_FD_CLOEXEC flag

On 19/02/2015 4:55 p.m., David Ahern wrote:
> On 2/19/15 12:06 AM, Adrian Hunter wrote:
>>> /* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
>>> - fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
>>> + fd = sys_perf_event_open(&attr, 0, cpu, -1, 0);
>>
>> I would prefer to avoid pid = 0 unless necessary and so just do the same
>> thing again i.e.
>>
>> while (1) {
>> fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
>> if (fd < 0 && pid == -1 && errno == EACCES) {
>> pid = 0;
>> continue;
>> }
>> break;
>> }
>>
>
> The probing is getting of hand. In this case the intent is a probe for a flag
> and flags are the first thing checked kernel side. Given that the parameters
> passed to sys_perf_event_open should be as simple and known safe as possible.
> pid = -1 has known limitations. Why can't pid just be getpid() in both cases?
>
> Simplifies this function a lot and removes the need for sched_getcpu(). So
> pid = getpid();
>
> fd = sys_perf_event_open(&attr, pid, -1, -1, PERF_FLAG_FD_CLOEXEC);
>
> and if that fails
>
> fd = sys_perf_event_open(&attr, pid, -1, -1, 0);
>
> Why is anything more complicated needed?

Yes, I am sorry it is a pain. I don't know why I didn't add a comment
to the code :-(. Using -1 for the pid is a workaround to avoid gratuitous
jump label changes. If pid=0 is used and then a system-wide trace is done
with Intel PT, there will be a jump label change shortly after the tracing
starts. That means the running code gets changed, but Intel PT decoding
has to walk the code to reconstruct the trace - so errors result. There
will always be occasional jump label changes, but this avoids one that
would otherwise always happen.

2015-02-19 16:27:39

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] perf: Fix probing for PERF_FLAG_FD_CLOEXEC flag

On 2/19/15 9:17 AM, Adrian Hunter wrote:
> Yes, I am sorry it is a pain. I don't know why I didn't add a comment
> to the code :-(. Using -1 for the pid is a workaround to avoid gratuitous
> jump label changes. If pid=0 is used and then a system-wide trace is done
> with Intel PT, there will be a jump label change shortly after the tracing
> starts. That means the running code gets changed, but Intel PT decoding
> has to walk the code to reconstruct the trace - so errors result. There
> will always be occasional jump label changes, but this avoids one that
> would otherwise always happen.

I don't understand the response. Why can't pid == getpid() (ie., pid >
0) be used for this test? pid = -1 and pid = 0 are not needed. With pid
> 0 cpu value does not matter so cpu = -1 can be used. Again this is
just to determine if the kernel supports PERF_FLAG_FD_CLOEXEC. Existence
of PT should not be involved here.

David

2015-02-19 17:28:46

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH] perf: Fix probing for PERF_FLAG_FD_CLOEXEC flag

On 19/02/2015 6:22 p.m., David Ahern wrote:
> On 2/19/15 9:17 AM, Adrian Hunter wrote:
>> Yes, I am sorry it is a pain. I don't know why I didn't add a comment
>> to the code :-(. Using -1 for the pid is a workaround to avoid gratuitous
>> jump label changes. If pid=0 is used and then a system-wide trace is done
>> with Intel PT, there will be a jump label change shortly after the tracing
>> starts. That means the running code gets changed, but Intel PT decoding
>> has to walk the code to reconstruct the trace - so errors result. There
>> will always be occasional jump label changes, but this avoids one that
>> would otherwise always happen.
>
> I don't understand the response. Why can't pid == getpid() (ie., pid > 0)

IIRC pid == getpid() is the same as pid = 0

> be used for this test? pid = -1 and pid = 0 are not needed. With pid > 0
> cpu value does not matter so cpu = -1 can be used. Again this is just to
> determine if the kernel supports PERF_FLAG_FD_CLOEXEC. Existence of PT
> should not be involved here.

This is about the side-effects of opening perf events. One of the side-effects
is that some jump labels get switched. For optimization reasons, there is then
a delay before they switch back. That means that a side-effect of probing the
API is that jump label changes, that otherwise would not have happened, appear
during the trace.

This is not only about Intel PT. From an abstract point of view, it is
about minimizing the disturbance to the system under test.

2015-02-24 11:33:21

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH] perf: Fix probing for PERF_FLAG_FD_CLOEXEC flag

On 19/02/15 19:28, Adrian Hunter wrote:
> On 19/02/2015 6:22 p.m., David Ahern wrote:
>> On 2/19/15 9:17 AM, Adrian Hunter wrote:
>>> Yes, I am sorry it is a pain. I don't know why I didn't add a comment
>>> to the code :-(. Using -1 for the pid is a workaround to avoid gratuitous
>>> jump label changes. If pid=0 is used and then a system-wide trace is done
>>> with Intel PT, there will be a jump label change shortly after the tracing
>>> starts. That means the running code gets changed, but Intel PT decoding
>>> has to walk the code to reconstruct the trace - so errors result. There
>>> will always be occasional jump label changes, but this avoids one that
>>> would otherwise always happen.
>>
>> I don't understand the response. Why can't pid == getpid() (ie., pid > 0)
>
> IIRC pid == getpid() is the same as pid = 0
>
>> be used for this test? pid = -1 and pid = 0 are not needed. With pid > 0
>> cpu value does not matter so cpu = -1 can be used. Again this is just to
>> determine if the kernel supports PERF_FLAG_FD_CLOEXEC. Existence of PT
>> should not be involved here.
>
> This is about the side-effects of opening perf events. One of the side-effects
> is that some jump labels get switched. For optimization reasons, there is then
> a delay before they switch back. That means that a side-effect of probing the
> API is that jump label changes, that otherwise would not have happened, appear
> during the trace.
>
> This is not only about Intel PT. From an abstract point of view, it is
> about minimizing the disturbance to the system under test.
>
>
>

How about this:

From: Adrian Hunter <[email protected]>
Date: Tue, 24 Feb 2015 13:20:59 +0200
Subject: [PATCH] perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag

Commit f6edb53c4993ffe92ce521fb449d1c146cea6ec2 converted the probe to
a CPU wide event first (pid == -1). For kernels that do not support
the PERF_FLAG_FD_CLOEXEC flag the probe fails with EINVAL. Since this
errno is not handled pid is not reset to 0 and the subsequent use of
pid = -1 as an argument brings in an additional failure path if
perf_event_paranoid > 0:

$ perf record -- sleep 1
perf_event_open(..., 0) failed unexpectedly with error 13 (Permission denied)
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.007 MB /tmp/perf.data (11 samples) ]

Since this function only needs to get past this check in kernel/events/core.c:

/* for future expandability... */
if (flags & ~PERF_FLAG_ALL)
return -EINVAL;

Also, ensure the fd of the confirmation check is closed and comment
why pid = -1 is used.

Needs to go to 3.18 stable tree as well.

Based-on-patch-by: David Ahern <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/cloexec.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
index 47b78b3..6da965b 100644
--- a/tools/perf/util/cloexec.c
+++ b/tools/perf/util/cloexec.c
@@ -25,6 +25,10 @@ static int perf_flag_probe(void)
if (cpu < 0)
cpu = 0;

+ /*
+ * Using -1 for the pid is a workaround to avoid gratuitous jump label
+ * changes.
+ */
while (1) {
/* check cloexec flag */
fd = sys_perf_event_open(&attr, pid, cpu, -1,
@@ -47,16 +51,24 @@ static int perf_flag_probe(void)
err, strerror_r(err, sbuf, sizeof(sbuf)));

/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
- fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
+ while (1) {
+ fd = sys_perf_event_open(&attr, pid, cpu, -1, 0);
+ if (fd < 0 && pid == -1 && errno == EACCES) {
+ pid = 0;
+ continue;
+ }
+ break;
+ }
err = errno;

+ if (fd >= 0)
+ close(fd);
+
if (WARN_ONCE(fd < 0 && err != EBUSY,
"perf_event_open(..., 0) failed unexpectedly with error %d (%s)\n",
err, strerror_r(err, sbuf, sizeof(sbuf))))
return -1;

- close(fd);
-
return 0;
}

--
1.9.1

2015-02-24 16:31:35

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] perf: Fix probing for PERF_FLAG_FD_CLOEXEC flag

On 2/24/15 4:31 AM, Adrian Hunter wrote:
> How about this:
>
> From: Adrian Hunter <[email protected]>
> Date: Tue, 24 Feb 2015 13:20:59 +0200
> Subject: [PATCH] perf tools: Fix probing for PERF_FLAG_FD_CLOEXEC flag
>
> Commit f6edb53c4993ffe92ce521fb449d1c146cea6ec2 converted the probe to
> a CPU wide event first (pid == -1). For kernels that do not support
> the PERF_FLAG_FD_CLOEXEC flag the probe fails with EINVAL. Since this
> errno is not handled pid is not reset to 0 and the subsequent use of
> pid = -1 as an argument brings in an additional failure path if
> perf_event_paranoid > 0:
>
> $ perf record -- sleep 1
> perf_event_open(..., 0) failed unexpectedly with error 13 (Permission denied)
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.007 MB /tmp/perf.data (11 samples) ]

This part of the commit message can be removed:

> Since this function only needs to get past this check in kernel/events/core.c:
>
> /* for future expandability... */
> if (flags & ~PERF_FLAG_ALL)
> return -EINVAL;

---

>
> Also, ensure the fd of the confirmation check is closed and comment
> why pid = -1 is used.
>
> Needs to go to 3.18 stable tree as well.
>
> Based-on-patch-by: David Ahern <[email protected]>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/util/cloexec.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c

Acked-by: David Ahern <[email protected]>