2022-02-09 06:11:15

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v6 10/12] perf tools: Improve IBS error handling

From: Kim Phillips <[email protected]>

improve the error message returned on failed perf_event_open() on AMD when
using IBS.

Output of executing 'perf record -e ibs_op// true' BEFORE this patch:

The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
/bin/dmesg | grep -i perf may provide additional information.

Output after:

AMD IBS cannot exclude kernel events. Try running at a higher privilege level.

Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:

Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
/bin/dmesg | grep -i perf may provide additional information.

Output after:

Error:
AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity

Signed-off-by: Kim Phillips <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Stephane Eranian <[email protected]>
---
tools/perf/util/evsel.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 22d3267ce294..d42f63a484df 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2847,9 +2847,22 @@ static bool find_process(const char *name)
return ret ? false : true;
}

+static bool is_amd(const char *arch, const char *cpuid)
+{
+ return arch && !strcmp("x86", arch) && cpuid && strstarts(cpuid, "AuthenticAMD");
+}
+
+static bool is_amd_ibs(struct evsel *evsel)
+{
+ return evsel->core.attr.precise_ip || !strncmp(evsel->pmu_name, "ibs", 3);
+}
+
int evsel__open_strerror(struct evsel *evsel, struct target *target,
int err, char *msg, size_t size)
{
+ struct perf_env *env = evsel__env(evsel);
+ const char *arch = perf_env__arch(env);
+ const char *cpuid = perf_env__cpuid(env);
char sbuf[STRERR_BUFSIZE];
int printed = 0, enforced = 0;

@@ -2949,6 +2962,17 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
return scnprintf(msg, size,
"Invalid event (%s) in per-thread mode, enable system wide with '-a'.",
evsel__name(evsel));
+ if (is_amd(arch, cpuid)) {
+ if (is_amd_ibs(evsel)) {
+ if (evsel->core.attr.exclude_kernel)
+ return scnprintf(msg, size,
+ "AMD IBS can't exclude kernel events. Try running at a higher privilege level.");
+ if (!evsel->core.system_wide)
+ return scnprintf(msg, size,
+ "AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity");
+ }
+ }
+
break;
case ENODATA:
return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. "
--
2.35.0.263.gb82422642f-goog



2022-02-09 16:38:08

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling

Em Tue, Feb 08, 2022 at 01:16:35PM -0800, Stephane Eranian escreveu:
> From: Kim Phillips <[email protected]>
>
> improve the error message returned on failed perf_event_open() on AMD when
> using IBS.
>
> Output of executing 'perf record -e ibs_op// true' BEFORE this patch:
>
> The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
> /bin/dmesg | grep -i perf may provide additional information.

Humm, here on a

$ grep -m1 'model name' /proc/cpuinfo
model name : AMD Ryzen 9 5950X 16-Core Processor
$ ls -la /sys/devices/ibs_op
total 0
drwxr-xr-x. 4 root root 0 Feb 9 07:12 .
drwxr-xr-x. 21 root root 0 Feb 9 07:12 ..
drwxr-xr-x. 2 root root 0 Feb 9 12:17 format
-rw-r--r--. 1 root root 4096 Feb 9 12:21 perf_event_mux_interval_ms
drwxr-xr-x. 2 root root 0 Feb 9 12:21 power
lrwxrwxrwx. 1 root root 0 Feb 9 07:12 subsystem -> ../../bus/event_source
-r--r--r--. 1 root root 4096 Feb 9 12:17 type
-rw-r--r--. 1 root root 4096 Feb 9 07:12 uevent
$ cat /sys/devices/ibs_op/type
9
$

Running without this patch:

$ uname -a
Linux five 5.15.14-100.fc34.x86_64 #1 SMP Tue Jan 11 16:53:51 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$

$ cat /etc/redhat-release
Fedora release 34 (Thirty Four)
$

$ perf record -e ibs_op// true
Error:
Invalid event (ibs_op//u) in per-thread mode, enable system wide with '-a'.
$

Trying with system wide:

$ perf record -a -e ibs_op// true
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//u).
/bin/dmesg | grep -i perf may provide additional information.

$

So you're missing -a in all examples? Am I missing something?

> Output after:
>
> AMD IBS cannot exclude kernel events. Try running at a higher privilege level.
>
> Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:
>
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
> /bin/dmesg | grep -i perf may provide additional information.

Here, as root:

[root@five ~]# perf record -e ibs_op// true
Error:
Invalid event (ibs_op//) in per-thread mode, enable system wide with '-a'.
[root@five ~]# perf record -a -e ibs_op// true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.482 MB perf.data (175 samples) ]
[root@five ~]#

- Arnaldo

> Output after:
>
> Error:
> AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity
>
> Signed-off-by: Kim Phillips <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Boris Ostrovsky <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Joao Martins <[email protected]>
> Cc: Konrad Rzeszutek Wilk <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Michael Petlan <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Robert Richter <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> ---
> tools/perf/util/evsel.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 22d3267ce294..d42f63a484df 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2847,9 +2847,22 @@ static bool find_process(const char *name)
> return ret ? false : true;
> }
>
> +static bool is_amd(const char *arch, const char *cpuid)
> +{
> + return arch && !strcmp("x86", arch) && cpuid && strstarts(cpuid, "AuthenticAMD");
> +}
> +
> +static bool is_amd_ibs(struct evsel *evsel)
> +{
> + return evsel->core.attr.precise_ip || !strncmp(evsel->pmu_name, "ibs", 3);
> +}
> +
> int evsel__open_strerror(struct evsel *evsel, struct target *target,
> int err, char *msg, size_t size)
> {
> + struct perf_env *env = evsel__env(evsel);
> + const char *arch = perf_env__arch(env);
> + const char *cpuid = perf_env__cpuid(env);
> char sbuf[STRERR_BUFSIZE];
> int printed = 0, enforced = 0;
>
> @@ -2949,6 +2962,17 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
> return scnprintf(msg, size,
> "Invalid event (%s) in per-thread mode, enable system wide with '-a'.",
> evsel__name(evsel));
> + if (is_amd(arch, cpuid)) {
> + if (is_amd_ibs(evsel)) {
> + if (evsel->core.attr.exclude_kernel)
> + return scnprintf(msg, size,
> + "AMD IBS can't exclude kernel events. Try running at a higher privilege level.");
> + if (!evsel->core.system_wide)
> + return scnprintf(msg, size,
> + "AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity");
> + }
> + }
> +
> break;
> case ENODATA:
> return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. "
> --
> 2.35.0.263.gb82422642f-goog

--

- Arnaldo

2022-03-15 08:31:18

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling

Stephane,

On 09-Feb-22 2:46 AM, Stephane Eranian wrote:
> From: Kim Phillips <[email protected]>
>
> improve the error message returned on failed perf_event_open() on AMD when
> using IBS.
>
> Output of executing 'perf record -e ibs_op// true' BEFORE this patch:
>
> The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
> /bin/dmesg | grep -i perf may provide additional information.
>
> Output after:
>
> AMD IBS cannot exclude kernel events. Try running at a higher privilege level.
>
> Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:
>
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
> /bin/dmesg | grep -i perf may provide additional information.
>
> Output after:
>
> Error:
> AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity

This patch seems to be causing regression to perf python test.

Without this patch:

$ sudo ./perf test -vvv 19
19: 'import perf' in python :
--- start ---
test child forked, pid 145391
python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/usr/bin/python2' "
test child finished with 0
---- end ----
'import perf' in python: Ok

With this patch:

$ sudo ./perf test -vvv 19
19: 'import perf' in python :
--- start ---
test child forked, pid 144415
python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/usr/bin/python2' "
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: python/perf.so: undefined symbol: perf_env__cpuid
test child finished with -1
---- end ----
'import perf' in python: FAILED!

Thanks,
Ravi

2022-03-15 10:46:26

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling

>> +static bool is_amd(const char *arch, const char *cpuid)
>> +{
>> + return arch && !strcmp("x86", arch) && cpuid && strstarts(cpuid, "AuthenticAMD");
>> +}
>> +
>> +static bool is_amd_ibs(struct evsel *evsel)
>> +{
>> + return evsel->core.attr.precise_ip || !strncmp(evsel->pmu_name, "ibs", 3);
>> +}
>> +
>> int evsel__open_strerror(struct evsel *evsel, struct target *target,
>> int err, char *msg, size_t size)
>> {
>> + struct perf_env *env = evsel__env(evsel);
>> + const char *arch = perf_env__arch(env);
>> + const char *cpuid = perf_env__cpuid(env);
>
>
> This code dies for me on the latest tip.git because env = NULL and
> perf_env_cpuid() is broken for NULL argument.
> I don't quite know where this env global variable is set but I hope
> there is a better way of doing this, maybe using
> the evsel__env() function in the same util/evsel.c file.
>
> Similarly, the is_amd_ibs() suffers from a NULL pointer dereference
> because evsel->pmu_name maybe NULL:
>
> $ perf record -e rc2 .....
>
> causes a NULL pmu_name.

This should fix the issue:

@@ -2965,7 +2989,7 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,

struct perf_env *evsel__env(struct evsel *evsel)
{
- if (evsel && evsel->evlist)
+ if (evsel && evsel->evlist && evsel->evlist->env)
return evsel->evlist->env;
return &perf_env;
}

Thanks,
Ravi

2022-03-17 03:25:58

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling

On Mon, Mar 14, 2022 at 11:23 PM Ravi Bangoria <[email protected]> wrote:
>
> >> +static bool is_amd(const char *arch, const char *cpuid)
> >> +{
> >> + return arch && !strcmp("x86", arch) && cpuid && strstarts(cpuid, "AuthenticAMD");
> >> +}
> >> +
> >> +static bool is_amd_ibs(struct evsel *evsel)
> >> +{
> >> + return evsel->core.attr.precise_ip || !strncmp(evsel->pmu_name, "ibs", 3);
> >> +}
> >> +
> >> int evsel__open_strerror(struct evsel *evsel, struct target *target,
> >> int err, char *msg, size_t size)
> >> {
> >> + struct perf_env *env = evsel__env(evsel);
> >> + const char *arch = perf_env__arch(env);
> >> + const char *cpuid = perf_env__cpuid(env);
> >
> >
> > This code dies for me on the latest tip.git because env = NULL and
> > perf_env_cpuid() is broken for NULL argument.
> > I don't quite know where this env global variable is set but I hope
> > there is a better way of doing this, maybe using
> > the evsel__env() function in the same util/evsel.c file.
> >
> > Similarly, the is_amd_ibs() suffers from a NULL pointer dereference
> > because evsel->pmu_name maybe NULL:
> >
> > $ perf record -e rc2 .....
> >
> > causes a NULL pmu_name.
>
> This should fix the issue:
>
> @@ -2965,7 +2989,7 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
>
> struct perf_env *evsel__env(struct evsel *evsel)
> {
> - if (evsel && evsel->evlist)
> + if (evsel && evsel->evlist && evsel->evlist->env)
> return evsel->evlist->env;
> return &perf_env;
> }
>
I will check with that change. Similarly, the IBS helper needs to have
the following change:
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2854,13 +2854,18 @@ static bool is_amd(const char *arch, const char *cpuid)

static bool is_amd_ibs(struct evsel *evsel)
{
- return evsel->core.attr.precise_ip ||
!strncmp(evsel->pmu_name, "ibs", 3);
+ return evsel->core.attr.precise_ip
+ || (evsel->pmu_name && !strncmp(evsel->pmu_name, "ibs", 3));
}

2022-03-17 03:33:20

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling

Stephane,

On 16-Mar-22 5:33 AM, Stephane Eranian wrote:
> On Tue, Mar 15, 2022 at 12:46 AM Ravi Bangoria <[email protected]> wrote:
>>
>> Stephane,
>>
>> On 09-Feb-22 2:46 AM, Stephane Eranian wrote:
>>> From: Kim Phillips <[email protected]>
>>>
>>> improve the error message returned on failed perf_event_open() on AMD when
>>> using IBS.
>>>
>>> Output of executing 'perf record -e ibs_op// true' BEFORE this patch:
>>>
>>> The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
>>> /bin/dmesg | grep -i perf may provide additional information.
>>>
>>> Output after:
>>>
>>> AMD IBS cannot exclude kernel events. Try running at a higher privilege level.
>>>
>>> Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:
>>>
>>> Error:
>>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
>>> /bin/dmesg | grep -i perf may provide additional information.
>>>
>>> Output after:
>>>
>>> Error:
>>> AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity
>>
>> This patch seems to be causing regression to perf python test.
>>
>> Without this patch:
>>
>> $ sudo ./perf test -vvv 19
>> 19: 'import perf' in python :
>> --- start ---
>> test child forked, pid 145391
>> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/usr/bin/python2' "
>> test child finished with 0
>> ---- end ----
>> 'import perf' in python: Ok
>>
>> With this patch:
>>
>> $ sudo ./perf test -vvv 19
>> 19: 'import perf' in python :
>> --- start ---
>> test child forked, pid 144415
>> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/usr/bin/python2' "
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> ImportError: python/perf.so: undefined symbol: perf_env__cpuid
>> test child finished with -1
>> ---- end ----
>> 'import perf' in python: FAILED!
>>
>
> The fix I sent you is just to prevent a potential SEGFAULT in
> is_amd_ibs(). I bet the test fails some perf_event_open()
> and ends up in strerror function and from there I don't see how the
> patch could impact the test, given you'd segfault
> otherwise.
>
> I tried on my side and with or without this patch this test fails. I
> think this looks like an unrelated issue.

That's strange. IIUC, python/perf.so needs to know about util/evsel.c
which can be done by adding an entry in util/python-ext-sources.

In any case, do we really need this patch now? For both the example given
in description, I see no difference with or without patch:

Without patch:

$ sudo ./perf record -e ibs_op// true
Error:
Invalid event (ibs_op//) in per-thread mode, enable system wide with '-a'.

$ ./perf record -e ibs_op// true
Error:
Invalid event (ibs_op//u) in per-thread mode, enable system wide with '-a'.

Same o/p with the patch as well.

Thanks,
Ravi

2022-03-17 04:52:35

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling



On 16-Mar-22 4:37 PM, Ravi Bangoria wrote:
> Stephane,
>
> On 16-Mar-22 5:33 AM, Stephane Eranian wrote:
>> On Tue, Mar 15, 2022 at 12:46 AM Ravi Bangoria <[email protected]> wrote:
>>>
>>> Stephane,
>>>
>>> On 09-Feb-22 2:46 AM, Stephane Eranian wrote:
>>>> From: Kim Phillips <[email protected]>
>>>>
>>>> improve the error message returned on failed perf_event_open() on AMD when
>>>> using IBS.
>>>>
>>>> Output of executing 'perf record -e ibs_op// true' BEFORE this patch:
>>>>
>>>> The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
>>>> /bin/dmesg | grep -i perf may provide additional information.
>>>>
>>>> Output after:
>>>>
>>>> AMD IBS cannot exclude kernel events. Try running at a higher privilege level.
>>>>
>>>> Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:
>>>>
>>>> Error:
>>>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
>>>> /bin/dmesg | grep -i perf may provide additional information.
>>>>
>>>> Output after:
>>>>
>>>> Error:
>>>> AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity
>>>
>>> This patch seems to be causing regression to perf python test.
>>>
>>> Without this patch:
>>>
>>> $ sudo ./perf test -vvv 19
>>> 19: 'import perf' in python :
>>> --- start ---
>>> test child forked, pid 145391
>>> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/usr/bin/python2' "
>>> test child finished with 0
>>> ---- end ----
>>> 'import perf' in python: Ok
>>>
>>> With this patch:
>>>
>>> $ sudo ./perf test -vvv 19
>>> 19: 'import perf' in python :
>>> --- start ---
>>> test child forked, pid 144415
>>> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/usr/bin/python2' "
>>> Traceback (most recent call last):
>>> File "<stdin>", line 1, in <module>
>>> ImportError: python/perf.so: undefined symbol: perf_env__cpuid
>>> test child finished with -1
>>> ---- end ----
>>> 'import perf' in python: FAILED!
>>>
>>
>> The fix I sent you is just to prevent a potential SEGFAULT in
>> is_amd_ibs(). I bet the test fails some perf_event_open()
>> and ends up in strerror function and from there I don't see how the
>> patch could impact the test, given you'd segfault
>> otherwise.
>>
>> I tried on my side and with or without this patch this test fails. I
>> think this looks like an unrelated issue.
>
> That's strange. IIUC, python/perf.so needs to know about util/evsel.c

I meant util/env.c not util/evsel.c

> which can be done by adding an entry in util/python-ext-sources.
>
> In any case, do we really need this patch now? For both the example given
> in description, I see no difference with or without patch:
>
> Without patch:
>
> $ sudo ./perf record -e ibs_op// true
> Error:
> Invalid event (ibs_op//) in per-thread mode, enable system wide with '-a'.
>
> $ ./perf record -e ibs_op// true
> Error:
> Invalid event (ibs_op//u) in per-thread mode, enable system wide with '-a'.
>
> Same o/p with the patch as well.
>
> Thanks,
> Ravi

2022-03-17 04:58:53

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling

Hi Kim,

On Tue, Feb 8, 2022 at 1:17 PM Stephane Eranian <[email protected]> wrote:
>
> From: Kim Phillips <[email protected]>
>
> improve the error message returned on failed perf_event_open() on AMD when
> using IBS.
>
> Output of executing 'perf record -e ibs_op// true' BEFORE this patch:
>
> The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
> /bin/dmesg | grep -i perf may provide additional information.
>
> Output after:
>
> AMD IBS cannot exclude kernel events. Try running at a higher privilege level.
>
> Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:
>
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
> /bin/dmesg | grep -i perf may provide additional information.
>
> Output after:
>
> Error:
> AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity
>
> Signed-off-by: Kim Phillips <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Boris Ostrovsky <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Joao Martins <[email protected]>
> Cc: Konrad Rzeszutek Wilk <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Michael Petlan <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Robert Richter <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> ---
> tools/perf/util/evsel.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 22d3267ce294..d42f63a484df 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2847,9 +2847,22 @@ static bool find_process(const char *name)
> return ret ? false : true;
> }
>
> +static bool is_amd(const char *arch, const char *cpuid)
> +{
> + return arch && !strcmp("x86", arch) && cpuid && strstarts(cpuid, "AuthenticAMD");
> +}
> +
> +static bool is_amd_ibs(struct evsel *evsel)
> +{
> + return evsel->core.attr.precise_ip || !strncmp(evsel->pmu_name, "ibs", 3);
> +}
> +
> int evsel__open_strerror(struct evsel *evsel, struct target *target,
> int err, char *msg, size_t size)
> {
> + struct perf_env *env = evsel__env(evsel);
> + const char *arch = perf_env__arch(env);
> + const char *cpuid = perf_env__cpuid(env);


This code dies for me on the latest tip.git because env = NULL and
perf_env_cpuid() is broken for NULL argument.
I don't quite know where this env global variable is set but I hope
there is a better way of doing this, maybe using
the evsel__env() function in the same util/evsel.c file.

Similarly, the is_amd_ibs() suffers from a NULL pointer dereference
because evsel->pmu_name maybe NULL:

$ perf record -e rc2 .....

causes a NULL pmu_name.

Could you please send me an updated version to integrate with the
branch sampling code?

Thanks.


>
> char sbuf[STRERR_BUFSIZE];
> int printed = 0, enforced = 0;
>
> @@ -2949,6 +2962,17 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
> return scnprintf(msg, size,
> "Invalid event (%s) in per-thread mode, enable system wide with '-a'.",
> evsel__name(evsel));
> + if (is_amd(arch, cpuid)) {
> + if (is_amd_ibs(evsel)) {
> + if (evsel->core.attr.exclude_kernel)
> + return scnprintf(msg, size,
> + "AMD IBS can't exclude kernel events. Try running at a higher privilege level.");
> + if (!evsel->core.system_wide)
> + return scnprintf(msg, size,
> + "AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity");
> + }
> + }
> +
> break;
> case ENODATA:
> return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. "
> --
> 2.35.0.263.gb82422642f-goog
>

2022-03-17 05:57:54

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling

Arnaldo,

On 09-Feb-22 9:17 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Feb 08, 2022 at 01:16:35PM -0800, Stephane Eranian escreveu:
>> From: Kim Phillips <[email protected]>
>>
>> improve the error message returned on failed perf_event_open() on AMD when
>> using IBS.
>>
>> Output of executing 'perf record -e ibs_op// true' BEFORE this patch:
>>
>> The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
>> /bin/dmesg | grep -i perf may provide additional information.
>
> Humm, here on a
>
> $ grep -m1 'model name' /proc/cpuinfo
> model name : AMD Ryzen 9 5950X 16-Core Processor
> $ ls -la /sys/devices/ibs_op
> total 0
> drwxr-xr-x. 4 root root 0 Feb 9 07:12 .
> drwxr-xr-x. 21 root root 0 Feb 9 07:12 ..
> drwxr-xr-x. 2 root root 0 Feb 9 12:17 format
> -rw-r--r--. 1 root root 4096 Feb 9 12:21 perf_event_mux_interval_ms
> drwxr-xr-x. 2 root root 0 Feb 9 12:21 power
> lrwxrwxrwx. 1 root root 0 Feb 9 07:12 subsystem -> ../../bus/event_source
> -r--r--r--. 1 root root 4096 Feb 9 12:17 type
> -rw-r--r--. 1 root root 4096 Feb 9 07:12 uevent
> $ cat /sys/devices/ibs_op/type
> 9
> $
>
> Running without this patch:
>
> $ uname -a
> Linux five 5.15.14-100.fc34.x86_64 #1 SMP Tue Jan 11 16:53:51 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> $
>
> $ cat /etc/redhat-release
> Fedora release 34 (Thirty Four)
> $
>
> $ perf record -e ibs_op// true
> Error:
> Invalid event (ibs_op//u) in per-thread mode, enable system wide with '-a'.
> $
>
> Trying with system wide:
>
> $ perf record -a -e ibs_op// true
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//u).
> /bin/dmesg | grep -i perf may provide additional information.
>
> $
>
> So you're missing -a in all examples? Am I missing something?

AMD IBS does not support per-thread monitoring because it's configured
as uncore pmu (perf_invalid_context).

Thanks,
Ravi

2022-03-17 06:11:11

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH v6 10/12] perf tools: Improve IBS error handling

On Tue, Mar 15, 2022 at 12:46 AM Ravi Bangoria <[email protected]> wrote:
>
> Stephane,
>
> On 09-Feb-22 2:46 AM, Stephane Eranian wrote:
> > From: Kim Phillips <[email protected]>
> >
> > improve the error message returned on failed perf_event_open() on AMD when
> > using IBS.
> >
> > Output of executing 'perf record -e ibs_op// true' BEFORE this patch:
> >
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
> > /bin/dmesg | grep -i perf may provide additional information.
> >
> > Output after:
> >
> > AMD IBS cannot exclude kernel events. Try running at a higher privilege level.
> >
> > Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:
> >
> > Error:
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
> > /bin/dmesg | grep -i perf may provide additional information.
> >
> > Output after:
> >
> > Error:
> > AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity
>
> This patch seems to be causing regression to perf python test.
>
> Without this patch:
>
> $ sudo ./perf test -vvv 19
> 19: 'import perf' in python :
> --- start ---
> test child forked, pid 145391
> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/usr/bin/python2' "
> test child finished with 0
> ---- end ----
> 'import perf' in python: Ok
>
> With this patch:
>
> $ sudo ./perf test -vvv 19
> 19: 'import perf' in python :
> --- start ---
> test child forked, pid 144415
> python usage test: "echo "import sys ; sys.path.append('python'); import perf" | '/usr/bin/python2' "
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ImportError: python/perf.so: undefined symbol: perf_env__cpuid
> test child finished with -1
> ---- end ----
> 'import perf' in python: FAILED!
>

The fix I sent you is just to prevent a potential SEGFAULT in
is_amd_ibs(). I bet the test fails some perf_event_open()
and ends up in strerror function and from there I don't see how the
patch could impact the test, given you'd segfault
otherwise.

I tried on my side and with or without this patch this test fails. I
think this looks like an unrelated issue.