2021-06-02 10:35:25

by Leo Yan

[permalink] [raw]
Subject: [PATCH v2 8/8] perf record: Directly bail out for compat case

Since the 64-bit atomicity is not promised in 32-bit perf, directly
report the error and bail out for this case.

Now only applies on x86_64 and Arm64 platforms.

Suggested-by: Adrian Hunter <[email protected]>
Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/builtin-record.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3337b5f93336..f47e298281f7 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -74,6 +74,7 @@
#include <linux/zalloc.h>
#include <linux/bitmap.h>
#include <sys/time.h>
+#include <sys/utsname.h>

struct switch_output {
bool enabled;
@@ -848,6 +849,22 @@ static int record__mmap_evlist(struct record *rec,
opts->auxtrace_sample_mode;
char msg[512];

+#ifndef __LP64__
+ struct utsname uts;
+ int ret;
+
+ ret = uname(&uts);
+ if (ret < 0)
+ return ret;
+
+ if (!strncmp(uts.machine, "x86_64", 6) || !strncmp(uts.machine, "aarch64", 7) ||
+ !strncmp(uts.machine, "arm64", 5)) {
+ pr_err("Error, 32-bit perf cannot record from a 64-bit kernel.\n"
+ "Please use a 64-bit version of perf instead.\n");
+ return -ENOTSUP;
+ }
+#endif
+
if (opts->affinity != PERF_AFFINITY_SYS)
cpu__setup_cpunode_map();

--
2.25.1


2021-06-02 11:19:55

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case

On 2/06/21 1:30 pm, Leo Yan wrote:
> Since the 64-bit atomicity is not promised in 32-bit perf, directly
> report the error and bail out for this case.
>
> Now only applies on x86_64 and Arm64 platforms.
>
> Suggested-by: Adrian Hunter <[email protected]>

Maybe we can do better for the compat case.

We can assume the upper 32-bits change very seldom,
and always increase. So for the 'read' case:

u64 first, second, last;
u64 mask = (u64)((u32)-1) << 32;

do {
first = READ_ONCE(pc->aux_head);
rmb();
second = READ_ONCE(pc->aux_head);
rmb();
last = READ_ONCE(pc->aux_head);
} while ((first & mask) != (last & mask));
return second;

For the write case, we can cause a fatal error only if the new
tail has non-zero upper 32-bits. That gives up to 4GiB of data
before aborting:

if (tail & mask)
return -1;
smp_mb();
WRITE_ONCE(pc->aux_tail, tail);

> Signed-off-by: Leo Yan <[email protected]>
> ---
> tools/perf/builtin-record.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 3337b5f93336..f47e298281f7 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -74,6 +74,7 @@
> #include <linux/zalloc.h>
> #include <linux/bitmap.h>
> #include <sys/time.h>
> +#include <sys/utsname.h>
>
> struct switch_output {
> bool enabled;
> @@ -848,6 +849,22 @@ static int record__mmap_evlist(struct record *rec,
> opts->auxtrace_sample_mode;
> char msg[512];
>
> +#ifndef __LP64__
> + struct utsname uts;
> + int ret;
> +
> + ret = uname(&uts);
> + if (ret < 0)
> + return ret;
> +
> + if (!strncmp(uts.machine, "x86_64", 6) || !strncmp(uts.machine, "aarch64", 7) ||
> + !strncmp(uts.machine, "arm64", 5)) {
> + pr_err("Error, 32-bit perf cannot record from a 64-bit kernel.\n"
> + "Please use a 64-bit version of perf instead.\n");
> + return -ENOTSUP;
> + }
> +#endif
> +
> if (opts->affinity != PERF_AFFINITY_SYS)
> cpu__setup_cpunode_map();
>
>

2021-06-02 12:41:56

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case

Hi Adrain,

On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
> On 2/06/21 1:30 pm, Leo Yan wrote:
> > Since the 64-bit atomicity is not promised in 32-bit perf, directly
> > report the error and bail out for this case.
> >
> > Now only applies on x86_64 and Arm64 platforms.
> >
> > Suggested-by: Adrian Hunter <[email protected]>
>
> Maybe we can do better for the compat case.
>
> We can assume the upper 32-bits change very seldom,
> and always increase. So for the 'read' case:
>
> u64 first, second, last;
> u64 mask = (u64)((u32)-1) << 32;
>
> do {
> first = READ_ONCE(pc->aux_head);
> rmb();
> second = READ_ONCE(pc->aux_head);
> rmb();
> last = READ_ONCE(pc->aux_head);
> } while ((first & mask) != (last & mask));
> return second;
>
> For the write case, we can cause a fatal error only if the new
> tail has non-zero upper 32-bits. That gives up to 4GiB of data
> before aborting:
>
> if (tail & mask)
> return -1;
> smp_mb();
> WRITE_ONCE(pc->aux_tail, tail);

Seems to me, it's pointless to only support aux_head for 64-bit and
support aux_tail for 32-bit. I understand this can be helpful for the
snapshot mode which only uses aux_head, but it still fails to support
the normal case for AUX ring buffer using 64-bit head/tail.

Thanks,
Leo

2021-06-07 10:25:45

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case

On 2/06/21 3:38 pm, Leo Yan wrote:
> Hi Adrain,
>
> On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
>> On 2/06/21 1:30 pm, Leo Yan wrote:
>>> Since the 64-bit atomicity is not promised in 32-bit perf, directly
>>> report the error and bail out for this case.
>>>
>>> Now only applies on x86_64 and Arm64 platforms.
>>>
>>> Suggested-by: Adrian Hunter <[email protected]>
>>
>> Maybe we can do better for the compat case.
>>
>> We can assume the upper 32-bits change very seldom,
>> and always increase. So for the 'read' case:
>>
>> u64 first, second, last;
>> u64 mask = (u64)((u32)-1) << 32;
>>
>> do {
>> first = READ_ONCE(pc->aux_head);
>> rmb();
>> second = READ_ONCE(pc->aux_head);
>> rmb();
>> last = READ_ONCE(pc->aux_head);
>> } while ((first & mask) != (last & mask));
>> return second;
>>
>> For the write case, we can cause a fatal error only if the new
>> tail has non-zero upper 32-bits. That gives up to 4GiB of data
>> before aborting:
>>
>> if (tail & mask)
>> return -1;
>> smp_mb();
>> WRITE_ONCE(pc->aux_tail, tail);
>
> Seems to me, it's pointless to only support aux_head for 64-bit and
> support aux_tail for 32-bit. I understand this can be helpful for the
> snapshot mode which only uses aux_head, but it still fails to support
> the normal case for AUX ring buffer using 64-bit head/tail.

I am not sure why you say it is pointless. 'perf record' would still be
able to capture up to 4GiB of data. Do you mean you usually capture more
than 4GiB of data?

I was thinking we would separate out the compat case:

#if BITS_PER_LONG == 32
if (kernel_is_64_bit)
return compat_auxtrace_mmap__[read_head/write_tail]()
#endif

So the non-compat cases would not be affected.

2021-06-07 15:11:58

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case

On Mon, Jun 07, 2021 at 01:23:43PM +0300, Adrian Hunter wrote:
> On 2/06/21 3:38 pm, Leo Yan wrote:
> > Hi Adrain,
> >
> > On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
> >> On 2/06/21 1:30 pm, Leo Yan wrote:
> >>> Since the 64-bit atomicity is not promised in 32-bit perf, directly
> >>> report the error and bail out for this case.
> >>>
> >>> Now only applies on x86_64 and Arm64 platforms.
> >>>
> >>> Suggested-by: Adrian Hunter <[email protected]>
> >>
> >> Maybe we can do better for the compat case.
> >>
> >> We can assume the upper 32-bits change very seldom,
> >> and always increase. So for the 'read' case:
> >>
> >> u64 first, second, last;
> >> u64 mask = (u64)((u32)-1) << 32;
> >>
> >> do {
> >> first = READ_ONCE(pc->aux_head);
> >> rmb();
> >> second = READ_ONCE(pc->aux_head);
> >> rmb();
> >> last = READ_ONCE(pc->aux_head);
> >> } while ((first & mask) != (last & mask));
> >> return second;
> >>
> >> For the write case, we can cause a fatal error only if the new
> >> tail has non-zero upper 32-bits. That gives up to 4GiB of data
> >> before aborting:
> >>
> >> if (tail & mask)
> >> return -1;
> >> smp_mb();
> >> WRITE_ONCE(pc->aux_tail, tail);
> >
> > Seems to me, it's pointless to only support aux_head for 64-bit and
> > support aux_tail for 32-bit. I understand this can be helpful for the
> > snapshot mode which only uses aux_head, but it still fails to support
> > the normal case for AUX ring buffer using 64-bit head/tail.
>
> I am not sure why you say it is pointless. 'perf record' would still be
> able to capture up to 4GiB of data. Do you mean you usually capture more
> than 4GiB of data?

Okay, understand. We can support 32-bit perf for compat mode when the
trace data is less than 4GiB.

> I was thinking we would separate out the compat case:
>
> #if BITS_PER_LONG == 32
> if (kernel_is_64_bit)
> return compat_auxtrace_mmap__[read_head/write_tail]()
> #endif
>
> So the non-compat cases would not be affected.

Because I don't want to introduce the complexity for read/write head
and tail, and we also need to handle the same issue for the perf ring
buffer. So how about below change?

The main idea for below change is it allows the perf to run normally
on the compat mode and exitly if detects the buffer head is close to
the low 32-bit's overflow: when detect the low 32-bit value is bigger
than 0xf0000000 (so we have 256MiB margin to the overflow), it reports
error and exit.

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 1b4091a3b508..2a9965bfeab4 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map,
pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n",
mm->idx, old, head, head - old);

+#ifdef BITS_PER_LONG == 32
+ if (kernel_is_64bit() && head >= 0xf0000000) {
+ pr_err("32-bit perf cannot read 64-bit value atomically;\n");
+ pr_err("exit to avoid the 4GB (32-bit) AUX buffer overflow on compat mode.\n");
+ return -ENOMEM;
+ }
+#endif
+
if (mm->mask) {
head_off = head & mm->mask;
old_off = old & mm->mask;
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 9130f6fad8d5..823b69895b85 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -405,3 +405,20 @@ int perf_env__numa_node(struct perf_env *env, int cpu)

return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1;
}
+
+int perf_kernel_is_64bit(void)
+{
+ struct utsname uts;
+ int ret;
+
+ ret = uname(&uts);
+ if (ret < 0)
+ return 0;
+
+ if (!strncmp(uts.machine, "x86_64", 6) ||
+ !strncmp(uts.machine, "aarch64", 7) ||
+ !strncmp(uts.machine, "arm64", 5))
+ return 1;
+
+ return 0;
+}
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index ca249bf5e984..c6c034fc08f6 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -147,4 +147,6 @@ void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);

int perf_env__numa_node(struct perf_env *env, int cpu);
+
+int perf_kernel_is_64bit(void);
#endif /* __PERF_ENV_H */
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index ab7108d22428..f1d3725d599a 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -323,6 +323,14 @@ int perf_mmap__push(struct mmap *md, void *to,
if (rc < 0)
return (rc == -EAGAIN) ? 1 : -1;

+#ifdef BITS_PER_LONG == 32
+ if (kernel_is_64bit() && head >= 0xf0000000) {
+ pr_err("32-bit perf cannot read 64-bit value atomically;\n");
+ pr_err("exit to avoid the 4GB (32-bit) buffer overflow on compat mode.\n");
+ return -ENOMEM;
+ }
+#endif
+
size = md->core.end - md->core.start;

if ((md->core.start & md->core.mask) + size != (md->core.end & md->core.mask)) {

Thanks,
Leo

2021-06-09 14:03:31

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case

Hi Adrian,

On Wed, Jun 09, 2021 at 11:23:25AM +0300, Adrian Hunter wrote:

[...]

> >> I was thinking we would separate out the compat case:
> >>
> >> #if BITS_PER_LONG == 32
> >> if (kernel_is_64_bit)
> >> return compat_auxtrace_mmap__[read_head/write_tail]()
> >> #endif
> >>
> >> So the non-compat cases would not be affected.
> >
> > Because I don't want to introduce the complexity for read/write head
> > and tail, and we also need to handle the same issue for the perf ring
> > buffer. So how about below change?
> >
> > The main idea for below change is it allows the perf to run normally
> > on the compat mode and exitly if detects the buffer head is close to
> > the low 32-bit's overflow: when detect the low 32-bit value is bigger
> > than 0xf0000000 (so we have 256MiB margin to the overflow), it reports
> > error and exit.
> >
> > diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> > index 1b4091a3b508..2a9965bfeab4 100644
> > --- a/tools/perf/util/auxtrace.c
> > +++ b/tools/perf/util/auxtrace.c
> > @@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map,
> > pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n",
> > mm->idx, old, head, head - old);
> >
> > +#ifdef BITS_PER_LONG == 32
> > + if (kernel_is_64bit() && head >= 0xf0000000) {
>
> You are assuming the head never increases by more than 256MiB which
> means you should limit the buffer size to 256MiB maximum.
>
> To me this seems a bit too far from an ideal solution.
>
> I would have thought separating out the compat case makes things
> simpler to understand.

Agreed. I will follow up the suggestions to add compat variants for
accessing AUX head and tail, and will distinguish compat case with
global env variable for 64-bit kernel.

After get ready, will send out for review. Thanks a lot for suggestions!

Leo

2021-06-09 17:14:11

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] perf record: Directly bail out for compat case

On 7/06/21 6:09 pm, Leo Yan wrote:
> On Mon, Jun 07, 2021 at 01:23:43PM +0300, Adrian Hunter wrote:
>> On 2/06/21 3:38 pm, Leo Yan wrote:
>>> Hi Adrain,
>>>
>>> On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
>>>> On 2/06/21 1:30 pm, Leo Yan wrote:
>>>>> Since the 64-bit atomicity is not promised in 32-bit perf, directly
>>>>> report the error and bail out for this case.
>>>>>
>>>>> Now only applies on x86_64 and Arm64 platforms.
>>>>>
>>>>> Suggested-by: Adrian Hunter <[email protected]>
>>>>
>>>> Maybe we can do better for the compat case.
>>>>
>>>> We can assume the upper 32-bits change very seldom,
>>>> and always increase. So for the 'read' case:
>>>>
>>>> u64 first, second, last;
>>>> u64 mask = (u64)((u32)-1) << 32;
>>>>
>>>> do {
>>>> first = READ_ONCE(pc->aux_head);
>>>> rmb();
>>>> second = READ_ONCE(pc->aux_head);
>>>> rmb();
>>>> last = READ_ONCE(pc->aux_head);
>>>> } while ((first & mask) != (last & mask));
>>>> return second;
>>>>
>>>> For the write case, we can cause a fatal error only if the new
>>>> tail has non-zero upper 32-bits. That gives up to 4GiB of data
>>>> before aborting:
>>>>
>>>> if (tail & mask)
>>>> return -1;
>>>> smp_mb();
>>>> WRITE_ONCE(pc->aux_tail, tail);
>>>
>>> Seems to me, it's pointless to only support aux_head for 64-bit and
>>> support aux_tail for 32-bit. I understand this can be helpful for the
>>> snapshot mode which only uses aux_head, but it still fails to support
>>> the normal case for AUX ring buffer using 64-bit head/tail.
>>
>> I am not sure why you say it is pointless. 'perf record' would still be
>> able to capture up to 4GiB of data. Do you mean you usually capture more
>> than 4GiB of data?
>
> Okay, understand. We can support 32-bit perf for compat mode when the
> trace data is less than 4GiB.
>
>> I was thinking we would separate out the compat case:
>>
>> #if BITS_PER_LONG == 32
>> if (kernel_is_64_bit)
>> return compat_auxtrace_mmap__[read_head/write_tail]()
>> #endif
>>
>> So the non-compat cases would not be affected.
>
> Because I don't want to introduce the complexity for read/write head
> and tail, and we also need to handle the same issue for the perf ring
> buffer. So how about below change?
>
> The main idea for below change is it allows the perf to run normally
> on the compat mode and exitly if detects the buffer head is close to
> the low 32-bit's overflow: when detect the low 32-bit value is bigger
> than 0xf0000000 (so we have 256MiB margin to the overflow), it reports
> error and exit.
>
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 1b4091a3b508..2a9965bfeab4 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map,
> pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n",
> mm->idx, old, head, head - old);
>
> +#ifdef BITS_PER_LONG == 32
> + if (kernel_is_64bit() && head >= 0xf0000000) {

You are assuming the head never increases by more than 256MiB which
means you should limit the buffer size to 256MiB maximum.

To me this seems a bit too far from an ideal solution.

I would have thought separating out the compat case makes things
simpler to understand.

> + pr_err("32-bit perf cannot read 64-bit value atomically;\n");
> + pr_err("exit to avoid the 4GB (32-bit) AUX buffer overflow on compat mode.\n");
> + return -ENOMEM;
> + }
> +#endif
> +
> if (mm->mask) {
> head_off = head & mm->mask;
> old_off = old & mm->mask;
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index 9130f6fad8d5..823b69895b85 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -405,3 +405,20 @@ int perf_env__numa_node(struct perf_env *env, int cpu)
>
> return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1;
> }
> +
> +int perf_kernel_is_64bit(void)
> +{
> + struct utsname uts;
> + int ret;
> +
> + ret = uname(&uts);
> + if (ret < 0)
> + return 0;
> +
> + if (!strncmp(uts.machine, "x86_64", 6) ||
> + !strncmp(uts.machine, "aarch64", 7) ||
> + !strncmp(uts.machine, "arm64", 5))
> + return 1;
> +
> + return 0;
> +}

Obviously, we don't need to keep checking uname. It could be a global variable
that is always set up early.


> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index ca249bf5e984..c6c034fc08f6 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -147,4 +147,6 @@ void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
> struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
>
> int perf_env__numa_node(struct perf_env *env, int cpu);
> +
> +int perf_kernel_is_64bit(void);
> #endif /* __PERF_ENV_H */
> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index ab7108d22428..f1d3725d599a 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -323,6 +323,14 @@ int perf_mmap__push(struct mmap *md, void *to,
> if (rc < 0)
> return (rc == -EAGAIN) ? 1 : -1;
>
> +#ifdef BITS_PER_LONG == 32
> + if (kernel_is_64bit() && head >= 0xf0000000) {
> + pr_err("32-bit perf cannot read 64-bit value atomically;\n");
> + pr_err("exit to avoid the 4GB (32-bit) buffer overflow on compat mode.\n");
> + return -ENOMEM;
> + }
> +#endif
> +
> size = md->core.end - md->core.start;
>
> if ((md->core.start & md->core.mask) + size != (md->core.end & md->core.mask)) {
>
> Thanks,
> Leo
>