2020-11-05 04:52:15

by Daniel Xu

[permalink] [raw]
Subject: [PATCH bpf v2 0/2] Fix bpf_probe_read_user_str() overcopying

6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user,
kernel}_str helpers") introduced a subtle bug where
bpf_probe_read_user_str() would potentially copy a few extra bytes after
the NUL terminator.

This issue is particularly nefarious when strings are used as map keys,
as seemingly identical strings can occupy multiple entries in a map.

This patchset fixes the issue and introduces a selftest to prevent
future regressions.

Daniel Xu (2):
lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator
selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes
after NUL

lib/strncpy_from_user.c | 9 ++-
.../bpf/prog_tests/probe_read_user_str.c | 60 +++++++++++++++++++
.../bpf/progs/test_probe_read_user_str.c | 34 +++++++++++
3 files changed, 101 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
create mode 100644 tools/testing/selftests/bpf/progs/test_probe_read_user_str.c

--
2.28.0


2020-11-05 04:52:20

by Daniel Xu

[permalink] [raw]
Subject: [PATCH bpf v2 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator

do_strncpy_from_user() may copy some extra bytes after the NUL
terminator into the destination buffer. This usually does not matter for
normal string operations. However, when BPF programs key BPF maps with
strings, this matters a lot.

A BPF program may read strings from user memory by calling the
bpf_probe_read_user_str() helper which eventually calls
do_strncpy_from_user(). The program can then key a map with the
resulting string. BPF map keys are fixed-width and string-agnostic,
meaning that map keys are treated as a set of bytes.

The issue is when do_strncpy_from_user() overcopies bytes after the NUL
terminator, it can result in seemingly identical strings occupying
multiple slots in a BPF map. This behavior is subtle and totally
unexpected by the user.

This commit uses the proper word-at-a-time APIs to avoid overcopying.

Fixes: 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
Signed-off-by: Daniel Xu <[email protected]>
---
lib/strncpy_from_user.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index e6d5fcc2cdf3..d084189eb05c 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
goto byte_at_a_time;

while (max >= sizeof(unsigned long)) {
- unsigned long c, data;
+ unsigned long c, data, mask, *out;

/* Fall back to byte-at-a-time if we get a page fault */
unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);

- *(unsigned long *)(dst+res) = c;
if (has_zero(c, &data, &constants)) {
data = prep_zero_mask(c, data, &constants);
data = create_zero_mask(data);
+ mask = zero_bytemask(data);
+ out = (unsigned long *)(dst+res);
+ *out = (*out & ~mask) | (c & mask);
return res + find_zero(data);
+ } else {
+ *(unsigned long *)(dst+res) = c;
}
+
res += sizeof(unsigned long);
max -= sizeof(unsigned long);
}
--
2.28.0

2020-11-05 04:54:15

by Daniel Xu

[permalink] [raw]
Subject: [PATCH bpf v2 2/2] selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL

Previously, bpf_probe_read_user_str() could potentially overcopy the
trailing bytes after the NUL due to how do_strncpy_from_user() does the
copy in long-sized strides. The issue has been fixed in the previous
commit.

This commit adds a selftest that ensures we don't regress
bpf_probe_read_user_str() again.

Signed-off-by: Daniel Xu <[email protected]>
---
.../bpf/prog_tests/probe_read_user_str.c | 60 +++++++++++++++++++
.../bpf/progs/test_probe_read_user_str.c | 34 +++++++++++
2 files changed, 94 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
create mode 100644 tools/testing/selftests/bpf/progs/test_probe_read_user_str.c

diff --git a/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c b/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
new file mode 100644
index 000000000000..597a166e6c8d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include "test_probe_read_user_str.skel.h"
+
+static const char str[] = "mestring";
+
+void test_probe_read_user_str(void)
+{
+ struct test_probe_read_user_str *skel;
+ int fd, err, duration = 0;
+ char buf[256];
+ ssize_t n;
+
+ skel = test_probe_read_user_str__open_and_load();
+ if (CHECK(!skel, "test_probe_read_user_str__open_and_load",
+ "skeleton open and load failed\n"))
+ goto out;
+
+ err = test_probe_read_user_str__attach(skel);
+ if (CHECK(err, "test_probe_read_user_str__attach",
+ "skeleton attach failed: %d\n", err))
+ goto out;
+
+ fd = open("/dev/null", O_WRONLY);
+ if (CHECK(fd < 0, "open", "open /dev/null failed: %d\n", fd))
+ goto out;
+
+ /* Give pid to bpf prog so it doesn't read from anyone else */
+ skel->bss->pid = getpid();
+
+ /* Ensure bytes after string are ones */
+ memset(buf, 1, sizeof(buf));
+ memcpy(buf, str, sizeof(str));
+
+ /* Trigger tracepoint */
+ n = write(fd, buf, sizeof(buf));
+ if (CHECK(n != sizeof(buf), "write", "write failed: %ld\n", n))
+ goto fd_out;
+
+ /* Did helper fail? */
+ if (CHECK(skel->bss->ret < 0, "prog ret", "prog returned: %d\n",
+ skel->bss->ret))
+ goto fd_out;
+
+ /* Check that string was copied correctly */
+ err = memcmp(skel->bss->buf, str, sizeof(str));
+ if (CHECK(err, "memcmp", "prog copied wrong string"))
+ goto fd_out;
+
+ /* Now check that no extra trailing bytes were copied */
+ memset(buf, 0, sizeof(buf));
+ err = memcmp(skel->bss->buf + sizeof(str), buf, sizeof(buf) - sizeof(str));
+ if (CHECK(err, "memcmp", "trailing bytes were not stripped"))
+ goto fd_out;
+
+fd_out:
+ close(fd);
+out:
+ test_probe_read_user_str__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
new file mode 100644
index 000000000000..41c3e296566e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
@@ -0,0 +1,34 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#include <sys/types.h>
+
+struct sys_enter_write_args {
+ unsigned long long pad;
+ int syscall_nr;
+ int pad1; /* 4 byte hole */
+ unsigned int fd;
+ int pad2; /* 4 byte hole */
+ const char *buf;
+ size_t count;
+};
+
+pid_t pid = 0;
+int ret = 0;
+char buf[256] = {};
+
+SEC("tracepoint/syscalls/sys_enter_write")
+int on_write(struct sys_enter_write_args *ctx)
+{
+ if (pid != (bpf_get_current_pid_tgid() >> 32))
+ return 0;
+
+ ret = bpf_probe_read_user_str(buf, sizeof(buf), ctx->buf);
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
--
2.28.0

2020-11-05 09:01:55

by David Laight

[permalink] [raw]
Subject: RE: [PATCH bpf v2 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator

From: Daniel Xu
> Sent: 05 November 2020 02:26
...
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
> goto byte_at_a_time;
>
> while (max >= sizeof(unsigned long)) {
> - unsigned long c, data;
> + unsigned long c, data, mask, *out;
>
> /* Fall back to byte-at-a-time if we get a page fault */
> unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);

It's not related to this change, but since both addresses
are aligned (checked earlier) a page fault on the word read
is fatal.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-11-05 18:20:53

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH bpf v2 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator



> On Nov 4, 2020, at 6:25 PM, Daniel Xu <[email protected]> wrote:
>
> do_strncpy_from_user() may copy some extra bytes after the NUL

We have multiple use of "NUL" here, should be "NULL"?

> terminator into the destination buffer. This usually does not matter for
> normal string operations. However, when BPF programs key BPF maps with
> strings, this matters a lot.
>
> A BPF program may read strings from user memory by calling the
> bpf_probe_read_user_str() helper which eventually calls
> do_strncpy_from_user(). The program can then key a map with the
> resulting string. BPF map keys are fixed-width and string-agnostic,
> meaning that map keys are treated as a set of bytes.
>
> The issue is when do_strncpy_from_user() overcopies bytes after the NUL
> terminator, it can result in seemingly identical strings occupying
> multiple slots in a BPF map. This behavior is subtle and totally
> unexpected by the user.
>
> This commit uses the proper word-at-a-time APIs to avoid overcopying.
>
> Fixes: 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
> Signed-off-by: Daniel Xu <[email protected]>
> ---
> lib/strncpy_from_user.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> index e6d5fcc2cdf3..d084189eb05c 100644
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
> goto byte_at_a_time;
>
> while (max >= sizeof(unsigned long)) {
> - unsigned long c, data;
> + unsigned long c, data, mask, *out;
>
> /* Fall back to byte-at-a-time if we get a page fault */
> unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
>
> - *(unsigned long *)(dst+res) = c;
> if (has_zero(c, &data, &constants)) {
> data = prep_zero_mask(c, data, &constants);
> data = create_zero_mask(data);
> + mask = zero_bytemask(data);
> + out = (unsigned long *)(dst+res);
> + *out = (*out & ~mask) | (c & mask);
> return res + find_zero(data);
> + } else {

This else clause is not needed, as we return in the if clause.

> + *(unsigned long *)(dst+res) = c;
> }
> +
> res += sizeof(unsigned long);
> max -= sizeof(unsigned long);
> }
> --
> 2.28.0
>

2020-11-05 18:22:59

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH bpf v2 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator



> On Nov 5, 2020, at 10:16 AM, Song Liu <[email protected]> wrote:
>
>
>
>> On Nov 4, 2020, at 6:25 PM, Daniel Xu <[email protected]> wrote:
>>
>> do_strncpy_from_user() may copy some extra bytes after the NUL
>
> We have multiple use of "NUL" here, should be "NULL"?

Just realized strncpy_from_user.c uses "NUL", so nevermind...

>
>> terminator into the destination buffer. This usually does not matter for
>> normal string operations. However, when BPF programs key BPF maps with
>> strings, this matters a lot.
>>
>> A BPF program may read strings from user memory by calling the
>> bpf_probe_read_user_str() helper which eventually calls
>> do_strncpy_from_user(). The program can then key a map with the
>> resulting string. BPF map keys are fixed-width and string-agnostic,
>> meaning that map keys are treated as a set of bytes.
>>
>> The issue is when do_strncpy_from_user() overcopies bytes after the NUL
>> terminator, it can result in seemingly identical strings occupying
>> multiple slots in a BPF map. This behavior is subtle and totally
>> unexpected by the user.
>>
>> This commit uses the proper word-at-a-time APIs to avoid overcopying.
>>
>> Fixes: 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
>> Signed-off-by: Daniel Xu <[email protected]>
>> ---
>> lib/strncpy_from_user.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
>> index e6d5fcc2cdf3..d084189eb05c 100644
>> --- a/lib/strncpy_from_user.c
>> +++ b/lib/strncpy_from_user.c
>> @@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
>> goto byte_at_a_time;
>>
>> while (max >= sizeof(unsigned long)) {
>> - unsigned long c, data;
>> + unsigned long c, data, mask, *out;
>>
>> /* Fall back to byte-at-a-time if we get a page fault */
>> unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
>>
>> - *(unsigned long *)(dst+res) = c;
>> if (has_zero(c, &data, &constants)) {
>> data = prep_zero_mask(c, data, &constants);
>> data = create_zero_mask(data);
>> + mask = zero_bytemask(data);
>> + out = (unsigned long *)(dst+res);
>> + *out = (*out & ~mask) | (c & mask);
>> return res + find_zero(data);
>> + } else {
>
> This else clause is not needed, as we return in the if clause.
>
>> + *(unsigned long *)(dst+res) = c;
>> }
>> +
>> res += sizeof(unsigned long);
>> max -= sizeof(unsigned long);
>> }
>> --
>> 2.28.0

2020-11-05 18:32:21

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH bpf v2 2/2] selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL



> On Nov 4, 2020, at 6:25 PM, Daniel Xu <[email protected]> wrote:
>
> Previously, bpf_probe_read_user_str() could potentially overcopy the
> trailing bytes after the NUL due to how do_strncpy_from_user() does the
> copy in long-sized strides. The issue has been fixed in the previous
> commit.
>
> This commit adds a selftest that ensures we don't regress
> bpf_probe_read_user_str() again.
>
> Signed-off-by: Daniel Xu <[email protected]>
> ---
> .../bpf/prog_tests/probe_read_user_str.c | 60 +++++++++++++++++++
> .../bpf/progs/test_probe_read_user_str.c | 34 +++++++++++
> 2 files changed, 94 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> create mode 100644 tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c b/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> new file mode 100644
> index 000000000000..597a166e6c8d
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> @@ -0,0 +1,60 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +#include "test_probe_read_user_str.skel.h"
> +
> +static const char str[] = "mestring";
> +
> +void test_probe_read_user_str(void)
> +{
> + struct test_probe_read_user_str *skel;
> + int fd, err, duration = 0;
> + char buf[256];
> + ssize_t n;
> +
> + skel = test_probe_read_user_str__open_and_load();
> + if (CHECK(!skel, "test_probe_read_user_str__open_and_load",
> + "skeleton open and load failed\n"))
> + goto out;

nit: we can just return here.

> +
> + err = test_probe_read_user_str__attach(skel);
> + if (CHECK(err, "test_probe_read_user_str__attach",
> + "skeleton attach failed: %d\n", err))
> + goto out;
> +
> + fd = open("/dev/null", O_WRONLY);
> + if (CHECK(fd < 0, "open", "open /dev/null failed: %d\n", fd))
> + goto out;
> +
> + /* Give pid to bpf prog so it doesn't read from anyone else */
> + skel->bss->pid = getpid();

It is better to set pid before attaching skel.

> +
> + /* Ensure bytes after string are ones */
> + memset(buf, 1, sizeof(buf));
> + memcpy(buf, str, sizeof(str));
> +
> + /* Trigger tracepoint */
> + n = write(fd, buf, sizeof(buf));
> + if (CHECK(n != sizeof(buf), "write", "write failed: %ld\n", n))
> + goto fd_out;
> +
> + /* Did helper fail? */
> + if (CHECK(skel->bss->ret < 0, "prog ret", "prog returned: %d\n",

In most cases, we use underscore instead of spaces in the second argument
of CHECK(). IOW, please use "prog_ret" instead of "prog ret".

> + skel->bss->ret))
> + goto fd_out;
> +
> + /* Check that string was copied correctly */
> + err = memcmp(skel->bss->buf, str, sizeof(str));
> + if (CHECK(err, "memcmp", "prog copied wrong string"))
> + goto fd_out;
> +
> + /* Now check that no extra trailing bytes were copied */
> + memset(buf, 0, sizeof(buf));
> + err = memcmp(skel->bss->buf + sizeof(str), buf, sizeof(buf) - sizeof(str));
> + if (CHECK(err, "memcmp", "trailing bytes were not stripped"))
> + goto fd_out;
> +
> +fd_out:
> + close(fd);
> +out:
> + test_probe_read_user_str__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> new file mode 100644
> index 000000000000..41c3e296566e
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> @@ -0,0 +1,34 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +#include <sys/types.h>
> +
> +struct sys_enter_write_args {
> + unsigned long long pad;
> + int syscall_nr;
> + int pad1; /* 4 byte hole */
> + unsigned int fd;
> + int pad2; /* 4 byte hole */
> + const char *buf;
> + size_t count;
> +};
> +
> +pid_t pid = 0;
> +int ret = 0;
> +char buf[256] = {};
> +
> +SEC("tracepoint/syscalls/sys_enter_write")
> +int on_write(struct sys_enter_write_args *ctx)
> +{
> + if (pid != (bpf_get_current_pid_tgid() >> 32))
> + return 0;
> +
> + ret = bpf_probe_read_user_str(buf, sizeof(buf), ctx->buf);

bpf_probe_read_user_str() returns "long". Let's use "long ret;"

> +
> + return 0;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> --
> 2.28.0
>

2020-11-05 19:31:52

by Daniel Xu

[permalink] [raw]
Subject: Re: [PATCH bpf v2 2/2] selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL

On Thu Nov 5, 2020 at 10:30 AM PST, Song Liu wrote:
>
>
> > On Nov 4, 2020, at 6:25 PM, Daniel Xu <[email protected]> wrote:
> >
> > Previously, bpf_probe_read_user_str() could potentially overcopy the
> > trailing bytes after the NUL due to how do_strncpy_from_user() does the
> > copy in long-sized strides. The issue has been fixed in the previous
> > commit.
> >
> > This commit adds a selftest that ensures we don't regress
> > bpf_probe_read_user_str() again.
> >
> > Signed-off-by: Daniel Xu <[email protected]>
> > ---
> > .../bpf/prog_tests/probe_read_user_str.c | 60 +++++++++++++++++++
> > .../bpf/progs/test_probe_read_user_str.c | 34 +++++++++++
> > 2 files changed, 94 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> > create mode 100644 tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c b/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> > new file mode 100644
> > index 000000000000..597a166e6c8d
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> > @@ -0,0 +1,60 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <test_progs.h>
> > +#include "test_probe_read_user_str.skel.h"
> > +
> > +static const char str[] = "mestring";
> > +
> > +void test_probe_read_user_str(void)
> > +{
> > + struct test_probe_read_user_str *skel;
> > + int fd, err, duration = 0;
> > + char buf[256];
> > + ssize_t n;
> > +
> > + skel = test_probe_read_user_str__open_and_load();
> > + if (CHECK(!skel, "test_probe_read_user_str__open_and_load",
> > + "skeleton open and load failed\n"))
> > + goto out;
>
> nit: we can just return here.
>
> > +
> > + err = test_probe_read_user_str__attach(skel);
> > + if (CHECK(err, "test_probe_read_user_str__attach",
> > + "skeleton attach failed: %d\n", err))
> > + goto out;
> > +
> > + fd = open("/dev/null", O_WRONLY);
> > + if (CHECK(fd < 0, "open", "open /dev/null failed: %d\n", fd))
> > + goto out;
> > +
> > + /* Give pid to bpf prog so it doesn't read from anyone else */
> > + skel->bss->pid = getpid();
>
> It is better to set pid before attaching skel.
>
> > +
> > + /* Ensure bytes after string are ones */
> > + memset(buf, 1, sizeof(buf));
> > + memcpy(buf, str, sizeof(str));
> > +
> > + /* Trigger tracepoint */
> > + n = write(fd, buf, sizeof(buf));
> > + if (CHECK(n != sizeof(buf), "write", "write failed: %ld\n", n))
> > + goto fd_out;
> > +
> > + /* Did helper fail? */
> > + if (CHECK(skel->bss->ret < 0, "prog ret", "prog returned: %d\n",
>
> In most cases, we use underscore instead of spaces in the second
> argument
> of CHECK(). IOW, please use "prog_ret" instead of "prog ret".
>
> > + skel->bss->ret))
> > + goto fd_out;
> > +
> > + /* Check that string was copied correctly */
> > + err = memcmp(skel->bss->buf, str, sizeof(str));
> > + if (CHECK(err, "memcmp", "prog copied wrong string"))
> > + goto fd_out;
> > +
> > + /* Now check that no extra trailing bytes were copied */
> > + memset(buf, 0, sizeof(buf));
> > + err = memcmp(skel->bss->buf + sizeof(str), buf, sizeof(buf) - sizeof(str));
> > + if (CHECK(err, "memcmp", "trailing bytes were not stripped"))
> > + goto fd_out;
> > +
> > +fd_out:
> > + close(fd);
> > +out:
> > + test_probe_read_user_str__destroy(skel);
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> > new file mode 100644
> > index 000000000000..41c3e296566e
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> > @@ -0,0 +1,34 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/bpf.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +
> > +#include <sys/types.h>
> > +
> > +struct sys_enter_write_args {
> > + unsigned long long pad;
> > + int syscall_nr;
> > + int pad1; /* 4 byte hole */
> > + unsigned int fd;
> > + int pad2; /* 4 byte hole */
> > + const char *buf;
> > + size_t count;
> > +};
> > +
> > +pid_t pid = 0;
> > +int ret = 0;
> > +char buf[256] = {};
> > +
> > +SEC("tracepoint/syscalls/sys_enter_write")
> > +int on_write(struct sys_enter_write_args *ctx)
> > +{
> > + if (pid != (bpf_get_current_pid_tgid() >> 32))
> > + return 0;
> > +
> > + ret = bpf_probe_read_user_str(buf, sizeof(buf), ctx->buf);
>
> bpf_probe_read_user_str() returns "long". Let's use "long ret;"

Thanks for review, will send v3 with these changes.

[...]

2020-11-05 19:33:09

by Daniel Xu

[permalink] [raw]
Subject: Re: [PATCH bpf v2 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator

On Thu Nov 5, 2020 at 10:16 AM PST, Song Liu wrote:
>
>
> > On Nov 4, 2020, at 6:25 PM, Daniel Xu <[email protected]> wrote:
> >
> > do_strncpy_from_user() may copy some extra bytes after the NUL
>
> We have multiple use of "NUL" here, should be "NULL"?
>
> > terminator into the destination buffer. This usually does not matter for
> > normal string operations. However, when BPF programs key BPF maps with
> > strings, this matters a lot.
> >
> > A BPF program may read strings from user memory by calling the
> > bpf_probe_read_user_str() helper which eventually calls
> > do_strncpy_from_user(). The program can then key a map with the
> > resulting string. BPF map keys are fixed-width and string-agnostic,
> > meaning that map keys are treated as a set of bytes.
> >
> > The issue is when do_strncpy_from_user() overcopies bytes after the NUL
> > terminator, it can result in seemingly identical strings occupying
> > multiple slots in a BPF map. This behavior is subtle and totally
> > unexpected by the user.
> >
> > This commit uses the proper word-at-a-time APIs to avoid overcopying.
> >
> > Fixes: 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
> > Signed-off-by: Daniel Xu <[email protected]>
> > ---
> > lib/strncpy_from_user.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> > index e6d5fcc2cdf3..d084189eb05c 100644
> > --- a/lib/strncpy_from_user.c
> > +++ b/lib/strncpy_from_user.c
> > @@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
> > goto byte_at_a_time;
> >
> > while (max >= sizeof(unsigned long)) {
> > - unsigned long c, data;
> > + unsigned long c, data, mask, *out;
> >
> > /* Fall back to byte-at-a-time if we get a page fault */
> > unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
> >
> > - *(unsigned long *)(dst+res) = c;
> > if (has_zero(c, &data, &constants)) {
> > data = prep_zero_mask(c, data, &constants);
> > data = create_zero_mask(data);
> > + mask = zero_bytemask(data);
> > + out = (unsigned long *)(dst+res);
> > + *out = (*out & ~mask) | (c & mask);
> > return res + find_zero(data);
> > + } else {
>
> This else clause is not needed, as we return in the if clause.

Thanks, will change in v3.

[..]

2020-11-05 21:36:54

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf v2 2/2] selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL

On Wed, Nov 4, 2020 at 8:51 PM Daniel Xu <[email protected]> wrote:
>
> Previously, bpf_probe_read_user_str() could potentially overcopy the
> trailing bytes after the NUL due to how do_strncpy_from_user() does the
> copy in long-sized strides. The issue has been fixed in the previous
> commit.
>
> This commit adds a selftest that ensures we don't regress
> bpf_probe_read_user_str() again.
>
> Signed-off-by: Daniel Xu <[email protected]>
> ---
> .../bpf/prog_tests/probe_read_user_str.c | 60 +++++++++++++++++++
> .../bpf/progs/test_probe_read_user_str.c | 34 +++++++++++
> 2 files changed, 94 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> create mode 100644 tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c b/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> new file mode 100644
> index 000000000000..597a166e6c8d
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/probe_read_user_str.c
> @@ -0,0 +1,60 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +#include "test_probe_read_user_str.skel.h"
> +
> +static const char str[] = "mestring";
> +
> +void test_probe_read_user_str(void)
> +{
> + struct test_probe_read_user_str *skel;
> + int fd, err, duration = 0;
> + char buf[256];
> + ssize_t n;
> +
> + skel = test_probe_read_user_str__open_and_load();
> + if (CHECK(!skel, "test_probe_read_user_str__open_and_load",
> + "skeleton open and load failed\n"))
> + goto out;
> +
> + err = test_probe_read_user_str__attach(skel);
> + if (CHECK(err, "test_probe_read_user_str__attach",
> + "skeleton attach failed: %d\n", err))
> + goto out;
> +
> + fd = open("/dev/null", O_WRONLY);
> + if (CHECK(fd < 0, "open", "open /dev/null failed: %d\n", fd))
> + goto out;
> +
> + /* Give pid to bpf prog so it doesn't read from anyone else */
> + skel->bss->pid = getpid();
> +
> + /* Ensure bytes after string are ones */
> + memset(buf, 1, sizeof(buf));
> + memcpy(buf, str, sizeof(str));
> +
> + /* Trigger tracepoint */
> + n = write(fd, buf, sizeof(buf));
> + if (CHECK(n != sizeof(buf), "write", "write failed: %ld\n", n))
> + goto fd_out;
> +
> + /* Did helper fail? */
> + if (CHECK(skel->bss->ret < 0, "prog ret", "prog returned: %d\n",
> + skel->bss->ret))
> + goto fd_out;
> +
> + /* Check that string was copied correctly */
> + err = memcmp(skel->bss->buf, str, sizeof(str));
> + if (CHECK(err, "memcmp", "prog copied wrong string"))
> + goto fd_out;
> +
> + /* Now check that no extra trailing bytes were copied */
> + memset(buf, 0, sizeof(buf));
> + err = memcmp(skel->bss->buf + sizeof(str), buf, sizeof(buf) - sizeof(str));
> + if (CHECK(err, "memcmp", "trailing bytes were not stripped"))
> + goto fd_out;
> +
> +fd_out:
> + close(fd);
> +out:
> + test_probe_read_user_str__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> new file mode 100644
> index 000000000000..41c3e296566e
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> @@ -0,0 +1,34 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +#include <sys/types.h>
> +
> +struct sys_enter_write_args {
> + unsigned long long pad;
> + int syscall_nr;
> + int pad1; /* 4 byte hole */

I have a hunch that this explicit padding might break on big-endian
architectures?..

Can you instead include "vmlinux.h" in this file and use struct
trace_event_raw_sys_enter? you'll just need ctx->args[2] to get that
buffer pointer.

Alternatively, and it's probably simpler overall would be to just
provide user-space pointer through global variable:

void *user_ptr;


bpf_probe_read_user_str(buf, ..., user_ptr);

From user-space:

skel->bss->user_ptr = &my_userspace_buf;

Full control. You can trigger tracepoint with just an usleep(1), for instance.

> + unsigned int fd;
> + int pad2; /* 4 byte hole */
> + const char *buf;
> + size_t count;
> +};
> +
> +pid_t pid = 0;
> +int ret = 0;
> +char buf[256] = {};
> +
> +SEC("tracepoint/syscalls/sys_enter_write")
> +int on_write(struct sys_enter_write_args *ctx)
> +{
> + if (pid != (bpf_get_current_pid_tgid() >> 32))
> + return 0;
> +
> + ret = bpf_probe_read_user_str(buf, sizeof(buf), ctx->buf);
> +
> + return 0;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> --
> 2.28.0
>

2020-11-05 23:26:04

by Daniel Xu

[permalink] [raw]
Subject: Re: [PATCH bpf v2 2/2] selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL

On Thu Nov 5, 2020 at 1:32 PM PST, Andrii Nakryiko wrote:
> On Wed, Nov 4, 2020 at 8:51 PM Daniel Xu <[email protected]> wrote:
[...]
> > diff --git a/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> > new file mode 100644
> > index 000000000000..41c3e296566e
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> > @@ -0,0 +1,34 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/bpf.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +
> > +#include <sys/types.h>
> > +
> > +struct sys_enter_write_args {
> > + unsigned long long pad;
> > + int syscall_nr;
> > + int pad1; /* 4 byte hole */
>
> I have a hunch that this explicit padding might break on big-endian
> architectures?..
>
> Can you instead include "vmlinux.h" in this file and use struct
> trace_event_raw_sys_enter? you'll just need ctx->args[2] to get that
> buffer pointer.
>
> Alternatively, and it's probably simpler overall would be to just
> provide user-space pointer through global variable:
>
> void *user_ptr;
>
>
> bpf_probe_read_user_str(buf, ..., user_ptr);
>
> From user-space:
>
> skel->bss->user_ptr = &my_userspace_buf;
>
> Full control. You can trigger tracepoint with just an usleep(1), for
> instance.

Yeah, that sounds better. I'll send a v4 with passing a ptr.

Thanks,
Daniel

[...]

2020-11-05 23:34:06

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH bpf v2 2/2] selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL



> On Nov 5, 2020, at 3:22 PM, Daniel Xu <[email protected]> wrote:
>
> On Thu Nov 5, 2020 at 1:32 PM PST, Andrii Nakryiko wrote:
>> On Wed, Nov 4, 2020 at 8:51 PM Daniel Xu <[email protected]> wrote:
> [...]
>>> diff --git a/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
>>> new file mode 100644
>>> index 000000000000..41c3e296566e
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
>>> @@ -0,0 +1,34 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +
>>> +#include <linux/bpf.h>
>>> +#include <bpf/bpf_helpers.h>
>>> +#include <bpf/bpf_tracing.h>
>>> +
>>> +#include <sys/types.h>
>>> +
>>> +struct sys_enter_write_args {
>>> + unsigned long long pad;
>>> + int syscall_nr;
>>> + int pad1; /* 4 byte hole */
>>
>> I have a hunch that this explicit padding might break on big-endian
>> architectures?..
>>
>> Can you instead include "vmlinux.h" in this file and use struct
>> trace_event_raw_sys_enter? you'll just need ctx->args[2] to get that
>> buffer pointer.
>>
>> Alternatively, and it's probably simpler overall would be to just
>> provide user-space pointer through global variable:
>>
>> void *user_ptr;
>>
>>
>> bpf_probe_read_user_str(buf, ..., user_ptr);
>>
>> From user-space:
>>
>> skel->bss->user_ptr = &my_userspace_buf;
>>
>> Full control. You can trigger tracepoint with just an usleep(1), for
>> instance.
>
> Yeah, that sounds better. I'll send a v4 with passing a ptr.
>
> Thanks,
> Daniel

One more comment, how about we test multiple strings with different
lengths? In this way, we can catch other alignment issues.

Thanks,
Song

2020-11-05 23:58:05

by Daniel Xu

[permalink] [raw]
Subject: Re: [PATCH bpf v2 2/2] selftest/bpf: Test bpf_probe_read_user_str() strips trailing bytes after NUL

On Thu Nov 5, 2020 at 3:31 PM PST, Song Liu wrote:
>
>
> > On Nov 5, 2020, at 3:22 PM, Daniel Xu <[email protected]> wrote:
> >
> > On Thu Nov 5, 2020 at 1:32 PM PST, Andrii Nakryiko wrote:
> >> On Wed, Nov 4, 2020 at 8:51 PM Daniel Xu <[email protected]> wrote:
> > [...]
> >>> diff --git a/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> >>> new file mode 100644
> >>> index 000000000000..41c3e296566e
> >>> --- /dev/null
> >>> +++ b/tools/testing/selftests/bpf/progs/test_probe_read_user_str.c
> >>> @@ -0,0 +1,34 @@
> >>> +// SPDX-License-Identifier: GPL-2.0
> >>> +
> >>> +#include <linux/bpf.h>
> >>> +#include <bpf/bpf_helpers.h>
> >>> +#include <bpf/bpf_tracing.h>
> >>> +
> >>> +#include <sys/types.h>
> >>> +
> >>> +struct sys_enter_write_args {
> >>> + unsigned long long pad;
> >>> + int syscall_nr;
> >>> + int pad1; /* 4 byte hole */
> >>
> >> I have a hunch that this explicit padding might break on big-endian
> >> architectures?..
> >>
> >> Can you instead include "vmlinux.h" in this file and use struct
> >> trace_event_raw_sys_enter? you'll just need ctx->args[2] to get that
> >> buffer pointer.
> >>
> >> Alternatively, and it's probably simpler overall would be to just
> >> provide user-space pointer through global variable:
> >>
> >> void *user_ptr;
> >>
> >>
> >> bpf_probe_read_user_str(buf, ..., user_ptr);
> >>
> >> From user-space:
> >>
> >> skel->bss->user_ptr = &my_userspace_buf;
> >>
> >> Full control. You can trigger tracepoint with just an usleep(1), for
> >> instance.
> >
> > Yeah, that sounds better. I'll send a v4 with passing a ptr.
> >
> > Thanks,
> > Daniel
>
> One more comment, how about we test multiple strings with different
> lengths? In this way, we can catch other alignment issues.

Sure, will do that in v4 also.

2020-11-15 10:41:59

by kernel test robot

[permalink] [raw]
Subject: [lib/strncpy_from_user.c] 2903f3d558: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=


Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 2903f3d55815f9bdbd4eff4f8e58e76400741f84 ("[PATCH bpf v2 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator")
url: https://github.com/0day-ci/linux/commits/Daniel-Xu/Fix-bpf_probe_read_user_str-overcopying/20201105-102832
base: https://git.kernel.org/cgit/linux/kernel/git/bpf/bpf.git master

in testcase: apachebench
version:
with following parameters:

runtime: 300s
concurrency: 4000
cluster: cs-localhost
cpufreq_governor: performance
ucode: 0x7000019

test-description: apachebench is a tool for benchmarking your Apache Hypertext Transfer Protocol (HTTP) server.
test-url: https://httpd.apache.org/docs/2.4/programs/ab.html


on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 29.725008] max_uptime=3600
[ 29.728144] RESULT_ROOT=/result/apachebench/cs-localhost-4000-performance-300s-ucode=0x7000019-monitor=3472ca3d/lkp-bdw-de1/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/2903f3d55815f9bdbd4eff4f8e58e76400741f84/0
[ 29.748008] LKP_SERVER=internal-lkp-server
[ 29.752464] softlockup_panic=1
[ 29.755861] prompt_ramdisk=0
[ 29.759420] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[ 29.767066] CPU: 7 PID: 1 Comm: init Not tainted 5.9.0-13438-g2903f3d55815 #1
[ 29.774206] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
[ 29.782369] Call Trace:
[ 29.784820] dump_stack+0x57/0x6a
[ 29.788136] panic+0x102/0x2d2
[ 29.791205] ? vfs_statx+0x7b/0x120
[ 29.794687] do_exit.cold+0xb2/0xbe
[ 29.798173] do_group_exit+0x3a/0xa0
[ 29.801749] __x64_sys_exit_group+0x14/0x20
[ 29.805928] do_syscall_64+0x33/0x40
[ 29.809498] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 29.814542] RIP: 0033:0x7fb35bfd4806
[ 29.818113] Code: 83 c8 ff c3 89 fa 41 b8 e7 00 00 00 be 3c 00 00 00 eb 10 90 89 d7 89 f0 0f 05 48 3d 00 f0 ff ff 77 22 f4 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff ff 76 e2 f7 d8 89 05 2a e9 00 00 eb d8 0f 1f 84 00
[ 29.836850] RSP: 002b:00007fffe0fe35f8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[ 29.844406] RAX: ffffffffffffffda RBX: 00007fb35bfdd208 RCX: 00007fb35bfd4806
[ 29.851528] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
[ 29.858654] RBP: 00007fb35bfe3e80 R08: 00000000000000e7 R09: 00007fffe0fe3508
[ 29.865778] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000002
[ 29.872901] R13: 0000000000000001 R14: 00007fb35bfe3eb0 R15: 0000000000000000
[ 29.880070] Kernel Offset: disabled
ACPI MEMORY or I/O RESET_REG.


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Oliver Sang


Attachments:
(No filename) (3.06 kB)
config-5.9.0-13438-g2903f3d55815 (174.22 kB)
job-script (8.10 kB)
dmesg.xz (17.39 kB)
job.yaml (5.29 kB)
Download all attachments